Journal of Applied Mathematics

Volume 2013, Article ID 541250, 9 pages

http://dx.doi.org/10.1155/2013/541250

## A Berry-Esseen Type Bound in Kernel Density Estimation for Negatively Associated Censored Data

^{1}College of Science, Guilin University of Technology, Guilin 541004, China^{2}Department of Mathematics, Ji'nan University, Guangzhou 510630, China

Received 19 February 2013; Accepted 11 July 2013

Academic Editor: XianHua Tang

Copyright © 2013 Qunying Wu and Pingyan Chen. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We discuss the kernel estimation of a density function based on censored data when the survival and the censoring times form the stationary negatively associated (NA) sequences. Under certain regularity conditions, the Berry-Esseen type bounds are derived for the kernel density estimator and the Kaplan-Meier kernel density estimator at a fixed point .

#### 1. Introduction

Let be a sequence of the true survival times. The random variables (r.v.s.) are not assumed to be mutually independent; it is assumed, however, that they have a common unknown continuous marginal distribution function (d.f.) and density function . Let the r.v.s. be censored on the right by the censoring r.v.s. , so that one observes only , where here and in the sequel, and is the indicator random variable of the event . In this random censorship model, the censoring times , , are assumed to have the common d.f. ; they are also assumed to be independent of the r.v.s. . Following the convention in the survival literature, we assume that both and are nonnegative random variables. In contrast to statistics for complete data, we observe only the pairs , , and the estimators are based on these pairs.

The following nonparametric estimation of the distribution functions and due to Kaplan and Meier [1] is widely used to estimate and on the basis of the data : where denote the order statistics of and is the concomitant of .

We introduce the kernel density estimator where are bandwidths and is some kernel function. When is known, (3) can be used to estimate the common density of the lifetimes. However, in most practical cases is unknown and must be replaced by the Kaplan-Meier estimator , so the Kaplan-Meier kernel density estimator of the is defined by

There is an extensive literature on the Kaplan-Meier estimator for censored independent observations. We refer to papers by Földes and Rejtő [2], Gu and Lai [3], Gill [4], and Sun and Zhu [5]. Sun and Zhu obtained the following Berry-Esseen bound for i.i.d. censored sequences.

Theorem A. *Let be a bounded probability kernel function with compact support satisfying for integer ,
*

Let be -order continuously differentiable and let be continuously differentiable in a neighborhood of with for . Then where denotes the standard normal distribution function, and .

However, the censored dependent data appear in a number of applications. For example, repeated measurements in survival analysis follow this pattern; see Kang and Koehler [6]. In the context of censored time series analysis, Shumway et al. [7] considered (hourly or daily) measurements of the concentration of a given substance subject to some detection limits, thus being potentially censored from the right. Lecoutre and Ould-Said [8], Cai [9], and Liang and Uña-Álvarez [10] studied the convergence for the stationary -mixing data. However, the convergence for the NA data has not been reported.

The main purpose of this paper is to study the kernel density estimator and the Kaplan-Meier kernel estimator of a density function based on censored data when the survival and the censoring times form the stationary NA (see the following definition) sequences. Under certain regularity conditions, the Berry-Esseen type bounds are derived for the kernel density estimator and the Kaplan-Meier kernel estimator at a fixed point .

*Definition 1. *Random variables , are said to be negatively associated (NA) if for every pair of disjoint subsets and of ,
where and are increasing for every variable (or decreasing for every variable) such that this covariance exists. A sequence of random variables is said to be NA if every finite subfamily is NA.

Obviously, if is a sequence of NA random variables and is a sequence of nondecreasing (or nonincreasing) functions, then is also a sequence of NA random variables.

This definition was introduced by Joag-Dev and Proschan [11]. Statistical test depends greatly on sampling. The random sampling without replacement from a finite population is NA but is not independent. NA sampling has wide applications such as those in multivariate statistical analysis and reliability theory. Because of the wide applications of NA sampling, the limit behavior of NA random variables has received more and more attention recently. One can refer to Joag-Dev and Proschan [11] for fundamental properties, Matuła [12] for the three-series theorem, and Wu and Jiang [13, 14] for the strong convergence.

#### 2. Main Results

In what follows, let be the d.f. of the ’s, . Since the sequences and are independent, it follows that .

Define (possibly infinite) times , , and by Then, .

We give the following four lemmas, which are helpful in proving our theorems.

Lemma 2 (Chang and Rao, [15]). *Let and be random variables, then for any **
here and in the sequel, where denotes the standard normal distribution function.*

Lemma 3 (Su et al. [16, Theorem 1]). *Let be a sequence of NA r.v.s. with zero means and , and . Then for ,
**
where depends only on .*

Lemma 4. *Let be a sequence of NA r.v.s. with continuous d.f. , and let be the empirical d.f. based on the segments . Then
*

*Proof. *Similar to the proof of Lemma 4 in Yang [17], we can prove Lemma 4.

Lemma 5 (Wu and Chen [18, Theorem 1.3]). *Let and be two sequences of NA r.v.s. Suppose that the sequences and are independent. Then for any ,
*

In order to formulate our main results, we now list some assumptions.() and are two sequences of stationary NA random variables, and and are independent.() Suppose that , , and and have bounded derivative in a neighborhood of .() For all integers , the conditional distribution , given , has a density , and for all , for and some , where represents a neighborhood of . () The kernel is a bounded derivative function with for and .() Let , , and be positive integers with where .

*Remark 6. *() Implies and .

Let , .

Theorem 7. *Suppose that are satisfied; then
**
where , , , .**Consider the following:
**
where .**Furthermore, if
**
then
*

Theorem 8. *Assume that the conditions of Theorem 7 hold. Then
**
Furthermore, if (16) holds, then
*

#### 3. Proofs

*Proof of Theorem 7. *We observe that, by (3),

Let , , , where
and then
By (20),
We first estimate , , and . Obviously, implies that and are stationary; thus,

From , , and , we obtain
Hence, by , .

For and , by ,

Therefore, by ,
By , , , and Lemma 2.3 of Zhang [19], for ,

Thus, by and ,
Therefore, by the combination of , (24), (26), (28), and (30),

Similarly,

By (26), (27), , , and ,

By (25), , and , Note that for any random variables and ; from (31)–(33),
Therefore, from the combination of (23) and (31)–(34), it follows that
Thus, (14) holds.

Now, we prove (15). Let , , , . Then, . According to Lemma 2, (14), (20), (32), and (33), we have

Let , be independent random variables with the same distribution as for . Put , . Obviously,
Note that and from (20) and (24). By (14), (30), (32), and (33),
Note that , , are independent random variables, and . Therefore, by (from (39)), (14), and Berry-Esseen inequality (cf. Petrov [20, page 154, Theorem 5.7]), there exists some constant such that

Similar to (26), we can get and . It is easy to see from Property P7 of Joag-Dev and Proschan [11] that is also sequence of NA r.v.s., so by using Lemma 3, we have

Assume that and are the characteristic functions of and , respectively. By Esseen inequality (cf. Petrov [20, page 146, Theorem 5.3]), for any , there exists some constant such that

By Theorem 10 in Newman [21], (14), and (30),
Therefore,
On applying (39)–(41), we have
Thus,
Choosing , then by (42)–(46),
Therefore, the combination of (37)–(39), (41), (47), and (15) holds.

Finally, we prove (17). By Lemma 2 and (15), for any ,

Applying (14), , , and differential mean value theorem, there exists a constant , such that
Hence, there exists a constant sufficiently large such that . Let in (48); then . Therefore, by (48), (16) holds.

*Proof of Theorem 8. *Using (15) and Lemma 2,

Let be the empirical d.f. of . Then, by (2),

Thus, by Lemmas 4 and 5, for ,

Using (14), we get

Therefore, (18) holds from (50) and (53).

Using (18), similar to the proof of (17), we can prove (19). This completes the proof of Theorem 8.

#### Acknowledgments

The authors are very grateful to the referees and the editors for their valuable comments and helpful suggestions that improved the clarity and readability of the paper. This paper is supported by the National Natural Science Foundation of china (11061012), project supported by Program to Sponsor Teams for Innovation in the Construction of Talent Highlands in Guangxi Institutions of Higher Learning ((2011) 47), and the Support Program of the Guangxi China Science Foundation (2012GXNSFAA053010 and 2013GXNSFDA019001).

#### References

- E. L. Kaplan and P. Meier, “Nonparametric estimation from incomplete observations,”
*Journal of the American Statistical Association*, vol. 53, pp. 457–481, 1958. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - A. Földes and L. Rejtő, “A LIL type result for the product limit estimator,”
*Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete*, vol. 56, no. 1, pp. 75–86, 1981. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - M. G. Gu and T. L. Lai, “Functional laws of the iterated logarithm for the product-limit estimator of a distribution function under random censorship or truncation,”
*The Annals of Probability*, vol. 18, no. 1, pp. 160–189, 1990. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - R. D. Gill,
*Censoring and Stochastic Integrals*, vol. 124 of*Mathematical Centre Tracts*, Mathematisch Centrum, Amsterdam, The Netherlands, 1980. View at MathSciNet - L. Q. Sun and L. X. Zhu, “A Berry-Esseen type bound for kernel density estimators under random censorship,”
*Acta Mathematica Sinica*, vol. 42, no. 4, pp. 627–636, 1999. View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - S.-S. Kang and K. J. Koehler, “Modification of the Greenwood formula for correlated response times,”
*Biometrics*, vol. 53, no. 3, pp. 885–899, 1997. View at Publisher · View at Google Scholar · View at Scopus - R. H. Shumway, A. S. Azari, and P. Johnson, “Estimating mean concentrations under transformation for environmental data with detection limits,”
*Technometrics*, vol. 31, pp. 347–356, 1988. View at Google Scholar - J.-P. Lecoutre and E. Ould-Said, “Convergence of the conditional Kaplan-Meier estimate under strong mixing,”
*Journal of Statistical Planning and Inference*, vol. 44, no. 3, pp. 359–369, 1995. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - Z. Cai, “Estimating a distribution function for censored time series data,”
*Journal of Multivariate Analysis*, vol. 78, no. 2, pp. 299–318, 2001. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - H.-Y. Liang and J. de Uña-Álvarez, “A Berry-Esseen type bound in kernel density estimation for strong mixing censored samples,”
*Journal of Multivariate Analysis*, vol. 100, no. 6, pp. 1219–1231, 2009. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - K. Joag-Dev and F. Proschan, “Negative association of random variables, with applications,”
*The Annals of Statistics*, vol. 11, no. 1, pp. 286–295, 1983. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - P. Matuła, “A note on the almost sure convergence of sums of negatively dependent random variables,”
*Statistics & Probability Letters*, vol. 15, no. 3, pp. 209–213, 1992. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - Q. Wu and Y. Jiang, “A law of the iterated logarithm of partial sums for NA random variables,”
*Journal of the Korean Statistical Society*, vol. 39, no. 2, pp. 199–206, 2010. View at Publisher · View at Google Scholar · View at MathSciNet - Q. Wu and Y. Jiang, “Chover's law of the iterated logarithm for negatively associated sequences,”
*Journal of Systems Science & Complexity*, vol. 23, no. 2, pp. 293–302, 2010. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - M. N. Chang and P. V. Rao, “Berry-Esseen bound for the Kaplan-Meier estimator,”
*Communications in Statistics*, vol. 18, no. 12, pp. 4647–4664, 1989. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - C. Su, L. Zhao, and Y. Wang, “Moment inequalities and weak convergence for negatively associated sequences,”
*Science in China A*, vol. 40, no. 2, pp. 172–182, 1997. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - S. C. Yang, “Consistency of the nearest neighbor estimator of the density function for negatively associated samples,”
*Acta Mathematicae Applicatae Sinica*, vol. 26, no. 3, pp. 385–394, 2003. View at Google Scholar · View at MathSciNet - Q. Y. Wu and P. Y. Chen, “Strong representation results of Kaplan-Meier estimator for censored NA data,”
*Journal of Inequalities and Applications*, vol. 2013, article 340, 2013. View at Publisher · View at Google Scholar - L.-X. Zhang, “The weak convergence for functions of negatively associated random variables,”
*Journal of Multivariate Analysis*, vol. 78, no. 2, pp. 272–298, 2001. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet - V. V. Petrov,
*Limit Theorems of Probability Theory*, vol. 4, Oxford University Press, New York, NY, USA, 1995. View at MathSciNet - C. M. Newman, “Asymptotic independence and limit theorems for positively and negatively dependent random variables,” in
*Inequalities in Statistics and Probability*, vol. 5 of*IMS Lecture Notes Monogr. Ser.*, pp. 127–140, Institute of Mathematical Statistics, Hayward, Calif, USA, 1984. View at Publisher · View at Google Scholar · View at MathSciNet