- About this Journal ·
- Abstracting and Indexing ·
- Advance Access ·
- Aims and Scope ·
- Annual Issues ·
- Article Processing Charges ·
- Articles in Press ·
- Author Guidelines ·
- Bibliographic Information ·
- Citations to this Journal ·
- Contact Information ·
- Editorial Board ·
- Editorial Workflow ·
- Free eTOC Alerts ·
- Publication Ethics ·
- Reviewers Acknowledgment ·
- Submit a Manuscript ·
- Subscription Information ·
- Table of Contents
Journal of Applied Mathematics
Volume 2013 (2013), Article ID 541250, 9 pages
A Berry-Esseen Type Bound in Kernel Density Estimation for Negatively Associated Censored Data
1College of Science, Guilin University of Technology, Guilin 541004, China
2Department of Mathematics, Ji'nan University, Guangzhou 510630, China
Received 19 February 2013; Accepted 11 July 2013
Academic Editor: XianHua Tang
Copyright © 2013 Qunying Wu and Pingyan Chen. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
We discuss the kernel estimation of a density function based on censored data when the survival and the censoring times form the stationary negatively associated (NA) sequences. Under certain regularity conditions, the Berry-Esseen type bounds are derived for the kernel density estimator and the Kaplan-Meier kernel density estimator at a fixed point .
Let be a sequence of the true survival times. The random variables (r.v.s.) are not assumed to be mutually independent; it is assumed, however, that they have a common unknown continuous marginal distribution function (d.f.) and density function . Let the r.v.s. be censored on the right by the censoring r.v.s. , so that one observes only , where here and in the sequel, and is the indicator random variable of the event . In this random censorship model, the censoring times , , are assumed to have the common d.f. ; they are also assumed to be independent of the r.v.s. . Following the convention in the survival literature, we assume that both and are nonnegative random variables. In contrast to statistics for complete data, we observe only the pairs , , and the estimators are based on these pairs.
The following nonparametric estimation of the distribution functions and due to Kaplan and Meier  is widely used to estimate and on the basis of the data : where denote the order statistics of and is the concomitant of .
We introduce the kernel density estimator where are bandwidths and is some kernel function. When is known, (3) can be used to estimate the common density of the lifetimes. However, in most practical cases is unknown and must be replaced by the Kaplan-Meier estimator , so the Kaplan-Meier kernel density estimator of the is defined by
There is an extensive literature on the Kaplan-Meier estimator for censored independent observations. We refer to papers by Földes and Rejtő , Gu and Lai , Gill , and Sun and Zhu . Sun and Zhu obtained the following Berry-Esseen bound for i.i.d. censored sequences.
Theorem A. Let be a bounded probability kernel function with compact support satisfying for integer ,
Let be -order continuously differentiable and let be continuously differentiable in a neighborhood of with for . Then where denotes the standard normal distribution function, and .
However, the censored dependent data appear in a number of applications. For example, repeated measurements in survival analysis follow this pattern; see Kang and Koehler . In the context of censored time series analysis, Shumway et al.  considered (hourly or daily) measurements of the concentration of a given substance subject to some detection limits, thus being potentially censored from the right. Lecoutre and Ould-Said , Cai , and Liang and Uña-Álvarez  studied the convergence for the stationary -mixing data. However, the convergence for the NA data has not been reported.
The main purpose of this paper is to study the kernel density estimator and the Kaplan-Meier kernel estimator of a density function based on censored data when the survival and the censoring times form the stationary NA (see the following definition) sequences. Under certain regularity conditions, the Berry-Esseen type bounds are derived for the kernel density estimator and the Kaplan-Meier kernel estimator at a fixed point .
Definition 1. Random variables , are said to be negatively associated (NA) if for every pair of disjoint subsets and of , where and are increasing for every variable (or decreasing for every variable) such that this covariance exists. A sequence of random variables is said to be NA if every finite subfamily is NA.
Obviously, if is a sequence of NA random variables and is a sequence of nondecreasing (or nonincreasing) functions, then is also a sequence of NA random variables.
This definition was introduced by Joag-Dev and Proschan . Statistical test depends greatly on sampling. The random sampling without replacement from a finite population is NA but is not independent. NA sampling has wide applications such as those in multivariate statistical analysis and reliability theory. Because of the wide applications of NA sampling, the limit behavior of NA random variables has received more and more attention recently. One can refer to Joag-Dev and Proschan  for fundamental properties, Matuła  for the three-series theorem, and Wu and Jiang [13, 14] for the strong convergence.
2. Main Results
In what follows, let be the d.f. of the ’s, . Since the sequences and are independent, it follows that .
Define (possibly infinite) times , , and by Then, .
We give the following four lemmas, which are helpful in proving our theorems.
Lemma 2 (Chang and Rao, ). Let and be random variables, then for any here and in the sequel, where denotes the standard normal distribution function.
Lemma 3 (Su et al. [16, Theorem 1]). Let be a sequence of NA r.v.s. with zero means and , and . Then for , where depends only on .
Lemma 4. Let be a sequence of NA r.v.s. with continuous d.f. , and let be the empirical d.f. based on the segments . Then
Lemma 5 (Wu and Chen [18, Theorem 1.3]). Let and be two sequences of NA r.v.s. Suppose that the sequences and are independent. Then for any ,
In order to formulate our main results, we now list some assumptions.() and are two sequences of stationary NA random variables, and and are independent.() Suppose that , , and and have bounded derivative in a neighborhood of .() For all integers , the conditional distribution , given , has a density , and for all , for and some , where represents a neighborhood of . () The kernel is a bounded derivative function with for and .() Let , , and be positive integers with where .
Remark 6. () Implies and .
Let , .
Theorem 7. Suppose that are satisfied; then
where , , , .
Consider the following: where .
Furthermore, if then
Proof of Theorem 7. We observe that, by (3),
Let , , , where and then By (20), We first estimate , , and . Obviously, implies that and are stationary; thus,
From , , and , we obtain Hence, by , .
For and , by ,
Therefore, by , By , , , and Lemma 2.3 of Zhang , for ,
Thus, by and , Therefore, by the combination of , (24), (26), (28), and (30),
By (26), (27), , , and ,
By (25), , and , Note that for any random variables and ; from (31)–(33), Therefore, from the combination of (23) and (31)–(34), it follows that Thus, (14) holds.
Now, we prove (15). Let , , , . Then, . According to Lemma 2, (14), (20), (32), and (33), we have
Let , be independent random variables with the same distribution as for . Put , . Obviously, Note that and from (20) and (24). By (14), (30), (32), and (33), Note that , , are independent random variables, and . Therefore, by (from (39)), (14), and Berry-Esseen inequality (cf. Petrov [20, page 154, Theorem 5.7]), there exists some constant such that
Similar to (26), we can get and . It is easy to see from Property P7 of Joag-Dev and Proschan  that is also sequence of NA r.v.s., so by using Lemma 3, we have
Assume that and are the characteristic functions of and , respectively. By Esseen inequality (cf. Petrov [20, page 146, Theorem 5.3]), for any , there exists some constant such that
By Theorem 10 in Newman , (14), and (30), Therefore, On applying (39)–(41), we have Thus, Choosing , then by (42)–(46), Therefore, the combination of (37)–(39), (41), (47), and (15) holds.
Finally, we prove (17). By Lemma 2 and (15), for any ,
Applying (14), , , and differential mean value theorem, there exists a constant , such that Hence, there exists a constant sufficiently large such that . Let in (48); then . Therefore, by (48), (16) holds.
Proof of Theorem 8. Using (15) and Lemma 2,
Let be the empirical d.f. of . Then, by (2),
Thus, by Lemmas 4 and 5, for ,
Using (14), we get
Therefore, (18) holds from (50) and (53).
Using (18), similar to the proof of (17), we can prove (19). This completes the proof of Theorem 8.
The authors are very grateful to the referees and the editors for their valuable comments and helpful suggestions that improved the clarity and readability of the paper. This paper is supported by the National Natural Science Foundation of china (11061012), project supported by Program to Sponsor Teams for Innovation in the Construction of Talent Highlands in Guangxi Institutions of Higher Learning ((2011) 47), and the Support Program of the Guangxi China Science Foundation (2012GXNSFAA053010 and 2013GXNSFDA019001).
- E. L. Kaplan and P. Meier, “Nonparametric estimation from incomplete observations,” Journal of the American Statistical Association, vol. 53, pp. 457–481, 1958.
- A. Földes and L. Rejtő, “A LIL type result for the product limit estimator,” Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, vol. 56, no. 1, pp. 75–86, 1981.
- M. G. Gu and T. L. Lai, “Functional laws of the iterated logarithm for the product-limit estimator of a distribution function under random censorship or truncation,” The Annals of Probability, vol. 18, no. 1, pp. 160–189, 1990.
- R. D. Gill, Censoring and Stochastic Integrals, vol. 124 of Mathematical Centre Tracts, Mathematisch Centrum, Amsterdam, The Netherlands, 1980.
- L. Q. Sun and L. X. Zhu, “A Berry-Esseen type bound for kernel density estimators under random censorship,” Acta Mathematica Sinica, vol. 42, no. 4, pp. 627–636, 1999.
- S.-S. Kang and K. J. Koehler, “Modification of the Greenwood formula for correlated response times,” Biometrics, vol. 53, no. 3, pp. 885–899, 1997.
- R. H. Shumway, A. S. Azari, and P. Johnson, “Estimating mean concentrations under transformation for environmental data with detection limits,” Technometrics, vol. 31, pp. 347–356, 1988.
- J.-P. Lecoutre and E. Ould-Said, “Convergence of the conditional Kaplan-Meier estimate under strong mixing,” Journal of Statistical Planning and Inference, vol. 44, no. 3, pp. 359–369, 1995.
- Z. Cai, “Estimating a distribution function for censored time series data,” Journal of Multivariate Analysis, vol. 78, no. 2, pp. 299–318, 2001.
- H.-Y. Liang and J. de Uña-Álvarez, “A Berry-Esseen type bound in kernel density estimation for strong mixing censored samples,” Journal of Multivariate Analysis, vol. 100, no. 6, pp. 1219–1231, 2009.
- K. Joag-Dev and F. Proschan, “Negative association of random variables, with applications,” The Annals of Statistics, vol. 11, no. 1, pp. 286–295, 1983.
- P. Matuła, “A note on the almost sure convergence of sums of negatively dependent random variables,” Statistics & Probability Letters, vol. 15, no. 3, pp. 209–213, 1992.
- Q. Wu and Y. Jiang, “A law of the iterated logarithm of partial sums for NA random variables,” Journal of the Korean Statistical Society, vol. 39, no. 2, pp. 199–206, 2010.
- Q. Wu and Y. Jiang, “Chover's law of the iterated logarithm for negatively associated sequences,” Journal of Systems Science & Complexity, vol. 23, no. 2, pp. 293–302, 2010.
- M. N. Chang and P. V. Rao, “Berry-Esseen bound for the Kaplan-Meier estimator,” Communications in Statistics, vol. 18, no. 12, pp. 4647–4664, 1989.
- C. Su, L. Zhao, and Y. Wang, “Moment inequalities and weak convergence for negatively associated sequences,” Science in China A, vol. 40, no. 2, pp. 172–182, 1997.
- S. C. Yang, “Consistency of the nearest neighbor estimator of the density function for negatively associated samples,” Acta Mathematicae Applicatae Sinica, vol. 26, no. 3, pp. 385–394, 2003.
- Q. Y. Wu and P. Y. Chen, “Strong representation results of Kaplan-Meier estimator for censored NA data,” Journal of Inequalities and Applications, vol. 2013, article 340, 2013.
- L.-X. Zhang, “The weak convergence for functions of negatively associated random variables,” Journal of Multivariate Analysis, vol. 78, no. 2, pp. 272–298, 2001.
- V. V. Petrov, Limit Theorems of Probability Theory, vol. 4, Oxford University Press, New York, NY, USA, 1995.
- C. M. Newman, “Asymptotic independence and limit theorems for positively and negatively dependent random variables,” in Inequalities in Statistics and Probability, vol. 5 of IMS Lecture Notes Monogr. Ser., pp. 127–140, Institute of Mathematical Statistics, Hayward, Calif, USA, 1984.