Numerical Methods of Complex Valued Linear Algebraic SystemView this Special Issue
Research Article | Open Access
The Local Linear -Estimation with Missing Response Data
This paper studies the nonparametric regressive function with missing response data. Three local linear -estimators with the robustness of local linear regression smoothers are presented such that they have the same asymptotic normality and consistency. Then finite-sample performance is examined via simulation studies. Simulations demonstrate that the complete-case data -estimator is not superior to the other two local linear -estimators.
Local polynomial regression methods, which have advantages over popular kernel methods in terms of the ability of design adaption and high asymptotic efficiency, have been demonstrated as effective nonparametric smoothers. In addition, the local polynomial regression smoothers can adapt to almost all regression settings and cope very well with the edge effects. For details, see  and references therein. However, a drawback of these local regression estimators is lack of robustness. It is well known that -type of regression estimators have many desirable robustness properties. As a result, -type of regression estimators is natural candidates.
In fact, some methods such as kernel, local regression, spline, and orthogonal series methods can estimate nonparametric functions. For an introduction to this subject area, see [1–3]. Local linear smoother, as an intuitively appealing method, has become popular in recent years because of its attractive statistical properties. As is shown in [4–7], local regression provides many advantages over modified kernel methods. Consequently, it is reasonable to expect that the local regression based -type estimators carry over those advantages.
In the present paper, local -type regression estimators are applied to propose three -estimators of with missing response data, including the complete-case data -estimator, the weighted -estimator, and the estimated weighted -estimator, such that these estimators have the same asymptotic normality and consistency. Finite sample simulations show that the complete-case data -estimator is not superior to the other two local linear -estimators.
In the regression analysis setup, the basic inference begins by considering the random sample where the design point is observed and
Theoretically, this is actually a missing response problem. Recently, considerable interest in the work on nonparametric regression analysis with missing data and many methods has been developed. Cheng (see ) employed a kernel regression imputation approach to define the estimator of the mean of . Hirano et al. (see ) defined a weighted estimator for the response mean when response variable is missing. Then, Wang et al. (see ) developed estimation theory for semiparametric regression analysis in the presence of missing response. In the last year, Liang  and Wang and Sun  discussed the generalized partially linear models with missing covariate data and the partially linear models with missing responses at random, respectively.
To study the missing data (1), the MAR assumption would require that there exists a chance mechanism, denoted by , such that holds almost surely. In practice, (3) might be justifiable by the nature of the experiment when it is legitimate to assume that the missing of mainly depends on .
The paper is organized as follows. Some notations and preliminary results about three -estimators of with missing response data including the local -estimator with the complete-case data, the weighted -estimator, and the estimated weighted -estimator are given in Section 2. The asymptotic normality and consistency of three -estimators are then presented in Section 3. In Section 4, simulation studies give some comparison results of the proposed estimators. Sketches of the proofs are given in Section 5.
2. Model and Estimators
2.1. The Models
The nonparametric regression model that we will consider for the incomplete data (1) is given by for . Here is the regression function, is design point, is regression error, and is a response variable. The regression error is conveniently assumed to be independent and identically distributed (i.i.d.) random variable with mean 0 and variance . Furthermore, the variance of will be assumed to depend on , denoted by , allowing for heteroscedasticity. To simplify our preliminary discussion the covariate will be assumed to be real-valued.
2.2. The Local -Estimator with the Complete-Case Data
The local -estimator with the complete-case data is defined as the solution of the following problems; that is, find and to minimize or to satisfy the local estimation system of equations where is a given outlier-resistant function, is the derivative of , is a kernel function, and is a sequence of positive numbers tending to zero.
The -estimations of and are defined as and , which are the solution to the system of (6). We denote them by and , respectively.
2.3. The Weighted -Estimator
If we find and to minimize or to satisfy the local estimation system of equations then the solutions and are called the weighted -estimator. Furthermore, the solutions of the system of (8) will be denoted by and , respectively.
2.4. The Estimated Weighted -Estimator
In practice, the selection probabilities function is usually unknown. To estimate the selection probabilities, we apply the local linear smoother to find and such that is minimized. A straightforward calculation yields where With the estimator defined in (10), we can give the estimated weighted -estimator by finding and to minimize or to satisfy the local estimation system of equations By and denote the solutions and of the system of (13), respectively.
3. Main Results
In this section, we will present the main results of this paper to explore the asymptotic distribution and consistencies of , , and . The following conditions will be used in the rest of this section.(1)The regression function has a continuous second derivative at the given point and is continuous and bounded on support field .(2) The sequence of bandwidths tends to zero such that .(3) The design density is continuous at the given point and .(4) The kernel function is a continuous probability density function with bounded support field .(5) with .(6) The function is continuous and has a derivative . Further, assume that and are positive and continuous at the given point and there exists such that is bounded in a neighborhood of .(7) The function satisfies that as , uniformly in a neighborhood of .
In addition, for the convenience of representation and proof, will denote three estimators , , and , and will denote the estimators , , and . Lastly, assume for .
The following will present the main theorems such that the three -estimators mentioned above have the same consistency and asymptotic normality.
Theorem 1. Under conditions (1)–(7), are bounded; furthermore, also satisfies Lipchitz’s condition; that is, for . If , then
Theorem 2. Under conditions (1)–(7), are bounded and also satisfies Lipchitz’s condition; that is, for . If , then where and .
4. Simulation Studies
In this section, we conducted some simulations to better understand the finite-sample performance of the present three -estimators. Then, we will compare the biases, the sample mean square errors (), and the sample mean average square errors () of three -estimators. The MASE of is defined by where and is grid points.
The sample size was set and the regression model was considered, where is a uniform (0,1) and is a random sample from and is independent of . The kernel function was taken as the Epanechnikov kernel: and . To generate the indicators , the function was chosen as , for all , and the pseudo i.i.d. uniform random variables . As a result, each was generated. If , then ; otherwise, . By these, 500 independent sets of data were generated.
For , , and , the optimal bandwidth was obtained by 500 simulations. Let be the optimal bandwidth of the obtained by the th simulation and then compute . When , we compare the bias, , and of the present three -estimators. The comparison results are shown in Figures 1–3.
Figure 1 shows the biases of , , and . It is easy to see that has considerable bias while and are very close in most points.
Figure 3 shows the of , , and . We can see that has considerable while and are very close in most points.
The comparison results on the biases, the , and the of three -estimators obtained by Figures 1, 2, and 3 show that the weighted -estimator and the estimated weighted -estimator are obviously superior to the complete-case data -estimator while there is no appreciable distinction on the superiority between weighted -estimator and the estimated weighted -estimator.
5. The Proofs of Theorems
Lemma 3. Under conditions (1)–(7) and for any random sequence , if and , then where .
Proof. Since the second equality can be obtained from the first one, we only prove the first equality. It is obvious that where Similar to the proof of Lemma 4 in , we have The following only proves . Let for any given and Then we have By condition (7) and noticing that in the above expressions, it is not difficult to see where and are two sequences of positive numbers, tending to zero as . Since , it follows that with . The conclusion is obtained coming from the fact that .
Lemma 4. Under conditions (1)–(7) and for any random sequence , if and , then where .
Proof. The proof is similar to the proof of Lemma 3.
Lemma 5. Under conditions (1)–(7), is asymptotically normal and where and .
Proof. Let . Then is a sum of i.i.d. random variables with mean zero and variance with Similar to the proof of Lemma 4 in , we can easily obtain asymptotic expression of , namely, and easily verify Lyapunov’s condition via using condition (6). That is, is asymptotically normal. With (28) we have This completes the proof of this lemma.
Lemma 6. Under conditions (1)–(7), was asymptotically normal and where and .
Proof. The proof is similar to the proof of Lemma 5.
Lemma 7. Under conditions (1)–(7), and are bounded; furthermore, also satisfies Lipchitz’s condition; that is, for . If , then the following equalities hold: ①;② ;③ .
Proof. We prove ① firstly.①Since equality (31) is changed into Now, we denote if . It is easy to see that Then we have with the method of the kernel density estimate that where By the operations property of (2) for an integer , Since and , Let . The following will prove In fact, (40) holds only if holds. Then, where and . It follows from (39) and the definition of and that Further, (41) and (42) yield . Again, since , which indicates if is sufficiently large. As a result, equality (40) holds. We conclude from (38) and (39) that this completes the proof of ①.②Similar to the proof of Theorem 1 in , one can easily obtain As a consequence, we have with (44) that This completes the proof of ②. ③The proof is similar to the proof of ②.
Lemma 8. Under conditions (1)–(7), if and when , then where is a compact set, , and is the estimate value of .
Proof. The same arguments as those of Theorem 2 in  can yield the proof of this lemma.
Lemma 9. Under conditions (1)–(7), and are bounded; furthermore, also satisfies Lipchitz’s condition; that is, for . If , then was asymptotically normal and , where and .
Proof of Theorem 1. The proof can be proved by the following two cases.
(i) If either or , the proof of this theorem is similar to the proof of Theorem 1 in .
(ii) If , similar to the proof of Theorem 1 in , the proof of this theorem is obtained immediately by the first equality of Lemmas 7 and 8. It is easy to see from (i) and (ii) that we complete the proof of this theorem.
Proof of Theorem 2. We can prove the conclusion of this theorem by the following three cases.
(1) If , let where is the estimate of . Then Using (6), we get Again, where By Lemma 3, we have that It follows from the consistency shown in (15) and condition (7) that Since and , we have with (50) and (54) that Therefore, By Lemma 5 and Slutsky’s Theorem, we get (16); that is,
(2) If , similar to the proof case , the proof can be obtained immediately by Lemmas 4 and 6.
(3) The following will prove the case when . Let where is the estimate of . Then we have By (13), one can get It then follows from the first equality of Lemma 7, Lemma 8, and (61) that where By Lemma 4, it is easy to see According to the first equality of Lemmas 7 and 8, the consistency shown in (15), and the condition (7), we have Since and , we have with (61) and (65) that Therefore, By Lemma 9 and Slutsky’s Theorem, we get (16); that is, It is easy to see from cases (1), (2), and (3) that we complete the proof.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was partly supported by the Science Foundation of the Education Department of Shaanxi Province of China (2013JK0593), the Scientific Research Foundation (BS1014) and the Education Reform Foundation (2012JG40) of Xi’an Polytechnic University, and the National Natural Science Foundations of China (11201362, 11271297, and 1101325).
- J. Fan and I. Gijbels, Local Polynomial Modelling and Its Applications, Chapman and Hall, London, UK, 1996.
- P. J. Green and B. W. Silverman, Kernel Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach, Chapman and Hall, London, UK, 1994.
- D. M. Titterington and G. M. Mill, “Kernel-based density estimates from incomplete data,” Journal of the Royal Statistical Society B: Methodological, vol. 45, no. 2, pp. 258–266, 1983.
- J. Fan, “Local linear regression smoothers and their minimax efficiencies,” The Annals of Statistics, vol. 21, no. 1, pp. 196–216, 1993.
- T. Hastie and C. Loader, “Local regression: automatic kernel estimators of regression curves,” Annals of Statistics, vol. 15, pp. 182–201, 1993.
- T. Orchard and M. A. Woodbury, “A missing information principle: theory and applications,” in Proceedings of the 6th Berkeley Symposium on Mathematical Statistics and Probability, vol. 3, pp. 697–715, University of California, June-July 1970.
- D. Ruppert and M. P. Wand, “Multivariate locally weighted least squares regression,” The Annals of Statistics, vol. 22, no. 3, pp. 1346–1370, 1994.
- P. E. Cheng, “Nonparametric estimation of mean functionals with data missing at random,” Journal of the American Statistical Association, vol. 89, no. 425, pp. 81–87, 1994.
- K. Hirano, G. W. Imbens, and G. Ridder, “Efficient estimation of average treatment effects using the estimated propensity score,” Econometrica, vol. 71, no. 4, pp. 1161–1189, 2003.
- Q. Wang, O. Linton, and W. Härdle, “Semiparametric regression analysis with missing response at random,” Journal of the American Statistical Association, vol. 99, no. 466, pp. 334–345, 2004.
- H. Liang, “Generalized partially linear models with missing covariates,” Journal of Multivariate Analysis, vol. 99, no. 5, pp. 880–895, 2008.
- Q. Wang and Z. Sun, “Estimation in partially linear models with missing responses at random,” Journal of Multivariate Analysis, vol. 98, no. 7, pp. 1470–1493, 2007.
- J. Fan and I. Gijbels, “Variable bandwidth and local linear regression smoothers,” The Annals of Statistics, vol. 20, no. 4, pp. 2008–2036, 1992.
- R. J. Carroll, J. Fan, J. Gijbels, and M. P. Wand, “Generalized partially linear single-index models,” Journal of the American Statistical Association, vol. 92, no. 438, pp. 477–489, 1997.
- J. Fan and J. Jiang, “Variable bandwidth and one-step local M-estimator,” Science in China A, vol. 29, no. 1, pp. 688–702, 1999.
Copyright © 2014 Shuanghua Luo et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.