Statistical Inference for the Heteroscedastic Partially Linear Varying-Coefficient Errors-in-Variables Model with Missing Censoring Indicators

Zou, Yuye; Wu, Chengxin

doi:https://doi.org/10.1155/2021/1141022

Discrete Dynamics in Nature and Society

On this page

Abstract Introduction Conclusion Appendix Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2021 | Article ID 1141022 | https://doi.org/10.1155/2021/1141022

Statistical Inference for the Heteroscedastic Partially Linear Varying-Coefficient Errors-in-Variables Model with Missing Censoring Indicators

Yuye Zou^1,2and Chengxin Wu^3,4

Academic Editor: Chris Goodrich

Received16 Apr 2021

Accepted26 May 2021

Published07 Jun 2021

Abstract

In this paper, we focus on heteroscedastic partially linear varying-coefficient errors-in-variables models under right-censored data with censoring indicators missing at random. Based on regression calibration, imputation, and inverse probability weighted methods, we define a class of modified profile least square estimators of the parameter and local linear estimators of the coefficient function, which are applied to constructing estimators of the error variance function. In order to improve the estimation accuracy and take into account the heteroscedastic error, reweighted estimators of the parameter and coefficient function are developed. At the same time, we apply the empirical likelihood method to construct confidence regions and maximum empirical likelihood estimators of the parameter. Under appropriate assumptions, the asymptotic normality of the proposed estimators is studied. The strong uniform convergence rate for the estimators of the error variance function is considered. Also, the asymptotic chi-squared distribution of the empirical log-likelihood ratio statistics is proved. A simulation study is conducted to evaluate the finite sample performance of the proposed estimators. Meanwhile, one real data example is provided to illustrate our methods.

1. Introduction

In regression analysis, for a long period of time, the flexible and refined statistical regression models are widely applied in theoretical study and practical application. The main results related to parameter regression models and nonparameter regression models are rather mature. Recently, semiparameter regression models can reduce the high risk of misspecification related to parameter regression models and avoid the “curse of dimensionality” for nonparametric regression models. Thanks to their advantage, semiparametric regression models enjoy consideration attention from statisticians. Semiparametric regression models have various forms. Specially, partially linear varying-coefficient errors-in-variables (PLVCEV) model, as a typical example, was introduced by You and Chen [1] and has the following form:where is the response variable, are the covariates, is a vector of -dimensional unknown parameter, is an unknown -dimensional vector of coefficient function, and is the random error. The measurement error is independent of with mean zero and covariance matrix . In order to identify the model, is assumed to be known.

As one general and flexible semiparametric model, model (1) includes a variety of models of interest. When is observed exactly, model (1) boils down to be PLVC model [2, 3]. When , and is observed exactly, model (1) reduces to partially linear regression model [4]. When is observed exactly, and is a constant vector, model (1) becomes a linear regression model. When and , model (1) reduces to partially linear EV model [5]. For model (1), You and Chen [1] proposed estimators of parametric and nonparametric components and showed their asymptotic properties. Liu and Liang [6] constructed the asymptotical normality of jackknife estimator for error variance and standard chi-square distribution of jackknife empirical log-likelihood statistic. Fan et al. [7] established penalized profile least squares estimation of parameter and nonparameter in the model.

The literature mentioned above assumed that the random errors are homoscedastic, which means that the random error is independent of . However, in many practical application fields, the error variance function may change with the variables. Heteroscedastic error models have attracted much attention of many scholars. For example, You et al. [8] considered the estimation of parametric and nonparametric parts for partially linear regression models with heteroscedastic errors. Fan et al. [9] constructed confidence regions of parameter for heteroscedastic PLVCEV model based on empirical likelihood method. Shen et al. [10] discussed estimation and inference for PLVC model with heteroscedastic errors. Xu and Duan [11] extended the results of Shen et al. [10] to efficient estimation for PLVCEV model with heteroscedastic errors.

The above related works assumed that the responses are observed completely. However, in many practical fields, especially in biomedical studies and survival analysis, the response cannot be completely observed due to censored variables. Huang and Huang [12, 13] discussed the constructed confidence regions of the parameters for varying-coefficient single-index model and partially linear single-index EV model by empirical likelihood method under censored data, respectively. The aforementioned results require that the censoring indicators be always observed. However, the censoring indicators may not be observed completely. For example, the death of individual is attributable to the cause of interest that may require information that is not gathered or lost due to various reasons [14]. In this paper, we assume that the censoring indicators are missing at random (MAR), which is common and reasonable in statistical analysis with missing data [15]. There are a lot of works related to missing censoring indicators. For example, Wang and Dinse [16] and Li and Wang [17] proposed weighted least square estimators of unknown parameter and proved their asymptotical normality for linear regression model. Shen and Liang [18] discussed the estimation and variable selection for PLVC quantile regression model. Wang et al. [19] considered composite quantile regression for linear regression model. However, there is no literature focusing on the estimation and confidence regions of heteroscedastic errors model with right-censored data when the censoring indicators are MAR.

In this paper, we consider modified profile least square (PLS) estimators of the unknown parameter and local linear estimators of the coefficient function. Besides the point estimation, we are also interested in interval estimation in terms of empirical likelihood (EL) method, which, first introduced by Owen [20], is a very effective method for constructing confidence regions, which enjoys a lot of nice properties over the normal approximation-based methods and bootstrap approach. Thanks to its advantage, there are a lot of literature-related EL methods to refer to. For instance, Fan et al. [21] considered penalized EL for high-dimensional PLCVEV model. Wang and Drton [22] established estimation for linear structural equation models with dependent errors based on EL method. Fan et al. [23] discussed weighted EL for heteroscedastic varying-coefficient partially nonlinear model with missing data. Zou et al. [24] considered EL inference for partially linear single-index EV model with missing censoring indicators.

It is worth pointing out that it is innovative and interesting in studying the PLVCEV model with heteroscedastic errors under censoring indicators MAR. Thus, we consider estimation and confidence regions based on modified profiled LS method and EL inference, respectively. The main aims of this paper include the following aspects: (1) define a class of modified PLS estimators of the parameter and local linear estimators of coefficient function based on regression calibration, imputation, and inverse probability weighted approaches, and prove the asymptotical normality of the proposed estimators; (2) construct reweighted estimators of the parameter and coefficient function based on estimators of the error variance function, and establish the asymptotic properties of the proposed estimators; (3) develop the asymptotic standard chi-squared distribution of the empirical log-likelihood ratio functions, construct the confidence regions for the parameter, and propose the asymptotic distribution of the corresponding maximum EL estimators. Finally, a simulation study and a real data analysis are conducted to demonstrate the finite sample performance of the proposed procedures.

The rest of this paper is organized as follows. In Section 2, we construct modified PLE estimators of the parameter and local linear estimators of the coefficient function. In Section 3, we proposed empirical log-likelihood ratio statistics and maximum EL estimators. The main results are shown in Section 4. Section 5 presents simulation and real data analysis. In Section 6, we show some conclusions. The proofs of the main results are shown in Appendix.

2. Methodology

Suppose that is a sample from model (1), that is,where the model error satisfies and , which is an unknown function of representing heteroscedastic error. In the practical application, the response may be right censored by various reasons. Let be censoring time with distribution function (df) . One can only observe with df and censoring indicator . Define the missing indicator to be , which is 0 if is missing; otherwise, it is 1. Throughout this article, we assume that is independent of , and is MAR, which implies that and are conditional independent given , i.e.,

2.1. Modified Profile Least Squares Estimation

The local linear regression technique is employed to estimate the coefficient function . If has twice continuous derivative at point , for in a small neighborhood of , one can approximate by the following expansion with Taylor expansion:where . Then, can be estimated by minimizing the following objective function:where is a kernel function, and is a bandwidth sequence. Due to the missing indicators, some cannot be observed. Therefore, model (2) cannot be applied directly. One can replace with its conditional expectation . Thus, can be defined as the minimizer of

However, in practical fields, function is usually unknown. One can use parametric and nonparametric methods to estimate . However, when the covariates are high-dimensional, nonparametric estimation may cause “the curse of dimensionality.” Hence, throughout this paper, we assume that follows a parametric model , where is an unknown parameter vector. Following Wang and Dinse [16], the estimator of can be obtained by maximizing the following likelihood function:

Let . Since , we replace with its estimator . Hence, can be estimated by minimizing the following objective function:where is the estimator of , which is defined bywhich is the Nadaraya–Watson estimator of with the kernel function and bandwidth sequence .

For notational simplicity, let , , ,

If is known, one can obtain the local linear estimator of coefficient function by

Substituting (11) into the original model (8) and eliminating bias produced by the measurement error, we get the following modified PLS estimator of the parameter based on regression calibration method,

Then, the local linear regression estimator of is defined as follows:

Let . Since under the missing mechanism, we can impute with in expression (6). Hence, can be estimated by minimizing the following objective function:

If is known, one can obtain the local linear estimator of coefficient function bywhere . Substituting (21) into the original model and eliminating bias produced by the measurement error, hence, we obtain the following modified PLS estimator of based on imputation method:

Thus, the local linear regression estimator of is defined as follows:

Let . Note that under MAR assumption. Hence, we substitute with , where is a nonparametric estimator of with kernel function and bandwidth sequence . Hence, can be estimated by minimizing the following objective function:

If is known, one can obtain the local linear estimator of coefficient function bywhere . Substituting (19) into the original model and eliminating bias produced by the measurement error. Hence, we can get the following modified PLS estimator of based on inverse probability weighted method:

Hence, the local linear regression estimator of is defined as follows:

2.2. Estimation for Error Variance

In order to improve the estimation of parametric and nonparametric parts, we construct local linear estimators of the error variance function in this subsection. Note that . By minimizing the following object function with respect to ,the local linear regression estimator of based on regression calibration method is defined bywhere the weight function is defined bywith

Note that

By minimizing the following object function with respect to ,the local linear regression estimator of based on imputation method is defined bywhere the weight function is defined bywith

Note that

By minimizing the following object function with respect to ,the local linear regression estimator of based on inverse probability weighted method is defined bywhere the weight function is defined bywith

2.3. Reweighted Estimation

In this subsection, we construct the reweighted estimations of the parametric and nonparametric parts based on the error variance estimator given in (23). By minimizing the following object function,then, we get the following reweighted estimator of based on the regression calibration method:

Furthermore, the reweighted estimator of the coefficient function is defined by

Similarly, based on the error variance estimator given in (28) and minimizing the following object functionthen, we get the reweighted estimator of based on the imputation method:

Hence, the reweighted estimator of the coefficient function is defined by

From the error variance estimator given in (33) and minimizing the following object functionthen, we get the reweighted estimator of based on the inverse probability weighted method:

Thus, the reweighted estimator of the coefficient function is defined by

3. Empirical Likelihood

The confidence regions of the parameter can be constructed by the asymptotic distribution of Theorems 1 and 4. However, the estimation of asymptotic covariance is quite complicated. In this section, we shall employ the EL method to construct confidence regions for , which avoids to estimate the complicated covariance.

3.1. Regression Calibration Empirical Likelihood

We introduce the following auxiliary random vector based on regression calibration method:

Thus, we define the empirical log-likelihood ratio function as follows:

The optimal value of satisfying (46) is given by , where is the solution to the equation . By the Lagrange multiplier method, the corresponding empirical log-likelihood ratio function is represented as

By maximizing , we can obtain a maximum EL estimator of with regression calibration method.

3.2. Imputation Empirical Likelihood

We introduce the following auxiliary random vector based on imputation method:

Hence, we define the empirical log-likelihood ratio function as follows:

The optimal value of satisfying (49) is given by , where is the solution to the equation . By the Lagrange multiplier method, the corresponding empirical log-likelihood ratio function is

By maximizing , we can obtain a maximum EL estimator of with imputation method.

3.3. Inverse Probability Weighted Empirical Likelihood

We introduce the following auxiliary random vector based on inverse probability weighted method:

Then, we define the empirical log-likelihood ratio function as follows:

The optimal value of satisfying (52) is given by , where is the solution to the equation . By the Lagrange multiplier method, the corresponding empirical log-likelihood ratio function is represented as

By maximizing , we can obtain a maximum EL estimator of with inverse probability weighted method.

4. Main Results

For convenience and simplicity, we use and generically to represent any positive constants, which may take different values for each appearance. Let , , , and . Denote

In order to prove the main results, we give a set of assumptions that are stated in the following theorems: (C1) The random variable has bounded support and its density function is Lipschitz continuous and away from zero on its support. (C2) There is such that ., ., . and are nonsingular matrixes. (C3) has continuous second derivatives in . (C4) The variance function with uniform boundedness has continuous second-order derivation and is bounded away from zero. (C5) The kernel as a symmetric density function has compact support , which is Lipschitz continuous, and satisfies . (C6) Denote and . Let and is continuous. is continuous for . (C7) The kernel functions and are bounded with bounded compact supports, and , , and . (C8) and have bounded derivatives of order 1, and there exists such that . (C9) is a positive definite. is continuous at . (C10) The bandwidth satisfies , , and for .

Remark 1. (a)Assumptions (C1) and (C2) are used to establish the asymptotic normality and the oracle property of the estimators. Assumptions (C3) and (C4) are common conditions for varying-coefficients models with heteroscedastic error. Assumption (C5) requires that the kernel function is a proper density with finite second moment, which is required to derive the asymptotic variance of estimators. Assumption (C6) implies that is bounded away from zero. Assumptions (C7)–(C9) are needed for the properties of and . Assumption (C10) underlines the relationship bandwidth with sample size , which implies the optimal bandwidth in nonparametric estimation.(b)From the Taylor expansion and conclusion in Li and Wang [17], one can get which, together with assumption (C9), gives .

The asymptotic properties of the proposed estimators are shown in the following theorems.

Theorem 1. Suppose that assumptions (C1)–(C10) are satisfied; then, we havewhere is taken to be , and . correspond to , and , respectively.

Theorem 2. Suppose that assumptions (C1)–(C10) are satisfied; then, we havewhere is taken to be , and . correspond to , and , respectively.

Theorem 3. Suppose that assumptions (C1)–(C10) are satisfied; let ; then, we havewhere is taken to be one of , and .

Theorem 4. Suppose that assumptions (C1)–(C10) are satisfied; then, we havewhere is taken to be one of , and . correspond to , and , respectively.

Theorem 5. Suppose that assumptions (C1)–(C10) are satisfied; then, we havewhere is taken to be one of , and . correspond to , and , respectively.

Theorem 6. Suppose that assumptions (C1)–(C10) are satisfied; if is the true value, then we havewhere denotes one of , , and . is a standard chi-squared random variable with 1 degree of freedom.

Theorem 7. Suppose that assumptions (C1)–(C10) are satisfied; then, we havewhere denotes one of , and . correspond to , and , respectively.

Remark 2. (a)From Theorems 1 and 4, the asymptotic variance of the reweighted estimator is not greater than that of the modified profile LS estimator ; that is, is a positive semidefinite matrix. The asymptotic variance of the reweighted estimator is smaller than that of , and is larger than that of , which indicates that performs the best, and performs the worst. The modified PLS estimators , enjoy the same conclusion.(b)From Theorems 2 and 5, the local polynomial estimator and reweighted estimator have the same asymptotic distribution, which reflects the characteristic of the local regression in nonparametric models.(c)From Theorem 6, the EL confidence region for can be established as , where is the upper -quantile of distribution of .

5. Simulation

In this subsection, we carry out some numerical simulation to investigate the finite sample behavior of the proposed estimators. We compare the performance of the estimators based on the regression calibration method (CA), imputation method (IM) and inverse probability weighted method (IPW), and their corresponding reweighted estimators (R-CA, R-IM, R-IPW). Besides, we conduct a comparison of the EL method with the normal approximation (NA) approach in terms of coverage probabilities (CP) and average interval lengths (AL) under different settings. At the same time, we give a real data analysis. The kernel functions are taken as , and . The bandwidths , and have taken the same values by leave-one-sample-out cross-validation. The following simulation is based on 500 replications. The sample size is chosen to be 100 and 400, repeatedly.

5.1. Simulation Experiments

The data are generated from the following the PLVCEV model:where , , , the covariates and are from and pairwise covariance . is from . is from , the model error . The error variance function is taken as for and 4. The measurement error . To represent different levels of measurement errors, we take be and in the simulations, respectively. Let the censoring time be from , where is adapted to get different censoring ratios (CR). For the missing mechanism, we take . We choose different and to get different CR and missing ratios (MR). Suppose that follows a logistic model, that is, . Then, the parameter is estimated by maximum likelihood method. See Table 1 for details.

In the first simulation, we study the finite sample performance of the proposed modified PLS estimators and reweighted estimators of based on mean squared error (MSE) defined asand the global mean square error (GMSE) of defined aswhere is a sequence of grid points. In addition, we plot QQ-plots of the reweighted estimator for under different settings in Figures 1 and 2. In the second simulations, we plot the curves of the proposed estimators , , and under different settings in Figures 3 and 4. In the third simulations, we consider CP and AL of the confidence regions for based on the EL method (CPE, ALE) and NA method (CPN, ALN) with nominal level 0.95 under different settings in Table 2.

From Tables 2–4 and Figures 1–4, it can be seen that(1)In Tables 3 and 4, the MSE and GMSE of reweighted estimators are smaller than those of modified PLS estimators under the same setting. The results of IM estimators are smaller than those of IPW estimators, and bigger than those of RC estimators. The results increase as measurement error, heteroscedasticity error, and CR and/or MR increase. The results decrease as the sample size increases. The results above imply that the reweighted estimators perform better than the modified PLS estimators. The RC method performs best, and IPW method performs worst, which confirms the theoretical results.(2)In Table 2, the CP of reweighted estimators is larger than that of the modified PLS estimators. The CP of the RC method is the smallest, and that of the IPW method is the biggest under the same settings. The CP decreases as heteroscedasticity error and CR and/or MR increase. The results increase as sample size increases. The CP based on the EL method is smaller than that of the NA method. The AL performs in the opposite way.(3)In Figures 1 and 2, the fit is better as decreasing the heteroscedasticity error and measurement error. The fit is worse as increasing CR and/or MR.(4)In Figures 3 and 4, the proposed estimators of error variance perform better as decreasing measurement error, heteroscedasticity error, and CR and/or MR. The estimator performs the best, and performs the worst under the same settings.

5.2. A Real Data Analysis

In real data analysis, we illustrate the methodology via an application to a dataset from a breast cancer clinical trial [25]. This clinical trial was conducted by the Eastern Cooperative Oncology Group, whose target was evaluating tamoxifen as a treatment for stage II breast cancer among elderly women, who are older than 65. There are 169 elderly women participating in the trial, and we focus on 79 women who died by the end of the trial. But, unfortunately, the cause of death is incomplete. Among them, 44 women died from breast cancer, 17 died from other known causes, and 18 died from unknown causes. Let the censoring indicator show whether the death was caused by breast cancer, and let the missing indicator show whether the cause of death was known. The dataset contains four covariates: whether the patients accepted the treatment (1, tamoxifen; 0, placebo), denoted as ; whether the estrogen receptor status was positive (1, yes; 0, unknown), denoted as ; whether there were four or more axillary lymph positive nodes (1, yes; 0, no), denoted as ; and whether the primary tumor is 3 cm or larger (1, yes; 0, no), denoted as . Then, we employ the following model to fit the data,where is the logarithm of the time to death due to breast cancer, which is censored, and the censoring indicator is MAR. The heteroscedastic error follows the form of . For the purpose of comparison, we compute both mean squared prediction error (MSE) and mean absolute deviation (MAD) of the predictions, which are defined as follows: , and where is the fitted value of . The values of MSE and MAD based on different methods are given in Table 5. In addition, the estimated curves of based on RC, IM, and IPW methods are reported in Figure 5.

From Table 5 and Figure 5, it can be seen that, (1) in Table 5, the estimators of are positive, which indicates that the breast-cancer deaths may live longer if they received the treatment. Among these estimators, the MSE and MAD based on the R-RC method are smallest, which confirms the conclusions in Theorems 1 and 4. (2) In Figure 5, the primary tumor size of patients is mainly from 0 to 3 cm. The survival time decreases obviously as the tumor size increases.

6. Conclusion

In this paper, we consider the estimation and confidence regions based on modified PLS method and EL inference for PLVCEV model with heteroscedastic errors under censoring indicators MAR, respectively. Asymptotic properties of the proposed estimators are established, and the confidence regions of parameter are constructed. In addition, a simulation study and real data analysis are conducted to illustrate our proposed method.

Xu and Duan [11] established efficient estimation for varying-coefficient heteroscedastic partially linear model with additive errors, but their results are confined in responses observed completely. It is an innovative and challenging topic to study the statistical inference for heteroscedastic PLVCEV model under right-censored data with censoring indicators MAR.

An interesting problem is whether we can extend the estimation method and incomplete data to functional regression models.

Appendix

Proof of Main Results

Lemma A.1. Let be independent and identically distributed (i.i.d.) random variables. If is bounded for , then .

Proof. Lemma A.1 can be verified as Lemma 3 in Owen [20].

Lemma A.2. Let be i.i.d. random vectors, where and are scalar random variables. Assume further that and , where denotes the joint density of . Let be a bounded positive function with a bounded support and satisfying a Lipschitz condition. Then, provided that for some .

Proof. Lemma A.2 comes from the basic corollary in Mack and Silverman [26].

Lemma A.3. Suppose that assumptions (C1)–(C10) hold; then, as , it holds thatwhere . , is the -th element of and .

Proof. The proof of Lemma A.3 is similar to that of Lemma A.2 in Xia and Li [27].

Lemma A.4. Under the assumptions (C6)–(C8), then we havewherewith , , , , , and . is between and . is between and . is between and .

Proof. Following the proof of Theorem 1 in Wang and Ng [28], one can get the proof of Lemma A.4. To save space, here, we omit the details.

Lemma A.5. Suppose that assumptions (C1)-(C10) hold; then, as , we have and .

Proof. We only prove the results about . The results related on and can be proved similarly. By the definition of defined as (13) in Section 2, we can writeStandard arguments yield that . Following (A.9) in Shen et al. [10], we have . Similarly, . Note that , it is easy to prove that . Therefore, the proof of is finished. Using similar arguments, the verification of can be done.

Lemma A.6. Suppose that assumptions (C1)–(C10) hold, and is the true value; then, as , we havewhere is taken to be one of , and . The values correspond to , and , respectively.

Proof. We only prove the result related to . The proofs of and can be confirmed similarly. The verification of (b) can be done by Lemma 3 given in Owen [29]. Here, we omit the details. ConsiderFrom Theorem 3, we haveSince . Under assumption (C10), we have . Under assumption (C5), we have . On applying assumption (C10), one can getApplying , we haveRecalling Remark 1 and from the results in Theorem 3, it is easy to proveUnder the missing mechanism and similar to the proof of , it is easy to prove that for . Hence, we have . ConsiderFrom Lemma A.4, we haveIt is easy to prove , and we have under assumption (C10).Hence, . Similarly, we have and . Compared with , is far smaller than , and then . From Theorem 3, we have . Similarly, . Hence, we have . Note that . From assumption (C9), we getThus, it can be checked that . Hence, combining with and collecting the results above, one can obtainBy the central limit theorem, we haveIt follows from the law of large numbers thatConsider the partial derivative of with respect to , and thenBy the law of large numbers, we haveTherefore, the verification of Lemma A.6 is finished.

Proof. of Theorem 1. We prove Theorem 1 for . The verifications related to and can be obtained similarly. Recalling the definition of in Section 2, one can write , whereReplace and with their true values, and then can be rewritten asRecalling the definition of in Section 2, then we haveStandard computations yield that ; then, it is easy to calculateNote that . Hence, we have , and . Thus, it can be concluded thatNote that , which indicates that . ConsiderIn terms of the expansion of in Lemma A.4, one can getNote that . Based on the independence of , then we have . Hence, it holds that . Compared with , is far smaller than , and then . Similarly, and . Hence, we have . Following (A.20) in Lemma A.4, it is easy to prove . Collecting the results above, we haveAnalogous to the arguments as the proof of Lemma A.6 (a), then one can obtainCollecting the results above, the proof of Theorem 1 is completed.

Proof. of Theorem 2. We only prove the results about . The results related on and can be proved similarly. By the definition of , we can writeBy Taylor expansion, it can be checked thatHence, from assumption (C5), we haveOne direct simplification impliesTheorem 1 implies that . It is easy to verify that and . Hence, we haveBy Slutsky’s theorem, we finish the proof of Theorem 2.

Proof. of Theorem 3. We only prove the results about . The results related on and can be proved similarly. To save space, here, we omit the details. Denote and . Note thatFollowing Lemma 4.1 in Chiou [30], it is easy to prove that . From Theorem 3.5 in You and Chen [1], we have , which indicates that . Based on the fact that , it is easy to prove for . In terms of Lemma A.2, one can obtain , which completes the proof of Theorem 3.

Proof. of Theorem 4. Based on Theorem 3, similar to the arguments, the proof of Theorems 1 and 4 can be verified easily.

Proof. of Theorem 5. Theorem 1 implies that and . Analogous to the proof of Theorem 3, it is easy to verify Theorem 5.

Proof. of Theorem 6. Applying the Taylor expansion for the empirical log-likelihood ratio function, then we haveNote that is the solution of equation , and then one can getSimilar to Owen [20], we derive that . Hence, we haveHence, from (A.36) and (A.38), it can be checked thatin which, together with Lemma A.6, the proof of Theorem 6 is finished.

Proof. of Theorem 7. For convenience, letNote that and satisfy and . By expanding at for , it can be shown thatwhere . Hence,From Lemma A.6, we have , and which, together with Lemma A.6, shows the proof of Theorem 7.

Data Availability

The simulation study is based on the Monte Carlo simulation to study the finite sample performance of the proposed estimators. The real dataset is in [25]. Clinical trial E1178 conducted by the Eastern Cooperative Oncology Group compared tamoxifen therapy and placebo in elderly (≥age 65) women with stage II breast cancer.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the China Postdoctoral Science Foundation (2019M651422), the National Natural Science Foundation of China (11831008 and 11971171), the National Social Science Foundation Key Program (17ZDA091), the 111 Project of China (B14019), and the Natural Science Foundation of Shanghai (17ZR1409000).

References

J. You and G. Chen, “Estimation of a semiparametric varying-coefficient partially linear errors-in-variables model,” Journal of Multivariate Analysis, vol. 97, no. 2, pp. 324–341, 2006.
View at: Publisher Site | Google Scholar
X. He, X. Feng, X. Tong, and X. Zhao, “Semiparametric partially linear varying coefficient models with panel count data,” Lifetime Data Analysis, vol. 23, no. 3, pp. 439–466, 2017.
View at: Publisher Site | Google Scholar
B. Kai, R. Li, and H. Zou, “New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models,” Annals of Statistics, vol. 39, no. 1, pp. 305–332, 2011.
View at: Publisher Site | Google Scholar
R. F. Engle, C. W. J. Granger, J. Rice, and A. Weiss, “Semiparametric estimates of the relation between weather and electricity sales,” Journal of the American Statistical Association, vol. 81, no. 394, pp. 310–320, 1986.
View at: Publisher Site | Google Scholar
H. Liang, W. Härdle, and R. J. Carroll, “Estimation in a semiparametric partially linear errors-in-variables model,” The Annals of Statistics, vol. 27, pp. 1519–1535, 1999.
View at: Publisher Site | Google Scholar
A. A. Liu and H. Y. Liang, “Jackknife empirical likelihood of error variance in partially linear varying-coefficient errors-in-variables models,” Statistical Papers, vol. 58, no. 1, pp. 1–28, 2017.
View at: Publisher Site | Google Scholar
G.-l. Fan, H.-y. Liang, and L.-x. Zhu, “Penalized profile least squares-based statistical inference for varying coefficient partially linear errors-in-variables models,” Science China Mathematics, vol. 61, no. 9, pp. 1677–1694, 2018.
View at: Publisher Site | Google Scholar
J. You, G. Chen, and Y. Zhou, “Statistical inference of partially linear regression models with heteroscedastic errors,” Journal of Multivariate Analysis, vol. 98, no. 8, pp. 1539–1557, 2007.
View at: Publisher Site | Google Scholar
G.-L. Fan, H.-Y. Liang, and J.-F. Wang, “Statistical inference for partially time-varying coefficient errors-in-variables models,” Journal of Statistical Planning and Inference, vol. 143, no. 3, pp. 505–519, 2013.
View at: Publisher Site | Google Scholar
S.-L. Shen, J.-L. Cui, C.-L. Mei, and C.-W. Wang, “Estimation and inference of semi-varying coefficient models with heteroscedastic errors,” Journal of Multivariate Analysis, vol. 124, pp. 70–93, 2014.
View at: Publisher Site | Google Scholar
H. Xu and X. Duan, “Eﬃcient estimation for partially linear varying-coeﬃcient errors-in-variables models with heteroscedastic errors,” Instrumentation Mesure Métrologie, vol. 18, no. 2, pp. 295–314, 2018.
View at: Publisher Site | Google Scholar
Z. Huang, “Empirical likelihood for single-index varying-coefficient models with right-censored data,” Journal of the Korean Statistical Society, vol. 39, no. 4, pp. 533–544, 2010.
View at: Publisher Site | Google Scholar
Z. Huang, “Empirical likelihood for a partially linear single-index measurement error model with right-censored data,” Communications in Statistics - Theory and Methods, vol. 40, no. 6, pp. 1015–1029, 2011.
View at: Publisher Site | Google Scholar
Q. Wang and J. Shen, “Estimation and confidence bands of a conditional survival function with censoring indicators missing at random,” Journal of Multivariate Analysis, vol. 99, no. 5, pp. 928–948, 2008.
View at: Publisher Site | Google Scholar
R. J. A. Little and D. B. Rubin, Statistical Analysis with Missing Data, John Wiley & Sons, New York, NY, USA, 1987.
Q. Wang and G. E. Dinse, “Linear regression analysis of survival data with missing censoring indicators,” Lifetime Data Analysis, vol. 17, no. 2, pp. 256–279, 2011.
View at: Publisher Site | Google Scholar
X. Li and Q. Wang, “The weighted least square based estimators with censoring indicators missing at random,” Journal of Statistical Planning and Inference, vol. 142, no. 11, pp. 2913–2925, 2012.
View at: Publisher Site | Google Scholar
Y. Shen and H.-Y. Liang, “Quantile regression for partially linear varying-coefficient model with censoring indicators missing at random,” Computational Statistics & Data Analysis, vol. 117, pp. 1–18, 2018.
View at: Publisher Site | Google Scholar
J. F. Wang, W. J. Jiang, F. Y. Xu, and W. X. Fu, “Weighted composite quantile regression with censoring indicators missing at random,” Communications in Statistics-Theory and Methods, vol. 51, 2019.
View at: Publisher Site | Google Scholar
A. B. Owen, “Empirical likelihood ratio confidence regions,” The Annals of Statistics, vol. 18, no. 1, pp. 90–120, 1990.
View at: Publisher Site | Google Scholar
G.-L. Fan, H.-Y. Liang, and Y. Shen, “Penalized empirical likelihood for high-dimensional partially linear varying coefficient model with measurement errors,” Journal of Multivariate Analysis, vol. 147, pp. 183–201, 2016.
View at: Publisher Site | Google Scholar
Y. S. Wang and M. Drton, “Empirical likelihood for linear structural equation models with dependent errors,” Stat, vol. 6, no. 1, pp. 434–447, 2017.
View at: Publisher Site | Google Scholar
G. L. Fan, L. L. Wang, and H. X. Xu, “Weighted empirical likelihood for heteroscedastic varying coefficient partially non‐linear models with missing data,” Stat, vol. 10, no. 1, 2021.
View at: Publisher Site | Google Scholar
Y. Zou, G. Fan, and R. Zhang, “Empirical likelihood and variable selection for partially linear single-index EV models with missing censoring indicators,” Journal of the Korean Statistical Society, vol. 50, no. 1, pp. 134–162, 2021.
View at: Publisher Site | Google Scholar
F. J. Cummings, R. Gray, T. E. Davis et al., “Tamoxifen versus placebo: Double-blind adjuvant trial in elderly women with stage ii breast cancer,” NCI Monographs, vol. 1, pp. 119–123, 1986.
View at: Google Scholar
Y. P. Mack and B. W. Silverman, “Weak and strong uniform consistency of kernel regression estimates,” Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, vol. 61, no. 3, pp. 405–415, 1982.
View at: Publisher Site | Google Scholar
Y. C. Xia and W. K. Li, “On the estimation and testing of functional-coefficient linear models,” Statistica Sinica, vol. 9, pp. 735–757, 1999.
View at: Google Scholar
Q. Wang and K. W. Ng, “Asymptotically efficient product-limit estimators with censoring indicators missing at random,” Statistic Sinica, vol. 18, no. 2, pp. 749–768, 2008.
View at: Google Scholar
A. B. Owen, “Empirical likelihood ratio confidence intervals for a single functional,” Biometrika, vol. 75, no. 2, pp. 237–249, 1988.
View at: Publisher Site | Google Scholar
J. M. Chiou and H. G. Müller, “Nonparametric quasi-likelihood,” The Annals of Statistics, vol. 27, no. 1, pp. 36–64, 1999.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2021 Yuye Zou and Chengxin Wu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

182

Downloads

503

Citations