Abstract

Block empirical likelihood inference for semiparametric varying-coeffcient partially linear errors-in-variables models with longitudinal data is investigated. We apply the block empirical likelihood procedure to accommodate the within-group correlation of the longitudinal data. The block empirical log-likelihood ratio statistic for the parametric component is suggested. And the nonparametric version of Wilk’s theorem is derived under mild conditions. Simulations are carried out to access the performance of the proposed procedure.

1. Introduction

For longitudinal data, we consider semiparametric varying-coefficient partially linear model which has the following form: where is the response variable, , , and are regressors, is a -dimensional vector of unknown parameters, is a -dimensional vector of smooth functions of time , and is a zero-mean stochastic process. Due to the curse of dimensionality, for simplicity, we assume that is univariate.

Obviously, model (1) contains many usual parametric, nonparametric, and semiparametric models. Model (1) has been studied by many authors. Zhang et al. [1] suggested a two-step method for estimating it. Li et al. [2] suggested a local least-squares procedure with a kernel weight function. Fan and Huang [3] developed a profile least-squares technique for estimating parametric model. You and Zhou [4] and Huang and Zhang [5] suggested the estimator of the parametric and nonparametric models, respectively. Fan et al. [6] proposed a semiparametric estimation of the working correlation matrix and applied a profile weighted least-squares approach.

However, in many practical situations, these variables are often measured with error. In this paper, we consider this case where the variable is measured with additive error and both and are measured exactly. That is, cannot be observed, but an unbiased measure of , denoted by , can be obtained as follows: where is the measurement error, which is independent of (, , , ), with mean zero and covariance matrix . We can assume that is known. If is unknown, we estimate it by repeatedly measuring by Liang et al. [7]. For errors-in-variables models (1) and (2), Liang et al. [8] developed a profile least-squares procedure to estimate the parametric component and derived the asymptotic normality of the resulting estimator.

The empirical likelihood, which is a nonparametric approach for constructing confidence regions, was introduced by Owen [9] and has many nice statistical properties (see Owen [10]). Owen [11] applied empirical likelihood to linear regression models and Kolaczyk [12] made further extensions to generalized linear models. Recently, Xue and Zhu [13] considered the varying coefficient models. You and Zhou [4], Huang and Zhang [5], and Zhao and Xue [14] investigated the empirical likelihood confidence regions for varying-coefficient partially linear models. Other related papers contain Yang and Li [15], Hu et al. [16], Wang et al. [17], and Fan et al. [18, 19].

In this paper, we consider models (1) and (2) with longitudinal data; one aim of this paper is to construct the confidence region for the parameter components. To achieve it, we apply the block empirical likelihood approach [20] to construct block empirical log-likelihood ratio statistic for parameter and then prove nonparametric Wilk’s phenomenon. Simulation studies assess the proposed method. The other aims are to prove that the maximum empirical likelihood estimator (MELE) for the parameter is asymptotically normal under some suitable conditions.

The rest of this paper is organized as follows. In Section 2, we construct the block empirical likelihood based confidence region for the parametric components. Assumption conditions and main results are given in Section 3. Simulation results are reported in Section 4. The proofs of the main results are stated in Section 5. Finally, some concluding remarks are given.

2. Methodology

In this section, we are to extend the result of Hu [21] to the semivarying coefficient errors-in-variables model with longitudinal data.

We apply longitudinal data (, , , ). , and which are generated from semivarying coefficient errors-in-variables model through the following equation: where , , and , and , , . We use counting process to describe the number of observations of the th subject. We assume that is bounded, but the number of subjects goes to infinity.

Suppose that is known; then, model (3) can be reduced to a varying-coefficient regression model: Here, the local linear regression method is applied to estimate the coefficient function in model (4). That is, for in a small neighborhood of , one can approximate locally by a linear function where . This leads to the following weighted least-squares problem: find to minimize where is a kernel function, , and is a bandwidth. Let

Then, the solution to problem (6) is given by Then, can be given by where is identity matrix and is zero matrix. Denote then, Substituting (12) into (4), we can obtain the approximate residuals as the following: where

Similar to Owen [10], can be treated as a random sieve approximation of the random error sequence . In order to deal with the correlation within group, we use the block empirical likelihood method. The block empirical likelihood procedure takes the ‘‘data” into account as a whole. Hence, similar to Xue and Zhu [13], we introduce the auxiliary random vector

Following (13), if is true, then . If one ignores the measurement error and replaces by in , one can show that the resulting estimator is inconsistent. As we all know, inconsistency caused by the measurement error can be overcome by applying the so-called correction for attenuation proposed by Fuller [22] in linear regression. With a similar way as in Zhao and Xue [14], the corrected-attenuation auxiliary vector is introduced and defined as where . The term aims to avoid the underestimating for the parameter caused by the measurement error. Therefore, the empirical likelihood ratio function for is defined as A unique value for exists, provided that is inside the convex hull of the point . Using the Lagrange multiplier technique, the optimal value for is where is the solution of the equation Then, the block empirical log-likelihood ratio function is In addition, by maximizing , we can obtain the maximum empirical likelihood estimator (MELE) . Let If the matrix is invertible, then the MELE of can be given by According to , we can define the estimator as

3. Main Results

To establish asymptotic properties of the block empirical log-likelihood ratio, we make the following assumptions. These assumptions are made by You and Zhou [4]. We use to denote the Euclidean norm with and .

Assumption 1. The random variable has a compact support . The density function of has a continuous second derivative and is uniformly bounded away from zero.

Assumption 2. The matrix is nonsingular for each . , , and are all Lipschitz continuous.

Assumption 3. There is a such that , , , and and for some such that as .

Assumption 4. have the continuous second derivative in .

Assumption 5. The kernel is a symmetric probability density function and is a bounded variation function on its support.

Assumption 6. The bandwidth satisfies and as .

The following theorem gives the asymptotic distribution of .

Theorem 1. Assume that the Assumptions 16 hold; if is the true value of the parameter, then where denotes the convergence in distribution and is a chi-square distribution with degrees of freedom.

Then, we can construct the confidence regions for the parameter . More precisely, for any , let be such that . Then, constitute a confidence region for with asymptotic coverage .

Theorem 2. Assume that the Assumptions 16 hold. Then, one has where

4. Simulation Results

In this section, we will conduct some simulations to the empirical likelihood (EL) method. The data are generated from where , , , , , , , , , and .

In the simulation studies, for each combination of , and , we draw 1,000 random samples of sizes 100 or 200 from the above model, respectively. For each sample, a 95% confidence interval for is computed using our block empirical likelihood method. The kernel function is taken as the Gauss kernel . The “leave-one-sample-out” method is used to select the bandwidth . We define the score of as follows: Then cross-validation smoothing parameter is the minimizer of . Some representative coverage probabilities are reported in Table 1.

5. Proof of the Main Results

In order to prove the main results, we first introduce several lemmas. Let , , , , , , and .

Lemma 3. Let be i.i.d random vector, where is scalar random variable. Further, assume that , , where denotes the joint density of . Let be a bounded positive function with a bounded support, satisfying a Lipschitz condition. Given that for some , then size

Proof. This lemma can be found in Mack and Silverman [23].

Lemma 4. Let , be a sequence of multi-independent random variate with and . Then, Further, let be a permutation of . Then, one has

Proof. We can prove this lemma immediately by Kolmogorov inequality.

Lemma 5. Let be i.i.d random variables. If are uniformly bounded for , then one has

Proof. This lemma can be found in Shi and Lau [24].

Lemma 6. Suppose that Assumptions 16 hold; one has which hold for all , where .

Proof. This follows immediately from the result that was obtained by Yang and Li [15].

Lemma 7. Suppose that Assumptions 16 hold; one has, when ,

Proof. Let ; then, Lemma 7 can be directly attained by Lemma 6.

Lemma 8. Suppose that the Assumptions 16 hold, one has where is defined by (26).

Proof of Theorem 1. From (36), using the same arguments as were used in the proof of Owen [10], we have where is defined in (19). Then, we have size By using Lemma 8, we obtain Applying the Taylor expansion to (20), we get that Hence, together with (39), we have size
Together with Lemma 8, this proves Theorem 1.

Proof of Theorem 2. Following the similar arguments as were used in the proof of Theorem 2 in Yang and Li [15], we have By (35), we can prove by the law of large numbers. Together with Lemma 8 and Slutsky’s theorem, this proves Theorem 2.