#### Abstract

In this paper, we consider a single-index varying-coefficient model with application to longitudinal data. In order to accommodate the within-group correlation, we apply the block empirical likelihood procedure to longitudinal single-index varying-coefficient model, and prove a nonparametric version of Wilks’ theorem which can be used to construct the block empirical likelihood confidence region with asymptotically correct coverage probability for the parametric component. In comparison with normal approximations, the proposed method does not require a consistent estimator for the asymptotic covariance matrix, making it easier to conduct inference for the model's parametric component. Simulations demonstrate how the proposed method works.

#### 1. Introduction

The single-index varying-coefficient model which was proposed by Huang and Zhensheng [1] is a very important tool to explore the dynamic pattern in many complex dynamic systems, such as economics, finance, politics, epidemiology, medical science, and ecology. As mentioned in Gao et al. [2], the concept of complex dynamic systems arises in many varieties. Such systems are often concurrent and distributed, because they have to react to various kinds of events, signals, and conditions. They may be characterized by a system with uncertainties, time delays, stochastic perturbations, hybrid dynamics, distributed dynamics, chaotic dynamics, and a large number of algebraic loops. Moreover, many related literatures, such as Jian et al. [3] and Hu et al. [4], have been proposed. The single-index varying-coefficient models is one method that can be used to describe the complex dynamic systems. They are natural extensions of classical parametric models with good interpretability and are becoming more and more popular in data analysis.

Longitudinal data arise frequently in many scientific studies. For longitudinal data, we know that the data that are collected from the same subject at different times are correlated and that the observations from different subjects are often independent. Therefore, it is of great interest to estimate the regression function incorporating the within-subject correlation to improve the efficiency of estimation. The single-index varying-coefficient model is a popular nonparametric fitting technique; it is easily interpreted in real applications because it has the features of the single-index model and the varying-coefficient model. In addition, the single-index varying-coefficient model may include cross-product terms of some components of covariates. Hence, it has considerable flexibility to cater for a complex multivariate nonlinear structure.

Without loss of generality, we consider a longitudinal study with subjects and observations over time for the th subject with a total of observations. In this article, we apply longitudinal data to a single-index varying-coefficient model, and propose a single-index varying-coefficient longitudinal data model of the form where is a vector of covariates, is the th measurement on the th unit, is an vector of unknown parameters, is an vector of unknown functions, and is a random error with mean 0 and finite variance , assuming that and are independent. For the sake of identifiability, it is often assumed that and the first nonzero element is positive, where denotes the Euclidean metric.

Obviously, model (1) includes a class of important statistical models. For example, if and , model (1) reduces to the single-index longitudinal data model which was proposed by Bai et al. [5] to estimate the index coefficient and unknown link function in a single-index model for longitudinal data by combining penalized splines and quadratic inference functions. If and , (1) is the varying-coefficient longitudinal data model studied by Chiang et al. [6], Huang et al. [7], and Qu and Li [8], among others. So model (1) is easily interpreted in real applications because it has the features of the single-index longitudinal data model and the varying-coefficient longitudinal data model. In addition, model (1) may include cross-product terms of some components of and . Hence, it has considerable flexibility to cater for complex multivariate nonlinear structure.

When , model (1) reduces to the nonlongitudinal single-index varying-coefficient model. Some authors have studied the estimation and application of the model. Recently, empirical likelihood methods have been applied to nonlongitudinal single-index varying-coefficient model. For example, Xue and Wang [9] developed statistical techniques for the unknown coefficient functions and single-index parameters in the single-index varying-coefficient models. They first estimate the nonparametric component via the local linear fitting, then construct an estimated empirical likelihood ratio function, and hence obtain a maximum empirical likelihood estimator for the parametric component. The motivation is that empirical likelihood based inference has many desirable statistical properties. For example, this method does not involve any variance estimation which is rather complicated in nonparametric or semiparametric regression settings and hence are robust against the heteroscedasticity; confidence region based on the empirical likelihood method does not have predetermined symmetry so that it can better correspond to the true shape of the underlying distribution, and so on. Owen [10, 11] and many others developed this into a general methodology. For example, Wang and Jing [12], Chen and Qin [13], Shi and Lau [14], and Xue and Zhu [15–17], among others. A recent survey on empirical likelihood can be found in the monograph of Owen [18]. More methods about the single-index varying-coefficient model have been proposed, such as Huang and Zhang [19] and Feng and Xue [20]. When , model (1) is the single-index longitudinal data model. The usual empirical likelihood method cannot be applied, however, to the single-index longitudinal data model (1) due to correlation within groups. In this paper, we propose a block empirical likelihood procedure to accommodate this correlation. A nonparametric version of the Wilks’ theorem is derived, which can be used to construct confidence regions with asymptotically correct coverage probabilities for the parametric component in the model. Compared with normal approximations, our method has the appealing feature that it does not require one to construct a consistent estimator for the asymptotic covariance matrix. Furthermore, the block empirical likelihood method avoids intensive Monte Carlo simulations usually required by the bootstrap method.

The rest of the paper is organized as follows. Section 2 introduces the estimated block empirical likelihood method. Section 3 derives the nonparametric version of Wilks’ theorem. Section 4 provides a data-driven procedure to choose the tuning parameters. A simulation study is given in Section 5. Proof of the main result is relegated to Section 6.

#### 2. Block Empirical Likelihood Method

In this section, we are to extend the results of You et al. [21] and Xue and Wang [9] to the single-index varying-coefficient longitudinal data model.

To apply the block empirical likelihood method to model (1), we introduce an auxiliary random vector where stands for the derivative of the function vector , and is a bounded weight function with a bounded support , which is introduced to control the boundary effect in the estimations of and . For convenience, we pointed that is the indicator function of the set . Note that if . Hence, the problem of testing whether is the true parameter is equivalent to testing whether , for . Because of the unknowns and , we cannot directly use the block empirical likelihood method to make statistical inference on . A natural way is to replace and by their estimators. In this paper, we estimate the vector functions and via the local linear regression technique (see, e.g., Fan and Gijbels [22]). The local linear estimators for and are defined as and at the fixed point , where and minimize the sum of weighted squares: where , is a kernel function, and is a bandwidth sequence that decreases to 0 as increase to . It follows from the least squares theory that where with

*Remark 1. *Since the convergence rate of the estimator of is slower than that of the estimator of if the same bandwidth is used, this leads to a slower convergence rate for the estimator of than . To increase the convergence rate of the estimator of , we introduce the another bandwidth to replace in and define it as .

Similar to Owen [11] and Shi and Lau [14], can be treated as a random sieve approximation of the random error sequence . In order to deal with the correlation within groups, we use the block empirical likelihood procedure proposed by You et al. [21]. Unlike the usual empirical likelihood method, the block empirical likelihood procedure takes the “data” for into account as a whole. Let be , with and replaced by and , respectively, for . Then an estimated block empirical likelihood function for is defined as For a given a unique maximum exists, provided that 0 is inside the convex hull of the points for . The maximum of (7) may be found via the method of Lagrange multipliers. The optimal value for satisfying (7) may be shown to be where the Lagrange multiplier is the solution of the following equation:

Since is maximized for in the absence of parametric constraints, we define the corresponding estimated profile block empirical log-likelihood ratio as

We will show in the next section that if is the true parameter vector, is asymptotically chi-square distributed.

#### 3. Theoretical Properties

Throughout this article, we assume that increases to push up the total sample size , while the is fixed. To establish the nonparametric Wilks' theorem for , we first make the following assumptions.) The density function of , , is bounded away from zero for and near and satisfies the Lipschitz condition of order 1 on , where is the support of .() The function , have continuous second derivatives on , where are the th components of .(), , and .(), .() The kernel is a symmetric probability density function with a bounded support and satisfies the Lipschitz condition of order 1 and .() The matrix is positive definite, and each entry of and satisfies the Lipschitz condition of order 1 on , where , and is defined in ().() The matrices and are positive definite, where is defined in ().

*Remark 2. *Condition () is used to bound the density function of away from zero. This ensures that the denominators of and are, in probability one, bounded away from 0 for . The second derivatives in () are standard smoothness conditions. ()–() are necessary conditions for the asymptotic normality or the uniform consistency of the estimators. It should be pointed out that the condition can be replaced by , , and for some . In the current work, the exponential index of the norm is set as 6 for it is the minimum value to meet the asymptotic normality or the uniform consistency of the estimators. Conditions () and () ensure that the asymptotic variance for the estimator of exists.

Let , and the first nonzero element is positive}. Then is an inner point of set . The following theorem shows that is asymptotically distributed as a weighted sum of independent variables.

Theorem 3. *Suppose that ()–() hold, then as ,
*

*where represents convergence in distribution, are independent variables, and the weights , for , are the eigenvalues of . Here is defined in condition*

*(**)*,*and and are defined in condition*

*(**)*.To apply Theorem 3 to construct a confidence region or interval for , we need to consistently estimate the unknown weights . By the plug-in method, and can be consistently estimated by respectively, where is the maximum empirical likelihood estimator of defined by (9), , , and with where is a kernel function, and is bandwidth with .

This implies that the eigenvalues of , say , consistently estimate for . Let be the quantile of the conditional distribution of the weighted sum given the data. Then an approximate confidence region for can be defined as follows:

In practice, the conditional distribution of the weighted sum , given the sample , can be calculated using Monte Carlo simulations by repeatedly generating independent samples from the distribution.

In addition to the above, direct way of approximating the asymptotic distributions, we can also consider the following alternative. The alternative is motivated by the results of Rao and Scott [24]. Now, we propose another adjusted empirical log-likelihood, whose asymptotic distribution is chi-squared with degrees of freedom. The adjustment technique is developed by Wang and Rao [25] by using an approximate result in Rao and Scott [24]. Note that can be written as

By examining the asymptotic expansion of , which is specified in the proof of Theorem 4 below, we define an adjustment factor by replacing in by , where . The adjusted empirical log-likelihood ration is defined by where is defined in (10).

Theorem 4. *Suppose that conditions – hold. Then, .*

According to Theorem 4, can be used to construct an approximate confidence region for . Let Then, gives a confidence region for with asymptotically correct coverage probability .

#### 4. Bandwidth Selection

For practical implementation, the tuning parameters need to be decided. We employ a data-driven procedure to choose the tuning parameter , where controls the smoothness of and . We all know that various existing bandwidth selection techniques for nonparametric regression, such as the cross-validation, generalized cross-validation, and the modified multifold cross-validation criterion, can be adapted for the estimation and . Because the algorithm of the modified multifold cross-validation criterion proposed by Cai et al. [26] to select the optimal bandwidth is simple and quick, throughout the empirical studies in this paper, we consider the modified multifold cross-validation criterion. Specifically, let and be two given positive integers and . The basic idea is first to use subseries of lengths to estimate the unknown coefficient functions and then to compute the one-step forecasting error of the next section of the sample of length based on the estimated models. More precisely, we choose which minimizes where are computed from the sample with bandwidth equal to . Note that for different sample size, we rescale bandwidth according to its optimal rate, that is, . Since the selected bandwidth does not depend critically on the choice of and , to computation expediency, we take and in our simulation.

Let be the bandwidth obtained by minimizing (21) with respect to ; that is, . Then is the optimal bandwidth for estimating . When calculating the block empirical likelihood ratios and estimator of , we use the approximation bandwidth because this insures that the required bandwidth has correct order of magnitude for the optimal asymptotic performance (see, e.g., Carroll et al. [27]), and the bandwidth satisfies condition ().

#### 5. A Simulation Study

In this section, we carry out some simulations to study the finite sample performance of the estimated block empirical likelihood method.

*Example 5. *The data are generated from
where , , , , and are i.i.d . For each combination of , , and , 1000 samples are generated from the above model in all simulations. For each sample, a 95% confidence interval for are computed using our estimated block empirical likelihood method. For the smoother, we used a local linear smoother with the Gaussian kernel with a modified multifold cross-validation criterion bandwidth throughout all smoothing steps. Some representative coverage probabilities and coverage confidence intervals are reported in Table 1. Simulation results show that our estimated block empirical likelihood confidence regions have high coverage probabilities and short average confidence interval lengths.

*Example 6. *Consider the regression model
where and the are independent random variables. The sample was generated from a bivariate uniform distribution on with independent components, was generated from a bivariate normal distribution with , and correlation coefficient between and is . In model (24), the coefficient functions are , and .

For the smoother, we use a local linear smoother with a Gaussian kernel , and use the modified multifold cross-validation criterion proposed by Cai et al. [26] to select the optimal bandwidth throughout all smoothing steps because the algorithm is simple and quick. We take the weight function . The sample size for the simulated data is 100, and the run is 1000 times in all simulations.

The confidence regions of and their coverage probabilities, with nominal level , were computed from 1000 runs. The estimated block empirical likelihood was used to construct the confidence regions. The simulated results are given in Figure 1. Simulation results show that our block empirical likelihood confidence regions have high coverage probabilities and short average confidence interval lengths.

The histograms of the 1000 estimators of the parameters and are in Figures 2(a) and 2(b), respectively. The Q-Q plots of the 1000 estimators of the parameters and are in Figures 3(a) and 3(b), respectively. Figures 2 and 3 show empirically that these estimators are asymptotically normal. The means of the estimates of the unknown parameters and are 0.33342 and 0.66673, respectively, and their biases (standard deviations) are 0.000128 (0.00308) and 0.000603 (0.00352), respectively.

**(a)**

**(b)**

**(a)**

**(b)**

We also consider the average estimates of the coefficient functions , , and over the 1000 replicates. The estimators are assessed via the root mean squared errors (RMSE); that is, , where and are regular grid points. The boxplot for the 1000 RMSEs is given in Figure 4. From Figures 4(a)–4(c) we see that every estimated curve agrees with the true function curve very closely. Figure 4(d) shows that all RMSEs of estimates for the unknown functions are very small.

**(a)**

**(b)**

**(c)**

**(d)**

*Example 7. *We now apply the block empirical likelihood method to analyze the data from a longitudinal hormone study [28]. The study involved 34 women whose urine samples were collected in one menstrual cycle and whose urinary progesterone was assayed on alternate days. A total of 492 observations were obtained, with each woman contributing from 11 to 28 observations over time. Each woman's cycle length was standardized uniformly to a reference 28-day cycle since the change of the progesterone level for each woman depends on time during a menstrual cycle. In the following, we consider the following model:
where is the th log-transformed progesterone value measured at standardized day since menstruation for th woman, and and are age and body mass index for the th individual at day , respectively.

We apply the block empirical likelihood method to fit the data. Because we focus on the estimators of and , we only summarize the estimators of and in Figure 5. Next, we denote and as the estimators of , when the correlation structures are specified as independence and first-order autoregressive, respectively. We see from Figure 5 that both and are significant for neither of confidence regions for the two estimators including (0,0). Therefore, we conclude that the parameters and are not significant, which is consistent with the conclusion of Zhang et al. [28].

#### 6. Proof of the Theorem

In order to prove Theorem 3, we introduce the following several lemmas. The following lemma gives uniformly convergent rates of and . This lemma is straight-forward extension of known results in nonparametric function estimation. Moreover, the proofs of Lemma 9 and Lemma 10 is similar with the corresponding Lemma 9 and Lemma 10 of Xue and Wang [9]. We hence omit these proofs.

Lemma 8. *Let for some positive constant . Suppose that conditions ()–(), (), and () hold. Then
*

In order to describe Lemma 9, we use the following notations. Denote . From Lemma 8, we have and ; hence, we can assume that lies in with and , where Let and ,

Lemma 9. *Suppose that conditions ()–() hold. Let
*

*Then*

*where is defined in (12).*

Lemma 10. *Suppose that conditions ()–() hold. Then
*

*where is defined in (30), is defined in condition*

*(**)*, and is defined in (2).*Proof of Theorem 3. *Note that, when , Lemma 10 also holds. Applying the Taylor expansion to (7) and invoking Lemma 10, we can obtain

By (9) and Lemma 10, we have
This together with (40) proves that
where and are defined in (30) and (37), respectively. From (37) of Lemma 10 and (42), we obtain
where . Let , where , are the eigenvalues of . Then there exists an orthogonal matrix such that . Using the notations of Lemma 9, we have

Noting that , from the above equation and Lemma 9, we have

Hence, by (35) of Lemma 9, we have
where is the identity matrix. This together with (43) proves Theorem 3.

*Proof of Theorem 4. *By Lemma 10 and, similarly to the proof of (42), we can obtain
uniformly for , where tends to 0 in probability uniformly for . Note that and . By the expansion of , defined in (19) and (47), we get

This together with (44) and (48) proves Theorem 4.

Then we complete the proof.

#### Acknowledgments

This research was supported by NNSF project (11171188 and 11231005) of China, Mathematical Finance-Backward Stochastic Analysis and Computations in Financial Risk Control of China (11221061), NSF and SRRF projects (ZR2010AZ001 and BS2011SF006) of Shandong Province of China, K C Wong-HKBU Fellowship Programme for Mainland China Scholars 2010-11, and the Fundamental Research Funds for the Central Universities (27R1310008A).