#### Abstract

To obtain efficient estimation of parameters is a major objective in population pharmacokinetic study. In this paper, we propose an empirical likelihood-based method to analyze the population pharmacokinetic data based on the generalized linear model. A nonparametric version of the Wilk's theorem for the limiting distributions of the empirical likelihood ratio is derived. Simulations are conducted to demonstrate the accuracy and efficiency of empirical likelihood method. An application illustrating our methods and supporting the simulation study results is presented. The results suggest that the proposed method is feasible for population pharmacokinetic data.

#### 1. Introduction

The parameter estimation in population pharmacokinetics (PPK) is a significant problem in clinical research. It is a novel problem in pharmacokinetic (PK) study that combines classic PK models with group statistical models. The PPK parameters, including group typical values, fixed effect parameter, interindividual variation, and intraindividual variation, which are the determinant factors of drug concentration in patients, are taken into consideration. PPK is capable of quantitatively describing the effects of different factors in drug metabolism, such as pathology, physiology, and combined medication, then providing guidance on the adjustment of therapeutic regimen, thus increasing the efficacy and safety in a new drug evaluation.

Many statistical models have been proposed to fit PPK parameters. The most popular analytic statistical model for PPK data is the random effects model proposed by Larid and Ware [1]. Besides, the classic compartmental model and nonlinear mixed effect model proposed by Sheiner et al. in 1977 [2] are the most commonly used statistical models for PPK. And recently, generalized linear model proposed by Salway and Wakefield in 2008 [3] has received wide attention.

In fact nonlinear models are used to estimate parameters of a chosen compartmental model. This model can provide generally good results. However, a disadvantage of the nonlinear models is that it is generally more difficult to fit. From a computational point of view, one also faces the usual challenges associated with nonlinear regression such as choosing starting values, problem with convergence, nonlinear regression diagnostics, and so forth. Other nonlinear models for the analysis of PPK data see [4–6]. Salway and Wakefield [3] proposed a generalized linear model with gamma distribution to deal with PPK data. The GLM is one of the most widely used regression models for statistical analysis. The relevant theoretical investigations are rather complete and the results of parameter estimation have been the standard results in the literature and textbooks. Many methods could solve the generalized linear models, including general least square (GLS) method, quasi-likelihood method, normal theory maximum likelihood, quadratic estimating equations, extended quasi-likelihood, and general estimation equation (GEE). McCulloch [7] proposed three kinds of effective algorithm which can estimate the parameter in generalized linear model accurately. They are Monte Carlo EM (MCEM) algorithm, Monte Carlo Newton-Raphson (MCNR) algorithm, and Simulated Maximum likelihood (SML) separately. In the progress of maximum, Newton-Raphson algorithms, Fisher algorithms, EM algorithms and other basic algorithms can also be used to get the approximate solution through iteration. However, referring to multidimensional integral, it is difficult to solve a likelihood equation, and sometimes the algorithm is not convergent.

In all these methods the covariance structure is an especially crucial problem. In general, we divided variance into four models following in normal distribution: the constant error model, the proportional error model, a combined error model, an alternative combined error model. Therefore, how to choose variance structure appropriately is really confusing. Wakefield et al. [8] suggested the use of integrated nested Laplace approximations for GLMs in this context. In this case, the empirical likelihood is an alternative good choice. Empirical likelihood as a nonparametric data-driven technique was first proposed by Owen [9]. It employs the likelihood function without specifically assuming a distribution for the data and incorporates side information through constraints or a prior distribution, which maximizes the efficiency of the method. What’s more, if the calculation procedure of the algorithm involves iteration, the problem of convergence may exist. Sometimes we cannot obtain convergent solution. But in principle, empirical likelihood can give estimates and confidence regions with almost any method that yields a “reasonable” set of estimating function. By “reasonable” we mean that the estimating functions have zero mean, finite variance, and either based on independent and identically distributed sampling or have higher order moments which allow use of a triangular array argument. Empirical likelihood only requires that the estimating functions have expectation zero. Hence empirical likelihood will be robust to mis-specification of the variance function, as long as the equation for the mean is correct.

In the past two decades, unique and desirable properties of the empirical likelihood method have been studied by a number of researchers. These properties include, but are not limited to, range-respecting, transformation-preserving, asymmetric confidence intervals, Bartlett correctability, and better coverage probability or shorter confidence interval. Empirical likelihood has both the easy implementation of nonparametric inference method and the effectiveness of likelihood inference method. Compared with parametric reference method, it just needs a mild assumption of the existence of the matrix. When the sample size is small, the confidence interval we get by this method is superior to that by the asymptotic normal method. The shape and direction of the confidence interval confidence depend completely on the data itself. The estimation of variance is generally not required. Besides, empirical likelihood has unique superiority in parameter estimation and test of goodness of fit. An updated comprehensive overview of the empirical likelihood is available in Owen [10]. In a large range of situations, empirical likelihood has been shown to have properties analogous to a real likelihood, see Qin and Lawless [11, 12], Li [13], Jing [14], and others. A general framework of empirical likelihood based on estimation function is discussed in Qin and Lawless [12]. Wang et al. [15] incorporated the within-subject correlation structure of the repeated measurements into the empirical likelihood method. They also performed empirical likelihood inference on individual components of regression function.

To the best of our knowledge, there is little literature on the empirical likelihood inference for PPK study with longitudinal data. In this paper, we develop an empirical likelihood method for inference of the parametric component in generalized linear models with population pharmacokinetic data through constructing auxiliary random vectors. The model is as follows: where is the response variable. A few conditions needed to be assumed for the transferred additive error model to be valid to use. Assumptions are listed as follows: These assumptions are used to insure a ascending absorption phase and a descending elimination phase. The model is first proposed by Salway and Wakefield [3].

The rest of the paper is organized as follows. In Section 2, the empirical likelihood method for PPK models is proposed. The asymptotic normality of the proposed empirical estimator and asymptotic chi-square distribution of the proposed empirical likelihood ratio are derived under some regularity assumptions. In Section 3, the simulation is presented and the results also support the theory. A real data analysis is reported in Section 4. Finally, the paper concludes with some discussions in Section 5. Proofs are given in the appendix.

#### 2. Methods

In recent years, numerous scholars have conducted a detailed study on PPK parameter estimation and a variety of methods appeared, for example GEE (Liang and Zeger [16]). Here, we use the method of quasi-likelihood to derive estimating functions for empirical likelihood. In the application part, we will compare the effectiveness of GEE and empirical likelihood method. Note that, since maximum likelihood and quasi-likelihood coincide for all linear exponential families, the class of estimating functions derived using the former method automatically falls within that of the latter. See Morris [17] and Nelder and Pregibon [18] Standard empirical likelihood for generalized linear models has been considered by Kolaczyk [19], based on constraints derived from the score function of the quasi-likelihood. We employ empirical likelihood and deduce as follows.

Assume there are subjects with observations per subject in a study. From (1.1), the generalized error model can be rewritten as where and with .

Therefore, change both side of (1.1) to logarithmic, and following Xue and Zhu [20], we can define an empirical log-likelihood ratio function as follows. Suppose that is a 3*1 dimensional parameters vector to be estimated. From model (1.1), an auxiliary random vector can be defined as where , , and is an invertible covariance matrix dependent on the parameter vector , and will be equal to . can be estimated by using the method of moments [16]. Note that = 0 if is the true parameters. Using this, an empirical log-likelihood-ratio function is defined as By the Lagrange multiplier method, the optimal value for is given by where is Lagrange multiplier

By combining (2.3) and (2.4), can be represented as where is determined by Define as the unique maximizer of . We state the asymptotic properties of the empirical likelihood ratio. The following theorems indicate that is asymptotically normal and the empirical log-likelihood ratio for is a standard chi-square distribution. Some assumptions and proofs of our results are given in the appendix.

Theorem 2.1. *In addition to Lemma A.1 mentioned in the appendix, one assumes that is continuous in a neighborhood of the true value . Then if can bounded by some integrals function in the neighborhood, then one has
**
where .*

Theorem 2.2. *Using the assumptions of Theorem 2.1, if is the true value of the parameter and the last component of is a positive number, then one has
**
where → stands for convergence in distribution, and is a chi-squared distribution with degrees of freedom.*

As a consequence of Theorem 2.1, confidence regions for the parameter can be constructed by (2.3). More precisely, for any , let be the fraction quintile such that . Then constitutes a confidence region for with asymptotically correct coverage probability .

#### 3. Simulation

In this section some simulation results are reported to show the performance of the proposed empirical likelihood method. A two-stage generalized linear random effect model that incorporates a finite mixture model is considered in the simulation.

At the first stage where .

At the second stage where is the associated fixed effect.

To perform the simulations, 500 datasets are generated, each consisting of , 50, 100 subjects. The time schedule is set as 1 2 4 8 10 12 16 32/h. Dosage is set to 5 mg. The fixed effect is set as . The random effect is generated from a normal distribution as follows: , and . For the variance matrix and , we choose a diagonal matrix with elements of 0.05^{2} and 0.005^{2}, respectively, for the reason that neither within-individual nor between-individual variation has been considered here.

The conjugate gradient method used to search for the optimal empirical likelihood parameter estimates by modifying a routing is given in Press et al. [21]. For each dataset, we compute the MELE beta for fixed effects, and the 95% intervals. To illustration the efficacy of empirical likelihood, we also analysis each dataset by GEE method. The simulation results for different sample size are summarized in Table 1, based on 500 simulations.

The results are listed in Table 1. The “Mean” is constructed according to the 500 simulated point estimators, which was computed as the mean value of 500 simulated results of . The confidence intervals are listed in the columns “CI”. The lower and upper bounds of “CI” are the average of 500 simulated lower bounds, and upper bounds, respectively. From Table 1, it is fairly that, as the sample size increases, the confidence intervals become smaller, and the length of EL confidence interval shows advantage over GEE method. For the estimator , these show that the empirical likelihood method for the model can obtain accurate estimator and can construct the effective confidence intervals of .

#### 4. Application

We apply the empirical likelihood in the proposed generalized linear models to the theophylline data from bronchial asthma study. They were originally analyzed in Upton et al. [22] and are available from the Resource Facility for Population Kinetics. Initially this data set was used for Bayesian analysis of linear and nonlinear population models by using the Gibbs sampler. Details about the related design, methods, and medical implications of the asthma study have been described by Upton et al. [22]. Although all the individuals were scheduled to have clinical visits followed the time schedule, due to the various reasons, some individuals missed scheduled visits, which resulted in unequal number of measurements and different measurement times across individuals. It has been also used in Bayesian analysis of linear and nonlinear population models employing Gibbs sampler with normal errors. Wakefield et al. [8] suggested the generalized linear mixed model with the gamma error utilize for it. Based on this model, he had obtained a Bayesian result better than the compartment model result [8]. A Bayesian analysis of the dataset was completed by Salway and Wakefied [3] and Yan et al. [23].

Here we applied our method to the data of 12 subjects who were administered an oral dose of the antiasthmatic agent theophylline, with 10 concentration measurements obtained from each individual over the subsequent 25 hours. Confidence regions for each parameters are shown in Figure 1, respectively. The ranges of the confidence regions are small, which suggests a good fit for the parameters.

**(a)**

**(b)**

A comparable representation with the GEE methods completed for the derived parameters can be seen in Table 2. We compared the proposed empirical likelihood method with GEE method and the typically compartment model results. It concludes that these methods are comparable in terms of accuracy and efficiency. Table 2 indicates that EL-based confidence region is smaller than the GEE-based region and compartment results. Moreover, these findings are similar to the ones of Salway and Wakefield [3] and Yan et al. [23].

#### 5. Conclusions

This paper shows that empirical likelihood is an effective method for analyzing the PPK model. The empirical log-likelihood ratios are proven to be asymptotically chi-squared, and then the corresponding confidence regions for the parameters of interest are constructed. A simulation study is used to investigate the performance of the empirical likelihood method. The results show that the empirical likelihood could obtain accurate estimator and construct the effective confidence regions. We also analyze the theophylline data from the Upton et al. [22], and the results are similar and comparable to the ones of Wakefield et al. [8] and Yan et al. [23].

At last, we have to mention the two disadvantages of this method. For one, the likelihood function in the empirical likelihood method is completely determined by the data, while avoiding the estimation of the variance, so the estimation of individual effect may not be obtained. But actually, it is an important parameter in PPK study and directly impacts doctor's recommendation of individual administration. The following two steps are advised to solve this problem. First, estimate the parameters in the model through empirical likelihood function and then calculate the random effect with the estimators.

For another, in this paper, we did not consider the correlation structure of within-subject observation. However, in the current literature on empirical likelihood methods, the potential correlation structure of within-subject observation was first taken into consideration by Wang et al. [15]. They also demonstrated that it was easy to incorporate the correlation structure. It offered a promising direction to study this problem. Furthermore, we have only accomplished the complete data set at present and incomplete data will be further studied in future. From the application point of view, missing data and censoring data are worth being studied.

#### Appendix

Lemma A.1. *Assume that is positive definite, is continuous in a neighborhood if the true value , 和 are bounded by some integrable function in this neighborhood, and the rank of is . Then, as , with probability 1 attains its minimum value at some point in the interior of the ball , and and satisfy
*

*where*

*Proof of Lemma A.1. *Denote , for , where . First, we give a lower bound for on the surface of the ball. Similar to the proof of Owen (1990) [24], when and , we have
uniformly about .

By this and Taylor expansion, we have (uniformly for )
where, , and is the smallest eigenvalue of
Similarly,
Since is a continuous function about as belongs to the ball , has minimum value in the interior if this ball, and satisfies

Theorem A.2. *In addition to Lemma A.1. one assumes that is continuous in in a neighborhood of the true value . Then if can bounded by some integrable function in the neighborhood, then
**
where .*

*Proof of Theorem A.2. *Taking derivatives about , , we have
Expanding at , by the condition of the Theorem 2.1 and Lemma A.1, we have
where .

We have
where
From this and , we know that . Easily we have
where, .

Theorem A.3. *If is the true value of the parameter and the last component of is a positive number, similar to Qin and Lawless [11], then one has
**
where → stands for convergence in distribution, and is a chi-squared distribution with p degrees of freedom.*

*Proof of Theorem A.3. *The log-empirical likelihood ratio test statistics is

Note that
where .

Also under :
and

Thus
Note that converges to a standard multivariate normal distribution and that is symmetric and idempotent, with trace equal to . Hence the empirical likelihood ratio statistic converges to .

#### Acknowledgments

Research was supported in part by the Project NSFC (Program no. 11171065 and 81130068), Natural Science Foundation of Jiangsu Province (Program no. BK2011058), and the Fundamental Research Funds for the Central Universities (Program no. JKQ2011032). The author thank the Editor and anonymous referees for all their comments and suggestions, which have considerably improved our presentations.