Table of Contents Author Guidelines Submit a Manuscript
Journal of Probability and Statistics
Volume 2018, Article ID 4878925, 9 pages
https://doi.org/10.1155/2018/4878925
Research Article

Local Influence Analysis for Quasi-Likelihood Nonlinear Models with Random Effects

1School of Mathematics and Statistics, Guizhou University of Finance and Economics, Guiyang 550025, China
2Department of Mathematics and Statistics, University of North Carolina at Charlotte, NC 28223, USA
3Department of Mathematics, Southern University of Science and Technology, Shenzhen 518055, China

Correspondence should be addressed to Xuejun Jiang; nc.ude.ctsus@jxgnaij

Received 9 May 2018; Accepted 16 July 2018; Published 8 August 2018

Academic Editor: Steve Su

Copyright © 2018 Tian Xia et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

We propose a quasi-likelihood nonlinear model with random effects, which is a hybrid extension of quasi-likelihood nonlinear models and generalized linear mixed models. It includes a wide class of existing models as examples. A novel penalized quasi-likelihood estimation method is introduced. Based on the Laplace approximation and a penalized quasi-likelihood displacement, local influence of minor perturbations on the data set is investigated for the proposed model. Four concrete perturbation schemes are considered in the local influence analysis. The effectiveness of the proposed methodology is illustrated by some numerical examinations on a pharmacokinetics dataset.

1. Introduction

In this paper, we propose a quasi-likelihood nonlinear model with random effects (QLNMWRE) and investigate local influence of the model. The QLNMWRE is a hybrid generalization of quasi-likelihood nonlinear models [1, 2] and generalized linear mixed models, and it combines the advantages of both models. Generalized linear mixed models (GLMMs) are extensions of the well-known generalized linear models [3] by adding random effects to the linear predictor. GLMMs are effective and flexible for modeling nonnormal responses, repeated measurements, and other forms of clustered data. Efficient inference for the GLMMs depends on the underlying distribution of the data. Nevertheless, the exact distribution is rarely known in practice. In contrast, the quasi-likelihood method [4] requires only the first and second moments assumptions about the distribution and has been widely applied in the theory and practice of statistics (see, e.g., [58]).

Detecting influential observations is important in data analysis. The local influence analysis has become a general tool for detecting a group of points with great influence on the fitted model through perturbation schemes [9]. This approach has been successfully applied in many models, such as mixed models [10, 11], generalized linear models [12], generalized linear mixed models [13], exponential family nonlinear models [14], nonlinear reproductive dispersion mixed model [15], nonlinear mixed-effect models [16, 17], and multivariate threshold time series models [1]. However, in these references the local influence method severely depends on the likelihood displacement, which is rarely known in practice. Instead, quasi-likelihood methods do not require the exact likelihood function except the first two moments of the response variables. Hence, we conduct influence analysis of the QLNMWRE using a novel penalized quasi-likelihood estimation method. The proposed methodology is illustrated by analyzing the pharmacokinetics dataset.

The remainder of this paper is organized as follows. In Section 2, we introduce the QLNMWRE and the corresponding estimation method. A Fisher-scoring iteration algorithm is advanced to calculate the estimators. In Section 3, a penalized quasi-likelihood displacement (PQLD) is proposed, and assessment of local influence under four different perturbation schemes is investigated. In Section 4, the pharmacokinetics dataset is employed to illustrate the effectiveness of the proposed methodology. Finally, we make discussion in Section 5.

2. Models and Estimation Method

Let be a response vector of length , and let and be and matrices of explanatory variables associated with fixed and random effects, respectively. Conditional on the dimensional vector of random effects, , the observations, , are independent and satisfy thatwhere is an unknown parameter vector defined in a compact subset , and are defined in a subset of and a subset of , respectively, is a known variance function, is a dispersion parameter that is known or can be estimated separately, is a continuously differentiable function such that the derivative matrix has rank for all , with , and the random effects are assumed to be multivariate normally distributed:with being a known nonnegative definite matrix. Following [2, 3, 18, 19], the conditional log quasi-likelihood on is defined aswhere The model defined by (1)-(3) is the so-called QLNMWRE.

Clearly, this QLNMWRE encompasses some important special cases. If , then the above model is just the quasi-likelihood nonlinear model discussed by [2]; if , and are independently drawn from a one-parameter exponential family of distributions with density where is a measure, then it reduces to generalized linear models with random effects (see [20, 21]). Hence, the QLNMWRE is a hybrid extension of the quasi-likelihood nonlinear models and the generalized linear models with random effects.

Let be a probability density function of random effect . Then the joint log quasi-likelihood function of and isSimilar to the relationship between the joint log-likelihood function and the marginal log -likelihood function, we have where is the marginal log quasi-likelihood function of and is the log quasi-likelihood function of given , i.e., Following the arguments in [20], the integrated log quasi-likelihood function used to estimate is defined bywhere denotes the deviance measure of fit. If, conditional on , is a member of the exponential family, then is the conditional log-likelihood of given , and is the log-likelihood function.

In general, no analytical expressions are available for the integral in (8) and approximate techniques are needed. The simplest approach is the Laplace approximation [22, 23]. Obviously, the right-hand side of (8) is where . When the Laplace method is applied to approximate the integrated quasi-likelihood function (8), estimates of for fixed are obtained by maximizing the penalized quasi-likelihood (PQL) (8):where , and is the root of for fixed . We will use the penalized quasi-likelihood to estimate and to conduct local influence analysis. To this end, we need the following assumptions.

Assumption A.
(i) ;
(ii) there exists some constant and some compact subset such that

It is easily seen that Assumption A holds in generalized linear mixed models and exponential family nonlinear random effects models. Assumption A guarantees existence of the variance-covariance matrix of , where with . Let and , where Put , , , and . Under Assumption A, we have the following result.

Theorem 1. For the model defined by (1)-(3), conditional on , the quasi-score function, the quasi-observed information matrix, and the quasi-Fisher information matrix for admit the following representations:where indicates the array multiplication.

Let denote the maximum quasi-likelihood estimator (MQLE) of , which is the solution of equation . Then the Fisher-scoring iteration method can be used for computing by iteratively solving the following equation (see [14, 24]):where , and are all evaluated at and .

On the other hand, it follows from (5) that the quasi-score function and the quasi-Fisher information matrix for can be, respectively, expressed as where with , and Hence, the Fisher-scoring iteration algorithm for computing the predictor of under known is given bywhere and are all evaluated at and . As the iteration scheme (19) converges, converges to .

In general, the choice of initial value is important for the Fisher-scoring iteration algorithm. We use the algorithm in [2] for quasi-likelihood nonlinear models to find the starting values of parameter for QLNMWRE with . Hence, the MQLE of can be obtained by solving (16) and (19) until convergence.

In order to investigate the statistical diagnostic measures for QLNMWRE, we rewrite (16) where . When converges to , can be expressed aswhere , and are all evaluated at .

3. Local Influence

The aim of local influence analysis is to investigate the behavior of some influence measure when small perturbations are made in the model/data, where is an m-dimensional vector of perturbations restricted to some open subset . For simple statistical models, Cook constructed in [9] the likelihood displacement and used it to assess the local influence of a minor perturbation. Although this approach is very useful, serious difficulties are encountered when applying it to complicated models, because of the intractable likelihood function. For the sake of coping with those difficulties, some authors have considered alternatives to replace LD. For instance, Zhu et al. proposed in [25] the Q-likelihood displacement and established an approach to assess local influence of statistical models with incomplete data, and Jung presented in [26] a quasi-likelihood displacement to obtain local influence analysis in generalized estimating equations. Inspired by [25, 26], we define in this work a new penalized quasi-likelihood displacement and then adapt the local influence approach introduced by [9] to the QLNMWRE.

Let and be the penalized quasi-likelihood for the unperturbed and perturbed models, respectively. We assume that there is an such as . Let and be the MQLE of under the unperturbed and perturbed models, respectively. Similar to the likelihood displacement [9], we define the penalized quasi-likelihood displacement (PQLD) as The influence graph is defined as . Following the approach developed in [9, 25, 26], the normal curvature of at in the direction of some unit vector can be used to summarize the local behavior of the penalized quasi-likelihood displacement. As shown in [9], the normal curvature in the unit direction at is given bywhere , in which is a matrix evaluated at and , is a Hessian matrix evaluated at . The maximum curvature , which is the largest absolute eigenvalue of , and the corresponding direction vector are usually used for identifying locally influential observations. A large value of is an indication of a serious local problem, and if the -th element in is relatively large special attention should be paid to the element being perturbed by . To apply the local influence method in [9] to the QLNMWRE, we consider the following four perturbation schemes.

3.1. Case-Weights Perturbation

Let be an perturbation vector such that . The joint log quasi-likelihood function for the perturbed model is given by where . Then the penalized quasi-likelihood function can be expressed aswhere , , and satisfies Hence, , where . Then and thus where .

3.2. Response Variable Perturbation

A perturbation of the response variables is introduced by replacing by , where , and represents the situation with no perturbation. In this case, the joint log quasi-likelihood function for the perturbed model is given by where C is a constant. It follows from Section 2 that the penalized quasi-likelihood function is where , and satisfies It follows that , where with . Then and where with and .

3.3. Explanatory Variables Perturbation

In this case, we focus on the perturbation of a specific explanatory variable. Under this condition we have the perturbed explanatory matrix with , where is a single explanatory variable of matrix corresponding to and denotes no perturbation. Then the joint log quasi-likelihood function for the perturbed model is where C is a constant, , and . It follows from Section 2 that where , , and satisfies Therefore, and . Let and . Then where indicates the array multiplication.

3.4. Perturbation of Covariates in Random Effects

Consider perturbing the data for the th explanatory variable of , by modifying the data matrix Z as , where is a vector with 1 at th position and zeros elsewhere. Under this situation, the perturbed joint log quasi-likelihood can be expressed as where is a quantity that does not depend on and , and . When , it indicates no perturbation. It follows from Section 2 that where , and satisfies and therefore, with . Then Hence,

4. Numerical Results

To illustrate how to use the proposed methodology, we consider the data set reported by [27]. The data came from a study of the pharmacokinetics of indomethacin following bolus intravenous injection of the same dose in six human volunteers. For each subject, the plasma concentrations of indomethacin were measured at 11 time points from 15 min to 8 hours postinjection. Davidian et al. used nonlinear repeated model to analyze the dataset in [28]; we model it using the following QLNMWRE:where response variables belong to the Gumbel distribution (cf. [29]) with the density function , and . By [29], we have and , where is called the Euler constant, and . It is easily shown that Assumption A holds for our proposed model. Therefore, we can apply our proposed methodology to estimate the parameters in model (43). Using the algorithm in Section 2, we obtain the MQLE of , the predictive values of as follows: and

Now we present local influence analysis for the above fitting results. Under case-weight perturbation, cases 23, 45, and 56 are most influential, as depicted as in Figure 1(a). Cases 1, 12, 23, 45, and 56 are identified as influential points, and case 23 is the most influential, as shown in Figure 1(b). The index plots of and for perturbation on explanatory variables are given in Figures 2(a) and 2(b), respectively. From Figure 2(a) we can see that cases 12, 23, 45, and 56 are identified as influential points. Figure 2(b) shows that cases 1, 12, 23, 34, 45, and 56 are influential. Figure 3 displays the index plots of for the perturbation of random effects. For these types of perturbation, case 23 is identified as being the most influential. Note that case 23 exerts great influence in each perturbation scheme, which indicates that the results obtained through different perturbation schemes are quite consistent. Special attention should be paid to those influential cases, which may be worthwhile to consider a more formal test to check whether they are outliers.

Figure 1: Index plots of - and for case-weights perturbation.
Figure 2: Index plots of perturbation of explanatory variables.
Figure 3: Index plots of for perturbation of random effects design matrix.

5. Conclusion

In this work, we have assessed the local influence of minor perturbations of our proposed models. The key idea of the previous approach is to study the behavior of the likelihood displacement obtained from a relevant perturbation. However, it is difficult to apply it directly to the proposed model due to the fact that the marginal quasi-likelihood function of the QLNMWRE involves the intractable integral. To solve this problem, we have employed Laplace’s method to approximate the marginal quasi-likelihood function of the QLNMWRE, which results in the penalized quasi-likelihood (PQL). Based on the PQL and the penalized quasi-likelihood displacement, the estimates of unknown parameters have been proposed, and local influence analysis has been investigated. Our numerical example has demonstrated that our proposed local influence technique is rather useful in the detection of influential points. Although the focus of this article is on the assessment of influential points in the QLNMWRE, the local influence approach can be extended to other complicated models.

Appendix

Proof of Theorem 1. Differentiating (10) with respect to yields that It follows from the definition of and (10) that which implies Substituting (A.3) into (A.1) yields (13). Differentiating (13) with respect to leads to Differentiating (A.3) with respect to yields Note that ; it follows that Combining (A.5) and (A.6) leads to which implies Substituting (A.8) into (A.4) yields (14). It follows from Assumption A and that (15) holds. Thus, the proof is completed.

Data Availability

The repeated measurement data supporting this study are from previously reported studies and datasets, which have been cited in [27, 28]. The processed data are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

T. Xia was supported by the NSFC (11361013 and 11571161), the Science and Technology Foundation of Guizhou Province of China [(2008)2249], and Talent Introduction Foundation of Guizhou University of Finance and Economics. X. Jiang was supported by the Natural Science Foundation of Guangdong Province of China (2016A030313856) and the Shenzhen Sci-Tech Fund (no. JCYJ20170307110329106).

References

  1. X. Jiang, T. Xia, and X. Wang, “Asymptotic properties of maximum quasi-likelihood estimator in quasi-likelihood non linear models with stochastic regression,” Communications in Statistics—Theory and Methods, vol. 46, no. 13, pp. 6229–6239, 2017. View at Publisher · View at Google Scholar · View at MathSciNet
  2. T. Xia, X.-R. Wang, and X.-J. Jiang, “Asymptotic properties of maximum quasi-likelihood estimator in quasi-likelihood nonlinear models with misspecified variance function,” Statistics. A Journal of Theoretical and Applied Statistics, vol. 48, no. 4, pp. 778–786, 2014. View at Publisher · View at Google Scholar · View at MathSciNet
  3. P. McCullagh and J. A. Nelder, Generalized Linear Models, Chapman & Hall, London, UK, 2nd edition, 1989. View at Publisher · View at Google Scholar · View at MathSciNet
  4. R. W. Wedderburn, “Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method,” Biometrika, vol. 61, pp. 439–447, 1974. View at Google Scholar · View at MathSciNet · View at Scopus
  5. J. Fan and I. Gijbels, Local Polynomial Modeling and Its Application, Chapman & Hall, London, UK, 1996. View at MathSciNet
  6. J. Jiang, X. Jiang, J. Li, Y. Liu, and W. Yan, “Spatial quantile estimation of multivariate threshold time series models,” Physica A: Statistical Mechanics and its Applications, vol. 486, pp. 772–781, 2017. View at Publisher · View at Google Scholar · View at MathSciNet
  7. X. Jiang, J. Li, T. Xia, and W. Yan, “Robust and efficient estimation with weighted composite quantile regression,” Physica A: Statistical Mechanics and its Applications, vol. 457, pp. 413–423, 2016. View at Publisher · View at Google Scholar · View at Scopus
  8. K. Y. Liang and S. L. Zeger, “Longitudinal data analysis using generalized linear models,” Biometrika, vol. 73, no. 1, pp. 13–22, 1986. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  9. R. D. Cook, “Assessment of local influence,” ournal of the Royal Statistical Society. Series B (Methodological), vol. 48, no. 2, pp. 133–169, 1986. View at Google Scholar · View at MathSciNet
  10. R. J. Beckman, C. J. Nachtsheim, and R. D. Cook, “Diagnostics for mixed-model analysis of variance,” Technometrics. A Journal of Statistics for the Physical, Chemical and Engineering Sciences, vol. 29, no. 4, pp. 413–426, 1987. View at Publisher · View at Google Scholar · View at MathSciNet
  11. L. C. Montenegro, V. H. Lachos, and H. Bolfarine, “Local influence analysis for skew-normal linear mixed models,” Communications in Statistics—Theory and Methods, vol. 38, no. 3-5, pp. 484–496, 2009. View at Publisher · View at Google Scholar · View at MathSciNet
  12. W. Thomas and R. D. Cook, “Assessing influence on predictions from generalized linear models,” Technometrics, vol. 32, no. 1, pp. 59–65, 1990. View at Publisher · View at Google Scholar · View at Scopus
  13. L. Xiang, A. H. Lee, and S.-K. Tse, “Assessing local cluster influence in generalized linear mixed models,” Journal of Applied Statistics, vol. 30, no. 4, pp. 349–359, 2003. View at Publisher · View at Google Scholar · View at MathSciNet
  14. B. C. Wei, Exponential Family Nonlinear Models, Springer, Singapore, 1998. View at MathSciNet
  15. N.-S. Tang, B.-C. Wei, and W.-Z. Zhang, “Influence diagnostics in nonlinear reproductive dispersion mixed models,” Statistics. A Journal of Theoretical and Applied Statistics, vol. 40, no. 3, pp. 227–246, 2006. View at Publisher · View at Google Scholar · View at MathSciNet
  16. E. F. Vonesh and R. L. Carter, “Mixed-effects nonlinear regression for unbalanced repeated measures,” Biometrics - A Journal of the International Biometric Society, vol. 48, no. 1, pp. 1–17, 1992. View at Publisher · View at Google Scholar · View at MathSciNet
  17. E. F. Vonesh, “A note on the use of Laplace's approximation for nonlinear mixed-effects models,” Biometrika, vol. 83, no. 2, pp. 447–452, 1996. View at Publisher · View at Google Scholar · View at MathSciNet
  18. T. Xia, X. Jiang, and X. Wang, “Strong consistency of the maximum quasi-likelihood estimator in quasi-likelihood nonlinear models with stochastic regression,” Statistics & Probability Letters, vol. 103, pp. 37–45, 2015. View at Publisher · View at Google Scholar · View at MathSciNet
  19. T. Xia, X. Jiang, and X. Wang, “Diagnostics for quasi-likelihood nonlinear models,” Communications in Statistics—Theory and Methods, vol. 46, no. 18, pp. 8836–8851, 2017. View at Publisher · View at Google Scholar · View at MathSciNet
  20. N. E. Breslow and D. G. Clayton, “Approximate inference in generalized linear mixed models,” Journal of the American Statistical Association, vol. 88, 9, p. 25, 1993. View at Google Scholar
  21. X. Lin, “Variance component testing in generalised linear models with random effects,” Biometrika, vol. 84, no. 2, pp. 309–326, 1997. View at Publisher · View at Google Scholar · View at MathSciNet
  22. L. Tierney, R. E. Kass, and J. B. Kadane, “Approximate marginal densities of nonlinear functions,” Biometrika, vol. 76, no. 3, pp. 425–433, 1989. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  23. R. Wolfinger, “Laplace's approximation for nonlinear mixed models,” Biometrika, vol. 80, no. 4, pp. 791–795, 1993. View at Publisher · View at Google Scholar · View at MathSciNet
  24. Y. Lee and N. A. Nelder, “Hierarchical generalized linear models,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 58, no. 4, pp. 619–678, 1996. View at Google Scholar · View at MathSciNet
  25. H.-T. Zhu and S.-Y. Lee, “Local influence for incomplete-data models,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 63, no. 1, pp. 111–126, 2001. View at Publisher · View at Google Scholar · View at MathSciNet
  26. K.-M. Jung, “Local influence in generalized estimating equations,” Scandinavian Journal of Statistics, vol. 35, no. 2, pp. 286–294, 2008. View at Publisher · View at Google Scholar · View at MathSciNet
  27. K. C. Kwan, G. O. Breault, E. R. Umbenhauer, F. G. McMahon, and D. E. Duggan, “Kinetics of indomethacin absorption, elimination, and enterohepatic circulation in man,” Journal of Pharmacokinetics and Biopharmaceutics, vol. 4, no. 3, pp. 255–280, 1976. View at Publisher · View at Google Scholar
  28. M. Davidian and D. M. Giltinan, Nonlinear Models for Repeated Measurement Data, Chapman and Hall, London, 1995.
  29. E. J. Gumbel, Statistics of Extremes, Columbia University Press, New York, 1958. View at MathSciNet