We propose a marginalized joint-modeling approach for marginal inference on the association between longitudinal responses and covariates when longitudinal measurements are subject to informative dropouts. The proposed model is motivated by the idea of linking longitudinal responses and dropout times by latent variables while focusing on marginal inferences. We develop a simple inference procedure based on a series of estimating equations, and the resulting estimators are consistent and asymptotically normal with a sandwich-type covariance matrix ready to be estimated by the usual plug-in rule. The performance of our approach is evaluated through simulations and illustrated with a renal disease data application.

1. Introduction

Longitudinal studies often encounter data attrition because subjects drop out before the designated study end. Both statistical analysis and practical interpretation of longitudinal data can be complicated by dropouts. For example, in the Modification of Diet in Renal Disease (MDRD) study [1, 2], one main interest was to investigate the efficacy of interventions of blood pressure control and diet modification on patients with impaired renal functions. The primary outcome was glomerular filtration rate (GFR), which measured filtering capacity of kidneys, and was repeatedly measured over the study period. However, some patients could leave the study prematurely for kidney transplant or dialysis, which precluded further GFR measurements. This resulted in a dropout mechanism that could relate to patients’ kidney function and correlate with their GFR values. Other patients were followed to the end of the study or dropped out due to independent reasons. Thus, statistical analysis of longitudinal GFR needs to take into consideration the presence of mixed types of informative and independent dropouts.

Many statistical models and inference approaches have been proposed to accommodate the nonignorable missingness into modeling longitudinal data (see reviews [38]). According to the target of inference and the interpretation of model parameters, existing methods can be classified into three categories: subject-specific inference, event-conditioning inference, and marginal inference. First, a widely used modeling strategy for longitudinal data with informative dropouts is to specify their joint distribution via shared or correlated latent variables. Under such model assumptions, the longitudinal parameters have a conditional, subject-specific interpretation (e.g., [911]). But the interpretation of longitudinal parameters usually changes with the number and characteristics of latent variables assumed, for example, a single random intercept versus a random intercept plus a random slope.

Second, event-conditioning approaches have also been widely used when the target of inference is within subgroups of patients with particular dropout patterns or when the dropout can potentially change the material characteristic of the longitudinal process (e.g., death). The inference is usually conducted conditioning on the dropout pattern or on the occurrence of the dropout event. Thus, model parameters have an event-conditioning subpopulation-averaged interpretation, for example, pattern-mixture models for the group expectation of each dropout pattern [3, 12]; treatment effects among survivors [13]; gender and age effects in mortal cohort [14]. Because the interpretation of such models is made by conditioning on a future event, event-conditioning approaches may be natural in a retrospective setting but may not be directly useful for the evaluation of treatment efficacy prospectively.

Lastly, when the research objective is to study covariate effects at population level in a dropout-free situation, marginal models address this concern directly. When data are without missing or missing completely at random (using Rubin's definition on missingness [15]), the estimation of model parameters can be carried out by the generalized estimating equation (GEE) approach assuming a “working” correlation matrix [16]. When dropouts are missing at random, the inverse probability-weighted GEE methods are commonly used [17, 18]. In the presence of informative dropouts, the class of selection models that were originally proposed to adjust selection bias in econometrics [19] have been widely used for the marginal analysis of longitudinal data [2022]. Recently, the marginalized transition model [23] and marginalized pattern-mixture model [24] were proposed for binary longitudinal data with finite nonignorable nonresponse patterns. These marginalized approaches provide a powerful tool for studying the marginal association between longitudinal outcomes and covariates while incorporating nonignorable nonresponses.

In this paper, we shall adopt the idea of shared latent variables to account for the dependence between longitudinal responses and informative dropouts while focusing on marginal inference for the longitudinal responses. Here dropouts can occur on a continuous time scale. We develop an effective estimation procedure built on a series of asymptotically unbiased estimating equations with light computational burden. The resulting estimators for longitudinal parameters are shown to be consistent and asymptotically normal, with a sandwich-type variance-covariance matrix that can be estimated by the usual plug-in rule.

The remainder of the paper is organized as follows. In Section 2, we introduce the notation and the proposed semiparametric marginalized model. In Section 3, a simple estimating equation-based procedure is first proposed for the situation with pure informative dropouts and is extended to a more general situation where there is a mixture of random dropouts and informative dropouts. Asymptotic properties of resulting estimators are also studied. Simulation studies and an application to a renal disease data set are given in Section 4. Some remarks are discussed in Section 5. All the technical details are provided in the appendix.

2. Notation and Model Specification

2.1. Data Notation

Consider that a longitudinal study follows 𝑛 subjects over time period [0,𝜏]. For the ith subject 𝑖=1,,𝑛, the complete-data consist of {𝑌𝑖𝑗,𝑋𝑖𝑗,𝑡𝑖𝑗,𝑗=1,,𝑛𝑖}, where 𝑌𝑖𝑗 is the value of response at the jth observation time 𝑡𝑖𝑗 and 𝑋𝑖𝑗 is a 𝑝×1 vector of covariates associated with response 𝑌𝑖𝑗. Note that 𝑋𝑖𝑗 includes baseline covariates that are separately denoted by 𝑍𝑖 and potential time-dependent covariates. Let 𝑇𝑖 denote the informative dropout time and 𝐶𝑖 denote the random censoring time that is independent of (𝑌𝑖𝑗,𝑇𝑖) given the covariates. In practice, we observe (𝑇𝑖,𝛿𝑖), where 𝑇𝑖=min(𝑇𝑖,𝐶𝑖) and 𝛿𝑖=𝐼(𝑇𝑖𝐶𝑖) taking the value of 1 if the informative dropout time is observed and 0 otherwise. Throughout the paper, let 𝐼() denote the indicator function. Due to the dropout, longitudinal responses and covariates can only be observed at 𝑡𝑖𝑗𝑇𝑖. Hence, the observed data are {𝑌𝑖𝑗,𝑋𝑖𝑗,𝑡𝑖𝑗,𝑇𝑖,𝛿𝑖,𝑖=1,,𝑛,𝑗=1,,𝑚𝑖}, where 𝑚𝑖=𝑛𝑖𝑗=1𝐼(𝑡𝑖𝑗𝑇𝑖).

2.2. Semiparametric Marginalized Latent Variable Model

We first introduce the composition of our proposed model and then discuss the model motivation and interpretation. The first component is a marginal generalized linear model for longitudinal responses 𝑌𝑖𝑗’s:𝑔E𝑌𝑖𝑗𝑋𝑖𝑗𝜇𝑔𝑖𝑗=𝛽𝑀𝑋𝑖𝑗,(2.1) where 𝑔() is a known link function and 𝜇𝑖𝑗 denotes the marginal expectation. The second component is a linear transformation model for the informative dropout time 𝑇𝑖:𝐻𝑇𝑖=𝜃𝑍𝑖+𝜂𝑖,(2.2) where 𝐻() is an unspecified monotone transformation function and 𝜂𝑖 is assumed to follow a known continuous distribution 𝐹() that is independent of 𝑍𝑖. The last component is a conditional mean model characterizing the dependence between longitudinal responses and informative dropouts:𝑔E𝑌𝑖𝑗𝑋𝑖𝑗,𝜂𝑖𝜇𝑔𝑖𝑗𝜂𝑖=Δ𝑖𝑗+𝛼𝑏𝑖𝑗𝜂𝑖,(2.3) where the latent random effects 𝐛𝑖(𝜂𝑖)={𝑏𝑖1(𝜂𝑖),,𝑏𝑖𝑛𝑖(𝜂𝑖)} are investigator-specified functions of 𝜂𝑖 and covariates, and Δ𝑖𝑗 is an implicit parameter whose value is determined by the integral equation matching the conditional mean model (2.3) with the corresponding marginal model (2.1), that is,𝑔1𝛽𝑀𝑋𝑖𝑗𝑌=E𝑖𝑗𝑋𝑖𝑗E𝑌=E𝑖𝑗𝑋𝑖𝑗,𝜂𝑖=𝑔1Δ𝑖𝑗+𝛼𝑏𝑖𝑗(𝜂)𝑑𝐹(𝜂).(2.4)

The marginal mean model (2.1) directly specifies the marginal relationship between the responses and covariates, and 𝛽𝑀 is the 𝑝×1 marginal regression parameters of main interest. Next, the semiparametric linear transformation model (2.2) is chosen to provide a flexible survival model for the informative dropout time while it still can be easily incorporated into model (2.3) for the dependence of the longitudinal responses and informative dropouts. Model (2.2) includes the proportional hazards model [25], the proportional odds model [26], and the Box-Cox transformation model as special cases and has been studied intensively in survival analysis literature [2729]. In addition, as we present in Section 3, the explicit assumption on the error distribution in (2.2) can facilitate the “marginalization” procedure for parameter estimation.

The conditional mean model (2.3) is motivated by the construction of the marginalized random-effects model [30, 31]. As a motivating example, we consider a continuous Gaussian process following a simple random-effects model, 𝑌𝑖𝑗=𝛽𝑋𝑖𝑗+𝑏0𝑖+𝑏1𝑖𝑡𝑖𝑗+𝜀𝑖𝑗, where (𝑏𝑖0,𝑏𝑖1) are the random intercept and slope, and error terms 𝜀𝑖𝑗,𝑗=1,,𝑛𝑖 are assumed to follow 𝑁(𝟎,Σ𝑖) but independent of (𝑏𝑖0,𝑏𝑖1) or 𝜂𝑖. Note that 𝜀𝑖𝑗's can still exhibit temporal dependence in addition to what has been accounted by the random effects, that is, Σ𝑖 with AR(1) covariance structure. Furthermore, as in joint modeling approaches via latent variables, the joint distribution of (𝑏0𝑖,𝑏1𝑖,𝜂𝑖) is assumed to be 𝑁𝟎,Σ𝑏𝐶,𝐶,1. It is easy to see that the conditional mean has the expression as model (2.3),E𝑌𝑖𝑗𝑋𝑖𝑗,𝜂𝑖=Δ𝑖𝑗+𝐶1𝜂𝑖+𝐶2𝜂𝑖𝑡𝑖𝑗.(2.5) We use model (2.3) primarily as a parsimonious model for the dependence structure between the longitudinal responses and informative dropout times. However, note that although model (2.3) takes a similar form as the marginalized random-effects model, it does not intend to fully specify the joint distribution of the repeated measurements since model (2.3) only specifies the conditional mean function and there is no conditional independence assumed.

Note that Δ𝑖𝑗 is the solution of (2.4), and thus its value implicitly depends on 𝛽𝑀, 𝛼, the formulation of 𝑏𝑖𝑗(𝜂𝑖) and the distribution of 𝜂𝑖. The specification of 𝑏𝑖𝑗(𝜂𝑖) reflects investigator's assumptions on the dependence structure among the longitudinal responses and their association with the dropout times. It is well known that the dependence assumptions between longitudinal measurements and informative dropouts are usually unverifiable from the observed data, but it intrinsically affects the inference about 𝛽𝑀. Thus, a sensitivity analysis under various assumptions is always warranted. It is clear that the sensitivity analysis can be easily conducted within the framework of model (2.3). For example, the sensitivity analysis may start with a large model in the specification for 𝛼𝑏𝑖𝑗(𝜂𝑖), for example, 𝛼1𝜂𝑖+𝛼2𝜂𝑖𝑡𝑖𝑗+𝛼3𝜂𝑖𝑋𝑖𝑗, and then examine the statistical significance of estimates for 𝛼’s to further simplify the model. As shown in the next subsection, complex structure can be imposed on 𝑏𝑖𝑗(𝜂𝑖) without introducing much extra computation. Lastly, we note that the marginal interpretation of longitudinal parameters in model (2.1) is invariant under different specifications of the conditional mean model (2.3).

3. Estimation and Asymptotic Properties

3.1. Conditional Generalized Estimating Equation

First, assume that 𝜂𝑖 is known. We construct a “conditional” generalized estimating equation for =(𝛽𝑀,𝛼). More specifically, the estimating function 𝑈() is specified as𝑛𝑖=1𝐀𝑖𝜂𝑖𝐖𝑖𝐘𝑖𝝁𝑖𝜂𝑖=𝑛𝑖=1𝐀𝑖𝜂𝑖𝐖𝑖𝐘𝑖𝑔1𝚫𝑖+𝐛𝑖𝜂𝑖𝛼,(3.1) where 𝐘𝑖 denotes the 𝑚𝑖×1 vector of observed responses of subject 𝑖; Δ𝑖=(Δ𝑖1,,Δ𝑖𝑚𝑖); 𝐛𝑖(𝜂𝑖)={𝑏𝑖1(𝜂𝑖),,𝑏𝑖𝑚𝑖(𝜂𝑖)}; 𝝁𝑖(𝜂𝑖)={𝜇𝑖1(𝜂𝑖),,𝜇𝑖𝑚𝑖(𝜂𝑖)}; 𝐀𝑖(𝜂𝑖)=[𝜕𝝁𝑖(𝜂𝑖)/𝜕𝛽𝑀,𝜕𝝁𝑖(𝜂𝑖)/𝜕𝛼]; 𝐖𝑖 is a 𝑚𝑖×𝑚𝑖 weight matrix.

It is easy to see that 𝑈() has mean zero at the true parameter values 0 under model (2.3). Note that the vector of marginal parameters 𝛽𝑀 is implicitly present in 𝑈() with Δ𝑖 through the constrain equation (2.4). Thus, the Jacobian matrix 𝐀𝑖(𝜂𝑖) needs to be derived using both the constrain (2.4) and models (2.1) and (2.3), which is different from the ordinary GEE. More specifically, entries of the Jacobian matrix 𝐀𝑖(𝜂𝑖) are given by 𝜕𝜇𝑖𝑗𝜂𝑖𝜕𝛽𝑀=𝑋𝑖𝑗Υ𝑖𝑗EΥ𝑖𝑗𝜂𝑖1Υ𝑖𝑗𝜂𝑖,𝜕𝜇𝑖𝑗𝜂𝑖=𝑏𝜕𝛼𝑖𝑗𝜂𝑖𝑏E𝑖𝑗𝜂𝑖Υ𝑖𝑗𝜂𝑖EΥ𝑖𝑗𝜂𝑖1Υ𝑖𝑗𝜂𝑖,(3.2) where Υ𝑖𝑗=1/̇𝑔{𝑔1(𝛽𝑀𝑋𝑖𝑗)}, Υ𝑖𝑗(𝜂𝑖)=1/̇𝑔[𝑔1{Δ𝑖𝑗+𝛼𝑏𝑖𝑗(𝜂𝑖)}], and we use ̇𝑎(𝑥) to denote the derivative of a function 𝑎(𝑥) throughout this paper. In particular, we have Υ𝑖𝑗=𝜇𝑖𝑗(1𝜇𝑖𝑗) and Υ𝑖𝑗(𝜂𝑖)=𝜇𝑖𝑗(𝜂𝑖){1𝜇𝑖𝑗(𝜂𝑖)} under the logit-link function for binary data; Υ𝑖𝑗=𝜇𝑖𝑗 and Υ𝑖𝑗(𝜂𝑖)=𝜇𝑖𝑗(𝜂𝑖) under the log-link function for count data. Thus, under these canonical link functions, Υ𝑖𝑗=Var(𝑌𝑖𝑗𝑋𝑖𝑗) and Υ𝑖𝑗(𝜂𝑖)=Var(𝑌𝑖𝑗𝑋𝑖𝑗,𝜂𝑖) are the marginal variance and conditional variance of the responses, respectively. In addition, these formulations also facilitate our selection of the weight matrix 𝐖𝑖. For example, for binary longitudinal data with logit-link function, we can choose a weight matrix as 𝐖𝑖1=diag{Var(𝑌𝑖𝑗𝑋𝑖𝑗,𝜂𝑖),𝑗=1,,𝑚𝑖}.

It is clear that the implementation of the estimating function (3.1) requires the knowledge of 𝜂𝑖, which is an unknown quantity and has to be estimated first. The estimation of the semiparametric linear transformation model (2.2) has been studied by many authors [2729]. In particular, Chen et al. [27] proposed a class of martingale-based estimating equations, 𝑛𝑖=10𝑍𝑖𝑑𝑁𝑖𝑇(𝑡)𝐼𝑖𝜃𝑡𝑑Λ𝑍𝑖+𝐻(𝑡)=0,𝑛𝑖=1𝑑𝑁𝑖𝑇(𝑡)𝐼𝑖𝜃𝑡𝑑Λ𝑍𝑖+𝐻(𝑡)=0,𝑡0,(3.3) where 𝑁𝑖(𝑡)=𝐼(𝑇𝑖𝑡,𝛿𝑖=1). Then an iterative algorithm can be carried out to solve 𝜃 and 𝐻 simultaneously. We estimate (𝜃,𝐻) using the approach of Chen et al. [27] and shall denote the estimates as ̂Θ(𝜃,𝐻).

3.2. Estimation Procedure for Pure Informative Dropouts

We first consider the situation of pure informative dropouts, that is, 𝛿𝑖1. Define ̂𝜂𝑖=𝐻(𝑇𝑖̂𝜃)+𝑍𝑖 and replace 𝜂𝑖’s in (3.1) with their estimated counterparts ̂𝜂𝑖’s. Denote the resulting estimating function by 𝑈(;Θ) and define the estimator of as the solution to 𝑈(;Θ)=0. The estimation of entails an iteration between solving nonlinear equations for Δ𝑖𝑗 and updating a Newton-Ralphson equation for . More specifically, given the current estimated value of (𝐽) at the Jth step, we first estimate Δ(𝐽)𝑖𝑗 from𝑔1𝛽(𝐽)𝑀𝑋𝑖𝑗=𝑔1Δ𝑖𝑗+𝛼(𝐽)𝑏𝑖𝑗(𝜂)𝑑𝐹(𝜂),(3.4) and then update the parameters by (𝐽+1)=(𝐽)+𝑛𝑖=1𝐀𝑖(𝐽)̂𝜂𝑖𝐖𝑖(𝐽)𝐀𝑖(𝐽)̂𝜂𝑖1𝑛𝑖=1𝐀𝑖(𝐽)̂𝜂𝑖𝐖𝑖(𝐽)𝐘𝑖𝝁(𝐽)𝑖̂𝜂𝑖,(3.5) where 𝐀𝑖(𝐽)(̂𝜂𝑖), 𝐖𝑖(𝐽), and 𝝁(𝐽)𝑖(̂𝜂𝑖) are evaluated at the current parameter values (𝐽) and Δ(𝐽)𝑖𝑗. The algorithm is iterated until it converges. Because 𝜂𝑖 is assumed to follow an explicit parametric distribution 𝐹, it greatly simplifies the marginalization procedure (3.4). We propose to use the Gaussian-quadrature approach [32] to numerically evaluate (3.4) and 𝐀𝑖(𝐽)(̂𝜂𝑖). Since the integrand of (3.4) is monotonic in Δ(𝐽)𝑖𝑗 and so is the whole integral, it is easy to calculate a large number of Δ(𝐽)𝑖𝑗,𝑖=1,,𝑛,𝑗=1,,𝑚𝑖, in all iterative steps. Moreover, the numerical integration is only upon the one-dimensional space of 𝜂𝑖 and requires light computation even with complex structure assumed on 𝑏𝑖𝑗(𝜂𝑖). The proposed iterative algorithm has been implemented using “R” codes, which are available from the authors upon request.

3.3. Estimation Procedure for Mixed Types of Dropouts

We generalize the proposed estimation function (3.1) to accommodate the situation where there are mixed informative dropouts and random censoring. More specifically, the modified estimating equation is given by𝑈(;Θ)=𝑛𝑖=1𝐀𝑖𝜂𝑖,𝛿𝑖𝐖𝑖𝐘𝑖𝝁𝑖𝜂𝑖,𝛿𝑖=0,(3.6) where 𝜂𝑖=𝐻(𝑇𝑖)+𝜃𝑍𝑖; the jth component of 𝝁𝑖(𝜂𝑖,𝛿𝑖) is 𝜇𝑖𝑗𝜂𝑖,𝛿𝑖=𝛿𝑖𝑔1Δ𝑖𝑗+𝛼𝑏𝑖𝑗𝜂𝑖+1𝛿𝑖E𝑔1Δ𝑖𝑗+𝛼𝑏𝑖𝑗(𝜂)𝜂𝜂𝑖,(3.7) and the Jacobian matrix 𝐀𝑖(𝜂𝑖,𝛿𝑖)=[𝜕𝝁𝑖(𝜂𝑖,𝛿𝑖)/𝜕𝛽𝑀,𝜕𝝁𝑖(𝜂𝑖,𝛿𝑖)/𝜕𝛼]. When 𝛿𝑖=1, the ith component of 𝑈(;Θ) is the same as the one in (3.1). For 𝛿𝑖=0, the entries of 𝐀𝑖(𝜂𝑖,0) are given by 𝜕𝜇𝑖𝑗𝜂𝑖,0𝜕𝛽𝑀=𝑋𝑖𝑗Υ𝑖𝑗EΥ𝑖𝑗𝜂𝑖1EΥ𝑖𝑗(𝜂)𝜂𝜂𝑖,𝜕𝜇𝑖𝑗𝜂𝑖,0𝑏𝜕𝛼=E𝑖𝑗(𝜂)Υ𝑖𝑗(𝜂)𝜂𝜂𝑖𝑏E𝑖𝑗𝜂𝑖Υ𝑖𝑗𝜂𝑖EΥ𝑖𝑗𝜂𝑖1EΥ𝑖𝑗(𝜂)𝜂𝜂𝑖.(3.8) In addition, the entries of the weight matrix can be changed to Var{𝑌𝑖𝑗𝑋𝑖𝑗,𝜂𝑖,𝛿𝑖} accordingly. Conditional expectations of various functions given 𝜂𝜂𝑖 are computed using the Gaussian-quadrature method. Let ̂𝜂𝑖=𝐻(𝑇𝑖̂𝜃)+𝑍𝑖 and replace 𝜂𝑖's in (3.6) with their estimated counterparts ̂𝜂𝑖’s. Denote the resulting estimating function by 𝑈(;Θ). Then the estimator of can be obtained from the equation 𝑈(;Θ)=0 using the same iterative algorithm described in the previous subsection.

3.4. Asymptotic Properties of

In this subsection, we establish the asymptotic properties of . Towards this end, we need the following assumptions. (C1)The covariates 𝑋𝑖𝑗’s are bounded with probability 1. (C2)The true parameter values 0 and 𝜃0 belong to the interior of a known compact set, and the true transformation function 𝐻0 has a continuous and positive derivative. (C3)Let Λ() denote the cumulative hazard function of 𝜂𝑖. Define ̇𝜆(𝑡)=Λ(𝑡) and ̇𝜓(𝑡)=𝜆(𝑡)/𝜆(𝑡). Then 𝜆() is positive, 𝜓() is continuous, and lim𝑡𝜆(𝑡)=0=lim𝑡𝜓(𝑡). (C4)𝜏 is finite and satisfies 𝑃(𝑇>𝜏)>0 and 𝑃(𝐶=𝜏)>0. (C5)The matrix ΩE{𝐀1(𝜂1,𝛿1)𝐖1𝐀1(𝜂1,𝛿1)} is positive finite, and the number of repeated measurements 𝑚𝑖𝑁.

The regularity conditions (C1)–(C4) are also used by Chen et al. [27] to derive the consistency and asymptotically normality of the estimators Θ. Condition (C5) is needed to establish the consistency and asymptotic normality of , which is given in the following theorem.

Theorem 3.1. Under conditions (C1)–(C5), with probability 1,  |0|0. In addition, one has, as 𝑛, 𝑛0𝐷𝑁0,Ω1𝑉Ω1.(3.9)

The definition of 𝑉 and a sketch of the proof for Theorem 3.1 are given in the appendix. The asymptotic variance-covariance matrix can be consistently estimated by its empirical counterpart Ω1𝑉Ω1, which can be easily obtained using the usual plug-in rule.

4. Numerical Studies

4.1. Simulations

We conducted a series of simulation studies to evaluate the finite-sample performance of our proposed approach. Consider a binary longitudinal process with the marginal probability of success as 𝑔1(𝛽0+𝛽1𝑡+𝛽2𝑍), where 𝑔 was the logit-link function; observations occurred at 𝐭𝑖={𝑡𝑖𝑗=𝑗,𝑗=0,1,,5}; 𝑍 was generated from a Bernoulli distribution with the success probability of 0.5, and (𝛽0,𝛽1,𝛽2)=(1.5,0.3,1). The informative dropout time 𝑇𝑖 was generated from a linear transformation model 𝐻(𝑇𝑖)=𝜃𝑍𝑖+𝜂𝑖, where 𝜃=0.5. We considered three distributions for 𝜂𝑖: the standard normal distribution (N.), the extreme value distribution (E.), and the logistic distribution (L.), corresponding to the normal transformation model, the Cox proportional hazard model, and the proportional odds model, respectively, for the informative dropout time. We then generated the binary response 𝑌𝑖𝑗 independently from a Bernoulli distribution with the success probability of 𝑔1{Δ𝑖𝑗+𝛼𝑏𝑖𝑗(𝜂𝑖)}, where Δ𝑖𝑗 was calculated to match the marginal mean value as in (2.4) and 𝛼 indicated the level of dependence. We considered several combinations to specify the dependence between longitudinal outcomes and informative dropout times. More specifically, when 𝛼=0, there was no informative dropouts; when 𝛼=0.5 (or 0.25) and 𝑏𝑖𝑗(𝜂𝑖)=𝜂𝑖, the dependence existed and was linear in the latent variable 𝜂𝑖; when 𝛼=0.25 (or 0.5) and 𝑏𝑖𝑗(𝜂𝑖)=𝜂𝑖𝑡𝑖𝑗, the dependence was present through an interaction between the latent variable and the observation time.

For each scenario, we considered samples of size 100 and 200 and conducted 500 runs of simulations. The Gaussian-quadrature approximation was calculated using 50 grid points. We first considered the situation of pure informative dropouts and generated the dropout time 𝑇𝑖 from the transformation model with 𝐻0(𝑡)=2{arctan(𝑡)+𝜋/2}. Under the assumptions of 𝜂𝑖 following the normal, the extreme value, and the logistic distributions, the average numbers of repeated measurements were 3.91, 3.37, and 3.94, respectively. The estimation results on 𝛽1 and 𝛼 are summarized in Table 1. The proposed estimators are unbiased under all simulated scenarios, and the Wald-type 95% confidence intervals all have reasonable empirical coverage probabilities. The performance of the proposed method is consistent with different distributional assumptions of 𝜂𝑖 and different specifications of the dependence structure, and the results improve as the sample size increases.

Next, we consider the situation where there are mixed informative dropouts and random censoring. For simplicity, let 𝐶𝑖 be an administrative censoring at the end of the study, that is, 𝜏=6. The informative dropout time 𝑇𝑖 was generated from the transformation model with 𝐻(𝑡)=log(𝑡)1 and 𝜂𝑖 followed the standard normal distribution. This yielded the proportion of informative dropouts of 69.6% and the average number of repeated measurements about 4. Other settings were kept the same as in the previous simulations. The simulation results are presented in Table 2. Again, the proposed approach gives unbiased parameter estimates and reasonable coverage probabilities under all the scenarios. For comparison, we also implemented the ordinary GEE method [16]. When the informative dropout is absent, that is, 𝛼=0, the GEE method yields consistent parameter estimates of 𝛽1 as expected. But when there is informative dropout (𝛼0), the performance of the GEE method deteriorates quickly as the magnitude of the dependence between the longitudinal data and informative dropout increases.

Last, we conducted sensitivity analysis for the proposed approach and our simulations consisted of two parts. First, as discussed in Section 2, to better characterize the dependence structure between longitudinal responses and informative dropouts, we would suggest to start with a large model in the specification for 𝛼𝑏𝑖𝑗(𝜂𝑖) and then examine the statistical significance of estimates for 𝛼’s to further simplify the model. We simulated data from a simple model with either 𝛼𝑏𝑖𝑗(𝜂𝑖)=𝛼1𝜂𝑖 or 𝛼𝑏𝑖𝑗(𝜂𝑖)=𝛼2𝜂𝑖𝑡𝑖𝑗, and then applied the proposed approach by assuming a bigger model (2.3) as Δ𝑖𝑗+𝛼1𝜂𝑖+𝛼2𝜂𝑖𝑡𝑖𝑗. The simulation results are summarized in the top panel in Table 3. The proposed method can reasonably well estimate all the parameters, and in particular, could correctly indicate the unnecessary zero term. Second, we simulated data from Δ𝑖𝑗+𝛼1𝜂𝑖+𝛼2𝜂𝑖𝑡𝑖𝑗 but fitted misspecified models that omitted some terms. The results are summarized in the lower panel of Table 3. It is evident that misspecified models lead to biased estimates for both longitudinal regression coefficients and dependence parameters.

4.2. Application to Renal Disease Data from MDRD Study

Here we considered a subgroup of 129 patients with low-protein diet in MDRD study B, among whom, 62 patients were randomized to the group of normal-blood-pressure control and 67 patients were randomized to the group of low-blood-pressure control. Besides the randomized intervention, other covariates included time in study (time), baseline disease progression status (Prog), baseline blood pressure (Bp), and log-transformed baseline urine protein level (log.Pt). There were 52 (40.3%) patients left the study prematurely for kidney transplant or dialysis and were treated as informative dropouts.

We applied the proposed approach to estimate the marginal effects of covariates on GFR values. To account for the possible informative dropouts, we assumed that the dependence term 𝛼𝑏𝑖𝑗(𝜂𝑖) had a form of 𝛼1𝜂𝑖+𝛼2𝜂𝑖𝑡𝑖𝑗, analogous to the joint modeling approach with latent random intercept and random slope used in Schluchter et al. [33]. We considered the situations of 𝜂𝑖 following the standard normal, the extreme value, or the standard logistic distributions. Because the outcome was a continuous variable, we used the identity link function.

Our results are presented in Table 4 and compared with the results from the ordinary GEE [16] with an independent working correlation matrix. More specifically, the slope estimates from the proposed approach indicate a much faster decreasing rate of GFR (e.g., time Est = −0.27, SE = 0.03, under the normality assumption for 𝜂𝑖) than the result from the ordinary GEE method (time Est = −0.14, SE = 0.03). A possible explanation is that those patients remaining under observation usually have better kidney functions and thus higher GFR values. The ordinary GEE approach that treats the observed patients as random representatives of the population tends to underestimate the degressive trend of GFR.

The estimates for the intervention on blood pressure control show positive effect of the low-blood-pressure control on the longitudinal GFR development. Although the results are not statistically significant, the estimates from the proposed method (e.g., Intervention Est = 0.82, SE = 1.07, under the normality assumption for 𝜂𝑖) are about twice large of the values from the ordinary GEE method (Intervention Est = 0.35, SE = 0.90). Moreover, for the proposed approach, the results under different distributional assumptions for 𝜂𝑖 are quite similar. The estimates of the dependency parameters (𝛼1,𝛼2) are positive and statistically significant. This indicates that higher GFR values are positively associated with longer dropout times in the study. In addition, our proposed approach shows that the baseline urine protein level is significantly associated with the longitudinal GFR development, but the ordinary GEE method does not show such significance. The results obtained using our proposed method are also consistent with those reported in Schluchter et al. [33].

5. Discussion

In this paper, we propose a semiparametric marginalized model for marginal inference of the relationship between longitudinal responses and covariates in the presence of informative dropouts. The regression parameters represent the covariate effects on the population level. The proposed estimators are expected to be insensitive to misspecification of the latent variable distribution [31], which is desirable pertaining to the sensitivity analysis on unverifiable assumptions for the informative dropouts. In practice, the choice between marginal models and other types of joint modeling approaches should be determined by study objective.

To estimate the regression parameters in the proposed marginalized model, we proposed a class of simple conditional generalized estimating equations and demonstrated its computational convenience. In general, a likelihood-based approach can be used to achieve more efficient inference and is also of great interest. For example, a marginalized random effects model [30, 31] can be used for the longitudinal process and a frailty model [34] can be used for the dropout time. Furthermore, latent variables (𝑏𝑖𝑗,𝜂𝑖) can be modeled by a copula distribution or non-Gaussian distributions [35]. The likelihood-based methods enjoy the high efficiency and facilitate the implementation of classical model selection procedures, such as AIC or BIC; however, intensive computations are often involved when the dimension of random effects is high.


A. Proof of Theorem 3.1

Under the conditions (C1)–(C4), Chen et al. [27] established the consistency of ̂𝜃 and the uniform consistency of a transformed 𝐻(), that is, sup𝑡[0,𝜏]|exp{𝐻(𝑡)}exp{𝐻0(𝑡)}|=𝑜𝑝(1). We first derive the consistency of the proposed estimator , which is the solution of the equation 𝑈(;Θ)=0. Note that (1/𝑛)(𝜕𝑈(;Θ)/𝜕)=(1/𝑛)𝑛𝐀𝑖(̂𝜂𝑖,𝛿𝑖)𝐖𝑖𝐀𝑖(̂𝜂𝑖,𝛿𝑖) is a positive definite matrix, and by the law of large numbers and the consistency of ̂𝜃 and 𝐻, it converges uniformly to a deterministic positive definite matrix Ω() over a compact set of . In addition, we have that (1/𝑛)𝑈(;Θ) converges uniformly to a deterministic function 𝑢(;Θ0) satisfying 𝑢(0;Θ0)=0 and (𝜕𝑢(;Θ0)/𝜕)=0=Ω(0)=Ω. Thus, the estimating equation 𝑈(;Θ)=0 exists a unique solution . Since 0 is the unique solution of 𝑢(0;Θ0)=0, the consistency of easily follows.

To prove the asymptotic normality, by the Taylor expansion, we have𝑛0=1𝑛𝜕𝑈;Θ𝜕11𝑛𝑈0;Θ,(A.1) where lies between and 0. Since is consistent and (1/𝑛)(𝜕𝑈(;Θ)/𝜕) converges uniformly to Ω(), we have that (1/𝑛)(𝜕𝑈(;Θ)/𝜕) converges to Ω. Furthermore, the Taylor expansion of 𝑈(0;Θ) around Θ0 gives1𝑛𝑈0;Θ=1𝑛𝑛𝑖=1𝐀𝑖𝜂𝑖,𝛿𝑖𝐖𝑖𝐘𝑖𝝁𝑖𝜂𝑖,𝛿𝑖𝑛𝑖=1𝐀𝑖𝜂𝑖,𝛿𝑖𝐖𝑖𝝁𝑖̂𝜂𝑖,𝛿𝑖𝝁𝑖𝜂𝑖,𝛿𝑖=1𝑛𝑈0;Θ0𝑛𝑖=1𝐀𝑖𝜂𝑖,𝛿𝑖𝐖𝑖𝜕𝝁𝑖(𝜂,)||||𝜕𝜂(𝜂𝑖,𝛿𝑖)×̂𝜃𝜃0𝑍𝑖+𝑂𝑖+𝐻𝑇𝑖;𝜃0𝐻0𝑇𝑖+𝑜𝑝(1),(A.2) where 𝑂𝑖=𝜕𝐻(𝑇𝑖;𝜃)/𝜕𝜃𝜃=𝜃0 and its asymptotic representation can be found in the appendix of Chen et al. [27]. The asymptotic representations of ̂𝜃 and 𝐻 have also been derived by Chen et al. [27],𝑛̂𝜃𝜃=Σ11𝑛𝑛𝑖=1𝜏0𝑍𝑖𝜇𝑏(𝑡)𝑑𝑀𝑖(𝑡)+𝑜𝑝Λ(1),𝐻𝑡;𝜃0Λ𝐻0(𝑡)=𝑛𝑛1𝑖=1𝑡0𝜆𝐻0(𝑠)𝐵2(𝑠)𝑑𝑀𝑖(𝑠)+𝑜𝑝𝑛1/2,(A.3) where 𝑁𝑖(𝑡)=𝛿𝑖𝐼(𝑇𝑖𝑡) and 𝜉𝑖(𝑡)=𝐼(𝑇𝑖𝑡) are the counting and at-risk processes, respectively, 𝑀𝑖(𝑡)=𝑁𝑖(𝑡)𝑡0𝜉𝑖(𝑠)𝑑Λ{𝜃0𝑍𝑖+𝐻0(𝑠)} is the mean-zero martingale process, and the definitions of Σ and the functions Λ(), 𝜆(), 𝐵2(), and 𝜇𝑏() are given in the appendix of Chen et al. [27].

Plugging these terms back to the expansion of (1/𝑛)𝑈(0;Θ) specified in (A.2), some rearrangement yields that it is equal to 1𝑛𝑛𝑖=1𝐀𝑖𝜂𝑖,𝛿𝑖𝐖𝑖𝐘𝑖𝝁𝑖𝜂𝑖,𝛿𝑖𝜏0𝑄1𝑍𝑖𝜇𝑏(𝑡)+𝑄2(𝑡)𝑑𝑀𝑖(𝑡)+𝑜𝑝(1),(A.4) where 𝑄1=lim𝑛𝑛𝑛1𝑖=1𝐀𝑖𝜂𝑖,𝛿𝑖𝐖𝑖𝜕𝝁𝑖(𝜂,)||||𝜕𝜂(𝜂𝑖,𝛿𝑖)𝑍𝑖+𝑂𝑖Σ1,𝑄2(𝑡)=lim𝑛𝑛𝑛1𝑖=1𝐀𝑖𝜂𝑖,𝛿𝑖𝐖𝑖𝜕𝝁𝑖(𝜂,)||||𝜕𝜂(𝜂𝑖,𝛿𝑖)𝐵𝑡,𝑇𝑖𝜉𝑖(𝑡)𝐵2.(𝑡)(A.5) The definition of 𝐵(𝑡,𝑠) can be found in Chen et al. [27].

Hence, (1/𝑛)𝑈(0;Θ) has been written as a standardized summation of independent terms with mean zero. By the central limit theorem, it is asymptotically equivalent to a multivariate Gaussian variable with zero mean and covariance matrix 𝑉, which is the limit of 1𝑛𝑛𝑖=1𝐀𝑖𝜂𝑖,𝛿𝑖𝐖𝑖𝐘𝑖𝝁𝑖𝜂𝑖,𝛿𝑖𝜏0𝑄1𝑍𝑖𝜇𝑏(𝑡)+𝑄2(𝑡)𝑑𝑀𝑖(𝑡)2.(A.6) From (A.1), it is easy to see that the estimator is asymptotically normal with mean zero and the variance-covariance matrix Ω1𝑉Ω1, which can be consistently estimated by its empirical counterpart  Ω1𝑉Ω1 using the usual plug-in rule.


The authors would like to thank the editor and the referees for their instructive comments. This research was supported partially by NIH grants RO1 CA-140632 and RO3 CA-153083.