Journal of Probability and Statistics

Volume 2019, Article ID 1019303, 10 pages

https://doi.org/10.1155/2019/1019303

## An Alternative Sensitivity Approach for Longitudinal Analysis with Dropout

^{1}Department of Statistics and Operations Research, College of Science, King Saud University, Riyadh, Saudi Arabia^{2}School of Mathematics and Statistics, Newcastle University, Newcastle Upon Tyne, UK

Correspondence should be addressed to Amal Almohisen; as.ude.usk@homlama

Received 22 March 2019; Accepted 4 June 2019; Published 1 July 2019

Academic Editor: Hyungjun Cho

Copyright © 2019 Amal Almohisen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

In any longitudinal study, a dropout before the final timepoint can rarely be avoided. The chosen dropout model is commonly one of these types: Missing Completely at Random (MCAR), Missing at Random (MAR), Missing Not at Random (MNAR), and Shared Parameter (SP). In this paper we estimate the parameters of the longitudinal model for simulated data and real data using the Linear Mixed Effect (LME) method. We investigate the consequences of misspecifying the missingness mechanism by deriving the so-called least false values. These are the values the parameter estimates converge to, when the assumptions may be wrong. The knowledge of the least false values allows us to conduct a sensitivity analysis, which is illustrated. This method provides an alternative to a local misspecification sensitivity procedure, which has been developed for likelihood-based analysis. We compare the results obtained by the method proposed with the results found by using the local misspecification method. We apply the local misspecification and least false methods to estimate the bias and sensitivity of parameter estimates for a clinical trial example.

#### 1. Introduction

Missing data are common in various settings, including surveys, clinical trials, and longitudinal studies. Methods for handling missing data strongly depend on the mechanism that generated the missing values as well as the distributional and modeling assumptions at various stages. This study focuses only on Missing at Random and Missing Not at Random dropout models, under a Linear Mixed Effect (LME) model.

Much of the literature on missing data problems assumes the dropout model is only MAR and not MNAR, but this assumption is clearly limited [1]. The consequences of misspecifying the missingness mechanism are investigated by deriving the so-called least false values, which are the values the parameter estimates converge to when the assumptions may be wrong. Derivation and illustration of theoretical least false values for the LME method are made under Missing at Random (MAR) and Missing Not at Random (MNAR) dropout. The misspecified dropout model MAR is assumed in this study.

Copas and Eguchi [2] gave a formula to estimate the bias under such misspecification using a likelihood approach. As the LME is a likelihood-based method, the estimates obtained through the Copas and Eguchi method can be compared with the LME least false estimates. The procedure will be applied by adding a tilt to the MAR dropout model to provide what Copas and Eguchi call local misspecification.

The local model uncertainty is elaborated as proposed by Copas and Eguchi [2] and illustrated both when model misspecification is present and when the data is incomplete. Furthermore, we find that the Copas and Eguchi method gives very similar results to the least false method. Misspecification will be dealt with assuming MAR where actually the truth is MNAR. Beside Copas and Eguchi [2], many other authors have developed methods to assess the sensitivity of inference under the MAR assumption [3, 4]. Moreover, Lin et al. [5] extended the Copas and Eguchi method and assumed a doubly misspecified model while having only single misspecification. Also, there has been interest in the Copas and Eguchi method from a Bayes perspective [6–10]. Recently, [11] performed simulation based sensitivity analysis.

In Section 2, the LME method is presented and we show how to calculate the least false values. A description of the Copas and Eguchi method is provided in Section 3.1, followed by an example in Section 3.2. A simulation study is described in Section 4. The Copas and Eguchi bias estimate results are studied and examined with the least false values derived from the LME method, and we then show the coverage of nominal confidence intervals. A sensitivity analysis is conducted to assess how inference can depend on missing data. In Section 5, the methods are applied to data from a clinical trial with two treatments and two measurement times as introduced and analysed by Matthews et al. [12]. We compared the results obtained by the proposed method with the results found by using the Copas and Eguchi method.

#### 2. Linear Mixed Effect (LME) Method

A statistical model containing fixed effects and random effects is called a mixed effect model. These models have been shown to be effective in many disciplines in the biological, physical, and social sciences. Usually a linear form is assumed.

Reference [13] gave a definition of the response in the LME model which is of the form:

For example, a simplified version of the Liard and Ware [14] mixed model approach for longitudinal data would include a random effect in the intercept term in a model for responses. If is the response at time on subject , the model is where is the marginal mean, which will usually be a linear function of covariates, is independent Gaussian noise, and is a realisation of a zero mean scalar Gaussian random variable. Since has zero mean, the marginal mean of remains after integrating out . However, since is common to all , we get dependence between observations on the same subject. For example, if is positive, then all values would tend to be above the marginal mean and so on. In the context of longitudinal data, some reviews of linear mixed models can be found in [15, 16].

##### 2.1. Assumptions

Suppose there are individuals in a study and each provides longitudinal responses and dropout information . Generally, we will assume a linear model for (in the absence of dropout) and logistic models for the probability of* continuing* to the next timepoint given that a subject is still under observation at time . At times, we refer to a* true* or* generating* model as the way in which data are obtained and to an* assumed* or* fitting* model as that chosen by the analyst for estimation.

For simplicity in this work, the study assumes that there are just two observations or treatment periods. The methods are of course more general.

At time 1, there is a measurement provided for all subjects, denoted by for subject . Then at time 2, some subjects are dropped out before measurement. Let = indicate that there is a measurement at time 2 and = otherwise. Let = and assume = where is a parameter vector of dimension and is the design matrix associated with subject , which is of dimension . The standard model assumes just one covariate and iswhere , , , , , and , , and Let , and .

Returning to the general case, the influence of missing data depends on the missingness mechanism, that is, the probability model for missingness. Knowing the reason for the missingness is obviously helpful to handle missing data. There are four general* missingness mechanisms* as introduced by Little and Rubin [17] and Wu and Carroll [18]. They are Missing Completely at Random (MCAR), Missing at Random (MAR), Missing Not at Random (MNAR), and Shared Parameter (SP).

For simplicity in this investigation, the parameters are assumed to be common between timepoints. Let the dropout parameters be . The MAR dropout logistic model is then

The missingness is called Missing Not at Random, if it depends on unrecorded information, which predicts the missing values. An example is that a patient was unsatisfied with a particular treatment, and thus this patient is more likely to quit the study. If missingness is not at random, then some bias is expected in inferences.

Let the dropout parameters now be . The MNAR version for the two-timepoint example is the logistic model:

##### 2.2. LME Least False

In this section, the Linear Mixed Effect (LME) method is investigated, which is based on a maximum likelihood estimating approach. The performance of the LME method under MAR and MNAR dropout is examined. Derivation and illustration of theoretical least false values are made. Assuming a Gaussian random intercept model, the score equation of current interest is [19]where , is a design matrix associated with subject which is , and we will use as notation for the first row of ; thus , , and . We can rearrange the terms in (6) to beThese components are in detailwhere , and

Also

Similarly for the right hand side of (8)

FinallyWe assume independent and identically distributed responses, with finite variance for the covariate and error distributions, and dropout probabilities bounded away from both zero and one. On dividing all sums by n, the weak law of large numbers applies and we can replace the sums with expectations as follows:In the left hand side of (14), there will be two parts. Firstand second

Similarly, the right hand side isExpressions for , , , , , , and have been obtained under different dropout models. For illustration, we show calculation of under MAR in the Supplementary Materials available at the journal website (available here).

Finally to find the least false value , the inverse of the matrix has been considered in the left hand side of (14) and we multiply this inverse by the matrix in the right hand side, which will yield the array of the least false values . In the following section, we present simulations regarding how the LME method performs under MAR and MNAR dropout model.

##### 2.3. Numerical Investigation

A scalar variable is generated, and then the longitudinal means are generated =, =. This was followed by from a bivariate normal distribution with mean . Missingness was generated from (4) and (5) for the MAR and MNAR models, respectively. In all of the following simulations, unless it is stated otherwise, the parameters =, , were followed. In the following, we show the effect of dropout on the limiting values and .

As LME provides consistent estimates under MAR, the least false values and are not affected by changing the dropout probabilities under MAR. Therefore, only MNAR concentrations were considered. From a contour plot of under MNAR (Figure 1), in order to minimise the bias in , should be chosen to be around zero. For negative , the dropout is associated with large , so and both tend to be low if dropout does not occur. Hence is lower than it should be. The opposite happens for a positive .