Abstract

Landmark model (LM) is a dynamic prediction model that uses a longitudinal biomarker in time-to-event data to make prognosis prediction. This study was designed to improve this model and to apply it to assess the cardiovascular risk in on-treatment blood pressure patients. A frailty parameter was used in LM, landmark frailty model (LFM), to account the frailty of the patients and measure the correlation between different landmarks. The proposed model was compared with LM in different scenarios respecting data missing status, sample size (100, 200, and 400), landmarks (6, 12, 24, and 48), and failure percentage (30, 50, and 100%). Bias of parameter estimation and mean square error as well as deviance statistic between models were compared. Additionally, discrimination and calibration capability as the goodness of fit of the model were evaluated using dynamic concordance index (DCI), dynamic prediction error (DPE), and dynamic relative prediction error (DRPE). The proposed model was performed on blood pressure data, obtained from systolic blood pressure intervention trial (SPRINT), in order to calculate the cardiovascular risk. Dynpred, coxme, and coxphw packages in the R.3.4.3 software were used. It was proved that our proposed model, LFM, had a better performance than LM. Parameter estimation in LFM was closer to true values in comparison to that in LM. Deviance statistic showed that there was a statistically significant difference between the two models. In the landmark numbers 6, 12, and 24, the LFM had a higher DCI over time and the three landmarks showed better performance in discrimination. Both DPE and DRPE in LFM were lower in comparison to those in LM over time. It was indicated that LFM had better calibration in comparison to its peer. Moreover, real data showed that the structure of prognostic process was predicted better in LFM than in LM. Accordingly, it is recommended to use the LFM model for assessing cardiovascular risk due to its better performance.

1. Introduction

The risk prediction models (RPMs) are used as a diagnostic model to estimate the probability of an event occurrence in a disease or as a prognostic model to estimate probable consequences of a disease. Accurate prediction of a risk is essential in clinical research, and it is the patient’s right to be informed about their disease progress [1]. Recently, RPMs are being used to help clinicians to make the best decision in diagnostic and therapeutic approaches, based on patient’s demographics, test results, or disease characteristics [2]. Diagnostic models are usually used for risk classification in patients while prognostic models use time to assess disease progress [3, 4]. Nowadays, more prediction models have been used in cardiovascular diseases, such as diagnostic models for assessing the risk factors [3]. However, the cardiovascular risk assessment tools are static prediction models that use baseline predictors, but they still have some shortcomings, such as poor prediction [5], for instance, the inability to determine long-term survival of a heart attack patient with previous successful treatment or the inability to decrease the risk of cardiovascular event in a treated hypertensive patient. During the intervention, biomarkers are measured that are potentially informative in order to determine the treatment efficacy [69]. In this respect, the risk prediction using longitudinal biomarker is referred to as the dynamic prediction model (DPM), which was introduced by some researchers [1013]. One DPM model is joint modeling (JM) [1416], which requires correct determination of biomarker process distribution and event time, but this biomarker distribution is usually unclear. Moreover, its generalization to more than one marker leads to the production of ample computational complexity [17]. The landmark model (LM), another DPM, is used as an appropriate alternative to JM [9, 1719]. The main advantage of LM is its simplicity, since it requires fewer assumptions compared to JM and might have much more power. In LM, the time is divided into different landmarks and then the simple Cox proportional hazards (PH) model is applied to each landmark for individuals who are still alive until time [20, 21]. On the other hand, the biomarker value in each landmark time is considered as a fixed variable; hence, the prediction of risks becomes feasible. A landmark window should be considered to predict survival until time , which is called horizon (thor). is the length of time to predict patient survival as the prediction window.

By analyzing LM, the less frail patients are probably maintained dynamically during the landmark times. On the other hand, the estimated parameters in LM can be affected, if some patients do not follow the specific clinical visit schedule. Also, not considering the correlation between landmarks might affect the risk prediction. Bias in LM probably originates from these neglected issues. In order to improve LM, the frailty parameter was used to present a new model called the landmark frailty model (LFM). Finally, LFM was used to assess the cardiovascular risk in the on-treatment hypertensive patients. To reach this goal, a study with simulation data was designed, and the real blood pressure data which was obtained from systolic blood pressure intervention trial (SPRINT) was analyzed [22, 23].

The rest of this article is organized as follows. Section 2 provides a brief description of the landmark approach as well as the proposed approach. Also, the setting of simulation studies, goodness of fit indices, and real data description are shown in Section 2. We conducted simulation studies to compare LFM with LM in Sections 3. In Section 4, we exhibited our approach with the SPRINT data followed by Section 5, which concludes and discusses simulation and real data.

2. Materials and Methods

2.1. Landmark Approach

Assuming that and are survival time (failure time) and censoring time, then make the observed time. represents the vector of covariates, which can be measured once at the beginning of the study. For example, age and gender are measured only at baseline and they are considered as fixed variables. represents the longitudinal biomarker like systolic or diastolic blood pressure which can be measured for several time intervals. For risk assessment, the Cox (PH) model as the most famous model is defined as where and are hazard function at time and baseline hazard, respectively. In LM, the time is divided into several landmark times including . At landmark (), subjects who are still at risk are considered for analysis and the remaining individuals will be omitted [9]. At each landmark, longitudinal biomarker value is considered as a fixed variable. Then, a time period, a landmark window is considered to predict survival until time which is called horizon (thor). is the length of time to predict patient survival as prediction window, which is the so-called 3 or 5 years. The Cox PH model in equation (1) is reformulated and the conditional hazard function is estimated by

The model presented in equation (2) is defined as the simple or basic LM. It is used to fit a model to each landmark, and it estimates the specific landmark effect of a biomarker for predicting survival between and where is a different baseline hazard in each landmark. We assumed that a longitudinal biomarker, , for subject at the time of was obtained from the mixed-effect model via the following formula: where and denote the design vector for random and fixed effect and subscript is 0 or 1.

According to equation (2), to consider the frailty of patients and the correlation between sequential measurements, LFM is defined as follows: where indicates the frailty of patient , which follows the multivariate normal distribution with mean 0 and covariance matrix . The survival prediction model is related to cumulative hazard function by

To estimate the parameters, Cox partial likelihood is modified via the following integrated (over landmarks) partial log-likelihood (IPL) [24].

In this formula, indicates the risk set, if subject remains until time at landmark . Otherwise, is assumed 0. denotes the risk set at time at landmark . We used the integrated partial likelihood (IPL) by integrating the random effects [25].

By maximizing the IPL, maximum likelihood estimators (MLE) for the parameters are provided. In addition, the coxme function in coxme package can only provide an ML estimate. With respect to the complexity of IPL calculation, the coxme package uses the Laplace approximation technique. More details are described elsewhere [25, 26].

We can also perform a model for all landmarks by stacking data set defined as super LFM, which considers parameters as if they depended on the time in a smooth fashion. It is formulated as where

Dynpred, coxme, and coxphw packages in the R.3.4.3 software were used for data analysis.

2.2. Simulation Study Setting

To assess the application of the models in different aspects, we set up several scenarios with different specificities in terms of sample size (, 200, and 400), number of landmarks (6, 12, 24, and 48), failure rate (30, 50, and 100%), and complete/missing data. In order to perform these models, a dichotomous variable with binomial distribution like treatment effect and continuous covariate with normal distribution like age ( and , respectively) were considered. The regression coefficient, and , was set at 0.5 and 1.5 as true values, respectively, for and . In equation (3), we assume that has a bivariate normal distribution for random intercept and slope with a mean of 0 and covariance matrix of . We also assumed that the individual error term () follows a normal distribution with a mean of 0 and variance of 1. Moreover, it is assumed that continuous variable was measured for 10 times for each individual sequentially. The time was generated from Weibull distribution [27] as shown in the following:

In this equation, , , , , and has a uniform (0, 1) distribution. Moreover, we performed the simple Cox model that just included the baseline data in three different sample sizes and three different failure percentages.

2.3. Goodness of Fit (GOF) and Prediction Ability Indices

There are several indices to assess the goodness of fit (GOF) and prediction ability in DPM. We used the standard error, bias of parameter estimation, and the mean square error (MSE), which were obtained from 300 simulation data. To compare LFM with LM, log-likelihood as well as deviance statistic was used. The latter is compared with mixture chi-square value (1.92) that was obtained from . Akaike information criterion (AIC) was used as if smaller AIC implies a better fit. Moreover, the dynamic concordance index (DCI), dynamic prediction error (DPE), and dynamic relative prediction error (DRPE) were used to measure the discrimination and calibration ability. DPE was obtained from the Brier error score formula:

In fact, the Brier score measures the average discrepancies between true event status and predictive values of survival at time . Low Brier score of a model indicates the better predictive performance of that model. In this formula, as the actual observation for subject at time is an event status, which could be either 1 or 0 (the occurrence or nonoccurrence of an event, respectively).

The predicted survival is estimated by model LM or LFM [28]. DRPE was calculated from

In equation (12), errors are obtained from equation (11) and the null model is a model without any covariates, such as Kaplan-Meier estimate, and the current model is LM or LFM.

2.4. Real Data

We used a part of the systolic blood pressure intervention trial (SPRINT) study [29] (National Heart, Lung, and Blood Institute (NHLBI), funded by the National Institutes of Health; ClinicalTrials.gov number NCT01206062) upon a request ID of 4612. In the main study of SPRINT, methods were reported in detail [30]. In summary, in that randomized controlled trial study, 9361 nondiabetic participants with systolic blood pressure (SBP) of equal or more than130 mmHg were allocated to an intensive treatment () and standard treatment () groups. Baseline data, lab data, and repeated measurement of SBP for 21 times in 5 years were collected. Heart failure, stroke, myocardial infarction, other acute coronary syndromes, and death from cardiovascular causes were regarded as cardiovascular events. Hence, we designed a case-cohort study from this data, which included Framingham risk factors of age, gender, total cholesterol (TCH) level, high-density lipoprotein cholesterol (HDL) level, and SBP. In our model, treatment effect was added to the abovementioned data. We considered 10 measurements of SBP (baseline, 6, 12, 18, 24, 30, 36, 42, 48, and 54 months). The aim was to determine the dynamic effect of blood pressure on cardiovascular disease risk by comparing LFM with LM. On the other hand, these two models were compared with the simple Cox model by considering only baseline blood pressure data. To compare LFM with LM, we used AIC and deviance criteria. And the deviance criteria were tested using mixture chi-square.

3. Results of Simulation

Simple Cox model results are summarized in Table 1, and results of LM and LFM are summarized in Tables 24

3.1. Landmark Models vs. Simple Cox Model

Both LFM and LM had a better relative performance in comparison to the simple Cox model. As can be seen in all scenarios, bias and MSE of bias were lower in LMs; however, this difference decreased as the sample size increased from 100 to 200-400. Nevertheless, bias did not have basic changes in failure rate.

3.2. Comparing LFM vs. LM

The performance of the models was evaluated based on their ability to estimate the true value of the parameters and their ability to classify and predict the actual survival.

3.2.1. Ability to Estimate the True Value of the Parameters

Mean of parameter estimation and its SE, bias, and MSE for failure rate (30%) are shown in Table 2. In total, bias and MSE were lower in LFM in comparison with LM. According to deviance and AIC indices, using 12 landmarks and sample size of 100, there was no statistically significant difference between LFM and LM in both data sets. Deviance was 1.76 in complete data and 1.11 in incomplete data. Deviance index showed that there was a statistically significant difference between the two models at the sample size of 200 and 400 in the complete data. At the sample size of 200, AICs were 1515 and 3636 in LFM and LM, respectively, while at the sample of 400 they were 1521 and 3648.

In these cases, bias and MSE of bias for the two parameters were slightly lower in LFM. In the incomplete data, according to deviance index, there was no significant difference between the two models in all sample sizes, while in the complete data (200-400) statistically significant difference was observed (in both cases, deviance was greater than mixture chi-square and AICs were lower in LFM). In all scenarios with 24 and 40 landmarks, based on deviance and AIC indices, LFM fitted better in comparison to LM. In these cases, the mean estimation of two parameters in the LFM was closer to the true value. This result was more pronounced in the continuous variable. According to the results of failure rate of 50% and 100% summarized in Tables 3 and 4, the superiority of LFM over LM was higher, especially in the model with 12 landmarks.

3.2.2. Ability to Classify and Predict the Actual Survival

We used CDI to assess the discrimination ability of the two models that were run with fixed sample size and failure rate in 3 different landmarks (6, 12, and 24), where CDI value is greater than 0.5, indicating that the model had discrimination ability. As illustrated in Figure 1, LFM had better performance. This advantage was more evident by increasing the number of landmarks from 6 to 12-24. On the other hand, the more area under the curve indicated the more accurate model. DPE and DRPE for calibration ability were plotted in Figures 2 and 3. The error rates and relative error rates in LFM were much lower than those in LM, and this became more prominent by increasing landmark numbers.

4. Results of Real Data

Results of real data are summarized in Table 5 and Figure 4. The adjusted hazard ratio of variables and its confidence interval (CI) are provided. Furthermore, AIC and deviance index were extracted to assess the models. While in both LFM and LM the SBP was highly significant, it had no significant impact on cardiovascular events in simple Cox ( value = 0.258). However, LFM fitted better since it had an AIC equal to 5437 while it was 6385 in LM. Also, the deviance between two models was 559.7 (). After adjusting the treatment effect as well as baseline risk factor effect (Figure 4), it was shown that while SBP was decreasing over time, the hazard ratio (HR) was decreasing in line with SBP in both models. However, it is noteworthy that this reduction was more in the LFM. On the contrary, HR was constant over time in the Cox model. As the blood pressure decreases, the 3-year survival prediction increases. LFM predicts higher survival than LM and the simple Cox model (Figure 5).

5. Discussion

5.1. Discussion of Simulation Data

DPM includes time-dependent marker information during follow-up in order to improve personal survival prediction probabilities. At any follow-up, time-updated marker value can be used to generate a dynamic prediction [1012]. These models are essential to identify high-risk individuals and timely clinical decision-making. Recently, LM as DPM was extensively investigated by researchers [9, 24]. Some of them used LM in different aspects of survival data such as competing risk and cure data [14, 20]. However, they paid little attention to individuals’ frailty and regularity of visits as well as correlation between different landmarks. On the other hand, LM can be affected by the way how landmarks are selected and number of landmarks. Ignoring these issues might lead to an estimation error. Hence, we proposed a modified LM which used the frail parameter as LFM. Indeed, in LM, individuals who experienced the intended event or being censored at a defined landmark time are considered in data analysis. Frailty plays a critical role in those who are retained and repeated dynamically in sequential landmarks due to their low frailty. However, in our proposed model, considering the frailty of patients was included in the analysis; hence, we were able to overcome the abovementioned problems. Simulations showed that both LFM and LM had a relative advantage over the simple Cox model. This was confirmed by various criteria such as bias and MSE. Bias and MSE in dichotomous and continuous variables were higher in the simple Cox model, which is in line with other studies [20, 31]. In the simple Cox model, as the sample size increased, the estimation error slightly decreased and the estimates were closer to the true values. But the simple Cox model was still behind the LMs. LFM and LM were compared regarding different sample sizes, different number of landmarks, different failure rate, and diverse data structure. Generally, the superiority of the LFM over LM was confirmed in the present study. This conformation was very clear in a large sample size and higher landmark number in both complete and incomplete data.

To the best of our knowledge, this is the first study to investigate the effect of number of landmarks on accuracy of the results. Wright et al. [31] performed LM with 20 landmarks, and others have empirically found that 20 to 100 landmarks are appropriate [24]. However, in the case of large sample size, data become too large, and it took too much time to run the programs. There was no significant difference between LM and LFM in small sample size and low number of landmarks, which was the result of checking the deviance and AIC indices between the two models. Both models performed better by increasing failure rate. Although no statistical comparison was made, in each case, LFM was more appropriately fitted with fewer estimation errors. In most times, the discrimination ability of LFM was more than LM since DCI was more than 70% in LFM while this index was lower in LM. The difference between DCIs increased by the implementation of increased number of landmarks. Also, evaluation of collaboration ability (DPE and DRPE) showed that LMF had a better performance than LM.

5.2. Discussion of SPRINT Data

Hypertension is not only recognized as a major cardiovascular risk factor but also has a significant impact on the occurrence of events followed by therapeutic interventions [32]. In this study, we showed that considering hypertension as a dynamic risk factor had a basic role for obtaining real estimation of successful treatment in cardiovascular diseases. Hence, it should be considered dynamically during the course of treatment and not only at the time of admission as a primary measure that it was emphasized by other similar studies [9, 33]. SPRINT data confirmed the simulation results, which contained repeated measurements of SBP as a single longitudinal biomarker. As mentioned in the previous section, more than one biomarker could be used in LMs. By only using the baseline blood pressure data (simple Cox model), the role of SBP was hindered due to dominancy of treatment effect; hence, it was not recognized as a cardiovascular risk factor. This result is in line with the results of our previous study [22] and the same result was obtained from the study carried out by Group S.R. [29]. In both landmark models, as SBP decreased after treatment, the risky effect of SBP was also reduced. While HR in the simple Cox model is close to 1 and constant over time, HR in the two landmark models was close to 3 at the beginning of the study and then it relatively decreased by decreasing blood pressure over time, although the intensity of the significant reduction was higher in our proposed model. On the other hand, in the simple Cox model, the effect of intensive treatment was 38% (1-1/0.720) in comparison with standard treatment, while in LFM and LM it was 110% and 107%, respectively. This means that the protective effect of intensive treatment is highly exhibited in our model in comparison with simple Cox and LM. Thereby, predictability of the 3-year dynamic survival is higher in LFM. Other studies that worked on blood pressure have considered landmarks separately while we used a model that landmarks were considered continuously [21, 33, 34]. This study showed that landmark models can be used to help clinicians to make better decision for diagnosis and treatment. Landmark models, especially our proposed model, are useful for risk assessment, when the data is not complete or regular, similar to our data.

6. Conclusion

In this study, we provided a modified LM, which considered the frailty of the patients as well as the correlation between the landmarks. Our approach can be fitted better in the sense that it has a better GOF, improved real data analysis, and more optimized cardiovascular risk assessment.

Data Availability

The data used to support the findings of this study were supplied by [National Heart, Lung, and Blood Institute (NHLBI),] under license and so cannot be made freely available. Requests for access to these data should be made to {National Heart, Lung, and Blood Institute (NHLBI), Funded by the National Institutes of Health; ClinicalTrials.gov number, NCT01206062}. We have a request ID of 4612.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was extracted from the Ph.D. dissertation of Mr. M. Sayadi [35], and it was supported by Shiraz University of Medical Sciences, Shiraz, Iran (Grant no. 97-16792). The authors would like to thank Mr. I. Razeghian-Jahromi and Ms. Z. Hadi for their valuable suggestions. The authors wish to thank Mr. H. Argasi at the Research Consultation Center (RCC) of Shiraz University of Medical Sciences for his invaluable assistance in editing this manuscript.