Find the Essence through the Phenomena: Cardiovascular Diseases and Biomarkers 2019View this Special Issue
Predicting Long-Term Mortality after Acute Coronary Syndrome Using Machine Learning Techniques and Hematological Markers
Introduction. Hematological indices including red cell distribution width and neutrophil to lymphocyte ratio are proven to be associated with outcomes of acute coronary syndrome. The usefulness of machine learning techniques in predicting mortality after acute coronary syndrome based on such features has not been studied before. Objective. We aim to create an alternative risk assessment tool, which is based on easily obtainable features, including hematological indices and inflammation markers. Patients and Methods. We obtained the study data from the electronic medical records of 5053 patients hospitalized with acute coronary syndrome during a 5-year period. The time of follow-up ranged from 12 to 72 months. A machine learning classifier was trained to predict death during hospitalization and within 180 and 365 days from admission. Our method was compared with the Global Registry of Acute Coronary Events (GRACE) Score 2.0 on a test dataset. Results. For in-hospital mortality, our model achieved a -statistic of 0.89 while the GRACE score 2.0 achieved 0.90. For six-month mortality, the results of our model and the GRACE score on the test set were 0.77 and 0.73, respectively. Red cell distribution width (HR 1.23; 95% CL 1.16-1.30; ) and neutrophil to lymphocyte ratio (HR 1.08; 95% CL 1.05-1.10; ) showed independent association with all-cause mortality in multivariable Cox regression. Conclusions. Hematological markers, such as neutrophil count and red cell distribution width have a strong association with all-cause mortality after acute coronary syndrome. A machine-learned model which uses the abovementioned parameters can provide long-term predictions of accuracy comparable or superior to well-validated risk scores.
The term acute coronary syndrome (ACS) refers to many conditions which include non-ST-segment elevation acute coronary syndrome (NSTE-ACS) and ST-elevation myocardial infarction (STEMI). The common cause of these conditions is inadequate blood flow to the myocardium which can be related to acute cholesterol plaque rupture or erosion and thrombus formation. These conditions have a similar presentation, and the most frequent symptom reported by patients is chest pain, which is one of the most common causes of presentation to the emergency room accounting for up to 6% of emergency department attendances and 27% of medical admissions . Current guidelines emphasize the usefulness of established quantitative risk scores for prognosis estimation , which is necessary for the adequate and cost-effective provision of evidence-based therapies.
An increased systemic and local inflammation plays a crucial role in the pathophysiology of ACS. Various hematological indices have been reported to be associated with poorer prognosis or the occurrence of major adverse cardiac events after ACS . These indices include neutrophil to lymphocyte ratio (NLR) [4–6], platelet to lymphocyte ratio (PLR) , red cell distribution width (RDW) , and mean platelet volume (MPV). These studies brought evidence that such nonspecific markers of the inflammatory response are associated with the GRACE score.  Moreover, they can improve its discriminative capabilities [10, 11].
Machine learning (ML) is a field of computer science that uses various computational algorithms to give computer systems the ability to progressively improve performance on a specific task with data, without being explicitly programmed. This term describes a vast spectrum of computational methods, many of which like logistic regression have been used extensively in medical sciences for many years . The most state-of-the-art algorithms are currently subject of intense research and have been recently shown to perform on par with trained ophthalmologists in detecting diabetic retinopathy in eye fundus images , classify skin lesion images automatically with dermatologist-level accuracy , or detect hip fractures from frontal pelvic X-rays .
In our previous research, we successfully used ML techniques to predict in-hospital mortality . In this study, we attempt to develop a new tool for long-term risk assessment following ACS and compare its performance with the GRACE 2.0 model. In contrast to existing risk scores, our tool relies on laboratory tests (including hematological indices) and simple measurements (including blood pressure and heart rate), rather than clinical features. The rationale for such approach is the proven association of inflammatory response with ACS outcomes.
We retrospectively examined electronic medical records of patients admitted to a cardiology department between January 2012 and December 2016 to select all patients hospitalized because of an ACS. The analyzed group comprised of patients who had their diagnosis confirmed by a cardiologist according to ESC guidelines .
5053 individual patients were qualified (1522 with STEMI, 857 with NSTEMI, and 2674 with unstable angina). We analyzed the descriptions of the electrocardiograms in the patient’s medical records to identify patients who had an ST-segment elevation () or any ST-segment deviation-elevation or depression () according to current guidelines.
We obtained information on all-cause death or survival and on the exact date of death from the national death registry one year after the end of data collection. Patients who had incomplete records or had no blood sample taken during hospitalization were excluded from the study. If a patient was admitted with ACS more than one time in the analyzed period, only the last hospitalization was considered.
All patients were treated according to current guidelines and doctor’s therapeutic decisions. Each patient had a venous blood sample taken within 30 minutes from admission. The complete blood count and hematological parameters were analyzed using an automated blood cell counter CD-RUBY (Abbott, Lake Bluff, Illinois, USA). Biochemical parameters were measured using COBAS 6000 (Roche, Basel, Switzerland). The results of the laboratory tests as well as the clinical information were obtained retrospectively from the electronic medical record (EMR) system at the time of follow-up. During the period of data collection, both Troponin I and Troponin T were used. Therefore, we expressed troponin elevation as a ratio (actual value divided by the norm).
Statistical analyses were performed using the RStudio Software. The Shapiro-Wilk test was used to test the variables’ distribution for normality. Most of the analyzed variables did not have a normal distribution. Median and interquartile ranges were selected as measures of central tendency. The univariable two-tailed Mann-Whitney test was used to compare numerical features. We created a multivariable Cox regression model using variables with statistically significant differences ( value <0.05) in univariate analysis. 310 observations were excluded from the analysis because of missing values. We did not use automated stepwise backward elimination. Instead, all variables which were suspected to influence the outcome were entered into the model . The list of variables used in the Cox regression model is presented in Table 1. The proportional hazard assumption was verified using Schoenfeld residuals. To assess the time-varying effects of the selected variables, Aalen’s additive model was used. A value <0.05 indicated statistical significance. The results were presented as hazard ratios with 95% confidence intervals (CI).
A probability of death during hospitalization and after 6 and 12 months from admission according to the GRACE 2.0 score was calculated using the model coefficients published on the GRACE project website (https://www.outcomes-umassmed.org/grace/). A Python package was developed to allow for the batch calculation of the GRACE 2.0 death probability based on relevant clinical and laboratory features. As the information about Killip class and creatinine level was available for almost all patients, the full version of the algorithm was used. In 84 cases the missing data did not allow for the calculation of the GRACE probability. Table 1 presents and compares the variables analyzed in the COX regression model as well as the variables used by the ML model and for the calculation of the GRACE score.
2.1. Machine Learning Methods
Model selection, optimization, and fitting were performed using the Python 3.6 and scikit-learn software packages. We used 4969 observations for training and evaluating the ML model. We have excluded 84 observations where variables necessary to calculate the GRACE score were missing, as presented in Table 1. The remaining missing values which did not affect the calculation of the GRACE score were imputed using mean of all observations. The gradient-boosted tree algorithm was implemented using the xgboost  software package.
One-fifth of the available data () was put aside as a test set and not used for training. Observations for the test set were chosen randomly, but in a way that preserved the ratio of positive to negative class (death and survival). The ML classifier was optimized using the training data only (), using the 5-fold cross-validation. In this process, the training data was divided into 5 parts, and each of these parts was used to train the classifier and to measure its performance. We measured the performance of the GRACE score and our model by calculating the areas under Receiver Operating Characteristic (ROC) curves. The performance measurements during cross-validation were averaged and expressed by deviation. Finally, the performance of both classifiers was compared by calculating the areas under the ROC curves on the test set which was not used for training the ML model at all. This process was repeated in identical fashion for all analyzed endpoints: in-hospital death, 6-month death, and 12-month death.
The in-hospital mortality rate was 1.64% () within 6 months from admission 5.87% () and within a year from admission 7.85% (). 766 patients (15%) died during the period of the study (from January 2012 until acquisition of the survival data in December 2017). The baseline clinical characteristics and laboratory test results according to survival status are presented in Tables 2 and 3. Some variables including the presence of ST-segment elevation, troponin elevation, sodium levels, and systolic blood pressure did not meet the proportional hazard assumption. However, examining Aalen’s additive model indicated that these parameters have a high prognostic value shortly after admission that decreases over time. The results of the multivariable Cox regression analysis are visualized in the form of a forest plot on Figure 1. High RDW, NLR, monocyte count, creatinine level, prothrombin time, age, and heart rate as well as low sodium and hemoglobin were significantly associated with all-cause mortality in the multivariable model. Due to a large number of missing values for CRP and LDL levels, they were not considered for survival analysis, but we kept them in the machine-learned model because of their known association with ACS pathophysiology and outcomes .
3.1. Machine Learning Results
The model based on the gradient-boosted trees was trained using the following variables as input: troponin elevation ratio, NLR, PLR, RDW, CRP, platelet count, creatinine, hemoglobin, mean cell volume, sodium, prothrombin time, fibrinogen, age, neutrophil count, body mass index, systolic and diastolic blood pressure, heart rate, and sex. The variables were selected to maximize the model’s performance, but clinical parameters including the data from the patient’s medical history and physical examination were not included in the model. The point was to create a model that could use data that is routinely collected in the EMR system for all patients. The model’s performance metrics are summarized in Table 4. Figure 2 presents the Receiver Operating Characteristic curves for our classifier and the GRACE score 2.0 for the detection of in-hospital, 6-month, and one-year mortality. Eyeballing the Receiver Operating Characteristic (ROC) curves and analysis of areas under these curves (AUROC) reveal that the results of our model and the GRACE score 2.0 are similar. GRACE performed slightly better for short-term results (AUROC 0.9 vs. 0.89) while our model scored better in long-term results (AUROC 0.77 vs. 0.73 and 0.72 vs. 0.71 for 6-month and one-year mortality, respectively).
The results of the survival analysis using Cox regression confirm findings from numerous studies regarding the association of hematological indices including RDW, NLR, and neutrophil count with short- and long-term prognosis after acute coronary syndrome . The low-grade inflammatory process plays an important role in the formation and subsequent destabilization and rupture of the atherosclerotic plaque . In the multivariable Cox regression model, RDW had a strong association with all-cause mortality (HR 1.22, 95% Cl 1.17-1.28). These results are consistent with the findings from other studies that identified RDW as a prognostic marker in cardiovascular diseases and heart failure  and also as a predictor of all-cause mortality . It was suggested that patients with increased RDW have lower oxygen supply at tissue level due to decreased red blood cell deformability and impaired blood flow through microcirculation . Our results also seem to confirm the findings from other studies  on the impact of admission anemia on long-term prognosis in ACS.
Our model performed better than GRACE score for medium- and long-term prognosis. However, the difference in performance was small, and the calculations of the GRACE scores in our study were made based on retrospective data and could be inaccurate in some cases. This result needs to be confirmed in prospective validation. Better long-term performance of our model might be related to the fact that it uses inflammation biomarkers. The underlying inflammation process is known to be related to atherosclerosis, but the currently used risk scores do not take advantage of this fact.
GRACE score 2.0 has been extensively validated in various populations and proved to have superior discriminatory accuracy for predicting major adverse cardiac events when compared to other risk assessment tools [25, 26]. However, the adoption of its use in a clinical setting was reported to be unsatisfactory. One of the reasons for such situation is the necessity of use of an external application which requires manual data input and consumes extra time . Studies have shown that the integration of risk assessment scores into IT solutions resulted in higher compliance . With all the necessary data available in the electronic medical record system, after integration into existing software, our solution can provide risk assessment without any additional input from the physician. The result could then trigger relevant alerts, helping to select the highest risk patients.
Several studies investigated the application of machine learning techniques to risk stratification in ACS. Most of these studies used data collected retrospectively from a large number of electronic medical reports, similarly as we did in our study [29, 30]. The models they created, however, were based on numerous clinical features, and it is difficult to reproduce the results and apply their solution in a different setting. For instance, VanHouten et al. reported that their machine-learned model could outperform the GRACE score. They used numerous sparse features including the full blood count in most patients and their classifier achieved area under receiver operating curve of 0.85. Our model yields comparable performance, but thanks to using the smaller number of free-of-interpretation features, it is easier to apply and validate externally.
5. Study Limitations
In our study, we retrospectively analyzed the electronic medical records of patients hospitalized over several years. This allowed for rapid development on an ML algorithm but is also a significant limitation.
Data stored in medical records are often incomplete, complex, messy, and can be biased . The naive use of raw medical records as input for either inferential statistics or machine learning models can lead to false conclusions. A good example of such situation is the study of Fine et al., in which patients who were admitted with severe community-acquired pneumonia and died in the emergency department had very little information stored in medical records. As a result, some deceased patients appeared healthier than those who survived .
The most concerning limitation of our study is related to variables that were stored in medical records as unstructured data in the form of physicians’ notes (e.g., descriptions of electrocardiograms). When designing our classifier, we only intended to use features that are available in the medical records as single measurements. Clinical features, including the results of physical examination, patient’s symptoms, and medical history, were not considered. This approach is different than those proposed by many other studies exploring the application of machine learning methods in predicting ACS outcomes [29, 30], where all the features that were available in EMR were used. Nevertheless, determining the presence of ST-segment deviation was necessary for calculating the GRACE score. We did not analyze the electrocardiograms directly, and the classification of some ECG descriptions was not obvious. Therefore, the calculations of the GRACE score were especially prone to bias. To make a justified statement on the performance of our classifier vs. any other existing score, it is necessary to evaluate it prospectively, and the scores should be calculated on the day of admission to the hospital.
The follow-up in our study was limited to death or survival status. This is also an important limitation because it was not possible to assess the occurrence of major adverse cardiac events other than all-cause death. Many patients suffered from recurrent ACS, which we did not analyze in this study. Instead, we only took into account the last available hospitalization.
Another important limitation is related to using the Cox regression model. Some of the variables which we used in this model did not meet the proportional hazard assumption. Nevertheless, after analyzing different regression models, we concluded that the predictive value of ST-segment elevation, troponin elevation, sodium levels, and systolic blood pressure may decrease over time and that it is worth presenting the results in this form.
Finally, although the study included patients hospitalized over many years, this dataset is still modest in terms of machine learning model development. The performance of our classifier varied slightly, depending on which observations were chosen randomly for the test set. In contrast, GRACE score was validated on over 100000 patients worldwide, thus the evidence that supports its usefulness is strong. We do not aim to prove that our method is better than any existing well-validated risk score, but to present a new approach to long-term risk prediction in ACS based on different analytic methods and different variables than existing scores.
Hematological markers of inflammation show strong correlation with the outcomes of ACS, and they can be successfully incorporated into numerical models designed to support clinical decisions. Our model predicted long-term mortality better than GRACE score, but the difference might not be significant, and it requires prospective validation. The potential of such solution lies in taking advantage of the easily available hematological biomarkers and in eliminating the necessity to enter the results of clinical examination or the past medical history into the model.
|ACS:||Acute coronary syndrome|
|AUROC:||Area under the receiver operating characteristic curve|
|BMI:||Body mass index|
|CABG:||Coronary artery bypass grafting|
|EMR:||Electronic medical records|
|GFR:||Glomerular filtration rate|
|GRACE:||Global registry of acute coronary events|
|MCV:||Mean cell volume|
|MPV:||Mean platelet volume|
|NLR:||Neutrophil to lymphocyte ratio|
|NSTE-ACS:||Non-ST-segment elevation acute coronary syndrome|
|NSTEMI:||Non-ST-segment elevation myocardial infarction|
|PLR:||Platelet to lymphocyte ratio|
|RDW:||Red cell distribution width|
|ROC:||Receiver operating characteristic|
|STEMI:||ST-segment myocardial infarction.|
The datasets used and analyzed during the study contain at least four indirect identifiers of patients which were used as input variables for machine learning algorithms (sex, age, weight, height, and place of treatment). For this reason, the data cannot be made publicly available in this form. However, authors are willing to share their data on reasonable request and after case-by-case assessment of such request by a local ethics committee.
Conflicts of Interest
The authors have no conflicts of interest to declare.
All the authors have had access to the data and all drafts of the manuscript. KP, JH, PB, PB, and JR designed the study. KP, KP, JH, and JB managed and analyzed the data. KP, KP, PB, and JH developed the machine-learning models. KP, PB, and JH wrote the draft of the manuscript. KP, JH, PB, and JB reviewed the manuscript. All the authors read and approved the final manuscript.
The Python implementation of the GRACE 2.0 death probability calculation, as well as the Python implementation of our classifier, will be published on GitHub repository https://github.com/konradpieszko upon publication of this manuscript. The Center for Outcomes Research did not evaluate nor formally approve our implementation of the GRACE 2.0 algorithm. However, they were informed about it and had no objections to us sharing the source code of our implementation.
U. U. Tamhane, S. Aneja, D. Montgomery, E. K. Rogers, K. A. Eagle, and H. S. Gurm, “Association between admission neutrophil to lymphocyte ratio and outcomes in patients with acute coronary syndrome,” The American Journal of Cardiology, vol. 102, no. 6, pp. 653–657, 2008.View at: Publisher Site | Google Scholar
S. Chatterjee, P. Chandra, G. Guha et al., “Pre-procedural elevated white blood cell count and neutrophil-lymphocyte (N/L) ratio are predictors of ventricular arrhythmias during percutaneous coronary intervention,” Cardiovascular & Hematological Disorders Drug Targets, vol. 11, no. 2, pp. 58–60, 2011.View at: Publisher Site | Google Scholar
H. Vakili, M. Shirazi, M. Charkhkar, I. Khaheshi, M. Memaryan, and M. Naderian, “Correlation of platelet-to-lymphocyte ratio and neutrophil-to-lymphocyte ratio with thrombolysis in myocardial infarction frame count in ST-segment elevation myocardial infarction,” European Journal of Clinical Investigation, vol. 47, no. 4, pp. 322–327, 2017.View at: Publisher Site | Google Scholar
H. Acet, F. Ertaş, M. A. Akıl et al., “Relationship between hematologic indices and global registry of acute coronary events risk score in patients with ST-segment elevation myocardial infarction,” Clinical and Applied Thrombosis/Hemostasis, vol. 22, no. 1, pp. 60–68, 2016.View at: Publisher Site | Google Scholar
A. T. Timóteo, A. L. Papoila, A. Lousinha et al., “Predictive impact on mediumterm mortality of hematological parameters in acute coronary syndromes: added value on top of GRACE risk score,” European Heart Journal: Acute Cardiovascular, vol. 4, no. 2, pp. 172–179, 2015.View at: Publisher Site | Google Scholar
W. Gale, L. Oakden-Rayner, G. Carneiro, A. P. Bradley, and L. J. Palmer, Detecting Hip Fractures with Radiologist-Level Performance Using Deep Neural Networks, 2017, https://arxiv.org/abs/1711.06504.
P. L. Flom and D. L. Casell, “Stopping stepwise: why stepwise and similar selection methods are bad, and what you should use,” in NESUG 2007 Proceedings; NorthEast SAS Users Group 22th Annual Conference, pp. 13–16, Baltimore, MD, USA, November 2007.View at: Google Scholar
T. Chen and C. Guestrin, “XGBoost: a scalable tree boosting system,” in Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 785–794, San Francisco, CA, USA, 2016.View at: Google Scholar
B. Szyguła-Jurkiewicz, Ł. Siedlecki, Ł. Pyka, E. Romuk, P. Przybyłowski, and M. Gąsior, “Red blood cell distribution width, relative lymphocyte count, and type 2 diabetes predict all-cause mortality in patients with advanced heart failure,” Polish Archives of Internal Medicine, vol. 128, no. 2, pp. 1–6, 2018.View at: Publisher Site | Google Scholar
D. P. Chew, C. Juergens, J. French et al., “An examination of clinical intuition in risk assessment among acute coronary syndromes patients: observations from a prospective multi-center international observational registry,” International Journal of Cardiology, vol. 171, no. 2, pp. 209–216, 2014.View at: Publisher Site | Google Scholar
J. P. VanHouten, J. M. Starmer, N. M. Lorenzi, D. J. Maron, and T. A. Lasko, “Machine learning for risk prediction of acute coronary syndrome,” AMIA Annual Symposium proceedings Archive, vol. 2014, pp. 1940–1949, 2014.View at: Google Scholar
J. Wallert, M. Tomasoni, G. Madison, and C. Held, “Predicting two-year survival versus non-survival after first myocardial infarction using machine learning and Swedish national register data,” BMC Medical Informatics and Decision Making, vol. 17, no. 1, p. 99, 2017.View at: Publisher Site | Google Scholar