Abstract

Cardiovascular diseases (CVDs) are the most common comorbidities in the chronic obstructive pulmonary disease (COPD), which increase the risk of hospitalization, length of stay, and death in COPD patients. This study aimed to identify the predictors for CVDs in COPD patients and construct a prediction model based on these predictors. In total, 1022 COPD patients in National Health and Nutrition Examination Surveys (NHANES) were involved in the cross-sectional study. All subjects were randomly divided into the training set (n = 709) and testing set (n = 313). The differences before and after the manipulation of the missing data were compared via sensitivity analysis. Univariate and multivariable analyses were employed to screen the predictors of CVDs in COPD patients. The performance of the prediction model was evaluated via the area under the curve (AUC), accuracy, sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), and calibration. Subgroup analysis was performed in patients using different COPD diagnosis methods and patients smoking or not smoking in the testing set. We found that male, older age, a smoking history, overweight, a history of blood transfusion, a history of heart disease in close relatives, higher levels of white blood cell (WBC), and monocyte (MONO) were associated with the increased risk of CVDs in COPD patients. Higher levels of platelets (PLT) and lymphocyte (LYM) were associated with reduced risk of CVDs in COPD patients. A prediction model for the risk of CVDs in COPD patients was established based on predictors including gender, age, a smoking history, BMI, a history of blood transfusion, a history of heart disease in close relatives, WBC, MONO, PLT, and LYM. The AUC value of the prediction model was 0.75 (95% CI: 0.71–0.79) in the training set and 0.79 (95%CI: 0.73–0.85) in the testing set. The prediction model established showed good predictive performance in predicting CVDs in COPD patients.

1. Introduction

As a complex respiratory disorder, chronic obstructive pulmonary disease (COPD) is characterized by persistent airflow limitation associated with an abnormal inflammatory due to exposure to noxious particles and gases [1, 2]. The prevalence of COPD is 11%–26%, and the worrisome trend is expected to continue over the next 25 years [3]. Alarmingly, over 6 million deaths were estimated to result from COPD annually all over the world, and by 2030, COPD will become the third major reason of death all through the world [4]. This prediction has already been fulfilled and COPD has been reported to have caused 3.23 million deaths in 2019 [5]. The prevalence of COPD has caused a large burden to the society with an estimated cost of US$50 trillion per year [6]. COPD is expected to become the main economic burden of human chronic diseases in the future with the increase in air pollution and the speed of aging worldwide [7]. To display special concern on COPD was essential for the society and patients.

Although COPD primarily affects the lungs, patients also suffered from concurrent comorbidities such as cardiovascular diseases (CVDs), lung cancers, and metabolic diseases [8]. CVDs are the most common comorbidities in COPD, which increase the risk of hospitalization, length of stay, and death in COPD patients [9]. Previously, studies have reported that the prevalence of CVDs in COPD patients was approximately 10%–38% [10], and CVDs caused about 20%–50% of mortality in COPD patients [11]. To prevent the occurrence of CVD in COPD patients was of vital significance for improving the prognosis of those patients.

Accumulating research findings emerged over the past years, and the risk factors for CVDs in COPD patients were identified in various studies. Increased serum levels of inflammation and oxidative stress associated factors such as vascular cell adhesion molecule-1 [12] and human epididymis protein 4 [13] were reported to be correlated with the increased risk of CVDs in COPD patients. Chronic bronchial infection was also identified to increase the incidence of CVDs in COPD patients [14]. Machine learning enables systems to automatically learn and build the analytical model from their experience, and various prediction models were built for identifying those with the risk of some diseases, or for clinical use [15, 16]. The prediction models provided valuable tools for helping and guiding the treatments and care for clinicians and nurses. Previously, a prediction model for predicting the risk of CVDs in COPD patients was also established based on monocyte (MONO) level/HDL cholesterol ratio with an area under the receiver operating characteristic curve (AUC) of 0.73 [17]. This model only focused on the MONO level/HDL cholesterol ratio in those patients, which lacked important demographic and clinical variables associated with inflammation in COPD [18]. In this study, we collected the data of 1022 COPD patients from the National Health and Nutrition Examination Surveys (NHANES) between 2007 and 2018. The purpose of our study was to explore the predictors for CVDs in COPD patients and construct a prediction model based on these predictors. A nomogram for predicting CVDs in COPD patients was also plotted to quickly identify the possibility of CVDs in COPD patients.

2. Methods

2.1. Study Population

The current cross-sectional study collected the data of 1199 COPD patients in the NHANES database from 2007 to 2018. The NHANES is an ongoing program performed by the National Center for Health Statistics (NCHS) of the Centers for Disease Control (CDC) to evaluate the health and nutritional status in the civilian noninstitutionalized populations of the United States [19]. Therefore, informed consent from the participants was waived. Every year, about 5000 nationally representative individuals are sampled through multistage, stratified, clustered sampling method [19]. In our study, patients with answers “unknown” or “refuse” were excluded and finally 1022 patients were included. All participants were divided into the training set (n = 709) and the testing set (n = 313). The screening process of subjects is displayed in Figure 1.

2.2. Main Variables and Outcome Variables

The main variables analyzed included age (years), gender, race (non-Hispanic White, non-Hispanic Black, or other race), education level (under 12th grade, high school grad/general equivalent diploma or equivalent, or some college or above), marital status (married or other), annual family income (<20000$ or ≥20000$), smoke or not, overweight or not, a history of blood transfusion, a history of heart disease in close relatives, white blood cell (WBC; 109/L), MONO (109/L), neutrophil (NEUT; 109/L), platelet (PLT; 109/L), lymphocyte (LYM; 109/L), NEUT/LYM ratio (NLR), and PLT/LYM ratio (PLR).

The outcome variable was COPD patients with CVDs. The COPD was defined according to the question MCQ160O (Has a doctor or other health professional ever told that you had COPD?) in the MCQ series with an answer of “Yes” or verified by spirometry with postbronchodilator forced expiratory volume in 1 second (FEV1)/forced vital capacity (FVC) ratio (FEV1/FVC) ≤0.7 [20]. CVDs were defined as patients with at least one myocardial infarction, congestive heart failure, angina pectoris, or a history of coronary heart disease.

2.3. Statistical Analysis

The Kolmogorov–Smirnov test was used to assess the normality of the measurement data. Continuous variables of normal distribution were represented by mean standard deviation (Mean ± SD), and comparison between groups was performed by t-test. The measurement data of nonnormal distribution were exhibited as M (Q1, Q3), and differences between groups were compared via Wilcoxon rank sum test. The enumeration data were described as n (%), and chi-square (χ2) or Fisher’s exact probability method were applied for comparisons between groups [21]. The differences before and after the manipulation of the missing data were compared in the sensitivity analysis. All subjects were classified into the training set and the testing set with a ratio of 7 : 3. Univariate and multivariable analyses were employed to screen the predictors of CVDs in COPD patients. The prediction model was constructed and the nomograms were plotted. The evaluation of the prediction model performance was performed via the area under the curve (AUC), accuracy, sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), and calibration. The receiver operator characteristic (ROC) curves were drawn, and subgroup analysis was performed in patients using different COPD diagnosis methods and patients smoking or not smoking in the testing set. Statistical analysis was conducted via SAS 9.4 software and R4.0.2 software was used to construct the model. was considered as a statistical difference.

2.4. The Proposed Architecture of Our Study

In this study, the data of 1022 COPD patients from the NHANES between 2007 and 2018 were collected. The purpose of our study was to explore the predictors for CVDs in COPD patients and construct a prediction model based on these predictors. All subjects were classified into the training set and the testing set with a ratio of 7 : 3. Univariate and multivariable analyses were employed to screen the predictors of CVDs in COPD patients. The prediction model was constructed, and a nomogram for predicting CVDs in COPD patients was also plotted to quickly identify the possibility of CVDs in COPD patients. Our prediction model had good predictive performance and the nomogram made it easy for the clinicians to quickly estimate the possibility of CVDs in a COPD patient and provide timely interventions to prevent the occurrence of CVDs in patients with high risk of CVDs.

3. Results

3.1. The Manipulation of Missing Data

Variables with a missing value were filled with the median. Sensitivity analysis was performed to compare the data before and after filling. The results revealed that no statistical difference was observed in the data before and after filling the median (Supplementary Table 1).

3.2. Comparison of the Characteristics between COPD Patients with CVDs and without CVDs

As depicted in Table 1, the mean age (60.34 years vs. 66.33 years, t = −7.38, ), mean level of WBC (7.62 109/L vs. 8.08 109/L, t = -2.89, P = 0.004), and MONO (8.16 109/L vs. 8.75 109/L, t = −3.30, ) in patients without CVDs were lower than patients with CVDs. The mean PLT level in patients without CVDs were lower than patients with CVDs (244.59 109/L vs. 228.70 109/L, t = 3.43, ). The average level of NEUT (4.40 vs. 4.80, Z = 3.289, ) and NLR (2.19 vs. 2.55, Z = 4.577, ) in patients without CVDs were lower than patients with CVDs. The percentages of patients in terms of education level (Z = −2.139, ), annual family income (χ2 = 7.599, ), smoking (χ2 = 8.817, ), overweight (χ2 = 25.203, ), a history of blood transfusion (χ2 = 48.405, ), and a history of heart disease in close relatives (χ2 = 31.967, ) were statistically different between patients with CVDs and without CVDs.

3.3. Predictors of CVDs in COPD Patients

Variables with statistical significance in the baseline data of COPD patients with or without CVDs were included in the multivariate logistic regression analysis. The results showed that males were associated with 1.060-fold higher risk of CVDs than females (odd ratios (OR) = 1.060, 95% confidence interval (CI): 0.752–1.494). Increased age in COPD patients was associated with a higher risk of CVDs (OR = 1.040, 95% CI: 1.025–1.055). A smoking history in COPD patients increased the risk of CVDs by 0.737 times (OR = 1.737, 95% CI: 1.118–2.697). The risk of CVDs was increased by 0.987 times in overweight patients (OR = 1.987, 95% CI: 1.449–2.725). Patients with a history of blood transfusion were associated with 2.437-fold higher risk of CVDs (OR = 2.437, 95% CI: 1.734–3.425). A history of heart disease in close relatives increased the risk of CVDs by 1.758 times in COPD patients (OR = 2.758, 95% CI: 1.919–3.962). Higher levels of WBC (OR = 1.227, 95% CI: 1.125–1.338) and MONO (OR = 1.085, 95% CI: 1.016–1.159) were associated with a higher risk of CVDs in COPD patients. Higher levels of PLT (OR = 0.996, 95% CI: 0.993–0.998) and LYM (OR = 0.723, 95% CI: 0.568–0.920) were associated with a decreased risk of CVDs in COPD patients (Table 2).

3.4. The Equilibrium Test of the Training Set and Testing Set

The participants were randomly divided into the training set and the testing set (7 : 3). The results of equilibrium analysis revealed that there was no statistical significance in the differences of variables between the training set and the testing set (all ) (Table 3), which indicated that the data in the training set and the testing set were almost equilibrated.

3.5. Construction of the Logistic Prediction Model and Validation of the Predicative Value via the Testing Set

Predictors with statistical difference in the multivariable regression analysis were involved in the logistic prediction model. The results delineated those males had a 1.231-fold higher risk of CVDs than females (OR = 1.231, 95% CI: 0.817–1.856). COPD patients with older ages were correlated with a 1.037-fold increase of the risk of CVDs (OR = 1.037, 95% CI: 1.019–1.054). COPD patients with a smoking history were associated with a 1.497-fold increase of the risk of CVDs (OR = 1.497, 95% CI: 0.891–2.513). Overweight was linked with a 1.575-fold increase of the risk of CVDs in COPD patients (OR = 1.575, 95% CI: 1.080–2.298). Patients with a history of blood transfusion were associated with a 2.090 times higher risk of CVDs (OR = 2.090, 95% CI: 1.387–3.148). A history of heart disease in close relatives increased the risk of CVDs in COPD patients (OR = 2.944, 95% CI: 1.923–4.506). The risk of CVDs in COPD patients was increased in patients with higher levels of WBC (OR = 1.197, 95% CI: 1.082–1.324) and MONO (OR = 1.115, 95% CI: 1.034–1.204). Higher levels of PLT (OR = 0.996, 95% CI: 0.993–0.999) and LYM (OR = 0.826, 95% CI: 0.625–1.091) were associated with a reduced risk of CVDs in COPD patients (Table 4). The formula of the prediction model was: Logit (P) = Ln (P/1-P) = 0.208 male + 0.036 age +0.202 smoking+0.227 overweight + 0.368 blood transfusion+0.540 heart disease in close relatives + 0.180 WBC + 0.109 MONO - 0.004 PLT - 0.191 LYM.

The ROC curves of the training set and testing set are separately shown in Figure 2; the AUC value of the prediction model was 0.75 (95% CI: 0.71–0.79) in the training set and 0.79 (95% CI: 0.73–0.85) in the testing set. The accuracy was 0.76 (95% CI: 0.72–0.79) in the training set and 0.77 (95% CI: 0.72–0.82) in the testing set. The sensitivity was 0.56 (95% CI: 0.46–0.63) in the training set and 0.51 (95% CI: 0.41–0.62) in the testing set. The specificity was 0.83 (95% CI: 0.79–0.86) in the training set and 0.87 (95% CI: 0.83–0.92) in the testing set. The NPV was 0.84 (95% CI: 0.81–0.87) in the training set and 0.83 (95% CI: 0.78–0.87) in the testing set. The PPV was 0.53 (95% CI: 0.46–0.60) in the training set and 0.60 (95% CI: 0.49–0.71) in the testing set (Table 5). The calibration curves of the model in the training set and testing set are shown in Figure 3, which depict that the prediction values of the model in the training set and testing set deviated slightly from the perfected models, but were close to matching, indicating the prediction model had good agreement between the predictive probability and the actual probability. A nomogram was also established for predicting the occurrence of CVDs in COPD patients (Figure 4). A sample was randomly selected in the training set and the patient was a female without a history of heart disease in close relatives. The LYM level of the patient was 1.91 109/L, the level of WBC was 7.12 109/L, the level of PLT was 338 109/L, and the level of MONO was 6.09 109/L. The patient was 58 years old with a history of smoking and blood transfusion. The patient was not overweight. The total score was 288 and the possibility of CVDs in the patient was 0.15, which was similar with the actual results (Figure 4).

3.6. Subgroup Analysis of Prediction Ability of the Prediction Model

As there were two diagnosis methods for COPD patients in our study, subgroup analysis was performed in the testing set. The data delineated that the AUC value in COPD patients diagnosed by spirometry was 0.69 (95% CI: 0.53–0.85), the accuracy was 0.83 (95% CI: 0.75–0.88), the sensitivity was 0.38 (95% CI: 0.14–0.61), the specificity was 0.88 (95% CI: 0.82–0.93), the NPV was 0.92 (95% CI: 0.87–0.97), and the PPV was 0.27 (95% CI: 0.09–0.46). The AUC value in COPD diagnosed from the questionnaire was 0.72 (95% CI: 0.64–0.80), the accuracy was 0.63 (95% CI: 0.55–0.71), the sensitivity was 0.70 (95% CI: 0.60–0.81), the specificity was 0.61 (95% CI: 0.51–0.72), the NPV was 0.70 (95% CI: 0.60–0.81), and the PPV was 0.57 (95%CI: 0.45–0.68). Subgroup analysis was also performed based on whether the patients had a history of smoking. The AUC value in patients with a history of smoking was 0.72 (95% CI: 0.64–0.80), the accuracy was 0.63 (95% CI: 0.55–0.71), the sensitivity was 0.70 (95% CI: 0.60–0.81), the specificity was 0.61 (95% CI: 0.51–0.72), the NPV was 0.70 (95% CI: 0.60–0.81), the PPV was 0.57 (95% CI: 0.45–0.68). The AUC value in patients without a history of smoking was 0.64 (95% CI: 0.45–0.83), the accuracy was 0.72 (95% CI: 0.59–0.83), the sensitivity was 0.22 (95% CI: 0.01–0.49), the specificity was 0.82 (95% CI: 0.71–0.92), the NPV was 0.85 (95% CI: 0.75–0.95), and the PPV was 0.18 (95% CI: 0.01–0.41) (Table 6). The prediction model showed better performance in patients with COPD diagnosed according to questionnaires and patients with a history of smoking.

4. Discussion

This study collected the data of 1022 COPD patients with CVDs to evaluate the factors associated with the occurrence of CVDs in COPD patients and establish a prediction model based on these predictors. The data revealed that male, age, smoking history, overweight, history of blood transfusion or heart disease in close relatives, and levels of WBC, PLT, LYM, and MONO were predictors for CVDs in COPD patients. Additionally, we established a prediction model for the occurrence of CVDs in COPD patients based on these predictors, the AUC value of the prediction model was 0.75 in the training set and 0.77 in the testing set, which showed good predictive performance. Subgroup analysis revealed that the prediction model had better performance in patients with COPD diagnosed according to questionnaires and patients with a history of smoking.

Cigarette smoke is the major cause of COPD, which results in about 95% of COPD cases in industrialized countries [22]. Smoking is also reported to be one of the most important risk factors for COPD with CVDs [23]. This may be due to the diverse inflammatory responses resulted from smoking in COPD patients, which increased the risk of CVDs [24]. Austin et al. identified that alveolar macrophages in bronchoalveolar lavage from the lungs of smokers might release more reactive oxygen species than nonsmokers [25]. Herein, COPD patients with a history of smoking were associated with a higher risk of CVDs. Previously, several studies have indicated that age was associated with the incidence of CVDs in COPD patients [14, 26]. These findings supported the results in our study, which showed that increased age was correlated with a higher risk of CVDs in the COPD patients. In the current study, patients with a history of blood transfusion were also associated with an increased risk of CVDs in COPD patients. As reported, blood transfusion was a risk factor of major cardiovascular events in patients with acute myocardial infarction and anemia [27, 28]. Family history of a heart disease is widely proposed to be an essential marker for predicting the occurrence of cardiovascular events in patients [29, 30], which provide evidence to the findings of our study, which depicted that a history of heart disease in close relatives was associated with an increased risk of CVDs in COPD patients. Blood routine parameters are essential inflammatory markers of COPD and some of the inflammatory makers were also elevated in patients with COPD [31, 32]. MONO circulates in the blood, bone marrow, and spleen and are one of the active members of inflammation in COPD [33]. An increased WBC and a decreased LYM count were also identified in COPD patients compared to healthy subjects [34, 35]. Inflammation was associated with the changes in structure, shape, and dynamics of PLT, which may further affect atherogenic and thrombotic events [36]. Another study also showed that increased levels of inflammatory markers are associated with the increased incidence of atherosclerosis, coronary heart disease, congestive heart failure, and atrial fibrillation [37]. In our study, higher WBC and MONO levels were associated with an increased risk of CVDs in COPD patients while the higher levels of PLT and LYM were associated with a decreased risk of CVDs in COPD patients. For COPD patients, regular blood routine inspection should be conducted to pay close attention to the levels of WBC, MONO, PLT, and LYM for timely identifying patients with a high risk of CVDs.

The current study assessed the predictors for CVDs in COPD patients and established a prediction model based on these predictors in the training set. The validation of the prediction model was performed in the testing set. The AUC values of the model showed good predictive abilities in both the training set and the testing set. The calibration curves of the model also suggested that the prediction model had good agreement between the predictive probability and the actual probability, indicating our prediction model had good predictive performance. Previously, a prediction model for the occurrence of CVDs in COPD patients was constructed based on the MONO level/HDL cholesterol ratio, which showed an AUC value of 0.73, and our prediction model had better predictive performances than the model. Additionally, a nomogram was plotted in line with our model, which was easy for the clinicians to calculate the score directly from the graph and quickly estimate the possibility of CVDs in a COPD patient. The nomogram might offer a tool for the clinicians to provide timely interventions to prevent the occurrence of CVDs in patients with a high risk of CVDs.

The strengths of this study were that we dealt with the missing data and sensitivity analysis was conducted, which revealed that there was no significant difference between the characteristics of patients before and after manipulating the missing data. These suggested that the data used in our study through manipulating the missing data might reduce the bias than simply deleting the data, which might increase the reliability of our results. Internal validation was also performed to verify the results of the present study. Several limitations existed in the present study. First, the sample size was small and the statistical power was reduced. Second, external validation was not conducted. Third, some data collected from questionnaires in NHANES were self-reported, which might cause bias. In the future, studies with large scale of sample size were required to validate the findings of our study.

5. Conclusions

Herein, a prediction model was constructed for predicting CVDs in COPD patients based on predictors including gender, age, smoking history, overweight, history of blood transfusion or heart disease in close relatives, and levels of WBC, MONO, PLT, and LYM using the data of 1022 patients from NHANES database. The results showed that our prediction model had good predictive performance in predicting CVDs in COPD patients with AUC values of 0.75 in the training set and 0.79 in the testing set. A nomogram was also plotted for predicting the occurrence of CVDs in COPD patients. The findings might help identify COPD patients with a high risk of CVDs and provide timely interventions and treatment to prevent the occurrence of CVDs. For future work, we are planning to collect the samples of COPD in our hospital and use these data to verify the predictive performance of the prediction model. Meanwhile, advanced deep learning and optimization approaches will be employed to help improve the predictive value of predicting CVDs in COPD patients.

Data Availability

The datasets generated and/or analyzed during the current study are available via https://www.cdc.gov/nchs/nhanes/index.htm.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

Supplementary Materials

Supplementary Table 1 Sensitivity analysis of the missing data before or after manipulation. (Supplementary Materials)