Abstract

Current atherosclerosis (AS) assessment devices have a disadvantage for users to carry around. In response to this shortcoming, we propose to collect the wrist photoplethysmograph (PPG) signal and create models to predict the indicators of atherosclerosis (cardiovascular age and right brachial and ankle pulse wave velocity (baPWV)). This study uses the maximum correlation coefficient method for feature selection and establishes multiple models to predict cardiovascular age and the right baPWV. The study results show that the prediction of cardiovascular age using the backpropagation (BP) neural network model is the best. Its Pearson correlation coefficient (PCC) is 0.9501 (), and the model finds the best six physiological features as crest time (CT), crest time ratio (CTR), slop K, stiffness index (SI), reflection index (RI), and heart rate (HR). When predicting the right baPWV value on the right side, we propose a hybrid method MLR_BP, which has better experimental results than BP and MLR. The MLR_BP model improves the prediction accuracy, the predicted PCC value is 0.9204 (), and the model only needs two features, HR and cardiovascular age. This study further verified the results of related literature and proved the relationship between AS and related physiological parameters. The proposed method is applied to wearable devices and has an application value for diagnosing AS and preventing cardiovascular diseases.

1. Introduction

According to the World Heart Federation report at the World Heart Conference, approximately 20 million people die from various cardiovascular diseases (CVD) each year worldwide. It is estimated that the number of deaths from multiple CVD will exceed 30 million in 2025 [1]. “China Cardiovascular Health and Disease Report 2018” pointed out that China’s CVD prevalence and fatality rate are still on the rise. It is estimated that there are 270 million CVD patients, and CVD deaths account for more than 40% of the residents’ disease mortality, which is higher than that of tumors and other diseases [2]. Among them, arteriosclerosis (AS) is a significant predictor of CVD. Therefore, the prevention of AS is the key to reducing the risk of CVD [3]. It is possible to diagnose AS by magnetic resonance imaging, ultrasound, and other methods clinically. However, this requires professional equipment, which is high cost, and complicated operation and cannot dynamically obtain AS status at any time. The development of a portable, noninvasive diagnosis of AS wearable devices has positive significance for early screening and diagnosis of CVD.

The Framingham heart disease study proposed cardiovascular age and cardiovascular risk as new indicators to measure AS. The study used factors such as gender, age, and systolic blood pressure to predict cardiovascular age and cardiovascular risk [4]. In recent years, a large number of studies believe that physiological parameters such as Ankle-brachial index (ABI) and pulse wave velocity (PWV) are important indicators for evaluating AS [5, 6]. PWV measurements of different parts have essential value in assessing AS, such as brachial and ankle pulse wave velocity (baPWV) and carotid-femoral artery PWV (cfPWV). However, these PWV measurement methods generally have the disadvantages of complicated operation and inconvenient to carry around. Because of the critical role of cardiovascular age and baPWV in predicting the AS, our research predicts two essential physiological indexes using the wrist PPG as the research goal.

PWV is an independent predictor of cardiovascular risk [7]. Cardiovascular age is also one of the gold standards for assessing AS. Existing studies use PPG signals to evaluate blood pressure, arterial stiffness, and so on, and the collected signals are concentrated on the finger. In daily life, collecting finger signals reduces the users’ comfortable. The wrist has thick skin and fewer blood vessels, and its sign is weaker than that of the finger. Therefore, the feature analysis based on the PPG signal of the wrist is more complicated, but comfort is better for users. This research first collects PPG signals at the wrist, extracts relevant features, and then establishes a cardiovascular age prediction model to improve user comfort. The correlation between baPWV and aortic PWV is high. At the same time, the correlation coefficient of left and right baPWV is high. Therefore, this study took the right baPWV as an example to establish the baPWV prediction model and use the prediction of cardiovascular age and baPWV to monitor AS noninvasively and dynamically.

The main contributions of this paper are as follows:(1)Through many experimental analyses, we found the best feature subset and the best model for predicting cardiovascular age.(2)When predicting the right baPWV, we proposed the MLR_BP model, which improves the prediction accuracy and further verifies AS and cardiovascular age correlate.(3)Our study found that heart rate (HR) plays an essential role in predicting right baPWV, indicating that HR has a specific relationship with AS. It may provide another convenient method for monitoring AS.(4)The proposed model has fewer feature parameters, has low computational resource overhead, can be embedded in wearable devices, and improves the comfort of the user experience. It has particular reference significance for detecting AS and preventing CVD.

There are six sections in the paper, and the content of each section is as follows.

In the first section “Introduction,” firstly, we introduce the background and significance of the thesis research. Then, the existing methods for evaluating AS are introduced, including ABI, PWV, and cardiovascular age, and the disadvantages of related research. Finally, the research content and innovation of this paper are explained.

In the second section “Related Work,” we mainly introduce related research to extract various characteristic parameters from PPG signals to evaluate the cardiovascular function, AS, blood pressure, and other physiological indicators.

In the third section “Materials and Methods,” there are four small parts in this part. The data acquisition object is first introduced, and secondly, the PPG signal processing process is presented. Then, the methods of feature extraction and feature selection are introduced. The last is the construction of the model, which includes MLR, SVR, BP, and our proposed MLR_BP model.

In the fourth section “Results,” there are four small parts in this part. Firstly, we introduce the results of model feature selection in predicting cardiovascular age. Secondly, we show the accuracy of models performance in predicting cardiovascular age. Then, we present the results of model feature selection in predicting right baPWV. Finally, we offer the accuracy of the model’s performance in predicting the right baPWV. At the same time, we analyze the results to verify our contribution.

In the fifth section “Discussion,” we discuss the relationship between the experimental results and the conclusions in the relevant literature. The experimental results further verify the findings in the relevant literature. At the same time, there are new findings from the experimental results.

In the sixth section “Conclusion,” we summarize the work of the paper and related conclusions, as well as deficiencies and future work.

PPG contains cardiovascular-related physiological and pathological information such as cardiac pulsation function, hemodynamics, and vascular conditions [8, 9]. The PPG tracing method has the advantages of noninvasiveness, convenience, and low cost. At the same time, PPG signals are relatively easy to obtain among all medical-biological features. Therefore, more and more scholars use PPG waveforms to detect human-related physiological indexes, for example, blood pressure, blood sugar, blood oxygen saturation, heart, and other physiological indicators. Pulse wave analysis (PWA) is a typical method for studying AS, and it is widely used in both Chinese and Western medicine [10, 11].

Nidigattu et al. used PPG signals by extracting feature engineering to find the best feature subset to predict heart rate and blood pressure [12]. Zhang et al. used PPG signals to extract features and built a machine learning model to premeasure blood glucose, thereby changing the existing invasive measurement methods [13]. Couceiro Ricardo et al. proposed that the PPG signal at the finger evaluates cardiovascular function through multi-Gaussian fitting [14]. Zhao et al. established a joint framework for heart rate monitoring by PPG signals during exercise. It can be used for PPG heart rate monitoring in high-intensity physical activities and can be applied to fitness tracking and health information tracking of smart wearable devices [15]. Shen et al. recorded PPG signals in a free-moving environment and used deep learning algorithms to detect the onset of atrial fibrillation (AF) [16]. Tjahjadi et al. accurately classify blood pressure types based on two-way long and short-term memory and time-frequency analysis of PPG signals [17].

The above studies extract different feature parameters from the PPG signal to study the relevant conditions of the cardiovascular system. It is of great significance for the realization of rapid and noninvasive monitoring of CVD. The PPG signal collected at the wrist is greatly affected by motion artifacts. The above research did not analyze and process the signal collected at the wrist. They have also not been applied to wearable devices for real-time and noninvasive detection of cardiovascular disease-related indicators. Therefore, the evaluation of AS indicators based on the PPG signal at the wrist is of great significance.

3. Materials and Methods

3.1. Selection and Data Collection of Research Objects

In this experiment, a self-developed watch is used to collect the PPG signal of the wrist. The PPG signal is transmitted to the smartphone via Bluetooth and can be sent to the computer. In the experiment, standard cardiovascular testing equipment collects the test subjects’ cardiovascular age. And AS testing equipment tests left lower limb ABI, right lower limb ABI, left baPWV, and right baPWV. Thirty-seven subjects were recruited for this experiment, 20 males and 17 females aged 24–66. The population selection is representative.

In the experiment, each subject was required to use a wristwatch to collect ECG and PPG signals for 1 minute simultaneously. The sampling frequency was 500 Hz, and the number of signal points accumulated in 1 minute was 30,000. To make up for the small number of participants, we randomly selected 3000 point signals for denoising and feature extraction and other signal points used for data enhancement. The simulated experimental population was 370 people. To reduce computing resources and facilitate integrating algorithms into wearable devices, we only analyze PPG signals. Participants did not engage in moderate or high-intensity exercise for more than 10 minutes within 1 hour before the experiment.

3.2. PPG Signal Processing

We use discrete wavelet transform to filter high-frequency noise and a maximum point recognition algorithm of differential signal to start point identification. We use a baseline calibration algorithm based on starting point to solve filter baseline drift and use “time-domain analysis” and “derivative function analysis method” to extract feature points. We use the threshold method to extract the typical feature points and display them on the waveform graph. The signal processing flow is shown in Figure 1, and Figure 2 is the original PPG waveform collected at the wrist and the waveform after denoising.

3.3. Feature Extraction and Feature Selection

In the denoised PPG signal, the AS-related indexes are extracted according to the “time-domain analysis method” and the “derivative function analysis method.” A total of 10 feature indexes are extracted. The detailed definitions of 10 indexes are shown in Table 1. Figure 3 shows the typical features of PPG. A is the starting point, B is the primary wave, C is the descending middle gorge wave, D is the dicrotic wave, T is the duration of a complete pulse wave, and K is the slope. The relevant index calculation formula is as follows:where is the collected PPG, is the peak of Q(t), is the Q(t) valley, and T is the pulsation period.

The best feature subset is selected for training to reduce features and make the model more generalized. The experiment uses the Pearson correlation coefficient (PCC) method for feature selection. Find the best feature subset for each machine learning algorithm model, and the whole process is shown in Figure 4.

Feature selection methods are divided into three types: wrapper, embedded, and filter. There are multiple methods for each class. The wrapper method takes the performance of the learning model used as the evaluation criterion of the feature subset, and the purpose is to select a “tailor-made” feature subset for the learning model. The feature selection of the embedded method is embedded in the learning model training process; the feature selection is not clearly distinguished from the training process of the learner. Therefore, the feature selection of these two methods takes into account the subsequent learning model.

However, the filter method does not need to consider the learning model to be used later when selecting features. This method selects features based on the general properties of features, such as target correlation, autocorrelation, and divergence. One of the primary purposes of our research is to explore the model between AS or baPWV and related features, so the filter method is chosen as the feature selection. The filter also has a variety of methods, such as variance selection and correlation coefficient selection.

The variance selection method only considers the feature’s variance and selects features with a variance more than the set threshold. The variance selection method only considers the feature’s variance without considering the target value, and it is not easy to determine how appropriate the threshold is. Intuitively, the larger the correlation coefficient between each feature and the target value (AS; baPWV), the more critical it is. Compared with other methods, the correlation coefficient is more straightforward and more interpretable.

In addition, the wrapper and embedded methods’ computational overhead are larger than the filter method. The filter method can quickly explore the best model and feature subset. Therefore, we use the maximum correlation coefficient method based on PCC for feature selection in synthesis consideration.

3.4. Model Construction

Use linear and nonlinear relationship model algorithms to build cardiovascular age prediction models, including MLR, Ridge Regression (RR), Lasso Regression (LR), BP, Random Forest (FR), and Support Vector Regression (SVR) model.

In the experiment, BP neural network and MLR model were used to predict the right baPWV value. At the same time, we propose a hybrid method MLR_BP based on MLR and BP models. The model’s accuracy is better than the BP neural network and MLR models in predicting the right baPWV.(1)MLR establishes the relationship model between the response variable and the explanatory variable by fitting a linear formula. In this study, are the extracted PPG signal features, and the response variable is the predicted cardiovascular age or baPWV value. The formula is as follows:Among them, , are the regression coefficients, which are fitted through the train set data to minimize the loss function, and the loss function adopts the least square method, as shown in the following formula:where is the actual value and is the predicted value, which is the predicted cardiovascular age or baPWV value in this study.A fundamental problem in linear regression is overfitting. The so-called overfitting means that the training error of the model is tiny. However, the test error is obvious. Two methods are generally used to reduce overfitting. One is to reduce the number of features. In this study, the feature selection method is used to reduce the number of features. Another method is regularization. Lasso and Ridge Regression’s essence is to add L1 and L2 regularization based on standard linear regression. Therefore, two models of LR and RR are used in the experiment.(2)The SVR model can solve overlearning, nonlinearity, dimensionality disaster, and local minimum. It can handle both linear relational data and nonlinear relational data and has good generalization ability. In this study, the feature variables extracted by the PPG signal may have a nonlinear relationship, so this model explores the nonlinear relationship.(3)BP neural network is composed of two parts: forward propagation and error signal direction propagation. Forward propagation is from the input layer to the hidden layer and then to the output layer. The state of each layer of neurons only affects the next layer of neurons. If the output layer result does not reach the expected result, the backpropagation of the error signal is performed. This process continues to iterate until the error is minimized. In this process, a gradient descent method is used. The output node of the hidden layer of the th neural unit is shown as follows:Among them, represents the weight of the input node and the hidden node, and represents the input. In this research, it means the value of the extracted feature. is the bias. The output layer node is expressed as follows: is the weight of the hidden node and the output node and is the bias.(4)To improve the accuracy of predicting baPWV, we propose the MLR_BP model. This model is a fusion of the MLR and BP neural network models. The algorithm steps are as follows:(1)We use the BP model to find the best feature subset to predict cardiovascular age.(2)We acquire cardiovascular age by the BP model’s prediction as a new feature. This feature and the ten features that have been extracted from the PPG signal get a new larger feature set.(3)Based on the MLR model and the maximum correlation coefficient method, the best feature subset is found from the newly constructed feature set to predict baPWV.

In the experiment, 80% of the collected data are divided into the training set, and 20% are the test set. We used Python 3.6, and the best parameters were found by grid search and 5-fold cross-validation.

4. Results

4.1. The Results of Models Feature Selection in Predicting Cardiovascular Age

Firstly, perform feature selection based on PCC, and select the top 10 features of the correlation coefficient. Then, 1 to 10 feature input models are chosen according to the value of correlation coefficients. The PCC between the predicted value of different models and the standard value when selecting the different number of features is shown in Figure 5.

We can see from Figure 5 that the BP model has the best effect when selecting six features. These six features are crest time (CT), crest time ratio (CTR), slope K, stiffness index (SI), reflection index (RI), and heart rate (HR). The three linear models of MLR, LR, and RR have better results when selecting four features. Only the representative models are shown in Figure 5, and the MLR model is representative of these three linear models. The SVR model also works best when the number of features is six, and the FR model works best when nine features are chosen.

4.2. Accuracy of Models Performance in Predicting Cardiovascular Age

The accuracy is evaluated by PCC, Mean Deviation (MD), Residual Standard Deviation (RSD), Root Mean Square Error (RASE), and Mean Absolute Error (MAE) to model the. PCC represents the correlation between the predicted value and the measurement value, and MD is the average value of the deviation between the predicted value and the measurement value, reflecting the deviation index between the two. RSD represents the degree of discretization of the residual. RASE is used to measure the deviation between the predicted value and the measurement value. MAE and Mean Absolute Error are used to measure the average value of the absolute value of the error between the observed value and the actual value.

Table 2 shows the accuracy of each model. Among the six models, the BP model is the best, and the worst is the random forest model. , indicating that the prediction results of the models are statistically significant.

Figure 6 shows the density plot of residual in cardiovascular age between the predicted value by model and measured value. Figures 6(a) and 6(b), respectively, represent the BP model and MLR. It shows that the residual between the predicted value and the measured value of the BP model is normal distribution in cardiovascular age. From the comparison shown in Figure 6, the prediction result of the BP model is better than that of the MLR, and the prediction result is more credible.

Based on the above experimental results, the best model for predicting cardiovascular age is BP, and its best feature subsets are CT, CTR, slope K, SI, RI, and HR.

4.3. The Results of Models Feature Selection in Predicting Right baPWV

To further predict the atherosclerosis index, based on predicting the cardiovascular age, the right baPWV was selected as the prediction target. The MLR model and the BP model were used for comparison. 1 to 10 features were selected, respectively, by the feature maximum correlation coefficient method. The result of feature selection is shown in Figure 7. Both MLR and BP models have the best predictive effect when the feature number is 1. And the feature is HR.

To further improve the accuracy of predicting baPWV, we propose the MLR_BP model. The experimental results are shown in Figure 8. The MLR_BP model works best when the number of features is 2, and it is easy to see that the proposed model is better than the BP and MLR models. These two features are HR and cardiovascular age.

4.4. Accuracy of Models Performance in Predicting Right baPWV

Table 3 shows the prediction performance of the three models. It shows that the prediction accuracy of the BP model and the MLR model is similar. MLR_BP predicts the accuracy of various indicators better than these two models. of predicted value and the measured value of the three models.

Figure 9 is a density plot of residual in right baPWV between the predicted value by two models and the measured value. Figures 9(a) and 9(b), respectively, represent the MLR_BP and MLR model. From Table 3, the accuracy indicators of the MLR model and the BP model are similar. Therefore, Figure 9 only draws the density plot of residual of the MLR and MLR_BP. It shows that the residuals of the predicted and measured values of the MLR_BP model and the MLR model are normal distribution in Figure 9. However, residuals of the MLR_BP prediction are more concentrated around 0. Therefore, the MLR_BP model fits the error better than the MLR model. In summary, the MLR-BP model has better performance than MLR and BP model.

Therefore, the above experimental results and analysis show that our proposed BP_MLR model is the best for predicting the right baPWV, and the best feature subset is HR and cardiovascular age. At the same time, it is concluded from the experimental results that the BP and MLR models predict that the best feature subset of the right baPWV is HR. It shows that no matter which of these three models predicts the right baPWV, HR plays an essential role in predicting the right baPWV. It further shows that HR and AS have a specific relationship. HR monitoring has many devices that can be easily measured, which may provide a more convenient method for monitoring AS.

In addition, the two models we proposed use fewer feature parameters and require fewer computing resources. They can be easily embedded in wearable devices and improve the user’s comfort of AS diagnosing. It has particular reference significance for detecting AS and preventing CVD.

5. Discussion

The baPWV is a widely used clinical index to assess AS and is the gold standard. And the recorded parts of the measurement are the brachial and ankle artery. This study uses the PPG signal at the wrist to predict the value of baPWV with high accuracy, indicating that the proposed model provides a specific reference value for the simple measurement of baPWV.

Millasseau et al. have confirmed that SI increases with age, correlates with cfPWV, and can be used as an AS index. At the same time, clinical studies have demonstrated that RI plays an essential role in evaluating arterial elasticity [18]. Based on the CT study, Wu et al. confirmed that parameters such as CTR have a significant positive correlation with SI, RI, and cfPWV and confirmed that, in the absence of obviously reflected waves, Can still assess the degree of AS [19]. The results of this study further verify these conclusions.

In the study of predicting cardiovascular age, the MLR model has the best prediction effect when using four features (K; SI; RI; HR). The BP neural network has the best effect when selecting six features (CT; CTR; K; SI; RI; HR). Experiments show that the addition of CT and CTR features improves the accuracy of the model’s prediction. It indicates that CT and CTR have a particular value in predicting cardiovascular age, which is also consistent with the research conclusions of Wu et al.

MLR and BP neural network models need only the HR feature to achieve the best prediction effect in predicting the right baPWV. The MLR_BP model needs the two features of HR and cardiovascular age to achieve the best accuracy. It shows that cardiovascular age and baPWV are correlated, which is consistent with the views of related literature [20]. The study results indicate that HR plays an essential role in the prediction of cardiovascular age and baPWV. It shows that HR may also be a crucial indicator in assessing AS.

6. Conclusion

This research collected PPG signals at the wrist, extracted relevant features, and established several models to predict cardiovascular age and the right baPWV. We found the best feature subsets in different models. It further verifies the influence of related physiological indicators on AS. In the prediction of cardiovascular age, the BP model has the best accuracy. When predicting the baPWV on the right, the MLR_BP model has the best accuracy. The model only includes the two features of HR and cardiovascular age. The experimental results further verified the correlation between AS and SI, RI, CT, and CTR, which is consistent with the research conclusions of related scholars. At the same time, it can be seen from the research results that there is a specific correlation between HR and AS. The monitoring method of HR has been relatively mature and convenient. It may provide a new and convenient way for AS detection. Of course, this requires a lot of experiments to verify further.

In this research, based on the wrist PPG signal, models were established to assess the cardiovascular age and right baPWV indicators of AS. The models with high prediction accuracy and few model feature parameters can be applied to wearable devices. It can make the detection of AS-related indexes more convenient, fast, and noninvasive. It has a particular reference value for diagnosing AS and the prevention of cardiovascular diseases. The PPG signal is subject to many external interference factors, such as motion artifacts and temperature. The experimenter was in a relatively quiet scene when we collected the data in this study. We need further study the robustness of the model in the case of people’s daily exercise. In the future, we will explore wearable devices that apply our proposed model, which is robust in different environments, collect data in different environments, and make the model more general.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was supported by the Key Projects of the National Natural Science Foundation of Universities in Anhui Province (Grant no. KJ2020A0112), the National Natural Science Foundation of China (NSFC) (Grant no. 61701482), High-Level Talents Research Start-Up Fund of Hefei Normal University (Grant no. 2020rcjj45), and the Teaching and Research Projects in Anhui Province (Grants nos. 2020jyxm1572 and 2020jyxm1573).