Abstract

Background and Objective. The same range of blood pressure values may reflect different vascular functions, especially in the elderly. Therefore, a single blood pressure value may not comprehensively reveal cardiovascular function. This study focused on identifying pulse wave features in the elderly that can be used to show functional differences when blood pressure values are in the same range. Methods. First, pulse data were preprocessed and pulse cycles were segmented. Second, time domain, higher-order statistics, and energy features of wavelet packet decomposition coefficients were extracted. Finally, useful pulse wave features were evaluated using a feature selection and classifier design. Results. A total of 6,075 pulse wave cycles were grouped into 3 types according to different blood pressure levels and each group was divided into 2 categories according to a history of hypertension. The classification accuracy of feature selection in the 3 groups was 97.91%, 95.24%, and 92.28%, respectively. Conclusion. Selected features could be appropriately used to analyze cardiovascular function in the elderly and can serve as the basis for research on a cardiovascular risk assessment model based on Traditional Chinese Medicine pulse diagnosis.

1. Introduction

Pulse diagnosis is very important in Traditional Chinese Medicine (TCM), and research has focused on obtaining objective evidence for the technique [1, 2]. Pulse waves include objective information used in TCM pulse diagnosis. The time domain features of pulse waves have physiological significance and reflect the duration and amplitude of percussion waves, tidal waves, and dicrotic waves. The frequency and time domain features can reflect the disease state [3], especially in cardiovascular disease [4]. Early research [5] found that time domain features reflect hypertension. Recent studies have sought to identify a scientific correlation between pulse patterns (wiry pulse, slippery pulse, and others) and blood pressure using a computational approach. For example, the association between pulse waves and hemodynamic parameters has been studied in hypertensive patients [6, 7], and research has shown that blood pressure values can be predicted by pulse waves [8]. Pulse waves and blood pressure values are closely associated [911]. According to TCM, most hypertensive patients have a wiry pulse [8]. A wiry pulse is also commonly found in normotensive elderly, especially in those over 60 years old. Moreover, the elderly are at higher risk for hypertension. In the elderly, it is unclear how to distinguish between hypertensive patients taking blood pressure medication and normotensive, using pulse waves with the same blood pressure values. In our previous studies, a series of features, including time domain (TD), energy (), and higher-order statistics (HOS) features of wavelet packet decomposition coefficients (WPDC), were used in pulse classification of health versus subhealth and atherosclerosis versus nonarteriosclerosis [12]. The results have proved the feasibility of above features in pulse analysis. So we hypothesized that time domain (TD), energy (), and higher-order statistics (HOS) features of wavelet packet decomposition coefficients (WPDC) in the pulse wave cycle, which can identify signal characteristics [13], may reveal differences in pulse waves within the same range of blood pressure values in hypertensive and normotensive.

This study focused on individuals over 60 years of age to identify useful features in the pulse wave cycle that can demonstrate differences between hypertensive and normotensive, within the same range of blood pressure values. In this paper, firstly, the methods were introduced including the pulse data acquisition, preprocessing and pulse wave cycle segmenting, and feature extraction. Secondly, the experiments design and result were described. Thirdly, some details on experiments result discussion were given. Finally, the summary was presented.

2. Methods

Pulse data acquisition, preprocessing, pulse wave cycle segmentation, and feature extraction and classifier evaluation were performed. The general flow diagram is shown in Figure 1.

2.1. Data Acquisition

Data were collected from elderly volunteer subjects who presented for physical examinations at the community health service center in Pudong New District of Shanghai. The subjects were allowed to rest for 3–5 minutes before data collection and were instructed to sit, breathe quietly, relax the upper arm, extend the forearm, and flex the shoulder and elbow to about 120°, with the left wrist on a pulse pillow. Then, our specially developed TCM pulse bracelet [14] was placed over the Guan position in the left hand to capture the best pulse signals for 10 s.

Subjects were excluded from analysis if they lacked complete data for control or outcome variables or had significant heart disease.

A total of 770 subjects met the inclusion criteria and provided 10 s of pulse data for grouping of pulse wave cycles into NG, HerG, and HestG categories, according to blood pressure values. NG subjects had a baseline systolic blood pressure < 120 mmHg or diastolic blood pressure < 80 mmHg. HerG subjects had a baseline systolic blood pressure of 130 to 139 mmHg or diastolic blood pressure 80 to 89 mmHg. HestG subjects had a baseline systolic blood pressure > 140 mmHg or diastolic blood pressure > 90 mmHg. Each group was divided into 2 classes by history of hypertension (yes or no). Pulse wave cycle data were obtained for 10 s in all subjects during data preprocessing.

2.2. Data Preprocessing

Baseline wandering of original pulse data was removed with a high-pass filter in the sampling device. A bandpass from 0.5 Hz to 30 Hz filter was used to smooth waves affected by tremor or breathing. A Shannon Energy Envelope, Hilbert Transform (SEEHT) extractor was used for the percussion wave and beginning of the pulse wave cycle, as it was thought to be better than other extractors for wider, small pulse waves, or sudden changes in wave amplitude [12, 15]. A pulse wave cycle was defined as the interval between two initial sets of pulse data.

SEEHT extractor for the percussion was showed in more detail in [15] (Figure 2). Firstly, a bandpass filter with 1~4 Hz is designed to exclude other peaks and emphasize the percussion wave. Secondly, the data after bandpass filter are transformed by the Shannon Energy Envelope formula. However, wrist pulse signals between 1 Hz and 4 Hz are restrained by differentiated signals. SEE signals based on differentiated signals bring abrupt changes because the other waves are amplified in differentiation. The major local maxima of smooth SEE indicate approximate locations of the percussion waves. Hence, for detecting the percussion waves, a low-pass filtering is used for smoothing SEE to reduce the complexity of searching the local maxima. Thirdly, Hilbert Transform is used for finding the peaks. And then, the moving average filter signals after Hilbert Transform, which removed the low-frequency drift, locate the peaks by zero-cross point from positive shaft to negative shaft. Finally, the real peaks of the percussion wave are the maximum within 0.25 s in the pulse data after bandpass filter with 0.5~30 Hz.

Although the SEEHT method had shown good results for extraction of the percussion wave and the beginning of a pulse wave cycle, an error was observed in segmentation. This is basically due to morphological diversity in the pulse wave cycle. To eliminate the influence of segmentation error on the experimental results, noise in pulse wave cycles was excluded by visual inspection. For example, in Figure 3, there is a pulse signal (blue line) with low quality. The red asterisks are the percussion wave detected by SEEHT method. The red cycle is the start point of a pulse cycle and the end point of prior pulse cycle. So it is pulse cycle segmentation from one red cycle to next one. There are three error segmentations (red box) because of noise, so for every pulse sample segmentations result, we find out the error parts by visual inspection and delete that to ensure the effectiveness of pulse cycles in subsequent research.

2.3. Feature Extraction

To identify differences in pulse wave cycles between elderly hypertensive and normotensive, TD, , and HOS features of WPDC were extracted after preprocessing.

2.3.1. Time Domain Feature Extraction

A standard pulse wave is made up of 3 components: the percussion wave, tidal wave, and dicrotic wave. TD features include the duration and amplitude of the inflection point of 3 waves, which were extracted using a previously described method named Shap Threshold Value (STV) method (Figure 4).

STV method, which was described in more detail in pages 32–35 of [12], mainly contains two steps. First is that the pulse wave cycles are classified into eight pulse cycles (in Figure 5) by the shape according to expert experience and domain knowledge. Second step is detecting the inflection point in every shape using different threshold values.

Most TD features have clear physiological significance. In this study, 20 TD features (Figure 4) were chosen for analysis including 6 duration features (), 5 amplitude features (), 4 width features (), and 2 area features (As, Ad). The meaning of above features is showed in Table 1.

2.3.2. Wavelet Packet Decomposition

The discrete wavelet transform (DWT) only decomposes low-frequency components (approximations: A). The wavelet packet method, which is an expansion of the DWT method, can increase a wide variety of skills and power of the wavelet transform [16]. Wavelet packet decomposition (WPD) utilizes both low-frequency and high-frequency components (details: D). In WPD, the approximation achieved from the first level is split into new detail and approximation components, and this process is then repeated. Mother wavelet functions are important for wavelet packet coefficients and classification accuracy of extracted features. It was reported that the best feature set was obtained with the db6 wavelet function [17]. Therefore, this study chose the db6 wavelet function as the mother wavelet function to estimate the wavelet packet coefficients. The number of decomposition levels was set at 4. Therefore, 30 subbands were obtained for the fourth level of WPD. Figure 6 shows the fourth level of the WPD tree of pulse wave cycles.

2.3.3. Higher-Order Statistics and Energy Entropy

Higher-order statistics (HOS) have been applied successfully to extract features for classification [13]. In signal processing, many signals, especially nonlinearities, cannot be examined properly by second-order statistical methods. Therefore, higher-order statistical methods have been developed. While first- and second-order statistics contain mean and variance, nonlinear combinations of higher-order moments contain cumulants [18].

Let is real, discrete time random process. The moments of are defined as the coefficients in Taylor series expansion of the moment generating function [19].For zero mean discrete time signals, moments and cumulants are defined as [13]where is defined as the expectation operation and is the random process.

The second characteristic function of , defined as [13]is called the cumulant generating function, and the coefficients in its Taylor expansion are the th-order cumulants of , represented as . The cumulants are defined as [13]The second-, third-, and fourth-order cumulants are calculated for each pulse cycle taking lag 0, which means that the value of equals zero. The zero-lag cumulants have special names: is the variance and is denoted by ; and are denoted by and known as skewness and kurtosis, respectively.

In this study, the HOS methods are used to extract new and fewer number of features from the wavelet packet decomposition coefficients There were 30 subbands for the 4 levels as noted. Three features were extracted for each subband using HOS. We calculate HOS methods, second-, third-, and fourth-order cumulants including using cumulants functions in MATLAB 2013a:

In addition, Shannon entropy was used to calculate the energy of WPDC with the following entropy function in MATLAB 2013a:where represents the wavelet packet decomposition coefficients of every pulse cycle. Thus, 30 features () and 90 HOS features () were obtained for analysis.

2.4. Feature Selection

CfsSubsetEval and BestFirst were used for feature selection; these are built-in attribute evaluator and search methods in WEKA 3.8. CfsSubsetEval evaluates the worth of a subset of features by considering the individual predictive ability of each feature along with the degree of redundancy. Subsets of features that are highly correlated with the class while having low intercorrelation are preferred. The BestFirst method searches the space of attribute subsets by greedy hillclimbing augmented with a backtracking facility. Setting the number of consecutive nonimproving nodes allowed control of the level of backtracking. BestFirst may start with an empty set of attributes and search forward, with a full set of attributes and search backward, or at any point and search in both directions. The process is shown as Algorithm 1.

Input: f[n] //features sets including 20 TD, 90 HOS, and 30 features,
Output:  //features sets after features selection by CfsSubsetEval, is the count of selected features.
Begin // Bestfirst process beginning
Results[n]=KNN(f[n]) //get the results of KNN classification for every features in f[n]
f_sort[n]=Sort(f[n], Results[n]) //sort the features by classification result by descend
//  Bestfirst process ending
f_temp=f_sort  // f_tem is the selcted feature in every step
for Mf_ sort [2:end] //M is the features in f_ sort
// for every , using KNN classification judge the effectiveness of
[classifiction_resultBefor, classifiction_resultAfter] =KNN(f_temp, M)
if classifiction_resultAfter>classifiction_resultBefor // is effective when adding to f_temp for classifation.
if Correlation(f_temp,M)<0.5 // M and all features in f_temp are uncorrelated
f_temp(end+1)=M // M are inputted in f_temp
end
end
end
f_s= f_temp
End
2.5. Classification

-Nearest Neighbor (-NN) [20], which is the most effective and common nonparametric method in pattern recognition classification, was used for evaluation of the effectiveness of all features. -NN is independent of statistical distribution of training examples and classifies objects by computing their distance to the training examples in the feature space. The object is assigned to the class most common among its -Nearest Neighbors. In this study, when = 1, the object is simply assigned to the class of its nearest neighbor.

To compare the results of classification, the statistical definitions used were as follows:(i)TP: true positive, number of positives;(ii)TN: true negative, number of negatives;(iii)FP: false positive, number of negatives;(iv)FN: false negative, number of positives;(v)ROCA: receiver operating characteristic curve area, in which the -axis and -axis are the False Positive Rate (FPR) and True Positive Rate (TPR), respectively.

In this study, positive means hypertension history, and negative means nonhypertension history. Finally, accuracy (ACC), sensitivity (SE), specificity (SP), and ROCA ware used as evaluation indicators. The relevant formulas are shown as follows:

3. Experimental Results

After data preprocessing and noise removal, 6,075 pulse wave cycles were analyzed for the NG, HerG, and HestG groups, and the hypertension history and nonhypertension history classes. The 6 data sets are detailed in Table 2.

In every experiment, 20 TD features, 90 HOS features, and 30 features were selected with 10-fold cross-validation. The occurrence frequency of these features was designated as OF. For example, a 5-fold selected feature was chosen 5 times in 10-fold selection (OF = 5/10 × 100 = 50%). All chosen features with different OFs were divided into different combinations by eliminating lower values. Finally, 1-NN was used to verify different feature combinations. The experimental software platform for 1-NN and feature selection was Weka 3.8. All features were extracted in MATLAB 2013a.

3.1. Feature Selection Results

Using 10-fold selection, 26 features were selected in experiment 1 for NG, 9 features in experiment 2 for HerG, and 15 features in experiment 3 for HestG. The selected features and OF values are shown in Table 3 and Figure 7.

3.2. Classification Results

All chosen features with different OF values were divided into different combinations by eliminating the lower OF in every experiment. For example, 26 features were selected in NG, including 10 with 10% OF, 5 with 20% OF, 3 with 30% OF, 1 with 40% OF, 3 with 90% OF, and 4 with 100% OF. Accordingly, NG experimental results were divided into 5 groups. The first group eliminated 10 features with 10% OF and retained the remaining features. Therefore, the last group only contained the 4 features with 100% OF. The grouping in HerG and HestG was the same as in NG. Moreover, based on all feature subgroups and all selected feature subgroups, NG, HerG, and HestG experiments were divided into 7, 6, and 9 subgroups, respectively.

1-NN was used in every subgroup of experiments to evaluate the effectiveness of the features groups. The classification results for NG, HerG, and HestG are shown in Tables 4, 5, and 6, respectively. To highlight the features of optimal combinations, line charts are shown in Figures 8, 9 and 10, respectively.

4. Discussion

In the NG experiment, 7 features were selected for best performance: age, BMI, , , HOS29, HOS81, and (Figure 8); these were found at least 9 times in 10-fold selection (OF ≥ 90). Using the same rules, the best performance features in the HerG experiment were age, BMI, , , , and HOS45, and the best performance features in the HestG experiment were age, BMI, , , , HOS8, HOS29, HOS65, , and .

Age and BMI in the classification results of the 3 experiments all showed good performance, consistent with other reports. There were different trends among selected TD features between normal values in the hypertension and nonhypertension groups. For example, in the NG experiment, an increase in age was accompanied by an increase in in the group with normal blood pressure values and a history of hypertension (Figure 11(a)), but there was no consistent change in those without a history of hypertension (Figure 11(b)).

Most of the selected HOS and features of WPDC had low-frequency components. One feature from the first level, 2 from the second level, 1 from the third level, and 4 from the fourth level (Figure 12) were selected as best features. The selected features included 3 third cumulants, 2 fourth cumulants, and 3 features.

Each subband level after WPD contained second-, third-, and fourth-order cumulants. Red box denotes selected features in NG; blue box denotes selected features in HerG; green box denotes selected features in HestG.

In TCM theory, the pulse type changes from slippery to wiry with age. The consensus among TCM physicians is that hypertensive patients have a wiry pulse. Research has shown a correlation between the rank of a wiry pulse and different levels of hypertension. Two types of wiry pulse (healthy elderly wiry pulse and hypertensive wiry pulse) show a distinct difference. The classification accuracy showed a decreasing trend as blood pressure values increased (97.91% in HG, 95.24% in HerG, and 92.28% in HestG). Because of the normal blood pressure values in the NG group, there are essentially 2 classes of a wiry pulse: the healthy elderly wiry pulse and the hypertensive wiry pulse. However, in the HerG and HestG groups, the pulse wave in those without a hypertension history reflected the features of a hypertensive wiry pulse. Thus, the classification accuracy in the HerG group was lower than that in the NG group, and the accuracy in the HestG group was lower than that in the HerG group. The features selected in the classification all achieved accuracy of greater than 92.28% in the 3 groups. Although the features of a hypertensive wiry pulse were present in 2 classes (hypertension and nonhypertension history) in the HerG and HestG groups, the selected features can also reflect cardiovascular function under conditions of sustained hypertension.

5. Conclusion

In elderly individuals, pulse wave cycle features in the same blood pressure range show significant differences according to hypertension history. Recognition rates of over 90% have been achieved in classification experiments using the selected features. This shows that not all equivalent blood pressure levels represent the same cardiovascular function. Meanwhile, the TD, HOS, and energy features of WPDC can be used to evaluate cardiovascular function according to blood pressure values.

This study shows that management of health risk requires more than blood pressure medication in elderly individuals with hypertension. Changes in pulse wave and blood pressure values should be used in an evaluation index. Future research will focus on finding more effective features for assessment of blood vessels and analysis of the relationship between pulse features and central arterial pressure.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was funded by the National Natural Science Foundation of China (no. 81574058), Shanghai Key Research Program of Shanghai Municipal Commission of Health and Family Planning (ZY3-CCCX-3-2002), Shanghai Science and Technology Innovation Project of Traditional Chinese Medicine (ZYKC201602003 and ZYKC201702002), and the National Key Research and Development Program of China (2017YFC1703300 and 2017YFC1703301).