Abstract
Arrhythmia is a cardiovascular disease that seriously affects human health. The identification and diagnosis of arrhythmia is an effective means of preventing most heart diseases. In this paper, a BiLSTMTreg algorithm that integrates rhythm information is proposed to realize the automatic classification of arrhythmia. Firstly, the discrete wavelet transform is used to denoise the ECG signal, based on which we performed heartbeat segmentation and preserved the timing relationship between heartbeats. Then, different heartbeat segment lengths and the BiLSTM network model are used to conduct multiple experiments to select the optimal heartbeat segment length. Finally, the tree regularization method is used to optimize the BiLSTM network model to improve classification accuracy. And the interpretability of the neural network model is analyzed by analyzing the simulated decision tree generated in the tree regularization method. This method divides the heartbeat into five categories (nonectopic (N), supraventricular ectopic (S), ventricular ectopic (V), fused heartbeats (F), and unknown heartbeats (Q)) and is validated on the MITBIH arrhythmia database. The results show that the overall classification accuracy of the algorithm is 99.32%. Compared with other methods of classifying heartbeat, the BiLSTMTreg network model algorithm proposed in this paper not only improves the classification accuracy and obtains higher sensitivity and positive predictive value but also has higher interpretability.
1. Introduction
With the improvement of people’s living standards, the incidence and mortality of cardiovascular diseases are increasing year by year and are accompanied by a younger trend [1]. Arrhythmia is a common cardiovascular disease, which may endanger people’s lives in serious cases [2]. Therefore, the accurate detection of arrhythmia to prevent heart disease has a very important significance. Electrocardiogram (ECG), as a comprehensive expression of cardiac electrical activity on the body surface, contains a wealth of physiological and pathological information reflecting cardiac rhythm and electrical conduction and is one of the important bases for diagnosis of heart disease and evaluation of cardiac function [3]. Different types of arrhythmias can be identified and diagnosed by analyzing the ECG waveform. Traditional ECG waveform analysis is performed manually by medical personnel, who need to give a diagnosis based on cardiovascular disease diagnosis rules and personal experience. Due to the individual differences of patients and the complexity of diseases, there are many types of ECG. In addition, some arrhythmias occur only occasionally in the daily life of the patients, and the ECG data need to be recorded for a long time. Therefore, the amount of collected ECG data is huge, which brings a heavy burden to doctors. Under the circumstances, mistakes, missed inspections, or misdetections easily occur. With the rapid development of computer technology and electronic information technology, the computer has become an indispensable and important tool of medical modernization, and computeraided medical treatment has penetrated into every corner of medical service [4]. In recent years, increasing attention has been paid to the study of computeraided analysis algorithms for electrocardiography, particularly those that can accurately and rapidly identify and diagnose arrhythmias. The automatic classification and diagnosis algorithm of ECG signals can save doctors’ time by helping them better judge the symptoms of arrhythmia quickly. In addition, it can provide good healthcare in areas where medical resources are scarce.
This paper presents a beat classification method based on the timeseries network, which integrates the interheartbeat rhythm information. This method is based on tree regularization constraints and the BiLSTM neural network model. This method improves the accuracy of heartbeat classification. And the interpretability of the proposed algorithm is analyzed by tree regularization constraints and feature analysis. The main contributions of this work are as follows:(1)A timeseries BiLSTMTreg algorithm was designed to classify the beats, which combined the information of the beats so that the deep neural network could learn more rhythm information between heartbeats.(2)A tree regularization method for the heartbeat classification model is proposed to optimize the BiLSTMTreg algorithm and improve the generalization ability of the neural network model.(3)By analyzing the key nodes of the simulated decision tree in tree regularization, the concerns in the learning process of the BiLSTMTreg algorithm are analyzed, and the interpretability of the model is analyzed to a certain extent.(4)Compared with other deep learning methods, the proposed BiLSTMTreg algorithm improves the accuracy of heartbeat classification and reduces doctor’s misdiagnosis rate to a certain extent
2. Related Work
The diagnosis of early arrhythmia is mainly the doctor’s manual analysis of ECG waveform, which requires the doctor to have a professional medical theoretical basis and rich clinical experience. Because of the diversity of arrhythmia and the complexity of the ECG waveform, this method cannot meet the needs of patients. With the development of artificial intelligence, the classification of arrhythmia using intelligent processing technology has become a hot topic in recent years.
In the 1950s, the automatic analysis technology of ECG signals has appeared in the field of ECG research. At first, only ECG filtering processing technology developed relatively mature. Later, with the continuous development of technology, automatic detection and diagnosis of arrhythmia disease also began to be continuously explored by researchers. In the past decades, domestic and foreign ECG researchers have proposed a variety of heartbeat classification methods. These methods can be divided into two categories from the perspective of whether manual feature extraction of ECG signals is needed: feature engineeringbased classification methods and deep learningbased methods [5]. Traditional rulebased and machinelearningbased heartbeat classification methods both require manual feature extraction.
2.1. Heartbeat Classification Method Based on Feature Engineering
Feature engineering is to process a series of original data and extract the features as the input of the model to improve the performance of the model. Feature engineering mainly includes three aspects: feature selection, feature extraction, and feature construction. Feature extraction is the key step of ECG signal classification and recognition, and the extracted feature quality will affect the accuracy of ECG signal classification and recognition [6]. Generally, the features of ECG signals extracted by researchers mainly include morphological features [7], interphase features [8, 9], wavelet transform features [10], higherorder statistics (HOS) [9, 11], Hermite basis function (HBF) [12], QRS amplitude vector [13], and QRS composite wave area [14]. Then machinelearning algorithms are used for classification, such as the KNN algorithm [15], support vector machine (SVM) [7], and random forest [9]. Zhu et al. [7] extracted the ECG morphological features and used the SVM algorithm to classify the heartbeat, achieving a high classification accuracy. Yang et al. [9] extracted a variety of features, including RR interval, wavelet coefficient, and highorder statistics, and then used the random forest classifier based on an extreme learning machine to detect arrhythmias. Ji et al. [15] proposed a multifeature combination and stacked DWKNN algorithm to classify arrhythmias. The effects of different characteristic combinations on the classification of the heartbeat were analyzed.
Although this method based on feature engineering can also achieve relatively high classification accuracy, because of the complex waveform and poor antiinterference ability of ECG signal, the features extracted by hand often produce the human error. And the features of the manual design are very dependent on the prior knowledge of the researcher. Deep learning has the advantage of automatically extracting features and classification, which well solves a series of problems caused by manual feature extraction.
2.2. Heartbeat Classification Method Based on Deep Learning
The deep learning model has become a common model for ECG data classification [16]. Compared with the feature engineeringbased ECG classification method, the deep learning method, which uses original data rather than manually extracted features as input, can achieve better classification performance. In the deep learning method, researchers use the nonlinear transformation of hidden layers in the network to automatically obtain effective features and transform the original features into different new feature spaces by changing the structure of hidden layers in the network and the way of stacking [17], so as to make full use of the rich hidden information in the data and improve the classification accuracy.
Recently, some researchers [18, 19] have used a deep neural network model for automatic classification of ECG signals. Ji et al. [20] proposed an ECG classification system based on Faster RCNN. Onedimensional ECG signal is converted into twodimensional image as the input of neural network to realize the classification of arrhythmias. Akarya et al. [21] proposed a 9layer deep convolutional neural network (CNN) for automatic recognition of ECG signals. The original ECG signal and the ECG signal filtered out the highfrequency noise were used to classify the heartbeat, and the accuracy rates were 94.03% and 93.47%, respectively. Khan et al. [22] used the long shortterm memory network (LSTM) to automatically identify 16 different types of arrhythmias. Wu et al. [23] proposed a heartbeat classification algorithm that integrated CNN and BiLSTM deep learning models and extracted the morphological and temporal features of heartbeat, respectively, by using CNN and BiLSTM. Li et al. [24] proposed a BiLSTMAttention Network model to distinguish different types of arrhythmias. Pandey et al. [25] applied the extracted features of wavelet, RR interval, morphology, and highorder statistics to BiLSTM to achieve the automatic classification of the heartbeat. Yildirim et al. [26] proposed a heartbeat classification model based on wavelet transform and BiLSTM network, which used wavelet to decompose ECG signals into signals of different frequency scales and used the signals as the input sequence of the BiLSTM model.
The classification method of ECG signals based on deep learning realizes the “endtoend” learning mode, eliminates the manual design process of features, saves manpower, and makes the process of ECG classification simpler and more efficient. Although all the above studies cleverly used the deep neural network to classify ECG signals, the rhythm information between heartbeats has not been fully considered, the interpretability of the network has not been analyzed, and the classification accuracy needs to be improved.
3. Method
The heartbeat classification method of the BiLSTMTreg algorithm that integrates rhythm information between heartbeats proposed in this paper mainly includes the following steps: firstly, the data are preprocessed to filter out the noise in the ECG signal and segment ECG signal into heartbeats. Secondly, the continuous single heartbeat is combined into heartbeat segments so that the rhythm information between the heartbeats can be retained. Then, the BiLSTMTreg model was constructed and optimized. Finally, the heartbeats were classified. Section 3.1 is the preprocessing part, Section 3.2 is the representation of the rhythm information part, and Section 3.3 is the model building and optimization part.
3.1. ECG Signal Preprocessing
The preprocessing stage is mainly denoising and segmentation of ECG signals. Generally speaking, the collected ECG signals inevitably contain noise due to the influence of equipment and human body itself [27], which mainly includes baseline drift, power frequency interference, and EMG interference. It is important to remove as much noise as possible from ECG signals before classifying them. Wavelet transform is a generalization of shorttime Fourier transform (STFT) [28], which can perform timefrequency analysis of ECG signals well. Compared with the equally spaced timefrequency localization of STFT, wavelet transform can provide higher frequency resolution at low frequency and higher time resolution at high frequency. In this paper, discrete wavelet transform is used to denoise ECG signals, which can avoid losing important physiological details in ECG signals and better retain the characteristics of ECG signals. Because of the high regularity of the Daubechies wavelet, the reconstructed signal is relatively smooth. And the strength spectrum of the DB6 wavelet [29, 30] is focused on low frequencies. Its moderate filter length and moderate coefficient values, compared with the other wavelets, provide more smoothing and less shift in the ECG fiducials. Therefore, in order to obtain a good classification accuracy, this paper uses the DB6 wavelet in the Daubechies wavelet base to process ECG signals. In terms of implementation, we use python’s opensource wavelet transform tool pywt. The discrete wavelet transform formula [31] is shown in (1) and (2).where is the wavelet coefficient, is the discrete wavelet function at different scales and locations, is the input ECG signal, is the wavelet basis function, and is the order of the scale. The larger is, the smaller the scale is, which means the higher the frequency is and the closer it is to the details. is the offset of position. is the scale parameter and is the position parameter. Signal comparison before and after pretreatment with discrete wavelet transform is shown in Figure 1 and Figure 2.
Heartbeat segmentation is to divide an ECG record with a complete heartbeat as a unit [32]. A complete heartbeat should contain P wave, QRS compound wave, and T wave [33], as shown in Figure3(a). In this paper, the peak value of the R wave marked in the MITBIH database was used as the reference point for heartbeat segmentation, and 0.25s and 0.4s were extracted before and after the peak of R, as shown in Figure 3(b). We take this 0.65S data as a sample of a single heartbeat. For MITBIH ECG data with a sampling rate of 360HZ, we extracted 90 points before R peak and 144 points after R peak. Therefore, the reconstructed sample is 235 points.
(a)
(b)
3.2. Rhythm Information between Heartbeats
The rhythm information between heartbeats contained in the ECG is an important basis for doctors to diagnose heart diseases. Changes in ECG rhythm can reflect problems in different parts of the heart, which can help medical staff design more rational treatment plans. Common rhythm types are bigeminy, trigeminy, ventricular tachycardia, and atrial tachycardia.
Bigeminy. Every normal heartbeat is followed by a premature beat. And the occurrence of three or more groups in a row is called bigeminy. According to the type of premature beat, it can be divided into ventricular bigeminy and atrial bigeminy. For example, the rhythm change of N–V–N–V–N–V is ventricular bigeminy, and the rhythm change of N–S–N–S–N–S is atrial bigeminy.
Trigeminy. A premature beat occurs after every two normal heartbeats. And the occurrence of three or more groups in a row is called trigeminy. According to the type of premature beat, it can be divided into ventricular trigeminy and atrial trigeminy. For example, the rhythm change of N–N–V–N–N–V–N–N–V is ventricular trigeminy, and the rhythm change of N–N–S–N–N–S–N–N–S is atrial trigeminy. The ECG signal with ventricular trigeminy is shown in Figure 4.
Ventricular Tachycardia. Three or more consecutive ventricular premature beats are called ventricular tachycardia, such as the rhythm change V–V–V.
Atrial Tachycardia. Three or more consecutive atrial premature beats are called atrial tachycardia, such as the rhythm change S–S–S.
In addition, the appearance of certain types of heartbeats also reflects changes in ECG rhythm. For example, after a continuous ventricular tachycardia, a ventricular fusion heartbeat is often generated due to electrical signals from the sinus node, followed by ventricular capture. Therefore, ventricular fusion heartbeat and ventricular capture are important characteristics of ventricular tachycardia.
In this paper, this rhythmic information, which is beneficial to the classification of heartbeats, was integrated into the model. Specifically, in processing the dataset, successive single beats were grouped into segments, which preserved information about rhythm between beats. Then, the ECG data is input into the neural network model in the unit of heartbeat segment, which enables the model to make full use of the rhythm information contained in the heartbeat segment when identifying the heartbeat type and improves the classification accuracy. The length of the heartbeat segment is one of the key points of our study.
3.3. BiLSTMTreg Algorithm
Recurrent neural network (RNN) is a kind of neural network with shortterm memory ability, which is very effective in processing data with sequence characteristics. However, in deep neural networks, the gradient is unstable. The gradient close to the input layer is calculated based on the product of the gradients of the subsequent layers [34]. When the neural network has too many hidden layers or the input sequence of the RNN network is too long, it will cause the gradient near the input layer to vanish or blow up, which affects the performance of RNN to some extent. In order to solve this problem, Hochreiter et al. [35] proposed the long shortterm memory network (LSTM) in 1997. By adding gating units into RNN, the network can choose whether to retain the historical information so as to solve the problem of gradient disappearance and gradient explosion caused by longterm dependence of the RNN network.
3.3.1. BiLSTM Neural Network Structure
Compared with RNN, LSTM adds three gating units, which are input gate, forgetting gate, and output gate. In addition, there are two important parts of LSTM, namely, memory unit, and hidden state. The forgetting gate controls whether the information in the memory unit is discarded, the input gate controls whether the information of the current signal and hidden state is added to the memory unit, and the output gate determines the information output in the memory unit. Figure 5 shows the unit structure of the LSTM, where , , and , respectively, represent the forgetting gate at the current moment, the input gate, and the output gate; and , respectively, represent the state value of the memory unit at the previous moment and the current moment; and , respectively, represent the hidden state at the previous moment and the current moment. represents the input at the current moment, and is the candidate value of the memory unit at the current moment. and represent the sigmoid activation function and tanh activation function, respectively.
The calculation process of LSTM can be expressed as equations (3–8):
Formulas (3)–(6), respectively, represent the calculation formulas for the input gate , forget gate , output gate , and candidate value of the memory unit. They are all determined by the input data at the current moment, the hidden state at the previous moment, and their corresponding weight matrix, where , , , and are the weight matrix of the current input ; , , , and represent the weight matrix of the hidden state at the last moment; , , , and are the corresponding bias items, respectively. These weight matrices and bias terms are trained by the way of gradient descent. Formula (7) indicates that the current moment memory unit is adjusted by the current candidate unit and its own state as well as the input gate and the forgetting gate. Finally, formula (8) indicates that the output at the current moment, that is, the hidden state at the current moment, is determined by the current memory unit and the output gate.
One disadvantage of LSTM is that it cannot encode information upfront and can only use its past context, not its future context. In the classification of heartbeat, if the relevant information of the former and the latter can be obtained at the same time during the classification of the current heartbeat, the rhythm information of the heartbeat will be grasped more accurately, thus improving the classification accuracy of the current heartbeat. And BiLSTM solves this problem well [36]. In each BiLSTM layer, there are two independent LSTM to process sequences in two directions, respectively. The specific formula is shown in (9)–(11). At the time , the hidden layer state of BiLSTM obtains the heartbeat information before the time through the forward LSTM and the heartbeat information after the time through the backward LSTM and then carries out the weighted sum of and , where and are the corresponding weight matrices and is the bias term.
3.3.2. BiLSTM Network Based on Tree Regularization
In machine learning, there are many strategies designed to reduce model generalization errors, which are collectively referred to as regularization. The form of regularization is very simple, which is to add an additional term after the objective function to affect the selection of the optimal point of the objective function. The common regularization methods are L1 regularization and L2 regularization. The common regularization methods are L1 regularization and L2 regularization. The objective function is shown in equation (12), where is a regular term.
Tree regularization is a new regularization method proposed by Wu et al. [37], which can not only effectively improve the generalization ability of the model but also analyze the interpretability of the model. The tree regularization method of deep network model interpretability is a postinterpretable method, that is, the method of applying model analysis after model training to make the model interpretable. This method looks for the decision tree representation of the deep network model and realizes the human understanding of the prediction results of the network model by improving the human simulability of the network model. The implementation method of tree regularization includes the following two stages. First, we train deep neural network while being closely modeled by decision trees. In this way, this decision tree can accurately simulate the prediction process of the network. Secondly, the complexity metric of the decision tree, the average path length (APL), is taken as the penalty term for model optimization. In this way, the neural network can be encouraged to generate simple decision trees and restricted to generate complex decision trees, which further makes the generated decision trees easier to be simulated by human beings. The decision tree generation formula can be expressed by (13) and (14), where is the sample feature of the training set, is the prediction label of the depth model, is the weight matrix of the depth model, and is the prediction label of the decision tree. The reason why is used as the input of the decision tree is to make and as similar as possible so as to realize the purpose of using the decision tree to simulate the deep network.
The calculation formula of tree regularization is shown in (15), where is the path length of the sample and is the average path length, namely, the penalty term.
It can be seen from equation (15) that is not differentiable for network parameter . Therefore, in order to use the gradient descent strategy in the network optimization process, Wu et al. [37] proposed the surrogate regularization function , which can surrogate the previous APL calculation method, as shown in equations (16) and (17). By training a Multilayer Perceptron (MLP), the mapping relationship between the parameter vector W of the neural network model and APL is established. With W and APL as inputs to MLP, the objective function of MLP is shown in equation (17), where represents the weight matrix of the MLP model, represents the regularization intensity, represents the known parameter vectors and their corresponding real APL datasets, and represents the total number of datasets. Therefore, after using the surrogate model, the objective function of the BiLSTM network is shown in equation (18).
In this paper, tree regularization is used in the BiLSTM model to optimize the model, reduce the generalization error of the model, and improve the classification accuracy. At the same time, the generated simulated decision tree is used to analyze and understand how the BiLSTM model carries out heartbeat classification. The BiLSTM model using tree regularization is shown in Figure 6. Specifically, is used to represent a single heartbeat sample. The heartbeat segment composed of consecutive single heartbeats is used as the input of the network, and the number of single heartbeats in the heartbeat segment is the timestep of the network. The model first uses BiLSTM to classify heartbeat. Secondly, the decision tree is used to simulate BiLSTM, and APL is calculated. Then, the MLP model is trained to get the surrogate regularization function , and then is added to the objective function of the BiLSTM model for the next round of training. Algoithm 1 describes the BiLSTMTreg model algorithm.

4. Experiment
The processing and analysis of ECG signal is very important to the classification of the heartbeat. The research focus of this paper is on the construction and optimization of the model integrating rhythm information. According to the ANSI/AAMI EC57:2012 classification proposed by the Association for the Advancement of Medical Instruments (AAMI), arrhythmia can be divided into five categories: N (normal or bundle branch block), S (supraventricular ectopic beat), V (ventricular ectopic beat), F (fusion beat), and Q (beat not specified). On the basis of extracting continuous heartbeat segments, this experiment constructs a timeseries network that integrates rhythm information between heartbeats and divides heartbeats into the above five types.
4.1. Experimental Environment
The model proposed in this paper is trained and tested on a PC workstation with Xeon(R)Silver4114CPU, 32 GB memory, and Geforce2080Ti graphics card. The PC workstation runs on Ubuntu 18.04 system. And the algorithm is run under the TensorFlowGPU V2.2.0 framework.
4.2. Experimental Data
A unified and authoritative standard database is the basis of the automatic analysis of ECG signals. In the research field of ECG signals, the MITBIH arrhythmia database is the most widely used database by researchers [38]. The database contains 48 records, each of which is about 30 minutes long, with about 650,000 sampling points and a sampling frequency of 360 Hz. Fifteen categories were labeled in the MITBIH arrhythmia database. Table 1 is the corresponding table of the two heartbeat classification methods.
In this paper, we classified 109,454 heartbeats from the MITBIH arrhythmia database, including 90,595 Ntype heartbeats; 2,781 heartbeats in the S category; 7,235 Vtype heartbeats. The number of heartbeats in category F was only 802 and in Q was 8041. In this paper, 90% of the heartbeat data were randomly selected from the dataset as the training set and the remaining 10% as the test set. And the specific distribution of data is shown in Table 2.
4.3. Evaluation Metrics
In order to calculate the performance of the model for heartbeat classification, the classification results were divided into four categories: TP, FP, TN, and FN. Take Ntype as an example; formulas (19)–(22), respectively, represent the calculation methods of type N true positive heartbeat (TP_{N}), type N falsepositive heartbeat (FP_{N}), type N true negative heartbeat (TN_{N}), and type N falsenegative heartbeat (FN_{N}). Table 3 shows the confusion matrix of the classification results.
In this paper, sensitivity, specificity, positive predictive value, and accuracy are used as indicators of classifier performance. Sensitivity (Se), also known as recall rate, is the proportion of positive samples that are correctly judged to be positive. The higher the sensitivity, the greater the proportion of samples correctly predicted. Specificity (Sp) is the proportion of correctly judged negative samples to actually negative samples. The positive predictive value (+p) refers to the proportion of correctly judged positive samples to all the judged positive samples. Accuracy (Acc) is the ratio of the sum of true positives and true negatives to the total number of samples, reflecting the consistency between test results and actual results. The calculation formula of the above four evaluation metrics is shown in (23–26).
5. Results and Analysis
In order to build a timeseries network model that is most suitable for the task of heartbeat classification and more accurately distinguish the categories of arrhythmias, we conducted the following five groups of experiments. In this section, we first compare and analyze the performance of RNN, GRU, and LSTM in heartbeat classification (Section 5.1). Secondly, the network is changed to bidirectional, and the classification results of BiRNN, BiGRU, and BiLSTM are compared (Section 5.2). Thirdly, by comparing the effects of different heartbeat lengths on the classification performance of the BiLSTM model, the optimal heartbeat length was selected (Section 5.3). Then, tree regularization was used to optimize the BiLSTM model. By adding tree regularization, the generalization ability of BiLSTM is improved, and the classification accuracy is improved, compared with the traditional L1 and L2 regularization (Section 5.4). Then, the important features of the simulated decision tree are analyzed and verified by experiments (Section 5.5). Finally, the results are compared with other references (Section 5.6).
5.1. Analysis of Experimental Results of Different TimeSeries Networks
In order to select the optimal timeseries network model, Experiment 1 selected three network models, namely, RNN, GRU, and LSTM, for heartbeat classification. The experimental results show that the overall classification accuracy of the RNN model and GRU model is 98.98% and 98.97%, respectively. The overall classification accuracy of the LSTM model is 99.09%, which is better than that of the RNN model and GRU model. However, it cannot fully consider the rhythm information by using the oneway recurrent neural network for heartbeat classification. Table 4 shows the classification results and performance of three oneway recurrent neural networks.
5.2. Analysis of Experimental Results of Different Bidirectional TimeSeries Networks
The oneway recurrent neural network can only learn the heartbeat information before the current moment when performing heartbeat classification. Therefore, we improve the selected LSTM network to BiLSTM so that the network can consider both the previous heartbeat information and the future heartbeat information. And the BiRNN and BiGRU networks are used for comparison and verification. The experimental results show that the overall classification accuracy of the BiRNN model and BiGRU model is 99.13% and 98.92%, respectively. The overall classification accuracy of the BiLSTM model is 99.18%, which is better than that of the BiRNN model and BiGRU model. Table 5 shows the classification results and performance of the three bidirectional recurrent neural networks.
5.3. Select the Optimal Length of Heartbeat Segment
In order to select the optimal length of the heartbeat segment, a total of 7 experiments were conducted. The length of heartbeat segments selected by us is 1, 5, 10, 15, 20, 25, and 30, respectively, and the corresponding timestep of the BiLSTM is also 1, 5, 10, 15, 20, 25, and 30, respectively. The experimental results show that the classification accuracy of the network is gradually improved when the length of the heartbeat segment is less than 15. However, when the length of the heartbeat is greater than 15, the classification performance of the network decreases rapidly. The main reason is that the rhythm information of heartbeat, such as bigeminy, trigeminy, atrial tachycardia, and ventricular tachycardia, can be shown within 15 beats. When the heartbeat segment is too long, the heartbeat information considered by the network is too redundant, which will affect the network performance. Table 6 shows the classification results of the BiLSTM network with different lengths of heartbeat segments.
5.4. Analyze the Experimental Results of Different Regularization Methods
In order to improve the generalization ability of BiLSTM and further improve the classification accuracy, we choose tree regularization to constrain the weight of the network and use the traditional L1 and L2 regularization for comparison. Experimental results verify the feasibility and effectiveness of the proposed model, and the overall classification accuracy is 99.32%. The overall classification accuracy of the models using L1 regularization and L2 regularization was 99.26% and 99.23%. Compared with Experiment 2, the overall accuracy of Experiment 4 was improved by 0.14%, and the precision of class S, class V, and class F was all improved, among which the precision of class F was improved more obviously by 5.62%. Through the above analysis, it is concluded that tree regularization can effectively improve the classification accuracy of the network, which is better than the traditional L1 and L2 regularization. Table 7 shows the classification results of BiLSTM models under different regularization methods. Figure 7 shows the confusion matrix of heartbeat classification results based on the BiLSTMTreg model.
5.5. Analyze the Key Nodes of the Simulated Decision Tree
The tree regularization method used in this paper looks for the decision tree representation of the model in the training process of the network. The generated decision tree simulates the decision process of the BiLSTM network model. We call this decision tree a simulated decision tree (SDT). Since there are many feature points in a single heartbeat, the generated SDT is too large, so we selected the tree generated by the top 10 important feature points of SDT when displaying this decision tree, as shown in Figure 8. The top 10 important feature points are 126, 112, 162, 121, 153, 80, 224, 93, 100, and 120. The positions of these feature points corresponding to the ECG waveform are as follows: sampling points 126, 120, 121, and 153 correspond to ST segment, sampling point 112 to J point, sampling point 224 corresponds to the endpoint of T wave, sampling point 162 corresponds to the beginning point of T wave, sampling point 80 corresponds to the peak value of Q wave, sampling point 93 corresponds to the peak value of R wave, and sampling point 100 corresponds to the peak value of S wave, as shown in Figure 9.
In Figure 8, we have modified the representation of the value field in the decision tree node. We represent the value in the value field as the percentage of the number of heartbeats of N, S, V, F, and Q in the total number of heartbeats of the corresponding category. Taking node 2 as an example, 0.08 in the value field represents that the number of class S heartbeats in this node accounts for 0.08% of the total number of class S heartbeats, which means that this node almost contains no class S heartbeats. Therefore, according to this simulated decision tree, we have the following analysis:(1)Sampling point 126 is the root node of the simulated decision tree. According to whether the voltage value at this point is less than −0.0692 mV, the sample can be divided into two parts, namely, node 2 and node 3. In node 2, the heartbeat of classes F and Q is relatively large, while the heartbeat of the other three categories is relatively small. Therefore, 28.95% of class F heartbeats and 46.22% of class Q heartbeats were separated from the total sample according to the sampling point 126. Analysis of the reason: sampling point 126 is the point of ST segment in the ECG waveform. ST segment refers to the segment between the end of the QRS complex and the beginning of the T wave, representing the period between ventricular depolarization and ventricular repolarization [39]. The normal ST segment is smooth and flush with the baseline.(2)It is shown by nodes 11 and 12 that node 2 distinguishes the F heartbeats from the Q heartbeats according to the value of sampling point 224. There is only 0.38% of class Q heartbeat in node 11 and 0% of class S heartbeat in node 12. Analysis of the reason: sampling point 224 is the endpoint of T wave in the ECG waveform. T wave is a wave with a larger amplitude and longer duration after the QRS complex, which shows the process of ventricular repolarization.(3)It can be shown from node 5 that node 3 separates 25.1% of class V heartbeats and 25.53% of class Q heartbeats from node 3 according to the value of sampling point 112, and it is shown by node 14 and node 15 that node 5 distinguishes the heartbeats of class S from class Q according to the value of sampling point 153. Analysis of the reason: sampling point 112 is the J point in the ECG waveform, and J point is the junction point between the end of the QRS complex and the beginning of the ST segment.(4)As indicated by node 13, node 4 separates 26.00% of class V heartbeats from node 4 according to the value of sampling point 162. As indicated by node 16, node 6 separates 14.85% of class V heartbeats and 12.59% of class F heartbeats from node 6 according to the value of sampling point 80. Sampling point 162 and sampling point 80 are T wave start points and Q wave peak values, respectively.(5)It is shown by node 8 that 11.88% of N, 34.33% of S, 13.11% of V, 3.22% of F, and 19.53% of Q heartbeats are separated from the sample of node 7 according to the value of sampling point 93. After passing through nodes 9 and 10, 6.81% of class V is sorted out at node 18, 15.78% of class Q is sorted out at node 19, and 21.62% of class S heartbeat is sorted out at node 20. The reason is analyzed as follows: sampling point 93 and sampling point 100 are, respectively, R wave and S wave in ECG waveform, which together with Q wave corresponding to sampling point 80 constitute QRS complex. The QRS complex is a group of wave groups with complex changes and large amplitude, showing the process of ventricular depolarization [40].
To verify that the BiLSTMTreg algorithm focuses on and learns from these medically significant feature points, in Experiment 5, we only used these 10 important feature points as the features of a single heartbeat and used the BiLSTMTreg algorithm to classify the heartbeat. The experimental results are shown in Table 8, and the overall classification accuracy is 98.45%. Compared with Experiment 4, Experiment 5 showed no significant decrease in all other metrics except the sensitivity of class S. The experimental results validate the importance of these medically significant feature points in the model.
5.6. Comparison with Previous Studies
Table 9 compares the classification performance of this method and other literature methods. The experimental data of other pieces of literature also comes from the MITBIH arrhythmia database. It can be seen from Table 9 that the method proposed in this paper has the best classification accuracy, with an overall classification accuracy of 99.32%. The classification methods in literature [23, 25] all use the BiLSTM model. The results show that the proposed method has obvious advantages in all metrics except for the low sensitivity of class F, and the classification accuracy is 2.03% and 0.74% higher than the two methods, respectively. From the perspective of heartbeat type, the sensitivity of class S is significantly improved by the method presented in this paper compared with other methods. Compared with the literature [21], the method presented in this paper makes all metrics of Q heartbeat better, especially the sensitivity of Q heartbeat increased by 2.25%. In this paper, a classification method is proposed to integrate the rhythmic information between heartbeats that doctors are concerned about into the timeseries network so that the network can learn this information effectively. Moreover, the bidirectional timeseries network model can more conveniently obtain the context information of the heartbeat segment, so the algorithm in this paper can have better classification performance in the heartbeat classification problem.
6. Conclusion
In this paper, an intelligent classification of heartbeat based on the BiLSTMTreg algorithm is proposed, which integrates rhythm information between heartbeats. This method fully considers the information of heart rhythm, which doctors pay attention to when diagnosing heart disease, and realizes the automatic classification of heartbeats. In this paper, the influence of different lengths of heartbeat segments on the classification results of the model is analyzed to select the best heartbeat segment length. On this basis, the BiLSTMTreg algorithm was used for heartbeat classification. Experiments were carried out on the MITBIH arrhythmia database, and the results showed that the method can effectively distinguish five types of heartbeats, N, S, V, F, and Q, and the overall classification accuracy rate is 99.32%. The significance of this study is to provide patients with more accurate medical care services. The highlight of this study are as follows:(1)The heartbeat segment containing rhythm information between heartbeats was selected as the characteristics of the heartbeat sample, and the BiLSTMTreg algorithm was used to automatically learn the potential rhythm information of individuals(2)A tree regularization method is proposed to optimize the BiLSTMTreg algorithm and improve the accuracy of heartbeat classification(3)By analyzing the key nodes of the simulated decision tree, the interpretability of the BiLSTMTreg algorithm is analyzed(4)The experimental results show that the algorithm proposed in this paper can effectively realize the classification of arrhythmia
In the future study, we will collect more class Ftype heartbeat data for pretraining of the model so as to obtain more accurate intelligent ECG diagnosis results.
Data Availability
(1) All datasets used to support the findings of this study are included within the paper. (2) All datasets used to support the findings of this study were supplied by the publicly available MITBIH database from the Massachusetts Institute of Technology. The URL to access this data is https://archive.physionet.org/cgibin/atm/ATM. (3) The coding used to support the findings of this study has not been made available because the source code in this paper is part of a national project and is a trade secret, so the source code is not available.
Conflicts of Interest
The authors declare no conflicts of interest.
Acknowledgments
This work was supported in part by the National Key Research and Development Program of China (Grant no. 2017YFB1401200); Key Science and Technology Project of Xinjiang Production and Construction Corps (Grant no. 2018AB017); Key Research, Development, and Dissemination Program of Henan Province (Science and Technology for the People) (Grant no. 182207310002); National Natural Science Foundation of China (Grant no. 61872324); Program for Science & Technology Innovation Talents in Universities of Henan Province (Grant no. 21HASTIT031); Training Plan for Young Backbone Teachers of Colleges and Universities in Henan (Grant no. 2019GGJS018); Key Scientific Research Project of Colleges and Universities in Henan Province (Grant no. 20A520035); and Collaborative Innovation Major Project of Zhengzhou (Grant no. 20XTZX06013).