In this paper, we investigate how to incorporate intelligence into the human-centric IoT edges to detect arrhythmia, a heart condition often associated with morbidity and even mortality. We propose a classification algorithm based on the intrapatient convolutional neural network model and the interpatient attention residual network model to automatically identify the type of arrhythmia in the edges. As the imbalance categories in the MIT-BIH arrhythmia database which needs to be used in the algorithm, we slice and overlap the original ECG signal to homogenize the heartbeat sets of different types, and then the preprocessed data was used to train the two proposed network models; the results reached an overall accuracy rate of 99.03% and an F1 value of 0.87, respectively. The proposed algorithm model can be used as a real-time diagnostic tool for the remote E-health system in next generation wireless communication networks.

1. Introduction

It is reported by The World Health Organization that cardiovascular diseases are the primary cause of the world’s highest mortality, and arrhythmias are the most common [1]. Arrhythmias are caused by abnormalities in the conduction system of the heart. They can be slowly, rapidly, or irregular heartbeats and can be life-threatening or nonlife-threatening. Nonlife-threatening arrhythmias need to be tested for a long period of time to ensure that the pathologic causes of the arrhythmia can be detected early. ECG signal is a kind of physiological signal that can record and reflect the condition of the heart, and its simple and noninvasive advantages are widely used in the diagnosis of arrhythmia. In the early stage of cardiovascular diseases, arrhythmias are often accompanied by the appearance of arrhythmia. Therefore, early diagnosis and prevention are very important for the intervention treatment of patients.

Traditionally, the diagnosis of arrhythmias relies on a cardiologist’s ability to identify specific types of arrhythmias by analyzing the waveform of electrical signals collected from the heart. However, due to the complexity and suddenness characteristics of arrhythmia, the ECG signals detected in a short time may not accurately reflect the real cardiac activity of the patient. For the ECG signal recording that needs to be monitored for a long time, the identification of arrhythmia type by artificial means is time-consuming and laborious, and it is easy to miss detection. To improve the reading efficiency of arrhythmia, real-time monitoring through automatic analysis technology can play a great auxiliary role in the diagnosis of arrhythmia. 6G wireless communication networks are believed that it should be ubiquitous, human-centric, full band, strongly secure, and intelligent [2, 3], which offers distributed, low latency, and reliable machine learning at the wireless network edge [4, 5].

In the machine learning method, the process of the arrhythmia diagnosis algorithm usually includes three main steps: preprocessing, feature extraction, and classification. Feature extraction has dominated the diagnosis field of arrhythmia for decades, including feature extraction based on wavelet, morphology, and statistics [6]. Wavelet transform decomposes the signal into components of different scales [7], and the time positioning of spectral components can be obtained through wavelet analysis. Some studies on the analysis of ECG signal using extracted by wavelet feature [8]. In literature [9], a random matrix was selected to extract the morphological features of heartbeat, in which each column was normalized, and each row was extracted by discrete cosine transform as the projection matrix. For the statistical characteristics of ECG, signal analysis usually is the use of ECG signal of time-domain characteristic value calculation, such as energy, mean, kurtosis, skewness, maximum, and minimum [10, 11], these features provide an effective method to analyze the complexity, and different types of the time series of ECG signal can help distinguish the types of arrhythmia patients, to obtain better classification performance. However, the advantages and disadvantages of such arrhythmia diagnosis algorithms usually depend on the feature extraction stage, and the robustness of the diagnosis model is still limited due to the complex feature extraction process.

In recent years, the end-to-end deep learning method has shown outstanding performance in automatic feature extraction, and the trend of using convolutional neural network models for arrhythmia diagnosis has become more and more obvious. Paper [12] proposes a 34-layer neural network model that does not take any complex preprocessing or feature extraction steps. The data set used is 500 times that of the open data set. The classification of arrhythmias has a high diagnostic performance similar to that of cardiologists. Although existing work has laid a solid foundation for this field, due to the long recording time of ECG signals, low signal quality, diversity of pathological reasons, and extremely scarce data sets, how to improve the robustness of arrhythmia diagnosis results remains is a challenge.

One of the most effective tools for arrhythmia diagnosis is the detection of ECG signal, and the morphological characteristics and frequency spectrum of a single heartbeat can provide meaningful clinical information about the automatic identification of ECG. However, the shape and time characteristics of the ECG signal between different patients are very different under different physical environments, which leads to the problem of ECG signal classification that has not been fully solved. The main problem of using ECG to diagnose arrhythmia is that different patients have different ECG shapes although they suffer from the same disease, and two different diseases may have roughly the same characteristics in the ECG signal. These problems bring some difficulties to the diagnosis of heart disease [13]. Most of the algorithms in the literature are evaluated based on intrapatient paradigms rather than interpatient programs. Although these algorithms can obtain good accuracy by evaluating intrapatient programs [14, 15], due to individual differences, sexuality exists objectively, and the result is not particularly reliable. So, it is the most consistent with the actual application scenario to avoid the training data and the test data coming from the same sample. In addition, when the amount of sample data in the arrhythmia database is scarce and the number of categories is unbalanced, the existing arrhythmia diagnosis algorithms show poor performance when identifying categories with relatively small amounts of data and whose sensitivity and accuracy are both very low [16]; so, the automatic classification of ECG signals is still a difficult problem.

In order to cope with the above challenges, we use the data in the MIT-BIH arrhythmia database [17] to propose based on intrapatient with the convolution neural network model to simulate the ECG records within the normal beat (), ventricular premature beat (), right bundle branch block (), left bundle branch block (), and based on the interpatient with attention residual network model [18, 19] for normal (), ventricular ectopic (), ventricular ectopic (), the fusion (), and unknown beat () five types of classification. Compared with the results in the existing literature, our method can obtain relatively good results. The main contributions of this work are listed as follows: (a)We propose a one-dimensional convolutional neural network model to classify the heartbeat intrapatient in four categories(b)We propose to combine the residual network module with the attention mechanism with interpatient ablation study on the proposed network model(c)By adopting slice and overlap processing to enhance the original ECG signal, the amount of data of various types can be balanced

The rest of this paper is arranged as follows: Section 2 introduces the work related to the study of arrhythmia. Section 3 describes the data set used and the preprocessing of the data. Section 4 describes the proposed two network model structures. Section 5 introduces the evaluation indicators of training network model and analyzes the results. Section 6 summarizes the whole thesis and prospects.

The algorithm process of arrhythmia diagnosis based on the deep neural network can be roughly divided into the following steps: ECG data preprocessing and arrhythmia classification. The algorithm flow chart is shown in Figure 1.

2.1. Preprocessing

The ECG signal is usually a low-frequency weak signal collected by an electrocardiogram machine with electrodes attached to the surface of the human body. The signal frequency is usually between 0.05 and 100HZ, which is extremely susceptible to external noise. The purpose of data preprocessing steps is to reduce these noises. Typical noise types are as follows: (a)Baseline drift: it belongs to low-frequency noise (0.15-0.3 Hz), which is the noise caused by the change of electrodeposition caused by movement artifact or the patient’s respiration(b)Power interference: it is mainly a noise signal with a frequency of 50/60HZ generated by the interference of the power system, and its bandwidth is lower than 1HZ(c)EMG interference: high-frequency noise signals (30-300 Hz) generated by muscle contractions other than the heart

In the classification process of arrhythmia, noise signals of different degrees will have a great impact on the diagnosis of patients and reduce the accuracy of diagnosis. Therefore, it is necessary to select an appropriate preprocessing method for noise removal [18].

2.2. Based on Existing Methods of Deep Learning

In the deep neural network method, a classifier that can automatically extract features is needed to identify the types of arrhythmia after the ECG data preprocessing step. At present, probabilistic neural network, fuzzy clustering neural network, and recursive neural network are used to classify arrhythmia. The probabilistic neural network is a feed forward network, which is derived from the Bayesian network and Fisher discriminant analysis. Literature [19] believes that the probabilistic neural network model is more robust and effective in calculation than the traditional model. In the structure of the fuzzy clustering neural network, the neural network layer composed of a fuzzy clustering layer and a multilayer perceptron works in turn. When the fuzzy layer performs the initial operation of the classification task, the neural network layer serves as the final classifier, and finally the fuzzy clustering is used to improve the performance of the neural network classifier [20]. In recent years, the fuzzy clustering neural network has been applied in some studies [21, 22]. However, in literature [23], a hybrid fuzzy neural network method is proposed to minimize the problem of multilayer perceptron, improve its generalization ability, and reduce training time. The recurrent neural network is a neural network structure with closed-loop connections between neurons [24]. This neural network can achieve highly nonlinear dynamic mapping and has been used in some ECG signal classification studies [25, 26].

3. Arrhythmia Data Preprocessing

3.1. Arrhythmia Dataset

This study uses the MIT-BIH arrhythmia data set [27]. The benchmark database was created by the Massachusetts Institute of Technology and Beth Israel Hospital in Boston, Massachusetts, USA, in 1980 and started to release. It is the first data set used to evaluate the performance of arrhythmia detectors and has been widely used in some famous studies [28, 29]. Each record in the data set is independently annotated and confirmed by two or more cardiologists, and the wave peak value or local extreme value of the heartbeat is indicated.

3.2. Segmentation of Intrapatient Heartbeat

The QRS wave in the ECG signal data is located, and then a single heartbeat beat is extracted. First, use the 15-25HZ band-pass filter to obtain the QRS band and then perform the double-slope processing [30] to become a signal composed of single-mode peaks. After the preprocessing is completed, the ECG signal is located by the QRS wave through an adaptive threshold. Taking the QRS wave as the central reference, 100 and 150 sampling points are selected forward and backward, respectively, for the rough interception. At this time, the length of the heartbeat beat obtained by interception is 250 sampling points. The ECG signal is divided into normal heartbeat (74962), premature ventricular contraction (7034), right bundle branch block (7254), and left bundle branch block (8068). Since the number of four heartbeat types is extremely unbalanced, each type selects only 7000 heartbeats after segmentation and interception during model training and then randomly divides them into a training set and test set in half. Finally, the samples corresponding to the first 14,000 indexes in the cut sample are the training set, and the rest are used as the test set.

3.3. ECG Sequence Processing for Interpatient

In this paper, four records containing rhythmic ECG signals have been deleted from the MIT-BIH arrhythmia data set. The division method adopts interpatient, and the remaining ECG signal data is divided into training set DS1 and test set DS2. DS1 and DS2 contain 22 records of mixed conventional and complex arrhythmia, each data set with about 50,000 heartbeats [6, 31].

Due to the serious class imbalance in the data set used in this paper, especially the imbalance of training set data is very likely to cause the network model to learn invalid or even fail to converge. To solve this problem, this paper adopts the method of slice and interception of ECG signal data, with the length of each interception segment being 5 s, and the amount of data is increased by overlapping between segments to alleviate the impact of category imbalance. Taking the most nonoverlapping category as the benchmark, the overlapping length of the remaining slices can be estimated by the following formula: where represents the length of the ECG signal overlap, represents rounding, represents the length of each slice, represents the number of samples in the current category, and represents the number of samples in the largest category. After training set, DS1 is processed as DS1 according to formula (1). The number of different types of data in DS1 basically reaches a balance with the types. The number of each type after slice interception and overlap processing is shown in Table 1. Finally, the intercepted ECG signal is subjected to wavelet transform based on the db6 wavelet system [30] and -score standardization before resampling. The test set DS2 does not do overlap processing.

4. Architecture of the Deep Learning Network Model

4.1. Structure of the Convolutional Network Model for Intrapatient Paradigm

This paper proposes a five-layer convolutional neural network model for the classification of arrhythmia with normal beat (), premature ventricular beat (), right bundle branch block (), and left bundle branch block () based on intrapatient. The local connection and weight sharing of convolutional neural networks reduce the number of network parameters, decrease the complexity of the model, and alleviate the problem of model overfitting, which has achieved great success in many fields such as computer vision. The structure of the network model is shown in Figure 2.

This paper proposed that the one-dimensional convolutional neural network model is composed of 2 convolutional layers, 2 pooling layers, and 1 fully connected layer, in which the convolutional layer of each layer will pass through a RELU activation function after the convolutional operation. The first layer of the convolution layer performs convolution operation on the input single heartbeat beat to extract local features. The size of the convolution kernel is set as , and the number of feature maps starts from 4. The size of the convolution kernel of the third convolution layer is set to , and the number of feature maps is 8. When the convolution operation is performed in the convolution layer, the movement step of the convolution kernel is set to 1. In the second and fourth layers, the average pooling operation is performed, the key feature information is extracted from the local features, and redundant features are discarded. The pooling step size is set to 5 and 3, respectively. The specific parameters of the network model are shown in Table 2.

4.2. Structure of the Attention Residual Network Model for Interpatient Paradigm

In this paper, the residual network module and attention mechanism are combined to form the attention residual unit, and the attention residual network model stacked by the attention residual unit is used to conduct the ablation study for interpatient paradigm. The structure of the network model is shown in Figure 3.

This paper uses the deep learning framework Keras and TensorFlow as the backend to build the model [32]. The size of the convolution kernel in the residual network is , the number of feature maps starts from 12, the weight of each layer is added with L2 regularization, the dropout probability value is set to 0.5 [33], and the value for small batch processing set to is 128, the initial value of the learning rate is set to 0.1, and the subsequent stepwise changes. Because the momentum optimizer has strong generalization ability in the ECG signal classification problem, to optimize the loss function, this paper uses stochastic gradient descent and momentum optimizer. The size of the convolution kernel of the attention module introduced in the network model is . The network model is optimized by adjusting the number of convolutional layers in the residual network and the number of convolution kernels in the attention module.

4.2.1. Residual Network Model

The residual network is composed of the stack of residual module, and shows superior performance in the application of computer vision and other fields. The schematic diagram of the residual module is shown in Figure 4. The process can perform the following mathematical calculation: where is the input of the residual module, is the residual function, is the weight parameter corresponding to the residual function, and is the output of the residual module.

For the arrhythmia diagnosis algorithm, a traditional neural network will more or less have information loss and waste when transmitting the information. It also maybe causes gradients to disappear or explode that making the deep network models unable to train. And the residual block inside the residual network uses the jump connection to directly pass the input information to the output by bypassing, protecting the integrity of information, and optimizing the problem of gradient disappearance caused by increasing the network depth in the neural network.

4.2.2. Attention Mechanism

The attention mechanism is a method of data processing in machine learning. It can be understood as a mechanism to redistribute resources based on the importance of the attention object to the originally allocated resources. The core idea is to find data based on the original data and then focus on some important features that inhibit unnecessarily. Because of the advantages of the attention mechanism, this paper proposes to introduce a spatial attention module into the residual network.

The spatial attention module [34, 35] uses the spatial relationship between features to generate a spatial attention map. The focus of attention is on the “where” of the feature map that is the information part. The schematic diagram of the module is shown in Figure 5. The feature map is used as the input of the spatial attention module, and after Figure 5, the two-dimensional spatial attention map can be obtained. The process can be summarized as where means average pooling, means maximum pooling, means convolution operation with a convolution kernel size of , is the sigmoid activation function, represents element-wise multiplication, and is the precise output obtained after passing through the spatial attention module.

4.2.3. Ablation Study

In this section, we conduct ablation experiments to better understand the effect of adding an attention module [36]. This paper uses the 18-layer residual network as the backbone architecture. By adding the attention module to the residual module, the network can more efficiently concentrate on the important information part of the ECG signal. Finally, analyze and compare the results of ablation experiments, and all experiments are performed on the same machine with the same parameter settings.

5. Experiments and Result Analysis

5.1. Model Evaluation Metrics

In this paper, we follow the classification standards of the American Association for the Advancement of Medical Instrumentation (AAMI) [37] and refers to other literature on the evaluation methods of the ECG signal classification in the MIT-BIH arrhythmia database, using accuracy (accuracy) and sensitivity (sensitivity), prediction rate (precision+), recall rate (recall), F1 value, and confusion matrix to evaluate the network model [35]. The final evaluation indicators are as follows: where is a true positive sample, is a true negative sample, is a false positive sample, is a false negative sample, and is the total number of samples.

6. Result Analysis

6.1. Intrapatient Model Performance

The network modeled by the end-to-end deep learning method avoids manual extraction of data features, and the network can perform automatic feature extraction for classification. The center beat of the ECG signal is passed through a 5-layer convolutional neural network model. After continuous hyperparameter adjustments, the overall accuracy of the four types of classification is 99.03%, and the normal beat of the specific four types () is as follows: 99.88%, premature ventricular beats (): 97.83%, right bundle branch block (): 99.12%, and left bundle branch block (): 99.29%; the results are shown in Figure 6:

The model algorithm puts all the extracted heartbeats together and then divides the training set and the test set, without considering the differences between individuals, but this seems to be somewhat inconsistent with the actual scene. The labeled data that has been obtained in the actual scene comes from some old patients, and the model algorithm needs to predict new patients based on the rule of these data. At this time, the influence of individual differences will be reflected, making it difficult for the model we trained on old patient data to effectively generalize to the data of new patients, and individual differences will cause the deterioration of model performance.

6.2. Model Performance Interpatient

Through the ablation study on the MIT-BIH arrhythmia data set, the results of the two network models are shown in Table 3. Table 3 shows that the model that introduces the attention mechanism in the residual network has higher accuracy than the residual network model, and the F1 value has increased by 3%. In the residual network model, the sensitivity of the classification results for the two categories of normal heartbeat and supraventricular ectopic heartbeat is both above 90%. Comparing the attention residual network model with the residual network model, the prediction rate in the classification results of normal heartbeats has increased by 10%, but the sensitivity of the classification of supraventricular ectopic heartbeats has decreased. The sensitivity and prediction rate of the classification results of ventricular ectopic heartbeat and fusion heartbeat increased by 16%, 31%, 15%, and 19%, respectively, but neither model can classify unknown heartbeats. From the results of the ablation study, it can be seen that the introduction of the attention mechanism into the residual network greatly improves the diagnostic results of a normal heartbeat, supraventricular ectopic heartbeat, and fusion heartbeat and also increases the robustness of the network model.

The confusion matrix of the two models after the ablation study is shown in Figure 7. Observation shows that after the introduction of the attention mechanism in the residual network model, the predictions of , , and categories in arrhythmia have been greatly improved. Among them, there are numerous mutual wrong predictions among , , and types. So, it can be guessed that the waveform graphs of these three categories of the heartbeat are more similar.

Since the amount of original data in the category in the MIT-BIH data set used in this paper is very small, the category cannot be distinguished. In many literatures that use this dataset, it is basically indistinguishable. The accuracy rate in some literatures exceeds 90%, but the balance of the categories is not good. For example, in the literature [34], the accuracy rate reached 94.61%, but the Se of the category was only 20%, the P+ was only 0.16%, and the P+ of the category was only 0.52%. In the literature [9], the accuracy rate reached 93%, but in the five categories, and categories could not be distinguished, the category Se was only 70.8%, and the category Se and P+ were 29.5% and 38.4%, respectively.

7. Conclusion

Due to the data set used in this paper is extremely unevenly distributed among categories, overlap processing is used when preprocessing the data, which increases the amount of data in each class and optimizes the overfitting problem of the proposed network model. In addition, compared with a single heartbeat, the sample obtained after arbitrary segment interception of a limited amount of data is much more complex, which enables the network model to get rid of the coupling problem with the QRS detection algorithm and makes the ECG signal diagnosis process more simple and generalized. The attention residual network model proposed in this paper greatly improves the , , and types of arrhythmia, optimizes the performance of the network model, and increases the robustness of the model. The research results presented in this paper have positive significance for improving the accuracy of arrhythmia diagnosis, but limited by the small amount of data, it has a certain impact on the research. It is necessary to increase the number of data sets and try to combine other neural network structures to explore the arrhythmia diagnosis algorithm. Note that we will consider our further work into the device-to-device (D2D) and index modulation systems [3840], which might automatically and adaptively monitor the arrhythmia situation.

Data Availability

All data, models, or code generated or used during the study are available from the corresponding author by request. (Ji Wang, email: [email protected]).

Conflicts of Interest

The authors declare that they have no conflicts of interest.


This work was partly supported by the National Natural Science Foundation of China (61871645, J2024023). Innovation Training Program for College Students of Guangzhou University (Provincial) under grant S202011078027.