Atrial fibrillation (AF) is a common abnormal heart rhythm disease. Therefore, the development of an AF detection system is of great significance to detect critical illnesses. In this paper, we proposed an automatic recognition method named CNN-LSTM to automatically detect the AF heartbeats based on deep learning. The model combines convolutional neural networks (CNN) to extract local correlation features and uses long short-term memory networks (LSTM) to capture the front-to-back dependencies of electrocardiogram (ECG) sequence data. The CNN-LSTM is feeded by processed data to automatically detect AF signals. Our study uses the MIT-BIH Atrial Fibrillation Database to verify the validity of the model. We achieved a high classification accuracy for the heartbeat data of the test set, with an overall classification accuracy rate of 97.21%, sensitivity of 97.34%, and specificity of 97.08%. The experimental results show that our model can robustly detect the onset of AF through ECG signals and achieve stable classification performance, thereby providing a suitable candidate for the automatic classification of AF.

1. Introduction

Heart disease is the leading cause of human death, and the number of deaths due to cardiovascular diseases accounts for a large proportion of the total number of deaths worldwide [1]. Most cardiovascular diseases are often accompanied by arrhythmia. Among them, atrial fibrillation (AF) is the most common persistent arrhythmia. In our country, there are more than 10 million people suffering from AF. Its incidence increases with age, but in recent years, its incidence has shown an increasing trend in people of younger age groups [24]. Simultaneously, a series of complications related to AF, such as stroke, heart failure, and other diseases, also lead to high morbidity and mortality [57]. However, AF also shows strong unpredictability, and capturing AF signals in real time is difficult [8, 9]. Electrocardiogram (ECG) detection technology forms an important basis for AF diagnosis [1012]. Therefore, the application of automatic detection technology for diagnosing AF is necessary. Consequently, machine learning has significantly contributed to the development of real-time monitoring of AF, and timely intervention in the effective detection of AF can avoid serious consequences caused by an exacerbation of the disease [13].

With developments in information technology and artificial intelligence technology, the automatic classification of AF has made great progress. The traditional ECG classification algorithm is composed of two parts: feature extraction and classifier. First, principal component analysis, latent Dirichlet allocation, and other methods are designed for feature extraction and then placed into a support vector machine or random forest, among others, for classification in the classifier. A complete ECG signal is shown in Figure 1 where AF is mainly seen by the disappearance of the p wave or irregular RR intervals. The RR interval is the amount of time change between two R waves, and the feature recognition method based on the RR interval shows high accuracy. Because the wave value of the R wave has the largest peak in the ECG signal, locating it is easy. However, the low amplitudes of the P and T waves make it challenging for them to be detected, and the feature extraction algorithm is still not mature enough. With developments in machine learning, traditional models have some insurmountable defects. First, traditional algorithms need to design feature extraction methods to extract useful information and combine machine learning algorithms for classification. This process may be accompanied with the loss of some information. When the extracted features cannot fully reflect the data, the classification results may appear to have larger errors. Second, it mainly relies on considerable prior expert knowledge and sufficient biomedical signal processing capabilities. On this basis, designing a good classifier algorithm is also necessary, but achieving optimal results is difficult.

Unlike traditional machine learning algorithms, deep learning-based methods have the ability to mine complex relations, and useful features of data and have been widely researched and applied in the automatic classification of AF. Xu et al. proposed a framework that combined an improved frequency slice wavelet transform and a convolutional neural network (CNN) for automatic AF beat recognition and achieved good performance [14]. Wei et al. constructed a synchronous feature of each heartbeat of an ECG signal through a recursive complex network and subsequently used CNN to detect AF by analyzing the eigenvalues of the recursive complex network [15]. Andersen et al. proposed an end-to-end model combining CNN and recurrent neural networks to classify ECG signals as AF or a normal sinus rhythm [11]. Pourbabaee et al. developed a deep learning machine to screen and identify patients with paroxysmal AF [16]. Dang et al. proposed a model that uses a CNN-BLSTM network to diagnose arrhythmia and uses ECG signals to automatically detect AF, achieving relatively good results [17]. Several researchers have shown that combining the deep learning features with the classifier will significantly improve the performance of the system and make the classification results more ideal. Although the above research can effectively solve the classification problem of AF, we can observe that various neural networks have the ability to extract complex nonlinear characteristics from the original data without human intervention; however, learning the thinking mechanism of the ECG signal features with high accuracy required for monitoring is still a difficult task. CNN and long short-term memory networks (LSTM) are very efficient for feature extraction of ECG signals, and this superiority is applied herein to the AF detection algorithm.

In this study, we proposed a new diagnosis method for AF named CNN-LSTM, which can automatically detect AF from ECG signals. The contributions of this study are as follows:(i)We propose an automatic recognition method named CNN-LSTM that uses heartbeat features as input datasets to automatically identify AF in an ECG signal.(ii)CNN has advantages in image processing, while LSTM can compensate for the shortcomings of CNN in the context sequence. Therefore, the combination of CNN and LSTM can effectively improve accuracy in the field of AF recognition.(iii)The use of multiscale signals representing the AF characteristics as the input of the network reduces computing resources. The design can be used to extract multiscale features and improve the generalization ability of the network model, and this study provides a high-precision classification method to meet the real-time monitoring needs of AF.

In the following sections, we provide a detailed experimental process and verify the performance of the method in the open access database. This automated method can analyze a large amount of data in a short time while ensuring high accuracy; thus, it may become a practical tool for providing real-time monitoring for patients and reducing the work pressure on doctors.

Section 1 of this article deals with the background of current research on AF and related research algorithms that have been implemented. Section 2 covers the data source and related network structure required for the experiment. Section 3 presents the experimental details, results, and analysis of the results. Section 4 presents the concluding statements and prospects for future work.

2. Material and Methods

2.1. Description of Dataset

Our experimental research is based on data from the MIT-BIH Atrial Fibrillation Database, which is publicly available from PhysioNet [18, 19]. This database includes 25 long-term ECG Holter records from different subjects (mainly paroxysmal attacks). It contains two ECG signal channels with AF annotations. The sampling rate of this database is 250 Hz, and these records also include beat notes manually marked by expert clinicians.

We preprocess the ECG signal to train and evaluate the automatic AF prediction method based on the CNN-LSTM model. As the duration of the AF recording in these data is different from the normal recording duration, all ECG signals are divided into the same duration and data balance is performed to better apply the learning of the model. After segmentation, 960,000 short-term ECG segments were obtained, comprising 480,000 segments of AF records and 480,000 segments of normal records. Figure 2 shows a comparison of the AF signal and the normal signal. The ECG signal segment is divided into a training set and a test set. To better detect the classification effect of the model, the signal processing and segmentation of the dataset are random in nature.

2.2. Networks

This section first reviews the CNN and LSTM network models, which are closely related to the model structure proposed herein. Then, our proposed research model is put forward, and the structure, parameters, and mathematical expressions of the model are described in detail.

In this section, we describe the network structure model proposed in this paper, which mainly includes two convolutional layers, one LSTM layers, fully connected layers, and other computing operations.

2.2.1. Convolutional Neural Network

CNN is a feedforward neural network, and it mainly includes an input layer, a convolutional layer, a pooling layer, and an output layer. Its special network structure has great advantages in feature extraction and learning, especially in the field of image recognition, and thus it can achieve great success. Its structure is shown in Figure 3.

The CNN is connected to the input layer through a convolution kernel. The convolution kernel performs dot multiplication through a sliding window to achieve multiscale feature extraction. Simultaneously, the weight-sharing mechanism of the convolution layer makes it more effective for feature extraction, greatly reducing the number of free variables that need to be learned. Subsequently, we add a pooling layer after the convolutional layer to reduce the feature matrix and network complexity. Because the input ECG signals are one-dimensional time series, we use one-dimensional convolution in the convolution layer, as shown in Figure 4.

Before the data training, we normalized the data. The convolutional layer extracts features from the original input. The output of the a-th neuron of the one-dimensional convolutional layer is shown in the following equation:

The input sequence is , where W denotes a matrix of weight coefficients, b is an offset coefficient, and n is the number of convolution kernels. Then, the result of the convolution is input into an activation function δ (in this case ReLU), and then the result of the convolution layer is fed back to the pooling layer.

2.2.2. Long Short-Term Memory Network

LSTM is a special recurrent neural network. LSTM is suitable for applications such as natural language processing [20, 21] and biomedical signal processing [22, 23]. LSTM improves the standard RNN model and adds a gate mechanism. It overcomes the problems of gradient disappearance, gradient explosion, and length dependence of traditional RNNs. The hidden layer of LSTM comprises an input gate, a forget gate, and an output gate. The structure is shown in Figure 5.

The input of the LSTM hidden layer includes not only the input of the current sequence but also the state of the hidden layer at the previous time; then, the output vector and the output and of the current state are calculated. The key part of LSTM is what information we will discard from the cell state at the last moment and how much information can be transferred to the current state . This decision is made through a forget door. The next step is to decide how much new information is added to the next state. Finally, we determine the output value based on the current and . The update method of LSTM is as follows:where is the state information of the memory unit, is the accumulated information at the current moment, W is the weight coefficient matrix, b is the bias term, σ is the sigmoid activation function, and tanh is the hyperbolic tangent activation function.

2.2.3. Proposed Architecture

Neural networks have their own unique feature learning method. The CNN model converts the original input into a fixed-length vector representation through convolution kernels, sliding windows, and pooling to capture local features in the input, but the original data arrive. The dependency relation is difficult to learn, and LSTM can better understand the content of the input information through the memory unit that can compensate for the defects of the CNN. Therefore, a CNN-LSTM deep learning model is proposed to achieve the automatic classification of AF in this paper [24, 25].

Figure 6 shows the proposed CNN-LSTM network architecture. After inputting the ECG signal, the convolutional layer and the pooling layer in the CNN first extract local features and subsequently enter the hidden layer of LSTM to obtain optimal feature representation. Finally, the nonlinear function softmax in the fully connected layer is classified into the corresponding categories. CNN adds some processing such as normalization that can avoid overfitting and speed up training.

3. Experimental Results

3.1. Performance Evaluation

To estimate the performance of heartbeat classification, the performance of the model is usually evaluated with accuracy, specificity, and sensitivity [2628]. They are defined as follows:where TP denotes the number of correctly classified AF signals; FP denotes the number of incorrectly classified AF signals; TN is the number of correctly classified N signals; and FN is the number of misclassified N signals. The results of the experimental study are presented in the next section.

3.2. Implementation Details and Results

The experiment herein uses the TensorFlow neural network framework. Before the experiment started, the data labels were converted into corresponding one-hot vectors. The experiment is based on an equal number of AF records and normal records for training and testing, and the signal processing of the dataset is random. During the experiment, parameter optimization was performed, the batch size was set to 128, the learning rate was 0.01, multiple iteration training was performed, and the Adam updater was used to update the weights to obtain the best classification results. Table 1 lists the relevant parameters of the experimental network.

Figure 7 shows the loss and accuracy curves of the CNN-LSTM model. It indicates performance change in the training set as the number of iterations increases. The network continues to converge, and the model does not appear to be overfitting.

Figure 8 shows the receiver operating characteristic curve (ROC) curve of the test set. The abscissa of the curve is the false positive rate, and the ordinate is the true positive rate. AUC denotes the area under the ROC curve. The AUC realized by the model was 0.97. The closer the AUC value is to 1, the better is the performance of the model.

Figure 9 shows the confusion matrix of the test set, which is used to measure the accuracy of a classifier. The upper left corner is TP, the upper right corner is FP, the lower left corner is FN, and the lower right corner is FP. We can convert the result of the quantity in the confusion matrix to a ratio between 0 and 1 to facilitate standardized measurements.

As mentioned above, the CNN-LSTM model achieved an overall classification accuracy of 97.28% on the test set, with a sensitivity of 97.51% and a specificity of 97.06%.

In this study, by combining the deep learning model of CNN and LSTM, i.e., using CNN and LSTM to extract the characteristics of the ECG, the model can automatically extract features and achieve higher accuracy.

Table 2 shows a series of scientific studies based on ECG signals in the MIT-BIH AF database. It mainly includes three evaluation indices: accuracy, sensitivity, and specificity. Through a comparison, we can observe that out proposed CNN-LSTM network model has improved on the input signal of the model and network structure compared with other deep learning methods and achieved good results.

4. Conclusion

In this study, we conducted an in-depth study of the ECG classification algorithm and constructed a network combining CNN and LSTM. This network can extract the characteristics of ECG signals and classify them. Compared with traditional ECG classification methods, our proposed CNN-LSTM network structure used the MIT-BIH AF database and achieved a high classification accuracy. The experimental results confirm that our proposed CNN-LSTM network is effective for the automatic detection and classification of AF. In addition, this method occupies fewer computing resources and can theoretically achieve real-time performance, thereby contributing to the development of wearable ECG detection devices. Our future research may involve the use of a model that classifies AF tasks under nonfixed scale inputs to achieve further optimization of the neural network.

Data Availability

The data used to support the findings of this study have not been made available because the data also form part of an ongoing study. The original data of the study can be obtained at https://physionet.org/.

Conflicts of Interest

The authors declare that they have no conflicts of interest.


This study was supported by the Shandong University Undergraduate Teaching Reform Research Project (approval number: M2018X078) and the Shandong Province Graduate Education Quality Improvement Program 2018 (approval number: SDYAL18088). This study was also partially supported by the Major Science and Technology Innovation Projects of Shandong Province (grant no. 2019JZZY010731).