Abstract

In cardiac rhythm disorders, atrial fibrillation (AF) is among the most deadly. So, ECG signals play a crucial role in preventing CVD by promptly detecting atrial fibrillation in a patient. Unfortunately, locating trustworthy automatic AF in clinical settings remains difficult. Today, deep learning is a potent tool for complex data analysis since it requires little pre and postprocessing. As a result, several machine learning and deep learning approaches have recently been applied to ECG data to diagnose AF automatically. This study analyses electrocardiogram (ECG) data from the PhysioNet/Computing in Cardiology (CinC) Challenge 2017 to differentiate between atrial fibrillation (AF) and three other rhythms: normal, other, and too noisy for assessment. The ECG data, including AF rhythm, was classified using a novel model based on a combination of traditional machine learning techniques and deep neural networks. To categorize AF rhythms from ECG data, this hybrid model combined a convolutional neural network (Residual Network (ResNet)) with a Bidirectional Long Short Term Memory (BLSTM) network and a Radial Basis Function (RBF) neural network. Both the F1-score and the accuracy of the final hybrid model are relatively high, coming in at 0.80% and 0.85%, respectively.

1. Introduction

A common arrhythmia known as atrial fibrillation (AF) has been linked to serious heart-related diseases such stroke and heart failure [1, 2]. It increases the risk of cardiovascular disappointment and, as a result, significantly impacts depression and mortality [3, 4]. Furthermore, AF affects many people worldwide, and the risk increases with age [5]. The capacity of Artificial Intelligence (AI) and AI techniques to enhance the early detection of cardiovascular diseases with little effort in ECG testing is still mostly unknown. The 2017 Physionet/Computing in Cardiology competition discourages mainstream academics from proposing solutions for programmed AF detection from brief single-lead ECG data [1]. The test is presented as a traditional machine learning issue, with a marked preparation set and suggestions evaluated against a cloaked test set of records. Regardless of whether the primary assessment for the final placement is the precision of the proposed model, various distinct features should be evaluated for the prior appropriation of each proposition in clinical practice. Because it perfectly captures the electrical movement of the cardiac activity [1] ECG determination can provide competent AF detection in clinical practice [3]. Because symptoms occur in episodes, it is challenging to examine AF during normal in-office visits [2]. Recent techniques consider the high fatality rate and inadequacy in detecting AF [6]. ECG signal examinations for AF localization are conducted in the time or recurrence area. The current AF recurrence is commonly assessed over a sign with deleted QRS complex and T peak edifice [7, 8]. This study aims to provide a characterization model and evaluate its ability to separate brief single-lead ECG signals classified as AF, Normal (N), noisy, and Other Rhythms (O) using the 2017 PhysioNet/Computing in Cardiology Challenge database [9].

The most common symptomatic tool for identifying cardiac arrhythmias is the ECG. AF is the most prevalent cardiac musicality problem characterized by disorganized chamber compression [4]. The prevalence of AF increases with each succeeding decade of age, from 0.5% at 50-59 years to approximately 9% at 80-90 years [10]. It is estimated that 2.3 million adults in the United States suffer from AF, which is expected to rise to 5.6 million by 2050 as the population ages [11]. AF is considered a substantial cause of death and misery as it increases the risk of heart failure and stroke.

For this reason, an assortment of programmed calculations managing the surface of ECG signals has been proposed in recent years. Many misuse the two changes incited by the arrhythmia on the ECG. From one perspective, AF is portrayed by a quick atrial movement, whose rate can change between 240 and 540 actuation/min [12]. Since everyone shells the atrioventricular (AV) hub, a quick and sporadic ventricular reaction can be seen on the ECG. Unmistakably, this conduct diverges from the ordinary example of the R-R interim arrangement amid typical sinus mood (NSR). However, during NSR, the homogeneous P-wave associated with atrial depolarization is replaced by a series of low-fullness fibrillatory waves with varied morphology (f-waves) [12, 13]. AF is underanalyzed and ordinarily distinguished after a patient presents serious complications such as stroke or heart failure.

Drugs can ease side effects and help forestall genuine difficulties, for example, stroke. Electrophysiological medical procedures and radiofrequency removal have effectively restored an ordinary mood [14]. Late progress in versatile innovation (arrange and computational power network) makes it conceivable to grow minimal effort, broadly accessible, and exact medicinal gadgets. These gadgets can be utilized to address the lack of medicinal services assets in the creating scene and lower the expense of social insurance in developed countries. AF finders consider preliminary screening and recognizable proof of AF contrasted with manual strategies. Most current calculations are found in ventricular reaction and atrial movement investigations. Garcia et al. [15, 16] describe AF, including pulse variability, wavelet entropy, and P-wave recognition. However, current AF identification strategies in clinical settings are restricted [17]. In past examinations, an order was performed just to clean information. Be that as it may, clamor is unavoidable in nonstop observing settings because of lead separation, breath, or movement.

Furthermore, such a setup distinguishes AF signals from common indications [17]. Since AF is regularly misdiagnosed as other arrhythmia types, the characterization of AF against an elective cadence would help make the finder more robust. AF is a brilliant possibility for which the effect of such all-around designed versatile innovation would be high. In any case, despite the availability of low-effort therapeutic equipment, the ability to legitimately process information over the phone, and the availability of vast databases of biosignals, almost nothing has been done to make insightful calculations that could naturally translate this therapeutic information. The Physionet/Computing in Cardiology Challenge 2017 [18] subject addresses this theme. It energizes analysts worldwide to create methods for arranging AF from a short single-lead electrocardiogram (ECG) recording acquired utilizing a cell phone.

Much work has gone into ECG categorization, and more is being done in the process. In [19], Garcia et al. propose a new strategy that takes advantage of ventricular and atrial activity variability, as shown on the surface electrocardiogram (ECG). First, the time series generated from RR intervals and fibrillatory wave morphology derived from TQ intervals are developed. The Coefficient of Sample Entropy is then used to measure their regularity (COSEn). The gathered data is at long last consolidated through a multiclass Support Vector Machine way to deal with perceiving among short episodes of AF, Normal Sinus Rhythm (NSR), and Other Rhythms (OR).

Rajpurkar et al. [20] propose using the ResNet model to categorize the ECG data into four different groups. Rajpurkar et al. also incorporate a number of other advanced features, such as statistical modeling of atrial activity, study of heart rate variability in both the frequency and temporal domains, spectral power analysis, and so on. They devised a hierarchical classification model by employing oversampling techniques across categories to determine if an electrocardiogram signal is normal, noisy, exhibiting atrial fibrillation (AF), or displaying an alternative rhythm. Maknickas V and Maknickas A [21] suggest using a LSTM network for the classification of ECG data. This network leverages directly learnt patterns from precomputed QRS complex characteristics. The procedure for extracting information from each pulse of ECG data is outlined in reference [22].

Jiménez-Serrano et al. [9] devised a method for automatically extracting 72 ventricular activity parameters from 8528 ECG recordings that were submitted to the 2017 PhysioNet/Computing in Cardiology Challenge. Following that, a grid search was performed using a set of Feed Forward Neural Network (FFNN) training parameters to carry out a Feature Selection (FS) and training/validation method [3]. The authors use templates that are responsive to a certain heart rate variability, waveform, and AdaBoost classifier. The classification of multiparametric atrial fibrillation is described in [23]. This classification is based on HRV analysis, noise detection, the discovery of atrial activity by the presence of a P-wave in the average beat and f-waves during TQ intervals, and beat morphology analysis following robust synthesis of an average beat and delineation of P, QRS, and T waves. [23] is cited as an example. A linear discriminant classifier was used to categorize the ECG data, which was then separated into four categories. Furthermore, Ojha et al. [24] constructed a deep autoencoder-based SVM classifier to categorize the ECG signal into five categories using the arrhythmia database and previously published research, with better results. This resulted in a more accurate classification of the ECG signal.

This classification task is performed using various pattern recognition algorithms. Therefore, this research work aims to develop a new model based on deep learning techniques for the early diagnosis of arrhythmias from ECG signals. That is, it focuses on arrhythmia and ECG signal processing and classification models to propose new models that can help the cardiologist for the early diagnosis of arrhythmia.

The significant contributions of this paper are the following: (i)this paper’s data set is taken from a PhysioNet challenge (Computing in Cardiology Challenge) 2017(ii)firstly, we performed a data preprocessing task using bandpass Butterworth filters to remove the noise from the ECG signals(iii)after that, the Z-Score normalization is performed on the amplitude values of the filtered ECG signals(iv)the dataset used in this paper is highly imbalanced. Therefore, we have used SMOTE (Synthetic Minority Sampling Technique) technique to balance the dataset, and then the dataset is divided into testing and training datasets for modeling(v)we have trained three different combinations of deep learning models for the training of the dataset, namely, ResNet, a mixture of BLSTM and ResNet, and a combination of ResNet with RBF techniques for the detection of atrial fibrillation heartbeats in the ECG signals(vi)then, validation is performed on test datasets that classify ECG signals into four classes: normal, AF, noisy, and others(vii)finally, this study introduces a new hybrid model based on deep learning techniques that classify ECG signals into four classes: normal, AF, noisy, and others. These models also enhance the effectiveness and efficiency of the heartbeat classification compared to other machine learning and deep learning models using the same ECG signal challenge dataset

The rest of the paper is structured as follows: the methodology and materials are described in Section 2. Section 3 presents the outcome results and a discussion of the proposed methods. Finally, Section 4 states the conclusion of this paper.

2. Materials and Methods

Deep learning techniques are most commonly used in healthcare nowadays. Two deep learning methods have been proposed in this study, which are Convolutional Neural Networks (CNN) and LSTM. Parameter sharing, translation invariance, and sparse connectivity make CNN training computationally efficient and well-liked in computer vision [25, 26]. The downside of CNNs is that they rely on grid-like structures to function (e.g., images or fixed segment windows).They are shown in Figure 1.

One recent finding that has helped with the training and improved accuracy of deeper CNNs is the Residual Network (ResNet) [27]. By utilizing shortcut identity connections similar to a feedforward LSTM (a subtype of RNNs) [18, 28], ResNet makes feature mappings from lower layers accessible at higher stages.

2.1. Dataset Collection

ECG readings were taken during the challenge using the AliveCor gadget and made public. The Physionet Challenge server was used for training with an open database of 8528 single-lead ECGs and accompanying annotations and testing using a closed database of 3658 ECG recordings (dataset). Normal Sinus Rhythm (N), Atrial Fibrillation (A), Other Rhythms (O), and Noisy Recordings were the four categories of ECGs found in the database. These range from 9 to 61 seconds in length and are single-lead 300 Hz ECG recordings. The information repository is offered as a downloadable zip file. Table 1 and Figure 2 show the sum of heartbeats for various categories.

2.2. Signal Preprocessing and Normalization

Each ECG segment was preprocessed using 10th-order bandpass Butterworth filters [29]. These filters had a cut-off frequency of either 5 Hz or 45 Hz (narrowband) or 1 Hz or 100 Hz (wideband). The frequency response of the Butterworth filter is perfectly flat (i.e., has no ripples) in the passband and goes to zero in the stopband. The filter has the flattest magnitude curve possible. For this analysis, we have chosen to segment the ECG data into 20-second samples, each representing a single heartbeat.

Given that there is a 300 Hz sampling rate.

Consequently, each training segment is 20 seconds in length, matching the requirements of the CinC 2017 database. Before segmenting an ECG recording, the recording is normalized to have a mean value of zero and a standard deviation of one. This is because the ECG was already bandpass filtered [22] by the recording device, so there was no need for any additional filters. Then, using the Z-score normalization technique, the amplitude values are transformed into the range of 0-1 to make them more comparable. where represents each sample of heartbeats, and the mean is calculated by taking the mean of the 20 second ECG signal values. Here represents the standard deviation.

2.3. Oversampling

Predictive accuracy is commonly used to evaluate the performance of deep learning algorithms. Although, this is not acceptable when the data is unbalanced, and the costs of different errors vary significantly. The present work uses Synthetic Minority Oversampling Technique (SMOTE) [30]. It is based on an oversampling strategy where the minority class is over inspected by making “manufactured” models instead of overtesting for replacements. SMOTE method generates new synthesized sample data for minority classes without duplicity. It calculates the k-nearest neighbors for each minority class observation. Then, the synthetic data samples are created using one or more k-nearest neighbors, depending on the degree of oversampling required.

2.4. Proposed Deep Learning Model

In the present work, we take two different approaches. One is similar to [20], in which the ResNet model is used to classify the ECG recordings into four classes, and the other approach uses more than one model, including the ResNet model. This ResNet model has 36 layers, combining a convolutional layer, max-pooling layer, and fully connected layer. This approach uses the Bidirectional Long Short Term Memory (BLSTM) and Radial Basis Function (RBF) model. It is an intuitive hybrid approach to gain an insight into how different combinations of neural network models can be combined to form a singular hybrid model that can perform the classification task with improved efficiency.

2.4.1. Bidirectional LSTM (BLSTM)

Bidirectional LSTMs enhance the model performance on grouping characterization tasks by augmenting traditional LSTM [31, 32]. BLSTM trains two LSTMs rather than one with info succession. The first is based on the information sequence, while the second is based on a duplicate of the information sequence turned around [33]. It includes copying the intermittent main layer in the system such that there are presently two layers adjacent to each other, giving the information grouping as a contribution to the top layer and a switched duplicate of the second information arrangement (https://machinelearningmastery.com/develop-bidirectional-lstm-sequence-classification-python-keras/). The associations between LSTM units enable the data to push through a circle over the nearby time steps that makes an inside the condition of criticism, allowing the system to comprehend the idea of time and discover the transient elements inside the introduced data. LSTM units can recall or overlook data by keeping up a memory state. The most critical data is kept and back-engineered, while the less critical data is ignored and discarded [34]. The architecture of the BLSTM network is shown in Figure 3.

2.4.2. Hybrid Architecture

General classification applied directly to LSTM does not produce specific results. Therefore, it is an excellent strategy to use a hybrid model combining a ResNet (CNN) with LSTM to have more accurate results [35, 36]. The ResNet (CNN) LSTM model utilizes ResNet layers for learning features to join the LSTM layer to help accurate prediction. Both ResNet (CNN) and LSTM performed reasonably well on ECG signals. Besides, profound learning models do not require any extraction of handmade highlights, and they are generally simple to implement [34]. Henceforth, this paper uses the blend of these two calculations to determine arrhythmias. The bidirectional LSTM bolsters the yield of ResNet engineering to order the information into four classes, viz. AF, Normal, Noisy, and others. Figure 4 shows the hybrid structure of the ResNet and BLSTM model.

Another variant includes feeding the output of the ResNet model into an RBF [37, 38]. RBF neural network then has the task of classifying the incoming data from the ResNet model into the four classes discussed above, as shown in Figure 5.

3. Results and Discussion

The above models are trained and tested using the publicly available free cloud notebook (http://colab.research.google.com). The Google Collab environment provides a free GPU limit of up to 11GB and a memory of 358.27 GB, with a CPU frequency of 2.3 GHz on the Tesla T4 system with a memory clock rate of 1.59 GHz. The dataset is directly downloaded from the PhysioNet website to avoid the overhead of uploading data from a local machine. Thus, the above hardware setup provides an efficient way to train and test the deep learning neural network without any interference from the local devices. The learning rate used by the model is 0.001, and the Adam optimizer is used, which is present in the Keras library. Cross-validation is a resampling method for evaluating AI models with a limited information sample. A five-fold cross-validation strategy is utilized in this paper. The given methodology includes only one parameter, , which refers to the number of meetings in which a specific information test is included. This methodology is generally called k-fold cross-validation. The whole dataset is first divided into k equal parts in this strategy. Then K-1 parts are used for training the classification models, and the last Kth part is used for testing the trained models. Therefore, in this fashion, the model is trained Kth time on the different parts of the dataset, and every time, we test the model on a new Kth part of the dataset that the model does not see during the training period.

The following table describes our experiment results, and different models are trained and tested using a five-fold cross-validation strategy. The training and test datasets are split into 80-20 for all models used in this paper. Accuracy and F1 score are used to evaluate the performance of the models. F1 is a metric that combines precision and recall to assess a model’s correctness.

Table 2 presents the results for five-fold cross-validation on the ResNet-36 model and plots the variation of validation accuracy on validation runs and the computed F1 score, achieving an overall F1 score of 80.58%.

Figure 6 plots the variation of validation accuracy with the ResNet 36 model for different epochs like 5, 10, 15, and 20. It can be inferred that the maximum validation accuracy achieved is 84.40% for epoch number 20 with cross-validation number 5. Validation accuracy has been growing linearly over the number of epochs.

Table 3 presents the accuracy and computed F1 score using five-fold cross-validation for the hybrid model of ResNet and bidirectional LSTM. The overall computed F1 score is 80.08%, with different F1 scores across five-fold cross-validation as 71.73, 77.97, 79.85, 84.93, and 85.94%, respectively.Figure 7 shows how the number of epochs affects the validation accuracy for various cross-validation methods. It achieved the highest validation accuracy of 82.87% and increased with an increase in epochs.

Table 4 shows the variation in F1 score and validation accuracy for ResNet and RBF networks using five-fold cross-validation. It has achieved an overall F1 score of 80.20%. Figures 8 and 9 show the accuracy variation and F1 score of the different epochs achieving the highest validation accuracy of 84.56%.

In the present work, we have classified the short ECG recordings into four classes using deep learning neural networks such as ResNet, hybrid model (ResNet and bidirectional LSTM), and ResNet + RBF neural network. We have compared the results across different models and concluded that the presented models achieved a significant outcome compared to related works discussed in [29]. The model used in [29] is limited to expanding the model up to only a specific value due to computational leverage, but our present work does not consider that factor. Our results have improved significantly. However, the limitation of distorted and noisy signals presents a setback that leads to having a bottom hand in overall accuracy and computes the F1 score.

The work demonstrated by Garcia et al. [19] used a multiclass SVM approach for classification and achieved an F1 score of 73%. In comparison, Rajpurkar et al. [20] used the approach of ResNet (34 layers) that converts the sequence of ECG samples into a sequence of rhythm classes. They achieved an overall F1 score of 79.9%. On the other hand, Coppola et al. [1] used a hierarchical classification model for ECG classification into different rhythm classes with an F1 score of 78.55%. Maknickas V and Maknickas A [21] used the LSTM network to learn patterns directly from precomputed QRS complex features that classify ECG signals and achieved an F1 score of 78%. Schwab et al. [22] used ensemble RNN with the LSTM attention model and achieved an F1 score of 79%. Andreotti et al. [29] used a ResNet model and achieved an accuracy of 79%. Jiménez-Serrano et al. [9] used a Feedforward Neural Network (FFNN) with an F1 score of 77%. Our present work has two different approaches, one is similar to the [20, 29] with a ResNet model, and the other is a variation of a hybrid model of ResNet with BLSTM and ResNet with RBF achieving an F1 score of 80.58%, 80.08%, and 80.20%, respectively. Table 5 and Figure 10 describe the performance comparisons of the proposed model with the existing works.

4. Conclusion

Overall, many studies have been done on ECG rhythm classification, and the present work adds another variation of the ResNet model and two new hybrid architectures involving BLSTM and RBF networks. The results shown are promising and can be increased in various ways with the accessibility of more publicly accessible and open data, which has been a continuous obstacle to the current study. New biomedical technologies allow researchers to deal with an unprecedented amount of precise data. However, given the nature of this work and different deep neural networks, we can rest assured that there is a broad scope of improvement that can be done in this field. Many researchers are constantly working on this problem domain, and many R&D institutes have taken interest in it. Thus, it led us to assume that this domain is going to flourish and outshine shortly.

Although we tried our best to incorporate a maximum of models in this domain, given the limited time and computational resources, a vast plethora of techniques and models like Multilayer Perceptron (MLP), etc., can be applied to the given problem domain. As the work involves a lot of computational and physical data resources, with the advent of new and better technologies, we can try to reduce the complexity to infer results in more optimized time.

Data Availability

Data will be made available on demand.

Conflicts of Interest

The authors declare no conflicts of interest.