#### Abstract

To solve the problem of real-time arrhythmia classification, this paper proposes a real-time arrhythmia classification algorithm using deep learning with low latency, high practicality, and high reliability, which can be easily applied to a real-time arrhythmia classification system. In the algorithm, a classifier detects the QRS complex position in real time for heartbeat segmentation. Then, the ECG_RRR feature is constructed according to the heartbeat segmentation result. Finally, another classifier classifies the arrhythmia in real time using the ECG_RRR feature. This article uses the MIT-BIH arrhythmia database and divides the 44 qualified records into two groups (DS1 and DS2) for training and evaluation, respectively. The result shows that the recall rate, precision rate, and overall accuracy of the algorithm’s interpatient QRS complex position prediction are 98.0%, 99.5%, and 97.6%, respectively. The overall accuracy for 5-class and 13-class interpatient arrhythmia classification is 91.5% and 75.6%, respectively. Furthermore, the real-time arrhythmia classification algorithm proposed in this paper has the advantages of practicability and low latency. It is easy to deploy the algorithm since the input is the original ECG signal with no feature processing required. And, the latency of the arrhythmia classification is only the duration of one heartbeat cycle.

#### 1. Introduction

According to the World Health Organization (WHO), 17.9 million people died of cardiovascular disease in 2019, of which 85% died of sudden heart disease and stroke [1]. Arrhythmia refers to a problem with the frequency or rhythm of the heartbeat, and severe arrhythmia may cause lethal heart disease [2]. Clinically, doctors usually diagnose it by analyzing the patient’s electrocardiogram (ECG) with his/her relevant medical history and clinical manifestations [3]. However, abnormal ECG signals usually occur by chance, which cannot be obtained from a short-term ECG. It is time-consuming and labor-intensive and lacks objectivity to only rely on manual processing of a patient’s long-term ECG records. Moreover, centralized analysis after recording ECG lacks real-time performance and cannot deal well with the sudden risk of patients.

With the development of computer science and technology, computer-aided diagnosis (CAD) for ECG analysis has helped solve the shortage of manual processing of ECG [4]. An increasing number of algorithms have been proposed for the automatic analysis of ECG signals for arrhythmia classification. Generally, an arrhythmia classification algorithm consists of four steps: preprocessing, heartbeat segmentation, feature extraction, and classification algorithm. Heartbeat segmentation has been studied for 30 years [5–8]. The classical heartbeat segmentation method uses an adaptive threshold method [5]. With the deepening of research, more and more new technologies have been applied to heartbeat segmentation algorithms, such as wavelet transform [9], genetic algorithm [10], and neural network [11]. The accuracy of heartbeat segmentation has a great impact on the final arrhythmia classification. However, many studies on arrhythmia classification algorithms directly use the heartbeat markers in the database, ignoring the influence of errors in heartbeat segmentation on the overall algorithm. Typical ECG features include single features (such as heart rate variability, QRS width, PQ/PR interval, and amplitude of QRS) and time-domain features or corresponding frequency-domain features extracted directly from ECG signals. More popular arrhythmia classification algorithms include support vector machine (SVM) [12, 13], artificial neural network (ANN) [14, 15], linear discriminant (LD) [16], and logistic regression (LR) [17].

Deep learning is a branch of machine learning. It is a computational model with multiple processing layers to learn data representation with multiple levels of abstraction [18]. Deep learning optimizes the parameters of each layer by backpropagation and discovers complex structures in big datasets [18]. It has been proven to be useful for many disciplines, such as computer vision, speech recognition, natural language processing, and bioinformatics. Increasingly methods based on deep learning are used to study the classification of arrhythmia. Mathews et al. [19] proposed a deep learning-based ECG classifier using single-lead ECG and trained a deep learning model-based classifier to classify arrhythmias on the ECG signal with a 114 Hz sampling rate. Paweł and Acharya [20] used long-duration (10 s) ECG signal segments, strengthened the characteristic ECG signal features with spectral power density estimation, and introduced a novel three-layer deep genetic ensemble of classifiers. Shaker et al. [21] proposed a novel data augmentation technique using generative adversarial networks (GANs) to balance the dataset and effectively improve the performance of ECG classification over the same models trained on the original dataset. These studies have made good progress in the accuracy and the interpatient performance of the arrhythmia classification algorithm; however, they lack the real-time improvement of the algorithm. In practice, developing the most appropriate classifier that is capable of classifying arrhythmia in real time is also an issue in ECG arrhythmia classification [22].

Due to the specificity of individual ECGs, the main concern in practical application is the interpatient performance of the arrhythmia classification algorithm. There are two main ways to improve it. The first method is to use an expert to annotate arrhythmia on a portion of a specific patient’s ECG and fine-tune the model with the annotation to improve the model performance for this patient [23–26]. Among them, Luo et al. [26] proposed a patient-specific arrhythmia classifier based on deep learning, in which a deterministic patient-specific heartbeat classifier is fine-tuned on heartbeat samples that include a small subset of individual samples (the overall accuracy increases from 89.3% to 97.5%). However, this method is feasible but not scalable because fine-tuning the model requires expert intervention. Another method is to train a general classifier with good interpatient performance through a reasonable selection of features, normalization, training datasets, and evaluation methods [14, 27–31]. The second method is used in this paper because it is cheaper and more practical.

The combination of edge computing [32] and wearable ECG acquisition technology for real-time arrhythmia monitoring cannot only help patients monitor their health and prevent sudden risks but also has the advantages of low latency, low power consumption, low bandwidth, and high privacy. An arrhythmia classification algorithm with low latency, high practicability, and reliability is the key to the monitoring system. Therefore, this paper proposes a real-time arrhythmia classification algorithm using deep learning. In the algorithm, a classifier based on the FFNN (Feedforward Neural Network) model first detects the QRS complex position for real-time heartbeat segmentation. Then, the time-domain morphological features of each heartbeat cycle are extracted according to the heartbeat segments, and another classifier based on the CNN (Convolutional Neural Network) model classifies arrhythmia in real time. The algorithm achieves a real-time performance of heartbeat cycle latency in arrhythmia classification and overcomes the shortcomings of traditional arrhythmia classification algorithms, which are unscalable and complex in feature processing and not suitable for edge computing. This paper provides a real-time, efficient, and reliable arrhythmia classification algorithm for edge computing-based ECG monitoring systems.

#### 2. Materials and Methods

##### 2.1. Data Description

The MIT-BIH arrhythmia database [33] is published by the Massachusetts Institute of Technology-Beth Israel Hospital. It contains 48 ECG records with a sampling rate of 360 Hz and a duration of 30 minutes. In this paper, only the ECG signals of the MLII lead are used for arrhythmia classification, which requires less hardware and computational cost, while giving a satisfactory overall accuracy [31, 34].

ANSI/AAMI [35] categorizes the 15 recommended classes of arrhythmia into five superclasses which are normal (N), supraventricular ectopic beat (SVEB), ventricular ectopic beat (VEB), fused beat (F), and unknown beats (Q). It is recommended to classify with only these superclasses. Since there are only 13 classes of arrhythmia in MIT-BIH, this paper designs the heartbeat classification model as a 13-class classification model, which can then get the 5-class classification results according to the class hierarchy. Some studies also classify arrhythmia into other five classes: normal (N), left bundle branch block (LBBB), right bundle branch block (RBBB), premature ventricular contraction (PVC), and atrial premature beat (APB) [36–38].

To better adapt to the actual environment and to improve the practicality of the algorithm, this paper divides the 44 records in the MIT-BIH database without pacemakers into two groups, according to the work of Chazal et al. [16]. The first group, named the DS1 group, consists of the records of 101, 106, 108, 109, 112, 114, 115, 116, 118, 119, 122, 124, 201, 203, 205, 207, 208, 209, 215, 220, 223, and 230. The second group, named the DS2 group, consists of the records of 100, 103, 105, 111, 113, 117, 121, 123, 200, 202, 210, 212, 213, 214, 219, 221, 222, 228, 231, 232, 233, and 234. The records of Groups DS1 and DS2 are basically from different patients (201 and 202 are from the same patient but in different groups). Figure 1 shows part of the waveform of each data record in the DS1 and DS2 groups.

Table 1 shows the hierarchy of superclasses and subclasses in arrhythmia classes. It also contains descriptions and quantity statistics of each class in Groups DS1 and DS2. Since some nonbeat annotations exist in the MIT-BIH database, this article marks them as the OTHER superclass. The relevant data should be excluded from the following process of constructing datasets.

##### 2.2. Overall Design

The overall algorithm design is shown in Figure 2:(1)The real-time ECG signal sequence is cut into ECG segments with a 200 ms time window(2)The algorithm detects the QRS complex positions with Classifier 1 in the ECG segments(3)The algorithm caches ECG data and extracts the time-domain feature (named ECG_RRR) based on the last three QRS complex positions(4)Classifier 2 predicts arrhythmias using the ECG_RRR feature

The algorithm’s output is the arrhythmia type and the corresponding QRS complex position information. As shown in Figure 2, the ECG_RRR feature is a resampled ECG sequence with a fixed length. The starting point of the sequence is the last *R* wave point of the current heartbeat cycle, and the endpoint is the next *R* wave point of the current heartbeat cycle. Therefore, the ECG_RRR feature contains the complete ECG signal of the current heartbeat cycle. Meanwhile, to improve the accuracy of QRS complex detection, Classifier 1 adopts a step-window voting mechanism to ensure that the moving window contains complete QRS complex information. Since the preprocessing of the ECG signal (such as removing baseline offset, noise, and power-line interference) should be done at the acquisition devices, this algorithm does not consider signal preprocessing.

The 200 ms time window is selected because it not only guarantees real-time performance but also ensures that each two adjacent time windows contain no more than one complete QRS complex.

The real-time ECG signal is cut into ECG segments (named , length is 72 when the sample rate is 360 Hz in this paper) by a 200 ms time window, expressed by

The is input into Classifier 1, which outputs an integer . means that the input ECG segment does not contain QRS complexes. means that the input ECG segment contains QRS complexes and the center position of the QRS complex is . The position of this QRS complex in the ECG signal is and is stored in the QRS buffer.

After obtaining the position of the last three consecutive QRS complexes , the heart rate (HR) and heart rate variability (HRV) are calculated bywhere is the sampling rate of ECG (360 Hz in this paper).

Only when and are both within the normal range (30∼150), the three QRS positions are considered valid. Then, the ECG_RRR feature is obtained by resampling the ECG signal in the ECG buffer between and into 360 data points.

Finally, the ECG_RRR feature is input into Classifier 2 which outputs the arrhythmia classification of the heartbeat cycle at .

The ECG_RRR feature is selected for arrhythmia heartbeat classification because it contains both the complete information of the current heartbeat cycle and partial information of the preceding and succeeding heartbeat cycles as the heart rate variation information.

Classifiers 1 and 2 are both based on deep learning models. According to the characteristics of the input features, Classifier 1 uses a Deep Neural Network (DNN) model of fully connected layers, also called Feedforward Neural Network (FFNN). Classifier 2 uses the Convolutional Neural Network (CNN) model. CNN is a neural network with multiple hidden layers such as fully connected layers, convolutional layers, and pooling layers. CNN is a type of DNN with convolutional layers and pooling layers to extract and abstract features.

For the fully connected layers, which are directly connected layer by layer, the forward propagation from the layer with features to the layer with features is

The formula in the matrix form iswhere is the feature vector of the layer , a column vector of length , is the feature vector of the layer , a column vector of length , is the weight matrix of the layer , a matrix of *m* rows and *n* columns, is the bias vector of the layer , a column vector of length , and is the nonlinear activation function of the layer.

Each convolution layer contains several convolution kernels. The features of the previous layer are convolved with the corresponding convolution kernels to output new features. For the layer with depth (the feature maps are ), after the operation of the convolution layer (with learnable convolution kernels and learnable bias ), the layer with depth (the feature maps are ) is obtained. The calculation formula of each feature map iswhere is the set of input feature maps, is the convolution operation, and is the activation function.

For a CNN model, there is usually a pooling layer following the convolutional layer to reduce the feature size. After several convolutional layers and pooling layers, a flatten layer transforms the features into a vector, and then, the output is obtained through several fully connected layers.

In this study, the hidden layer uses the linear rectification function (Rectified Linear Unit, ReLU) as the activation function, which has the advantage of fast convergence. For an input column vector , the function output is

In this paper, the output layer uses the SoftMax activation function to normalize the output into a probability distribution. For an input column vector , the function output is

The loss function evaluates the difference between the predicted output and the true value . In this paper, the cross-entropy loss function is used, and its function for an *n*-class classification problem iswhere is the predicted output and is usually obtained by the one-hot encoding of the true class.

In this study, the Adam algorithm optimizes the model parameters. It has the advantages of the adaptive learning rate, fast convergence, and stable results. This article uses TensorFlow 2, a deep learning open-source tool released by Google, to build, train, and test the models.

##### 2.3. Dataset Construction

According to the overall algorithm design, there are two classifiers, and the cores of them are deep learning-based neural network models. They are trained on two datasets, named ECG_200 ms_POS_72 and ECG_RRR_TYPE_360, respectively.

The generation of the ECG_200 ms_POS_72 and ECG_RRR_TYPE_360 datasets from the MIT-BIH database is shown in Figure 3. To ensure the validity of the dataset, we should eliminate the relevant data in the superclass OTHER. Meanwhile, when intercepting ECG_RRR, a moving window with a length of 200 ms (72) and a step of 100 ms (36) splits the original data.

Figure 4 shows partial waveform diagrams of various ECG_RRR features. There are apparent differences in the morphological characteristics of various ECG_RRR features, which further proves the effectiveness of ECG_RRR features for arrhythmia classification.

This paper constructs the two datasets (ECG_200 ms_POS_72 and ECG_RRR_TYPE_360) after processing all 44 records. Table 2 shows the description of the constructed two datasets. Table 3 shows the distribution of samples in the ECG_200 ms_POS_72 dataset, which contains a total of 794,376 samples (sample_A). Among them, 590,230 samples (sample_N) are negative samples with label 0, indicating that no QRS complex is included. The other 204,146 samples (sample_P) are positive examples with positive integer labels, meaning that the QRS complex is included. The positive label values are approximately uniformly distributed in [1, 71]. The average value (P_ave) of the distribution is 2875.30, the maximum value (P_max) is 3019, the minimum value (P_min) is 2784, and the variance (P_s) is 55.13. Table 4 shows the distribution of the samples in the ECG_RRR_TYPE_360 dataset, which contains a total of 97,898 records. Although the data of Categories F (fusion beat) and Q (unknown beat) are imbalanced (the amount of data is small), these two categories are not the main classification targets of the classifier.

##### 2.4. QRS Position Detection

The QRS position detection subalgorithm marks the position of the QRS complex in the ECG signal so that the overall algorithm can segment the heartbeat. In this paper, the input of the QRS position detection subalgorithm is the 200 ms ECG segment with a length of 72, and the output is the QRS complex position (an integer in [1, 71]) or 0 (when the QRS complex is not included).

This subalgorithm consists of a 72-class FFNN-based model (named Model_QRS) and a QRS detection strategy.

The Model_QRS model is trained and evaluated on the DS1 group of the ECG_200 ms_POS_72 dataset, which is randomly divided into a training set (train set) and a validation set (valid set) with a ratio of 8 : 2. The test set consists of the data in the DS2 group of the ECG_200 ms_POS_72 dataset.

To reduce the computational cost and meet the requirements of edge computing equipment, it is a good idea to use as small a model as possible to obtain better results. Therefore, grid searching is employed to identify the optimal model for QRS position detection. This paper designs all hidden layers to the same size to facilitate searching. As shown in Figure 5, the number of hidden layers varies from 1 to 8. Since the input feature size is 72, the candidate size of each hidden layer is {9, 18, 36, 72, 144}.

Sometimes there is a slight error between the marked points in the MIT-BIH database and the actual QRS complex position. It also allows a certain error to exist in the QRS complex position prediction in practice. Therefore, this article concerns more about the accuracy (named acc2) that the QRS complex position is within the tolerated error range (±5).

The sizes of the input and output layers are both 72 (the result of the QRS complex position is the index of the maximum value in the output layer). After training the model with the cross-entropy loss, Adam optimizer, and early stopping strategy (if the loss of the model is less than 1*e* − 5 for consecutive five epochs, stop training to prevent overfitting), we get the result of grid searching, which is shown in Figure 6. As can be seen from the figure, with the increase of the number and size of hidden layers, the accuracy of the model improves, but when the size of each hidden layer is greater than 36 and the number of hidden layers is greater than three, the improvement of acc2 is not significant. Therefore, in this paper, an FFNN model with three hidden layers and 36 neurons in each hidden layer is selected as the model for Classifier 1 for QRS position detection. Figure 7 shows the structure of the final selected FFNN model. Table 5 shows the details of each layer of the model. The number of parameters in this model is 7956, the model size is about 32 kB, and the amount of computation required for the model is 16092 FLOPs (floating-point operations [39]). Figure 8 shows the accuracy and loss curves of the model training process.

Finally, the acc2 on the test set is 96.8%.

The algorithm uses a strategy based on a step-window voting mechanism when the model is deployed in production to improve the accuracy, as shown in Figure 9:(1)The algorithm caches the latest ECG segment with a length of 144 and a duration of 400 ms and uniformly uses windows with length 72 to obtain *N* 200 ms ECG segments .(2)The algorithm inputs the to the FFNN model to get the results of N QRS complex positions .(3)The final QRS position and total votes are calculated according to . The prediction result of this QRS complex is retained only when is greater than threshold and the position from the previous QRS complex is greater than threshold . The values of thresholds and the number of windows can be adjusted according to requirements.

In this paper, we choose and use all 22 records of the DS2 group in the original data to evaluate Classifier 1 (for QRS position detection). The recall rate, precision rate, and overall accuracy of QRS complex position prediction are 98.0%, 99.5%, and 97.6%, respectively.

##### 2.5. Heartbeat Classification

The heartbeat classification subalgorithm uses the ECG_RRR feature. Since there are only 13 classes of arrhythmia in MIT-BIH, this subalgorithm is a 13-class arrhythmia classification model that can get 5-class arrhythmia classification according to the class hierarchy.

Since ECG_RRR features have relatively strong morphological characteristics, the core of this subalgorithm is a 13-class classifier based on the CNN model (named Model_TYRE) and uses the ECG_RRR_TYPE_360 dataset for training and evaluation. We use the DS1 group of the ECG_RRR_TYPE_360 dataset as training data, which is further randomly split into a training set (train set) and a validation set (valid set) with a ratio of 8 : 2. The test data comes from the DS2 group of the ECG_RRR_TYPE_360 dataset.

The input of the model is the ECG_RRR feature proposed in this paper, with a length of 360 and clear morphological characteristics. Therefore, a one-dimensional CNN model constructs the model, and the result shows that this model has a good performance.

After considering the tradeoff between the size and accuracy of the model, we select a structure of the CNN-based model and show it in Figure 10. Table 6 shows the details of the model. The number of parameters in this model is 6273, the model size is about 25 kB, and the amount of computation required is 105678 FLOPs (floating-point operations [39]). The input of the model is the ECG_RRR feature with a depth of 1 and a length of 360. The model then uses a cascade of three one-dimensional convolutional layers and pooling layers to transform the data into features with depth 20 and length 6. Then, a flatten layer flattens the features to a vector of length 120. Finally, two full connection layers of size 30 and one output layer obtain an output of length 13, each representing the probability of one of the 13 classes. The output with the highest probability is the model’s predicted class, and then, the 5-class prediction is obtained according to the class hierarchy.

To prevent overfitting, dropout is applied on pooling layers and full connection layers during model training and randomly disables 30% of neuron connections.

After training the model with the cross entropy, Adam optimizer, and early stopping strategy (if the loss of the model is less than 1*e* − 5 for consecutive five epochs, it stops training to prevent overfitting), the accuracy and loss curves during the training process are shown in Figure 11. Since the dropout is applied in the training process, the training loss is higher than the validation loss, and the training accuracy is lower than the validation accuracy.

Since the data in the DS1 and DS2 groups belong to different patients and the model only learns on 80% of the data in the DS1 group, the performance of the model on the DS2 group is an indicator of interpatient performance. The accuracy of 13-class classification is not good enough (only 77.0%), but the accuracy of 5-class classification according to the hierarchy is as high as 94.2%.

#### 3. Results and Discussion

In the experimental test, the algorithm first uses Classifier 1 to predict the QRS complex position, which assists in segmenting the ECG_RRR feature. It then uses Classifier 2 to predict the corresponding arrhythmia type. Finally, the QRS position predictions and the corresponding arrhythmia predictions are compared with the ground truth in the database to obtain the evaluation results.

This paper uses all the 44 patients’ MLII lead original ECG signals in Groups DS1 and DS2 of the MIT-BIH arrhythmia database to test the overall algorithm. Besides, the interpatient performance of the algorithm is obtained by testing on the data in the DS2 group, which is also an important evaluation metric in practice.

For each of these records, the classification pipeline is as follows: firstly, the ECG signal of MLII lead is sliced by a 200 ms time window and passed into the algorithm chronologically. The algorithm then predicts the position of the QRS complex and the corresponding arrhythmia class in real time. After all data in this record is entered, the predictions of the QRS complex position sequence (QRS_pred) and the corresponding heartbeat type sequence (TYPE_pred) are obtained. Meanwhile, the ground truth QRS complex position sequence (QRS_true) and the corresponding heartbeat type sequence (TYPE_true) are obtained by analyzing the marker information in this record. Finally, by comparing QRS_pred and TYPE_pred with QRS_true and TYPE_true, respectively, the corresponding confusion matrix and statistical matrix are obtained. When comparing QRS_true with QRS_Pred, a certain error of ±5 is allowed.

The recall rate (*R*), precision rate (*P*), and overall accuracy (Acc) are the evaluation metrics of classification performance. For an *n*-class classification problem with classes , the calculation formulas of precision rate, recall rate, and overall accuracy are as follows:where represents the number of samples whose true class is and predicted class is , represents the number of samples whose true class and predicted class is both , represents the number of samples whose true class is , and represents the number of samples whose predicted class is .

For the real-time ECG arrhythmia classification in this paper, there is a specific case in which the prediction of the QRS complex position is incorrect. This article marks this specific case as 0, and the quantity statistics are represented by and , where represents the number of samples whose predicted class is in false-positive samples and represents the number of samples whose true class is in false-negative examples.

For the 5-class arrhythmia classification, the results are recorded in the confusion matrix as Table 7 and the statistical table as Table 8.

The recall rate, precision rate, and overall accuracy of N, S (SVEB), V (VEB), F, and Q are calculated according to the 6-class case regarding equations (9)–(11). Specifically, the numbers of false-positive (FP), false-negative (FN), and true-positive (TP) samples of QRS position prediction are calculated by

The precision rate (*P*), recall rate (*R*), and overall accuracy (Acc) of QRS complex position prediction are calculated by

After testing all records in the database, 44 confusion matrices are obtained. According to the DS1 and DS2 groups, two result Tables 9 and 10 and a statistics Table 11 are obtained after synthesizing them.

From Tables 9 and 10, it can be calculated that, for the DS1 group, the recall rate, precision rate, and overall accuracy are 98.3%, 99.6%, and 98.0%, respectively. For the DS2 group, they are 98.0%, 99.5%, and 97.6%, respectively.

Table 11 shows that, on the data in the MIT-BIH database, the overall accuracy for 5-class arrhythmia classification is 93.6%, and the interpatient accuracy is 91.5%. The statistics of each patient’s 5-class arrhythmia classification results in Group DS2 are shown in Table 12, and the distribution of overall accuracy is shown in Figure 12. As shown in the table, except for the extremely low accuracy of Patient 232 and the low accuracy of Patients 222, 219, and 213, the accuracies of the other 18 patients are all above 90%.

Table 13 summarizes the comparisons of the proposed algorithm and state-of-the-art methods which do not require expert intervention. It demonstrates that the proposed algorithm has an advantage in overall accuracy, although the recall rate and precision rate of the SVEB classification are lower than those of the other studies. The algorithm of this paper uses the heartbeat segmentation provided by Classifier 1 for arrhythmia classification, and its overall accuracy may be affected by the error of heartbeat segmentation. However, many other studies just segment heartbeat by the marked points in the database for arrhythmia classification. For a fair comparison, the results of Classifier 2 on the DS2 group in the ECG_RRR_TYPE_360 dataset are also listed (proposedrows in the table), which is the interpatient arrhythmia classification performance using the heartbeat markers in the database.

#### 4. Conclusions

Real-time monitoring of ECG and intelligent diagnosis in daily life are of great significance to reduce the risk of cardiovascular disease. With the development of wearable ECG measurement technology and edge computing, a real-time arrhythmia diagnosis system combining the two is a solution in which a reliable real-time arrhythmia classification algorithm is the core. This paper introduces a novel arrhythmia classification model which performs with high accuracy and in real time. The model takes the raw ECG signal as input and segments it with a time window. Then, it detects the QRS complex positions with an FFNN-based model and extracts the time-domain feature (ECG_RRR) for another CNN-based model to predict the arrhythmia type. Experimental results show that our model performs very well on the MIT-BIH dataset. In addition, our model requires low computing power for real-time prediction, which is available on most desktop and mobile processors.

##### 4.1. Practicality

The input of the algorithm is the original ECG signal without complex feature extraction. Moreover, to adapt to different sampling rates, a sampling rate adaptation layer can be added before the algorithm is in practice.

The core of the algorithm is deep learning that is already very mature. There are many excellent deep learning frameworks for cloud computing or mobile devices, such as TensorFlow, Caffe, PyTorch, and MXNet, so the proposed algorithm can be easily deployed.

##### 4.2. Real Time

The cores of the algorithm are two deep learning-based classifiers with simple structure, high performance, and low computational cost.

The input of the algorithm is a 200 ms ECG segment. The latency of detecting the QRS complex position is 400 ms, and the latency of arrhythmia classification is just one heartbeat cycle. The numbers of parameters in the two models are 7956 and 6273, respectively, and the computational requirements are 16092 FLOPs and 105678 FLOPs, respectively. Considering the use of a step-window voting strategy to detect QRS complexes, the comprehensive computation is 0.235 MFLOPs. The device only needs 1.2 MFLOPS (FLOPS means Floating-Point Operations Per Second) of floating-point operation capability to meet the computing needs. At present, desktop-level CPUs and GPUs can reach the magnitude of GFLOPS or even TFLOPS, and mobile processors can reach hundreds of MFLOPS. Therefore, the computational requirement of the algorithm is not a problem.

##### 4.3. Effectiveness

The proposed algorithm includes two deep learning-based classifiers for QRS complex position detection and arrhythmia classification, respectively. Meanwhile, a strategy based on a step-by-step window voting mechanism is proposed to improve QRS complex position prediction accuracy.

For interpatient performance, the recall rate, precision rate, and overall accuracy of the algorithm’s interpatient QRS complex position prediction are 98.0%, 99.5%, and 97.6%, respectively. The algorithm has overall accuracies of 91.5% and 75.6% for 5-class and 13-class arrhythmia classification, respectively.

#### Data Availability

The two datasets for training and evaluating the models constructed in this study are available from the corresponding author upon request and the origin real-world ECG data can be obtained from MIT-BIH database (https://www.physionet.org/content/mitdb/1.0.0/).

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

#### Acknowledgments

This work was supported by Science and Technology Development Project of Jilin Province, China (20190303043SF), Science and Technology Development Project of Jilin Province, China (20200404205YY), and National Key Research and Development Program of China (2018YFF0300806-1).