Abstract

Atrial fibrillation (AF) is the most common cardiac arrhythmia in clinical practice. It often starts with asymptomatic and short episodes, which are difficult to detect without the assistance of automatic monitoring tools. The vast majority of methods proposed for this purpose are based on quantifying the irregular ventricular response (i.e., RR series) during the arrhythmia. However, although AF totally alters the atrial activity (AA) reflected on the electrocardiogram (ECG), replacing stable P-waves by chaotic and time-variant fibrillatory waves, this information has still not been explored for automated screening of AF. Hence, a pioneering AF detector based on quantifying the variability over time of the AA morphological pattern is here proposed. Results from two public reference databases have proven that the proposed method outperforms current state-of-the-art algorithms, reporting accuracy higher than 90%. A less false positive rate in the presence of other arrhythmias different from AF was also noticed. Finally, the combination of this algorithm with the classical analysis of RR series variability also yielded a promising trade-off between AF accuracy and detection delay. Indeed, this combination provided similar accuracy than RR-based methods, but with a significantly shorter delay of 10 beats.

1. Introduction

Atrial fibrillation (AF) is nowadays the most common heart rhythm disturbance [1]. Its prevalence is closely related to age, thus rising notably among elderly people [2]. While 0.12–0.16% of population under 49 years suffer from this cardiac arrhythmia, this percentage increases to 10–17% for those aged 80 years or older [3]. Bearing in mind the fast expected growth of the elderly population, from 841 million in 2013 to more than 2000 million by 2050 [4], AF can be considered as an acute and burgeoning public health problem. Indeed, whereas this disease currently affects 8.8 million adults (over the age of 55) in the European Union, this population will roughly double by 2060 [5]. Similarly, 5.2 million Americans presented AF in 2010, but it is expected that the number of cases will exceed 12 million by 2030 [6].

Although this arrhythmia is not life-threatening in itself, it provokes hemodynamic alterations predisposing to the formation of blood clots within the atria [7] that, eventually, can travel to the brain, thus increasing notably the likelihood of triggering a critical stroke. In fact, AF patients present a fivefold risk of stroke and a twofold risk of death compared with healthy people of the same age [8]. Moreover, 20% of total strokes occur approximately in patients with this arrhythmia [3]. However, pathophysiological mechanisms causing and maintaining AF are still not completely understood [9]. This fact makes its diagnosis and therapy extremely challenging and, often, poorly efficient [10]. Indeed, AF accounts for approximately one-third of hospitalizations for all cardiac rhythm disorders [11], thus requiring a significant part of the healthcare budget [12].

In this context, more research in AF prevention has been recently considered as a priority and, hence, the early detection of the arrhythmia may provide interesting clinical benefits [13]. More precisely, AF usually starts with episodes as short as a few beats in length, but their frequency and duration often increase after some time [14]. Indeed, recent studies point out that between 18 and 25% of patients evolve to permanent AF in less than 5 years [3]. Currently, it is also clinically accepted that AF causes electrophysiological alterations in the atrial tissue to favor its maintenance [9]. Hence, the detection of AF signs as early as possible is essential to enable successful preventive therapies and, thus, to reduce its burden [15]. However, about 90% of initial arrhythmic episodes have revealed to be asymptomatic [16] and, consequently, routine physical examination can only provide late diagnostic evidences [3]. To overcome this problem, the automatic identification of AF from continuous monitoring of the electrocardiogram (ECG) has been proposed [17]. In a similar line, the use of automatic algorithms to detect the early appearance of AF after ischemic stroke has also been recently suggested as a key measure to reduce additional attacks [18]. In fact, although the presence of brief AF episodes has not been associated with an increased risk of clinical events in patients with pacemakers and implantable cardioverter defibrillators [19], it is an accepted marker of recurrent risk of stroke [18]. Considering these aspects, many authors have identified personal monitoring devices, massively developed in last years, as extremely helpful tools for premature identification of silent AF [18, 20].

A broad variety of algorithms to detect automatically AF can be found in the literature. Their vast majority relies on quantifying the two significant alterations provoked by the arrhythmia on the ECG. Briefly, in contrast to sinus rhythm (SR), where the atria are contracted in response to a repetitive and synchronized electrical impulse originating at the sinus node, atrial contraction during AF is activated by very rapid and disorganized impulses generated at multiple locations [9]. This disorganized activation causes ineffective atrial contractions, such that the common P-waves in SR are replaced by irregular fibrillatory (-) waves during AF [21]. Moreover, the atrioventricular (AV) node also transmits during the arrhythmia impulses to the ventricles more irregularly and quickly than for SR [22]. Because ventricular contractions are reflected on the ECG as QRS complexes, the series of temporal distances between R-peaks is then characterized to be quick and irregular during AF.

Given the high immunity to noise of the R-peaks, this latter ECG feature has been commonly exploited by most AF detectors proposed to date. Indeed, the RR interval series variability has been widely quantified from time, frequency, and complexity domains. Thus, some entropy-based indices, relying on RR time series, have reported the highest ability to discern AF from other rhythms (OR) [20]. However, these metrics have to be computed from time intervals with at least several dozens of beats, thus introducing a delay in the identification of AF and burying the detection of short episodes [23]. Bearing in mind that the occurrence of brief asymptomatic episodes often conforms the typical advent of AF [14] and that this fact has been closely associated with an elevated risk of thrombus formation [24] and ischemic stroke recurrence [18], this limitation involves a serious drawback in RR time series-based methods.

Precisely, the most recent works are putting special emphasis on palliating this issue [23, 2530]. In fact, some authors have proposed the use of information from RR intervals series irregularity in combination with features obtained from the atrial activity (AA), that is, the P- or -waves [23, 28, 29]. This way, episodes as brief as a few beats in length have been successfully detected [23, 30]. A similar outcome has also been reported by some algorithms through the sole use of the AA morphological information [2527]. Due to the lower signal-to-noise ratio (SNR) of the P- and -waves in the ECG, with respect to the QRS complex, such a kind of analysis has not received much attention in past years. However, the characterization of these waves via the stationary wavelet entropy (SWEn) has provided very useful information to identify AF, even when the arrhythmia does not present an irregular ventricular response [2527]. Despite this promising outcome, the information carried within the variability in size, shape, and timing of the -waves [21] has not been completely exploited to detect AF. Hence, the main goal of the present study is to analyze whether the morphological variability of the -waves reflected on the ECG can improve the discrimination between AF and OR episodes. Thus, an algorithm commonly used to estimate the RR interval series regularity has been adapted to assess the time course variability of P- and -waves characterized through SWEn.

The remainder manuscript is organized as follows. Section 2 describes the ECG recordings used to validate the proposed algorithm. Next, Section 3 outlines the preprocessing applied to these signals as well as how P- and f-waves are characterized and their variability computed. Classification results between AF and OR episodes are then introduced in Section 4 and discussed in Section 5. Finally, Section 6 presents the concluding remarks of this study.

2. Materials

Two freely available databases in PhysioNet [31] were used in this study. According to previous works [32, 33], the MIT-BIH AF Database (AFDB) was first considered to train the proposed algorithm. This dataset consists of a large number of AF and OR episodes and, therefore, robust and stable tuning parameters can be obtained. Moreover, the AFDB has been widely used to assess previous AF detectors [20], thus enabling an easy and fair comparison among methods. In short, 23 ECG recordings of 10 hours in length were collected from paroxysmal AF patients. They were acquired with a sampling rate of 250 Hz and 12-bit resolution over a range of ±10 mV. Also, note that more than one million and one hundred thousand beats were manually annotated into four different rhythms, including AF, atrial flutter, junctional rhythms, and OR.

On the other hand, the MIT-BIH Arrhythmia Database (ARRDB) was used to test the algorithm. Recordings from this group have been sometimes considered for validation of some AF detectors [20], since they contain a wide variety of other arrhythmias than AF. Precisely, this database is formed by 48 short-term (30 minute-length) ECG recordings divided into two sets. The series 100 includes 23 subjects without AF, whereas the series 200 contains AF episodes and other rhythms, such as atrial and ventricular bigeminy, ventricular trigeminy, atrial flutter, and ventricular and supraventricular tachycardia. Although the signals were initially recorded with a sampling rate of 360 Hz, they were here downsampled to 250 Hz.

3. Methods

The proposed algorithm to discern between AF and OR episodes is graphically summarized in Figure 1. As can be seen, the ECG is firstly preprocessed and, then, the variability both in the morphological pattern of P- or -waves and in the RR intervals is computed separately. Finally, the information gained from both paths is combined via a linear discriminant analysis (LDA) to assign a potential class to the signal. More details about each step are provided next.

3.1. Data Preprocessing

Two leads were available in all the ECG recordings, but the one showing the highest P- and -waves was only analyzed. This lead was manually selected by visual inspection, because no information about the acquired leads from each recording is contained by the databases. Note that, although the proposed algorithm can work successfully from any lead, its performance will be better as the SNR in the analyzed P- and -waves is higher [26, 27].

To improve further analysis of the selected signal, a first step of preprocessing was considered. Thus, baseline wander was removed by making use of an IIR high-pass filtering with 0.5 Hz of cut-off frequency [34]. Additionally, high-frequency noise and powerline interference were reduced through an IIR low-pass filtering with 50 Hz of cut-off frequency [35]. Both filters were designed by using a Chebyshev window with a relative sidelobe attenuation of 40 dB and applied in a forward/backward fashion. With the aim of detecting efficiently the R-peaks from the resulting signal, a phasor transform-based approach was used [36]. The method has been widely validated on several databases manually annotated by experts, thus providing values of sensitivity and positive predictivity greater than 99.65% and 99.70%, respectively. It was also able to deal indifferently with normal and ectopic beats, which is an interesting ability within the context of AF. Indeed, it is well known that the onset of paroxysmal AF is often preceded by atrial and ventricular premature complexes [37].

3.2. Morphological Characterization of P- and -Waves

The application of nonlinear metrics to the surface ECG in AF has provided significant insights during the last years [38, 39]. Thus, with the aim of characterizing the morphological pattern of P- and f-waves, every single TQ interval was detected and then decomposed into the wavelet domain, such as in previous works [26, 27]. Given the difficulty in detecting accurately the T-wave offset during AF [40], the TQ interval was selected as a window of varying size. Briefly, taking a reference point for each beat placed 50 ms before the R-peak, the TQ interval was detected as the preceding segment. Because its length is highly variable with the heart rate [41], it was adaptively selected as a quarter of the mean RR interval for the last five beats [27]. Additionally, to increase the typical low SNR of the P-wave and draw a clear distinction from the -waves, consecutive TQ intervals were averaged. Thus, for every beat, the median TQ interval from its preceding beats was obtained. Note that if the TQ intervals only contained clearly defined P-waves, their average will highlight this waveform [27]. Contrarily, when the TQ intervals contained -waves, the averaged signal will result in a noisy-like pattern, thus provoking more acute morphological variations over time [27]. Consequently, in addition to noise reduction, the averaging of consecutive TQ intervals also allowed the emphasis of the stable or variable nature of P- and -waves, respectively [26, 27]. In order to quantify the effect of this averaging on the variability showed by the morphology of P- and -waves, values for of 2, 5, 10, 15, 20, 30, 40, and 50 beats were considered.

Next, SWEn was used to characterize the median TQ interval for every beat, thus obtaining the time series , which has the same length as the RR interval series, referred to as . This entropy-based metric quantifies morphological complexity by decomposing a waveform into different time-frequency scales and, then, computing Shannon entropy from their relative energy distributions [26]. While low values are obtained for extremely organized signals, such as P-waves, high ones are associated with disorganized waveforms, like -waves [26]. Note that this index was computed using 4 decomposition levels and a sixth-order Daubechies wavelet function. Additionally, the index was normalized to report values between 0 and 1.

3.3. Variability of the TQ Interval Series

Single values of the time series have recently reported an ability about 95% to discern between noise-free AF and SR beats [26]; however, its variability has still not been analyzed. With that aim, a similar algorithm to the coefficient of sample entropy (COSEn) has been adapted to work with the time series . COSEn was defined by Lake and Moorman to estimate short-term variability of the RR interval series and, thus, to discern AF and OR episodes from ECG segments of 12 beats [32]. This index is based on the sample entropy (SEn), which estimates irregularity in a time series by computing the repetitiveness of similar patterns. More precisely, given data points for a time series , the first step to compute SEn is to form vectors of size samples, such that , for . Next, the maximum absolute distance between every pair of vectors is estimated as such that they are considered similar if is lower than a tolerance . Then, the number of vectors similar to , that is, , is obtained by excluding self-matches and the average share for all vectors of length can be estimated as Repeating the process for vectors of length , SEn can be computed as [42]

A common trouble dealing with SEn is the selection of to obtain reliable entropy estimates [43]. Although recommendations proposed by Pincus [44] have been widely used in previous works, Lake [45] introduced a modification in SEn, called quadratic SEn (QSEn), which allowed us to obtain comparable entropy estimates regardless of . This was mainly based on adding the term to SEn and involves a key aspect to obtain reliable SEn estimates from very short time series [32]. In this context, needs to be progressively increased until the number of similar patterns is large enough to obtain confident values for both and . Considering this aspect, was here obtained adaptively for each interval of samples by starting from an initial value and increasing by 5% until was higher than a specific threshold . According to Lake and Moorman [32], this cut-off was experimentally obtained by analyzing the set of values . Similarly, was also chosen by considering values of ranging from 1 to 10% in steps of 1%.

Another key aspect to reach accurate and confident QSEn estimates is the appropriate selection of . In fact, similar patterns can be more easily found when is reduced [42]. As in previous works [32, 43], values of and were analyzed. Moreover, because this parameter is closely related to the data length, different values of beats were also studied.

As a final step, because the mean value of has previously revealed a promising ability to identify AF [26], the complementarity between this information and the time series regularity, estimated by QSEn, was explored through LDA. Results showed that the two parameters were independent AF detectors, thus providing a discriminant model where both presented very similar coefficients in magnitude and sign. Following Lake and Moorman’s philosophy [32], a simple new index TQEn was then defined as Note that the LDA coefficients were discarded to simplify the model. As expected in the same way as in [32], this simplification did not alter significantly classification outcomes.

3.4. Variability of the RR Interval Series

As a reference for the proposed index TQEn, COSEn was also obtained from the time series . Thus, this metric was computed as [32] with ,  ms, and as computational parameters [32]. Regarding the analyzed data window, values of = 5, 12, 15, and 30 beats were considered for a more thorough and fairer comparison with TQEn. Finally, complementary information provided both by TQEn and COSEn was also studied by means of LDA, such as in previous analyses.

3.5. Performance Measures

The discriminant ability between AF and OR episodes of TQEn, COSEn, and their LDA-based combination was assessed in terms of sensitivity (Se) and specificity (Sp). While the first parameter was referred to the ratio of AF beats correctly classified, the second one was considered as the percentage of OR beats properly identified. For the training AFDB, these metrics were computed through a receiver operating characteristic (ROC) curve. The ROC is a graphical representation of sensitivity and specificity for several cut-off points, such that the optimal threshold (Th) was chosen to maximize the proportion of total beats correctly classified, that is, the diagnostic accuracy (Acc). The cut-off points obtained in this way were then used to validate the three performance indices on the ARRDB.

On the other hand, given the previously described importance of detecting brief AF episodes, the method’s delay introduced to detect the transition between two different rhythms was also studied. Remark that the optimal thresholds obtained during the training stage were used later to assess this parameter from both the AFDB and ARRDB.

4. Results

4.1. Training with the AFDB

As described before, computation of TQEn depends on several parameters totally interconnected. Hence, because simultaneous experiments for their joint optimization are impossible, each parameter was separately tuned and typical values were considered for the remaining ones. To this respect, Figure 2(a) shows the diagnostic accuracy of TQEn as a function of for , , beats, and beats. As can be seen, only tiny differences were noticed, although the best distinction between AF and OR episodes was reported for . As a consequence, this value of was selected for the remaining analyses. Regarding the threshold , Figure 2(b) displays the discriminant ability of TQEn for , , beats, and beats. Again, apart from , no significant differences were observed. Thus, to reduce computational burden of TQEn as much as possible, was chosen. As a last step, to obtain reliable estimates of QSEn, was also optimized. For each value of , the ratios of patterns with no template matches (i.e., ) and all matches (i.e., ) are displayed in Figure 2(c), along with the average values of QSEn and the diagnostic accuracy of TQEn. According to Lake and Moorman [32], the optimal value of must be taken as the cut-off point where average QSEn is approximately 0.5 and the ratios and are similar. A value of was then selected, as illustrated in Figure 2(c) with the grey shaded band.

With regard to the number of averaged beats to obtain the median TQ interval, Figure 2(d) displays the diagnostic accuracy of TQEn for beats, as well as of its two components, that is, and . As can be seen, these two latter metrics presented a very similar behavior for beats, but an opposite trend for larger values of . Thus, whereas the accuracy of QSEn decreased notably for beats, the one of displayed a slight increase. These variations were also reflected on the behavior of TQEn, which presented the best classification outcome for beats. Although differences in diagnostic accuracy lower than 2% were only observed between values of and beats, the detection delay rose notably from about 9 to 14 beats. Anyway, both values of were considered to study variability differences in the morphology of P- and -waves as a function of .

To this last respect, Figure 2(e) shows how the discriminant ability of TQEn increased for both values of (5 and 15 beats) when longer data windows were analyzed. As expected, increasing delays in the detection of AF and OR episodes as a function of were also noticed. Having this result in mind, a reasonable trade-off between accuracy and delay was beats. Indeed, for longer values of , a limited increase in diagnostic accuracy lower than 1.5% was observed, regardless of . Nonetheless, for a thorough comparison between TQEn and COSEn, Table 1 displays the classification outcomes for both indices computed with different values of and . As expected, the accuracy increased for larger values of at the cost of having a longer detection delay. Furthermore, COSEn provided a slightly better classification result with a shorter detection delay for the same value of .

Aimed at studying how the distributions of TQEn and COSEn are spread in the training database, Figure 3 shows their representation for AF and OR episodes. As can be observed, very reduced differences between the distributions associated with TQEn for different values of and were noticed. Contrarily, a notably higher dissimilarity between the distributions of COSEn for AF and OR episodes can be seen as increases.

Finally, Table 2 presents the classification results for the obtained LDA-based combination of TQEn and COSEn. Compared with the diagnostic accuracy reported by each single index, the discriminant model showed improvements between 0.5 and 3%, with completely balanced values of sensitivity and specificity. It is also remarkable that the delay for this classifier always presented values between TQEn and COSEn.

4.2. Validation on the ARRDB

Making use of the optimal decision-making thresholds obtained with the AFDB, the classification results computed on the ARRDB are presented in Table 3. In general terms, both TQEn and COSEn reported a lower discriminant ability than for the training dataset. More precisely, regardless of and , the diagnostic accuracy decreased about 5% for TQEn and more than 15% for COSEn. This finding agrees with a notably higher overlapping between the distributions of AF and OR episodes for COSEn than for TQEn, such as can be observed in Figure 4.

Similarly, Table 4 shows that the proposed LDA-based combination of TQEn and COSEn also provided a lower discriminant ability in this validation stage than for the AFDB. Indeed, a decrease between 5 and 10% can be noticed for the different values of and . Additionally, improvements in diagnostic accuracy about 1% were only reached in comparison with TQEn.

5. Discussion

To the best of our knowledge, this work introduces for the first time the idea of quantifying the variability of the TQ interval morphology for automated screening of AF. Information obtained in this way has provided a slightly better discriminant ability than the previously studied mean value of SWEn [26, 27], whenever the number of averaged TQ intervals was lower than 30 (see Figure 2(d)). Moreover, according to a more fickle morphology in size, shape, and timing presented by -waves than by P-waves [21], higher variability was reported by TQEn for AF than for OR episodes (see Figures 3 and 4). Note that, although it is well known that the morphology of -waves may evolve over time [54], this finding remained for most AF episodes regardless of their duration. Contrarily, for values of longer than 30 beats, the median TQ interval trended to be a zero signal during AF and, then, its variability failed to be informative about the patient’s rhythm. Anyway, it is interesting to remark that the combination of QSEn and SWEn into TQEn always improved their single discriminant ability about 5% (see Figure 2(d)). Hence, this novel index has been able to reach values of accuracy higher than 93 and 85% for training and testing databases, respectively (see Tables 1 and 3).

These classification outcomes for AF and OR episodes have exceeded those reported by all previously proposed AF detectors based on estimating the presence of P-waves [2527, 55]. Indeed, although Ladavich and Ghoraani have presented an algorithm with high sensitivity about 98%, its specificity is considerably more reduced (around 91%), thus introducing a significant rate of false positives [25]. Moreover, that method also presents a serious limitation compared with the proposed TQEn. More precisely, it requires an initial long-term training (about 35 minutes) for every recording under study before being able to identify AF episodes [25]. Consequently, due to the lack of enough SR intervals, that algorithm is not applicable to patients with persistent or permanent AF, as well as with other cardiac diseases which are present during all the recording time. On the contrary, TQEn can be applied without additional training to any kind of patient and recording. Indeed, once the algorithm was trained with the AFDB, the obtained optimal threshold can be used to discern blindly between any AF and OR episodes from any database.

Compared with COSEn, TQEn has also reported a similar discriminant ability for the AFDB. Indeed, for comparable delays, TQEn has only provided values of diagnostic accuracy about 1% lower than COSEn (see Table 1). However, TQEn presents some additional and interesting advantages. On the one hand, it is more insensitive to the presence of ectopic beats than COSEn and most of RR-based AF detectors. Thus, whereas the RR interval series is completely altered by the premature occurrence of both ventricular and atrial beats [23, 47], this ectopic activity only modifies slightly the median TQ interval. As an example, Figure 5(a) shows an excerpt of SR where numerous ventricular ectopics render COSEn classification inaccurate, but they do not alter the successful performance of TQEn. In agreement with this result, many previous works have also pointed out a loss of effectivity of RR-based detectors in the presence of ectopic beats [56]. Hence, it is not surprising that these abnormal beats have been often removed before quantifying the RR interval series regularity [33, 47]. On the other hand, most RR-based algorithms, including also COSEn, fail to detect AF episodes presenting a regular ventricular response [25]. This is frequent when AF is accompanied by AV block, as well as ventricular and AV junctional tachycardia [25, 52]. Moreover, the incidence of AF is about 50% in patients with a paced ventricular rhythm over 2 years [57]. A negative outcome is also obtained when these algorithms deal with OR rhythms showcasing irregular RR interval series, such as sinus arrhythmia [52]. For instance, Figure 5(b) shows a SR interval with an irregular ventricular response that was wrongly classified as AF by COSEn, but correctly identified by TQEn.

Nonetheless, TQEn also presents some disadvantages with regard to the RR-based methods. Thus, its more serious limitation is a higher sensitivity to noise. Whereas a significantly strong noise is required to mask the R-peak in the ECG, the TQ interval can be more easily disturbed by soft nuisance signals, because P- and -waves display the most limited SNR in the ECG [21]. More precisely, the -waves often present lower amplitude than P-waves, but this aspect does not mean a concern for TQEn whenever the waves are not completely masked by noise [27]. Nonetheless, to palliate this problem, morphological analysis of the median TQ interval computed from several beats has been recently proposed [26, 27]. Given that this approach could reduce the morphological variability among successive TQ intervals, different values of and were tested. However, results showed by Figure 2(d) suggest that averaging a limited number of beats does not shrink significantly the variability in morphology of the TQ interval, thus improving its discriminant ability for automated screening of AF. However, as aforementioned, for the cases when the TQ interval was completely masked by noise, it was impossible to discern the presence of P- or -waves [27]. To this respect, Figure 5(c) displays how TQEn classifies incorrectly a SR segment when noise fully obscures the TQ interval. Contrarily, because the R-peaks are still visible, COSEn is able to obtain a successful outcome.

On the other hand, the need of identifying the TQ interval can also lead to some problems for TQEn. To this respect, in patients with a rapid ventricular response during AF, the TQ interval length can be very limited. However, because normal P-waves have a duration of about 100 ms [58], TQEn is able to work successfully even with a TQ interval as short as this length [27]. Indeed, whereas the algorithm provided accuracy about 90% for heart rates higher than 150 bpm, its performance was reduced to 77% for 160 bpm. However, in this last case, only three episodes were available for analysis in the databases, thus indicating the reduced number of times in which this situation can occur. Another limitation related to the TQ interval detection is the use of a reference point 50 ms before the R-peak. Although this temporal distance is sufficiently longer to exclude the Q-wave from the TQ interval for most of rhythms, this is not the case for patients with bundle branch block or ventricular pacing. Nonetheless, this aspect has only a limited impact on TQEn, whenever the part of the Q-wave included in the TQ interval does not alter notably its morphology through time. Indeed, taking a reference point 30 ms before the R-peak, a decrease in the diagnostic accuracy of TQEn lower than 1% has only been noticed for both the AFDB and ARRDB. Additionally, more than 92% of paced ventricular beats included in the ARRDB have also been correctly classified by the algorithm.

Bearing the described advantages and drawbacks of TQEn and COSEn in mind, it is not surprising that their LDA-based combination has provided the best classification outcomes for automated screening of AF. Nonetheless, it is mandatory to note that the improvement in diagnostic accuracy reached by the discriminant model was notably higher for the AFDB than for the ARRDB. The loss of effectivity for TQEn and, particularly, COSEn in the last database could justify this outcome. Even so, the combination of TQEn and COSEn has presented a better trade-off between discriminant ability and detection delay than most of previously published detectors of AF, such as Table 5 shows. More precisely, this algorithm, working with a delay lower than 10 beats, has reported the best classification outcome, for both training and test databases, in comparison with other methods featured by a similar delay. Additionally, its performance was only 1% lower than other methods presenting delays of about 100 beats, even though the algorithm introduces a maximum delay of about 20 beats.

It is interesting to remark that a detection delay as low as possible is essential for an AF detector, because episodes shorter than this value cannot be detected. Clearly, the longer the delay, the more belated the identification of AF, thus reducing its clinical usefulness. Additionally, when most of brief AF episodes are undetected, only an imprecise AF burden can be computed, which could have a negative impact on further management of the patient [15]. In view of the obtained outcomes, this is not the case for the proposed combination of TQEn and COSEn. Thus, whereas AF burden from manual annotations in the AFDB and ARRDB was 44.73% and 10.75%, respectively, the proposed algorithm estimated values of 44.01% and 9.98%.

Despite the promising described results, a point deserving special attention is the remarkable difference in diagnostic accuracy reported by TQEn and COSEn between training and testing datasets (see Tables 1 and 3). A possible explanation relays on the fact that the AFDB contains AF and SR episodes mainly, whereas the ARRDB presents a considerable number of other atrial and ventricular arrhythmias than AF. Some of these diseases are associated with an irregular ventricular response, and their presence turns the identification of AF a harder challenge [23, 47]. In this line, a recent work has proven that bigeminy suppression is able to improve RR-based identification of AF [23]. Moreover, compared with the distributions of TQEn and COSEn for the AFDB displayed in Figure 3, a significantly remarkable overlapping between AF and OR episodes can be seen for the ARRDB in Figure 4. Interestingly, that aliasing is appreciably longer for COSEn than for TQEn, which agrees with a higher decrease in its discriminant ability (about 17% for COSEn and 6% for TQEn). As a consequence, TQEn seems to be less sensitive than COSEn to the presence of other arrhythmias than AF.

Other works dealing with the ARRDB have also provided notably lower values of diagnostic accuracy than for the AFDB, such as Table 5 shows. Although a considerably lower performance reduction can be noticed for these methods than for COSEn, they made use of extremely longer data windows. It is reasonable to think that the lower the analyzed data window, the higher the impact of every transitory alteration of the RR interval series in the identification of AF. To this respect, TQEn and COSEn have also shown a rising discriminant ability for growing values of and , such as Table 3 displays.

Anyway, considering the aforementioned differences between the studied databases, their use for the validation of any AF detector is highly interesting. In fact, the AFDB contains a balanced number of AF and OR beats, thus providing a robust training for further classification of other blinded rhythms. It is worth noting that in case an unbalanced number of both kinds of beats were considered to train the algorithm, a significant bias could be introduced towards detecting the predominant rhythm. On the other hand, the ARRDB only presents an AF burden about 11%, with more than 75% of the episodes sorter than 100 beats, thus drawing a real scenario where every automatic AF detector will have to work. In fact, these algorithms will have to be mainly applied to patients with a high risk of suffering from AF, who could only present a few brief AF episodes. Therefore, this database allows a robust and realistic testing context for any AF detector. Hence, considering that the LDA-based combination of TQEn and COSEn has been validated on more than one million and two hundred and fifty thousand beats from these databases and, moreover, has reached equal or higher diagnostic accuracy than other previously proposed methods, this algorithm is sufficiently general to provide a similar classification outcome when applied to any other datasets.

Finally, a limitation of the proposed combination of TQEn and COSEn is the need of computing three entropy-based indices to classify each beat as AF or OR. Although the computational cost of QSEn is high, some approaches have been proposed to accelerate its computation, particularly for short segments [59, 60]. Additionally, some previous works have also proven that the wavelet transform, along with additional extensive processing, can be run in real-time [61, 62], thus enabling TQEn and its blending with COSEn for continuous monitoring applications of patients at risk of AF.

6. Conclusions

A brand new study of the morphological variability of the TQ interval, obtained from the surface ECG, has been introduced for automated screening of AF. The application of quadratic sample entropy to the time series generated by quantifying the TQ interval morphology via the stationary wavelet transform has proven to be a more confident AF screener than previous methods, especially in the presence of ectopic beats, as well as of other arrhythmias. Additionally, the combination of this algorithm with the traditional RR series variability has also reported an interesting trade-off between diagnostic accuracy and detection delay. Indeed, classification results comparable to well-established AF detectors have been obtained with the advantage of an extremely lower detection delay. This work opens new insights towards the challenging problem of prompt screening of asymptomatic brief AF episodes, which are the most common type of episodes in the earliest stages of AF.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This work was supported by the Spanish Ministry of Economy and Competitiveness (Project TEC2014-52250-R).