Abstract

Machine learning is an expanding research area. Its main application is in the medical field and particularly the detection of epilepsy and epileptic seizures through electroencephalographic signals (EEG). It aims to design an intelligent framework that enables an immediate diagnosis of this disease without neurological consultation and thus saves the lives of the epileptic patients by detecting seizures and warning them before it happens. However, as a real-time application, this kind of framework faces several challenges such as accuracy, fast responses, and optimal memory usage. Within this context, our work was carried out. We propose a new machine learning framework based on chaos and fractal theories. Two main novelties are presented in this paper. Firstly, we propose a new method for signal preprocessing, and we reconstruct new versions of studied EEG signals using derivative determination and chaotic injection. Secondly, we suggest a new method for fractal analysis using Higuchi fractal dimension (HFD). In fact, HFDs extracted from EEG derivatives lead to detect epilepsy, whereas HFDs extracted from EEG with a chaotic signal injection lead to seizure detection. In addition, feature fusion helped to linearize all classification problems. An experimental study using the Bonn EEG database proves the efficiency of our contributions in comparison to published research. An accuracy of 100% was achieved in different classification cases using few features and a simple linear classifier.

1. Introduction

Epilepsy is a dangerous brain disorder that can put patients’ lives at high risks. EEG is the basic examination for epilepsy therapy since it leads to significant results, is easy to achieve, and is not expensive. However, the exploitation of EEG traces is time consuming and requires neurologists for the interpretation. Over the past decades, technology evolution has led to the possibility of recording EEG in the digital data form for computer analysis. After that, developing automatic systems for epilepsy detection and epileptic patient monitoring became an interesting research field, with a lot of challenges for a real-time application. More than the robustness and efficiency, developed systems seek to overcome some key points. Basically, learning and predicting times must be reduced which is important in case we need to update our model and to have a fast prediction of the treated state in order to take the necessary preoccupations. Otherwise, memory requirements impose data reduction and an optimal choice of the data processing method. Nevertheless, EEGs are complex signals due to their irregularity, nonlinearity, and nonstationarity and are obviously disturbed by noises. For all these reasons, the development of machine learning models based on the EEG is a hard task. According to a literature review, EEG processing goes through the following steps. The first one is signal preprocessing for noise removal using different filtering methods, while the second step is about feature extraction and selection to reduce the data vector using the time domain or a transformed or decomposed version of the studied signal in other domains. The final step is the classification which is used to detect the patient brain state. Over the last century, many published research studies proved that the complexity of EEG signals is not from a random aspect, it is rather chaotic with a fractal structure [13]. Then, chaos and fractal theories have been widely used to study EEG signals. In [4], authors extracted features using signal transformation in the frequency domain, fast Fourier transform (FFT), and then they calculated relative intensity ratio (RIR) for different frequencies. Fractal dimension is also calculated with Petrosian (FDP) and HFD methods. Moreover, Hjorth parameters (HPs) are used to evaluate signal complexity. Sharma et al. [5] decomposed the EEG signal using the analytic time-frequency flexible wavelet transform (ATFFWT) in 17 subbands and extracted HFD for each one. Then, these features were used in the support vector machine (SVM) classifier. In [6], Sharma and Pachori decomposed the signal with the tunable Q-factor wavelet transform (TQFWT), extracted HFD for different subbands, and evaluated them using the least-squares support vector machine (LS-SVM). Sharma et al. proposed a novel approach [7] in 2019 using the biorthogonal wavelet transform (BOWT), used HFD with other nonlinear features (NLfeats) such as Shannon, Renyi, and Fuzzy entropies (ShanEn, ReEn, FuzEn) and energy (En), and classified signals with the SVM. Fractal and nonlinear analyses were used in Zhang and Chen’s work [8], who calculated HFD, ReEn, and Hurst exponent (HE) from the EEG and extracted six temporal statistic features (TSFs) from the product function of local mean decomposition (LMD); these features were classified through the SVM. The same classifier is used in [9] with multifractal analysis (MFA). Recently, Jiang et al. [10] decomposed the signal through the scattering transform (ScT), and then they calculated the entropy by two methods, FuzEn and log energy entropy (LogEEn). Later, SVM was used for the classification.

According to this literature research, we have found that nonlinear analysis and fractal dimension are interesting tools for characterizing EEG signals. However, in all cited papers, [410], different signal decomposition methods were used that lead to obtaining a high-dimensional feature vector, which is disadvantageous in real applications, time consumption, and memory usage. In our previous work [11], we have proved the usefulness of derivatives to extract information from EEG. We extracted the logarithm of variance (LogVar) from EEG signal as well as its first two derivatives. Then, we used the kernel trick (KT) to linearize features’ space. Achieved accuracy in this paper is important, whereas our work is limited to solving a part of all the state-of-the-art classification problems using the same database.

In this work, we propose two new methods for machine learning model building based on fractal and chaos theories and use EEG signal derivatives. Features are extracted from a transformed version of EEG signals. In fact, a chaotic signal injection allows seizure detection. In addition to this, fractal study applied to EEG and its derivatives and a chaotic injection to the extracted derivatives permit to qualify perfectly healthy subjects as well as epilepsy detection. Finally, a feature fusion, using a nonlinear function, is applied to extracted features from all signal versions to solve all the classification cases studied in our work. Our contributions for EEG signal processing through this paper are interesting in terms of the achieved accuracy, simplicity, and run time compared to the other published works, which use the same database studied in this paper.

This paper is organized as follows: we will start by presenting the used database and several works prepared for the same objective as our work in Section 2. Then, we will introduce our proposed method and the achieved results and discuss our work in Section 3. Finally, a summary of the paper with a conclusion and perspectives will be given in Section 5.

2. Materials and Methods

2.1. EEG Database

In our study, we use an open-source database published by Bonn University and described in [12]; it contains 5 subsets (Z, O, N, F, and S), and each subset contains 100 signals with 4097 samples. Z and O are extracted from healthy (H) subjects with, respectively, open and closed eyes. N and F are recorded from epileptic (E) subjects from different brain zones. S is saved during epileptic seizures (ES). This database was used in different published works. In Table 1, we present a selection of performed experiments (Exps) with the used subset (U-Sub) that we will study later.

2.2. Proposed Machine Learning Model

The method proposed in this manuscript to detect a patient’s brain state is illustrated in Figure 1; we went through five steps:(i)Signal preprocessing: we reconstruct three new versions of the processed signal. Then, we introduce new time series, defined as follows:where e is the studied EEG signal, c is the chaotic signal extracted from the logistic map and is its amplitude, and is the third derivative of the e signal.(ii)Feature extraction: we extract the Higuchi fractal dimension from the EEG signal, e, and the tree reconstructed versions, .(iii)Feature space reduction: we removed irrelevant features and selected the most significant ones that solve classification problems. Mainly, feature reduction is an important step to reduce the used memory that is beneficial to save the trained version. In our work, it was also beneficial for the presentation of features in 2D and 3D spaces that will be helpful in the following steps.(iv)Feature space linearization: extracted features from the four versions of the studied signal are used to reconstruct a new feature space based on the feature fusion approach. Since our goal is to classify three types of EEG signals (H, E, and ES), we propose to divide classification problems studied in our paper into three categories:(i)One versus one (Exp 1, Exp 2, and Exp 3): to solve the binary classification problem using two types of signals. Selected features from the previous step were applied to design the linear feature space using “feature fusion 1.”(ii)One versus all (Exp 4 and Exp 5): to solve the binary classification problem using three types of signals. We use a transform version of linear features selected from the reconstructed space in (i) to reduce the feature space in “feature fusion 2.”(iii)Multiclass (Exp 6): to solve the multiclass classification problem using three types of signals. The output of the previous step yields to simplify this problem based on “feature fusion 3.”(v)Classification: through the previous steps, we managed to solve all the classification problems deliberated in our work with a linear classifier.

2.3. Fractal Dimension

EEG signal processing through FD is widely used to detect many neurological disorders such as Alzheimer’s [13], mental disorders [4], and epilepsy [411]. FD is calculated through different methods, Katz algorithm, multiresolution box-counting, Higuchi method [14], etc. The most used method in EEG signal processing is HFD [15]. HFD was firstly defined in his paper [14]; its determination follows the following steps:(i)From time series , where N is its length, we reconstruct novel time series named calculated aswhere m and k are two integers, and is the integer part.(ii)For each and , we calculate the curve length as(iii)The length is defined by(iv)In Higuchi’s paper, he supposed that, for fractal time series, tends towards , where f is the Higuchi fractal dimension (HFD), so HFD is calculated as

The main problem using HFD is the determination of factor. As mentioned in equations (4)–(6), we notice that the number of program iterations depends on the chosen . In Higuchi’s paper, in which he defined his algorithm, he did not give a theoretical definition, allowing to choose an exact value of the factor. He just proposed to choose this factor through the linear regression of a log-log plot; this method is detailed in [4]. This allows to find a different value of by studying the same signals; for example, it is chosen to be equal to 5 in [4, 7] and 128 in [5], for the EEG signals extracted from the same database.

2.4. Proposed Methodology
2.4.1. Chaos Injection

The evolution of HFD as a function of of five EEG signals extracted from different subsets (Z, O, N, F, and S) is exposed in Figure 2 (). HFD as a function of has an increasing evolution; we notice that different curves pass through a critical point () in which the evolution changes from important to weak. Chaos injection will be used in this work to detect signal irregularity.

We measured the HFD as a function of for different values of ; a selection of 3 cases () is shown in Figure 1. We notice that HFD signals outside crises (Z, O, N, and F) are highly modified; nevertheless, for the S signal, it is weakly modified. Moreover, we note that the curve presents a new critical point , where there is a modification in the direction of change for H cases and the rate of change for E cases.

2.4.2. Derivative Exploitation

In the previous work [11], we have shown that the features extracted from the derivative of EEG have given a significant result in epilepsy diagnosis. In this paper, the HFD of the third derivative of EEG, , is calculated. In Figure 3(a), we show the HFD of e and (for ), for each subset. In Figure 3(b), we show the HFD of the derivatives, () and (), for the same sets. We note that HFD-based chaos injection applied in signal derivatives was highly modified for the ES signal.

As depicted in the previous two figures, the evolution of the fractal dimension in the function of presents some critical points. In the follow-up of this work, we will calculate five HFDs of the signal for different conditions summarized in Table 2.

3. Experimental Results

In this part, we have randomly selected 50% of signals existing in each subset for model training and the other 50% for tests.

3.1. Binary Classification Using Two Types of Signals

We used the five features introduced in Table 2 (, , , , and ). In Table 3, we expose the results of the classifications of EEG signals applied to all the features, KNN (all_feat), and a subset of them, KNN (sel_feat).(i)Exp 1: we have found that (HFD1, HFD2, and HFD4) are sufficient to solve this problem. This result is illustrated in Figure 4(a), and the dispersions between the classes are clearly visible. Moreover, we have found that, for the two subproblems (ZO-N and ZO-F), the classification can be achieved by only two features.(ii)Exp 2: HDF4 leads to a more valid classification which is shown in Figure 4(b).(iii)Exp 3: the best classification accuracy for seizure detection is achieved using three features (HFD2, HFD3, and HFD5). In Figure 4(c), we present this problem using all these features. Moreover, we find that, for the subproblems (N-S and F-S), two features are sufficient for the classifications.

3.2. Binary and Three-Class Classification Using Three Types of Signals

In this part, we will take advantage of the results obtained in the first three experiments. In each Exp, selected features for each case are observed in Table 4. We are going to use feature fusion exposed in Table 4 to solve the following Exps.(iv)Exp 4: we used HFD4 and a selection of , , and according to the need for classification which is forced by subset combination requirements.(v)Exp 5: to detect the H subject, for all the probabilistic set combinations, we used HDF4 and , and we have designed new Feat as mentioned in Table 4.(vi)Exp 6: for the three-class classification cases, we will exploit FF done in Exp 4 and Exp 5 to represent the problem in the 2D space; FF of this Exp is shown in Table 4.

All classification problems are assessed by a k-nearest neighbor (KNN) classifier using all features (KNN all) and a selection of features using our proposed method (PML). Classification accuracy is exposed in Table 5. For these cases, we have also reduced the feature space which is advantageous for a real application while increasing the classification accuracy. We have shown that our method presents several advantages. Firstly, we have used a minimum number of features which vary between 1 and 5 to solve the whole classification problem using all possible combinations of subsets. In addition, the classification is made using several transformations applied to the features that reduces the dimension of the feature space; this allowed us to simplify the choice of the decision function to solve the classification problem. Moreover, the precision achieved from each Exp is of a high value. Finally, we used features calculated with a simple algorithm whose calculation time of each one is shown in Table 6. Experimental tests were performed using Intel® Xeon Dual-Core W3550 3.06 GHz.

4. Discussion

At this level, it is important to evaluate the performance of our method compared to other published works that use the same database. In Table 7, we present the accuracy of various selected works of the state of the art by indicating the machine learning methods, the number of used features (),the combinations of subsets used (U-Sub), and the publication date (PY) of each reference (Ref).

According to these state-of-the-art papers, two main methodologies were used to achieve high accuracy in all experiments as mentioned in Table 7. The first one is fractal theory and nonlinear analysis for feature extraction, which are considered efficient tools to study EEG signals in different cited papers; they used various signal preprocessing methods. Scattering transform in [10] was used to extract 14 nonlinear features. A nonlinear analysis based on multifractal decomposition in [9] was used to extract 14 features. HFD was used to extract 17 features in [5] and 13 features in [6] from, respectively, two different wavelet decompositions ATFFWT and TQFWT. In addition, more than HFD, other nonlinear features were extracted from EEG using different other signal decomposition methods, FFT in [4] to extract 38 features, LMD in [7] to extract 9 features, and BOWT in [8] to extract 25 features. The second one is the signal derivative in the preprocessing step, which is presented in our previous work; we showed that derivatives contain important information about EEG signals. In fact, using derivatives in the signal preprocessing step, we used few features in all classifications experiments, but those results are valid only for a limited combination of used subsets. In this paper, HFD and derivative determination were applied to design an efficient machine learning model that overcomes all the accuracies achieved in cited works. Although we use a minimal number of features compared to other works, for each Exp, only in this work, an accuracy of 100% was reached for some or all of the possible combinations of subsets of each case.

5. Conclusion

An innovative EEG signal processing method for the automatic detection of epilepsy and epileptic seizures was presented in this paper. We suggest a new method for signal preprocessing using EEG derivative determination and chaotic signal injection. In addition, we propose a new method for Higuchi fractal dimension determination. The experimental results proved the efficiency of our strategy, and we achieved a significant accuracy with few features and with minimum execution time compared to other published works. Our future direction will concern about how to profit from the richness of the two theories, fractal and chaos, for analyzing EMG and ECG to propose an accurate epilepsy diagnostic system based on the multisensor system.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.