Deep and Transfer Learning Approaches for Complex Data Analysis in the Industry 4.0 EraView this Special Issue
Recognition of Imbalanced Epileptic EEG Signals by a Graph-Based Extreme Learning Machine
Epileptic EEG signal recognition is an important method for epilepsy detection. In essence, epileptic EEG signal recognition is a typical imbalanced classification task. However, traditional machine learning methods used for imbalanced epileptic EEG signal recognition face many challenges: (1) traditional machine learning methods often ignore the imbalance of epileptic EEG signals, which leads to misclassification of positive samples and may cause serious consequences and (2) the existing imbalanced classification methods ignore the interrelationship between samples, resulting in poor classification performance. To overcome these challenges, a graph-based extreme learning machine method (G-ELM) is proposed for imbalanced epileptic EEG signal recognition. The proposed method uses graph theory to construct a relationship graph of samples according to data distribution. Then, a model combining the relationship graph and ELM is constructed; it inherits the rapid learning and good generalization capabilities of ELM and improves the classification performance. Experiments on a real imbalanced epileptic EEG dataset demonstrated the effectiveness and applicability of the proposed method.
Epilepsy is a common neurological disease that can cause recurrent seizures. During seizures, injury or life-threatening events may occur owing to the distraction or involuntary spasms of the patient [1, 2]. In the clinical diagnosis of various seizures, electroencephalogram (EEG) signal detection plays a crucial role . This is because the epileptic brain releases characteristic waves during seizures. In recent years, an increasing number of machine learning-based methods have been applied for epileptic EEG signal recognition [4–8]. Figure 1 illustrates a machine learning method-based system for epileptic EEG signal recognition. The figure shows that an epileptic EEG signal recognition system involves the following three main steps: (1) a feature extraction method is used on original epileptic EEG signals for training and testing, (2) EEG signals after feature extraction for training are used to train the machine learning-based model to build an epileptic EEG signal recognition system, and 3) EEG signals after feature extraction for testing are then inputted into the epileptic EEG signal recognition system for detection.
Previously, many machine learning methods have been proposed for epileptic EEG signal recognition, such as the naive Bayes method (NB) , -nearest neighbor (KNN) , support vector machine (SVM) , fuzzy system [12, 13], and extreme learning machine (ELM) [14, 15], and they have shown good effectiveness. In essence, epileptic EEG signal recognition is a typical imbalanced classification task [16, 17]. Compared with negative samples (people without epilepsy), positive samples (patients with epilepsy) have extremely low representation and cannot be well classified by traditional classifiers. Although the misclassification of positive samples has little effect on the model accuracy, it may cause serious medical malpractice. Therefore, traditional machine learning methods face several critical challenges for recognition of imbalanced epileptic EEG signals: (1) traditional machine learning methods often ignore the imbalance of epileptic EEG signals and misclassify positive samples, which may cause serious medical malpractice, and (2) existing imbalanced classification methods ignore the interrelationship between samples, resulting in poor classification performance. Therefore, building a classifier that considers the imbalance of the epileptic EEG signals and additional knowledge of samples becomes imperative for classification of imbalanced datasets with epileptic EEG signals.
To overcome these challenges, a novel imbalanced epileptic EEG signal recognition method based on a graph and ELM is proposed in this study. ELM has become a classical machine learning method with its solid theoretical foundations, fast training speed, and good predictive performance [18, 19]. Although ELM can universally approximate to any continuous functions, it is not effective for classifying imbalanced datasets. Therefore, it is necessary to adopt strategies to make ELM correctly classify positive samples to obtain a reasonable classification result of an imbalanced dataset. Previously, numerous imbalanced ELM-based methods have been proposed. For example, Zong et al.  proposed the weighted extreme learning machine (WELM), which pioneered the application of ELM in imbalanced classification. Similarly, Zhang and Ji  proposed a fuzzy ELM (FELM), which regulated the distributions of penalty factors by inserting a fuzzy matrix. Yu et al.  proposed a special cost-sensitive ELM (ODOC-ELM) for imbalanced classification problems. Li et al.  proposed an ensemble WELM algorithm based on the AdaBoost framework to learn the weights of different samples adaptively. Yang et al.  proposed a novel ELM-based imbalanced classification method by estimating the probability density distributions for two imbalanced classes. Shukla and Yadav  combined CC-ELM with WELM to propose a regularized weighted CC-ELM. Xiao et al.  proposed an imbalanced ELM-based algorithm for two classes of classification tasks by solving each class classification error. Du et al.  proposed an online sequential extreme learning machine with under- and oversampling(OSELM-UO) for online imbalanced big data classification. In addition, some ELM-based imbalanced methods, such as ensemble weighted ELM , class-specific cost regulation ELM , label-weighted extreme learning machine , and class-specific ELM , have also been proposed. However, to the best of our knowledge, there is no study that uses imbalanced ELM methods for epileptic EEG signal recognition; therefore, it is necessary to propose such a method for epileptic EEG signal recognition.
In this study, inspired by WELM, we propose a novel graph-based ELM (G-ELM) for imbalanced epileptic EEG signal recognition. First, we use the graph theory to construct a relationship graph of samples according to their data distribution. Then, we combine the relationship graph with ELM to propose G-ELM. The experimental results on a real imbalanced epileptic EEG dataset show that the proposed method can address imbalanced classification of epileptic EEG signals effectively. The main contributions of this study are as follows. (1)The proposed G-ELM sets the compensation for loss of positive samples to be greater than that of negative samples based on graph theory and then combines with the ELM to classify imbalanced data effectively. It is a novel imbalanced ELM-based method, which attains a good classification performance and inherits the rapid learning and good generalization capabilities of ELM(2)The proposed imbalanced classification method attempts to consider both the imbalance and interrelationship of epileptic EEG samples to obtain better performance for imbalanced epileptic EEG signal recognition. It can be utilized for imbalanced epileptic EEG signal recognition. It not only realizes effective classification of imbalanced epileptic EEG signals from a new perspective but also expands application of ELM-based algorithms(3)We use six imbalanced classification evaluation indices, i.e., accuracy, precision, recall, -measure, _means, and AUC, to compare the performance of the proposed G-ELM and the existing imbalanced ELM-based methods. Extensive experiments on a real imbalanced epileptic EEG dataset indicate that the proposed method can address imbalanced epileptic EEG signal recognition effectively and outperform the existing imbalanced ELM-based methods
The rest of this paper is organized as follows. Section 2 introduces the background underlying the proposed epileptic EEG recognition method. In Section 3, the details of the proposed G-ELM are presented. The performance of the proposed method is evaluated with several comparative methods in Section 4. The conclusions of this paper are provided in Section 5.
In this section, we briefly describe the background related to the proposed epileptic EEG signal recognition method. It includes the epileptic EEG dataset, the feature extraction methods, and the classical ELM, which are used for epileptic EEG signal detection.
2.1. Epileptic EEG Dataset
The real epileptic EEG dataset used in this paper is Bonn , which is from the University of Bonn, Germany. It can be publicly downloaded from the following website (http://www.epileptologie-bonn.de/cms/upload/workgroup/lehnertz/eegdata.html). There are five groups (denoted by A–E, respectively) in Bonn. In each group, there are 100 samples of 23.6 s segments. Detailed descriptions of the five groups are given in Table 1. Groups A and B are segments acquired from five healthy volunteers with eyes open (Group A) and eyes closed (Group B). Groups C–E are segments acquired from volunteers with epilepsy. In Group C, EEG signals are measured in the hippocampus of the brain during seizure-free intervals and those in Group D are measured in the epileptogenic zone during seizure-free intervals. In Group E, EEG signals are measured during seizure activity. Five representative original epileptic EEG signals of five different groups are shown in Figure 2.
2.2. Feature Extraction
Many studies [33–35] have shown that the original EEG signals cannot be directly used for training machine learning-based models and that feature extraction is a necessary step. This is because the original EEG signals are usually high dimensional, stochastic, nonstationary, and nonlinear and the background noise in the original signals is very complex. The commonly used feature extraction methods can be divided into three main categories: time domain analysis, frequency domain analysis, and time-frequency analysis. Time domain analysis-based methods extract the features by analyzing the characteristics of original EEG signals, such as mean, variance, amplitude, and kurtosis . Frequency domain analysis-based methods usually analyze the EEG signals in the frequency domain to extract the features, such as fast Fourier transforms  and short-time Fourier transforms . As for time-frequency analysis methods, the information of time and frequency domain is considered simultaneously to extract the features from original epileptic EEG signals. Typical time-frequency analysis-based methods are wavelet transform methods [39, 40]. In this paper, we use the wavelet packet decomposition  for feature extraction from original epileptic EEG signals to simultaneously utilize the information of time and frequency domain.
ELM , which was first proposed by Huang et al., is a single-hidden-layer feedforward neural network . It can directly optimize the output weight of the hidden layer by setting the number of hidden nodes, without paying attention to the weight and offset of the input layer, which can be generated randomly. Compared with other traditional supervised learning methods, it has good generalization ability and high learning speed. Figure 3 shows the network structure of an ELM.
ELM considers both empirical and structural risks, and its objective function is as follows: where represents the hidden layer feature matrix, where , represents the th row of the weight matrix , represents the offset, denotes the training samples, is the number of training samples, is dimension, and is the number of hidden nodes; is the error matrix between the network outputs and the target outputs. is a penalty parameter, which can adjust the accuracy and generalization ability of the ELM.
The optimization problem in (1) can be solved based on the Karush–Kuhn–Tucker theory. The output weight of ELM can be calculated by
3. Graph-Based Extreme Learning Machine
In this section, a graph-based ELM (G-ELM) is proposed. We first introduce the relationship graph of an imbalanced dataset and then develop the proposed imbalanced classification method G-ELM by combining the relationship graph with an ELM.
3.1. Relationship Graph of an Imbalanced Dataset
In the context of imbalanced classification problem, the relationships between the training samples can be regarded as an undirected graph.
Undirected graph can be expressed as , where is the vertex set of graph and is the edge set of graph . Figure 4 shows an undirected graph of an imbalanced synthetic dataset with 7 samples, where 2 positive samples are represented by a blue circle and 5 negative samples are represented by a red star. All samples are numbered for subsequent display. Note that there are connections between samples in different classes and the weight is 1. Samples in the same class are not connected.
The elements of an adjacency matrix can be defined as follows:
Here, is the label of .
According to the above definition of the adjacency matrix , we can see that the distance of the samples in the same class can be considered 0. For samples in different classes, the distance between them can be considered 1.
Then, the relationship graph matrix can be expressed as where is the degree matrix; stands for a vector with , whose elements are exactly 1; is the number of training samples.
As for the imbalanced dataset , we need to increase the loss of misclassification of positive samples because the misclassification of positive samples (patients with epilepsy) could cause serious consequences. This can be realized by regulating the degree matrix . The shortcomings of the cost learning algorithm can be compensated by increasing the relationship between samples. Therefore, the relationship graph not only ensures the accuracy of positive sample classification but also makes up for the lack of the mutual relationships and prior knowledge between samples.
According to the above description, the relationship graph matrix of the synthetic dataset in Figure 4 can be expressed as
3.2. Objective Function of G-ELM
According to the above relationship graph and ELM, the objective function of the G-ELM can be expressed as follows:
Here, , is the number of samples in , is the sample dimension, and represents the true class label of the samples. and are the same as defined in ELM. represents the output weight vector. represents the loss between the network outputs and the target outputs. Equation (8) is the relationship graph matrix of the samples.
3.3. Solution of G-ELM
In this subsection, we attempt to optimize the objective function of G-ELM. According to , the objective function of G-ELM is a convex optimization problem. The specific optimization solution process is as follows:
The Lagrangian function corresponding to (7) is
Let the derivation of with respect to , , equal to zero:
With the obtained solution, i.e., , the predicted class label of the testing sample can be obtained as follows: where is a testing sample.
3.4. Learning Algorithm of G-ELM
According to the above derivation, the implementation of G-ELM is summarized in Algorithm 1.
To demonstrate the effectiveness of the proposed G-ELM, we conducted extensive experiments on a real epileptic EEG dataset. The proposed G-ELM was verified by comparing it with five ELM-based methods, i.e., ELM , W1-ELM , W2-ELM , R1-ELM , and R2-ELM , using six imbalanced classification evaluation indices and average standard deviation on the real Bonn dataset. Except for ELM, the other comparison methods are imbalanced classification methods. All the experiments were conducted on a computer with Intel Core i5-3317U 1.70 GHz CPU and 16 GB RAM by using MATLAB 2016a. The details of the experimental settings and results are presented in the following sections.
4.1. Data Preparation
Although the real Bonn dataset has been used in many studies, the way of using it in this study differs from those in previous works. To evaluate the performance of the proposed G-ELM, nine imbalanced datasets were generated from the original five groups of EEG signals to simulate the imbalanced classification scenario. The details of the nine datasets are summarized in Table 2. In each dataset, the EEG signals of patients with epilepsy (E) were regarded as a positive class, while the other groups were regarded as a negative class, to identify whether the patients with epilepsy are experiencing seizure activity. A brief description of the five groups (A, B, C, D, and E) can be found in Table 1. The last column of Table 2 is , which is used to show the degree of imbalance of the dataset. can be defined as follows: where and represent the number of samples of the positive class and the negative class, respectively.
In our experiment, we randomly partitioned each dataset. In each dataset, 80% of the dataset were used for training and the remaining 20% were used for testing.
4.2. Evaluation Indices
In our experiments, we used six imbalanced classification evaluation indices to evaluate all the adopted methods. The six imbalanced classification evaluation indices were accuracy, precision, recall, -measure, _means, and AUC, which can be, respectively, defined as
Here, is the number of true positive samples, is the number of false negative samples, is the number of true negative samples, and is the number of false positive samples, respectively. where is the set of all the indexes of the positive samples and is the set of those of the negative samples; and . is the prediction value of . is the indicator function
4.3. Adopted Methods and Parameter Settings
In the experiments, five ELM-based methods, i.e., ELM , W1-ELM , W2-ELM , R1-ELM , and R2-ELM , were adopted for comparisons with G-ELM. Referring to the guidelines in [2, 20, 46], a grid search strategy based on _means was used to determine appropriate parameters of all the methods. We set parameter in the range of and parameter in the range of for all the adopted methods. All the adopted methods were run ten times on each generated imbalanced dataset. The average experimental results corresponding to the six imbalanced classification evaluation indices are reported.
4.4. Experimental Results
To evaluate the classification performance of the proposed G-ELM, five ELM-based methods were used for performance comparison. All experiments were repeated ten times for fairness. The mean and standard deviation of the corresponding indices of all methods in each dataset are reported in Tables 3–8. The best results are shown in bold. The improvement of G-ELM relative to ELM on all datasets using the six imbalanced classification evaluation indices is shown in Figure 5.
According to experimental results in Tables 3–8, the following observations can be made: (1)For the adopted six imbalanced classification evaluation indices, the proposed G-ELM performs best on most datasets. This is because G-ELM can suppress the misclassification of negative samples while considering the accuracy of positive samples and has a high classification performance, which is suitable for imbalanced epileptic EEG signal recognition(2)In general, G-ELM, R1-ELM, R2-ELM, W1-ELM, and W2-ELM achieved better performances than ELM. This is due to the addition of a cost matrix, which makes them more suitable for imbalanced classification. Moreover, G-ELM has the best effect because it adds sample information using a relationship graph(3)Tables 4 and 5 show the results of recall and precision of all methods. They evaluate the performance of imbalanced classification from two different perspectives. From the excellent performance of G-ELM in Tables 4 and 5, we can see that adding information regarding relationships between samples can increase the weight of positive samples(4)-measure and _means in Tables 6 and 7 are two important indices to measure the performance of imbalanced classification methods, which can be combined with recall and precision to evaluate the effect of the methods. From the results, we can see that the proposed G-ELM has the best performance. It has excellent performance in imbalanced epileptic EEG signal recognition(5)AUC is an important index to evaluate imbalanced classifiers. From Table 8, we can see that the performance of G-ELM on all datasets is the best. G-ELM has excellent performance in imbalanced classification and good effectiveness of imbalanced epileptic EEG signal recognition
4.5. Statistical Analysis
Statistical analysis was performed to further analyze the performances of all the adopted methods in our experiments. For conciseness, we only present statistical analysis of the _means results. Firstly, the Friedman test  was used to calculate the average ranking of each method. The rankings of all the adopted methods are shown in Figure 6. In Figure 6, we can see that the performance of G-ELM is the best.
Then, the post hoc hypothesis test  was used to evaluate the statistical significance of the performance differences between G-ELM and the other adopted methods. Post hoc hypothesis test results are presented in Table 9. In Table 9, we can see that the null hypothesis is rejected when due to . Therefore, performance differences between G-ELM and the other adopted methods are significant, which means that G-ELM is effective for imbalanced epileptic EEG signal recognition.
In this study, we aimed to address the challenge that traditional machine learning methods ignore the imbalance of epileptic EEG datasets and the existing imbalanced classification methods ignore the relationships between samples. A graph-based ELM was proposed for imbalanced epileptic EEG signal recognition. First, graph theory was used to construct the relationship between samples according to the distribution. Second, a model combining the relationship graph and ELM was proposed; this model inherited the rapid learning and good generalization capabilities of ELM while maintaining satisfactory classification. Experiments on a real imbalanced epileptic EEG dataset demonstrated the effectiveness and applicability of the proposed method. However, there is still room for improvement in the scope and search method of the optimal parameters in this experiment. In the future, ways to design a better method to determine the optimal parameters will be further studied and explored.
Data can be downloaded from http://www.epileptologie-bonn.de/cms/upload/workgroup/lehnertz/eegdata.html.
Conflicts of Interest
None of the authors have any conflicts of interest.
This work was supported in part by the National Natural Science Foundation of China under Grant 61772198 and by the Natural Science Foundation of Jiangsu Province under Grant BK20161268.
F. P. Lestari, M. Haekal, R. E. Edison, F. R. Fauzy, S. N. Khotimah, and F. Haryanto, “Epileptic Seizure Detection in EEGs by Using Random Tree Forest, Naïve Bayes and KNN Classification,” Journal of Physics: Conference SeriesJournal of Physics: Conference Series, vol. 1505, no. 1, p. 12055, 2020.View at: Publisher Site | Google Scholar
M. K. Siddiqui, X. Huang, R. Morales-Menendez, N. Hussain, and K. Khatoon, “Machine learning based novel cost-sensitive seizure detection classifier for imbalanced EEG data sets,” International Journal on Interactive Design and Manufacturing, vol. 14, no. 4, pp. 1491–1509, 2020.View at: Publisher Site | Google Scholar
J. Yang, H. Yu, X. Yang, and X. Zuo, “Imbalanced extreme learning machine based on probability density estimation,” in International Workshop on Multi-disciplinary Trends in Artificial Intelligence, pp. 160–167, Springer, Cham, 2015.View at: Google Scholar
R. G. Andrzejak, K. Lehnertz, F. Mormann, C. Rieke, P. David, and C. E. Elger, “Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state,” Physical Review E, vol. 64, no. 6, article 061907, 2001.View at: Publisher Site | Google Scholar
T. Strutz, Data Fitting and Uncertainty: A Practical Introduction to Weighted Least Squares and Beyond, Vieweg, Wiesbaden, Germany, 2010.
C. W. N. F. C. W. Fadzal, W. Mansor, L. Y. Khuan, and A. Zabidi, “Short-time Fourier transform analysis of EEG signal from writing,” in 2012 IEEE 8th International Colloquium on Signal Processing and its Applications, pp. 525–527, Malacca, Malaysia, March 2012.View at: Publisher Site | Google Scholar
Z. Deng, K.-S. Choi, Y. Jiang, and S. Wang, “Generalized hidden-mapping ridge regression, knowledge-leveraged inductive transfer learning for neural networks, fuzzy systems and kernel methods,” IEEE Transactions on Cybernetics, vol. 44, no. 12, pp. 2585–2599, 2014.View at: Publisher Site | Google Scholar