Abstract
Mobile edge computing (MEC) has the ability of pattern recognition and intelligent processing of realtime data. Electroencephalogram (EEG) is a very important tool in the study of epilepsy. It provides rich information that can not be provided by other physiological methods. In the automatic classification of EEG signals by intelligent algorithms, feature extraction and the establishment of classifiers are both very important steps. Different feature extraction methods, such as time domain, frequency domain, and nonlinear dynamic feature methods, contain independent and diverse specific information. Using multiple forms of features at the same time can improve the accuracy of epilepsy recognition. In this paper, we apply metric learning to epileptic EEG signal recognition. Inspired by the equidistance constrained metric learning algorithm, we propose multifeature metric learning based on enhanced equidistance embedding (MMLE^{3}) for EEG recognition of epilepsy. The MMLE^{3} algorithm makes use of various forms of EEG features, and the feature weights are adaptively weighted. It is a big advantage that the feature weight vector can be adjusted adaptively, without manual adjustment. The MMLE^{3} algorithm maximizes the distance between the samples constrained by the cannotlink, and the samples of different classes are transformed into equidistant; meanwhile, MMLE^{3} minimizes the distance between the data constrained by the mustlink, and the samples of the same class are compressed to one point. Under the premise that the various feature classification tasks are consistent, MMLE^{3} can fully extract the associated and complementary information hidden between the features. The experimental results on the CHBMIT dataset verify that the MMLE^{3} algorithm has good generalization performance.
1. Introduction
Mobile edge computing (MEC) converges cloud computing capabilities and Internet service environment to the edge of the network, which can provide services to users nearby, and effectively makes up for the deficiencies of cloud computing [1]. The combination of MEC and artificial intelligence technology is a research hotspot in recent years. MEC has rich application scenarios in the field of intelligent medicine. Electroencephalogram (EEG) analysis is widely used in neuroscience, especially in the diagnosis and seizure of epilepsy [2, 3]. In clinical practice, the diagnosis of epilepsy is mainly based on the patient’s history of seizures, and further examination and diagnosis are made concerning the EEG signals. EEGbased epilepsy detection mainly relies on the personal experience of the doctor. With the gradual development of intelligent medical treatment, the automatic recognition and detection of epileptic EEG signals have become an important auxiliary detection means. How to extract effective features from EEG and design appropriate classification algorithms is the key task to epilepsy detection.
At present, the most commonly used feature extraction methods include the following: time domain analysis, frequency domain analysis, timefrequency analysis, nonlinear dynamics, and modelbased methods [4]. The time domain feature method regards EEG as a time series, calculates the correlation statistics of the sequence, and extracts the corresponding epileptic EEG features. For example, Kaya et al. [5] used the histogram features based on local binary patterns (LBP) together with Bayesian networks to classify epileptic seizures. Another widely used strategy is to extract frequency domain features from a given EEG signal. Fourier transform is one of the most commonly used algorithms to extract frequency domain features from time series data. Frassineti et al. [6] proposed a preprocessing step method. The signal is filtered by a fixed wavelet transform to reduce possible artifacts. Then, the support vector machine fine Gaussian method is used to detect epilepsy. Chandel et al. [7] proposed a combination of features based on ternary wavelet decomposition to predict the onset and termination of epilepsy. This method extracted standard deviation, variance, and highorder moments to represent the characteristics of different EEG activities and used linear discriminant analysis and nearest neighbor (KNN) classifiers to classify EEG between seizure and interictal periods. In the extraction of nonlinear dynamic features, the method based on complexity analysis is widely used in epilepsy detection, and the most commonly used is the feature extraction method using entropy strategy. For example, Xiang et al. [8] developed a feature extraction algorithm using fuzzy entropy. This method first calculated the fuzzy entropy of EEG signals from different epileptic states, then performed feature selection, and finally used a support vector machine for prediction. Hussein et al. [9] identified EEG seizures by modifying the fuzzy entropy with minimum variance. Firstly, appropriate filtering and independent quantity analysis were carried out to remove noise and artifacts, and then, the proportional operation was carried out to obtain the optimal features. Empirical mode decomposition is an analysis method based on the Fourier transform. Recently, it is also widely used in epilepsy detection. Bajaj and Pachori [10] regarded the intrinsic mode function (IMF) as a group of amplitude frequency modulation signals and gave the analytical expression of IMF by using the Hilbert transform form of IMF. IMF transform calculated two kinds of bandwidth, namely, a bandwidth and FM bandwidth. Kaleem et al. [11] used a new variant of empirical mode decomposition. This model allowed the detrending of the signal based on the time scale, which decomposed the signal into detrended components and nondetrended components according to the frequency separation standard, and then extracted features from the decomposed components. Usman et al. [12] converted the EEG data into a proxy channel and then used the empirical mode decomposition to improve the prediction results.
Supervised learning in machine learning is widely used in epilepsy detection. Some famous supervised learning algorithms, such as KNN, decision trees, and metric learning, have been successfully used for epilepsy detection. Metric learning is aimed at learning a more suitable distance measurement criterion in the feature space, in order to more accurately represent the similarity between samples. Metric learning is widely used in face recognition, object detection, image recognition, and so on. Weinberger and Saul [13] developed a large margin nearest neighbor analysis algorithm based on a support vector machine. The obtained Mahalanobis distance had the advantages of maximum marginal and internal consistency. Liu et al. [14] developed a global metric learning algorithm, which made the separation of different categories of samples greater in EEG signal recognition. Phan et al. [15] developed a global metric learning framework using supervised information. The algorithm can directly process EEG data without preprocessing such as artifact removal. Alwasiti et al. [16] developed a depth metric learning algorithm. Different from the traditional deep learning model, a large amount of training data was required; this algorithm only required very little training data.
Many classification algorithms largely rely on the distance measurement of the input data. In EEG classification, the key problem is to find a good distance measurement, to classify the test EEG into the class of the nearest EEG samples. Many researches have shown that an appropriate distance measure can significantly improve classification accuracy. In metric learning, EEG recognition depends on the similarity measurement between the input EEG data samples, and the similarity measurement between EEG data samples is realized by the distance measurement of the input feature vector of EEG data samples [17, 18]. Therefore, it is crucial to find a good distance metric in the sample feature space.
Due to the rhythm of EEG signals and the collection of EEG signals of multiple channels, EEG data samples have rich feature information. The contribution of different forms of features to EEG recognition is different, some play a decisive role, some play an auxiliary role, and some play a small or no role. Metric learning measures the similarity of EEG data samples and treats all features equally. Obviously, this strategy cannot accurately measure the similarity between EEG data samples, which will affect EEG signal recognition. In addition, various types of EEG signal features can be obtained from different feature extraction algorithms. Based on the principle of consistency and complementarity, the features of each type will contain specific information, and the use of multiple forms of features at the same time will improve the accuracy of epilepsy recognition.
In this paper, we apply metric learning to the recognition of epilepsy EEG signals. We make full use of various forms of EEG features and assign their different weights automatically. We try to find an appropriate distance measure for EEG data samples, so as to measure the similarity between EEG data samples more accurately, and finally achieve the purpose of improving the accuracy of epilepsy recognition. To achieve this goal, we propose multifeature metric learning based on enhanced equidistance embedding (MMLE^{3}) for EEG recognition of epilepsy. We learn from the techniques of the EquiDML algorithm [19] to maximize the distance between the samples constrained by the cannotlink, so that the samples belonging to different classes are transformed into equidistant. At the same time, the distance between the data constrained by the mustlink is minimized, so that the samples belonging to the same class are compressed to one point. In the process of metric matrix learning, feature weight vectors are introduced, and various features are adaptively weighted to effectively adjust the weight relationship between various features. Under the premise of the consistency of various feature classification tasks, the MMLE^{3} algorithm can effectively mine the hidden and complementary information between the features and highlight the role of the optimal feature, and it has a stronger discriminative ability. We conduct experiments on the CHBMIT dataset, and the experimental results validate the effectiveness of MMLE^{3}.
2. Related Work
Metric learning uses a given pair of samples to calculate the similarity between pairs of feature vectors. Metric learning generally uses distance metrics. Taking the commonly used Mahalanobis distance as an example, the distance metric between the two samples and can be written as where is a positive semidefinite matrix. can be decomposed as , where the matrix is metric matrix (or projection matrix). Therefore, Equation (1) can be expressed as
Therefore, the essence of metric learning is to learn a mapping space. In classification tasks, the commonly used strategy for metric learning is to output a positive value close to zero for pairs of samples of the same class and output larger values for pairs of samples of different classes.
Given a labeled dataset with dimensionality and number samples, the label matrix is composed of all class labels of . The sets of mustlink and cannotlink are defined as
According to the classification principle of minimum intraclass distance and maximum interclass distance, a supervised metric learning framework can be represented as where and are thresholds for sets of mustlink and cannotlink, respectively, and .
In the EquiDML algorithm [19], the sample pairs in set are gathered directly to a signal point. The distances of sample pairs in set are forced to have the same constant value. The constraints in the EquiDML algorithm are expressed as where is a positive value. The equidistance constraint indicates that the distance between classes must be greater than the distance within classes. In the metric space, the samples in the set correspond to different classes of samples, and the distances of any different pairs will have the same constant value.
3. Multifeature Metric Learning Based on Enhanced Equidistance Embedding
3.1. The Objective Function of MMLE^{3}
We try to learn a metric space with distinguishing ability. In this metric space, the samples belonging to the same class with different feature forms are as close as possible, and the samples belonging to different classes with different feature forms are as far away as possible. That is, there are more compact intraclass distances and more separable interclass distances in the metric space. The Mahalanobis distance of the th feature of samples and can be written in the following form as
Based on the EquiDML algorithm [19], the proposed MMLE^{3} algorithm makes use of the correlation and difference between multiple forms of features and makes the proposed algorithm more distinguishable by learning the complementary information of different types of features. Then, the MMLE^{3} algorithm can be represented by where and are the sample sets of mustlink and cannotlink with the th feature expression, respectively. and are the size of and , respectively. is the tradeoff parameter, and is the feature weight vector, and is the feature weight of the th sample features. It is worth emphasizing that (1)in order to reduce the convergence time, MMLE^{3} uses the shifted squared loss for the set (2) is not a parameter that needs to be adjusted manually. It can be obtained in a closedform solution
3.2. The Optimization Procedure
According to the Lagrange multiplier method, the Lagrangian function of Equation (7) can be represented as where is the Lagrange parameter.
There are two tune parameters and in the MMLE^{3}. We use the alternating updating strategy to obtain their optimal parameters. When is fixed in Equation (8), the optimization of is equal to solve the following problem:
Using the simplest projected gradient method, can be updated by where . denotes projection on . and are the step and regulation parameters, respectively.
At the th iteration, denote , where . Using the positive semidefinite matrix approximation method [19], we can obtain as where . is a positive diagonal matrix.
When is fixed in Equation (8), the optimization of is equal to solve . We can obtain
Then, the solution of is
Obviously, different from the manual parameter adjustment strategy, the weight parameter in the MMLE^{3} algorithm is adaptive and it can converge to the extreme value at any initial value.
Based on the above analysis, we give the MMLE^{3} algorithm as follows.

4. Experiments
4.1. Dataset and Feature Extraction
The used dataset called CHBMIT is from Boston Children’s Hospital [20]. The signal data is recorded by the international standard 1020 system, and the sampling frequency is 256 Hz. The example EEG signals and the used international standard 1020 system are shown in Figures 1 and 2, respectively. The dataset includes the cortical EEG data of 23 patients with epilepsy. Among the 23 patients, 5 are males, aged between 3 and 22 years, and 17 females, aged 1.519 years. The data No. 21 is collected again by patient No. 1, one and a half years later. The gender of patient No. 24 is unknown. In our experiment, 21 out of 24 patients are selected, excluding Nos. 6, 12, and 16, since some channel data of these patients can not be read.
We use two forms of EEG features in the experiment. The first form of features is the time domain features of EEG signals. We extract the correlation coefficient matrix and its eigenvalues of the original EEG signal and fuse them. The detailed time domain feature extraction process is shown in Figure 3. The “Date” is the original EEG signals. The “Sta” is the standardized matrix of EEG signals. The “CorrM” is the correlation coefficient matrix of “Sta,” and the “Eigen” is the eigenvalue corresponding to the “CorrM” matrix. The “Corr” is the expansion of “CorrM.” The “Feat1” is the experimental time domain feature by feature fusion of “Corr” and “Eigen.”
The second form of features is the frequency domain features of EEG signals. The detailed frequency domain feature extraction process is shown in Figure 4. The amplitude spectrum and phase spectrum in the frequency domain are two important features related to the time domain information. After being extracted the amplitude and phase features (called “AS” and “PS” in Figure 4), the correlation coefficients (called “CorrM1” and “CorrM2” in Figure 4) and eigenvalues of the spectrum (called “Eigen1” and “Eigen2” in Figure 4) are further extracted. The Feat2 is the experimental frequency domain feature by feature fusion of “Eigen1,” “Eigen2,” “Corr1,” and “Corr2”. The third form of features is the nonlinear features of EEG signals. In the experiment, the Shannon entropy, spectral entropy, and differential entropy of each delta (14 Hz), theta (47 Hz), alpha (713 Hz), and beta (1330 Hz) band of EEG signals are calculated. Then, the nonlinear feature Feat3 is obtained by three entropies.
We compare MMLE^{3} with seven algorithms. The comparison algorithms include LMNN [13], ITML [21], RDMLCCPVL [22], EquiDML [19], CMML [23], MVTSKFS [4], and MvCVM [24]. The slack variable in ITML is selected in {0.01, 0.1, 1, 10}. In CMML, the tradeoff parameter, learning rate, and parameter are set to 1, 10^{6}, and 5, respectively. The number of fuzzy rules in MVTSKFS is selected in , and three regulation parameters are set in . In MVTSKFS and MvCVM, the penalty parameter for each view is selected in {1, 10, 10^{2}, 10^{3}}, and the Gaussian kernel parameter is selected in {10^{2}, 10^{1}, …, 10^{2}}. In MMLE^{3}, the parameter is selected in {0.1, 0.2, …, 0.9}, and the parameter is set to be 2. The KNN is used as the classifier in MMLE^{3}. We use the grid search and 5fold cross strategy to select the best variables. The running environment of all algorithms is CPU i78700k, 3.2GHZ, and 32GB RAM, and software is Matlab 2016. The evaluation index adopts the specificity, sensitivity, and classification accuracy rate. The experiment is executed 10 times.
4.2. MMLE^{3} Performance Verification
The classification accuracy of MMLE^{3} with different parameters is shown in Figure 5. The first parameter is the balance parameter . The parameter is between [0,1] to balance the proportion of minimizing the distance term of the same class samples and maximizing the distance term of different class samples in the objective function. The accuracy of MMLE^{3} with different is shown in Figure 5(a). When the balance parameter is 1, the MMLE^{3} algorithm only optimizes the mustlink constraint and ignores the cannotlink constraint, so its classification accuracy is low. When the balance parameter is close to 0, the objective function of MMLE^{3} ignores the optimization of the mustlink constraint, so the classification accuracy of MMLE^{3} is also unsatisfactory. From Figure 5(a), when the balance parameter is between 0.4 and 0.6, the two optimization terms can be balanced, so that the EEG data samples in the metric space have the highest discriminative ability, and the classification accuracy of MMLE^{3} is the highest.
(a)
(b)
(c)
Second, we evaluate the dimension in matrix . The dimension of each form of features is 200. The MMLE^{3} algorithm obtains the Mahalanobis matrix by metric learning on the multiple forms of EEG features and projects the features into the projection space. The dimension takes an important role in MMLE^{3}. The classification accuracy of MMLE^{3} with different is shown in Figure 5(b). When is very small, MMLE^{3} will ignore most of the discriminative feature information, which leads to low classification accuracy. With the increase of value, the discriminative feature information increases, and this improves the classification accuracy. When increases to a certain value, all the discriminative feature information has been obtained, and the remaining small or ineffective feature information has little contribution to the EEG signals recognition. Therefore, the classification accuracy of MMLE^{3} keeps stable.
Thirdly, we evaluate the KNN classifier parameter in MMLE^{3}. The parameter is selected in {3, 6, …, 30}. The accuracy of MMLE^{3} with different is shown in Figure 5(c). The value of has little effect on classification accuracy. Regardless of the value of , the fluctuation of classification accuracy is very small.
4.3. Algorithm Performance Comparison
The proposed algorithm MMLE^{3} is compared with several algorithms on 4CHBMIT dataset. During the experiment, every algorithm runs 10 times and the specificity, sensitivity, and accuracy of all algorithms are recorded in Tables 1–3. CMML, MVTSKFS, MvCVM, and MMLE^{3} can make use of various forms of EEG features. In the experiment, three forms of EEG features: time domain, frequency domain, and nonlinear features, are used. When analyzing the time domain characteristics, the mode of the eigenvalue of the correlation coefficient matrix of EEG signal will change before and after the seizure, which shows that the time domain correlation coefficient matrix and its eigenvalue can predict the seizure and termination of epilepsy to a certain extent. The amplitude and phase in the frequency domain are effective features, which can directly reflect the difference between the seizure period and seizure interval. Entropy can describe the uncertainty of information source and plays an important role in nonstationary EEG signals. MMLE^{3} obtains the best classification performance and has the highest generalization ability, which is 2.43%, 2.52%, and 2.44% higher than baseline algorithm EquiDML in specificity, sensitivity, and accuracy. It can be seen that the comprehensive use of multifeature information can promote the accuracy of epilepsy recognition.
In addition, the MMLE^{3} algorithm uses the constraint forms of must link and cannot link to project the samples into a lowdimensional space, in which the distance between the samples constrained by cannot link is maximized, and the samples of different classes are transformed into equidistant; meanwhile, the distance between the samples constrained by must link is minimized, and the samples of different classes are compressed to a point. We introduce the feature weight vector to adaptively weigh various features and effectively adjust the weight relationship between various features in the process of metric matrix learning. On the premise that all kinds of feature classification tasks are consistent, the MMLE^{3} algorithm can effectively mine the association and complementary information hidden among features and highlight the role of optimal features. The MMLE^{3} algorithm has stronger discrimination ability. Therefore, the results in Tables 1–3 indicate that various forms of EEG features can be treated differently in the MMLE^{3} algorithm and the similarity between EEG data samples can be measured more accurately. The MMLE^{3} algorithm shows superiority and effectiveness for EEG recognition of epilepsy.
5. Conclusion
In clinical research, EEG is a basic tool for diagnosing and studying brain diseases, especially in the field of epilepsy diagnosis. This study explores how to improve the classification accuracy of epileptic EEG based on various feature extraction methods and metric learning algorithm. We propose the MMLE^{3} algorithm for EEG recognition of epilepsy. In the process of metric matrix learning, MMLE^{3} uses various forms of EEG features to effectively adjust the weight relationship between various features. Experiments show that the classification performance of comprehensive utilization of multiple features is significantly better than single feature, and multifeature metric learning has better stability and generalization ability. In the future, we will embed the proposed algorithm into the deep network for new latent representations. We will apply the proposed algorithm to clinical diagnosis in the next stage. In addition, with the development of computeraided technology, visualized operating systems are one of the development trends of future medical care. We will also try to design the MMLE^{3} algorithm into a visual operating system to facilitate the application of clinical diagnosis.
Data Availability
Copies of the used data can be obtained free of charge from https://physionet.org/content/chbmit/1.0.0/.
Conflicts of Interest
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was supported in part by the Science and Technology Project of Changzhou City under Grant No. CE20215032, the Natural Science Foundation of Jiangsu Province under Grant BK 20211333, and the Future Network Scientific Research Fund Project under Grant No. FNSRFP2021YB36.