Abstract

Mobile edge computing (MEC) has the ability of pattern recognition and intelligent processing of real-time data. Electroencephalogram (EEG) is a very important tool in the study of epilepsy. It provides rich information that can not be provided by other physiological methods. In the automatic classification of EEG signals by intelligent algorithms, feature extraction and the establishment of classifiers are both very important steps. Different feature extraction methods, such as time domain, frequency domain, and nonlinear dynamic feature methods, contain independent and diverse specific information. Using multiple forms of features at the same time can improve the accuracy of epilepsy recognition. In this paper, we apply metric learning to epileptic EEG signal recognition. Inspired by the equidistance constrained metric learning algorithm, we propose multifeature metric learning based on enhanced equidistance embedding (MMLE3) for EEG recognition of epilepsy. The MMLE3 algorithm makes use of various forms of EEG features, and the feature weights are adaptively weighted. It is a big advantage that the feature weight vector can be adjusted adaptively, without manual adjustment. The MMLE3 algorithm maximizes the distance between the samples constrained by the cannot-link, and the samples of different classes are transformed into equidistant; meanwhile, MMLE3 minimizes the distance between the data constrained by the must-link, and the samples of the same class are compressed to one point. Under the premise that the various feature classification tasks are consistent, MMLE3 can fully extract the associated and complementary information hidden between the features. The experimental results on the CHB-MIT dataset verify that the MMLE3 algorithm has good generalization performance.

1. Introduction

Mobile edge computing (MEC) converges cloud computing capabilities and Internet service environment to the edge of the network, which can provide services to users nearby, and effectively makes up for the deficiencies of cloud computing [1]. The combination of MEC and artificial intelligence technology is a research hotspot in recent years. MEC has rich application scenarios in the field of intelligent medicine. Electroencephalogram (EEG) analysis is widely used in neuroscience, especially in the diagnosis and seizure of epilepsy [2, 3]. In clinical practice, the diagnosis of epilepsy is mainly based on the patient’s history of seizures, and further examination and diagnosis are made concerning the EEG signals. EEG-based epilepsy detection mainly relies on the personal experience of the doctor. With the gradual development of intelligent medical treatment, the automatic recognition and detection of epileptic EEG signals have become an important auxiliary detection means. How to extract effective features from EEG and design appropriate classification algorithms is the key task to epilepsy detection.

At present, the most commonly used feature extraction methods include the following: time domain analysis, frequency domain analysis, time-frequency analysis, nonlinear dynamics, and model-based methods [4]. The time domain feature method regards EEG as a time series, calculates the correlation statistics of the sequence, and extracts the corresponding epileptic EEG features. For example, Kaya et al. [5] used the histogram features based on local binary patterns (LBP) together with Bayesian networks to classify epileptic seizures. Another widely used strategy is to extract frequency domain features from a given EEG signal. Fourier transform is one of the most commonly used algorithms to extract frequency domain features from time series data. Frassineti et al. [6] proposed a preprocessing step method. The signal is filtered by a fixed wavelet transform to reduce possible artifacts. Then, the support vector machine fine Gaussian method is used to detect epilepsy. Chandel et al. [7] proposed a combination of features based on ternary wavelet decomposition to predict the onset and termination of epilepsy. This method extracted standard deviation, variance, and high-order moments to represent the characteristics of different EEG activities and used linear discriminant analysis and -nearest neighbor (KNN) classifiers to classify EEG between seizure and interictal periods. In the extraction of nonlinear dynamic features, the method based on complexity analysis is widely used in epilepsy detection, and the most commonly used is the feature extraction method using entropy strategy. For example, Xiang et al. [8] developed a feature extraction algorithm using fuzzy entropy. This method first calculated the fuzzy entropy of EEG signals from different epileptic states, then performed feature selection, and finally used a support vector machine for prediction. Hussein et al. [9] identified EEG seizures by modifying the fuzzy entropy with minimum variance. Firstly, appropriate filtering and independent quantity analysis were carried out to remove noise and artifacts, and then, the proportional operation was carried out to obtain the optimal features. Empirical mode decomposition is an analysis method based on the Fourier transform. Recently, it is also widely used in epilepsy detection. Bajaj and Pachori [10] regarded the intrinsic mode function (IMF) as a group of amplitude frequency modulation signals and gave the analytical expression of IMF by using the Hilbert transform form of IMF. IMF transform calculated two kinds of bandwidth, namely, a bandwidth and FM bandwidth. Kaleem et al. [11] used a new variant of empirical mode decomposition. This model allowed the detrending of the signal based on the time scale, which decomposed the signal into detrended components and nondetrended components according to the frequency separation standard, and then extracted features from the decomposed components. Usman et al. [12] converted the EEG data into a proxy channel and then used the empirical mode decomposition to improve the prediction results.

Supervised learning in machine learning is widely used in epilepsy detection. Some famous supervised learning algorithms, such as KNN, decision trees, and metric learning, have been successfully used for epilepsy detection. Metric learning is aimed at learning a more suitable distance measurement criterion in the feature space, in order to more accurately represent the similarity between samples. Metric learning is widely used in face recognition, object detection, image recognition, and so on. Weinberger and Saul [13] developed a large margin nearest neighbor analysis algorithm based on a support vector machine. The obtained Mahalanobis distance had the advantages of maximum marginal and internal consistency. Liu et al. [14] developed a global metric learning algorithm, which made the separation of different categories of samples greater in EEG signal recognition. Phan et al. [15] developed a global metric learning framework using supervised information. The algorithm can directly process EEG data without preprocessing such as artifact removal. Alwasiti et al. [16] developed a depth metric learning algorithm. Different from the traditional deep learning model, a large amount of training data was required; this algorithm only required very little training data.

Many classification algorithms largely rely on the distance measurement of the input data. In EEG classification, the key problem is to find a good distance measurement, to classify the test EEG into the class of the nearest EEG samples. Many researches have shown that an appropriate distance measure can significantly improve classification accuracy. In metric learning, EEG recognition depends on the similarity measurement between the input EEG data samples, and the similarity measurement between EEG data samples is realized by the distance measurement of the input feature vector of EEG data samples [17, 18]. Therefore, it is crucial to find a good distance metric in the sample feature space.

Due to the rhythm of EEG signals and the collection of EEG signals of multiple channels, EEG data samples have rich feature information. The contribution of different forms of features to EEG recognition is different, some play a decisive role, some play an auxiliary role, and some play a small or no role. Metric learning measures the similarity of EEG data samples and treats all features equally. Obviously, this strategy cannot accurately measure the similarity between EEG data samples, which will affect EEG signal recognition. In addition, various types of EEG signal features can be obtained from different feature extraction algorithms. Based on the principle of consistency and complementarity, the features of each type will contain specific information, and the use of multiple forms of features at the same time will improve the accuracy of epilepsy recognition.

In this paper, we apply metric learning to the recognition of epilepsy EEG signals. We make full use of various forms of EEG features and assign their different weights automatically. We try to find an appropriate distance measure for EEG data samples, so as to measure the similarity between EEG data samples more accurately, and finally achieve the purpose of improving the accuracy of epilepsy recognition. To achieve this goal, we propose multifeature metric learning based on enhanced equidistance embedding (MMLE3) for EEG recognition of epilepsy. We learn from the techniques of the EquiDML algorithm [19] to maximize the distance between the samples constrained by the cannot-link, so that the samples belonging to different classes are transformed into equidistant. At the same time, the distance between the data constrained by the must-link is minimized, so that the samples belonging to the same class are compressed to one point. In the process of metric matrix learning, feature weight vectors are introduced, and various features are adaptively weighted to effectively adjust the weight relationship between various features. Under the premise of the consistency of various feature classification tasks, the MMLE3 algorithm can effectively mine the hidden and complementary information between the features and highlight the role of the optimal feature, and it has a stronger discriminative ability. We conduct experiments on the CHB-MIT dataset, and the experimental results validate the effectiveness of MMLE3.

Metric learning uses a given pair of samples to calculate the similarity between pairs of feature vectors. Metric learning generally uses distance metrics. Taking the commonly used Mahalanobis distance as an example, the distance metric between the two samples and can be written as where is a positive semidefinite matrix. can be decomposed as , where the matrix is metric matrix (or projection matrix). Therefore, Equation (1) can be expressed as

Therefore, the essence of metric learning is to learn a mapping space. In classification tasks, the commonly used strategy for metric learning is to output a positive value close to zero for pairs of samples of the same class and output larger values for pairs of samples of different classes.

Given a labeled dataset with dimensionality and number samples, the label matrix is composed of all class labels of . The sets of must-link and cannot-link are defined as

According to the classification principle of minimum intraclass distance and maximum interclass distance, a supervised metric learning framework can be represented as where and are thresholds for sets of must-link and cannot-link, respectively, and .

In the EquiDML algorithm [19], the sample pairs in set are gathered directly to a signal point. The distances of sample pairs in set are forced to have the same constant value. The constraints in the EquiDML algorithm are expressed as where is a positive value. The equidistance constraint indicates that the distance between classes must be greater than the distance within classes. In the metric space, the samples in the set correspond to different classes of samples, and the distances of any different pairs will have the same constant value.

3. Multifeature Metric Learning Based on Enhanced Equidistance Embedding

3.1. The Objective Function of MMLE3

We try to learn a metric space with distinguishing ability. In this metric space, the samples belonging to the same class with different feature forms are as close as possible, and the samples belonging to different classes with different feature forms are as far away as possible. That is, there are more compact intraclass distances and more separable interclass distances in the metric space. The Mahalanobis distance of the -th feature of samples and can be written in the following form as

Based on the EquiDML algorithm [19], the proposed MMLE3 algorithm makes use of the correlation and difference between multiple forms of features and makes the proposed algorithm more distinguishable by learning the complementary information of different types of features. Then, the MMLE3 algorithm can be represented by where and are the sample sets of must-link and cannot-link with the -th feature expression, respectively. and are the size of and , respectively. is the trade-off parameter, and is the feature weight vector, and is the feature weight of the -th sample features. It is worth emphasizing that (1)in order to reduce the convergence time, MMLE3 uses the shifted squared loss for the set (2) is not a parameter that needs to be adjusted manually. It can be obtained in a closed-form solution

3.2. The Optimization Procedure

According to the Lagrange multiplier method, the Lagrangian function of Equation (7) can be represented as where is the Lagrange parameter.

There are two tune parameters and in the MMLE3. We use the alternating updating strategy to obtain their optimal parameters. When is fixed in Equation (8), the optimization of is equal to solve the following problem:

Using the simplest projected gradient method, can be updated by where . denotes projection on . and are the step and regulation parameters, respectively.

At the -th iteration, denote , where . Using the positive semidefinite matrix approximation method [19], we can obtain as where . is a positive diagonal matrix.

When is fixed in Equation (8), the optimization of is equal to solve . We can obtain

Then, the solution of is

Obviously, different from the manual parameter adjustment strategy, the weight parameter in the MMLE3 algorithm is adaptive and it can converge to the extreme value at any initial value.

Based on the above analysis, we give the MMLE3 algorithm as follows.

Input: Multi-feature matrix , its cannot-link and must-link sets;
Output: The metric matrix Q and .
Initialization:, Q = I,
Repeat
t = t+1
Step 1: fixed Q(t),compute using Eq.(13);
Step 2: fixed , compute Q(t) using Eqs.(9)-(11);
Step 3: compute the value of objective function J(t);
Until J(t) is convergence or
Step 4: obtain the optimal Q and .

4. Experiments

4.1. Dataset and Feature Extraction

The used dataset called CHB-MIT is from Boston Children’s Hospital [20]. The signal data is recorded by the international standard 10-20 system, and the sampling frequency is 256 Hz. The example EEG signals and the used international standard 10-20 system are shown in Figures 1 and 2, respectively. The dataset includes the cortical EEG data of 23 patients with epilepsy. Among the 23 patients, 5 are males, aged between 3 and 22 years, and 17 females, aged 1.5-19 years. The data No. 21 is collected again by patient No. 1, one and a half years later. The gender of patient No. 24 is unknown. In our experiment, 21 out of 24 patients are selected, excluding Nos. 6, 12, and 16, since some channel data of these patients can not be read.

We use two forms of EEG features in the experiment. The first form of features is the time domain features of EEG signals. We extract the correlation coefficient matrix and its eigenvalues of the original EEG signal and fuse them. The detailed time domain feature extraction process is shown in Figure 3. The “Date” is the original EEG signals. The “Sta” is the standardized matrix of EEG signals. The “CorrM” is the correlation coefficient matrix of “Sta,” and the “Eigen” is the eigenvalue corresponding to the “CorrM” matrix. The “Corr” is the expansion of “CorrM.” The “Feat1” is the experimental time domain feature by feature fusion of “Corr” and “Eigen.”

The second form of features is the frequency domain features of EEG signals. The detailed frequency domain feature extraction process is shown in Figure 4. The amplitude spectrum and phase spectrum in the frequency domain are two important features related to the time domain information. After being extracted the amplitude and phase features (called “AS” and “PS” in Figure 4), the correlation coefficients (called “CorrM1” and “CorrM2” in Figure 4) and eigenvalues of the spectrum (called “Eigen1” and “Eigen2” in Figure 4) are further extracted. The Feat2 is the experimental frequency domain feature by feature fusion of “Eigen1,” “Eigen2,” “Corr1,” and “Corr2”. The third form of features is the nonlinear features of EEG signals. In the experiment, the Shannon entropy, spectral entropy, and differential entropy of each delta (1-4 Hz), theta (4-7 Hz), alpha (7-13 Hz), and beta (13-30 Hz) band of EEG signals are calculated. Then, the nonlinear feature Feat3 is obtained by three entropies.

We compare MMLE3 with seven algorithms. The comparison algorithms include LMNN [13], ITML [21], RDML-CCPVL [22], EquiDML [19], CMML [23], MV-TSK-FS [4], and MvCVM [24]. The slack variable in ITML is selected in {0.01, 0.1, 1, 10}. In CMML, the tradeoff parameter, learning rate, and parameter are set to 1, 10-6, and 5, respectively. The number of fuzzy rules in MV-TSK-FS is selected in , and three regulation parameters are set in . In MV-TSK-FS and MvCVM, the penalty parameter for each view is selected in {1, 10, 102, 103}, and the Gaussian kernel parameter is selected in {10-2, 10-1, …, 102}. In MMLE3, the parameter is selected in {0.1, 0.2, …, 0.9}, and the parameter is set to be 2. The KNN is used as the classifier in MMLE3. We use the grid search and 5-fold cross strategy to select the best variables. The running environment of all algorithms is CPU i7-8700k, 3.2GHZ, and 32GB RAM, and software is Matlab 2016. The evaluation index adopts the specificity, sensitivity, and classification accuracy rate. The experiment is executed 10 times.

4.2. MMLE3 Performance Verification

The classification accuracy of MMLE3 with different parameters is shown in Figure 5. The first parameter is the balance parameter . The parameter is between [0,1] to balance the proportion of minimizing the distance term of the same class samples and maximizing the distance term of different class samples in the objective function. The accuracy of MMLE3 with different is shown in Figure 5(a). When the balance parameter is 1, the MMLE3 algorithm only optimizes the must-link constraint and ignores the cannot-link constraint, so its classification accuracy is low. When the balance parameter is close to 0, the objective function of MMLE3 ignores the optimization of the must-link constraint, so the classification accuracy of MMLE3 is also unsatisfactory. From Figure 5(a), when the balance parameter is between 0.4 and 0.6, the two optimization terms can be balanced, so that the EEG data samples in the metric space have the highest discriminative ability, and the classification accuracy of MMLE3 is the highest.

Second, we evaluate the dimension in matrix . The dimension of each form of features is 200. The MMLE3 algorithm obtains the Mahalanobis matrix by metric learning on the multiple forms of EEG features and projects the features into the projection space. The dimension takes an important role in MMLE3. The classification accuracy of MMLE3 with different is shown in Figure 5(b). When is very small, MMLE3 will ignore most of the discriminative feature information, which leads to low classification accuracy. With the increase of value, the discriminative feature information increases, and this improves the classification accuracy. When increases to a certain value, all the discriminative feature information has been obtained, and the remaining small or ineffective feature information has little contribution to the EEG signals recognition. Therefore, the classification accuracy of MMLE3 keeps stable.

Thirdly, we evaluate the KNN classifier parameter in MMLE3. The parameter is selected in {3, 6, …, 30}. The accuracy of MMLE3 with different is shown in Figure 5(c). The value of has little effect on classification accuracy. Regardless of the value of , the fluctuation of classification accuracy is very small.

4.3. Algorithm Performance Comparison

The proposed algorithm MMLE3 is compared with several algorithms on 4CHB-MIT dataset. During the experiment, every algorithm runs 10 times and the specificity, sensitivity, and accuracy of all algorithms are recorded in Tables 13. CMML, MV-TSK-FS, MvCVM, and MMLE3 can make use of various forms of EEG features. In the experiment, three forms of EEG features: time domain, frequency domain, and nonlinear features, are used. When analyzing the time domain characteristics, the mode of the eigenvalue of the correlation coefficient matrix of EEG signal will change before and after the seizure, which shows that the time domain correlation coefficient matrix and its eigenvalue can predict the seizure and termination of epilepsy to a certain extent. The amplitude and phase in the frequency domain are effective features, which can directly reflect the difference between the seizure period and seizure interval. Entropy can describe the uncertainty of information source and plays an important role in nonstationary EEG signals. MMLE3 obtains the best classification performance and has the highest generalization ability, which is 2.43%, 2.52%, and 2.44% higher than baseline algorithm EquiDML in specificity, sensitivity, and accuracy. It can be seen that the comprehensive use of multifeature information can promote the accuracy of epilepsy recognition.

In addition, the MMLE3 algorithm uses the constraint forms of must link and cannot link to project the samples into a low-dimensional space, in which the distance between the samples constrained by cannot link is maximized, and the samples of different classes are transformed into equidistant; meanwhile, the distance between the samples constrained by must link is minimized, and the samples of different classes are compressed to a point. We introduce the feature weight vector to adaptively weigh various features and effectively adjust the weight relationship between various features in the process of metric matrix learning. On the premise that all kinds of feature classification tasks are consistent, the MMLE3 algorithm can effectively mine the association and complementary information hidden among features and highlight the role of optimal features. The MMLE3 algorithm has stronger discrimination ability. Therefore, the results in Tables 13 indicate that various forms of EEG features can be treated differently in the MMLE3 algorithm and the similarity between EEG data samples can be measured more accurately. The MMLE3 algorithm shows superiority and effectiveness for EEG recognition of epilepsy.

5. Conclusion

In clinical research, EEG is a basic tool for diagnosing and studying brain diseases, especially in the field of epilepsy diagnosis. This study explores how to improve the classification accuracy of epileptic EEG based on various feature extraction methods and metric learning algorithm. We propose the MMLE3 algorithm for EEG recognition of epilepsy. In the process of metric matrix learning, MMLE3 uses various forms of EEG features to effectively adjust the weight relationship between various features. Experiments show that the classification performance of comprehensive utilization of multiple features is significantly better than single feature, and multifeature metric learning has better stability and generalization ability. In the future, we will embed the proposed algorithm into the deep network for new latent representations. We will apply the proposed algorithm to clinical diagnosis in the next stage. In addition, with the development of computer-aided technology, visualized operating systems are one of the development trends of future medical care. We will also try to design the MMLE3 algorithm into a visual operating system to facilitate the application of clinical diagnosis.

Data Availability

Copies of the used data can be obtained free of charge from https://physionet.org/content/chbmit/1.0.0/.

Conflicts of Interest

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported in part by the Science and Technology Project of Changzhou City under Grant No. CE20215032, the Natural Science Foundation of Jiangsu Province under Grant BK 20211333, and the Future Network Scientific Research Fund Project under Grant No. FNSRFP-2021-YB-36.