#### Abstract

Epilepsy detection based on electroencephalogram (EEG) signal is of great significance to diagnosis and treatment of epilepsy. The denoised EEG signal is adopted by most traditional epilepsy detection methods. But due to nonideal denoising ability, the loss of local information and residual noise will occur, resulting in detection performance degradation. To solve the problem, the paper proposed an epilepsy detection method in noisy environment. Although epileptic signals and nonepileptic signals have some discrimination, they need to overcome the interference of noise. Hence, the improved sample entropy and phase synchronization indexes of corresponding 2 intrinsic mode functions (IMFs) caused by variational mode decomposition (VMD) are proposed as features, which can reduce the impact of noise on detection performance. The experimental results show that the accuracy, sensitivity, and specificity are 91.78%, 91.27%, and 93.61%, respectively. It can be used as an auxiliary method for clinical treatment of epilepsy.

#### 1. Introduction

Epilepsy is one of the common nervous system diseases affecting about 60 million people around the world [1]. Epilepsy detection results are one of the main basis for neurosurgeons to treat epilepsy. Traditional epilepsy detection is completed by neurosurgeons according to their own clinical experience by observing the electroencephalogram (EEG) [2]. This method not only takes a lot of time but also depends on the subjective judgment of neurosurgeons. Therefore, the realization of automatic high performance epilepsy detection is the main research direction for scholars [3].

As far as the authors know, the denoised EEG signals are widely used to detect epilepsy. But the epilepsy detection method based on denoised signal is limited in practical application because of the nonideal denoising. The denoising methods are divided into two classes. One is completed by bandpass filter based on the assumption that signal and noise live in different frequency bands. But boundary effect of filter causes poor filtering effect near the cut-off frequency. The other is to identify noise based on the assumption that noise and signal come from different sources. The independent component analysis method (ICA) is an outstanding representative of this class [4]. ICA officially states that it can filter out 95% of the noise, but the filtering effect of this method will become worse with the reduction of channels. In addition, ICA will consume a lot of time and cannot achieve real-time epilepsy detection. At the same time, the 2 classes of denoising methods will filter out some epileptic signals by mistake sometimes, resulting in the loss of epileptic information. To avoid the phenomenon, the paper realized epilepsy detection based on the complete signals in noisy environment.

EEG is a complex physiological phenomenon produced by the interaction of different tissues and organs. The nonlinear method can accurately describe the physiological features and obtain more information close to the real state of brain regulation [5, 6]. The most common nonlinear analysis methods include correlation dimension, Lyapunov exponent, and sample entropy. Because correlation dimension and Lyapunov exponent have certain requirements on data length [7], sample entropy is widely used for EEG analysis at present [8]. As far as authors know, sample entropy is obtained based on denoised signal in epilepsy detection so far. The sample entropy may be different from the real sample entropy due to the nonideal denoising. At the same time, the sample entropy represents the overall complexity of signal but lacks local information. The paper utilizes changing trend of local sample entropy to improve sample entropy, so that the improved sample entropy can represent overall complexity and local complexity, which can truly represent the characters of signals in noisy environment.

Phase synchronization is the result of brain nerve interaction, which can represent small dynamic changes because of its high sensitivity [9, 10]. In clinical application, neurosurgeons will determine the type of epilepsy according to the features in different frequency bands so as to make a treatment plan [11]. The frequency bands will be different because of the individual difference of patients. Sometimes, the bandpass filter can not eliminate the influence. Variational mode decomposition (VMD) solves the problem of frequency band decomposition as an adaptive decomposition method [12]. In the paper, phase synchronization index is selected as measurement method of phase synchronization. The signals from channel FZCZ and channel CZPZ are decomposed into 6 IMFs living in different frequency bands by VMD, respectively. 5 phase synchronization indexes of corresponding 2 IMFs (excluding IMF6) from different channels are selected for epilepsy detection because of their remarkable ability of epilepsy detection in noisy environment.

In the paper, the improved sample entropy and phase synchronization indexes are adopted in noisy environment. The main contributions can be summarized as follows.(i)The improved sample entropy is proposed in noisy environment. It overcomes the shortcoming that the sample entropy cannot represent the real complexity of partial signals because of the local special signals sometimes. Comparing to traditional sample entropy, the improved sample entropy has stronger epilepsy detection ability in noisy environment.(ii)In order to obtain features that can be used by neurologists, frequency band decomposition is realized by VMD. It can solve the problem that fixed frequency band cannot be extracted due to individual differences of patients. The 5 phase synchronization indexes between corresponding 2 IMFs (excluding IMF6) are proved to have strong epilepsy detection ability in noisy environment.(iii)As far as the authors know, this paper realized epilepsy detection in noisy environment for the first time. It avoids epilepsy information loss in the process of filtering.

The remaining of the paper is organized as follows. Section 2 reviews the block diagram of epilepsy detection and the principle of adopted method in the paper. Section 3 reviews the experimental processing and results, which contains feature extraction, feature analysis, and realization of epilepsy detection. Section 4 reviews a brief conclusion and research direction in the future.

#### 2. Principle and Methods

##### 2.1. Overall Structure of Epilepsy Detection

In the paper, the signals from channel FZCZ and channel CZPZ are selected and processed by outlier processing. The improved sample entropy is obtained which can truly represent the complexity of the signals in noisy environment. The signals from the 2 channels are divided into 6 IMFs by VMD, respectively. 5 phase synchronization indexes of corresponding 2 IMFs (excluding IMF6) are selected as the features. The random forest model is used to realize epilepsy detection based on the improved sample entropy and phase synchronization index. The block diagram of epilepsy detection is shown in Figure 1.

##### 2.2. VMD

VMD is a completely nonrecursive signal decomposition method, which mainly decomposes the signal into several narrow band components around different center frequencies. The center frequency is constantly changing. By finding the optimal solution of the constrained variational model, the variational modal components are obtained. The adaptive segmentation of each component in the frequency domain is completed. More details are in [12]. EEG signal can be decomposed into multiple components living in different frequency bands by VMD in the paper. An example of VMD is shown in Figure 2.

##### 2.3. Improved Sample Entropy

The paper proposes a sample entropy improving method which can represent signal complexity in noisy environment truly. The processing contains nonuniform processing and adjustment.

###### 2.3.1. Principle of Sample Entropy

Sample entropy is an important index to describe the complexity of signal [13]; the steps are as follows.

*Step 1. *The sequence is composed of a group of P-dimensional vectors denoted as *X*(*l*). It is expressed as

*Step 2. *Define the distance *D*[*X*(*l*), *X*(*s*)] between vectors *X*(*l*) and *X*(*s*) as the largest difference between the corresponding elements of the two vectors. *D*[*X*(*l*), *X*(*s*)] is expressed as follows:

*Step 3. *Given threshold *R*, count the number of *D*[*X*(*l*), *X*(*s*)] less than *R* and the proportion of this number to the total *N*−*P*, which is given bywhere *M* is the number that satisfies *D*[*X*(*l*), *X*(*s*)] < *R*, *l* = 1∼*N* + *P* + 1, l ≠ *s*.

*Step 4. *Calculate the average value of as follows:

*Step 5. *The sequence is composed of *P* + 1 vector in order, and step 1 to step 4 are repeated to get the *P* + 1 vector denoted as .

*Step 6. *The sample entropy of *x*(*n*) is denoted as

###### 2.3.2. Nonuniform Processing

Compared with nonepileptic signals, the proportion of epileptic signals with high amplitude in noisy environment is larger. In order to reduce the influence of noise on the amplitude of EEG signal, the nonuniform processing method is proposed. The method can enlarge the amplitude difference between epileptic signals and nonepileptic signals. It is denoted as follows:where *x* and *y* are the signal before and after nonuniform processing, respectively.

###### 2.3.3. Sample Entropy Adjustment

The existing EEG analysis methods are to divide the signal into several periods and extract feature in each period. Sample entropy is widely used in epilepsy detection. Sample entropy is the performance of overall complexity but lacking local information [14]. Sometimes sample entropy is decided mainly by local special signal. The development of epilepsy is not sudden but a process of evolution with time. Therefore, the changing trend of local sample entropy represents the feature of the overall sample entropy to a certain extent [15]. The paper proposes a sample entropy improving method for epilepsy detection, which takes 2 seconds as an analysis period and divides 2 seconds into 8 segments. According to the changing trend of sample entropy in 8 segments, this paper adjusts the sample entropy so as to improve the epilepsy recognition ability in noisy environment. The improved sample entropy is denoted as follows:SampEn and SampEn_{imp} denote sample entropy and improved sample entropy, respectively. *a* is the nonepileptic modulation coefficient of sample entropy. When the sample entropy in each segment increases 20% for 3 consecutive times, *a* is set to 1; otherwise it is set to 0. *ß* is the epileptic modulation coefficient of sample entropy. When the sample entropy in each segment decreases 20% for 3 consecutive times, *ß* is set to 1; otherwise it is set to 0. *µ* is adjustment factor which is determined by the difference of sample entropy between epileptic signals and nonepileptic signals in specific channel.

##### 2.4. Phase Synchronization

It is important to select signals based on features from different regions. The original signal transmits to the scalp through complex paths. It will be affected by other signals. In the paper, the EEG signal is obtained from 5 regions based on position: forehead region, left temporal region, right temporal region, occipital region, and hippocampus region. The region division is shown in Figure 3. The signals in the same regions mostly come from the same source, so they usually have strong similarity [16].

Phase synchronization means that there is a certain relationship between 2 phases of signals. When the amplitude of the two signals remains uncorrelated, the phase of the two signals may be in a synchronous state [17]. The phase synchronization can be described as follows.

Assuming that two oscillatory systems *X* and *Y* interact with each other and the output signals of the system are *x* (*t*) and *y*(*t*), the *n*:*m* (*n*, *m* is a natural number) phase synchronization is defined where , , . and are instantaneous phase of *x*(*t*) and *y*(*t*), respectively. and are initial phase of *x*(*t*) and *y*(*t*), respectively. is a normal number with smaller value. If *X* and *Y* have the same frequency, , phase difference is denoted as .

Phase synchronization index is an important index to measure phase synchronization which is defined as follows:where means getting average.

#### 3. Experiments and Results

##### 3.1. Experimental Data

In this paper, the data is provided by Massachusetts Institute of Technology (MIT) [18]. The data contains 24 EEG records of 23 patients, which are collected by 10–20 international standard systems. The start time and end time of epilepsy are manually labeled by epilepsy experts. The total duration of record is 979.8 hours, including 197 records of epileptic signals lasting 3.23 hours. The data contains many types of epileptic data, which is large and representative. It is widely used for epilepsy detection.

##### 3.2. Outlier Processing

In the paper, outlier processing is used to reduce the impact of noise. 120 seconds’ epileptic signals and 120 seconds’ nonepileptic signals of patient 18 are randomly selected as example, respectively. 2 seconds was taken as an analysis period. Outliers in each period were identified by the Pauta criterion and replaced by median. The number of outliers is shown in Figure 4.

It can be seen from Figure 4 that the number of outliers which is got from nonepileptic signals is larger than that from epileptic signal. The main reason is that the outliers with smaller amplitude are submerged by epileptic signals with high amplitude, resulting in the reduction of outliers.

The outliers in EEG signal usually affect the performance of epilepsy detection. Some outliers are caused by noise. Hence outlier processing is used to reduce the impact of noise. At present, ICA is widely used as one of the best denoising methods, which can remove up to 95% of the noise. Hence the signal processed by ICA is used as the reference standard. The outliers in each period were identified by the Pauta criterion. The signals are obtained by outlier processing and ICA processing, respectively. The correlation coefficient of the signals is shown in Figure 5.

It can be seen from Figure 5 that EEG signals from different channels have strong correlation with the signals after ICA denoising and outlier processing. It indicates that the outliers processing can reduce the effort of noise as ICA.

##### 3.3. Improved Sample Entropy

Compared with nonepileptic signals, the amplitude of epileptic signal is relatively higher. In order to enlarge the difference between epileptic signal and nonepileptic signal, the paper adopts nonuniform processing method to process EEG signal. The paper takes signal from channel FZCZ of patient 18 as an example. The amplitude distribution of epileptic signals and nonepileptic signals is given in Figure 6. It can be seen from Figure 6 that, compared with epileptic signals, the number of nonepileptic signals with high amplitude is relatively smaller.

**(a)**

**(b)**

By enlarging the amplitude difference between epileptic signals and nonepileptic signals, it can further enlarge the difference between them, so as to complete high performance epilepsy detection.

The analysis of variance (ANOVA) is used to analyze the significance difference between epileptic signals and nonepileptic signals. The *P*-value decrease proportion is shown in Figure 7. The smaller the *P*-value, the stronger the ability of distinguishing epilepsy from nonepilepsy. It can be seen from Figure 7 that the significant difference decreases obviously after nonuniform processing.

The adjustment factor is decided by average sample entropy of epileptic signals and nonepileptic signals. The average sample entropy is shown in Table 1.

It can be seen from Table 1 that the relationship between sample entropy of epileptic signals and nonepileptic signals in the same channels is uncertain. The main reason is that different channels locate in different positions of brain, so they are affected by different noise and signals in other channels.

Sometimes sample entropy cannot truly represent the complexity of EEG signal because of local special signals. In order to get the complexity of local signal, the 2-second period is divided into 8 segments (represented as No 1, No 2, etc.). The sample entropy of signals in each segment is calculated respectively. The changing trend of sample entropy between 8 segments is integrated into the whole sample entropy. The partial sample entropy needs to be adjusted as given in Table 2. The sample entropy meeting the ascending adjustment requirements accounts for 4.21% of the total. The sample entropy meeting the descending adjustment standard accounts for 2.94% of the total. The sample entropy meeting the descending adjustment standard and ascending adjustment standard accounts for 0.04% of the total. The sample entropy will be adjusted based on adjustment factors when meeting adjustment standard. The adjustment factor is obtained according to average sample entropy of specific channel in Table 1. Take channel CZPZ as an example; the average sample entropy of epileptic signals is 0.66, and the average sample entropy of nonepileptic signals is 0.71. Therefore, the adjustment factor of channel CZPZ is the difference of epileptic signals and nonepileptic signals (0.05 in this case).

The ANOVA is used to analyze the significance difference of the improved sample entropy between epileptic signals and nonepileptic signals. The *P*-value is obtained by this method. The significance difference of sample entropy before and after adjustment was calculated. The results are shown in Table 3. It can be seen from Table 3 that the significance of epileptic signals and nonepileptic signals after adjustment significantly increases. In particular, the *P*-value of signals in channel F7T7 is adjusted from 0.045 to 0.023 by sample entropy adjustment. Therefore, the improved sample entropy improves the detection ability of epileptic signals and nonepileptic signals.

The data acquisition environment is nonideal. Hence, data missing is inevitable in the processing of acquisition. The robustness of the improved sample entropy is analyzed in the paper. In the paper, the data which have no data missing are chosen as reference standard. The incomplete data is generated by randomly removing data in the proportion of 0.05, 0.1, 0.15, 0.2, 0.25, and 0.3. The paper calculates the correlation coefficient between incomplete data and complete data. The results are shown in Figure 8. It can be seen from Figure 8 that the correlation coefficient of the improved sample entropy in noisy environment is larger than that of the traditional sample entropy based on the denoised signals in the same proportion of data missing. This phenomenon shows that the improved sample entropy has stronger robustness than traditional sample entropy.

##### 3.4. Phase Synchronization Analysis in the Same Regions

The interaction between signals at the same frequency bands is active, which contains abundant physiological information. In order to get many IMFs at different frequency bands, the EEG signals are decomposed by VMD in the paper. It is of great significance to select the appropriate number of IMFs (denoted as K). If the K value is too small, it will produce insufficient decomposition, resulting in the neglect of meaningful information. If the K value is too large, the center frequency of different IMFs may be close to each other, resulting in mode aliasing. Hence, the center frequency observation method is adopted to get K value in [19]. The center frequency of the same IMF in different times is different. Therefore, the average center frequency is adopted by taking the average value of the center frequency of the same IMF. The paper analyzes the relationship between the number of IMFs and the average center frequency of each IMF. The results are shown in Figure 9.

**(a)**

**(b)**

It can be seen from Figure 9 that when the number of IMFs is 6, the average center frequency of each IMF has an obvious difference. However, when the number of IMFs is 7, the center frequency difference between IMFs is smaller than the number of IMFs which is 6. The phenomenon is caused by excessive decomposition of VMD. It can be concluded that 6 is the best number of IMFs in the paper.

The signals from 2 channels in the same region are decomposed to get IMFs by VMD. The average phase synchronization index of corresponding 2 IMFs in the same region is analyzed. The results are shown in Table 4.

It can be seen from Table 4 that there are some differences in the average phase synchronization index of IMFs in different regions. The farther away from the hippocampus, the lower the phase synchronization. On the whole, the phase synchronization of epileptic signals is higher than that of nonepileptic signals. The main reason is nonepileptic signals containing more random features than epileptic signals. When epileptic signals occur, the proportion of epileptic information in EEG signal becomes larger, so the degree of synchronization is higher. At the same time, with the increase of IMF’s frequency, the phase synchronization index gradually decreases. The phase synchronization information carried by IMF in high frequency is significantly less than that carried by IMF in low frequency.

In the paper, the EEG of patient 18 was randomly selected for analysis with 2 seconds as analysis period. The signal is decomposed into 6 IMFs by VMD (*K* = 6). In the same region, the significance difference of phase synchronization index of corresponding 2 IMFs is analyzed. The *P*-value obtained by ANOVA is shown in Table 5.

It can be seen from Table 5 that when the phase synchronization index is used as the feature of epilepsy and nonepilepsy in noisy environment, the significance is more obvious than after signal denoising. The main reason is that when denoising method is used to remove noise, only considering amplitude and frequency but ignoring phase results in partial phase information loss. Thus, the phase synchronization index is affected. There is a significant difference in the detection ability of different IMFs of epilepsy and nonepilepsy in different region. The significance of the IMFs in the hippocampus region is obvious on the whole. This region is closest to the source of epileptic seizures. Theoretically, the epileptic information obtained is the most timely. Therefore, the phase coupling features of the hippocampus region are the best choice for phase synchronization analysis.

##### 3.5. Phase Synchronization Analysis in the Different Regions

In the paper, 5 channels from 5 different regions were randomly selected for analysis. Channel FP1F3 in the forehead region, channel F7T7 in the left temporal region, F8T8 channel in the right temporal region, channel P7O1 in the occipital region, and channel FZCZ in the hippocampal region are selected as a representative channel. The data from each channel is decomposed into 6 IMFs by VMD. The ANOVA method is used for significance analysis between epileptic signals and nonepileptic signals. The results are shown in Table 6. It can be seen from Table 6 that the most of significance of epileptic signals and nonepileptic signals is obvious in different regions, but there is at least one pair of IMFs whose significance is not obvious. Therefore, the significance between epileptic signals and nonepileptic signals in the different regions is more worse than in the same regions.

##### 3.6. Epilepsy Detection Results

In the paper, channel FZCZ and channel CZPZ in the hippocampus region are selected as analysis channels, and 2 seconds is taken as an analysis period. The improved sample entropy is used as features. The signals from two channels are decomposed into 6 IMFs by VMD, respectively, and phase synchronization index between corresponding IMFs (excluding IMF6) is calculated, which is a total of 7 signal features. The random forest model is used to realize epilepsy detection. Random forest is an ensemble learning algorithm, which is a representative of bagging. By combining multiple weak classifiers, the final result is obtained by voting or taking the mean value. Random forest model has high performance and generalization. The grid search method is used to optimize the model parameters. The optimal number of decision trees is 900, and the number of variables randomly selected each time is 5. In order to reduce overfitting, the 10-fold cross validation is used to complete the performance verification. Many scholars make use of the same data to achieve epilepsy detection, and the performance is shown in Table 7.

It can be seen from Table 7 that the method proposed in the paper can achieve epilepsy detection in noisy environment. From the experimental results, it can be inferred that the improved sample entropy and phase synchronization index combined with VMD can perform well as features in noisy environment. The signal loss caused by denoising is avoided, and the signal integrity is guaranteed to the greatest extent. At the same time, only two channels are selected. The 10-fold cross validation can ensure the results are independent on subject. From the 10-fold cross validation results, it can be seen that the method is effective with high accuracy. At the same time, the method can detect most of epilepsy and only very few nonepileptic signals are classified as epileptic signal.

#### 4. Conclusion

Epilepsy detection is realized in noisy environment which can avoid information loss generated by denoising. The improved sample entropy and phase synchronization index are selected as features in the paper. The improved sample entropy has stronger epilepsy detection ability than sample entropy through nonuniform processing and adjustment. The channels in the same region can act better than in the different region when used for phase synchronization analysis. VMD is used to adaptively decompose the signal into 6 IMFs, and the phase synchronization indexes between corresponding 2 IMFs (excluding IMF6) can distinguish epilepsy from nonepilepsy. The random forest model realizes epilepsy detection. The results show that the accuracy, sensitivity, and specificity are 91.78%, 91.27%, and 93.61%, respectively. The results verify the advantage of the paper. That is, the method can still detect epilepsy with high performance based on EEG signal contained by complex noise. Because of the lack of epilepsy information caused by filtering, some epilepsy cannot be detected. The method effectively avoids the delay of diagnosis time which is caused by the false epilepsy detection. Hence, the method has a wider application potential.

However, the method has a disadvantage. That is, channel CZPZ and channel FZCZ do not always contain enough epilepsy information used for detecting, especially for some refractory epilepsy. In some special period, the 2 channels are not optimal channels when the quality of the two channels is poor. Therefore, establishment of adaptive unfixed channel selection method can further improve the performance of epilepsy detection through the improvement of local performance. The adaptive channel selection method in noisy environment will be our research work in the future.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was supported by the National Natural Science Foundation of China (61370222), National Key R&D Program of China under Grant 2020YFB1710200, and State Key Laboratory for Novel Software Technology (2020YFB1710200).