Computational Intelligence and Neuroscience

Computational Intelligence and Neuroscience / 2019 / Article
Special Issue

Brain-Inspired Intelligent Systems for Daily Assistance

View this Special Issue

Research Article | Open Access

Volume 2019 |Article ID 5627156 | 13 pages | https://doi.org/10.1155/2019/5627156

Motor Imagery EEG Classification Based on Decision Tree Framework and Riemannian Geometry

Academic Editor: Anastassia Angelopoulou
Received24 Sep 2018
Revised20 Dec 2018
Accepted03 Jan 2019
Published21 Jan 2019

Abstract

This paper proposes a novel classification framework and a novel data reduction method to distinguish multiclass motor imagery (MI) electroencephalography (EEG) for brain computer interface (BCI) based on the manifold of covariance matrices in a Riemannian perspective. For method 1, a subject-specific decision tree (SSDT) framework with filter geodesic minimum distance to Riemannian mean (FGMDRM) is designed to identify MI tasks and reduce the classification error in the nonseparable region of FGMDRM. Method 2 includes a feature extraction algorithm and a classification algorithm. The feature extraction algorithm combines semisupervised joint mutual information (semi-JMI) with general discriminate analysis (GDA), namely, SJGDA, to reduce the dimension of vectors in the Riemannian tangent plane. And the classification algorithm replaces the FGMDRM in method 1 with k-nearest neighbor (KNN), named SSDT-KNN. By applying method 2 on BCI competition IV dataset 2a, the kappa value has been improved from 0.57 to 0.607 compared to the winner of dataset 2a. And method 2 also obtains high recognition rate on the other two datasets.

1. Introduction

Brain computer interface (BCI) based on motor imagery (MI) is used to analyze human intention by electroencephalogram (EEG) signals generated by human brain electrophysiological activity [1, 2]. Based on BCI technology, exoskeletons can be used to help people with physical disabilities regain their motor ability, and BCI also has wide applications in smart home, entertainment, military, and other fields [36].

Common spatial pattern (CSP) is widely used in motor imagery to extract EEG features [7]. CSP has excellent performance in two classification tasks, but the drawback is that it needs a lot of electrodes [8].

Despite its short history, the use of the Riemannian geometry in BCI decoding is currently attracting increasing attention [913]. Covariance matrices lie in the space of symmetric positive definite (SPD) matrices, which can be formulated as a Riemannian manifold [14]. In the BCI field, the connections of the CSP algorithm and the tools of information geometry have been investigated, considering several divergence functions in alternative to the Riemannian distance [1518]. Barachant et al. proposed a simple data augmentation approach for improving the performance of the Riemannian mean distance to mean (MDM) algorithm [13]. Kumar et al. propose a single band CSP framework for MI-BCI that utilizes the concept of tangent space mapping in the manifold of covariance matrices, and the proposed method obtains good results when compared to other competing methods [19]. A hierarchical MDM classifier for multiclass problem has been tested in [20].

Advanced classifiers based on the tangent space on the Riemannian manifold of positive matrices are also receiving increasing attention. Barachant et al. map the covariance matrices in the tangent space and apply feature selection and linear discriminate analysis (LDA) in the tangent space [10]. For the application of the classifier in the tangent space, the problem is that the curse of dimensionality. Traditional data dimensionality reduction methods include two categories: linear dimensionality reduction (LDR) and nonlinear dimensionality reduction (NLDR). Since most of the actual data are nonlinear, NLDR techniques such as locally linear embedding (LLE) [21], isometric mapping (ISOMAP) [22], maximum variance unfolding (MVU) [23], and t-distributed stochastic neighbor embedding (t-SNE) [24, 25] are used to tackle problems widely. Lee et al. used discrete wavelet transform (DWT) and continuous wavelet transform (CWT) to extract features of MI tasks, and Gaussian mixture model (GMM) was used to construct GMM supervectors; this method accelerates the speed of training and improves the accuracy of motor imagery [26]. Sadatnejad et al. propose a new kernel to preserve the topology of data points in the feature space, and the proposed kernel is strong, particularly in the cases where data points have a complex and nonlinear separable distribution [8]. Xie et al. proposed a framework for intrinsic submanifold learning from a high-dimensional Riemannian manifold; the proposed method exhibited strong robustness against a small training dataset [27].

There is still another approach for overcoming the problem of high dimensionality in SPD manifolds. And this method maps from a high-dimensional SPD manifold to a lower dimensional one while the geometry of SPD manifolds is preserved. And there are only two works of this way. Davoudi et al. [14] proposed distance preservation to local mean (DPLM) as dimensionality reduction technique, combined with FGMDM, the best performance of this article in terms of kappa value is 0.60. Harandi et al. [28] learned a mapping that maximizes the geodesic distances between interclass and simultaneously minimizes the distances between intraclass, and it is done via an optimization on Grassmann manifolds.

In this paper, we proposed a novel SSDT-FGMDRM and SSDT-KNN for the classification of multiclass MI tasks by designing a simple yet efficient subject-specific decision tree framework. Method 1 contains SSDT-FGMDRM to improve the performance of FGMDRM. For each individual, method 1 first separates the two most discriminative classes from the group. Furthermore, the remaining categories including the misclassification samples of the previous nodes are reclassified in the last node. Method 2 contains SSDT-KNN and a NLDR method named SJGDA. SJGDA combines the advantage of semi-JMI and GDA, and method 2 performed well on different datasets. The aims of this article are as follows:(1)To verify the effectiveness of the proposed SSDT framework through dataset 1(2)To verify the superiority of SJGDA in feature extraction, compared with semi-JMI and GDA(3)To validate the generalization ability of method 2 through different datasets, in this paper

The rest of the paper is organized as follows: Section 2 introduced the mathematical preliminaries of the Riemannian geometry. Section 3 discussed the proposed methods in detail. Three datasets are introduced in Section 4. The results of our work are discussed in Section 5. And in Section 6, we compared our methods with the state of the art. This paper concludes in Section 7.

2. Geometry of SPD Matrices

Let Xi represent a short segment of continuous EEG signals, and Xi can be denoted as follows:where Xi corresponds to the trail of imaged movement starting at time t = Ti. denotes the number of sampled points of the selected segment.

For the trail, the spatial covariance matrix (SCM) can be calculated as follows:

Based on the SCM, there are two ways to classify MI tasks in the Riemannian manifold.

2.1. Filter Geodesic Minimum Distance to the Riemannian Mean

The Riemannian distance between two SPD matrices P1 and P2 in P(n) is given by [29]

Given m SPD matrices P1, … , Pm, the geometric mean in the Riemannian sense is defined as

For algorithm mean Riemannian distance to Riemannian mean (MDRM), we compute the Riemannian distance between unknown class P to the Riemannian mean point of each class and classify the unknown class into categories corresponding to the shortest distance. Inspired by the principal geodesics analysis (PGA) method [30], the literature [31] finds a set of filters by applying an extension of Fisher linear discriminant analysis (FLDA) named Fisher geodesic discriminant analysis (FGDA). And then, apply these filters to MDRM to form filter geodesic minimum distance to Riemannian mean (FGMDRM). More details can be seen from [31].

2.2. Tangent Space Mapping

As shown in Figure 1, the SPD matrix of P is denoted by a differentiable Riemannian manifold Z. Each tangent vector Si can be seen as the derivative at t = 0 of the geodesic Γ(t) between P and the exponential mapping Pi = EXPP(Si), defined as follows:

The inverse mapping is given by the logarithmic mapping and can be defined as follows:

Using the Riemannian geodesic distance, the Riemannian mean of I > 1 SPD matrices by

Using the tangent space located at the geometric mean of the whole set trials, , and then, each SCM Pi is mapped into this tangent space, to yield the set of m=n(n + 1)/2 dimensional vectors:

Many efficient classification algorithms can be implemented in the Riemannian space [10].

3. Methods

3.1. Subject-Specific Decision Tree Framework

Decision tree is a common machine learning method. Each node of decision tree can be defined as a rule. Guo and Gelfand [32] proposed classification trees with neural network, and this method embeds multilayer neural networks directly in nodes. In the decision tree, one of the most important things is to construct a proper binary tree structure; the upper nodes have the greater impact of the accuracy of the whole samples [33]. In order to solve the multiclassification problem in this paper, we constructed a subject-specific decision tree (SSDT) classification framework as shown in Figure 2 according to the best separating principle [34]. As can be seen from Figure 2, the SSDT proposed in this paper trains a different classification model at different nodes of the decision tree.

The advantages of the SSDT framework are as follows:(1)This model separates the two MI tasks (e.g., C.1 and C.2) with the highest recognition rate as far as possible(2)At the last node, we reclassify some samples to enhance the classification ability of the classifier

3.2. Method 1: A Direct Classification Method Based on SSDT-FGMDRM

Firstly, we point out one problem of the multiclass FGMDRM by using an example. Figure 3 gives a three-class classification problem. Figure 3(a) shows the classification progress by FGMDRM. We can see that three Riemannian mean points (RMPs) are located on the manifold. Since the classification criterion is decided by the distance calculated between the test point and the RMP, it caused a wrong classification. Figure 3(b) shows the example of the classification results obtained by using the first node of the SSDT-FGMDRM framework. It can be seen that the error classification is corrected by using the decision tree framework.

Method 1 is used to classify four types of MI tasks directly. The training and testing diagram is shown in Figure 4.

3.3. Feature Extraction Algorithm Based on the Riemannian Tangent Space

In this paragraph, we propose a novel data reduction method which combines semi-JMI and GDA, namely SJGDA, to solve the dimension disaster problem after tangent space mapping.

3.3.1. Semisupervised Joint Mutual Information

Semisupervised dataset D = D{DL ∪ DU} consists of two parts, are labelled data and are unlabelled data. A binary random variable S is introduced to determine the distribution of labelled dataset and unlabelled dataset. When s = 1, we record the value of y, otherwise not. In this way, the labelled set DL comes from the joint distribution p(x, y|s = 1), while the unlabelled set DU comes from the distribution p(x|s = 0). The underlying mechanism S turns out to be very important for feature selection.

Feature selection method based on mutual information theory is a common feature selection method [35]. In these methods, we rank the features according to the score and select the features with higher scores. For example, by ranking the features according to their mutual information with the labels, we get the sort of correlation that is related to class labels. The characteristics of the score are defined as follows:where Xθ represents the set of the features already selected and Xk is the feature ranked by scores. Y represents the label corresponding to feature Xk.

Semi-JMI is a method of using a semisupervised dataset as a training set for JMI. More details can be seen from Reference [36]. In this paper, the missingness mechanism is class-prior-change semisupervised scenario (MAR-C) [37]. After feature ranking, we can obtain a feature vector as follows:where n is the length of the tangent vectors Si. Since information redundancy exists in f, we select the best vector length m (m < n) of each subject by the classification recognition rate:

3.3.2. Generalized Discriminant Analysis

After variable selection, this paper uses generalized discriminant analysis (GDA) [38, 39], which is a nonlinear feature reduction technique based on kernels to reduce the length of the feature vectors fSJ and their redundancies. Mapping X (fSJ) into a high-dimensional space F through a kernel function Φ:

The linear Fisher decision is performed in the F space, and the criterion function for its extension iswhere WΦ ∈ F and SB and SW are between-class scatter and within-class scatter, respectively.

For the convenience of the numerical calculation, kernel functions are introduced to solve the problem:

Gauss kernel, poly kernel, and sigmoid kernel are widely used in GDA [40]. For test data z, its image Φ(z) in F space projects on WΦ is as follows:

This paper uses ploy kernel to reduce the dimension. After GDA, we can get a vector fG as follows:where d of fG is decided by the actual needs, and in this paper we set d = 1. And then, SJGDA is applied to the dataset of this paper, and the final feature vectors are constructed as follows:

3.4. Method 2: SJGDA and Subject-Specific Decision Tree k-Nearest Neighbor

Method 2 is used to classify four types of MI tasks after tangent space mapping. The training and testing diagram is shown in Figure 5.

4. Description of Data

4.1. Dataset 1

BCI competition IV dataset 2a is used to evaluate the performance of the proposed two methods [41]. Dataset 2a collects 22 channel EEG data and 3 EOG channel data. Four types of motor imagery were collected: left hand, right hand, foot, and tongue. The dataset contains nine healthy subjects and each subject has two sessions, one training session and one test session. Each session has 288 trails of MI data with 72 trails for each MI task. The EEG signals are bandpass filtered by a 5-th order Butterworth filter in the 8–30 Hz frequency band. The selection of trial period is important in MI classification; we select 2 s data (0.5 s and 2.5 s) after the cue, instructing the user to perform the MI tasks by the winner of the competition.

4.2. Dataset 2

BCI competition III dataset IIIa is used to evaluate the performance of method 2. BCI III dataset IIIa contains 3 subjects: K3b, K6b, and L1b, and collects 64 channel EEG data. The EEG was sampled with 250 Hz. Four types of motor imagery were collected: left hand, right hand, foot, and tongue. More details about this dataset can be seen at Reference [42].

4.3. Dataset 3

In our own dataset, Emotiv Epoc+ is used to collect EEG data of motor imagery. It is a portable EEG acquisition device with a sampling rate of 128 Hz. It has fourteen electrode channels (AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, and AF4), two inference electrodes (CMS and DRL), and the electrode placement follows the international 10–20 standard. Equipment and the Emotiv 14 electrodes are located over 10–20 international system positions as shown in Figure 6. This experiment collected three kinds of EEG signals of one joint: imagination of shoulder flexion (F), extension (E), and abduction (A), as shown in Figure 7.

Seven subjects participated in this experimental study. These subjects were in good health. During the experiment, subjects were naturally placed with both hands, trying to avoid body or head movement. During the experiment, subjects carried out motor imagery under the outside cue, a single experiment collected EEG signal for 5 seconds, and then took 5–7 seconds to have rest, each action repeated acquisition 20 times. The experimental process is shown in Figure 8.

5. Results

5.1. Results of Method 1

We use SSDT-FGMDRM to classify multiclass MI tasks as introduced in Section 3.1. Since there are four classes, we can have four pairs of MI tasks: left vs rest (L/RE), right vs rest (R/RE), foot vs rest (F/RE), and tongue vs rest (T/RE). For each subject, the pair with the highest accuracy is used to train N.1, and the pair with the second highest accuracy is to train N.2. Table 1 gives the ten-folder cross-validation results obtained using FGMDRM in OVR scheme.


SubjectA01A02A03A04A05A06A07A08A09MeanStd

L/RE84.4067.3993.0577.6466.9674.3486.3393.8394.0982.0010.90
R/RE89.2675.3395.1678.5663.9271.8181.1893.3982.9281.2810.24
F/RE77.8083.4489.2080.8871.5678.2288.8979.1884.481.515.65
T/RE88.2168.7690.5979.9571.8576.7290.6494.4394.3883.959.82

Table 2 displays the kappa values obtained by method 1. Compared with other methods, five subjects (A03, A06, A07, A08, and A09) achieved higher kappa value of nine without exploring the frequency domain information by method 1. In the case of fixed frequency window, we have improved the mean kappa value of 0.069 than MDRM (), and 0.139 than FGMDM_fixed (). Our approach also shows significant improvement than FGMDM (), which has exploited subject-specific frequency information, in terms of the kappa value of 0.039.


SubjectA01A02A03A04A05A06A07A08A09Mean value

Method 10.660.390.780.470.250.410.720.790.830.589
MDRM [10]0.750.370.660.530.290.270.560.580.680.520.4683
FGMDM [14]0.720.500.640.380.280.340.640.680.750.550.6607
FGMDM_fixed [14]0.690.350.600.280.210.300.460.620.530.450.1423

5.2. Results of Method 2

The results in Figure 9 show the T/RE feature distribution of the five features of subject A09. Figure 9(a) shows the first five ranked features with semi-JMI. After applying the semi-JMI, the first five best features extracted have shown statistically significant improvement in the separability with p values <0.05 except feature 2 with value 0.77. In Figure 9(b), the first five features extracted from primitive feature vectors with value of 0.13, 0.05, 0.87, 0.05, and 0.13. The values indicate that the pair T/RE have no significance in the primitive feature vectors. The results show that with our semisupervised feature ranking algorithm, the separable degree of the feature has been greatly improved.

Figure 10 shows the evolution of the classification accuracy with KNN (k = 5 in this paper) against the number of ranked variables in OVR scheme. L/RE and T/RE are the two pairs with the highest recognition rate, and they achieved the highest recognition rate in 100 variables. But this is still a curse of dimensionality for classifiers; GDA is used to analyze the first 100 sorted variables in our study.

As the separation of characteristics cannot meet our requirements, GDA is used to get more obvious variables. Figure 11 illustrates distributions for the first five most discriminant variables with GDA and semi-JMI. It can be seen from Figure 11 that L/RE is separated equally well by using GDA.

Table 3 displays ten-folder cross-validation results obtained using SJGDA and KNN in OVR scheme. It can be seen that the vectors which are mapped to the tangent space have better classification performance than that in the Riemannian manifold directly.


SubjectA01A02A03A04A05A06A07A08A09MeanStd

L/RE90.2974.6293.0575.0275.3178.1585.0892.3595.1484.338.12
R/RE91.3477.1194.4777.0773.6175.7481.5893.4187.5383.547.75
F/RE85.0683.7084.7777.1175.3677.0988.1782.3085.4482.114.24
T/RE88.8976.7289.2580.9178.5176.0489.2794.0893.4185.236.76

Table 4 presents the results obtained by SJGDA in pairwise way for multiclass MI tasks. We have six pairs of MI tasks: left and right (L/R), left and foot (L/F), left and tongue (L/T), right and foot (R/F), right and tongue (R/T), and foot and tongue (F/T).


SubjectA01A02A03A04A05A06A07A08A09MeanStd

L/R90.9567.2994.5263.8764.4870.3370.2997.9595.0079.4113.85
L/F95.1987.4393.9381.8468.1077.1498.5788.2993.8187.149.29
L/T96.5264.2496.5783.2473.6769.9197.9597.2399.2986.5113.16
R/F95.9087.5295.0584.8466.7172.8897.2493.1887.4386.7510.01
R/T99.3379.8296.4879.1272.2271.4897.2495.0592.2987.0010.61
F/T83.9484.1786.6275.1062.6274.2188.3193.1090.3382.049.10

Table 5 displays the comparison of classification accuracy using SJGDA and KNN for L/R task in 10-folder cross validation. References [8, 4345] contain the classification of other publications. We have improved the accuracy compared with Reference [44] (p = 0.85) and Reference [45] (). Gaur et al. [43] () explored the specific frequency information for each subject, and Sadatnejad and Shiry Ghidary [8] () used a novel kernel for dimensionality reduction which is similar to SJGDA. Although the results in the paper are not as high as those in Reference [43], it can be concluded that there is no difference between the results in Reference [43] and those in this paper because of .


SubjectA01A02A03A04A05A06A07A08A09MeanStdp value

Method 290.9567.2994.5263.8764.4870.3370.2997.9595.0079.4113.85
Reference [43]91.4960.5694.1676.7258.5268.5278.5797.0193.8579.9314.130.95
Reference [8]88.8959.0390.2878.4762.5075.0072.9293.0687.5078.6311.630.90
Reference [44]88.8951.3996.5370.1454.8671.5381.2593.7593.7578.0116.040.85
Reference [45]90.2854.1793.7564.5857.6465.2862.5090.9785.4273.8415.020.45

Table 6 presents the results in terms of the kappa value. The proposed method 1 achieved a mean performance of 0.589 which ranks this method to the first place of the competition. And with our proposed method 2, we have achieved a mean performance of 0.607, which makes method 2 to acquire the best performance of the state of the art.


SubjectA01A02A03A04A05A06A07A08A09Meanp value

Method 20.770.380.760.470.270.420.730.810.850.607
Method 10.660.390.780.470.250.410.720.790.830.5890.8632
Reference [43]0.860.240.700.680.360.340.660.750.820.600.9586
TSLDA [10]0.740.380.720.500.260.340.690.710.760.5670.6894
Winner 1 [46]0.680.420.750.480.400.270.770.750.610.570.7051
Reference [8]0.710.460.760.440.260.370.790.750.610.570.7245
Reference [14]0.750.490.760.490.340.360.680.760.760.600.9353

Dataset 2 is used to verify the effect of method 2, and the classification results are given directly in this paper. The results are shown in Table 7. As can be seen from Table 7, method 2 obtained the second highest recognition rate in the comparative literature. Compared with the recent reference [47], method 2 achieved good classification results.


SubjectMethod 2Reference [47]Reference [48]Reference [49]Reference [50]

k3b91.6790.0086.6794.2094.44
k6b75.0076.2581.6769.0062.50
l1b81.6777.9185.0078.6078.33
Mean82.7881.3884.4480.6078.42
p value0.960.420.940.73

5.3. Results of Dataset 3

Dataset 3 is used to evaluate the performance of method 2. Figure 12 shows the classification error with KNN against the number of ranked variables in OVR scheme. A/RE and F/RE are the two pairs with the lowest classification error, and they all achieved the highest recognition rate within 60 variables. In this paper, the first 60 ranked variables are used for the next analysis.

Figure 13 displays 5-folder cross-validation results obtained by using SJGDA and KNN in OVR and OVO scheme. This Figure 13(a) illustrates three possible pairs of MI tasks (F/RE, E/RE, and A/RE) for each subject. It can be learned from the figure that flexion and abduction are the easiest movement to distinguish in six subjects of seven, and the six subjects are S1, S3, S4, S5, S6, and S7. However, due to individual differences, the highest recognition rate of each subject is different.

We also compared three possible pairs (F/E, F/A, and E/A) in OVO scheme of seven subjects. Figure 13(b) depicts the comparison results for each subject, and it can be seen that the pair of F/A obtained the highest recognition rate in seven subjects. Combined with the analysis results of Figures 13(a) and 13(b), it can be considered that flexion and extension are more obvious in the three MI tasks.

As SJGDA is a new method proposed in this paper, we also compared the feature distribution of SJGDA, GDA, and semi-JMI to illustrate the effectiveness of SJGDA. Figure 14 depicts the feature distribution of F/E MI tasks of seven subjects. The blue and red circles represent the two different feature classes. As shown in Figure 14, the F/E MI tasks learned by SJGDA have high separability than GDA and semi-JMI.

The performance of the proposed method 2 is evaluated by using classification accuracy. Since there are three classes, the chance level is 33.33%. Figure 15 demonstrates that the proposed method achieves higher performance for six subjects (S1, S2, S3, S4, S5, and S6) out of seven except S7 compared to semi-JMI and GDA methods. In addition, it also can be seen that GDA obtains a better classification accuracy for four subjects of seven (S1, S2, S5, and S7) compared with semi-JMI. The reasons for this phenomenon can be attributed to as follows: In the process of feature selection, we manually select feature dimensions suitable for classifiers, which results in partial information loss. As a feature dimensionality reduction technique, GDA is suitable for the preservation of useful information from the primitive vectors. And the proposed method SJGDA in this paper not only preserves the advantages of GDA but also adds some high ranking features to strengthen the expressive ability of the features.

6. Discussions

In this paper, we proposed a novel SSDT framework combined with classifiers to improve the performance of classifiers for multiclass MI tasks. We also proposed a novel NLDR method named SJGDA, and this NLDR method performs better than both semi-JMI and GDA on different datasets. In the following paragraphs, we have discussed the two methods in detail.

Method 1 indicates the drawback of FGMDRM, and then the novel SSDT framework is used to improve the accuracy for each individual. As shown in Table 2, compared with other published results, method 1 gets a quite good result in the case of processing the EEG signals of fixed frequency segment (8–30 Hz).

As shown in Table 6, Gaur et al. [43] proposed SS-MEMDBF to select the subject-specific frequency to obtain enhanced EEG signals which represent MI tasks related to µ and β rhythms, then classification with the Riemannian distance directly. TSLDA was proposed by Barachant et al. [10], and the covariance matrices are mapped onto a higher dimensional space where they can be vectorized and treated as Euclidean objects. Ang et al. [46] is the winner of the competition, FBCSP and multiple OVR classifiers were used for MI tasks, and achieved the mean kappa value of 0.57. Sadatnejad and Shiry Ghidary [8] proposed a new kernel for NLDR over the manifold of SPD matrices, the kappa value is 0.576. Davoudi et al. [14] considered the geometry of SPD matrices and provides a low-dimensional representation of the manifold with high-class discrimination, and the best result of this method in terms of the kappa value is 0.60.

In method 2, SJGDA is used to get more obvious vectors from the tangent vectors, and a SSDT-KNN classifier is used to identify different MI tasks. Combined with SJGDA and SSDT-KNN, we have achieved a better performance compared with method 1 (), Reference [43] (), TSLDA (), winner 1 (), Reference [8] (), and Reference [14] (). It is clear that the proposed method in this paper is effective for MI tasks in a BCI system.

In order to prove the effectiveness of the proposed method 2, we tested it on two other datasets. As shown in Table 7 and Figure 15, method 2 achieves good classification results on two datasets.

7. Conclusion

The experimental results of method 1 show that the proposed classification framework significantly improves the classification performance of the classifier. The experimental results of method 2 show that the SJGDA algorithm proposed in this paper is superior to GDA and semi-JMI in feature extraction, and method 2 has the highest recognition rate in this paper. However, as the classifiers in the SSDT framework is substitutable, the focus of the next work is to combine more advanced classifiers with SSDT to increase the recognition rate of the BCI systems.

Data Availability

The dataset 1 and dataset 2 used to support the findings of this study are available from http://bnci-horizon-2020.eu/database/data-sets. The dataset 3 used to support the findings of this study is available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

Northeast Electric Power University (Grant number BSJXM-201521) and Jilin City Science and Technology Bureau (Grant number 20166012).

References

  1. J. R. Wolpaw, N. Birbaumer, D. J. McFarland, G. Pfurtscheller, and T. M. Vaughan, “Brain-computer interfaces for communication and control,” Clinical neurophysiology, vol. 113, no. 6, pp. 767–791, 2002. View at: Publisher Site | Google Scholar
  2. N. Birbaumer, “Breaking the silence: brain?computer interfaces (BCI) for communication and motor control,” Psychophysiology, vol. 43, no. 6, pp. 517–532, 2006. View at: Publisher Site | Google Scholar
  3. K. K. Ang, C. Guan, K. S. Phua et al., “Transcranial direct current stimulation and EEG-based motor imagery BCI for upper limb stroke rehabilitation,” in Proceedings of 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 4128–4131, IEEE, San Diego, CA, USA, August 2012. View at: Google Scholar
  4. B. Xu, S. Peng, A. Song, R. Yang, and L. Pan, “Robot-aided upper-limb rehabilitation based on motor imagery EEG,” International Journal of Advanced Robotic Systems, vol. 8, no. 4, p. 40, 2011. View at: Publisher Site | Google Scholar
  5. F. Wang, X. Zhang, R. Fu, and G. Sun, “Study of the home-auxiliary robot based on BCI,” Sensors, vol. 18, no. 6, p. 1779, 2018. View at: Publisher Site | Google Scholar
  6. F. Wang, H. Wang, and R. Fu, “Real-Time ECG-based detection of fatigue driving using sample entropy,” Entropy, vol. 20, no. 3, 2018. View at: Publisher Site | Google Scholar
  7. Z. J. Koles, M. S. Lazar, and S. Z. Zhou, “Spatial patterns underlying population differences in the background EEG,” Brain Topography, vol. 2, no. 4, pp. 275–284, 1990. View at: Publisher Site | Google Scholar
  8. K. Sadatnejad and S. Shiry Ghidary, “Kernel learning over the manifold of symmetric positive definite matrices for dimensionality reduction in a BCI application,” Neurocomputing, vol. 179, pp. 152–160, 2016. View at: Publisher Site | Google Scholar
  9. M. Congedo, A. Barachant, and R. Bhatia, “Riemannian geometry for EEG-based brain-computer interfaces; a primer and a review,” Brain-Computer Interfaces, vol. 4, no. 3, pp. 155–174, 2017. View at: Publisher Site | Google Scholar
  10. A. Barachant, S. Bonnet, M. Congedo, and C. Jutten, “Multiclass brain-computer interface classification by riemannian geometry,” IEEE Transactions on Biomedical Engineering, vol. 59, no. 4, pp. 920–928, 2012. View at: Publisher Site | Google Scholar
  11. A. Barachant, S. Bonnet, M. Congedo, and C. Jutten, “Common spatial pattern revisited by riemannian geometry,” in Proceedings of 2010 IEEE International Workshop on Multimedia Signal Processing (MMSP), pp. 472–476, IEEE, Saint Malo, France, October 2010. View at: Google Scholar
  12. A. Barachant, S. Bonnet, M. Congedo, and C. Jutten, “A brain-switch using riemannian geometry,” in Proceedings of 5th International Brain-Computer Interface Conference 2011 (BCI 2011), pp. 64–67, Graz, Austria, September 2011. View at: Google Scholar
  13. A. Barachant, S. Bonnet, M. Congedo, and C. Jutten, “Classification of covariance matrices using a riemannian-based kernel for BCI applications,” Neurocomputing, vol. 112, pp. 172–178, 2013. View at: Publisher Site | Google Scholar
  14. A. Davoudi, S. S. Ghidary, and K. Sadatnejad, “Dimensionality reduction based on distance preservation to local mean (DPLM) for spd matrices and its application in BCI,” 2016, https://arxiv.org/abs/1608.00514. View at: Google Scholar
  15. S. Brandl, K.-R. Müller, and W. Samek, “Robust common spatial patterns based on Bhattacharyya distance and gamma divergence,” in Proceedings of 2015 3rd International Winter Conference on Brain-Computer Interface (BCI), pp. 1–4, IEEE, Gangwon-Do, South Korea, January 2015. View at: Google Scholar
  16. W. Samek, M. Kawanabe, and K.-R. Muller, “Divergence-based framework for common spatial patterns algorithms,” IEEE Reviews in Biomedical Engineering, vol. 7, pp. 50–72, 2014. View at: Publisher Site | Google Scholar
  17. W. Samek and K.-R. Müller, “Information geometry meets BCI spatial filtering using divergences,” in Proceedings of 2014 International Winter Workshop on Brain-Computer Interface (BCI), pp. 1–4, IEEE, Gangwon, South Korea, February 2014. View at: Google Scholar
  18. W. Samek and M. Kawanabe, “Robust common spatial patterns by minimum divergence covariance estimator,” in Proceedings of 2014 IEEE International Conference onAcoustics, Speech and Signal Processing (ICASSP), pp. 2040–2043, IEEE, Florence, Italy, May 2014. View at: Google Scholar
  19. S. Kumar, K. Mamun, and A. Sharma, “CSP-TSM: optimizing the performance of riemannian tangent space mapping using common spatial pattern for mi-BCI,” Computers in biology and medicine, vol. 91, pp. 231–242, 2017. View at: Publisher Site | Google Scholar
  20. C. Lindig-León, N. Gayraud, L. Bougrain, and M. Clerc, “Comparison of hierarchical and non-hierarchical classification for motor imagery based BCI systems,” in Proceedings of The Sixth International Brain-Computer Interfaces Meeting, Pacific Grove, CA, USA, May-June 2016. View at: Google Scholar
  21. S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science, vol. 290, no. 5500, pp. 2323–2326, 2000. View at: Publisher Site | Google Scholar
  22. O. Kramer and D. Lückehe, “Visualization of evolutionary runs with isometric mapping,” in Proceedings of 2015 IEEE Congress on Evolutionary Computation (CEC), pp. 1359–1363, IEEE, Sendai, Japan, May 2015. View at: Google Scholar
  23. K. Q. Weinberger, F. Sha, and L. K. Saul, “Learning a kernel matrix for nonlinear dimensionality reduction,” in Proceedings of the Twenty-First International Conference on Machine Learning, p. 106, ACM, Banff, Canada, July 2004. View at: Google Scholar
  24. L. Van Der Maaten, “Learning a parametric embedding by preserving local structure,” RBM, vol. 500, p. 26, 2009. View at: Google Scholar
  25. L. Van Der Maaten, “Accelerating t-SNE using tree-based algorithms,” Journal of Machine Learning Research, vol. 15, pp. 3221–3245, 2014. View at: Google Scholar
  26. D. Lee, S.-H. Park, and S.-G. Lee, “Improving the accuracy and training speed of motor imagery brain-computer interfaces using wavelet-based combined feature vectors and Gaussian mixture model-supervectors,” Sensors, vol. 17, no. 10, p. 2282, 2017. View at: Publisher Site | Google Scholar
  27. X. Xie, Z. L. Yu, H. Lu, Z. Gu, and Y. Li, “Motor imagery classification based on bilinear sub-manifold learning of symmetric positive-definite matrices,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 25, no. 6, pp. 504–516, 2017. View at: Publisher Site | Google Scholar
  28. M. T. Harandi, M. Salzmann, and R. Hartley, “From manifold to manifold: Geometry-aware dimensionality reduction for SPD matrices,” in Proceedings of European Conference on Computer Vision, pp. 17–32, Springer, Zurich, Switzerland, September 2014. View at: Google Scholar
  29. M. Moakher, “A differential geometric approach to the geometric mean of symmetric positive-definite matrices,” SIAM Journal on Matrix Analysis and Applications, vol. 26, no. 3, pp. 735–747, 2005. View at: Publisher Site | Google Scholar
  30. P. T. Fletcher and S. Joshi, “Principal geodesic analysis on symmetric spaces: statistics of diffusion tensors,” in Computer Vision and Mathematical Methods in Medical and Biomedical Image Analysis, M. Sonka, I. A. Kakadiaris, and J. Kybic, Eds., pp. 87–98, Springer, Berlin, Heidelberg, 2004. View at: Publisher Site | Google Scholar
  31. A. Barachant, S. Bonnet, M. Congedo, and C. Jutten, “Riemannian geometry applied to BCI classification,” in Proceedings of International Conference on Latent Variable Analysis and Signal Separation, pp. 629–636, Springer, St. Malo, France, September 2010. View at: Google Scholar
  32. H. Guo and S. B. Gelfand, “Classification trees with neural network feature extraction,” IEEE Transactions on Neural Networks, vol. 3, no. 6, pp. 923–933, 1992. View at: Publisher Site | Google Scholar
  33. S. R. Safavian and D. Landgrebe, “A survey of decision tree classifier methodology,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 21, no. 3, pp. 660–674, 1991. View at: Publisher Site | Google Scholar
  34. Y.-H. Shao, W.-J. Chen, W.-B. Huang, Z.-M. Yang, and N.-Y. Deng, “The best separating decision tree twin support vector machine for multi-class classification,” Procedia Computer Science, vol. 17, pp. 1032–1038, 2013. View at: Publisher Site | Google Scholar
  35. G. Brown, A. Pocock, M.-J. Zhao, and M. Luján, “Conditional likelihood maximisation: a unifying framework for information theoretic feature selection,” Journal of Machine Learning Research, vol. 13, pp. 27–66, 2012. View at: Google Scholar
  36. K. Sechidis and G. Brown, “Simple strategies for semi-supervised feature selection,” Machine Learning, vol. 107, no. 2, pp. 357–395, 2017. View at: Publisher Site | Google Scholar
  37. J. G. Moreno-Torres, T. Raeder, R. Alaiz-Rodríguez, N. V. Chawla, and F. Herrera, “A unifying view on dataset shift in classification,” Pattern Recognition, vol. 45, no. 1, pp. 521–530, 2012. View at: Publisher Site | Google Scholar
  38. G. Baudat and F. Anouar, “Generalized discriminant analysis using a kernel approach,” Neural Computation, vol. 12, no. 10, pp. 2385–2404, 2000. View at: Publisher Site | Google Scholar
  39. M. Haghighat, S. Zonouz, and M. Abdel-Mottaleb, “Cloudid: trustworthy cloud-based and cross-enterprise biometric identification,” Expert Systems with Applications, vol. 42, no. 21, pp. 7905–7916, 2015. View at: Publisher Site | Google Scholar
  40. V. Vapnik, The Nature of Statistical Learning Theory, Springer Science & Business Media, Berlin, Germany, 2013.
  41. C. Brunner, R. Leeb, G. Müller-Putz, A. Schlögl, and G. Pfurtscheller, BCI Competition 2008–Graz Data Set A, vol. 16, Institute for Knowledge Discovery (Laboratory of Brain-Computer Interfaces) Graz University of Technology, Graz, Austria, 2008.
  42. B. Blankertz, K. R. Muller, D. J. Krusienski et al., “The BCI competition III: validating alternative approaches to actual BCI problems,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 14, no. 2, pp. 153–159, 2006. View at: Publisher Site | Google Scholar
  43. P. Gaur, R. B. Pachori, H. Wang, and G. Prasad, “A multi-class EEG-based BCI classification using multivariate empirical mode decomposition based filtering and riemannian geometry,” Expert Systems with Applications, vol. 95, pp. 201–211, 2018. View at: Publisher Site | Google Scholar
  44. F. Lotte and C. Cuntai Guan, “Regularizing common spatial patterns to improve BCI designs: unified theory and new algorithms,” IEEE Transactions on Biomedical Engineering, vol. 58, no. 2, pp. 355–362, 2011. View at: Publisher Site | Google Scholar
  45. H. Raza, H. Cecotti, Y. Li, and G. Prasad, “Adaptive learning with covariate shift-detection for motor imagery-based brain-computer interface,” Soft Computing, vol. 20, no. 8, pp. 3085–3096, 2015. View at: Publisher Site | Google Scholar
  46. K. K. Ang, Z. Y. Chin, C. Wang, C. Guan, and H. Zhang, “Filter bank common spatial pattern algorithm on BCI competition IV datasets 2a and 2b,” Frontiers in Neuroscience, vol. 6, p. 39, 2012. View at: Publisher Site | Google Scholar
  47. H. Baali, A. Khorshidtalab, M. Mesbah, and M. J. E. Salami, “A transform-based feature extraction approach for motor imagery tasks classification,” IEEE Journal of Translational Engineering in Health and Medicine, vol. 3, pp. 1–8, 2015. View at: Publisher Site | Google Scholar
  48. A. Schlögl, F. Lee, H. Bischof, and G. Pfurtscheller, “Characterization of four-class motor imagery EEG data for the BCI-competition 2005,” Journal of Neural Engineering, vol. 2, no. 4, pp. L14–L22, 2005. View at: Publisher Site | Google Scholar
  49. M. Grosse-Wentrup and M. Buss, “Multiclass common spatial patterns and information theoretic feature extraction,” IEEE Transactions on Biomedical Engineering, vol. 55, no. 8, pp. 1991–2000, 2008. View at: Publisher Site | Google Scholar
  50. I. Koprinska, “Feature selection for brain-computer interfaces,” in Prooceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 106–117, Springer, Bangkok, Thailand, April 2009. View at: Google Scholar

Copyright © 2019 Shan Guan et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


More related articles

1787 Views | 663 Downloads | 8 Citations
 PDF  Download Citation  Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at help@hindawi.com to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19. Sign up here as a reviewer to help fast-track new submissions.