Computational and Mathematical Methods in Medicine

Volume 2017 (2017), Article ID 2948742, 10 pages

https://doi.org/10.1155/2017/2948742

## Sequential Probability Ratio Testing with Power Projective Base Method Improves Decision-Making for BCI

^{1}Biomedical Engineering Department, Dalian University of Technology, Dalian, Liaoning 116024, China^{2}Affiliated Zhongshan Hospital of Dalian University, Dalian, Liaoning 116001, China^{3}Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China

Correspondence should be addressed to Yongxuan Wang; nc.ude.tuld.liam@4098xyw

Received 30 June 2017; Accepted 11 September 2017; Published 14 November 2017

Academic Editor: Fei Chen

Copyright © 2017 Rong Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Obtaining a fast and reliable decision is an important issue in brain-computer interfaces (BCI), particularly in practical real-time applications such as wheelchair or neuroprosthetic control. In this study, the EEG signals were firstly analyzed with a power projective base method. Then we were applied a decision-making model, the sequential probability ratio testing (SPRT), for single-trial classification of motor imagery movement events. The unique strength of this proposed classification method lies in its accumulative process, which increases the discriminative power as more and more evidence is observed over time. The properties of the method were illustrated on thirteen subjects’ recordings from three datasets. Results showed that our proposed power projective method outperformed two benchmark methods for every subject. Moreover, with sequential classifier, the accuracies across subjects were significantly higher than that with nonsequential ones. The average maximum accuracy of the SPRT method was 84.1%, as compared with 82.3% accuracy for the sequential Bayesian (SB) method. The proposed SPRT method provides an explicit relationship between stopping time, thresholds, and error, which is important for balancing the time-accuracy trade-off. These results suggest SPRT would be useful in speeding up decision-making while trading off errors in BCI.

#### 1. Introduction

Noninvasive brain-computer interface (BCI) based on the electroencephalogram (EEG) offers a new means of communication to locked-in or paralyzed patients [1, 2] and controlling a prosthesis [3, 4] without reliance on the usual neuromuscular pathways. The critical challenge of BCI technology is to classify the brain signals and mental tasks accurately and fast. However, the EEG recorded from the scalp has the characteristics of low strength, low SNR (signal noise ratio), and the EEG difference under different mental tasks is not significant. Therefore, various pattern recognition algorithms were used in BCI system to extract and classify EEG features.

Event-related desynchronization/synchronization (ERD/ERS) patterns of motor imagery are effective features for EEG-based BCI systems. The experiments show that the phenomenon of ERD/ERS varies among individuals. Therefore, a pattern recognition algorithm should be used to facilitate decoding “motor intent,” both to find subject-specific EEG features that maximize the separation between the patterns generated by executing the mental tasks and to train classifiers that minimize the classification error rates of these specific patterns. Currently, feature extraction for discrimination of left- and right-hand motor imagery EEG is usually based on EEG band power (BP). For example, autoregression (AR) model [5], discrete Fourier transformation (DFT) [6], and wavelet transforms (WT) [7] have been used to extract EEG features for classification. The wavelet method is one of the most effective algorithms. However, the success of wavelet application greatly depends on the proper selection of subject-specific parameters. Actually, the wavelet transform can be considered as projecting the EEG onto a wavelet basis and the band power as the modulus values of projective coefficients. Inspired by the wavelet method, we introduce a new feature extraction method based on power projective bases to classify EEGs without constrain of wavelet forms.

Moreover, the ability to make rapid decisions based on transient stimuli is a unique aspect of our brains’ capacity to process information. Broadly speaking, signal detection theory (SDT) and sequential analysis (SA) are two branches of mathematical models that provide a theoretical framework for understanding how decisions are made [8]. SDT converts a single observation into a categorical choice. According to different decision rules, there are different testing approaches to this problem [9]. For example, Bayesian decision theory is derived by minimizing the posterior expected loss, while Neyman-Pearson (NP) criterion seeks to find the best error probability (*α*) level test. Like most statistical classification methods, for example, linear discriminant analysis (LDA) and support vector machines (SVM), the classification error is the only characteristic of the SDT decision strategies. The necessary number of observation samples determined by the criteria could be very large, which is especially impractical for BCI applications. To control brain-actuated devices, such as robotics and neuroprostheses, both fast decision-making and a stable control signal with a minimal error rate are important [10, 11]. Therefore, recent attentions have been paid to the variable-length sequential sampling model.

A systematic theory of optimal stopping emerged with the work by Wald on the optimality of the sequential probability ratio test (SPRT) [12]. The SPRT achieves a desired error rate with the smallest number of samples, on average. Therefore, in this paper, we introduce a new feature extraction method based on power projective base to classify the EEGs by combining the sequential probability ratio test (SPRT) approach to obtain a continuous dynamic estimate of brain state with accuracy and decision speed balance.

#### 2. Methodology

##### 2.1. Data Description

The EEG data used in this work were obtained from thirteen subjects from BCI Competitions II, III, and IV. The task was performed based on left- and right-hand motor imagination.

###### 2.1.1. Dataset III from BCI Competition II

This dataset contains EEG data from one subject (S1) [13]. The data were recorded from three channels (C3, Cz, and C4) and sampled at 128 Hz. The data consist of 140 labelled and 140 unlabelled trials with an equal number of left- and right-hand trials. Each trial has a duration of 9 s, where a visual cue (arrow) is presented pointing to the left or the right after 3 s preparation period followed by a 6 s motor imagery (MI) task.

###### 2.1.2. Dataset IIIb from BCI Competition III

The second dataset contains EEG data recorded over the channels C3 and C4 from three subjects (S2, S3, S4) with some corrections [14]. The data were sampled at 125 Hz. Training and testing sets were available for each subject. Except for the subject O3 that has only just 320 trials for each set, the subjects S4 and X11 contain 540 labelled and 540 unlabelled trials. Each trial has duration of 7 s which consists of 3 s for preparation period and 1 s for a visual cue presentation, followed by another 3 s for the imagination task.

###### 2.1.3. Dataset IIb from BCI Competition IV

This dataset contains EEG data from nine subjects (S5–S13) [15]. The data were recorded from three bipolar channels (C3, Cz, and C4) and 3 EOG channels. The sample frequency was 250 Hz. Training and testing set was available for each subject. Each subject participated in two screening sessions without feedback and three online feedback sessions with smiley feedback. The trials without feedback had duration of 7 s, and a visual cue was presented for 1.25 s followed by another 4 s for the imagination task. The trials with feedback had duration of 7.5 s, and a visual cue was presented for 4.5 s until the end of motor imagination.

##### 2.2. Feature Extraction Method Based on Power Projective Base

Motor imagery can be regarded as mental rehearsal of a motor act without any obvious motor output. Recent studies show that when performing motor imagination, *μ* (8–13 Hz) and *β* (18–30 Hz) rhythms are found to reveal event-related desynchronization and synchronization (ERD/ERS) over sensorimotor cortex just as when one performs motor tasks. Due to nonstationary effects having often been observed in brain signals, we proposed a power projective base method to extract classification features from C3 and C4 channels. This method improves the classification accuracy by maximizing the difference of the average projective power between two-class signals. Specifically, the solution of the projective bases can be achieved by generalized eigenvalue decomposition for each subject.

Let be the training dataset from one channel, where denotes the left- or right-hand motor imagery tasks, denotes the sampling points, and is the number of trials. Moreover, let be the projective basis and . The projective power of signal , on the projective basis is So the mean projective power can be calculated as where is the autocorrelation matrix and it is usually positively definite.

To formulate the objective function to be the ratio of the two-class average projective powers, By maximizing or minimizing to be or , the corresponding eigenvector or is the optimal projective base to be solved. The optimization of (3) could be solved by taking a generalized eigenvalue decomposition method. First of all, we can get the following decomposition as where is the generalized eigenvector matrix and is the generalized eigenvalue. Therefore, the ratio of mean projection power turns towhere . Since has the maximum value and minimum value . Obviously, the corresponding vectors can be obtained by Then, we havewhich means that and are the first column and last column of , respectively. Choosing or as the projective base depends on which is larger between and . The projective power for the signals of the channels C3 and C4 onto their own projective bases is then stacked together into a 2-dimensional feature vector .

##### 2.3. SPRT Classification Method

Sequential analysis is a statistical decision model that assumes decisions are formed by continuously sampling information until the response criterion is satisfied. Once a boundary has been reached, the decision process is concluded and a response is elicited. The number of observations needed for a decision is not determined in advance of the experiment, but by the observations obtained during the test. The data should be fed to the SPRT algorithm sequentially, so we divide each trial into segments with overlap and each one has the same length as that of the projective base used in feature extraction.

Taking into account the nonstationarity of the EEG sampling information, we assume that the probability distribution of the th segment feature for class is , , and then the probability ratio for the th segment is and the evidence accumulation turns to where is the number of accumulated segments. Assuming the segments are independent for computation convenience [12], we havewhere , is the join probability distribution of dimensional vector . This assumption is violated by our data in practice.

The decision rule with two thresholds and isWith two thresholds, we have the option to increase or decrease which will increase the probability to make a correct decision (by waiting to accumulate more data or evidence) but decrease the probability of making a wrong decision (by delaying the decision). The error probabilities are defined as If is satisfied, we define the corresponding space of vector to be . With (11), we have Equation (14) is then integrated in yieldingThat is, Analogous reasoning for yields Thus, the two detection thresholds and are related to the error probabilities by

The two kinds of error probabilities can be lowered by either increasing or decreasing . However, due to the limited number of segments, the indecision ratio will increase as the error probability or is decreased. Therefore, we may not obtain the optimal result by simply increasing or decreasing . The suitable and could be achieved by the following optimization criteria.

Under the assumption that the features follow a Gaussian distribution, we can take logarithm on both sides of (10) to obtain the log probability ratio (log PR), which leads a sequential probability ratio test (SPRT) aswhere is the Mahalanobis distance and is the log PR of the th segment.

We can derive the average for each class:

For any given threshold pair and , the number of accumulated segments to make a correct decision, that is, stopping time, for class , is which satisfieswhere is the minimum element of set of . Generally, and may be different. Since the stopping time is a key point in the sequential analysis, we constrain the two thresholds by unifying of two classes to be equal, that is, . Then the thresholds are given by

For any given stopping time , there is a corresponding threshold pair and . The decision rule with two thresholds and isFrom this decision policy, we can see that other than assigning one of the two classes and , the decision functions may still be undecided and continue testing to the next observation. The “undecided” response keeps the number of errors (false positives or false negatives) low, which is useful for avoiding making excessive mistakes to speed up decisions, for example, a BCI control wheelchair running into an obstacle [16]. In addition, when it is still undecided when reaching to the stopping time , we specify that when , the decision rule is

Till now, with the above decision rules, the consequent results, such as accuracy, mutual information (MI) [17], the steepness of MI [18], and average decision time, will only rely on the stopping time and the data to be analyzed. Depending on the actual specific needs, we can set the accuracy, MI, the steepness of MI, and average decision time as the optimization target, respectively, to determine the optimal stopping time . At the same time, the two thresholds are determined.

#### 3. Results

##### 3.1. Feature Extraction

To evaluate the performance of our method, we tested it using BCI Competition Datasets II and Dataset IIIb from BCI Competition III. They were obtained from four subjects, denoted as S1–S4. The task performed was based on left and right-hand motor imagination.

The dimension of the projective base, that is, the length of the sliding window, is set to be 1 s. The time-domain waveforms of the optimal projective base of the two channels for subject S1 are shown in Figures 1(a) and 1(b). The corresponding frequency spectra are shown in Figures 1(c) and 1(d). The average projective power time courses during the right-hand (dash line) and left-hand (solid line) imagined movement for the C3 and C4 are displayed in Figures 1(e) and 1(f). From this figure, we can see that the projective bases are similar to modulated sine signals and the spectra have band-pass characteristics which are similar to that of wavelet base. For this subject, the projective power dominates in the *μ* rhythm. During the first 3.5 s (0.5 s after cue presentation), the projective power curves under two conditions are close; after 3.5 s, distinct difference in the projective power can be observed which provides a good classification feature. The projective bases for subjects S2 and S3 are similar to that of subject S1.