Abstract

Surface electromyography- (sEMG-) based gesture recognition is widely used in rehabilitation training, artificial prosthesis, and human-computer interaction. The purpose of this study is to simplify the sEMG devices by reducing channels while achieving comparably high gesture recognition accuracy. We propose a compound channel selection scheme by combining the variable selection algorithms based on multitask sparse representation (MTSR) and minimum Redundancy Maximum Relevance (mRMR). Specifically, channelwise features are first extracted to compose channel-feature paired variables, for which variable selection procedures by MTSR and mRMR are carried out, respectively. Then, we rank all the channels according to their occurrences in each variable selection procedure and figure out a certain number of informative channels by fusing these rankings of channels. Finally, the gesture classification performance using the selected channels is evaluated by the support vector machine (SVM) classifier. Experiment results validate the effectiveness of this proposed method.

1. Introduction

Surface electromyography (sEMG) is commonly used in clinical and engineering areas with the advantages of being noninvasive and convenient in signal acquisition. For example, sEMG reveals the information in diagnosing neuromuscular disorders [1, 2]. More generally, it may play important roles in the controlling of artificial assistance robots, arm prostheses, rehabilitation equipment, and some other instruments [3, 4].

Most of the related works have been carried out with sEMG of multiple channels to guarantee satisfactory recognition performance [5]. However, the increase of channels makes not only a high cost in engineering but also the great complexity of the sEMG devices and data processing burden. In addition, it could suffer from performance deterioration due to signal crosstalk [6, 7]. To overcome these problems due to multiple channels of sEMG, it is rewarding to select a reduced group of channels in a myoelectric control system. This is just the aim of our work which is to simplify the sEMG device by removing some redundant electrodes on the premise of desired classification performance.

Feature extraction is a routine procedure to describe the sEMG signals with a feature vector. Multitudinous features of time domain, frequency domain, and time-frequency domain have been widely applied in sEMG-based classification tasks. When multiple features are extracted for channels one by one, we could get a feature set with a quite large size (the number of features per channel times the number of channels). Hence, feature selection can be followed to reduce the feature redundancy and alleviate the curse of dimensionality, where metrics including scatter plot of features, statistical analysis, and recognition rate are applied to evaluate the effectiveness of features [8, 9], and feature search strategies including sequential forward selection (SFS), sequential backward selection (SBS), or bidirectional searching are adopted to find out the most informative features [10].

Like feature selection in the point of lowering the feature size, channel selection will, in addition, remove those channels unnecessary or irrelevant to classify different gestures. In fact, channel selection is highly related to feature selection since features coming from all the channels are generally combined to create a set of channel-feature paired variables. Hence, channel selection can be the successor operation after feature selection, using the selected or fixed features.

To select useful channels from multielectrode, Nagata and his colleagues [11] used the recognition rate to evaluate each measurement channel and found out the best combination of channels by the Monte Carlo method. Huang et al. [12] applied SFS search strategy for expected channels where four kinds of time-domain features and an LDA classifier are used in the searching iteration. Khushaba and Al-Jumaily [13] also adopted a wrapper method, particle swarm optimization, in channel selection where the importance of subsets was measured using the error rates acquired from a multilayer perceptron trained with backpropagation neural network. Similar work by Oskoei et al. [14] employed a multiobjective genetic searching algorithm with the objective function of data separability index or classification rate. Besides, filter methods have also been applied to rank the channels, where the minimum Redundancy Maximum Relevance (mRMR) [15] was used by Liu et al. [16] and Gupta et al. [17], the Relief-F by Qu et al. [18], and the Markov random field (MRF) by Qu et al. [16] as well.

As shown in these aforementioned pieces of literature, channel selection could be conducted by fixing the feature subset. That means we cannot simultaneously select the best features and channels, which can be improved in the way as follows. Features and channels are combined to construct feature-channel pairs, leading to a hybrid feature-channel selection problem. By finding the least redundant and most informative group of feature-channel pairs among all the possible ones, the best channels should be the most repeated ones. In these aspects, some classic or modified ranking methods have been applied to select channel-feature variables, such as mRMR-FCO [19] and certain correlation-based or distance-based evaluation function in the work by Al-Angari et al. [20].

Channel selection can follow a feature-channel filtering pipeline, but differing in specific ranking scores or search strategies. Our work is just under this kind of framework where we resort to the multitask sparse learning [21] together with mRMR filtering to pursue the discriminative sEMG channels across the classification for multiple gestures.

Since the classic least square regression model in sparse learning does not pursue the class-discriminative power of features, certain type of discriminative regularization terms is preferred to make up this limitation. Zhu et al. [22] put forward a group-sparsity-based least square regression framework integrating linear discriminant analysis and locality preserving projection. Similarly, to better capture the discriminative information among subjects, a multitask feature selection method was proposed to incorporate the intraclass and interclass Laplacian matrices [23]. But this kind of work will generally lead to a complicated optimization problem and most likely suffer from heavy computation cost.

Inspired by the works related to multitask sparse learning, for channel selection, we propose a channel selection method that combines the multitask sparse representation (MTSR) and mRMR algorithms. Instead of superimposing discriminative regularization terms in the MTSR framework, we evaluate the sEMG channels using the MTSR and mRMR, respectively, and then fuse their results to figure out the ideal channels in the end. The flowchart of this paper is shown in Figure 1.

3. Methods

3.1. Dataset and Evaluation Metrics

The sEMG dataset [24] contains thirty healthy normal-limbed subjects, who were kept relaxed and performed 7 distinct hand gestures including hand open, hand close, supination, pronation, wrist flexion, wrist extension, and rest. Eight surface electrodes were used for sEMG acquisition. In other words, we have signals with eight channels.

In this work, three classic measures, that is, precision, recall, and accuracy, are selected as indicators to evaluate the performance of gesture classification. These metrics are defined as follows:where TP, FP, TN, and FN are True Positive, False Positive, True Negative, and False Negative, respectively. An average of classification metrics in the experiments below will be obtained by 5-fold cross validation.

3.2. Feature Extraction

To analyze the sEMG signal, a sliding window is adopted for the 8 channels. Totally 11 time-domain features, as listed in Table 1, are extracted which have been proved effective for myoelectric pattern recognition [16]. Thus, we have channel-feature paired variables with the size of 8 times 11.

L is the signal length, and is the signal in an analysis window. SD is the standard deviation. is the order of autoregressive model, is a white noise term, and the coefficients are used as features.

3.3. Channel Selection Scheme

This study aims to reduce the sEMG channels by finding the least and best electrode locations to discriminate different hand motions. For channel-feature variables, we first perform a composite variable selection for the task of gesture motion recognition. Then, all channels will be ranked according to their occurrences in the selection of channel-feature variables, where MTSR and the mRMR variable ranking method are used, respectively. By fusing these two ranking results, we can finally get the ideal channels but with high recognition capability for hand gesture motions.

3.3.1. MTSR-Based Variable Selection

Given a feature matrix , where d and n are the numbers of features and samples, respectively, we also have a class indicator matrix with the class number c. Since multiple response variables are included in the class indicator matrix , for each response variable, we can find a regression coefficient vector individually. By regularizing a least square regression model with an -norm, the multiclass feature selection problem can be formulated as a sparse least square regression model as follows [21]:where is a coefficient matrix for regression and the parameter λ is adopted to adjust the sparsity of . By enforcing the group sparsity on the coefficient matrix with a -norm, some rows in will be zero. The first term in equation (4) controls the data fitting error, and the regularization parameter λ balances the relative importance of both terms. The larger λ results in more zero rows in the coefficient matrix. It can be assumed that the optimal solution would assign large weights to the important features and zero or small weights to the less important features.

3.3.2. mRMR Variable Ranking

The above MTSR method mainly focuses on the relationship between labels and features but ignores the relationship between features to some extent. Hence, we resort to mRMR algorithm to select features from a different perspective.

The mRMR criteria [15] aim to choose features that are mutually dissimilar to each other and marginally similar to the classification labels, ranking candidate component features based on compromise between relevance and redundancy. In this paper, we use mutual information to measure both redundancy and relevance.

Mutual information is defined as follows:where X and Y denote two feature vectors and p(x, y) is the joint probabilistic density, while p(x) and p(y) are the marginal probabilistic densities. The goal is to find a subset S with m features, and the maximum relevance and the minimum redundancy are defined by equations (6) and (7):where is the -th feature, c is the class variable, and S is the feature subset. The maximum relevance and the minimum redundancy are integrated by equation (8) or (9).

The incremental search method is used to find the approximate optimal feature. Supposing that we already have the feature set , the next step is to find the m-th feature from the feature set maximizing . The incremental algorithm optimizes the formula [15]

3.4. To Fuse the Channel Rankings

As stated above, we successively select the effective channel-feature pairs by MTSR and mRMR ranking method. Thus, we can get two groups of rankings for all channels according to their occurrences in the screened channel-feature variables. These two channel ranking methods work with different principles, but their corresponding results share common informative components even if they differ to a certain extent. We combine the two channel ranking results in the hope of avoiding decision faults to the utmost extent.

4. Results

4.1. Channel Selection

Considering that 11 features are extracted for 8 channels each, we have 88 channel-feature paired variables in total for each analysis window. We apply the multiclass sparse representation model for the training data. According to equation (4), the parameter λ controls the sparsity of the coefficient matrix W, namely, the number of the screened channel-feature variables. The gesture recognition performance would be affected by features and classifiers we employed.

Let λ varies from 0.01 to 0.1, and channel-feature variables corresponding to nonzero rows of the coefficient matrix W are kept and fed to a support vector machine (SVM) classifier with radial basis function [25]. We hope to achieve a high recognition rate (accuracy is used in Section 4.1 and 4.3) while using only a few feature variables.

We make a comparison to show how to decide a proper value for λ. When λ varies from 0.01 to 0.1, the screened channel-feature number varies greatly but the recognition rate does not decrease too much. The changing of recognition rate and channel-feature number along with λ is shown in Figure 2. We can also see that a good balance between the recognition rate and channel-feature dimension can be achieved when λ equals 0.03. Accordingly, we will keep 36 channel-feature variables in the following channel selection procedure.

And for mRMR-based channel-feature selection, we also keep the top 36 variables which will be fused with the results of MTSR.

Table 2 lists the selected 36 channel-feature variables (features for each channel) by MTSR and mRMR, respectively. It is obvious that there is a certain difference between the screened results by these two methods. For instance, autoregressive features AR1 and AR2 play important roles in MTSR modal, being used by most channels. However, for mRMR, the two features only appear in channel ⑧. Therefore, we select channels based on channel utilization rather than analyzing the features. We count the number of times that any two channels occupy a common feature, namely, the number of features shared by a channel pair. The more frequently a channel is utilized, the more important the channel will be. The corresponding statistical results for MTSR and mRMR are shown in Tables 3 and 4 .

From Tables 3 and 4, we sort channels by the number of times which are used. For MTSR, the order is ②>③=⑧>⑤>⑦>①=④>⑥ and ①=⑤>⑧>③>⑦>②=⑥>④ for mRMR. By decision-making level fusion for channel selection, three channels ③, ⑤, and ⑧ are adopted for the subsequent gesture recognition.

4.2. Feature Selection

Also based on the screened channel-feature variables by MTSR and mRMR, we list all the channels occupying a given feature (shown in Table 5). If a feature is shared by over half channels (>4), it will be selected for the gesture recognition task. Specifically, we have WL, AR1, and AR2 from MTSR-based results, and WL, IAV, SSI, and Kurtosis by mRMR. These six features, WL, IAV, SSI, Kurtosis, AR1, and AR2, will be fed into classifier in the following experiments.

4.3. Classification Performance Based on Channel and Feature Selection

According to Section 4.1, three channels (③, ⑤, and ⑧) are jointly selected by fusing MTSR and mRMR. We first compare the gesture classification performance using these three channels with those by MTSR or mRMR individually. For MTSR-based results, the top three channels are ②, ③, and ⑧, and the three channels ①, ⑤, and ⑧ are for mRMR. Their corresponding gesture recognition accuracies are shown in Figure 3. By combining MTSR and mRMR, channels ③, ⑤, and ⑧ are used and the average recognition rate is 98.68%, which is higher than that using channels ②, ③, and ⑧ or ①, ⑤, and ⑧ (the average classification accuracy is 95.59% for channels ②, ③, and ⑧ and 81.15% for channels ①, ⑤, and ⑧).

In addition, comparative experiments for gesture classification are carried out using two or four channels selected by different methods. When choosing two channels, we have channels ⑤ and ⑧ by fusing MTSR and mRMR. For MTSR-based method, the top two channels are ② and ③ or ② and ⑧; for mRMR, the selected two channels are ① and ⑤. The gesture classification accuracies are illustrated in Figure 4, where channels selected by jointly using MTSR and mRMR achieve the highest classification accuracy.

As for choosing four channels, channels ③, ⑤, ⑦, and ⑧ are selected by fusing MTSR and mRMR. For MTSR-based method, the top four channels are ②, ③, ⑤, and ⑧; for mRMR, the four channels are ①, ③, ⑤, and ⑧. Correspondingly, the gesture classification accuracies are drawn in Figure 5. It also verifies that channels selected by jointly using MTSR and mRMR achieve the highest classification accuracy.

4.4. Performance Evaluation and Comparison

To evaluate the performance of our method by fusing MTSR and mRMR for channel selection, comparative experiments are conducted in two aspects. Firstly, we further compare the proposed method with MTSR and mRMR in the task of channel selection. For the number of selected channels varying from 2 to 4, precision and recall for gesture classification corresponding to different method are listed in Table 6 where the selected channels are in square brackets.

Compared with only 2 channels used, the recognition performance improves significantly when 3 channels are selected. It reveals that even 2 informative channels cannot capture enough information to distinguish different hand gestures in the experiment, where the best combination of 2 channels [5 8] is picked out by the proposed method. With more channels added in a certain range, the recognition performance will increase overall. In all cases, as shown in the table, our MTSR- and mRMR-fused methods outperform each of the two base methods alone.

Besides, a latest work proposed a mean Relief-F-based channel selection method (MRCS) [18]. Under the same experimental conditions including dataset and features, its classification performance is shown as the third row in Table 6. As for selecting four channels, the channel combination [1 3 5 7] is obtained by MRCS, and the corresponding classification rate is lower than our work here by selecting 3 channels or 4 channels. It should be noted that the classification performance can be further improved by using more informative features as demonstrated in the work [18], which will be our focus in the work later.

5. Conclusion

Recent developments in sEMG instrumentation have made it possible to record many channels from single or multiple muscles simultaneously. The current study combines MTSR and mRMR to process the channel-feature variables, aiming to reduce the channel number without degrading the gesture recognition performance.

For a gesture recognition task, sEMG dataset of 8 channels is recorded for 7 hand motions. Given the channel-features pairs obtained from time-domain features, the most informative channels are decided by the MTSR- and mRMR-combined variable selection method. The combination of MTSR and mRMR makes the selected variables not only reflect the relationship between labels and feature vectors but also try to meet the requirement of maximum relevance and minimum redundancy between vectors. Experimental results have verified the effectiveness of the proposed method.

It is worth noting that only time-domain features are extracted for sEMG signals in this paper. The channel selection operation is dependent on these features. More features generated in the frequency domain or time-frequency domain are to be used to test this feature/variable selection method in the coming work. In addition, this proposed method for feature selection can also be used in other pattern recognition and machine learning applications.

Data Availability

The pattern recognition library is available at http://www.sce.carleton.ca/faculty/chan.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the Natural Science Foundation of Shandong Province (ZR2020MF086).