#### Abstract

Human motion intention recognition is a key to achieve perfect human-machine coordination and wearing comfort of wearable robots. Surface electromyography (sEMG), as a bioelectrical signal, generates prior to the corresponding motion and reflects the human motion intention directly. Thus, a better human-machine interaction can be achieved by using sEMG based motion intention recognition. In this paper, we review and discuss the state of the art of the sEMG based motion intention recognition that is mainly used in detail. According to the method adopted, motion intention recognition is divided into two groups: sEMG-driven musculoskeletal (MS) model based motion intention recognition and machine learning (ML) model based motion intention recognition. The specific models and recognition effects of each study are analyzed and systematically compared. Finally, a discussion of the existing problems in the current studies, major advances, and future challenges is presented.

#### 1. Introduction

Along with worldwide population aging and increasing number of the disabled and amputee, the wearable robots recently get extensive research. For the wearable robots, human-machine interface is a research hotspot, which acquires human motion intention by collecting and analyzing related information, and assists external devices to develop effective control strategies [1]. Accurate and real-time recognition of human motion intention is the key to achieve perfect human-machine coordination and wearing comfort [2, 3]. As a bioelectrical signal, surface electromyography (sEMG) is activated when a neuron carrying human intention information is transmitted to related muscles and reflects the human motion intention directly [4, 5]. Hence, the motion intention can be fully estimated without any information delay and lose [6, 7]. Because of containing rich information, mature acquisition technology, and noninvasiveness, the human motion intention recognition based on sEMG is about to go mainstream [8, 9].

The methods of sEMG based motion intention recognition can be divided into two groups: sEMG-driven musculoskeletal (MS) model based and machine learning (ML) based. For the former, a function between sEMG and joint moment, angular velocity or angular acceleration can be established by biomechanics model of muscles. An explanation of motion production process is the advantage of this method [3, 10]. For the latter, the sEMG feature or processed sEMG is provided as input to the ML. The discrete-motion classification or continuous-motion estimation is realized by establishing the mapping between input and human motion intention. The ML commonly used for motion intention recognition includes support vector machine (SVM), linear discriminant analysis (LDA), back-propagation neural network (BPNN), and deep learning (DL) [11, 12]. Compared to the former, the ML model possesses the characteristics of lower computational complexity, short operation time, and real-time performance. With the development of deep learning (DL) research in recent years, DL is increasingly used for human motion intention recognition. Compared to the others, DL greatly improves the nonlinearity of model, the ability of solving complex problem, and the accuracy of recognition [13]. The DL model commonly used for motion intention recognition includes deep belief network (DBN), convolutional neural network (CNN), and stacked auto-encoder (SAE) [14, 15].

There are several related review papers appearing in recent years. Nazmi et al. [16] reviewed the classification methods of motion patterns based on sEMG. A brief comparison of the different methods for preprocessing, feature extraction, and classifying sEMG signals was provided. Chowdhury [17] analyzed the signal processing of sEMG and evaluated the pros and cons of different classification models. Singh et al. [2] discussed the current development and challenges of the sEMG based control schemes which are employed in designing exoskeleton in stroke rehabilitation. However, the sEMG based continuous-motion intention regression and sEMG-driven musculoskeletal model based motion intention recognition are rarely reviewed. And the two methods are more valuable to realize the smooth control of wearable robot movements [3, 18, 19]. In order to further understand the knowledge of the sEMG based motion intention recognition, this paper presents the review of all commonly used methods of human motion intention recognition for last decade briefly.

The rest of this paper is organized as follows. In Section 2, the motion intention recognition methods based on the sEMG-driven musculoskeletal model is reviewed. In Section 3, we discuss the various ML methods for discrete-motion classification and continuous-motion regression. In Section 4, a succinct conclusion of this paper is presented.

#### 2. sEMG-Driven MS Model Based Motion Intention Recognition

The sEMG is a nonstationary and microelectric signal, which amplitude is concentrated in 0.01-10 mV, and frequency is concentrated in 20-500 Hz, especially in 50-150 Hz [20]. Because of about 30-150 ms prior to the corresponding motion generated, sEMG is an ideal choice for motion intention estimation [20, 21]. Figure 1 shows the human-machine interaction process based on sEMG. In the whole process, human motion intention recognition is the most critical part. It can be achieved through two ways: sEMG-driven MS model and ML model.

The sEMG-driven musculoskeletal model can be divided into three submodels, i.e., activation model, contraction model, and musculoskeletal geometry model, as shown in Figure 2 [10]. To serve as input to the model, the raw sEMG signal should be preprocessed by high-pass filtering, full-wave rectification, low-pass filtering, and normalization [3]. For the activation model, the relationship between muscular activation ((t)) and the processed sEMG signal () of muscle at time can be expressed as the following equation [3, 22]:where is a nonlinear shape factor of muscle number* i*. For the contraction model, Hill-type muscle model was always used, as shown in Figure 3 [23]. The force produced by the muscle-tendon unit () can be given bywhere and denote the tendon force and maximum isometric muscle force. , , and are the generic force-length, generic force-velocity, and parallel passive elastic force-length curves of muscle number* i*, respectively. represents the pennation angle, which is defined as the angle between the muscle fiber and the tendon [3, 22]. For the musculoskeletal geometry model, the moment arms of muscle-tendon unit () can be defined aswhere is the joint angle and is the muscle-tendon length. can be calculated bywhere and represent the lengths of tendon and muscle fiber, respectively [3, 10]. Thus, the joint moment can be given by the following equation [3]:where and denote the number of agonist and antagonist muscles acting on the joint, respectively. The joint angular acceleration () can be calculated by the joint forward dynamics [23, 24].where represents the joint inertia and includes the external torque and the limbs gravity torque. Consequently, the joint angular velocity () and angle () can be calculated byThere are several unknown parameters in the sEMG-driven musculoskeletal model, so the parameters identification through the preliminary experiment is necessary. Han et al. [23] and Ding et al. [24] developed a state-space sEMG model to estimate the continuous motion of elbow joint directly, and a closed-loop prediction-correction approach was employed. The results of preliminary experiment showed that the root mean squared error (RMSE) of angle and angular velocity between estimated and actual values is around 0.10 rad and 0.15 rad/s, and the correlation coefficient (CC) is around 0.99 and 0.91, respectively. Lloyd et al. [22] utilized a modified Hill-type muscle model to estimate muscle forces and knee moments. An average CC of 0.91 and mean residual error (MRE) of 12 Nm was observed. Karavas [3] employed the common sEMG-driven musculoskeletal model to estimate the knee torque, trajectory and stiffness trend. The results showed that the normalized RMSE was about 0.12 between the estimated and actual values. In the study of Sartori et al. [10], a multi-DOF sEMG-driven model was developed to estimate the muscle force and joint moment of lower extremity. The results showed that the average normalized mean absolute error (MAE) of joint moment of three lower extremities was around 0.15.

#### 3. Machine Learning Based Motion Intention Recognition

Machine learning (ML) based motion intention recognition can be divided into two groups: discrete-motion classification and continuous-motion regression. For the former, a mapping between sEMG and discrete-motion of upper/lower limbs needs to be established. The common classified lower limbs motions include walking, running, sit-to-stand, stand-to-stand, stair ascent, and stair descent. And the common classified upper limbs motions include shoulder flexion/extension/adduction/abduction, elbow flexion/extension, wrist flexion/extension/radial deviation/ulnar deviation, thumb flexion/extension/adduction/abduction, index flexion/extension, middle finger flexion/extension, ring finger flexion/extension, litter finger flexion/extension, hand grasp, and pinch grasp [25, 26]. For the latter, a mapping between sEMG and continuous-motion of upper/lower limbs needs to be constructed. The common regressed limbs motions include angle, angular velocity, angular acceleration, force, and moment of hip, knee, ankle, shoulder, elbow, and wrist joint. Compared to the former, a mature method, the latter is more valuable for the smooth control of wearing robots and will be the focus of future research [20].

##### 3.1. Machine Learning Based Discrete-Motion Classification

Table 1 reviewed the most recent studies about discrete-motion classification. As shown in Figure 1, feature extraction and classification model construction are two most important and key steps in discrete-motion classification. The commonly used feature can be mainly divided into time domain feature, frequency domain feature, and time-frequency domain feature. For the time domain feature, mean absolute value (MAV) [27–32], root mean square (RMS) [29, 31], variance (VAR) [29, 31], standard deviation (SD) [29], zero count (ZC) [27, 29, 32], waveform length (WL) [27, 29, 32], slope sign change (SSC) [29, 32], integrated EMG (IEMG) [33], and difference of mean absolute value (DMAV) [27] are commonly utilized. Although the calculation of time domain feature is simple, it is not enough to describe the information of signals. For the frequency domain feature, peak frequency (PF), median frequency (MF), and mean power frequency (MPF) are commonly utilized. It is only used to analyze the fatigue of muscle [34]. For the time-frequency domain feature, Fourier Transform Features [27] and Wavelet Transform Features [35] are commonly used. Although the comprehensive information of signal can be obtained, the extraction process of sEMG is complex and time consuming. When multichannel sEMG signals are used for feature extraction, feature redundancy often exists. Therefore, dimensionality reduction algorithm, which is usually adopted principal component analysis, needs for multichannel feature extraction [20].

SVM based classification model has the ability to resolve the nonlinear binary classification problem by constructing an optimal classification hyperplane with the largest margin to separate the two classes of samples [25]. For resolving the multiclassification problem, one-versus-one SVM, one-versus-rest SVM, multistep SVM, etc. are common utilized. Babita et al. [36] employed linear SVM and wavelet packet transform to classify binary elbow flexion and extension. A 91.1% classification accuracy was observed for this method. Yang et al. [37] classified eight hand motions including palm extension, palm turn downwards, palm turn upwards, palm enstrophe, palm ectropion, fist turn downwards, fist turn upwards, and clenching by using genetic algorithm optimized SVM. Power spectral density was used for feature extraction. The results showed that the training and testing recognition accuracy could reach 99.37% and 90.33%, respectively. Sui et al. [38] utilized an improved SVM to classify six upper limb motions, namely, elbow flexion, elbow extension, wrist internal rotation, wrist external rotation, fist clenching, and fist unfolding. The energy and variance of the wavelet packet coefficients were selected as feature vectors. The results showed that the average recognition accuracy could reach 90.66%. Cai et al. [25] adopted one-versus-one SVM to classify five upper limb motions, namely, shoulder flexion, shoulder abduction, internal rotation, external rotation, and elbow flexion. The results showed that the classification accuracy could reach 94.18%. Pan et al. [39] classified six finger motions, namely, thumb bending, index finger bending, middle finger bending, ring finger bending, and litter finger bending by using one-versus-one SVM. Relative energy coefficient of wavelet packet was selected as the input feature of classifier. The results showed that the recognition accuracy reached 97.78%. Chen et al. [40] utilized two-step SVM to classify seven upper limb motions, namely, shoulder flexion, shoulder extension, shoulder adduction, shoulder abduction, elbow flexion, and elbow extension. By extracting RMS as input feature, a shorter classification time and more accurate results could be obtained. Naik et al. [41] developed a twin SVM to classify seven motions including wrist flexion, ring and middle finger flexion, wrist flexion toward litter finger, wrist flexion toward thumb, finger and wrist flexion, finger and wrist flexion toward litter finger, and finger and wrist flexion toward thumb. An 84.83% classification accuracy was observed for this method.

LDA, k-nearest neighbour (K-NN), naive Bayes (NB), quadratic discriminant analysis (QDA), random tree (RT), random Forest (RF), etc. are also common utilized as classifier like SVM. Liu et al. [42] employed mixed LDA to classify thirteen hand motions including fist, open hand, radial deviation, ulnar deviation, wrist flexion, wrist extension, pronation, supination, fine pinch, key grip, ball grasp, and cylinder grasp. An average classification accuracy could reach 88.74% for this method. Dhindsa et al. [43] compared four classifiers, namely, LDA, NB, K-NN, and SVM, in classifying five classes of knee angle. Fifteen features including time domain features, frequency domain features, and autoregressive coefficients were used as input vectors. The results showed that the classification accuracy with LDA, NB, K-NN, and SVM classifier could reach 71.6%, 75.1%, 87.9%, and 92.2%, respectively. Pancholi et al. [33] classified seven hand motions including hand open, hand close, wrist flexion, wrist extension, soft gripping, medium gripping, and hard gripping by using LDA, K-NN, QDA, SVM, RT, and RF. Nine time domain features and seven frequency domain features were extracted as input vectors. The results showed that the RF had the maximum classification accuracy (99.54%), and the LDA had the minimum classification accuracy (75.38%). Bian et al. [11] utilized LDA, RF, NB, and SVM to classify eight hand motions, including twist a water bottle cap, turn a key, press an automatic pencil, press a nail clipper, preform “shoot” gesture, preform “rock” gesture, preform “ok” gesture, and preform “yeah” gesture. IEMG, SD, RMS, MPF, and MF were selected as the input features. A 91.67% classification accuracy for LDA, 87.50% classification accuracy for RF, 86.83% classification accuracy for NB, and 92.25% classification accuracy for SVM were obtained in this study. Alomari et al. [12] compared LDA, QDA, and K-NN in classifying eight hand motions, namely, wrist flexion, wrist extension, ulnar deviation, radial deviation, grip, open hand, pinch, and catch cylindrical subject. Sample entropy, RMS, myopulse percentage rate (MYOP), and difference absolute standard deviation value (DASDV) were selected as features. The results showed that the classification accuracy with LDA, QDA, and K-NN classifier could reach 98.56%, 93.42%, and 94.25%, respectively.

As shown in Figure 4, ANN based classification model has the ability of learning complex nonlinear patterns by adjusting a set of free parameters known as synaptic weights. Typical shallow ANN architecture consists of an input layer, a hidden layer and an output layer. Each layer has a weight matrix, a bias vector, and an output vector. Number of neurons in the input is given by the number of features obtained from the above methods and in the output is given by the number of motions needed to be classified. Oleinikov et al. [27] classified the hand motions by using ANN. The input features include four time domain features (MAV, DMAV, ZC, and WL) and two frequency domain features for two samples. The hyperbolic tangent sigmoid transfer function was used for twenty-five hidden neurons and SoftMax function for output neurons. The results showed 82% of offline classification accuracy for eight hand motions and 91% accuracy for six hand motions. Oweis et al. [44] adopted ANN to classify five motions including grasping, extension, flexion, ulna deviation, and radial deviation. Seventeen time and time-series domain features were used as input neurons. The proposed ANN includes 30 neurons in hidden layer and 5 neurons in output layer. The results showed that the average classification accuracy could reach 96.7%. Mane et al. [35] utilized ANN to classify open palm, closed palm, and wrist extension of hand motion. Discrete wavelet transform was used for feature extraction. The ANN architecture considered in this study was comprised of two neurons in input layer, ten neurons in hidden layer, and three neurons in output layer. Average 93.25% recognition rate was observed by the proposed method. Two cascaded ANNs were exploited in the study of Gandolla et al. [30] to detect three hand grasp motions, namely, pinching, grasp an object, and grasping. The two ANNs have the same 1025 neurons, i.e., pattern vectors, in the input layer, 25 neurons in the hidden layer, and 2 neurons in the output layer. In the first ANN, pattern vector was classified in clusters. And in the second ANN, the clusters containing more than one task were then classified. The preliminary experiment results illustrated that the proposed method had 76% accuracy for hand motion intention. Ahsan et al. [29] designed an optimal ANN structure with seven neurons (MAV, RMS, VAR, SD, ZC, SSC, and WL) in input layer, ten tan-sigmoid neurons in hidden layer, and four linear neurons in output layer. An average success rate of 88.4% was obtained for classifying single channel sEMG signals. Shen et al. [21] utilized neural network ensemble and three back-propagation neural networks, to recognize the phases of sit-to-stand motion. The sEMG characteristics from four muscles of lower limbs and two floor reaction force (FRF) characteristics were used as input to the proposed networks. For each BP network, there are six neurons in input layer and five neurons in linear output layer. And the tan-sigmoid hidden layer for three BP networks was 12, 13, and 15 neurons, respectively. The preliminary experiment result showed that the recognition accuracy of the proposed method was about 93.48%.

DL is greatly employed to classify human motions in recent years because it improves the nonlinearity of model and the accuracy of recognition. The common methods for motion classification include convolutional neural network (CNN), recurrent neural network (RNN), and stacked auto-encoder (SAE). For the CNN, a typical architecture is shown in Figure 5, which consists of input layer, convolutional layer, pooling layer, fully connected layer, and output layer. Park [14] employed a deep feature learning model based on convolutional neural network to classify six different hand motions including tip pinch grasp, prismatic four fingers grasp, power grasp, parallel extension grasp, lateral grasp, and opening a bottle with a tripod grasp. The proposed model was composed of one input layer, four convolutional layers, four pooling layers, and two fully connected layers. The results showed that the classification accuracy of this method could be up to 90%. Asai et al. [15] estimated four finger motions, namely, thumb open, thumb close, fingers except thumb open, and fingers except thumb close, based on the frequency conversion of sEMG using convolutional neural network. The proposed method contained two pairs of convolution-pooling layers and two fully connected layers. A preliminary experimental result illustrated that the accuracy of motion estimation reached 83%. For the RNN, Bu et al. [45] utilized five-layer recurrent log-linearized Gaussian mixture network (R-LLGMN) to classify six motions including flexion, extension, pronation, supination, grasping, and opening. An average recognition accuracy of 88.4% was observed for this method. For the SAE, Orjuela et al. [46] employed an auto-encoder based deep ANN to classify the five classes of wrist angles. Discrete wavelet transform was used to achieve the extraction of twelve features. The DNN architecture consisted of a sixty-neuron input layer, a five-neuron auto-encoder layer, a four-neuron hidden layer, and a five-neuron output layer. The results showed that the classification accuracy was average about 73.41% for five wrist positions.

In general, the motion description of discrete-motion classification is relatively simple, and there is no uniform classification standard. In addition, the types of motion used for classification are predefined. The unclassifiable condition will happen when the undefined motion type appears [20].

##### 3.2. Machine Learning Based Continuous-Motion Regression

The motion classification can only recognize a few discrete body motion and not be used for smooth control of wearable robots. Therefore, continuous-motion regression, which estimates more motion information than the former, will become the new focus. Similar to the sEMG-driven musculoskeletal model based motion intention recognition, the mapping between sEMG and joint angle, angular velocity, angular acceleration, or joint moment can also be established by ML. The common used ML based continuous-motion regression methods include shallow ANN and DL. Therefore, the two regression methods will be mainly discussed in this section. Table 2 reviewed the most recent studies about continuous-motion regression.

###### 3.2.1. Mapping between sEMG and Joint Kinematics

For joint kinematics regression, the mapping between sEMG and joint angle is commonly established. The estimated angle is used as an input signal in the control system of wearable robots to achieve accurate angle trajectory track. Compared to deep ANN, the shallow ANN is the most common method for sEMG based joint kinematics regression. The deep ANN is still in the development stage now and will be widely used in the future.

For upper limb motion estimation, Luh et al. [47] estimated the angle of elbow joint by using BPNN. The first layer consisted of sixteen filtered sEMG features nodes. The second hidden layer was constructed by 240 nodes and the third layer had one angle output node. The simulation results illustrated that the proposed method was capable of estimating the elbow angle with satisfactory accuracy. Chen et al. [48] adopted a hierarchical projection regression (HPR) for estimation of elbow angle using sEMG. The HPR projects the original date into a lower feature space to achieve a local refined mapping between sEMG and the human motion. An average regression error of 9.8 deg was observed for preliminary experiment. Raj et al. [49] utilized multilayered perceptron neural network (MLPNN) and radial basis function neural network (RBFNN) to identify the human forearm kinematics. The features of IEMG and ZC were extracted as the input signals. The results indicated that the RBFNN have a better identification with an average CC of 0.76 and 0.39 for angle and angular velocity, respectively. Wang et al. [50] also utilized RBFNN to map the relationship between sEMG and elbow joint angle. The Gaussian function was used between the input layer and hidden layer. The experimental results showed that the RMSE and CC were around 0.043 and 0.905, respectively. Kwon et al. [51] estimated upper limb motion by using feed forward neural network (FFNN). The network input terms were the MAV of sEMG and angular velocities. The output terms were estimated the angles of elbow and shoulder joints. Ngeo et al. [52] employed FFNN to build the nonlinear relationship between finger joint angles and sEMG signals. The proposed network consisted of an eight-nodes input layer, a tan-sigmoid hidden layer with activation function, and a fourteen-nodes linear output layer. The results showed that the correlation between predicted and actual finger joint angles was up to 0.92. And thirty neurons were used in hidden layer. The results showed that the average NRMSE was around 8.5 deg. Xia et al. [13] implemented recurrent convolutional neural network (RCNN) which combined the properties of RNN and CNN to estimate the movement of upper limbs. As shown in Figure 6, the proposed RCNN architecture was composed of one input layer, three convolutional layers, two pooling layers, two long short-term memory (LSTM) layers, and one output layer. An average CC of 93% was obtained for the proposed RCNN method.

For lower limb motion estimation, Zhang et al. [53] employed BPNN to establish the mapping between sEMG and joint angles of ankle, knee, and hip. The proposed network consisted of sixty-neuron input layer, twenty-neuron hidden layer, and three-neuron output layer. The results showed that the average error of different leg motions was less than 9 deg. Jiang et al. [5] developed a sEMG based real-time control method. The raw sEMG signal was processed and then input to a four-layer FFNN model to establish the mapping relation between sEMG and knee angle. In the proposed network, five sEMG signals were the neurons of input layer and knee joint angle was the neuron of output layer. The neuron number of the first hidden layer was 23, and the second hidden layer was 13. The results of preliminary experiment showed that the average value of CC was about 0.963. Anwar et al. [54] estimated the knee joint angle based on generalized regression neural network (GRNN). The experiment results illuminated that the MSE by using GRNN with multiscale wavelet transform feature was around 1.57. Mefoued [18] developed a RBFNN to map the nonlinearities between sEMG signal and desired knee angle. The RBFNN architecture considered in this study was comprised of two neurons in input layer, five neurons in hidden layer, and one neuron in linear output layer. And the nonlinear radial basis function was utilized as activation function. The maximal RMS error of knee position estimation was equal to 1.34 deg.

###### 3.2.2. Mapping between SEMG and Joint Kinetics

For joint kinetics regression, the mapping between sEMG and joint force or moment is commonly constructed. On the one hand, the estimated force or moment is used as an input signal in the control system of wearable robots to achieve accurate torque trajectory track. On the other hand, the estimated moment and the estimated angles from the previous section are used as the input signals to achieve accurate double closed-loop impedance control. Compared to joint kinematics regression, the researches of kinetics regression are relatively rare.

For upper limb motion estimation, Ziai et al. [55] estimated the wrist joint torques using sEMG based ANN. The proposed network used FFBPNN with one 8-neuron input layer, two hidden layers, and one torque output layer. An average NRMSE of 2.8% was observed. Yokoyama et al. [8] utilized sEMG based ANN to predict the handgrip-force. The proposed network consisted of one input layer, four hidden layers, and one output layer. The RMS features from four sEMG signals were used as input layer and estimated handgrip-force was used as output layer. For the hidden layer, 64, 32, 16, and 8 neurons were used in each hidden layer, respectively. The experimental results showed that the average CC was 0.84 between the predicted and observed forces. Naeem et al. [11] estimated human arm muscle force by implementing a BPNN. The proposed network utilized the rectified smoothed sEMG as input to generate the estimated muscle force as output. The results illustrated that the CC of the proposed model and Hill-type model can exceed 0.99.

For lower limb motion estimation, Pena et al. [19] proposed a multilayer perceptron neural network to map the sEMG signals to the knee torque and stiffness. The input signals were the sEMG signals, knee angle, and angular velocities and the output signals were estimated knee torque and stiffness. A second-order sliding mode control was developed to control the assistive device by using the desired knee angle. Chandrapal et al. [56] established a mapping between five sEMG signals and knee torque by implementing ANN. There are three neurons in the hidden layer of the multilayer perceptron (MLP) and three neurons in the fully connected cascade (FCC) network. The results showed that the mean lowest estimation error can achieve 10.46% for the proposed methods. Ardestani et al. [57] developed a generic multidimensional wavelet neural network (WNN) to predict the moment of human lower extremity joint. A total of ten inputs including eight sEMG signals and two GRF components were determined as the inputs for the WNN and three joint moments of lower extremity were determined as the output. The results showed that the proposed WNN can estimate joint moments to a high level of accuracy, NRMSE less than 10% and CC more than 0.94. Khoshdel et al. [4] developed an optimized ANN (one input layer, two hidden layers, and one output layer) for knee force estimation. The input layer consists of four preprocessed sEMG signals and the output layer consisted of estimated force. A total error of 3.45 was obtained for the proposed optimized ANN.

#### 4. Conclusions

In this study, the latest advanced researches in sEMG based motion intention recognition were discussed based on two methods: sEMG-driven musculoskeletal model and machine learning based model. For the sEMG-driven musculoskeletal model, fundamental modelling theory and the performance of models from different studies have been analyzed. For the machine learning based model, feature extraction and classification model construction of discrete-motion classification, and mapping establishment between sEMG and joint kinematics/kinetics of continuous-motion regression have been discussed. Additionally, the advantages and disadvantages of the existed different motion intention recognition methods have been discussed according to their different purposes in application.

One can notice that it is hard to find a sEMG based recognition method that can estimate all human motion intentions completely and thoroughly. Because of the lack of day-to-day repeatability and long training procedure, the current existed sEMG based motion intention recognition methods are still in the laboratory application stage and few of them are truly marketized. Deep learning based methods have an important adverse impact on enhancing recognition accuracy and will become the trend of future development. In general, the proposed methods are only applicable to the specific users and movement patterns. Improving the robustness and practicability of recognition methods is very important. And developing more precise and real-time human motion intention recognition methods will still be a crucial challenge in the future.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.