Abstract

Recent years have witnessed the rapid development of microelectromechanical systems, and human motion tracking technology based on IMU (inertial measurement unit) has attracted much attention. However, the magnetic field varies with time and position, which makes it necessary to calibrate sensors before tracking. To address the poor adaptability of IMU to the environments and improve the accuracy of estimated traces, this paper presents an ENN-based (Elman neural network) method to track human arm motions, which consists of two steps. First, the data derived from IMUs are preprocessed for the rough Euler angles; then, an ENN is trained to estimate motions. We explore the initially estimated position to calibrate the acceleration measurements as the input of the ENN. Real-world experiments of arm motion tracking are carried out with the ground truth from an optical motion tracking system. The experimental results show that the mean tracking errors are around 35 mm, with a strong ability to eliminate the effect of extreme measurement and environment noises, avoiding calibrating the magnetometer. The implementation of the well-trained model to independent motions indicates that the robustness of the proposed method is excellent, and the errors reduce by 37.2% on the -axis and perform similarly on the -axis compared with 4 traditional methods. This method quite suits those situations where trajectory tracking of the standardized motions is required, such as the medical habilitation.

1. Introduction

Human motion tracking is the procedure where the trace of human movements can be detected in quantity and quality via onbody sensors [1]. Nowadays, this technology is applicable in a wide range of fields including medical health [2], virtual reality [3], and sports biomechanics [4].

There are currently motion tracking methods such as marker-based optical tracking, exoskeleton-based mechanical tracking, and IMU- (inertial measurement unit-) based tracking. The optical tracking system has good accuracy, but it requires multiple fixed high-quality cameras and, thus, is restricted to a relatively small indoor space [5]. Mechanical tracking confronts the problem of reducing the error between the mechanical rotational axis and the human joint [6]. The IMU sensors consist of an accelerometer, a gyroscope, and a magnetometer, which can measure the orientation of the rigid body they are attached to, making it possible to track human motion [7]. However, this method also suffers from limitations such as long-term drift, magnetic interference, and inconsistency [8].

Thanks to the rapid development of microelectromechanical systems (MEMS), IMU-based methods have received much attention for their portability and low cost [9]. In recent years, various researches have been performed on IMU-based methods, especially in the data fusion field [10]. Zhu and Zhou designed a real-time motion tracking system based on a Kalman filter using IMUs [11]. Xiaoping et al. developed a quaternion-based extended Kalman filter to obtain the optimal orientations [12]. Fourati et al. presented a complimentary observer to calculate the attitude information based on quaternions [13]. Atrsaei et al. introduced a constraint of velocity to IMU to track fast motion [14]. Chen et al. improve the real-time tracking strategy by the combination of displacement and movement angle using the complementary and Kalman filters [15]. All these works are aimed at gaining the optimal estimation of quaternions through traditional filtering technology. Besides, these works failed to get rid of the problems with the calibration and magnetic field distortion caused by different environments.

Another problem of the IMU-based method is the alignment of multiple sensors. Zimmermann et al. develop a LSTM model to align the IMU to segment to obtain biomechanical joint angles [16]. Chen et al. design a novel online IMU-based human gait estimation framework, which introduces the kinematic chain constraints between multiple segments, achieving adaptive alignment and drift rejection [17]. These works contribute to alignment, but the complicated modeling process is unavoidable.

Nowadays, due to the advances in artificial intelligence, more researchers focus on how to improve the IMU measurements based on ANN (artificial neural network) [18]. Zhang et al. [19] proposed a DNN model to process the IMU data, integrating the DNN estimated value and numerical value to gain a more reliable pose. To get a more precise pose, Brossard et al. [20] applied a convolutional neural network to regress the gyro corrections. This work is aimed at denoising the gyroscopes and win a good precision compared to other methods. Though it is not designed for human motion track, it shows the possibility to apply neural networks to this area. Compared with a complex deep learning model, the ENN (Elman neural network) has won the favor of researchers due to its constructions and successful applications to nonlinear problems [21]. Kolanowski et al. presented an ENN-based navigation system to estimate the attitude of the rigid body where the IMU is attached [22]. Guo et al. proposed an attitude calculation algorithm aided by ENN (Elman neural network) to overcome the IMU’s poor adaptability to environments [23]. Chong et al. proposed a genetic Elman neural network to improve the temperature drift modeling precision of gyroscope [24]. All these work shows the possibility to estimate the human motion trace by processing the IMU data based on the neural network.

To eliminate extreme measurement noises and avoid the influence of the environment, this paper proposes an ENN-based method for human arm motion tracking by three attached IMUs. The high-end optical motion tracking system, Opti Track, is introduced as the ground truth, and ENN is trained to estimate the arm traces. Real-world experiments of arm motion tracking are then carried out to verify the effectiveness of the proposed method. The results show that the accuracy and robustness of the method are both acceptable.

The rest of this paper is organized as follows. Section 2 provides detailed information about the proposed method. Section 3 reports the environment, process, and results of the experiments. And in Section 4, the authors discuss the results and possible error sources. Finally, Section 5 draws the conclusion and future work.

2. Methods

Generally, a human arm can be modeled as three joints connecting two rigid bodies, as is shown in Figure 1. The arm can be simplified by two consecutive links, and three IMUs are attached to the three joints (the wrist, elbow, and shoulder) to describe the motion trace of the arm, each one described in a frame defined as where denotes the orientation while the other three present the position coordinate in the corresponding coordinate system.

Based on the human arm model setup, an ENN-based model is presented to estimate the trace of the arm. Figure 2 depicts the procedure of our proposed method, which contains two steps.

Step 1. Data preprocessing: the collected data are first segmented and then preprocessed. The acceleration and angular velocity are applied to calculate the attitude information, which is then aligned in the same frame.

Step 2. estimation/estimation: the data after preprocessing is set as the input of the ENN while the ground truth coordinates are collected by the optical tracking system. And the output coordinate can describe the arm trace. In this step, we introduce the feedback to help optimize the body acceleration.

When the IMU data has been collected and preprocessed, the well-trained ENN model is called to compute the coordinates of the arm, and then, a smoother is applied to gain the final trace of the arm.

2.1. Data Preprocessing

The output of the IMU provides the acceleration by the accelerometer, the earth magnetic field by the magnetometer, and angular velocity by the gyroscope [25]. The acceleration can be decomposed into three components as where is the gravitational acceleration, denotes the body acceleration (the acceleration generated by person movements), and represents the measurement noise, of which the distribution is normally Gaussian distribution. Among the three components, can be considered a constant vector for any object; thus, it is feasible to pick out to estimate the orientation of the object. With the assumption that the noise can be neglected, we design a low-pass filter for the acceleration signal to extract the gravity component [26].

In practice, measured by the magnetometer will be affected by the ferrous materials in the environment, so it is necessary to calibrate the magnetometer before estimating the arm movements [27]. However, it is inconvenient or even troublesome when the number of sensors increases. Considering and are capable to offer enough information of arm movements, is removed from the orientation estimation. Thus, the adaptability of IMU to environments can be improved in some way. However, the change of magnetic field strength with the movement is an important feature, so it is used to improve the ENN model, which will be detailed in the next section.

For arm motion tracking, it is particularly important to obtain the attitude information of the arm segment; hence, it is necessary to put the attitude information into the network [28]. There are three common methods to calculate the attitude, namely, the Euler algorithm, direction cosine method, and quaternion method. The direction cosine algorithm is widely used in navigation coordinate systems; however, the complicated calculation restricts its application in motion tracking. The quaternion method shows advantages in fast computation and all kinds of attitude calculation, but it does not allow separating the attitude angle directly and is easy to fall into instability once the measurement of one sensor gets disturbed [29]. Though the Euler angles suffer the gimbal lock, they are more understandable and efficient in decomposing rotations into individual freedoms, requiring less computational efforts [30].

This paper applies the Euler algorithm to calculate the attitude. By solving Equation (3), we can get the roll () and pitch (), and by solving Equation (4), we can get roll (), pitch (), and yaw (). By fusing the data from the accelerometer and gyroscope, we can get the final attitude as where subscript or denotes the angle calculated by the data from the accelerometer or the gyroscope and is a scale factor, whose value is 0.4 in our example.

2.2. Elman Neural Network

In traditional ways, we need to process a series of computations to figure out the final moving trace, and we consider applying an ANN to help calculate the coordinates. The coordinate calculated by the Opti Track system is set as the ground truth data. To find out the mapping relationship between the IMU data and coordinates, we develop an ENN for each IMU.

ENN is one kind of recurrent neural network (RNN), and its structure is depicted in Figure 3. It is composed of three layers, namely, the hidden layer, output layer, and context layer [31]. Compared to other ANNs, ENN is more popular for its unique advantage that the context nodes can memorize the values of previous hidden nodes, which makes ENN applicable in the fields of dynamic system identification and prediction [32].

And the Elman network is also denoted by the following equations:

During the training process, we pay more attention to the drastically changing axis. Although the accuracy of is severely affected by the environment, the relative changes of different actions are similar. Therefore, we can consider using to improve the traditional MSE (mean square error) loss function. We redesign the loss function as where , , and are the error weights calculated according to the change of magnetic field strength and presents the MSE of the error on each axis. Equation (10) shows the calculation of as an example. where presents the standard deviation of the vector in the subscript.

We introduce feedback to calibrate the input . The predicted coordinate is applied to compute the body acceleration and helps correct the input. Also, we take the -axis as an example to explain. First, the acceleration at sample on the -axis in the Opti frame can be calculated as

Since the IMU and Opti frames are aligned, the estimated body acceleration can be figured out by rotating with the matrix

And then, the weighted average of the measured and estimated body acceleration is regarded as the corrected . Now, the input and output matrixes are set as where the input components are calculated in Section 2.1, while the target components are provided by the Opti Track system. After the estimation, we apply the five-dot-cubic algorithm [33] to smooth the coordinates.

3. Experiments and Results

3.1. Experimental Setup

To verify the efficiency of the proposed algorithm, an experiment was carried out. The ground truth was obtained from the Opti Track Motive system of millimeter-level accuracy, while tracking a subject equipped with 4 IMUs. Three of them were attached to the left arm (on the wrist, elbow, and shoulder) of the subject with three corresponding markers for the optical system to track. And the fourth IMU is fixed on the chest of the subject as a reference. Here, the -axis is pointing forward, the -axis is pointing to the right side, and the -axis is perpendicular to the ground. Figure 4 shows how the subject wears the IMUs and the markers.

After the experimental environment was set up, the subject was asked to perform several movements at a relatively slow speed, including forward-smooth-lift (FSL), lateral-smooth-lift (LSL), forearm-supination (FS), and elbow-smooth-lift (ESL). Each movement gets started and ends up with the N-pose gesture (standing still with the arms vertical alongside the trunk on the ground) and lasts for at least 10 seconds. Figure 5 shows how the movements are organized.

Captured data include the marker positions in the optical coordinate system and the IMU signals in the respective sensor reference system. The data by the Opti Track system was sampled at 120 Hz while the data by IMU (HI221, hipnuc) was at 35 Hz. We resampled the Opti data to make its frequency rightly the same as the IMU data. IMU and Opti data were captured by different terminals, but they were manually synchronized by the N-pose gesture at the beginning and end of each sequence. This synchronization method may result in a misalignment in time, but it is acceptable for the time misaligned is quite short.

3.2. Performance Index

To assess the performance of our method, we develop some indices to evaluate the model accuracy and robustness. Given two variables with samples, (describing the estimated position on one axis) and (describing the ground truth position on the same axis), the following indices can be used for assessment:

Mean error:

Maximum error:

Correlation coefficient:

3.3. Results

The aligned IMU and Opti data are then put into the Elman NN to train the model, where the data is separated into a train set (70%) and a test set (30%). Then, we adapt the trained model to estimate a new independent motion to assess the generalization ability of the model. We have compared the accuracy and the robustness on several aspects, and the results are listed as follows. Here, we pay more attention to the -axis and the -axis, for the arm motions in experiments have few movements on the -axis, which can be regarded as random error.

First, the data captured in one motion is implemented to verify the proposed algorithm. Figure 6 depicts the error between the estimated coordinate and the ground truth data on the -axis of the wrist in four motions, which shows that the proposed method can help to get the trace of the arm relatively accurately. Table 1 reports all the performance indices of different parts of the arm in the 4 motions in test sets.

Then, to evaluate the robustness of the method, the well-trained model is implemented to estimate another four independent motions. Figure 7 is the boxplot to depict the errors on the -axis on the wrist in the four motions. Table 2 reports the performance indices of the -axis on the wrist in four motions. Finally, to further verify the effectiveness of the proposed method compared with traditional methods, we compare our method with the four classical methods, namely, the Zhu model [11], Yun model [12], Young model [34], and Bleser model [35], based on the dataset of [8]. Figure 8 compares the errors on the -axis between the selected methods on the motion (elbow flexion/extension), and Table 3 reports the errors on the three axes of the five methods.

4. Discussion

This section will discuss the performance of the proposed method on tracking the arm based on the results in Section 3 from two aspects, accuracy and robustness. Then, some possible error sources of this work will be mentioned, which can be a guide for our future work.

4.1. Accuracy

The first aspect taken into consideration is the accuracy, which is reflected by the mean and maximum errors on the three axes. Generally, the smaller the errors are, the more accurate the model is. The analysis of the errors in Figure 6 and Table 1 suggests that the accuracy of the method is acceptable. On the one hand, the 12 well-trained models (each motion has one model for each part) all play a good performance. The mean error of each model is around 30 mm, and the maximum errors are around 50 mm (very few can reach over 100 mm). On the other hand, the errors on the -axis have a similar performance to those on the -axis (they have their advantages in different motions). Overall, the error of the action is acceptable. The reconstructed IMU motion trajectory has a high correlation with the Opti system, and only 4 values are lower than 0.85. This shows that the reconstructed motion has a high consistency with the actual motion. We can get better results from the reconstructed motion and discover the characteristics of the original action. From this perspective, the accuracy of our proposed method is good.

A comparison between the method proposed in the article and other traditional methods is also conducted. Based on the open-source data set provided by the literature, we compared the errors of these methods on the three axes. Figure 8 shows errors on the -axis, and Table 2 reports errors on the three axes in detail. We can find that the proposed method performs best on the - and -axes, reducing about 37.2% of the mean errors and on the -axis, the error is also acceptable.

4.2. Robustness

The next aspect that plays an important role is robustness, which is reflected by the performance of other independent estimations based on well-trained models. Generally, robustness refers to the ability of the model to tolerate perturbations. We have tested other four independent actions for each motion to verify the robustness of the proposed method.

Figure 7 presents the distribution of the error on the -axis in the new actions. The red symbol, “+”, represents the outliers (values that reach over 1.5 times over the interquartile range). We can find that the mean errors of the new actions are similar to those of the test set while the gross errors seem to have increased. And Table 2 supports the point furtherly. The of the motion has increased by about 40 mm, which suggests the weak robustness of this method. Nevertheless, the correlation values are consistent with those in Table 1, showing that the estimated trace can reconstruct the human motion. Therefore, it can be figured that the proposed method can predict the trajectory of the same motion well, regardless of whether they are continuous actions. Although the maximum errors/outliners become larger, which can be reduced by introducing the kinematic chain in the future, the consistency of its actions has not decreased. Overall, the robustness of the model is acceptable.

4.3. The Error Sources

Though the accuracy and robustness are acceptable totally, there are still some unpredictable errors (like the maximum errors in Tables 1 and 2), which may be caused by the following two aspects:

4.3.1. Experiment

In Table 2, we can figure that theE and is larger than those in Table 1 while the is similar, thus, these errors may be caused by the independent experiments. This is because there is some difference between the two experiments on the position where the experimenter stands, which may cause a relatively constant error on the - or -axes. But this will not have a serious impact on the reconstruction of the arm trace, for the reconstruction of the arm movements is still clear and the correlations perform well.

4.3.2. Data

The maximum errors shown in Section 3 are unable to ignore and are possibly resulted by the collected data in our experiments. On the one hand, there are some missing values of the ground truth data (Opti data) caused by some unavoidable occlusion of some markers. We have filled the missing values using the interpolation and resample it to the same frequency with the IMU data, which may introduce some outliers with errors over 100 mm to some extent. On the other hand, the IMUs have been continuously working during the entire experimental time, which may result in more noise in the last several motions.

5. Conclusion

This paper proposes an arm motion tracking method based on wearable inertial sensors, using the ENN network. This method effectively avoids the problem of poor adaptability to the environment of traditional inertia-based solving methods. In terms of model training, the magnetometer information is perceived by IMU to train the model and applies the acceleration and angular velocity to calculate the attitude angles, which are set as the ENN input vector. To calibrate the body acceleration, feedback is designed, the more accurate results can be derived. Finally, the five-dot-cubic algorithm eliminates the errors of the estimated trace. Experiments verify the effectiveness of the proposed method in both accuracy and robustness. In addition, this article also uses open-source data to compare with other traditional estimators to further verify the reliability of the ENN-based method. In practical applications, this method quite suits the situations where the fixed motions require assessment, including rehabilitation and fitness exercises. The future work will focus on reducing accumulative errors by introducing the kinematic chain and cutting the numbers of training models by motion classifications and reconstructions.

Data Availability

Data is available by email: [email protected]

Conflicts of Interest

The authors declare that they have no conflict of interest regarding the publication of this paper.

Acknowledgments

This research was funded by the projects of the Aeronautical Science Foundation of China (Grant Number: 20185869009), Science and Technology Project of Jiangsu Market Supervision Administration (Grant Number: KJ196013), “333 Project” scientific research project of Jiangsu Province in 2020 (Grant Number: BRA2020253), National Natural Science Foundation of China (NSFC 61903081), and Zhishan Youth Scholar Program of Southeast University (Grant Number: 2242021R41135).