Abstract

The movement information of the human body can be recorded in the plantar pressure data, and the analysis of plantar pressure data can be used to judge whether the human body motion function is normal or not. A two-meter footscan® system was used to collect the plantar pressure data, and the kinetic and dynamic gait characteristics were extracted. According to the different description of gait characteristics, a set of models was established according to various people to present the movement of lower limbs. By the introduction of algorithm in machine learning, the FCM clustering algorithm is used to cluster the sample set and create a set of models, and then the SVM algorithm was used to identify the new samples, so as to complete the normal and abnormal motion function identification. The multimodel presented in this paper was carried out into the analysis of the anterior cruciate ligament deficiency. This method demonstrated being effective and can provide auxiliary analysis for clinical diagnosis.

1. Introduction

The foot is a very important part of the human body. Its basic function is to support the body, absorb impact force, produce forward thrust, and help regulate and maintain the balance of the human body. Under the action of gravity, the foot could get reaction force in the vertical direction by the ground when we stand or walk. When the structure or the state of the human body changes, plantar pressure distribution would change accordingly. The research of the human body based on static or dynamic plantar pressure data can reveal different characteristics of plantar pressure distribution comparing patients with normal controls, which help find the causes and related evolution.

The research of plantar pressure measurement was started by Beely in 1882. The systematic analysis of gait and extensive clinical research began in the 1950s. Eisenhardt et al. [1] studied the effects of different heel height of shoes on the foot bones of women after the study of plantar pressure of 30 women aged 18 to 30 who wore high heeled shoes. Eisenhardt et al. also pointed out that the normal human plantar pressure distribution had a certain pattern and foot deformity and dysfunction would damage the normal distribution of plantar pressure. Through the analysis of plantar pressure measurement, Minns and Craxford [2] got the difference of plantar pressure distribution between the patients with rheumatoid arthritis and the normal control. Abboud et al. [3] analyzed and compared the plantar pressure of diabetic patients and healthy person. Stokes et al. [4] pointed out that the pressure peak value caused by medial foot four toes decreased obviously when patient suffered from hallux valgus, and hallux valgus angle was related to plantar pressure deviation. Stolwijk et al. [5] explored that long time walking would change the walking pattern because of leg fatigue and led to the increasing burden of heel. Stolwijk et al. [6] compared and analyzed plantar pressure data from Malawi and the Netherlands with the analysis of the gait feature parameters of arch index (AI) and the trajectory of the center of pressure (COP) in order to explore why there was less foot disease in Malawi. Pataky and Maiwald [7] thought high quality biomechanical information was included in the plantar pressure data. For this reason, he developed 3D interactive visualization tools to analyze the plantar pressure data and explore foot behavior more deeply. Along with the development of sensor technology and the popularization of computer technology, the measurement and research of plantar pressure become more and more important in the field of gait research.

However, the plantar pressure data analysis is still far from mature. The current research work was mainly focused on the effect of human behavior on plantar pressure data and the relationship between plantar pressure and gait features [8, 9], and so on. On the other hand, due to the complex structure of the human body, the plantar pressure in different conditions also has complex performance. Large plantar pressure data will bring a heavy burden for medical workers. The intervention of relative algorithm in machine learning will make a great change to these situations.

Different classification criteria can generate different clustering analysis methods. With the development of the fuzzy set theory, Ruspini [10] introduced the concept of fuzzy partition in the cluster analysis at the end of the 1960s. And then many research results came out, such as clustering method based on similarity relation and fuzzy relation [11] and fuzzy clustering method based on the evolutionary algorithm [12]. The most widely used method is the fuzzy clustering method based on the objective function. This method can transform the clustering problem into a mathematical optimization problem, and it has been widely used due to its simple structure [13]. Among the fuzzy clustering algorithms based on objective function, the fuzzy mean clustering algorithm (FCM) created by Bezdek [14] and developed by Dunn is the most typical representative one. This algorithm is derived by the optimization of the objective function of hard clustering, and it is produced by using the concept of fuzzy partition.

On the other hand, support vector machine (SVM) algorithm which is based on statistical learning is playing a key role in pattern recognition. Support vector machine is a machine learning algorithm developed in the 1990s, which was first proposed by Cortes and Vapnik [15] in 1995. It is a supervised learning algorithm. Based on statistical learning theory, by seeking structural minimum risk to enhance its learning generalization ability, SVM can obtain better statistical learning results even in the case of limited sample information.

Multimodel adaptive control [16, 17] is a major control method of control field in the past 20 years. By dividing the model parameters into multiple subsections, multiple models of the system according to different subsection will be set up. The whole model of the system can be got by the different models in different subsections by using different ways of combinations.

With the advent of new plantar pressure measurement devices, a large amount of plantar pressure data is measured, but these data have not been adequately utilized. Currently, medical diagnostic work is done entirely by doctors relying on their own experience. In this paper, based on FCM clustering algorithm and SVM algorithm, the concept of multimodel is applied to the classification and identification of anterior cruciate ligament deficiency. This study introduces machine learning algorithm into clinical diagnosis creatively which can bring great help for the diagnosis and rehabilitation of clinical medicine.

2. Multimodel Modeling of Plantar Pressure Characteristics

In this study, we used a plantar pressure measurement device called footscan to measure plantar pressure data. Anterior cruciate ligament deficiency was regarded as the main analysis object. All subjects included both normal subjects and anterior cruciate ligament rupture subjects. We will first put all the data samples together for analysis, with FCM algorithm for clustering, and SVM algorithm for identification. Then, with the aid of prior knowledge, the left and right plantar pressure data were analyzed, respectively, in the same way.

2.1. Plantar Pressure Data Feature Modeling

As shown in Figure 1, footscan plantar pressure measurement system will be used to collect the plantar pressure data of the test subjects. This system can be used to measure plantar pressure in static or dynamic motion. The total size of measurement is 200 cm 40 cm, and there are 16384 pressure sensors. It is placed in the center of a 16-meter long runway. During the test, testers walk barefoot through the measurement according to their own walking habits. Testers must follow the “three-step” protocol; in other words, testers need to walk alternating their left and right foot. In order to ensure the validity of the measurement data, each person’s foot landing process should begin with the first touchdown of heel. The sampling frequency is set to 126 Hz.

Gait characteristics used in biomechanics and sports medicine always play important roles in the analysis of plantar pressure data, such as foot progression angle, AI, and the trajectory of COP. Among these, the trajectory of COP will play a crucial role in the analysis of plantar pressure, especially for the anterior cruciate ligament deficiency.

The trajectory of COP [5] is a parameter that is used to describe the movement of center of foot gravity in the whole stance phase. As given in paper [18], this parameter can be obtained by connecting pressure centers of all pressure data frames into a line. An example of COP can be seen in Figure 2.

2.2. Fuzzy -Means Clustering Algorithm

Fuzzy -means clustering (FCM) algorithm is a fuzzy clustering algorithm. Data samples are allowed to belong to two or more clusters in this algorithm, and membership degree was used to determine which cluster it belongs to. By minimizing the objective function based on a certain norm and clustering prototype, the FCM algorithm attempts to divide a finite collection of elements into a collection of fuzzy clusters.

is set as a series of samples:where is the number of samples and is the dimension of the sample space. When () is set as the number of clusters, FCM can be described as belowwhere

In this formula, is the weighted index; is fuzzy membership matrix:

is a matrix of clustering centers:

In this algorithm, clustering centers can be calculated by the following formula:

The FCM algorithm can be realized by using the following steps.

Step 1. Choose the number of clusters. Initialize the fuzzy membership matrix. The membership values are randomly generated.

Step 2. Calculate the clustering centers using the formula above.

Step 3. Update the membership matrix. Ifthen stop. Otherwise return to Step 2.

-means clustering algorithm [19] is the most well-known by scholars. FCM is similar to the -means algorithm. The algorithm minimizes intracluster variance as well but has the same problems as -means. FCM is sensitive to the initial choice of the fuzzy membership matrix, and the result may not be the global minimum sometimes.

2.3. Multimodel Set Based on Plantar Pressure Data and FCM Algorithm

In this section, FCM algorithm is used to classify different plantar pressure data samples. According to the important parameter COP of gait characteristics, multiple models will be set up for different testers.

First of all, the plantar pressure data of left or right side is different. On the other hand, the plantar pressure data on each side will include normal foot data and abnormal foot data. Therefore, the plantar pressure data models will have four types. As mentioned above, the plantar pressure data can be collected by footscan device. And each data sample is presented by a set of data tables sorted in chronological order. Each frame of pressure data can be described by a data table whose rows and columns are equal to sensors array on footscan plate. The number of tables depends on the moving speed of tester and the sampling frequency.

Firstly, the pressure center of the two-dimensional pressure data table can be calculated according to the following formula:

In this formula, and represent the coordinate values of the two dimensions. And the lower left corner of the space occupied by the foot is recorded as the origin of the coordinate. indicates the pressure value of each point. The pressure centers of each frame were connected together, and then COP line can be got.

Because different person have different foot size, if the COP lines are compared directly, the result must be inaccurate. Therefore, these plantar pressure data also need a coordinate mapping in the following formula:

In this formula, and are the pressure center coordinate value after coordinate mapping; and are the width of the two dimensions. Through this formula, all the plantar pressure data will be mapped to a (this value is derived from the sensor array model corresponding to the standard foot size) rectangular coordinate system.

Because each COP line is composed of hundreds of pressure center points, if all the data is used for cluster analysis, it will lead to a big computing burden. In order to reduce the computational complexity, the extraction of the coordinate points should be conducted. In our study, 5 key points (10 characteristic values) will be extracted for characteristic description.

In order to guarantee the same influence of and coordinate in analysis process, the extracted features need to be normalized. The feature values will be transformed into the range by dividing the maximum value of the corresponding feature in all the samples.

By using FCM and normalization process above, the plantar pressure data can be divided into different models. The model set with four and two models will be shown as in Figures 35.

3. Diagnose of Sports Injuries Based on Multimodel

According to the results of the above clustering analysis, different plantar pressure data can be classified as different COP model. Once the model set was set up, how to classify a new data sample to model set will be the main problem. The following SVM algorithm will be used to solve this problem.

3.1. Support Vector Machine

SVM is an important algorithm in machine learning; it may always be used for classification and regression analysis. It was first proposed for linear two-class classification problem by constructing a hyperplane to separate two kinds of samples. Now the method of SVM has been extended to nonlinear classification problem by using the penalty function and kernel function.

3.1.1. Linear Classification Problem

For a set of training samples ; represents the sample vector of dimension and represents the category. Linear classifier is designed to achieve such a classification of dimensional space and the hyperplane equation can be expressed as .

Taking the two-dimensional plane as an example, two different groups of points on the plane are separated by a straight line L (the hyperplane in two-dimensional space is a straight line), which is shown in Figure 6.

If , then is a point in the hyperplane; if , is a point in mode I; otherwise, if , is a point in mode II.

In this way, we only need to get the value of and to construct the classification function.

Finally the design of the classifier can be transformed into a convex optimization problem as follows:

The convex optimization problem can be converted to the following function expression by introducing the Lagrange duality factor :

And then the objective function becomes

3.1.2. Nonlinear Classification Problem

As to nonlinear classification problem, the support vector machine maps the sample data to the high dimension space by kernel function, so that the nonlinear classification problem can be solved.

Now the classification function can be expressed as

can be obtained by solving the following problems:

And then according to the following formula, and can be calculated:

SVM is a supervised learning algorithm. The application of this algorithm is divided into two parts: a set of data for classifier training and the new sample data for identification. The classifier training is progress of calculating , , and in order to determine the classification function and the identification process is to bring the new samples into the classification function to calculate the category.

There are some commonly used kernel functions:Polynomial kernel function: .Gauss kernel function: .Linear kernel function: .

3.2. Multimodel Identification for Plantar Pressure Based on SVM

SVM algorithm is a supervised learning algorithm. After the plantar pressure data has been classified by FCM algorithm, all the plantar pressure data and the models to which these data belong can be got. The pressure data and the models to which these data belong will be used to train parameters , , and . Once these parameters have been trained convergent, function (14) will be used to decide which model it belongs to when new plantar data is given.

4. Simulation

Anterior cruciate ligament is a very important structure of knee joint of human body. It can be used to maintain the stability of knee motion. Anterior cruciate ligament deficiency (ACLD) is a kind of common sports injury. When a man got ACLD, the injury will lead to instability of the knee and has a serious negative impact on the knee function. And it will lead to the abnormality of COP line [2024].

Through the data processing, we get 100 groups of plantar pressure data characteristic sequence, including the normal data and ACLD data. We will use FCM algorithm to do clustering analysis firstly and then use the SVM algorithm to do the data identification.

Firstly, we put left and right side data together for analysis. These 100 sets of data should include 4 models: the left side of the normal data, the left side of the ACLD data, the right side of the normal data, and the right side of the ACLD data. So the number of clusters in FCM algorithm is set to . Then the weighted index is set to ; maximum iteration number is set to 100; the minimum iteration error is set to . The cluster analysis results are shown in Table 1.

In Table 1, represents the -coordinate value of the first feature point in COP line; the others are similar. Mode_1 indicates the sample’s membership to Mode_1 which is shown in Figure 3. Category represents the final category to which current sample belongs.

The results of the FCM clustering analysis are used for the training and identification of the SVM algorithm. We select 80 groups of data in the front as training samples directly, 20 sets of data left as the test samples. The accuracy of final identification results is 50%. The accuracy was evaluated over the cluster labels. There is still a gap of the identification accuracy with our expectations.

Next, we analyze the plantar pressure data of the left and right sides, respectively. Take the left side data as an example. At this time all data samples should contain two categories, the normal data and the ACLD data, so the number of clusters in FCM is set to , the other parameters unchanged. The results of cluster analysis are shown in Table 2.

In Table 2, Mode_1 and Mode_2 are shown in Figure 4. We use the -cross validation method (we take in this study) for the analysis of the data. These data were divided into five groups. Because the original data sequence is randomly arranged, every 10 adjacent samples were divided into the same group. Four groups are selected as the training data and the remaining set of data as the test data for analysis. The final identification results are calculated by the SVM algorithm, which is shown in Table 3.

The average value of the five simulation test results is obtained, and the average identification accuracy is 76%.

Then we can use the same method to analyze the right plantar pressure data and get the result of FCM cluster analysis which is shown in Table 4.

In Table 4, Mode_1 and Mode_2 are shown in Figure 5. All the data were divided into five groups; four groups were selected as the training data and the remaining set of data as the test data. The final identification results are shown in Table 5.

The average value of the five simulation test results is obtained, and the average identification accuracy is 62%.

Three groups of experiments were done in this paper. In the first experiment, 100 sets of data samples were divided into 4 categories, with FCM algorithm for clustering and SVM algorithm for identification. The final identification accuracy is 50%. In the second and third experiments, the left and right plantar pressure data were analyzed, respectively, in the same way. The final identification accuracy is 76% and 62%. The comparison of identification results is shown in Figure 7.

From the simulation, it can be seen that the cluster and identification results on plantar pressure data analysis are acceptable by using the FCM and SVM machine learning algorithm. On the other hand, when the number of models in model sets is larger, identification accuracy will be decreased. If we use existing information or knowledge to reduce the number of models in model set, the accuracy can be improved obviously. Compared with the doctor’s clinical diagnosis, the results of accuracy rate can be accepted, but the accuracy of this method should be improved in the future study. This research can be used as a reference for clinical diagnosis of doctors and provide an auxiliary analysis.

5. Conclusion

The machine learning algorithm is applied to the analysis of plantar pressure data. The COP line is used as the main feature to describe characteristics to complete the identification of the ACLD patient and the normal control. FCM algorithm is used for the samples cluster, and the SVM algorithm is used for the identification. To the authors’ knowledge, there is still no research examining the plantar pressure using the clustering algorithm. The combination of machine learning algorithm with plantar pressure data analysis is a tentative research; the result in this paper is acceptable, but the accuracy should be improved in future study. This research can be an auxiliary method for doctors and has great significance for clinical diagnosis, rehabilitation evaluation, orthosis prescription, and sports exercises.

Disclosure

Xiaoli Li and Hongshi Huang are co-first authors.

Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is partially supported by the Fund of National Natural Science Foundation of China (61473034 and 61673053), Specialized Research Fund for the Doctoral Program of Higher Education (2013000611008), the Start-Up Funding of Content development and the Introduced Talent Research of Beijing University of Technology, Beijing Nova Program Interdisciplinary Cooperation Project (Z161100004916041), Beijing Nova Program (2008A006), Opening Foundation of Key Laboratory of Cryogenics (CRYO 201316, TIPC, CAS), and the Seeding Grant for Medicine and Information Sciences of Peking University (2014-MI-24). The study protocol was approved by the Institutional Research Board of Peking University Third Hospital (IRB00006761-2012010).