Shock and Vibration

Volume 2014, Article ID 717465, 15 pages

http://dx.doi.org/10.1155/2014/717465

## Bearing Degradation Process Prediction Based on the Support Vector Machine and Markov Model

^{1}School of Mechatronics and Automotive Engineering, Chongqing Jiaotong University, Chongqing 400074, China^{2}The State Key Laboratory of Mechanical Transmission, Chongqing University, Chongqing 400030, China^{3}Key Laboratory of Road Construction Technology and Equipment, Ministry of Education, Chang’an University, Xi’an 710021, China

Received 15 March 2013; Accepted 5 August 2013; Published 5 March 2014

Academic Editor: Valder Steffen

Copyright © 2014 Shaojiang Dong et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Predicting the degradation process of bearings before they reach the failure threshold is extremely important in industry. This paper proposed a novel method based on the support vector machine (SVM) and the Markov model to achieve this goal. Firstly, the features are extracted by time and time-frequency domain methods. However, the extracted original features are still with high dimensional and include superfluous information, and the nonlinear multifeatures fusion technique LTSA is used to merge the features and reduces the dimension. Then, based on the extracted features, the SVM model is used to predict the bearings degradation process, and the CAO method is used to determine the embedding dimension of the SVM model. After the bearing degradation process is predicted by SVM model, the Markov model is used to improve the prediction accuracy. The proposed method was validated by two bearing run-to-failure experiments, and the results proved the effectiveness of the methodology.

#### 1. Introduction

Bearing is one of the most important components in rotating machinery. Accurate bearing degradation process prediction is the key to effective implement of condition based maintenance and can prevent unexpected failures and minimize overall maintenance costs [1, 2].

To achieve effective degradation process prediction of the bearing, firstly, the features should be extracted from the collected vibration data. Then, based on the extracted features effectively prediction models should be selected [3]. Feature extraction is the process of transforming the raw vibration data collected from running equipment to relevant information of health condition. There are three types of methods to deal with the raw vibration data: time domain analysis, frequency domain analysis, and time-frequency domain analysis. The three types of methods are often chosen to extract the feature. For example, Yu [4] chose the time domain and the frequency domain transform to describe the characteristics of the vibration signals. Yan et al. [5] chose the short-time Fourier transform to extract the features. Ocak et al. [6] chose the wavelet packet transform to extract the feature of bearing wear information. Because the frequency features from FFT analysis results often tend to average out transient vibrations and thus not providing a wholesome measure of the bearing health status, in this paper, the time domain and the time-frequency domain characteristics are used to extract the original features.

Although the original features can be extracted, they are still with high dimension and include superfluous information. So the original features fusion and dimensional reduction method should be used to deal with the original features so as to select the typical features. The most commonly used features fusion and dimensional reduction method is principal component analysis (PCA) [7, 8]. But the PCA is mainly used for dealing with the linear data set, while the bearing vibration features are usually suppressed by the nonlinear characteristic features, so the PCA cannot work effectively. Therefore, it is a challenging to find an effective nonlinear features fusion and dimensional reduction method. In this research a new feature extraction method local tangent space alignment (LTSA) [9] is chosen. The LTSA is an efficient manifold-learning algorithm, which can be used as a preprocessing method to transform the high dimensional data into more easily handled low dimensional data [10]; the method has been used in many fields, such as face recognition, character recognition, and image recognition [11, 12]. In this paper, the LTSA is used to achieve extracting the more sensitive features.

After selecting the typical features, another challenge is how to effectively predict the bearing degradation process based on the extracted features. The existing equipment degradation process prediction methods can be roughly classified into model-based (or physics-based modals) and data-driven methods [13]. The model-based methods predict the equipment degradation process using the physical models of the components and damage propagation models based on damage mechanics [14, 15]. However, equipment dynamic response and damage propagation processes are typically very complex, and authentic physics-based models are very difficult to build [16]. Data-driven methods, also known as artificial intelligent approaches, are derived directly from routine condition monitoring data of the monitored system, which predicts the failure progression based on the learning or training process. The more prior the data is used for the training process, the more accurate is the model obtained [17]. Artificial intelligent techniques have been increasingly applied to bearing remaining life prediction recently. Lee et al. [18] presented an Elman neural network method for health condition prediction. Huang et al. [19] proposed a back-propagation network-based method for bearing degradation process prediction. However, the neural networks have the drawbacks of slow convergence; difficulty in escaping from local minima; uncertain network structure, especially when doing the bearing degradation process prediction problem with large data, and those problems will be more troublesome. The SVM [20] is most widely used recently and has a good identify and regression ability. In this paper, the SVM is used to predict the bearings degradation process.

Although the SVM is effective in predicting the bearing running state, the prediction error still exists. Because any prediction methods based on the historical data for future prediction will more or less have some prediction error, it is necessary to improve the prediction results. However, the prediction error has the character of being affected by many factors, fluctuations, showing a great random, and the error points are not related. So if we want to achieve high prediction accuracy, we need to find the discipline of the prediction error and correct the error of prediction. The Markov model [21] used the state transition matrix to achieve a more precise prediction. It can be used to achieve the improvement of the prediction accuracy. So in this research, the Markov model is used to improve the prediction accuracy.

The remainder of this paper is organized as follows. The methods of features extraction and the theory of dimensional reduction method LTSA are introduced in Section 2. The SVM model and the Markov model for bearing degradation process prediction are described in Section 3. In Section 4, the flowchart and the procedure of this research are introduced. The case validation and actual application are presented in Section 5. Finally, the conclusions are given in Section 6.

#### 2. Methods of Signal Processing and Dimensional Reduction

##### 2.1. Feature Extraction

This section presents a brief discussion on original feature extraction from time domain, time-frequency domain of vibration signals. Time domain methods usually involve statistical features that are sensitive to impulsive oscillation, such as kurtosis, skewness, peak-peak (P-P), RMS, and sample variance. The 5 domain statistical features are used as original features in time domain has been used in the literatures [2, 3]: where is the number of discrete points and represents the signal value at those points, where is the mean value.

The central moment for a set of data is defined as

The normalized forth moment, kurtosis, which is commonly used in bearing diagnostics, is defined as

The skewness is defined as

The peak-peak is defined as

Empirical mode decomposition (EMD) is a powerful tool in time-frequency domain analysis. The advantage of EMD is the presentation of signals in time-frequency distribution diagrams with multiresolution, during which choosing some parameters is not needed. This property is essential in the detection of bearing faults. The EMD energy can represent the characteristic of vibration signals, and thus it is used as the input features. The (intrinsic mode function) IMF energy data sets are chosen as original features in this paper. The original features for bearing degradation process prediction based on the original features are shown in Table 1.

##### 2.2. Dimensional Reduction Based on the LTSA

Because the generated original feature sets are still with high dimension and include superfluous information, the feature extraction method LTSA is used to fuse the relevant useful features and extracts more sensitive features to work as the input of the proposed prediction model.

The basic idea of LTSA is to use the tangent space of sample points to represent the geometry of the local character. Then these local manifold structures of space are lined up to construct the global coordinates. Given a data set , , a mainstream shape of -dimension () is extracted. The LTSA feature extraction algorithm is as follows [9].(1)Extract local information: for each , , use the Euclidean distance to determine a set of its neighborhood adjacent points (e.g., the nearest neighbors).(2)Local linear fitting: in the neighborhood of data points , a set of orthogonal basis can be selected to construct the -dimension neighborhood space of and the orthogonal projection of each point can be calculated to the tangent space of . is the mean data for the neighborhood. The orthogonal projection in the tangent space of neighborhood data of is composed of local coordinate that describes the most important information of the geometry of the .(3)Global order of the local coordinates: supposing that the global coordinates of converted by the is , and then the error is where the is the identity matrix; the is the unit vector; the is the points number of the neighborhood; the is the transformation matrix. In order to minimize the error, the and should be found, and then where the is the Moor-Penrose generalized inverse of . Suppose Let , , be a selected matrix from 0-1; the are global coordinates, and their weight matrix is The constraints is .(4)Extract of the low-dimensional manifolds feature: since the is the eigenvalue of matrix , the corresponding minimum eigenvectors matrix is composed of eigenvalue. The value of section 2 to section of matrix make of the . is the global coordinate mapping in the mainstream form of low-dimensional transformed from the nonlinear high-dimensional data set of .

The procedure of feature extraction can be described as follow.(1)Use the time domain analysis methods kurtosis, skewness, peak-peak, RMS, and sample variance to extract the statistical features.(2)Use the EMD method to decompose the collected vibration signal of each data set and get the IMF components; calculate the energy of each IMF component and get the features of the bearing in this time; then get the features of the other data sets.(3)Use the LTSA to reduce the original features dimensions and get the main features; the extracted features are used as input of the SVM model for bearing degradation process prediction.

#### 3. The SVM and Markov Model for Degradation Process Prediction

##### 3.1. SVM Prediction Model

SVM is a machine learning tool that uses statistical learning theory to solve multidimensional functions. It is based on structural risk minimization principles, which overcomes the extralearning problem of ANN.

The learning process of a SVM regression model is essentially a problem in quadratic programming. Given a set of data points such that as input and as target output, the regression problem is to find a function such as where is the high dimensional feature space, which is nonlinear mapped from the input space , is the weight vector, and is the bias [22].

After training, the corresponding can be found through for the outside the sample. The -support vector regression (-SVR) by Vapnik controls the precision of the algorithm through a specified tolerance error . The error of the sample is , regardless of the loss, when ; else consider the loss as . First, map the sample into a high dimensional feature space by a nonlinear mapping function and convert the problem of the nonlinear function estimates into a linear regression problem in a high dimensional feature space. If we let be the conversion function from the sample space into the high dimension feature space, then the problem of solving the parameters of is converted to solving an optimization problem (12) with the constraints in (13):

The feature space is one of high dimensionality and the target function is nondifferentiable. In general, the SVM regression problem is solved by establishing a Lagrange function and converting this problem to a dual optimization, that is, problem (14) with constraint of (15) in order to determine the Lagrange multipliers , , where , are Lagrange multipliers and . . evaluates the tradeoff between the empirical risk and the smoothness of the model.

The SVM regression problem has therefore been transformed into a quadratic programming problem. The regression equation can be obtained by solving this problem. With the kernel function , the corresponding regression function is provided by where the kernel function is an internal product of vectors and in feature spaces and .

##### 3.2. The Prediction Strategy and the Structure of the SVM Model

Traditional forecasting methods mainly achieve single-step prediction; when those methods are used for multisteps prediction, they cannot get an overall development trend of the series. Multisteps prediction method has the ability to obtain overall information of the series which provides the possibility for long-term prediction. There are two typical alternatives to build multisteps life prediction model. One is iterated prediction and the other is direct prediction. The comparison of the two strategies can be found in a number of literatures [23]. Marcellino et al. [24] presented a large-scale empirical comparison of iterated versus direct prediction. The results show that iterated prediction typically outperforms the direct prediction. So, the iterated multisteps prediction strategy has numerous advantages and will be adopted in this paper.

In order to determine the structure of the SVM, we constructed a three layers SVM prediction model. But to achieve the multisteps time series life prediction a basic problem should be suppressed. That is how many essential observations (inputs) are used for forecasting the future value (the output node number is 1), so-called embedding dimension . In order to suppress the problem, the CAO method [25], which is particularly efficient to determine the minimum embedding dimension through the expansion of neighbor point in the embedding space, is employed to select an appropriate embedding dimension . Then, the SVM input node number is determined.

To effectively select an appropriate embedding dimension based on the CAO method, the phase space reconstruction method should be mentioned. The fundamental theorem of phase space reconstruction is pioneered by Takens [26]. For an -point time series , a sequence of vectors in a new space can be generated as , where , is the length of the reconstructed vector , is the embedding dimension of the reconstructed state space, and is embedding delay time. The time delay is chosen through the autocorrelation function [27]: where , is the average value of the time series. The optimal time delay is determined when the first minimum value of occurs.

The embedding dimension is chosen through CAO method, defining the quantity as follows: where is the Euclidian distance and is given by the maximum norm. means the th reconstructed vector and is an integer, so that is the nearest neighbor of in the embedding dimension . A new quantity is defined as the mean value of all : where is only dependent on the dimension and time delay . To investigate its variation from to , the parameter is given by

By increasing the value of , the value is also increased and it stops increasing when the time series comes from a deterministic process. If a plateau is observed for , then is the minimum embedding dimension. But has the problem of slowly increasing or has stopped changing if is sufficiently large. CAO introduced another quantity to overcome the problem: where

Through CAO method, the embedding dimension of the SVM prediction model is chosen. The structure of the SVM model is determined.

##### 3.3. SOM Clustering Method to Divide the Prediction Error

State division is the process to determine the mapping from random variable to the state space. How to obtain state division is a crux for Markov model. Traditionally, it is performed by the state division approach described as follows. Let be the random sequence; let denote the state space; given if random variable , where , , then the variable belongs to the state , and the division of is usually uniform divided. However, the uniform divided method depends on the people’s experience, which will affect the prediction precise. In this research, the SOM neural network [28] is used to divide the state. The SOM can be created from highly deviating, nonlinear data. After the data are input, the SOM is trained iteratively.

In each training step, one sample vector from the input data set is chosen randomly, and the distance between it and all the weight vectors of the SOM, which are originally initialised randomly, is calculated using some distance measure. The best matching unit (BMU) is the map unit, whose weight vector is closest to . After the BMU is identified, the weight vectors of the BMU, as well as its topological neighbors, are updated so that they are moved closer to the input vector in the input space. The vectors are updated following the learning rule: where is the neighborhood function, which is monotonically decreasing with respect to the distance between the BMU and in the grid, and the training time is the learning rate; a decreasing function with .

At the end of the learning process, the weight vectors are grouped into clusters depending on their distance in the input space. Unlike networks based on supervised learning, which require that target values corresponding to input vectors are known, the SOM can be used to cluster data without knowing the class membership of the input data. This character is suitable for the problem of the prediction error of the SVM model which is not clear to us, so we should classify the error without prelearning, and therefore, the function of SOM method determined is an efficient and necessary method for clustering the state. Based on the clustering results, the state is divided into some districts.

##### 3.4. Markov Prediction Model to Improve the Prediction Accuracy

Consider a stochastic process that takes on a finite or countable number of possible values. Unless otherwise mentioned, this set of possible values of the process will be denoted by the set of nonnegative integers . If , then the process is said to be in state at time . Suppose that whenever the process is in state ; there is a fixed probability that it will be next instating . That is for all states and all . Such a stochastic process is known as a Markov chain. Equation (24) can be interpreted as stating that for a Markov model, the conditional distribution of any state , given the past states and the present state , is independent of the past states and depends only on the present state. This is called the Markovian property. The value represents the probability that the process will, when in state , next make a transition into state . Since probabilities are nonnegative and since the process must make a transition into some state, we have that

If the process has a finite number of states, which means the state space , then the Markov chain model can be defined by the matrix of one-step transition probabilities, denoted as

The initial probability is computed by where denotes the transition times from state to state and denotes the number of random variables belonging to state .

Markov model adopts state vector and state transition matrix to deal with the prediction issue. Suppose that the state vector of moment is , the state vector of moment is , and the state transition matrix is ; then the relationship is

Update from 1 to , and then where is the state vector at moment . Equation (29) is the basic Markov prediction model, if the initial state vector and the transition matrix are given, which allows calculation of any possible future state vector.

#### 4. Proposed Method

The flowchart of the proposed method is shown in Figure 1. The method consists of four procedures sequentially: data processing and features extraction, merge of the original features, constructing-training SVM model and predicting, and Markov model for improving the prediction result. The role of each procedure is explained as follows.

*Step 1. *Data processing and features extraction. The time domain and time-frequency domain signal processing methods are used to extract the original features from the collected mass vibration data.

*Step 2. *Merge of the original features. The LTSA method is used to extract the typical features and reduce the dimension of the features. The extracted features are used for training the SVM model.

*Step 3. *Constructing the SVM model. The SVM model is constructed; the CAO method is used to determine the embedding dimension. The iterated multistep prediction method is used to forecast the future value.

*Step 4. *Markov model for improving the prediction result. This procedure uses the SOM method to cluster the prediction error before the Markov method; based on the state division results the Markov model is used to improve the prediction results obtained by SVM model, to get a more precise prediction.

#### 5. Validation and Application

##### 5.1. Validation

In order to validate the effect of the proposed method, a validation test is proposed. The vibration signals used in this paper are provided by the Center for Intelligent Maintenance Systems (IMS), University of Cincinnati [29]. The experimental data sets are generated from bearing run-to-failure tests under constant load conditions on a specially designed test rig as shown in Figure 2. The rotation speed is kept constant at 2000 rpm. A radial load of 6000 lbs is added to the shaft and bearing by a spring mechanism. The data sampling rate is 20 kHz and the data length is 20,480 points as shown in Figure 3. It took a total of 7 days until the bearing fails. At the end, one bearing with serious wreck is used to test the proposed method as shown in Figure 4.

The time domain and time-frequency domain methods are used to deal with the collected vibration data as described in Section 2, Table 1. The measurements value of kurtosis and skewness are depicted in Figures 5(a) and 5(b).The IMF1 and IMF2 energy are depicted in Figures 6(a) and 6(b).

From the extracted features showed in Figures 5 and 6 we can see the following. The bearing is in normal condition during the time correlated with the first 700 points. After that time, the condition of bearing suddenly changes. It indicates that there are some faults occurring in this bearing. Different features reflect the bearing running state in different shapes. For example, the kurtosis and the IMF1 indict the bearing running state with an upward trend, while the skewness indicts the bearing running state with a downward trend. The kurtosis, skewness, and the IMF1, indicting the bearing running state, have a large fluctuation after the 700-point, while the IMF2, indicting the bearing running state, have a large fluctuation till after the 850-point. So using these features is not appropriate to reflect the bearing running condition. So it is very necessary to extract a sensitive feature by the original features to appropriately reflect the bearing running condition.

We performed feature extraction by means of LTSA to extract a sensitive feature and reduce the dimensionality of calculated features. After LTSA is used (in this article the parameters of the neighborhood factor equals 8, the embedding dimension equals 1), the bearing running state features dataset is got. The first main projected vector is chosen as the input of the SVM model. The result is shown in Figure 7. In comparison with the LTSA method, we also extracted the features through the PCA method, and the first main principal component is chosen. The result is shown in Figure 8.

From Figures 7 and 8 we can see that the LTSA method can extract an effective feature dataset, which is sensitive to the changes of bearing running state, while the extracted feature is based on PCA method with a bad effect, before the 700 point; we even cannot see the fluctuation of the bearing running state, and the trend convert is also not obvious, from which we cannot know the bearing running state effectively. This result indicates that information extracted by LTSA could be more effective than that extracted by PCA.

After extracting the typical features, the CAO method is used to determine the embedding dimension of the SVM model, based on the theorem of phase space reconstruction. We first choose the delay time through the autocorrelation function.

The optimal time delay is determined when the first minimum value of occurs.

Based on the extracted features dataset, the delay time is set to 3 for the projected vector values through the autocorrelation function as shown in Figure 9.

Then the embedding dimension is selected by the CAO method. The result is shown in Figure 10; the optimal embedding dimension for the projected vector is chosen as 10.

Based on the selected optimal embedding dimension, the SVM model is used to achieve the multisteps prediction. The RBF kernel function is used:

In this research the regularity parameter is set to 90.3, the kernel function parameter is set to 20, and the is set to 0.001. The parameters and are selected by Particle Swarm Optimization algorithm (PSO) [30]. The popular size of the PSO is set to 100, the interaction number of the PSO method is set to 20, and the fitness function of the PSO method is set to choose the parameters which make the SVM model fitting error in the training process the smallest. The error goal is set to 0.05, the dimension of the PSO is set to 2, and the inertia weight is set to , . Based on the selected typical features, the features dataset is used to train SVM model and the input features number of SVM is 9 determined by embedding dimension. Then, the trained SVM model is used to predict the bearing running state. Before the 700th points, the bearing is working in a normal state, so the 701–900 points are used to train the SVM model and the following 85 points are employed for testing. In order to evaluate the predicting performance, the root-mean square error (RMSE) is utilized as follows: where represents the total number of data points in the test set, is actual value in training set or test set, and represents the predicted value of the model. The actual value and the predicted result are shown in Figure 11.

From Figure 11 we can see that the trend of the bearing running state can be predicted by SVM model. From the prediction, we can get a general understanding of the bearing running state in the future, but the predicted result is not accurate, especially the stage of 70–85 point, through calculation. The RMSE of the actual and the predicted result is 0.0469, so the predict result is not satisfied and the prediction error of SVM model (the predicted results subtract the actual data) is shown in Figure 12.

Then the SOM method is used to divide the error into some districts, the iteration number if SOM is 2000; the structure of the state classification matrix is . The results of the state division by SOM are shown in Table 2.

It can be seen from Table 2 that when there is downward trend or the point value is less than 0, the state is set to 1 and when there is upward trend, the state is set to 3. According to the classification of the SOM model, the Markov state is divided into the following districts: , , and . The Markov model is used to improve the prediction error. For example, the 78 point, the Markov state transition matrix from 1–77 point is

Chose three points 77, 76, and 75 which are recent to the 78 point and set the transfer step as 1, 2, 3; the state prediction results based on the Markov model are shown in Table 3.

From Table 3 we can see that the accumulated value of stage 3 is the largest, so the stage of 78 point is set to 3 and the result is the same with the SOM clustering method.

In this research, according to the stage of the prediction of the Markov model, the correct value is calculated by , where is the median value of the divided stage area and is the value predicted though SVM model.

For the 78 point, the corrected value is , where the actual value at this point is 0.37206. Then other points have also been corrected though this method. The corrected results of the point 70–85 are shown in Table 4.

From Table 4 we can see that the Markov model makes the results more precise, which validate the necessary to use the Markov model to improve the effect of the proposed method. The RMSE of the actual and the predicted result is 0.0091, so the prediction accuracy improved significantly. However, because those points are so far away compared to the predicted point of 1–69, the results still have some error.

In order to compare the predict effect, the most usually used prediction model BP neural networks is used to predict the bearing running state based on the selected features. The learning rate of the neural network and its momentum coefficient are 0.01; the weights are initialized to uniformly distribute random values between −0.1 and 0.1; the iteration number is 2000; the training error is 0.001; the input number is 9; the hidden number is 15; the output node number is 1. The prediction results are shown in Figure 13.

From Figure 13 we can see that the prediction results based on the BPNN model is not working effectively. There are some peaks while in the same position the actual status is not obvious. The RMSE of the predicted result is 0.0932, so the prediction results of the traditional BPNN model is not more effective than the SVM model. In addition, the prediction method based on the BPNN has the problem of prediction results which are unstable; when the same data is used to train and predict, the results are different and even the neural network may fall into the local optimum as shown in Figures 14 and 15.

With the data, the proposed method has also been compared with other methods that had been proposed in relative research. The principal signal features extracted by PCA are utilized by HMM to predict the bearing running state [31]. The time domain and frequency domain features have been directly used as the input of the prediction model, and the result has been predicted by Neural Network algorithm [1]. The original features have been extracted by PCA as the input of the SVM prediction model [32]. The proposed method in this research. The RMSE of the different methods predicted results is shown in Table 5.

From Table 5, we can see that the RMSE of different prediction methods is very different. The prediction method that the original features have been directly used as the input of the NN model works the worst. This is the reason why the original features are still with high dimension and include superfluous information, which is not appropriate for state prediction; in addition, the NN prediction model has the drawbacks of slow convergence and difficulty in escaping from local minima. The prediction method based on the HMM model works not more effective than the method based on the SVM model; that is the reason why the HMM is not appropriate for long time forecast. The proposed method works the best; this is because the LTSA features extraction method can effectively extract the typical features and reduce the dimension and the SVM-Markov model can predict the state more precisely than the SVM and Markov model only. So through the comparison we can get that the proposed method is very effective in bearing running state prediction.

##### 5.2. Application

After validating the effectiveness of the proposed method, the method has been used to the actual application. The test rig is shown in Figure 16.

The bearings are hosted on the shaft and the shaft is driven by AC motor. The rotation speed is kept at 1000 rpm and a radial load of 3 kg is added to the bearing. The data sampling rate is 25600 Hz and the data length is 102400 points collected on the date of 2011.11.25 as shown in Figure 17. Every 2 hours, the vibration data are collected for one time. The collected data from 2011.11.25 to 2011.12.17 are analyzed after running for 1 year.

The time domain and time-frequency domain methods are used to deal with the collected vibration data as described in Section 2, Table 1. Then the features are normalized through and processed into the interval . The LTSA is used to reduce the dimensionality of calculated features and the result is shown in Figure 18.

From Figure 18, we can see that the bearing running state has a fluctuation and upward trend. Especially at 150 points, there is a sudden change of trend, which reflects the bearing’s working status change at this moment.

After extracting the typical features, the CAO method is used to determine the embedding dimension of the SVM model. The delay time is set as 2 for the projected vector values though the autocorrelation function as shown in Figure 19.

The embedding dimension is selected by the CAO method. The result is shown in Figure 20 and the optimal embedding dimension for the projected vector is chosen as 10.

Based on the selected optimal embedding dimension, the SVM model is used to achieve the prediction. The regularity parameter is set as 909.5 and the is set to 0.01 selected by PSO method. Based on the selected typical features, the features dataset is used to train SVM model, the 1–130 points are used to train the SVM model, and the following 20 points are employed for testing. The prediction results are shown in Figure 21.

From Figure 21 we can see that the trend of the bearing running state can be predicted by SVM model; from the prediction, we can get a general upward trend similar to the actual status. In addition, the sudden change of points 8 to 12 (near the 150 points in original signal as mentioned in Figure 17) is also showed out. However, the results are still not precise. In order to improve the prediction effect, the SOM method is used to divided the prediction error into some districts, the iteration number if SOM is 1000; the structure of the state classification matrix is . The results of the state division by SOM are shown in Table 6.

Based on the classification of the SOM model, the Markov state is divided into the following districts , , and ; the Markov model is used to improve the prediction error. The corrected results of the points 1–20 are shown in Table 7.

From Table 7, we can see that the Markov model make the results more precise.

Through the validation and actual application result we can see that the proposed method can predict the future status of the bearing, which is necessary for us to make some plan and do maintenance to reduce the risk of unnecessary accident.

#### 6. Conclusions

(1)The time domain and time-frequency domain methods are used to extract the original features from the mass vibration data, and in order to reduce the original features dimension and the superfluous information of the original features, the multifeatures fusion technique LTSA is used to fusion the original features and reduce the dimension.(2)Use the proposed SVM model to achieve bearing running state prediction. The proposed approach is validated by real-world vibration signals. The results show that the proposed methodology is of high accuracy, which is effective for the bearing running state prediction.(3)This research gives an example of combined approaches for the bearing running state prediction. Through analysis and validation we can get that the proposed method takes good use of the advantages of each part and achieve a high recognition accuracy and efficiency.(4)As the redundancy increases, the complexity of computation increases as well. This is one of the main shortcomings of the proposed method, which will be explored in the future.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

The research is supported by the Natural Science Foundation Project of CQ cstc2013jcyjA0896, National Nature Science Foundation of Chongqing, China (no. 2010BB2276), Fundamental Research Funds for the Central Universities (Project no. 2013G1502053), and Natural Science Foundation Project of CQ CSTC2011jjjq0006. The authors are grateful to the anonymous reviewers for their helpful comments and constructive suggestions.

#### References

- P. L. Zhang, B. Li, and S. S. Mi, “Bearing fault detection using multi-scale fractal dimensions based on morphological covers,”
*Shock and Vibration*, vol. 19, no. 6, pp. 1373–1383, 2012. View at Publisher · View at Google Scholar - S. Dong and T. Luo, “Bearing degradation process prediction based on the PCA and optimized LS-SVM model,”
*Measurement*, vol. 46, pp. 3143–3152, 2013. View at Publisher · View at Google Scholar - L. L. Jiang, Y. L. Liu, X. J. Li, and A. Chen, “Degradation assessment and fault diagnosis for roller bearing based on AR model and fuzzy cluster analysis,”
*Shock and Vibration*, vol. 18, no. 1-2, pp. 127–137, 2011. View at Publisher · View at Google Scholar · View at Scopus - J. B. Yu, “Bearing performance degradation assessment using locality preserving projections and Gaussian mixture models,”
*Mechanical Systems and Signal Processing*, vol. 25, no. 7, pp. 2573–2588, 2011. View at Publisher · View at Google Scholar · View at Scopus - J. H. Yan, C. Z. Guo, and X. Wang, “A dynamic multi-scale Markov model based methodology for remaining life prediction,”
*Mechanical Systems and Signal Processing*, vol. 25, no. 4, pp. 1364–1376, 2011. View at Publisher · View at Google Scholar · View at Scopus - H. Ocak, K. A. Loparo, and F. M. Discenzo, “Online tracking of bearing wear using wavelet packet decomposition and probabilistic modeling: a method for bearing prognostics,”
*Journal of Sound and Vibration*, vol. 302, no. 4-5, pp. 951–961, 2007. View at Publisher · View at Google Scholar · View at Scopus - C. Sun, Z. S. Zhang, and Z. J. He, “Research on bearing life prediction based on support vector machine and its application,”
*Journal of Physics*, vol. 305, no. 1, Article ID 012028, 2011. View at Publisher · View at Google Scholar - J. Wang, G. H. Xu, Q. Zhang, and L. Liang, “Application of improved morphological filter to the extraction of impulsive attenuation signals,”
*Mechanical Systems and Signal Processing*, vol. 23, no. 1, pp. 236–245, 2009. View at Publisher · View at Google Scholar · View at Scopus - Y. B. Zhan and J. P. Yin, “Robust local tangent space alignment via iterative weighted PCA,”
*Neurocomputing*, vol. 74, no. 11, pp. 1985–1993, 2011. View at Publisher · View at Google Scholar · View at Scopus - J. Wang, W. Jiang, and J. Gou, “Extended local tangent space alignment for classification,”
*Neurocomputing*, vol. 77, no. 1, pp. 261–266, 2012. View at Publisher · View at Google Scholar · View at Scopus - S. Dong, B. Tang, and R. Chen, “Bearing running state recognition based on non-extensive wavelet feature scale entropy and support vector machine,”
*Measurement*, vol. 46, no. 10, pp. 4189–4199, 2013. View at Publisher · View at Google Scholar - Y. Qin, B. P. Tang, and J. X. Wang, “Higher-density dyadic wavelet transform and its application,”
*Mechanical Systems and Signal Processing*, vol. 24, no. 3, pp. 823–834, 2010. View at Publisher · View at Google Scholar · View at Scopus - A. Moosavian, H. Ahmadi, and A. Tabatabaeefar, “Comparison of two classifiers; K-nearest neighbor and artificial neural network, for fault diagnosis on a main engine journal-bearing,”
*Shock and Vibration*, vol. 20, no. 2, pp. 263–272, 2013. View at Publisher · View at Google Scholar - T. Ghidini and C. Dalle Donne, “Fatigue life predictions using fracture mechanics methods,”
*Engineering Fracture Mechanics*, vol. 76, no. 1, pp. 134–148, 2009. View at Publisher · View at Google Scholar · View at Scopus - S. Marble and B. P. Morton, “Predicting the remaining life of propulsion system bearings,” in
*Proceedings of the 2006 IEEE Aerospace Conference*, Big Sky, Mont, USA, March 2006. View at Scopus - Z. G. Tian, L. N. Wong, and N. M. Safaei, “A neural network approach for remaining useful life prediction utilizing both failure and suspension histories,”
*Mechanical Systems and Signal Processing*, vol. 24, no. 5, pp. 1542–1555, 2010. View at Publisher · View at Google Scholar · View at Scopus - W. Caesarendra, A. Widodo, and B. Yang, “Combination of probability approach and support vector machine towards machine health prognostics,”
*Probabilistic Engineering Mechanics*, vol. 26, no. 2, pp. 165–173, 2011. View at Publisher · View at Google Scholar · View at Scopus - J. Lee, J. Ni, D. Djurdjanovic, H. Qiu, and H. Liao, “Intelligent prognostics tools and e-maintenance,”
*Computers in Industry*, vol. 57, no. 6, pp. 476–489, 2006. View at Publisher · View at Google Scholar · View at Scopus - R. Q. Huang, L. F. Xi, X. L. Li, C. Richard Liu, H. Qiu, and J. Lee, “Residual life predictions for ball bearings based on self-organizing map and back propagation neural network methods,”
*Mechanical Systems and Signal Processing*, vol. 21, no. 1, pp. 193–207, 2007. View at Publisher · View at Google Scholar · View at Scopus - C. W. Fei and G. C. Bai, “Wavelet correlation feature scale entropy and fuzzy support vector machine approach for aeroengine whole-body vibration fault diagnosis,”
*Shock and Vibration*, vol. 20, no. 2, pp. 341–349, 2013. View at Publisher · View at Google Scholar - P. C. Gonçalves, A. R. Fioravanti, and J. C. Geromel, “Markov jump linear systems and filtering through network transmitted measurements,”
*Signal Processing*, vol. 90, no. 10, pp. 2842–2850, 2010. View at Publisher · View at Google Scholar · View at Scopus - A. N. Jiang, S. Y. Wang, and S. L. Tang, “Feedback analysis of tunnel construction using a hybrid arithmetic based on Support Vector Machine and Particle Swarm Optimisation,”
*Automation in Construction*, vol. 20, no. 4, pp. 482–489, 2011. View at Publisher · View at Google Scholar · View at Scopus - V. T. Tran, B. Yang, and A. C. C. Tan, “Multi-step ahead direct prediction for the machine condition prognosis using regression trees and neuro-fuzzy systems,”
*Expert Systems with Applications*, vol. 36, no. 5, pp. 9378–9387, 2009. View at Publisher · View at Google Scholar · View at Scopus - M. Marcellino, J. H. Stock, and M. W. Watson, “A comparison of direct and iterated multistep AR methods for forecasting macroeconomic time series,”
*Journal of Econometrics*, vol. 135, no. 1-2, pp. 499–526, 2006. View at Publisher · View at Google Scholar · View at Scopus - L. Cao, “Practical method for determining the minimum embedding dimension of a scalar time series,”
*Physica D*, vol. 110, no. 1-2, pp. 43–50, 1997. View at Google Scholar · View at Scopus - F. Takens, “Detecting strange attractors in turbulence,” in
*Dynamical Systems and Turbulence*, D. A. Rand and L. S. Young, Eds., pp. 366–381, Springer, New York, NY, USA, 1981. View at Google Scholar - A. M. Fraser and H. L. Swinney, “Independent coordinates for strange attractors from mutual information,”
*Physical Review A*, vol. 33, no. 2, pp. 1134–1140, 1986. View at Publisher · View at Google Scholar · View at Scopus - T. Kohonen,
*Self-Organizing Maps*, Springer, Berlin, Germany, 1995. - J. Lee, H. Qiu, G. Yu, and J. Lin, “Rexnord Technical Services, “Bearing Data Set”, IMS, University of Cincinnati, NASA Ames Prognostics Data Repository,” NASA Ames, Moffett Field, CA, http://ti.arc.nasa.gov/project/prognostic-data-repository/.
- H. Mohkami, R. Hooshmand, and A. Khodabakhshian, “Fuzzy optimal placement of capacitors in the presence of nonlinear loads in unbalanced distribution networks using BF-PSO algorithm,”
*Applied Soft Computing Journal*, vol. 11, no. 4, pp. 3634–3642, 2011. View at Publisher · View at Google Scholar · View at Scopus - X. D. Zhang, R. G. Xu, C. Kwan, S. Y. Liang, Q. Xie, and L. Haynes, “An integrated approach to bearing fault diagnostics and prognostics,” in
*Proceedings of the American Control Conference (ACC '05)*, pp. 2750–2755, Portland, Ore, USA, June 2005. View at Scopus - C. Sun, Z. S. Zhang, and Z. J. He, “Research on bearing life prediction based on support vector machine and its application,”
*Journal of Physics*, vol. 305, no. 1, Article ID 012028, pp. 1–9, 2011. View at Publisher · View at Google Scholar · View at Scopus