#### Abstract

The fault diagnosis process is essentially a class discrimination problem. However, traditional class discrimination methods such as SVM and ANN fail to capitalize the interactions among the feature variables. Variable predictive model-based class discrimination (VPMCD) can adequately use the interactions. But the feature extraction and selection will greatly affect the accuracy and stability of VPMCD classifier. Aiming at the nonstationary characteristics of vibration signal from rotating machinery with local fault, singular value decomposition (SVD) technique based local characteristic-scale decomposition (LCD) was developed to extract the feature variables. Subsequently, combining artificial neural net (ANN) and mean impact value (MIV), ANN-MIV as a kind of feature selection approach was proposed to select more suitable feature variables as input vector of VPMCD classifier. In the end of this paper, a novel fault diagnosis model based on LCD-SVD-ANN-MIV and VPMCD is proposed and proved by an experimental application for roller bearing fault diagnosis. The results show that the proposed method is effective and noise tolerant. And the comparative results demonstrate that the proposed method is superior to the other methods in diagnosis speed, diagnosis success rate, and diagnosis stability.

#### 1. Introduction

Fault diagnosis is essentially considered as a class discrimination problem. Various methods have been applied to build classifiers to fulfill fault diagnosis [1–5]. However, these existing methods have their intrinsic limitation. To overcome these shortcomings, variable predictive model-based class discrimination (VPMCD) as a new multivariate classification approach [6–8] is presented by Raghuraj and Lakshminarayanan. Recently, our team has made a series of research on VPMCD for its application to fault diagnosis and the results show that VPMCD is a superior solution for fault diagnosis of rotating machinery with small sample and multiclassification problems [9–12].

It is known that there are some interactions among feature variables and VPMCD method can adequately use these interactions. However, in the application to fault diagnosis of rotating machinery, we found that the feature extraction and selection have a great influence on the performance of VPMCD classifier. As a new time-frequency signal processing method, local characteristic-scale decomposition (LCD) method can decompose a nonstationary signal into several intrinsic scale components (ISCs). Many applications show that LCD is superior to empirical mode decomposition (EMD) [13–15] in running time, decreasing the end effect and relieving mode mixing [16, 17]. On the other hand, singular value decomposition (SVD) technique based on phase space reconstruction theory has a good analytical ability for nonlinear and nonstationary time series and has been widely used in fault diagnosis for rotating machinery [18, 19]. However, it is difficult to determine the optimal reconstruction parameters for SVD technique [20]. To solve this problem, LCD method is applied to decompose the original vibration signal to a number of ISCs to construct initial matrix [21]; then feature variables can be obtained by SVD technique.

After feature extraction, we need to answer the following questions: which feature variables cause interrelationship that can describe the system’s dynamic characteristics more effectively? How to select more representative feature variables to improve the performance of the VPMCD classifier? In many practical applications, operators often have not a clear professional theory as guidance, so they cannot select better input features to design better VPMCD classifier. In this case, the accuracy of VPMCD classifier will decrease and seriously affect the accuracy for fault diagnosis. In other words, feature selection is fairly critical to design VPMCD classifier with better performance. Mean impact value can sensitively capture the interaction between the independent variable and dependent variable [22]. Combining artificial neural network (ANN) and mean impact value (MIV), we proposed ANN-MIV approach to choose more suitable features for VPMCD input features. At the end of this paper, a novel fault diagnosis model based on LCD-SVD-ANN-MIV and VPMCD is proposed and proved by a practical experiment for roller bearing fault diagnosis.

The rest of this paper is organized as follows. VPMCD method is introduced in Section 2. LCD-SVD technique is given in Section 3. Feature selection approach based on ANN-MIV is described in Section 4. A fault diagnosis model based on LCD-SVD-ANN-MIV and VPMCD is proposed in Section 5. We applied the proposed model to roller bearing fault diagnosis for experimental validation in Section 6. Some conclusions were made in Section 7.

#### 2. VPMCD

##### 2.1. Variable Predictive Model (VPM)

It is known that different system behaviors are always quantified by measurable features and interactions among them. For mechanical fault diagnosis, there exist linear or nonlinear associations among the features extracted from the vibration signals in different work conditions. In VPMCD, variable predictive models (VPMs) are defined to distinguish linear/nonlinear and direct/indirect quantitative relationships among the features using one of the mathematical equations in the form of the following formulas:

Suppose that there are classes and different variables in each failure class, which can be expressed by a feature vector . After selecting one of the above mathematical equations and the number of other variables used for prediction (referred to as predictor order , for any variable can be modeled using sample measurements of other variables . In other words, defines variable as a function of best set of other variables of the same class . One of the ways to determine the set of “” values is by solving an ordinary least squares problem as , where is the coefficient vector and is the design matrix containing the polynomial values of predictor variable set. It is noted that the number of possible models is and the mathematical equation with minimum prediction errors during validation is selected as best for variable and the collection of these best is regarded as characteristic model for representing the intervariable associations.

If there are classes and the structure of associations among the same set of variables is different in each class, then can be suitably developed during the supervised training using the known dataset of feature variables, so the distinct VPMs can be used to identify the class of an unknown sample.

##### 2.2. Classification Algorithm

Taking a fault diagnosis problem, for example, VPMCD algorithm includes two steps. The first step is to train VPMs of each class; the second step is to repredict feature variables by mapping on each of these VPMs and then to establish classifier. The detailed procedure is given as follows.

*Step 1 (VPMs training procedure). *(1) Collect samples with different fault classes.

(2) Extract feature vector for each class, respectively.

(3) For any predicted variable in a special class, choose the appropriate model type, predictor variables, and predictor order and establish using the observation samples belonging to this class.

(4) For the classification problem with classes, establish .

*Step 2 (classification procedure). *(1) For unknown sample, extract feature vector .

(2) Repredict each feature variable through VPMs, respectively, to obtain its prediction values.

(3) Calculate the sum of squared prediction errors of all feature variables.

(4) The unknown sample is classified into class which has the sum of minimum squared prediction errors :

#### 3. LCD-SVD Technique

A trajectory matrix can be decomposed into a series of mutually orthogonal, unit-rank, and elementary matrices by using SVD; that is, where and and is a diagonal matrix; let be nonzero diagonal elements arranged in decreasing order; are called singular values of matrix , namely, singular spectrum, where .

It is known that reconstruction parameters, such as lag time and embedding dimension, would have effect on the result of SVD method. It is difficult to determine reconstruction parameters. In order to solve this problem, LCD-SVD technique is presented. We introduced LCD method as follows.

LCD method can decompose a complex multicomponent signal into series of intrinsic scale components (ISCs), in which each ISC is a monocomponent signal whose instantaneous frequency has specific physical meaning.

That is, the original signal is decomposed intowhere is the th ISC component and is the residual component. Since the basic functions in LCD method are obtained by linear transformation of the signal, LCD method has obvious advantages compared to the EMD method and LCD. The details can be described in [16].

#### 4. Feature Selection Approach Based on ANN-MIV

MIV is the evaluation index showing how much the independent variables influence the dependent variable. Its absolute value represents the relative importance degree of the independent variables. In combination with ANN, we use MIV to rank the independent variables to select more representative feature. The ANN-MIV algorithm is described as follows.

To elaborate the algorithm, is used to represent the number of classes; is used to represent the sample number of each class in the training sets. is used to represent a special class and its value equals 1 to .

*Step 1. *-dimensional features are extracted from the training sample sets of different classes.

*Step 2. *For a special class , the feature variable is used as the dependent variable and the remaining feature variables are used as the independent variables in turn. Then we train ANN model with the training sample sets of the th class. It is noted that the input size of ANN is equal to the dimension of the independent variables and the output size is equal to one.

*Step 3. *The th sample from the training sets of the th class is selected and the simulated results are obtained via the trained ANN. Then the value of th feature variable varied by ±10% to constitute a pair of new feature variables and form a pair of new samples.

*Step 4. *The pair of new samples are, respectively, tested, and a pair of simulation outputs, noted as and , is obtained by feeding the corresponding ANN. Then the difference between the pairs of simulation outputs is calculated as follows:where represents how much the th independent feature variable affects the dependent feature variable in special class . Here, the value of the difference is called impact value (IV) of the corresponding dependent variable when considering that the certain independent variable changes.

*Step 5. *For the th class, the process is repeated from the remaining samples. We can calculate impact values and the mean of these impact values, which is called mean impact value (MIV):

*Step 6. *The process from Step 2 to Step 5 is repeated for the other feature variables of th class and the series of corresponding can be calculated. For example, considering the feature as the dependent feature variable, then can be expressed as .

*Step 7. *The value of can determine which feature variables have more distinct interaction with a special . Thus, we rank the value of and select some features with larger MIVs as more suitable variables to form VPM for VPMCD classifier. In the end, the process is repeated for the other classes and the corresponding MIV matrix can be calculated and the VPMCD classifiers for each class can be obtained.

#### 5. Fault Diagnosis Model

A novel fault diagnosis model based on LCD-SVD-ANN-MIV and VPMCD for rotating machinery was proposed in this paper. Firstly, LCD-SVD technique was introduced for the fault feature extraction. Subsequently, more suitable features were selected by ANN-MIV approach to form feature vector. Lastly, VPMCD method was utilized to design the classifier to identify the work condition. The flow chart of the proposed fault diagnosis model is given in Figure 1.

#### 6. Application to Fault Diagnosis for Roller Bearing

##### 6.1. Datasets Collection

All datasets and system investigations of the roller bearing were downloaded from the website of the Case Western Reserve University. The vibration signals were acquired by the accelerometer, which had been mounted on the bearing housing at the driver end of the motor. The bearing type at drive end is 6205-2RS SKF, whose parameters are given in Table 1. The digital datasets for roller bearing faults were collected by a 16-channel data recorder at the sampling frequency of 12 kHz. When the local defects occurred, the high frequency resonances would be excited quite strongly to produce vibration signal. groups of vibration signals were sampled under each running condition; thus groups of vibration signals in total were obtained. Several single point faults were set using electrodischarge machining. The test was carried out with 2 hp load and the shaft rotating speed is 1750 rpm. Seven different fault types were tested as follows: normal condition, inner-race fault in fault diameter of 0.007 inches, outer-race fault in fault diameter of 0.007 inches, ball fault in fault diameter of 0.007 inches, inner-race fault in fault diameter of 0.021 inches, outer-race fault in fault diameter of 0.021 inches, and ball fault in fault diameter of 0.021 inches. Figure 2 indicates the time domain waveforms for seven work conditions of roller bearing.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

**(g)**

##### 6.2. Feature Extraction and Feature Selection

Using LCD method mentioned above, the vibration signal of the roller bearing was decomposed into about 10 ISCs, whose frequency bands ranged from high frequency to low frequency. Here, we give the ISCs of the vibration signals with bearing fault with fault diameter of 0.021 inches in Figure 3.

From Figure 3, it can be found that the first several ISCs are high frequency components which include main fault information of roller bearing. Furthermore the cross-correlation analysis between the th ISC and the original signal was carried out and the results show that the first eight ISCs have larger cross-correlation coefficient () as seen in Table 2. Therefore, the first eight ISCs were utilized to construct the initial feature vector matrix automatically to obtain the feature values using LCD-SVD technique. Subsequently, we utilized ANN-MIV approach to select more suitable features to design VPMCD classifier. Through experiment, we found that the interaction among the feature variables is obviously deterministic and the corresponding features should be suitable when the value of MIV is more than or equal to 0.1. Here, we denoted such features with the mark “☆.” The results of feature selection based on ANN-MIV were given in Table 3; in the end array of Table 3, we counted the selected times of a special feature. From this, it is found that the selected times of are more than those of the other features. This shows that is more suitable to express the interactions among the feature variables. Therefore, we adopt to build feature vector as input features of VPMCD classifier shown in Table 4.

##### 6.3. VPMCD Classifier Design and Experimental Results

In this application, ten VPMs in total have been used to perform the initial tests for seven work conditions by combining all four models and different predictor order , respectively. They are Linear VPM with (noted as ), Linear VPM with (noted as ), Linear VPM with (noted as ), Linear Interaction VPM with (noted as ), Linear Interaction VPM with (noted as ), Quadratic Interaction VPM with (noted as ), Quadratic Interaction VPM with (noted as ), Pure Quadratic VPM with (noted as ), Pure Quadratic VPM with (noted as ), and Pure Quadratic VPM with (noted as ). Table 5 indicates the test results for the fault diagnosis of roller bearing in this study. These results are the average values for 100 times' test. As can be seen, these accuracies vary from 94% to 100%, and the cost time is as low as under 0.2 seconds. These results indicate that all classifiers based on LCD-SVD-ANN-MIV and VPMCD algorithms not only have perfect self-consistency but also have very good rapidity. Moreover, it can be seen that Linear Interaction VPM, Quadratic Interaction VPM, and Pure Quadratic VPM have better self-consistency than Linear VPM in this application, which could be interpreted as the nonlinear interaction of singular values of the initial feature vector matrix constructed with ISCs. It is known that singular values can reflect the energy distribution in the frequency domain of signal. Each of the ISCs occupies a certain range of frequency band; their energy distribution is nonlinear, which is consistent with the above-mentioned result.

After the detailed comparative analysis in Table 5, Pure Quadratic VPM with is selected because of giving highest self-consistency accuracy with only five acquired training samples, which is more suitable for insufficient fault samples than the other similar models. Therefore, Pure Quadratic VPM with is used to build the corresponding VPM for of each class:

In order to avoid the occasionality of the testing accuracy, Monte Carlo test approach is applied. In this experiment, all samples under each running condition were divided into three groups for training, validating, and testing randomly. Using the mentioned VPMCD algorithm in Section 2.2, matrix including () was established to express the interactions among the feature variables for different running conditions during the training process. The mathematical expression for the inner-race fault condition was shown in Table 6. Here, the number of training samples is equal to and the number of validating samples is equal to . Subsequently, VPMCD classification was used to identify running condition of roller bearing with testing samples for each class. The results show that the diagnosis accuracy is 100% as seen in the end line of Table 8.

##### 6.4. Classifier’s Sensitiveness to Noise

Since the experiment was performed with no masking source element such as gear vibration, Gaussian noise was added to the original vibration signals. The noisy signals with SNR 10 dB and −20 dB are obtained for each original vibration signal, respectively. The average classification accuracy based on the proposed model is given in Table 7. As seen in Table 7, LCD-SVD can reduce noise by selecting main ISCs to construct the trajectory matrices for SVD technique. The singular values by LCD-SVD and ANN-MIV techniques contain the main information of vibration signal so that the different conditions can be detected even for the noisy signals. The performance of the proposed fault diagnosis model remains high performance even for the noisy signals.

##### 6.5. Comparisons with Recent Research Works

To prove the proposed model, Table 8 indicates the comparisons with some recent published literatures. All of the literatures use the same roller bearing dataset.

In the literature [18], two methods based on SSA and BP-ANN were used to extract features and served as class discrimination model. The method combination of SSA with BP-ANN is found to be suitable for roller bearing fault diagnosis. However, the selecting procedures of both windows length and singular values are complex and the chosen value greatly affects the accuracy, especially in the noisy signals. In addition, neural network is a slower algorithm as the computational load depends on number of classes, variables, and data size.

In the literature [23], multiscale entropy was extracted as a fault feature and support vector machine (SVM) was employed to classify the fault location. However, the classification rate drops down from 97.42% to 73.94% in the noisy signals. This result means that the method is not noise tolerant. Hence, proper denoising approach should be applied to improve the performance of the method.

In the literature [24], the algorithm applied fault characteristic frequency in the envelope spectra as a feature, and a new optimization algorithm, called artificial chemical reaction optimization algorithm, was used to optimize the kernel parameters of the basic SVM. The experiment results show that the algorithm is effective and fast. However, this multiclassifier was not well established because the basic SVM is binary and the design of multiclassifier needs large computational intensity. Moreover, a dataset consisting of only one fault size was studied and fewer samples were tested. Simultaneously, the effects of noise are not involved in that work.

As seen in Table 8, all the above-mentioned methods achieved higher diagnosis accuracy, but the proposed model does not exhibit the limitations such as complex parameters tuning, large computational intensity, and accuracy dropping down with noise. The proposed method can diagnose the faults of roller bearing effectively and stably in a short time. At the same time, compared with the last four methods, we can find that the fault feature extraction based on LCD-SVD technique is effective. The accuracy will increase by combining LCD-SVD and ANN-MIV as feature selection approach. And the VPMCD classifier works faster than LSSVM classifier because it does not need parameters optimization.

#### 7. Conclusions

A novel fault diagnosis model was presented in this paper. Firstly, a new singular value decomposition technique based on local scale decomposition, called LCD-SVD technique, was introduced for roller bearing fault feature extraction. The LCD-SVD technique avoids the difficulty of selecting the parameters, which affects the accuracy of traditional SVD technique. Secondly, feature selection approach based on ANN-MIV was proposed to choose more suitable feature variables as input features for VPMCD classifier. Thirdly, a fault diagnosis model based on LCD-SVD-ANN-MIV and VPMCD was proposed. Lastly, the proposed model was applied to roller bearing fault diagnosis. At the same time, the effect of noise on classification performance was studied and the comparison has been made. The investigation shows that the proposed model performs well for the signal with a low SNR. The comparative results demonstrate that the proposed model is superior to the other methods in diagnosis speed, diagnosis success rate, and stability.

#### Competing Interests

The authors declare that they have no competing interests.

#### Acknowledgments

The authors would like to acknowledge the support from Chinese National Science Foundation Grant (no. 51375152), Cooperative Innovation Center for the Construction & Development of Dongting Lake Ecological Economic Zone (XJT2015351), the Construct Program of the Key Discipline in Hunan Province (Mechanical Design and Theory) (XJF201176), Cooperative Demonstration Base of Universities in Hunan “R&D and Industrialization of Rock Drilling Machines” (XJT2014239). They also would like to express their appreciation to Case Western Reserve University for offering the free bearing dataset.