Research Article  Open Access
A Novel Multimode Fault Classification Method Based on Deep Learning
Abstract
Due to the problem of load varying or environment changing, machinery equipment often operates in multimode. The data feature involved in the observation often varies with mode changing. Mode partition is a fundamental step before fault classification. This paper proposes a multimode classification method based on deep learning by constructing a hierarchical DNN model with the first hierarchy specially devised for the purpose of mode partition. In the second hierarchy , different DNN classification models are constructed for each mode to get more accurate fault classification result. For the purpose of providing helpful information for predictive maintenance, an additional DNN is constructed in the third hierarchy to further classify a certain fault in a given mode into several classes with different fault severity. The application to multimode fault classification of rolling bearing fault shows the effectiveness of the proposed method.
1. Introduction
Rolling bearing is a very pivotal component in rotating machines, which are widely used in largescale automated industrial equipment. Mechanical failure caused by rolling bearings may cause abnormality of the rotating machinery system, resulting in huge economic losses, and even cause some unnecessary casualties [1–5]. Therefore, timely and precisely classification is critical for bearing monitoring.
The methods for mechanical equipment fault classification can be divided into qualitative model based method, quantitative model based method, and datadriven based method [6, 7]. Qualitative model and quantitative model based methods require precise mathematical model or a large amount of expert knowledge of the system, which will inevitably limit its application in fault classification field. In the recent two decades, datadriven method is widely used in fault detection of complex system. Instead of much more prior knowledge, datadriven approach can detect fault only through the measured data of the complex system [8–11]. The most common used datadriven fault classification methods are statistical feature extraction based methods and machine learning based methods. However, the method based on statistical feature extraction can only realize fault detection and it is unable to realize fault classification. For fault classification, we had better use machine learning method such as Support Vector Machine (SVM) and artificial neural network (ANN).
In the field of mechanical system fault classification, because of the sensitivity of vibration spectrum to equipment failure, vibration signals are usually used as the data source for fault classification of mechanical equipment. Due to mechanical equipment’s characteristics of being nonstable, nonlinear, largescale, highdimensional, and noise polluted, it is usually very difficult for precise fault feature extraction which is the most critical factor of the accuracy of mechanical equipment monitoring [12–14]. Some scholars have put forward some feature extraction methods that combine signal processing technology with machine learning method for fault classification of mechanical equipment. Widodo and Yang extract the frequencydomain feature as the data source of SVM to detect the machinery fault [13]. When the number of samples is small and the signals are nonstationary, Yu et al. proposed a bearing fault classification method by combining SVM and Empirical Mode Decomposition (EMD) [10]. Hu et al. extracted the energy of each wavelet packet transform (WPT) node as the preextracted feature to develop a combined WPTSVM based method for more accurate bearing fault classification [15]. Wang et al. also used WPT to extract nonstationary characteristics of the bearing’s vibration signal as the preextracted feature of ANN [16]. The method uses the nonlinear learning classification ability and selforganizing ability of ANN to classify and diagnose bearing faults. Yang and Tang proposed a method combining expert system and back propagation neural network (BPNN) [17]. This method makes full use of the advantages of expert system and ANN to successfully detect the bearing failure. Since bearing vibration signals are susceptible to Gaussian noise, Jiang et al. used high level statistics as the feature vector of BPNN to improve the performance of BPNN in bearing fault classification [18]. However, SVM and BPNN share the shortcomings of shallow learning method: SVM is an algorithm of two classifiers, and it is inefficient in multiclassification especially in the case when the sample number of observation is very large. Selecting the appropriate kernel function and scale parameter usually needs a wealth of experience. ANN also suffers from many defects, such as the following: () ANN has a slow convergence rate and can easily converge to the local optimum and () ANN is ineffective in feature learning of complex nonlinear data and usually results in poor classification accuracy. In summary, SVM and BPNN as the shallow learning methods could not well extract the data feature involved in the highdimensional unsteady data [19]. With the load varying, bearing can work in different steady state, which is called “multimode” phenomenon. Current research work on machine learning based classification did not take multimode problem into account.
For multimode process, the data feature of each mode is different [20], but current research on bearing fault classification usually regards it as a single mode for simplicity of data processing which will result in inaccurate classification result since feature extracted is inaccurate [21–23]. Therefore, mode partition should be implemented before fault feature extraction of a separate mode for accurate feature extraction. Zhang et al. proposed an improved means clustering algorithm based on existing modal partition method [20]. Song et al. studied the issue to distinguish stability mode from transition mode without the number of modes known in advance [24]. Zhao et al. separated multiple modalities according to the diversity analysis in operational phases and established online monitoring method along multiple batch directions [25]. Zhang et al. used modal subspace separation method to deal with multimode monitoring problems [26]. By using various characteristics of the subspace, different mode can well be separated, which can provide chance for more accurate multimode fault classification.
Unfortunately, mode partition and corresponding fault monitoring method for certain multimode processes are only specially developed for a specific industrial process [20, 24–27]. It is required to develop a more universal method. Deep learning is a promising ubiquitous feature extraction tool which has attracted wide attention by scholars from various fields [21, 28–30]. Comparing to shallow learning, deep learning can well process the feature extraction and the issue of nonlinear big data by constructing a deep network [31, 32]. Through the unsupervised layerbylayer greedy training algorithm and BPbased global parameter finetuning, deep neural network (DNN) can not only avoid the local optimization problem, but also solve the problem of limitation in number of labeled samples and the limitation in generalization ability. Deep learning method was firstly proposed by Hinton and Salakhutdinov in 2006 [22]. In view of its excellent feature extraction capabilities, it also attracts the attention of fault classification experts. Lu et al. successfully used the better feature extraction ability of deep neural network to diagnose the bearing fault [33]. The proposed method overcomes the shortcomings that the traditional feature extraction method could not discover the unknown type fault timely and effectively. Jia et al. used deep neural network to monitor the failure of bearings [34]. Gan et al. proposed a fault classification method based on hierarchical neural network [11]. By constructing a twolayer neural network, the method not only could locate the position of bearing fault but also effectively mines the fault size of the bearing in the same position. Deep learning, as one of the most popular machine learning methods, has brought a subversive revolution to the field of artificial intelligence. However, application about the deep learning is still in infancy, during the application process; there are also many issues demanding improvement. For example, the data in [11] are derived from a single mode, without considering the multimode observation caused by load varying problem. Therefore, it cannot fully extract the fault feature involved in the observation of different mode which is essential for the accuracy of multimode fault classification.
To solve the abovementioned problems, this paper presents a multimode fault classification method based on deep learning. First, a DNN model is constructed, and the trained network is used to mode partition; then, a new set of DNNs are constructed for observation data of each mode, and the trained networks are used to determine which component fails to implement fault location recognition; finally, for a certain fault in a given mode, another DNN is constructed to classify those observation data with different fault size.
The remainder of this paper is as follows: Section 2 overviews the theory of deep learning. Section 3 develops a multimode fault classification method based on DNN by hierarchically constructing DNN models with different purpose. In Section 4, effectiveness of the proposed multimode fault classification method is demonstrated by experiments analysis. Section 5 concludes this paper.
2. Theory of Deep Learning
Deep learning is a method based on unsupervised feature learning. We use deep learning theory to construct DNN. DNN training process consists of two steps: () using the unsupervised learning algorithm to pretrain the network layer by layer, which is helpful for DNN to efficiently mine features from raw data; () using the back propagation algorithm to finetune the parameters of the whole network, optimizing the performance of DNN to mine raw feature. In this paper, DNN is pretrained by multistacking AutoEncoder (AE).
2.1. AutoEncoder
AutoEncoder is an unsupervised machine learning structure, and it can be viewed as a threelayer forward artificial neural network, as shown in Figure 1. It consists of the input layer, the hidden layer, and the output layer. AutoEncoder is a very special neural network with single hidden layer, whose output is equal to the input. AutoEncoder network parameters can be adjusted by repeated training process, such that the reconstructed output is an approximation with high accuracy of the input. AutoEncoder is composed of two parts: encoder and decoder. The encoder network encodes the input data from the highdimensional space into lowdimensional space; then the lowdimensional space data is mapped into highdimensional space through decoder network which realized the reconstruction process from output to input. Therefore, the lowdimensional space data can be used as the characteristic representation of the input data.
Given an unlabeled dataset ,; consisting of observation features or variables, each observation variable has samples. The encoder network encodes the sample to the hidden activate value with an activation function . The encoder process is described as follows:where is the encoder function, Sigmoid function is usually taken as the activation function in the encoder process, is the weight matrix of the network between input layer and the hidden layer, is the bias vector generated by the encoder network, and is the connection parameter between the input layer and the hidden layer. The Sigmoid function can be depicted via
Similarly, for the decoder network, the feature matrix obtained from encoder network is used to reconstruct through the decoder network such that the reconstructed is equal to the input . The decoder process is described as follows:where is the decoder function, is the activation function of the decoder process, represents the weight matrix between the hidden layer and the output layer of the network, and is the bias vector generated by the decoder process.
The essence of AE training process is to optimize the network parameters and. In order to make the output as close as possible to the input , we characterize the degree of approximation between input and output by minimizing the reconstruction error . The optimization process is described below:
In each training process, the gradient descent method is used to update the training parameters and of the AE network. The processes of network parameter update are as follows:where represents the learning rate and partial derivatives and can be calculated with back propagation algorithm.
DNN can be simply viewed as a multihidden layers neural network formed by stacking many AutoEncoders. This model uses the bottomup method of unsupervised learning, extracting the features layer by layer. Then supervised learning method is applied to finetune the whole network parameters, which can extract the most essential characteristics from original signals. The structure of DNN is shown in Figure 2.
First of all, pretrain the DNN by using the unsupervised layerbylayer greedy training algorithm. Firstly, the first AutoEncoder AE1 is trained by giving an unlabeled dataset as the input of encoder network. The encoded feature is the hidden layer of AE1. The training parameter is obtained by designing the unique as the output of AE1. Then, use as the input of the second AutoEncoder (AE2) and train AE2 to acquire the network training parameter . is the hidden layer of AE2 which can be viewed as the characteristics of AE2. After that, choose as the input of the third AutoEncoder (AE3). Repeat the process to get the hidden layer features of the th AutoEncoder (AE ) and the corresponding network training parameter .
Secondly, a classifier is added in the top layer of DNN. The feature information is extracted by using the unsupervised learning method in the pretraining process of DNN. However, DNN does not have the ability of classifying; a classifier should be added in the top of DNN. In this paper, Softmax classifier is used as the output layer of DNN. We suppose the training dataset is , the label is , and the probability for each category can be calculated via the following hypothesis function: where is the model parameter of Softmax. Similarly to the AE model, in order to guarantee the performance of the classifier, the classifier model parameter is trained by minimizing the cost function . The cost function of Softmax training process is shown in (7), where the top network parameter is obtained from minimizing .
Finally, finetune. In order to guarantee the accuracy of feature extraction and the classification effectiveness of output layer, the whole DNN training parameters are finetuned by using a supervise algorithm of back propagation with some limited number of sample labels. The process of finetuning is completed by minimizing the reconstruction error . The procedures for parameter update are as follows:where represents the actual output value, is a parameter set generated from the whole network training, , back propagation algorithm is used to update the network parameter , and is the learning rate in the process of deep learning. The finetuning process uses the labeled data to improve the performance of DNN.
2.2. DNNBased Classification
In order to accurately extract the essential characteristics of the mechanical equipment health conditions by DNN modeling, the following steps are required. Firstly, the original vibration signals should be preprocessed. Since frequencydomain signals are more sensitive to mechanical equipment faults, the original timedomain signals are converted into frequencydomain signals in the first step. Secondly, use the preprocessed data as the input of the DNN model to extract features of mechanical equipment health conditions with unsupervised layerbylayer pretraining. Last but not least, the whole network parameter can be updated by using the back propagation algorithm to finetune the DNN structure when limited number of labeled samples is available. In this way we can get an effective feature extraction result for fault classification. The preprocessed datasets are divided into training data and testing data. The training data is used to construct DNN model to obtain the training parameter , and the testing network initialized with training parameter is used to verify its effectiveness. Misclassification rate is used as an accuracy indicator of the DNNbased fault classification method. Detailed steps of DNN for mechanical system fault classification are shown in Figure 3.
3. Multimode Fault Classification Model Based on Deep Learning
There are a number of multimode processes in practical system. For multimode process, the potential feature extracted from the observation of each steady mode also varies. So it is necessary to separate the observation into several operation modes for accuracy data feature extraction.
Therefore, mode partition is a fundamental step before fault classification. In this paper, this problem is solved by constructing a hierarchical DNN model with the first hierarchy specially devised for the purpose of mode partition. By this means, it can make an effective mode partition for multimode process, which can increase the accuracy of DNNbased fault classification. Framework of threelayer DNN is shown in Figure 4.
The detailed steps for multimode fault classification are as follows.
Step 1 (mode partition). In this step, we focus on building a DNN model to determine the mode label of each sample. The whole datasets are used as the input of the multimode classification model. The mode partition process can be illustrated in detail as follows.
() Construct a new with hidden layers AE descripted in (9), and initialize the training parameters of . where , where is the weight matrix and is the bias vector. are the numbers of hidden layer neurons in . The network configuration can be represented by . denotes the training dataset. We use in (10) to represent the number of neurons in the input layer of .The parameters of can be initialized via
() Training of to obtain the net parameter . Unsupervised layerbylayer feature extraction based on the training dataset is implemented to the level AE defined in (9).
Add a Softmax classifier on the top of . Limited number of training labels sets is used to finetune and update the training parameter viawhere , with calculated by (6)(7), and is the number of samples. denotes the output of , and is learning rate in finetuning process.
() Mode partition uses the trained . Once test sample is obtained, compute the probability of each test sample via the trained . Then use (14) to divide the test sample into different modes:where and is the mode type of sample. denotes the mode label of the test sample.
Compare the mode partition label with the actual mode label to determine the misclassification number aswhere is the operation to characterize the size of a set and is the misclassification set defined by
Step 2 (fault source location). For a certain mode partitioned in Step 1, We can further locate the fault source. The procedure in Step 2 is analogous to Step 1, which is described below.
() According to the mode partition result, we build the second hierarchy of the model which comprises a set of DNNs, and denotes the training dataset in .Parameter initialization mechanism of is the same as Step 1.
() Train to obtain the net parameters . Similarly, for detailed calculation process, one can refer to (12)–(13).
() Determine the fault location by the trained .
The test dataset is used to predict the unknown fault locations based trained . Assume that each mode has different fault locations; fault location label for the sample of the mode can be calculated with prediction formula via
Compute the misclassification number of the mode. And then the misclassification of this classification step can be computed via
Step 3 (fault severity recognition). In order to identify the fault severity, the third hierarchy is devised with the intention to distinguish the fault severity. Construct the third deep network , is the training dataset in , and is the test dataset. Parameter training process is similar to Step 2. The severity classification label of the sample in can be determined by
The misclassification number for a given fault in a certain mode can be computed viawhere is the misclassification number of the fault location in the mode, is the misclassification number of all modes, and is the misclassification number in this step.
Step 4 (accuracy computation of the whole multimode classification network). In this paper, the classification accuracy of the hierarchical DNN is measured by the numbers of misclassifications. The final accuracy is calculated by the ratio of the total number of the misclassifications to the total number of samples. The procedure of calculation is as follows:
Combining (21) with (22), the final accuracy of the proposed multimode fault classification based on DNN can be formulated aswhere is the number of total samples, and the flow chart of the proposed multimode fault classification method based on threelayer DNN is depicted in Figure 5.
4. Application to Rolling Bearing Fault Classification
Rolling bearings play an important role for rotating machinery. The health condition of the bearing directly affects the reliability and stability in the whole system. Rolling bearing as the experimental platform is used to verify the effectiveness of the hierarchical DNN multimode fault classification method, and the performance of the proposed method is compared with the traditional method such as DNN, BPNN, SVM, hierarchical BPNN, and hierarchical SVM, which is listed in detail in Section 4.3.
4.1. Experimental Platform
The experimental datasets are obtained from the Case Western Reserve University Bearing Data Center in the United States [35]. The experimental platform is shown in Figure 6. It can be seen that the experimental platform consists of a 2 hp motor, a power meter, an electronic controller, a torque sensor, and a load motor. The vibration signals of the drive end of the motor are collected by the acceleration sensor as the experimental datasets for bearing fault classification. In this experiment, we use acceleration sensor to collect the vibration signals with the load of 0 hp, 1 hp, 2 hp, and 3 hp, respectively, and the sampling frequency is 48 kHz. There are four types of bearing health condition: () normal condition; () inner race fault; () outer race fault; () roller fault. The sizes of the bearing fault were 0.007 mm, 0.014 mm, and 0.021 mm, respectively.
4.2. Data Description
In this case, we collect the vibration signals of the bearing drive end at different loading. The dataset collected contains 4 kinds of modes; the motor load is 0 hp, 1 hp, 2 hp, and 3 hp, respectively, and 4 modes are shown in Table 1. In each mode, there are four states of inner race fault, outer race fault, roller fault, and normal, with 3 different fault sizes in each fault state, that is to say 10 different fault types in a single mode. This paper selects 200 samples in each fault type; each sample contains 2048 observation points. 100 samples are randomly selected as the training data, and the other 100 samples as the testing data. We use Fast Fourier Transform (FFT) for each sample to get 2048 Fourier coefficients. Because of the symmetry of the Fourier coefficients, we take the first 1024 coefficients as the new samples; that is to say the dataset contains 8000 samples. In order to compare the proposed method of hierarchical network with singlelayer network and explore the effect of different sample numbers on network, for a given mode, the sample number of each DNN is listed in Table 2. In addition, we present the original timedomain waveforms of the 10 fault types in mode 1 under A, as shown in Figure 7.


4.3. Results of Fault Classification
The proposed hierarchical DNN structure is applied to bearing fault classification; there are 8000 samples, 4 different modes, 4 fault positions in each mode, and totally 40 health conditions in dataset A. The health conditions of rotating machinery system under multimode, multicondition, multifault type, and large sample data are simulated which demonstrated the performance with the proposed method. To reduce the effect of randomness, the experiment was repeated 20 times. In this paper, the initialized parameters in the DNN pretraining process are shown in Table 3.

The network training uses stochastic gradient descent method; on each hierarchicy the maximum number of iterations of DNN is 500, 300, and 300, respectively. Simulation of three tradition methods, BPNN, SVM, and DNN, is compared with simulation of the proposed multimode fault classification approach to verify its effectiveness. In addition, hierarchical BPNN (HBPNN) and hierarchical SVM (HSVM) are also compared with hierarchical DNN (HDNN). BPNN uses the gradient descent method to update the network weights and bias parameter; onetoone training mechanism is used to train a SVM with radial basis. The training mechanism of HBPNN and HSVM is the same as HDNN.
Table 4 compares the fault classification accuracies in time domain and frequency domain. It can be seen from Table 4 that rotation machinery fault is more sensitive in frequency domain. So we use FFT as a tool to preprocess the original data.

Table 5 compares the fault classification results after mode partition. It can be seen from line 2 and line 3 that HDNN can obtain more accurate classification either for fault source location or for fault severity recognition which tells us that mode partition is a critical step in multimode fault classification.

The hierarchical model for the case of BPNN and SVM also confirms this conclusion. Comparing line 2 with line 4 and line 6, we can see that HDNN is significantly superior to other hierarchical machine learning models because of the fact that HDNN can get better mode partition accuracy which is shown in Table 6. On the other hand, we can draw another conclusion that the performance of traditional BPNN method is superior to the traditional SVM method in the large sample case, but the accuracy of HSVM is higher than that of HBPNN due to the fact that SVM does well in small sample learning.

In order to demonstrate the performance of the proposed multimode classification method, the hierarchical machine learning methods are employed in this paper. As can be seen from Table 6, the accuracy of mode partition with proposed HDNN method can reach 99.96%, and we can naturally find that the performance of HDNN is superior to HBPNN and HSVM in mode partition procedure.
In view of the excellent performance of the proposed multimode classification method, we found that the performance of classification was influenced by accurate feature extraction. In order to verify the effectiveness of HDNN based feature extraction method, scatter plots of the feature extracted are demonstrated in Figures 8–10. As shown in Table 3, in each training process, the number of neurons in the last hidden layer is 100; that is to say, the feature dimension is 100, which is too large to be visualized. Therefore, PCA is used as a data compression tool to reduce the feature dimension. In this paper, we use the first three key principal components to plot the scatter chart of the fault source location feature extracted by HDNN, as shown in Figure 8. Figure 8 is the scatter plots for fault feature extracted by HDNN after mode partition, while Figure 9 shows the scatter plots for fault feature extracted by DNN without mode partition. From Figures 9 and 10, we can see that some fault features are overlapped, which result in an unsatisfactory fault classification result.
(a)
(b)
(c)
(d)
(a)
(b)
(c)
Figure 10 is the scatter plot of the feature extracted for different modes. We can see from Figure 10 that HDNN does well in multimode fault feature extraction which will greatly affect the accuracy of the successive fault classification.
In summary, the proposed multimode classification method can accurately extract the different fault features based on its strong nonlinear characterization ability.
In general, efficiency of the fault classification method is affected by sample number of the train data. Figure 11 displays the fault classification accuracy of DNN and HDNN in two cases. Red line denotes the classification accuracy of the case when more samples are used as the training data. Black line denotes the classification accuracy of the case when fewer samples (only 1/2 of the first case) are used as the training data. In addition, the line with “” is the simulation result of HDNN and the line with “□” is the simulation result of traditional DNN.
From Figure 11, it can be clearly seen that () fault classification accuracy of HDNN does not vary much for the two cases, while the fault classification accuracy of DNN is greatly affected by the number of training data used and () in both cases fault classification accuracy of HDNN is much better than DNN. So we can come to the conclusion that HDNN is a more robust fault classification for multimode bearing fault classification in the case when fewer number of training data are available.
5. Conclusions
In this paper, a novel multimode fault classification method based on DNN is developed. The main idea is to construct a hierarchical DNN model with the first hierarchy specially devised for the purpose of mode partition. The second hierarchical model comprising a set of DNNs is devised to extract feature separately of different modes and precisely diagnose the fault source. Another set of DNNs is devised to distinguish the severity of a certain fault in a given mode, which is helpful for predictive maintenance of the machinery equipment. Rolling bearing is the experiment platform to verify the efficiency of the proposed method.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This research was supported in part by the Natural Science Fund of China (Grant nos. U1604158, U1509203, and 61333005) and Technical Innovation Talents Scheme of Henan Province (Grant no. 2012HASTIT005).
References
 D.H. Zhou, Y. Liu, and X. He, “Review on fault diagnosis techniques for closedloop systems,” Acta Automatica Sinica, vol. 39, no. 11, pp. 1933–1943, 2013. View at: Publisher Site  Google Scholar
 F. N. Zhou, J. H. Park, and Y. J. Liu, “Differential feature based hierarchical PCA fault detection method for dynamic fault,” Neurocomputing, vol. 202, pp. 27–35, 2016. View at: Publisher Site  Google Scholar
 C. Li, R.V. Sanchez, G. Zurita, M. Cerrada, D. Cabrera, and R. E. Vásquez, “Multimodal deep support vector classification with homologous features and its application to gearbox fault diagnosis,” Neurocomputing, vol. 168, pp. 119–127, 2015. View at: Publisher Site  Google Scholar
 X. He, Z. Wang, Y. Liu, and D. H. Zhou, “Leastsquares fault detection and diagnosis for networked sensing systems using a direct state estimation approach,” IEEE Transactions on Industrial Informatics, vol. 9, no. 3, pp. 1670–1679, 2013. View at: Publisher Site  Google Scholar
 D. Zhao, D. Shen, and Y. Q. Wang, “Fault diagnosis and compensation for twodimensional discrete time systems with sensor faults and timevarying delays,” International Journal of Robust and Nonlinear Control, 2017. View at: Publisher Site  Google Scholar
 H. Li and D. Y. Xiao, “Surver on data driven fault classification methods,” Control and Decision, vol. 26, no. 1, pp. 1–16, 2011. View at: Google Scholar
 R. M. An and Y. Gao, “Spacecraft fault classification based on hierarchical neural network,” Spacecraft Environment Engineering, vol. 30, no. 2, pp. 203–208, 2013. View at: Google Scholar
 F. N. Zhou, C. L. Wen, Y. B. Leng, and Z. G. Chen, “A datadriven fault propagation analysis method,” Journal of Chemical Industry and Engineering (China), vol. 61, no. 8, pp. 1993–2001, 2010. View at: Google Scholar
 H. Ji, X. He, and D. Zhou, “On the use of reconstructionbased contribution for fault diagnosis,” Journal of Process Control, vol. 40, pp. 24–34, 2016. View at: Publisher Site  Google Scholar
 D. J. Yu, M. F. Chen, J. S. Cheng, and Y. Yang, “A fault classification approach for rotor systems based on empirical mode decomposition method and support vector machines,” Proceedings of the Chinese Society for Electrical Engineering, vol. 26, no. 16, pp. 162–167, 2006. View at: Google Scholar
 M. Gan, C. Wang, and C. A. Zhu, “Construction of hierarchical diagnosis network based on deep learning and its application in the fault pattern recognition of rolling element bearings,” Mechanical Systems and Signal Processing, vol. 7273, pp. 92–104, 2016. View at: Publisher Site  Google Scholar
 G. F. Bin, J. J. Gao, X. J. Li, and B. S. Dhillon, “Early fault diagnosis of rotating machinery based on wavelet packets—empirical mode decomposition feature extraction and neural network,” Mechanical Systems and Signal Processing, vol. 27, no. 1, pp. 696–711, 2012. View at: Publisher Site  Google Scholar
 A. Widodo and B.S. Yang, “Support vector machine in machine condition monitoring and fault diagnosis,” Mechanical Systems and Signal Processing, vol. 21, no. 6, pp. 2560–2574, 2007. View at: Publisher Site  Google Scholar
 D. Zhao, Z. Lin, and Y. Wang, “Integrated state/disturbance observers for twodimensional linear systems,” IET Control Theory & Applications, vol. 9, no. 9, pp. 1373–1383, 2015. View at: Publisher Site  Google Scholar  MathSciNet
 Q. Hu, Z. He, Z. Zhang, and Y. Zi, “Fault diagnosis of rotating machinery based on improved wavelet package transform and SVMs ensemble,” Mechanical Systems and Signal Processing, vol. 21, no. 2, pp. 688–705, 2007. View at: Publisher Site  Google Scholar
 L. Y. Wang, W. G. Zhao, and Y. Liu, “Rolling bearing fault diagnosis based on wavelet packet neural network characteristic entropy,” Advanced Materials Research, vol. 108–111, no. 1, pp. 1075–1079, 2010. View at: Publisher Site  Google Scholar
 Y. Yang and W. Tang, “Study of remote bearing fault diagnosis based on BP Neural Network combination,” in Proceedings of the 7th International Conference on Natural Computation (ICNC '11), pp. 618–621, IEEE, Shanghai, China, July 2011. View at: Publisher Site  Google Scholar
 L. Jiang, Q. Li, J. Cui, and J. Xi, “Rolling bearing fault diagnosis based on higherorder cumulants and BP neural network,” in Proceedings of the 27th Chinese Control and Decision Conference (CCDC '15), pp. 2664–2667, IEEE, Qingdao, China, May 2015. View at: Publisher Site  Google Scholar
 T. Kuremoto, S. Kimura, K. Kobayashi, and M. Obayashi, “Time series forecasting using a deep belief network with restricted Boltzmann machines,” Neurocomputing, vol. 137, pp. 47–56, 2014. View at: Publisher Site  Google Scholar
 S. M. Zhang, F. L. Wang, S. Tan, and S. Wang, “A fully automatic onine mode identiflcation method for multimode processes,” Acta Automatica Sinica, vol. 42, no. 1, pp. 60–80, 2016. View at: Google Scholar
 J. Schmidhuber, “Deep Learning in neural networks: an overview,” Neural Networks, vol. 61, pp. 85–117, 2015. View at: Publisher Site  Google Scholar
 G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, 2006. View at: Publisher Site  Google Scholar  MathSciNet
 H. Ze, A. Senior, and M. Schuster, “Statistical parametric speech synthesis using deep neural networks,” in Proceedings of the 38th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '13), pp. 7962–7966, IEEE, Vancouver, Canada, May 2013. View at: Publisher Site  Google Scholar
 B. Song, S. Tan, and H. B. Shi, “Key principal components with recursive local outlier factor for multimode chemical process monitoring,” Journal of Process Control, vol. 47, pp. 136–149, 2016. View at: Publisher Site  Google Scholar
 L. Zhao, C. Zhao, and F. Gao, “Interbatchevolutiontraced process monitoring based on interbatch mode division for multiphase batch processes,” Chemometrics and Intelligent Laboratory Systems, vol. 138, pp. 178–192, 2014. View at: Publisher Site  Google Scholar
 Y. Zhang, C. Wang, and R. Lu, “Modeling and monitoring of multimode process based on subspace separation,” Chemical Engineering Research and Design, vol. 91, no. 5, pp. 831–842, 2013. View at: Publisher Site  Google Scholar
 F.N. Zhou, C.L. Wen, T.H. Tang, and Z.G. Chen, “DCA based multiple faults diagnosis method,” Acta Automatica Sinica, vol. 35, no. 7, pp. 971–982, 2009. View at: Publisher Site  Google Scholar
 P. Tamilselvan and P. Wang, “Failure diagnosis using deep belief learning based health state classification,” Reliability Engineering & System Safety, vol. 115, pp. 124–135, 2013. View at: Publisher Site  Google Scholar
 R. Huang, C. Liu, G. Li, and J. Zhou, “Adaptive deep supervised autoencoder based image reconstruction for face recognition,” Mathematical Problems in Engineering, vol. 2016, Article ID 6795352, 14 pages, 2016. View at: Publisher Site  Google Scholar
 H. Liu, L. Li, and J. Ma, “Rolling bearing fault diagnosis based on STFTdeep learning and sound signals,” Shock and Vibration, vol. 2016, Article ID 6127479, 12 pages, 2016. View at: Publisher Site  Google Scholar
 P. L. Wang and C. J. Xia, “Fault detection and selflearning identification based on PCAPDBNs,” Chinese Journal of Scientific Instrument, vol. 36, no. 5, pp. 1147–1154, 2015. View at: Google Scholar
 R. Pang, Z. B. Yu, W. Y. Xiong, and H. Li, “Faults recognition of highspeed train bogie based on deep learning,” Journal of Railway Science and Engineering, vol. 12, no. 6, pp. 1283–1288, 2015. View at: Google Scholar
 C. Lu, Z. Y. Wang, W. L. Qin, and J. Ma, “Fault diagnosis of rotary machinery components using a stacked denoising autoencoderbased health state identification,” Signal Processing, vol. 130, pp. 377–388, 2017. View at: Publisher Site  Google Scholar
 F. Jia, Y. Lei, J. Lin, X. Zhou, and N. Lu, “Deep neural networks: a promising tool for fault characteristic mining and intelligent diagnosis of rotating machinery with massive data,” Mechanical Systems and Signal Processing, vol. 7273, pp. 303–315, 2016. View at: Publisher Site  Google Scholar
 Bearing Data Centre, Case Western Reserve University, http://csegroups.case.edu/bearingdatacenter/home
Copyright
Copyright © 2017 Funa Zhou et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.