Research Article  Open Access
Muhammad Sohaib, JongMyon Kim, "Reliable Fault Diagnosis of Rotary Machine Bearings Using a Stacked Sparse AutoencoderBased Deep Neural Network", Shock and Vibration, vol. 2018, Article ID 2919637, 11 pages, 2018. https://doi.org/10.1155/2018/2919637
Reliable Fault Diagnosis of Rotary Machine Bearings Using a Stacked Sparse AutoencoderBased Deep Neural Network
Abstract
Due to enhanced safety, costeffectiveness, and reliability requirements, fault diagnosis of bearings using vibration acceleration signals has been a key area of research over the past several decades. Many fault diagnosis algorithms have been developed that can efficiently classify faults under constant speed conditions. However, the performances of these traditional algorithms deteriorate with fluctuations of the shaft speed. In the past couple of years, deep learning algorithms have not only improved the classification performance in various disciplines (e.g., in image processing and natural language processing), but also reduced the complexity of feature extraction and selection processes. In this study, using complex envelope spectra and stacked sparse autoencoder (SSAE) based deep neural networks (DNNs), a fault diagnosis scheme is developed that can overcome fluctuations of the shaft speed. The complex envelope spectrum made the frequency components associated with each fault type vibrant, hence helping the autoencoders to learn the characteristic features from the given input signals more readily. Moreover, the implementation of SSAEDNN for bearing fault diagnosis has avoided the need of handcrafted features that are used in traditional fault diagnosis schemes. The experimental results demonstrate that the proposed scheme outperforms conventional fault diagnosis algorithms in terms of fault classification accuracy when tested with variable shaft speed data.
1. Introduction
Reliable fault diagnosis of industrial machinery is an essential task, as it not only contributes to the safety and reliability of the machinery, but also decreases the associated maintenance and operational costs [1–7]. Vibration acceleration signals collected from complex industrial machines provide useful information about their health status, and therefore, vibration condition monitoring is considered a standard approach that allows for corroboration as a part of reliable fault diagnosis schemes [8–12]. Bearings are the most frequently used components in rotating machinery and account for approximately 40–51% of the failure occurrences [13–15]. As a result, bearing fault diagnosis has been extensively investigated.
Traditional fault diagnosis methods use efficient feature extraction and a machine learning algorithm, such as nearest neighbors (NN), support vectors machines (SVMs), and artificial neural networks (ANNs), to perform fault diagnosis [16–20]. Feature extraction is a cumbersome process that requires expert knowledge and also adds to the complexity of the fault diagnosis scheme [21]. Previously, ten statistical parameters reflecting the bearing health conditions were first calculated, and then the calculated features were provided as an input to the ANN for fault classification [22]. In a previous study, nineteen statistical parameters were extracted from the vibration signals, and fault classification was performed using SVMs [23]. A combination of the coefficients of the linear timeinvariant autoregressive model and nearest neighbor classifier were utilized for fault diagnosis [24]. These networks can efficiently perform fault classification under constant shaft speeds. However, the efficiency of these fault diagnosis schemes decreases when tested with data for variable shaft speeds. A mechanism is required to address the issue for the underlying network so that it can efficiently extract useful information from nonstationary shaft speed data, making it suitable for efficient fault diagnosis in variable shaft speed conditions.
In recent years, deep learning has emerged as a useful tool for solving pattern recognition, image processing, computer vision, and natural language processing problems and is capable of attaining informative features from minimally processed data via nonlinear transformations [25–27]. Deep learning algorithms can replace the need for handcrafted features, as the algorithms are capable of unsupervised feature learning and hierarchical feature extraction [28].
In this study, a threestep mechanism was developed, which can diagnose bearing faults under shaft speed fluctuations using vibration acceleration signals. First, the energyfrequency distribution of the input vibration acceleration signals is estimated by calculating its complex envelope spectrum. Then, the calculated complex envelope spectrum, which extracts essential properties of the signal, is provided as input to the stacked sparse autoencoder (SSAE) based deep neural network. The end fault classification is performed using the SoftMax classifier. The efficiency of the proposed scheme was evaluated by testing it with vibration data obtained for four different shafts speeds. The results demonstrate a noticeable improvement of the diagnostic performance compared to the outputs of existing techniques. The rest of this paper is organized as follows. Section 2 describes the dataset used for this study. Section 3 presents the details of the proposed scheme, and Section 4 discusses the experimental results. Finally, Section 5 concludes the paper.
2. Bearing Dataset
The effectiveness of the proposed scheme was tested using bearing fault data provided by Case Western Reserve University [29]. The faults were seeded in the test bearings on their outer raceway, inner raceway, and rolling element by using an electrodischarge machine (EDM). A variable length vibration signal was collected each time from a 2horsepower (hp) reliance electric motor for a normal condition and three faulty conditions. The drive end vibration data obtained from an accelerometer, placed at the 12 o’clock position on the bearing housing, was utilized in the experiments. The sampling rate, fault diameter, and crack depth were 12,000 Hz, 7 mils, and 0.11 inches, respectively. The data was collected for four shaft speeds of 1,722, 1,748, 1,772, and 1,796 revolutions per minute (rpm). The four diverse types of signals specifying the three faulty states and the one normal state are shown in Table 1.

3. Complex Envelope Spectrum and Stacked Autoencoders for Fault Diagnosis
The proposed speed invariant fault diagnosis scheme is elaborated in Figure 1. First, segmentation of the vibration acceleration signal is carried out using a fixed size window, resulting in 117 segments of 1,024 data points each for every fault type and speed condition. The complex envelope spectrum of each segment is computed to reveal the instantaneous features hidden in the time domain signal. This spectrum contains an impulse response series on certain defect frequencies associated with each fault type [30]. The defect frequencies are functions of the shaft speed, and therefore, a slight variation of the shaft speed can cause variations of the positions of these defect frequencies in the envelope power spectrum. Therefore, under variable speed conditions, the traditional envelope analysis yields poor results, as it relies on detecting bearing faults through the exact location of the corresponding defect frequencies in the envelope power spectrum. The proposed method does not rely on the exact locations of the defect frequencies; instead, it mines features from the entire spectrum. A variation of the shaft speed changes the exact locations of the peaks at defect frequencies. However, the relative positions of the peaks at the defect frequencies and the peaks at the principal harmonics of these defect frequencies remain unaltered. Thus, changes of the shaft speed skew the envelope power spectrum but do not drastically change its overall shape and structure, such that the relative positions of the defect frequencies and their principle harmonics remain the same. Therefore, due to the unsupervised learning and hierarchical feature extraction mechanism, stacked autoencoders can overcome the speed variations and automatically mine meaningful information from the complex spectrum. In this study, the focus was to compute the complex envelope spectra of the three fault conditions (i.e., inner, outer, and roller raceway faults). As a result, stacked autoencoders may take advantage of variations of the amplitude levels, as well as the relative positions of defect frequencies and the principle harmonics in the spectrum, resulting in extraction of the informative features.
3.1. Complex Envelope Spectrum
The following steps are involved in computing the complex envelope spectrum.(1)Computation of the analytical signal: it is described as follows.
The analytical signal is composed of a real signal and its Hilbert transform. Let us suppose that is a time domain signal; then its Hilbert transform and a new time domain signal , which is also known as the analytical signal, can be mathematically represented as shown below.Here, the value of is . The mathematical representation of the Hilbert transform is given as(2)The complex envelope spectrum is computed by applying a Fourier transform of the analytical signal.(3)The absolute of this spectrum is used for further processing with stacked autoencoders.
Typically, a high pass filter is applied as a preprocessing step on this raw signal to eliminate the effects caused by slow vibrations. Given that the deep network can extract meaningful information automatically from the input data, high pass filtering of a raw signal is skipped while computing the complex envelope spectrum. Thus, the proposed method using this spectrum, which is given as an input to the deep network, helps to reduce the complexity, as well as the number of steps for calculating the complex envelope spectrum of signals.
3.2. Stacked Autoencoders
A stacked autoencoder is a deep artificial neural network having more than one hidden layer, and it is formed by stacking simple autoencoders for feature extraction and classification. The functionality of stacked autoencoders can be understood by considering the knowledge of a single autoencoder. Figure 2 shows the architecture of a typical stacked autoencoder. An autoencoder is a threelayered artificial neural network (ANN) that operates in an unsupervised manner. There are two main parts of an autoencoder, the encoder and the decoder.
The encoder part utilizes an input and provides an output by transforming the inputs from a higherdimensional space (i.e., dimensions) to a lowerdimensional space (i.e., dimensions). The produced output vector is known as codes or latent variables, and it can be mathematically represented as follows:where , , and are the encoding activation function, weights, and biases, respectively, which are used in the hidden layer. On the other hand, the decoder part tries to reconstruct the inputs from the generated codes. The reconstruction process can be represented as follows:Here, , , and are the deactivation transfer function, weight, and bias, respectively, which are used in the reconstruction process. During the reconstruction process, autoencoders try to minimize the reconstruction error by using the following loss function:In the current work, sparse autoencoders were used to create sparse stacked autoencoders (SSAEs). The concept of sparsity in an autoencoder is explained in the following section.
3.3. Sparse Autoencoder
The sparsity constraint can be introduced to the cost function of an autoencoder with the help of a regularization term. This regularization term is a function which measures the average output activation value of a neuron and is helpful in avoiding the overfitting problem. The regularized cost function after introducing sparsity and weight regularization can be represented as follows: Here, indicates the regularization coefficient and is the sparsity regularization coefficient. During the training of an autoencoder, it is possible that the value of sparsity regularization term decreases by increasing the values of weights and decreasing the values of latent codes . This issue can be resolved by introducing regularization to the cost function, which can be formulated aswhere , , and denote the number of hidden layers, number of observations, and number of variables in the input data, respectively. The sparsity regularization term controls the sparsity constraint on the output from the hidden layer neurons. It takes a higher value when the neuron provides an average activation value that deviates mainly from the desired value . It can be defined by the KullbackLeibler divergence as follows:The function given in (8) measures the difference between the two distributions; if the two distributions are equal, it takes a zero value and increases as the distributions diverge from each other. When minimizing the cost function, this term is forced to be as small as possible; as a result, the two values and come closer to one another. The activation can be defined as
Once all the sparse autoencoders are trained individually, they are stacked to form the deep neural network (DNN). A typical threelayered deep neural network (DNN) based on SSAEs is given in Figure 3. In each hidden layer, sparsity is introduced into the network by a sparsity regularization term. In such a DNN, the autoencoders extract useful features through an unsupervised learning process. The DNN is then finetuned in a supervised manner using backpropagation in combination with the standard gradient descent algorithm. After finetuning, the network is tested using unseen data. The steps that are carried out during the finetuning of a deep neural network using the standard gradient descent based backpropagation algorithm are given as follows:(1)The weights and biases are initialized with small random nonzero values.(2)A set of input observations “” are provided to the DNN, and the corresponding activation is calculated.(3)For every layer in the network, an output is computed and is feedforwarded along with activation of neurons .(4)The predicted output is compared with the actual value to calculate the error between the two values. The computed error is denoted by , where is the change in the cost function and is derivative of activation function used in the neurons of a layer.(5)Backpropagation of the error is performed to update the weights in order to minimize the error.(6)The gradient of the cost function is calculated as and .(7)Steps – are repeated until the overall error is reduced to the smallest possible value.
4. Experimental Setup
In the current work, four experiments were carried out to validate the effectiveness of the proposed scheme when dealing with shaft speed fluctuations. The set of experiments is listed in Table 2, and each experiment was conducted multiple times with different numbers of epochs to train the network. The bearing fault data was divided into four separate datasets based on the shaft speed, with each dataset containing samples from the normal state, as well as from the inner raceway, outer raceway, and roller faults (468 samples). In each experiment, the network was trained using samples from one shaft speed dataset and validated with the samples of the other shaft speed datasets.

4.1. Parameter Selection for Stacked Sparse Autoencoders (SSAEs)
According to [31], selection of parameters during deep learning affects the performance of the model. In this work, while developing an SSAEbased DNN for bearing fault diagnosis, the model was repeatedly tested with different values for parameters, like the receptive input size, sparsity constraint, and the number of hidden nodes, and their effects on the reconstruction error of the model were observed. The reconstruction error is the difference between the reconstructed input and the original input and thus can help to improve the developed fault diagnosis model. In the subsequent sections, the details regarding the necessary parameters that are used for developing an autoencoder are given. The following details about the parameters are provided by considering the first autoencoder used in the SSAEs.
4.1.1. Receptive Input Size
The length of a single sample that is provided as input to an autoencoder is called receptive input size. It is observed that the quality of the higherlevel representative features, which are extracted from the input, improves when larger input sample size is provided to autoencoders. However, it also increases the computational overhead and hence smaller receptive input size is used to achieve better computational performance. In this work as we have used a window size of 1024 to calculate the complex envelope spectrum, the respective input size is 1024. Using an even larger input size would significantly increase the training time for the DNN, but may not yield proportional improvements in diagnostic performance. Therefore, an input size of 1024 data points is used to achieve a reasonable tradeoff between diagnostic performance and computational costs.
4.1.2. Number of Hidden Neurons
The number of hidden neurons that appear in the hidden layer of an autoencoder plays a crucial role in the extraction of higherlevel representative features. There is no defined rule for selecting the number of nodes in the hidden layer of an autoencoder. According to the available literature [32], the number of nodes in the hidden layer must be less than the receptive input size to learn the compressed representation of the input data. Figure 4 shows the effect of hidden layer neurons in the first autoencoder on the reconstruction error. It is evident that the reconstruction error of the autoencoder is less when the number of nodes is equal to half of the receptive input size or fewer. This criterion of half or fewer than half is also valid for all the subsequent hidden layers of the SSAEs.
4.1.3. Sparsity Constraint
The primary objective of an autoencoder is to extract higherlevel representative features through an unsupervised learning process. In unsupervised learning, an appropriate sparsity constraint can improve the forward learning of an autoencoder. The effect of the sparsity constraint on the reconstruction error of the first autoencoder is shown in Figure 5. It is apparent that the reconstruction error is almost invariant while keeping the sparsity proportion in the range between 0.15 and 0.2. Therefore, to construct a deep neural network, the value of the sparsity proportion in all the hidden layers is kept at a value of 0.15.
4.1.4. Number of Hidden Layers
The number of hidden layers influences the learning process of the SSAEbased DNN. Table 3 shows the influence of the number of hidden layers used to develop the DNN. It can be observed that the smallest reconstruction error is for the case that 4 hidden layers are used in the DNN. Moreover, there is no noticeable decline in the reconstruction error with an increase in the number of hidden layers. The error remains almost the same when a greater number of hidden layers are used, and the performance of the DNN is almost unchanged.

4.1.5. Average Execution Cost
In addition to reconstruction error, another metric that is worth considering is the computational cost of the training process, that is, the average amount of time required to train the DNN. The number of hidden layers and the nodes in each hidden layer affects the average execution cost or the time required to train the network. The DNN with the highest number of hidden layers and nodes will have the highest average execution cost as it will have more network parameters to tune. Figure 6 shows the average execution cost for different DNN structures considered in this study. It can be observed that the execution cost is high for networks with complex architecture. It can be noted from Figure 6 that the DNN with four hidden layers and 500, 250, 125, and 62 neurons, respectively, in each of those four layers has the highest execution cost, whereas the execution cost is reduced when the network architecture is simple, that is, with fewer hidden layers and fewer nodes in each hidden layer (i.e., 100/50). So, from these observations it can be concluded that the addition of more nodes in the hidden layers adds more complexity to the DNN structure, thereby requiring more execution time.
By observing the effect of different parameters on the two metrics used to evaluate the performance of DNN, it can be noted that although the reconstruction error is the lowest for the most complex network, that is, the DNN with four hidden layers, however the average execution cost is maximum in this case. Moreover, it can also be observed in Table 3 that if more nodes are added to a given hidden layer, then the network reconstruction error does not decrease substantially. So, to achieve the best tradeoff between reconstruction error and execution cost, a threelayered network structure has been adopted throughout the experiments. The reconstruction error and the execution cost for this network structure vary little as compared to the twolayered network structure, where these values are at their minimum. The network structures with multiple hidden layers efficiently perform dimensionality reduction, thereby improving the final classification results. Each hidden layer performs principle component analysis on the input data and outputs a reduced set of representative features, hence reducing the features vector dimensions. Hence, the adopted threelayered network provides a reduced set of features, that is, 100, 50, and 25 features per features set from its first, second, and third layer, respectively. Based on the above discussion and after observing the effect of different parameters on the performance of stacked sparse autoencoders (SSAEs), the optimal parameters selected for the SSAEbased DNNs are listed in Table 4.

5. Results
Figure 7 shows the extracted complex envelope spectra for the inner raceway, outer raceway, and roller element faults for different shaft speeds. The complex envelope spectrum is used to calculate the energyfrequency distribution of the given vibration signal. In the complex envelope spectrum, defect frequencies exist for a given fault. A noticeable variation of the energy levels, as well as in the energy distribution pattern, can be observed among the spectra of the different fault types. However, the variation is indistinct among the given fault types under different speed conditions. By taking advantage of the variations of energy levels and defect frequencies present in the complex envelope spectrum for a given fault type, the stacked autoencoders can learn distinct features.
(a)
(b)
(c)
In Figure 8, the scatter plots of the features extracted from the complex envelope spectrum for different shaft speeds are given. It is worth noticing that the features extracted by the autoencoders from the complex envelope spectrum for a given health condition under different shaft speeds are clearly distinguished from one another and clustered separately. These discriminant features enhance the performance of the DNN, performing effective fault classification when fluctuations of the shaft speed occur. A comparison of the results of the proposed scheme, stacked denoising autoencoders (SDA) [33], and vibration spectral imaging (VSI) [34] for four different experiments is presented in Table 5. In the SDAbased scheme, a twolayer deep neural network was developed using denoising autoencoders. The raw vibration signals describing four health conditions of the bearings were used as inputs, which were then contaminated with noise and segmented into 200 window size samples. The resulting samples were provided as inputs to the DNN for bearing fault diagnosis, which provided satisfactory diagnostic results for the bearing when using a noisy signal under constant speed conditions. Whereas in the VSIbased scheme authors presented a bearing fault diagnosis based on vibration spectrum imaging and artificial neural networks (ANNs), in this scheme, a 513point Fast Fourier Transform (FFT) was first calculated by using 1024 windowsized vibration signals. Later, the calculated 513point spectral data was stacked to create a 513 × 8 size grayscale image. The resulting images were subjected to an 8 × 4 sized smoothing filter and later converted to binary images by using an optimum threshold value (0.7). The resulting binary images having a total of 4014 frequency components were fed to an ANN with three hidden nodes. Both the schemes (SDA and VSI) were evaluated by using the Case Western Reserve University seeded fault bearing dataset. In addition, the results of a backpropagation neural network (BPNN), trained on the same data used in the proposed method, are also included for comparison. It can be observed that the minimum average fault classification accuracy of the proposed method is 90%. On the other hand, SDA, VSI, and BPNN, despite having superior performance in constant speed scenarios, fail to provide better results when speed fluctuations are experienced. Based on the results of this study, the proposed method outperformed the existing methods. In the proposed method, the variations of the energy levels and the presence of defect frequencies in the complex envelope spectrum of a given fault made the anomalous pattern more vibrant, helping the autoencoders to efficiently mine informative features that can be easily distinguished among the machine health conditions under shaft speed fluctuations.

In addition to the steadystate regime, results of the experiment are also presented in this work where subsamples from each fault category and operating speed are taken for training the SAEbased DNN. The results obtained using the proposed model in this configuration are compared with those of ANN and SDA. These experimental results are shown in Figure 9, and they clearly reveal that the proposed method yields the best results when subsamples from each fault category and operating speed are used for training the network as compared to the other two algorithms.
6. Conclusions
This work presents a stacked sparse autoencoderbased deep neural network (SSAEDNN), which in combination with a complex envelope spectrum for inputs performs fault diagnosis of rotary machines when there are fluctuations of the shaft speed. In the proposed scheme, vibration signals related to different health conditions of a motor bearing are preprocessed using the complex envelope signal. In the proposed method, information obtained by the stacked autoencoders from the defect frequency, as well as its principle harmonics present in the complex envelope spectrum for a given fault, makes it possible to classify faults with varying speeds. The efficiency of the proposed scheme was validated using rotating machine bearing data for four different shaft speeds. A series of experiments were performed, consisting of dividing the fault data related to the four different shaft speeds into separate datasets and processing each dataset separately for fault diagnosis, in order to anticipate the efficiency. In each experiment, the complex envelope spectrum of one operating speed was used to train the network before testing with datasets comprised of the remaining three shaft speeds. This procedure was conducted for all of the datasets. The minimum average classification accuracy for every experiment was 90%, which demonstrates that the proposed scheme can also classify faults when fluctuations of the shaft speed exist. This scheme was trained and tested on the complex envelope spectrum of highspeed bearings. Therefore, this proposed method can perform fault diagnosis on vibration signals with a high and variable shaft speed and periodicity.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this manuscript.
Acknowledgments
This work was supported by the Korea Institute of Energy Technology Evaluation and Planning (KETEP) and the Ministry of Trade, Industry & Energy (MOTIE) of the Republic of Korea (nos. 20162220100050, 20161120100350, and 20172510102130). It was also funded in part by the Leading Human Resource Training Program of Regional Neo Industry through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (NRF2016H1D5A1910564), in part through the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2016R1D1A3B03931927), and in part by the development of a basic fusion technology in the electric power industry (Ministry of Trade, Industry & Energy, 201301010170D).
References
 K. S. Sachin, K. P. Soman, N. Mohan, and P. Poornachandran, “Condition monitoring in roller bearings using cyclostationary features,” in Proceedings of the 3rd International Symposium on Women in Computing and Informatics, WCI 2015, pp. 690–697, India, August 2015. View at: Publisher Site  Google Scholar
 J. Chen, Z. Li, J. Pan et al., “Wavelet transform based on inner product in fault diagnosis of rotating machinery: a review,” Mechanical Systems and Signal Processing, vol. 7071, pp. 1–35, 2016. View at: Publisher Site  Google Scholar
 S. Li, G. Liu, X. Tang, J. Lu, and J. Hu, “An ensemble deep convolutional neural network model with improved DS evidence fusion for bearing fault diagnosis,” Sensors, vol. 17, no. 8, article no. 1729, 2017. View at: Publisher Site  Google Scholar
 H. Zhou, T. Shi, G. Liao et al., “Weighted kernel entropy component analysis for fault diagnosis of rolling bearings,” Sensors, vol. 17, no. 3, article no. 625, 2017. View at: Publisher Site  Google Scholar
 A. Rai and S. H. Upadhyay, “A review on signal processing techniques utilized in the fault diagnosis of rolling element bearings,” Tribology International, vol. 96, pp. 289–306, 2016. View at: Publisher Site  Google Scholar
 W. A. Smith and R. B. Randall, “Rolling element bearing diagnostics using the Case Western Reserve University data: a benchmark study,” Mechanical Systems and Signal Processing, vol. 6465, pp. 100–131, 2015. View at: Publisher Site  Google Scholar
 P. W. Tse and D. Wang, “State space formulation of nonlinear vibration responses collected from a dynamic rotorbearing system: An extension of bearing diagnostics to bearing prognostics,” Sensors, vol. 17, no. 2, article no. 369, 2017. View at: Publisher Site  Google Scholar
 D. Hansen and A. H. Olsson, “ISO Standard 133732: 2005: Condition monitoring and diagnostics of machines–Vibration condition monitoring–Part 2: Processing, analysis and presentation of vibration data,” International Standards Organization, 2009. View at: Google Scholar
 P. J. Tavner, “Review of condition monitoring of rotating electrical machines,” IET Electric Power Applications, vol. 2, no. 4, pp. 215–247, 2008. View at: Publisher Site  Google Scholar
 J. Huang, G. Chen, L. Shu, S. Wang, and Y. Zhang, “An Experimental Study of Clogging Fault Diagnosis in Heat Exchangers Based on Vibration Signals,” IEEE Access, vol. 4, pp. 1800–1809, 2016. View at: Publisher Site  Google Scholar
 Z. Huo, Y. Zhang, P. Francq, L. Shu, and J. Huang, “Incipient fault diagnosis of roller bearing using optimized wavelet transform based multispeed vibration signatures,” IEEE Access, 2017. View at: Publisher Site  Google Scholar
 M. Sohaib and J.M. Kim, “A Robust Deep Learning Based Fault Diagnosis of Rotary Machine Bearings,” Advanced Science Letters, vol. 23, no. 12, pp. 12797–12801, 2017. View at: Google Scholar
 S. A. Khan and J.M. Kim, “Rotational speed invariant fault diagnosis in bearings using vibration signal imaging and local binary patterns,” The Journal of the Acoustical Society of America, vol. 139, no. 4, pp. EL100–EL104, 2016. View at: Publisher Site  Google Scholar
 V. Tra, J. Kim, S. A. Khan, and J.M. Kim, “Incipient fault diagnosis in bearings under variable speed conditions using multiresolution analysis and a weighted committee machine,” The Journal of the Acoustical Society of America, vol. 142, no. 1, pp. EL35–EL41, 2017. View at: Publisher Site  Google Scholar
 M. Sohaib, C.H. Kim, and J.M. Kim, “A hybrid feature model and deeplearningbased bearing fault diagnosis,” Sensors, vol. 17, no. 12, article no. 2876, 2017. View at: Publisher Site  Google Scholar
 J. BurrielValencia, R. PuchePanadero, J. MartinezRoman, A. SapenaBano, and M. PinedaSanchez, “ShortFrequency Fourier Transform for Fault Diagnosis of Induction Machines Working in Transient Regime,” IEEE Transactions on Instrumentation and Measurement, vol. 66, no. 3, pp. 432–440, 2017. View at: Publisher Site  Google Scholar
 L. Wu, B. Yao, Z. Peng, and Y. Guan, “Fault diagnosis of roller bearings based on a wavelet neural network and manifold learning,” Applied Sciences (Switzerland), vol. 7, no. 2, article no. 158, 2017. View at: Publisher Site  Google Scholar
 M. Kang, M. R. Islam, J. Kim, J.M. Kim, and M. Pecht, “A hybrid feature selection scheme for reducing diagnostic performance deterioration caused by outliers in datadriven diagnostics,” IEEE Transactions on Industrial Electronics, vol. 63, no. 5, pp. 3299–3310, 2016. View at: Publisher Site  Google Scholar
 M. M. M. Islam, J. Kim, S. A. Khan, and J.M. Kim, “Reliable bearing fault diagnosis using Bayesian inferencebased multiclass support vector machines,” The Journal of the Acoustical Society of America, vol. 141, no. 2, pp. EL89–EL95, 2017. View at: Publisher Site  Google Scholar
 D. K. Appana, M. R. Islam, and J.M. Kim, “Reliable fault diagnosis of bearings using distance and density similarity on an Enhanced kNN,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface, vol. 10142, pp. 193–203, 2017. View at: Publisher Site  Google Scholar
 Z. Chen, X. Zeng, W. Li, and G. Liao, “Machine fault classification using deep belief network,” in Proceedings of the 2016 IEEE International Instrumentation and Measurement Technology Conference, I2MTC 2016, Taiwan, May 2016. View at: Publisher Site  Google Scholar
 Y. G. Lei, Z. J. He, and Y. Y. Zi, “EEMD method and WNN for fault diagnosis of locomotive roller bearings,” Expert Systems with Applications, vol. 38, no. 6, pp. 7334–7341, 2011. View at: Publisher Site  Google Scholar
 X. Zhang, B. Wang, and X. Chen, “Intelligent fault diagnosis of roller bearings with multivariable ensemblebased incremental support vector machine,” KnowledgeBased Systems, vol. 89, pp. 56–85, 2015. View at: Publisher Site  Google Scholar
 H. AlBugharbee and I. Trendafilova, “A fault diagnosis methodology for rolling element bearings based on advanced signal pretreatment and autoregressive modelling,” Journal of Sound and Vibration, vol. 369, pp. 246–265, 2016. View at: Publisher Site  Google Scholar
 S. Afaq Ali Shah, M. Bennamoun, and F. Boussaid, “Iterative deep learning for image set based face and object recognition,” Neurocomputing, vol. 174, pp. 866–874, 2016. View at: Publisher Site  Google Scholar
 G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” American Association for the Advancement of Science: Science, vol. 313, no. 5786, pp. 504–507, 2006. View at: Publisher Site  Google Scholar  MathSciNet
 R. Sarikaya, G. E. Hinton, and A. Deoras, “Application of deep belief networks for natural language understanding,” IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 22, no. 4, pp. 778–784, 2014. View at: Publisher Site  Google Scholar
 R. Zhang, Z. Peng, L. Wu, B. Yao, and Y. Guan, “Fault diagnosis from raw sensor data using deep neural networks considering temporal coherence,” Sensors, vol. 17, no. 3, article no. 549, 2017. View at: Publisher Site  Google Scholar
 Case Western Reserve University, http://csegroups.case.edu/bearingdatacenter/home/, 2017.
 T. W. Rauber, F. de Assis Boldt, and F. M. Varejão, “Heterogeneous feature models and feature selection applied to bearing fault diagnosis,” IEEE Transactions on Industrial Electronics, vol. 62, no. 1, pp. 637–646, 2015. View at: Publisher Site  Google Scholar
 A. Coates, A. Ng, and H. Lee, “An analysis of singlelayer networks in unsupervised feature learning,” in in Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp. 215–223, ACM, Ft. Lauderdale, FL, USA, 2011. View at: Google Scholar
 A. Ng, “Sparse autoencoder,” CS294A Lecture notes, vol. 72, no. 2011, pp. 1–19, 2011. View at: Google Scholar
 C. Lu, Z. Y. Wang, W. L. Qin, and J. Ma, “Fault diagnosis of rotary machinery components using a stacked denoising autoencoderbased health state identification,” Signal Processing, vol. 130, pp. 377–388, 2017. View at: Publisher Site  Google Scholar
 M. Amar, I. Gondal, and C. Wilson, “Vibration spectrum imaging: a novel bearing fault classification approach,” IEEE Transactions on Industrial Electronics, vol. 62, no. 1, pp. 494–502, 2015. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2018 Muhammad Sohaib and JongMyon Kim. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.