Research Article  Open Access
Dong Wang, KwokLeung Tsui, Peter W. Tse, Ming J. Zuo, "Principal Components of SuperhighDimensional Statistical Features and Support Vector Machine for Improving Identification Accuracies of Different Gear Crack Levels under Different Working Conditions", Shock and Vibration, vol. 2015, Article ID 420168, 14 pages, 2015. https://doi.org/10.1155/2015/420168
Principal Components of SuperhighDimensional Statistical Features and Support Vector Machine for Improving Identification Accuracies of Different Gear Crack Levels under Different Working Conditions
Abstract
Gears are widely used in gearbox to transmit power from one shaft to another. Gear crack is one of the most frequent gear fault modes found in industry. Identification of different gear crack levels is beneficial in preventing any unexpected machine breakdown and reducing economic loss because gear crack leads to gear tooth breakage. In this paper, an intelligent fault diagnosis method for identification of different gear crack levels under different working conditions is proposed. First, superhighdimensional statistical features are extracted from continuous wavelet transform at different scales. The number of the statistical features extracted by using the proposed method is 920 so that the extracted statistical features are superhigh dimensional. To reduce the dimensionality of the extracted statistical features and generate new significant lowdimensional statistical features, a simple and effective method called principal component analysis is used. To further improve identification accuracies of different gear crack levels under different working conditions, support vector machine is employed. Three experiments are investigated to show the superiority of the proposed method. Comparisons with other existing gear crack level identification methods are conducted. The results show that the proposed method has the highest identification accuracies among all existing methods.
1. Introduction
Gears are commonly used in mechanical transmission systems to transmit power from one shaft to another. Because gear is a mechanical component, its performance degrades over time when it is used [1–4]. Gear crack evaluation is a special case only when the gear crack is taken as the first agent in gear performance degradation process [5]. Identification of different gear crack levels is beneficial in preventing any unexpected machine breakdown and reducing economic loss because gear crack leads to gear tooth breakage [6]. Vibration analysis is a major tool to diagnose gear faults because vibration signals are easily collected from the casing of gearboxes [7, 8]. In the past years, vibration signal based wavelet analysis has been widely used to diagnose gear faults [9–11]. Additionally, in recent years, to diagnose the multistage gearbox used in a bucket wheel excavator system, Bartelmus and Zimroz [12–16] proposed a simple and effective health indicator for distinguishing the good and bad health conditions of the multistage gearbox. Data collected from such complex machines are nonstationary gearbox vibration signals under timevarying working conditions. Because the gearbox used in a bad health condition is more susceptible to external varying loads, the proposed health indicator was designed to be a function of the instantaneous input speed. The results show that the proposed health indicator has a linear relationship with the instantaneous input speed when the gearbox is in the good or bad health condition. Moreover, as the gearbox degrades, the inclination of the linear relationship increases.
Wavelet analysis is one of the most popular methods for diagnosing gear faults. For the use of wavelet analysis, the selection of a proper wavelet basis is crucial because wavelet analysis aims to calculate the inner product between a signal and the wavelet basis. Recently, Rafiee et al. [17, 18] made an exhaustive study for investigating the performance of 324 mother wavelet candidates on gear fault feature extraction. Their results showed that Daubechies 44 has the most similar shape with gear fault features. Even though gear fault diagnosis becomes a hot topic, according to our literature review, only some methods, such as mean frequency of Scalogram [19], instantaneous energy density [20], regularization dimension [21], cyclic spectral analysis [22], and selfsimilarity [23], are highly related to identification of different gear crack levels. However, because those methods are based on signal processing, the explanation of the results obtained by using these methods requires expertise.
To automatically identify different gear crack levels, intelligent methods [24–26] are required to be developed. To fill out this gap, first, a series of experiments [5] were conducted to collect gear vibration signals under different working conditions including four different motor speeds and three different loads. Five different artificial gear crack levels, including crack level 0%, crack level 25%, crack level 50%, crack level 75%, and crack level 100%, were produced to simulate gear crack deterioration levels. The definition of these different gear crack levels will be illustrated in Section 3.1. Then, a weighted nearest neighbor algorithm [24] was proposed to identify three different gear crack levels including crack level 0%, crack level 25%, and crack level 50% under four different motor speeds and three different loads because identification of early gear crack levels is more useful to conduct preventive maintenance. Their results showed that the weighted nearest neighbor algorithm achieves high prediction accuracies to identify three different gear crack levels under the four different motor speeds and the three different loads.
In this paper, to improve the work reported in [24], identification of three gear crack levels under the different loads and speeds is extended to identification of five gear crack levels under the different loads and speeds. It means that two more crack deterioration levels including crack level 75% and crack level 100% are considered. The gear crack levels ranged from 0% to 100% with an increase of 25% are used to describe the whole gear crack level deterioration. Because the number of the gear crack levels increase, the gear crack level identification problem under different loads and speeds becomes complicated and difficult. Therefore, it is necessary to propose an advanced gear crack identification method.
The rest of this paper is organized as follows. The proposed method for identification of different gear crack levels under different loads and speeds is introduced in Section 2. Three experiments are investigated in Section 3 to illustrate how the proposed method works and comparisons with other existing gear crack level identification methods are conducted. Conclusions are drawn in Section 4.
2. The Proposed Method for Identification of Different Gear Crack Levels under Different Motor Speeds and Loads
The proposed method for identification of five different gear crack levels under different loads and speeds is summarized in Figure 1, where the mathematical formulas of the proposed method are introduced in the following subsections.
First, to represent different gear crack levels, statistical features must be extracted from a gear vibration signal. Traditionally, statistical features are directly extracted from the temporal gear vibration signal and its corresponding frequency spectrum. The frequency spectrum is obtained by conducting Fourier transform on the temporal gear vibration signal. These statistical features directly extracted from the gear vibration signal and its corresponding frequency spectrum can be regarded as the “global” statistical features. These “global” statistical features are useful if the signaltonoise ratio of the gear vibration signal is high. In other words, the fault features caused by a gear crack can be clearly found in the gear vibration signal and its corresponding frequency spectrum. However, besides the fault features caused by a gear crack, there are many noises and unknown vibration components existing in the gear vibration signal. It is necessary to enhance the signaltonoise ratio of the gear vibration signal before the statistical features are extracted. Unlike Fourier transform, which aims to decompose the gear vibration signal to the sum of globally complex exponentials, a continuous wavelet transform uses inner product operation to measure the local similarity between a gear vibration signal and a wavelet mother function. It should be noted that the wavelet mother function is a locally oscillated analyzing function and can be shifted and scaled. The smaller the scale, the more compressed the wavelet mother function. The larger the scale, the more stretched the wavelet mother function. The continuous wavelet transform at different scales facilitates detecting the different local characteristics of the gear vibration signal, such as the features generated by a gear crack. Therefore, in this paper, the continuous wavelet transform at different scales, namely, Scalogram [27, 28], is conducted on the gear vibration signal to highlight the local gear fault features. If statistical features are extracted from the continuous wavelet coefficients, these statistical features can be regarded as “local” statistical features and can be used to better represent different gear crack levels. Additionally, it is not difficult to find that the resulting wavelet coefficients are so redundant that the statistical features extracted from the resulting wavelet coefficients are redundant because the wavelet functions at some of these different scales have very similar shapes.
10 popular statistical features, including mean, standard deviation, root mean square, peak, skewness, kurtosis, crest factor, clearance factor, shape factor, and impulse factor, are applied to the Scalogram at the scales ranged from 1 to 45 to generate 920 dimensional redundant statistical features for the representation of the gear crack level. The dimensionality of the redundant statistical features is calculated as follows. Based on the above ten statistical features and the Scalogram at the scales ranged from 1 to 45, redundant statistical features are extracted. Besides the statistical features extracted from the Scalogram, it is necessary to extract the statistical features from the frequency spectra of the Scalogram, which exhibit different gear fault signatures from the Scalogram [29]. Consequently, there are redundant statistical features extracted from the Scalogram and its frequency spectra. Considering 20 more “global” statistical features directly extracted from the original gear vibration signal and its corresponding frequency spectrum, there are redundant statistical features in all. Such redundant statistical features characterize the “global” and “local” features of the gear vibration signal. Moreover, according to our literature review, the use of the highdimensional redundant features for identification of different gear faults is rarely reported.
Second, compared with the statistical features used in other intelligent gear crack level identification methods [24–26], in which only 10 to 30 statistical features were extracted, these 920 statistical features extracted by the proposed method are highly redundant and they are superhigh dimensional so that it is impossible to directly use all of the redundant statistical features to train and test statistical models. Before any statistical model is used, it is necessary to reduce the dimensionality of the 920 redundant statistical features. There are many dimensionality reduction methods including linear and nonlinear methods [30]. For the nonlinear dimensionality methods, they have some disadvantages listed as follows. First, the calculation efficiencies of the nonlinear dimensionality methods are low. Second, they need much computer memory. Otherwise, the nonlinear dimensionality methods fail to generate lowdimensional features due to lack of computer memories. Third, even though these nonlinear dimensionality methods could be used to process training data, mapping of testing data to a lowdimensional space, namely, outofsample extension, is still questionable and often results in distinct estimation errors. Besides, through some numerical and real case studies, [30] concluded that nonlinear dimensionality reduction methods are not outperforming the traditional linear dimensionality reduction methods, such as principal component analysis. Therefore, a simple and efficient linear dimensionality reduction method called principal component analysis [31] is employed in this paper to generate new significant lowdimensional statistical features, namely, principal components, to distinguish different gear crack levels under the different working conditions.
At last, to ensure high training and testing accuracies, support vector machine [32, 33] is utilized to identify different gear crack levels under the different working conditions because it is able to use the kernel trick to map these new significant statistical features to a highdimensional feature space, where linear classification is possible.
2.1. Redundant Feature Extraction
Continuous wavelet transform [27] aims to use an artificial wavelet mother function to calculate the inner product between a signal and a wavelet mother function at different scales and translations :where is the complex conjugate operator and are the wavelet coefficients. From (1), it is seen that the continuous wavelet transform converts a onedimensional signal to a twodimensional signal (a timescale representation), which generates redundant wavelet coefficients at different scales. Besides, it is found that the wavelet mother function has a significant impact on the wavelet coefficients. Different wavelet mother functions result in different wavelet coefficients. Therefore, for the use of continuous wavelet transform, proper selection of the wavelet mother function becomes an open question. As mentioned in Introduction, Daubechies 44 is used in this paper and its temporal waveform is plotted in Figure 2(a). To show the redundant wavelet coefficients obtained by the continuous wavelet transform, the data with gear crack levels 0% and 100% are plotted in Figures 2(b) and 2(c), respectively, and their corresponding wavelet coefficients are plotted in Figures 2(d) and 2(e), respectively, in which the absolute wavelet coefficients are used to enhance the threedimensional visualization of the wavelet coefficients. From the results shown in Figures 2(d) and 2(e), it is seen that the each of the onedimensional gear signals shown in Figures 2(b) and 2(c) is, respectively, transformed to a twodimensional timescale diagram, which are the redundant wavelet coefficients. As a result, compared with the original signals shown in Figures 2(b) and 2(c), the redundant wavelet coefficients provide more fault signatures. Following the use of the continuous wavelet transform, 10 statistical features shown in the following can be applied to quantify the wavelet coefficients at the scales ranged from 1 to 45 and their corresponding frequency spectra. Considering 20 more statistical features extracted from the original signal without being processed by the continuous wavelet transform and its frequency spectrum, there are redundant statistical features in all. The statistical features used for quantifying wavelet coefficients at different scales (note that the same statistical features are used to quantify the frequency spectra of the wavelet coefficients at different scales and is the length of the signal) are as follows.
(a)
(b)
(c)
(d)
(e)
Mean Value
Root Mean Square
Standard Deviation
Skewness
Kurtosis
Crest Factor
Clearance Factor
Shape Factor
Impulse Factor
Maximum
2.2. Dimensionality Reduction
In the previous section, the superhighdimensional statistical features are extracted based on the use of the continuous wavelet transform. The dimensionality of the statistical features is 920. The direct use of all 920 statistical features for identification of different gear crack levels under the different working conditions will lead to the curse of dimensionality [34], which means that the number of the data used for supporting any result grows with the dimensionality exponentially. To relieve this problem, the dimensionality reduction is required to be conducted prior to the use of support vector machines. As discussed at the beginning of Section 2, in this paper, principal component analysis is chosen and its fundamental is introduced in the following [31]. Suppose that training samples for different gear crack levels are obtained. Based on the training samples and the 920 statistical features extracted from each sample, feature matrix is constructed as follows:
From (12), it is seen that the statistical features used in (12) are very redundant. Principal component analysis aims to generate the significant new statistical features from the highdimensional space and form a lowdimensional orthogonal space to express different gear crack levels. Each of the new generated features is called a principal component. Additionally, the first principal component has the greatest variance. The second principal component has the second greatest variance, and so on. Suppose that each column or feature of (12) has a zero mean and a unit variance. To achieve the above statement, the following optimization problem is constructed:where is the transpose operator. The Lagrange function of (13) is built as follows:where is the Lagrange multiplier. The first partial derivative of with respect to is obtained as follows:If (15) is set to zero, the relationship between w and the Lagrange multiplier is the eigenfunction of the symmetric matrix and it is written as follows:
Suppose that the first column of is the eigenvector corresponding to the largest eigenvalue, the second column of is the eigenvector corresponding to the second largest eigenvalue, and so on. Consequently, the feature matrix shown in (12) can be mapped to a new space consisting of the principal components:where the first column of is the first principal component , the second column of is the second principal component , and so on. For fair comparison with the other existing gear crack level identification methods reported in [24], where seven statistical features were selected from 25 statistical features, in this paper, the first seven principal components obtained by the proposed method are used to train and test support vector machines. Besides, the testing data can be directly mapped to the new principal component space by using the established linear transformation matrix .
2.3. Identification of Different Gear Crack Levels under Different Working Conditions
To automatically identify different gear crack levels under different motor speeds and loads, support vector machine [32] is used in this paper and it is a kind of supervised learning method which has been widely investigated in the past years for solving various classification and regression problems. Given the training data of two different gear crack levels , where is the statistical feature vector with a dimensionality of and is the binary classification label, if the training data are linearly separable, a linear decision function can be determined by solving the following optimization problem [32]:where is the normal vector to the decision function, is the dot product, and is the offset of the decision function. The objective function of (18) aims to maximize the distance between two hyperplanes, where there are no training data between them. It means that the linear decision function creates the maximum distance between the linear decision function and the nearest training data.
Considering the noise with the slack variables and the error penalty constant , (18) is revised as [32]
Solving (19) is equivalent to solving the following Lagrangian problem with Lagrange multipliers and :
Taking the derivatives of (20) with respect to and , respectively, it is derived that
Substituting (21) into (20), (20) becomes a dual quadratic optimization problem [32]:
After solving (22), the linear decision function is obtained as follows:
To extend the linear classification problem to the nonlinear classification problem, kernel trick can be used to map the training data to a highdimensional space, where the linear classification problem is possibly solved, prior to the establishment of (23). Considering the kernel function, (23) can be revised as [32]where is the linear or nonlinear kernel function which should satisfy Mercer’s theorem. There are three popular kernel functions including linear, polynomial, and Gaussian radial basis functions [32]. Generally, the Gaussian radial basis function is the preferable choice for the use of the support vector machine because, unlike the linear kernel, it is able to handle the nonlinear classification problem. Furthermore, the number of hyperparameters used in the Gaussian radial basis function is less than that used in the polynomial function. Therefore, in this paper, the Gaussian radial basis function is chosen and (24) is revised aswhere is the kernel parameter and is the modulus of the feature vector. To classify five different gear crack levels, the popular oneagainstall strategy and oneagainstone strategy can be used [32, 33].
3. Instance Studies and Comparisons with Other Existing Advanced Gear Crack Level Identification Methods
3.1. Experimental Platform
In this paper, one of the coauthors [5] designed the experiments to collect the different gear crack level data under different motor speeds and loads from the experimental setup shown in Figures 3(a) and 3(b). The experimental setup included a gearbox, a 3hp ac motor, which was used to drive the input shaft of the gearbox, and a magnetic brake, which was used to provide different loads. Four different rotation motor speeds including 1200 rpm, 1400 rpm, 1600 rpm, and 1800 rpm and three different loads including no load, half load, and full load were used. Gears 1, 2, 3, and 4 had 48, 16, 24, and 40 teeth, respectively. Gear 3 was the tested gear used in the experimental setup. Some artificial gear crack levels denoted as crack levels 0%, 25%, 50%, 75%, and 100% were produced to simulate all gear crack deterioration levels and their geometries are tabulated in Table 1. The crack thickness was 0.4 mm because the available thinnest knife in the coauthor’s lab was 0.4 mm. For the four gear crack levels, the crack depths were 0.25, 0.5, 0.75, and 1 times the half of the chordal tooth thickness, respectively, because the tooth will break rapidly when the crack depth is more than half of the chordal tooth thickness. Here, the chordal tooth thickness is the tooth thickness at the pitch line. The crack widths were 0.25, 0.5, 0.75, and 1 times the face width equal to 25 mm, respectively. The crack angle was 45 degree. The diagrammatic sketches of the chordal tooth thickness, crack width, and crack angle are plotted in Figures 4(a) and 4(b), respectively. The different gear crack levels are shown in Figures 5(a) to 5(d), respectively. The vibration signals were measured by two acceleration sensors, which were produced by PCB Electronics with model number 352C67. These two sensors were mounted on the casing of the gearbox in the vertical and horizontal directions, respectively. In [5], it was reported that the vertical direction is more sensitive to identification of the different crack levels. Therefore, the vibration signals collected from the vertical direction were used in this paper. The sampling frequency was set to 5120 and, for each sample, the sampling points were 8192. For each working condition, two samples were collected. Consequently, there were 2 samples ×3 loads ×4 speeds ×5 gear health conditions = 120 samples in all. By considering the different combination of different motor speeds and loads, the similar three experiments designed in [24] were used in this paper. Compared with the experiments designed in [24], two more gear crack levels including 75% and 100% were considered in this paper, which makes the gear crack level identification difficult. The designed experiments used in this paper are tabulated in Table 2.


(a)
(b)
(a)
(b)
(a)
(b)
(c)
(d)
In the first experiment, for each gear crack level, 24 samples were collected from the machine under 4 different motor speeds and 3 different loads. Therefore, samples were collected in all. Then, half of the samples () were used in the training phase. The other samples () were used in the testing phase.
In the training phase of the second experiment, for each gear crack level, 12 samples were collected from the machine under 2 different motor speeds and 3 different loads. Therefore, samples were collected. Then, in the testing phase, for each gear crack level, other 12 samples were collected from the machine under other 2 different motor speeds and 3 different loads. Therefore, samples were collected. The design of experiment aims to investigate the influence of different motor speeds on the prediction accuracies of the proposed method.
In the training phase of the third experiment, for each gear crack level, 8 samples were collected from the machine under 4 different motor speeds and 1 load. Therefore, samples were collected. Then, in the testing phase, for each gear crack level, 16 samples were collected from the machine under another 4 different motor speeds and 2 different loads. Therefore, samples were collected. The design of experiment aims to investigate the influence of different loads on the prediction accuracies of the proposed method.
3.2. Comparisons of the Proposed Method with the Four Existing Gear Crack Level Identification Methods Reported in [24]
For experiment 1, support vector machines are trained and tested by the five different gear crack levels under four different motor speeds and the three different loads. It means that all different working conditions are considered in the training of support vector machines, which makes the different crack levels relatively easy compared with experiments and , in which only part of the working conditions are used to train support vector machines. In vibration analysis, different motor speeds and loads have great influence on the amplitudes and waveforms of vibration signals, the changes of which result in the changes of the statistical feature values. Therefore, the design of experiments and makes the different gear crack level identification complicated. For the use of SVM, Gaussian radial basis function was used and its kernel width was optimized to 0.14. The prediction accuracies of the proposed method and the four existing advanced gear crack level identification methods reported in [24] are tabulated in Table 3, where method 1 is the direct use of nearest neighbor without statistical feature selection, method 2 is nearest neighbor with random statistical feature selection, method 3 is nearest neighbor with Euclidean distance evaluation used for statistical feature selection and without weighting technique, and method 4 is the weighted nearest neighbor method proposed in [24]. From the result shown in Table 3, it is found that the proposed method has the highest prediction accuracies among all methods. Besides, the prediction accuracies achieve 100%.

The reasons why the proposed method has such high prediction accuracies are explained as follows. First, the statistical features used in the proposed method are very redundant and their number is high to 920. The redundant statistical features provide more gear crack fault signatures. Second, the principle components are the new significant statistical features generated from the redundant statistical features to represent different gear crack levels. For experiment 1, the first two principal components and the first three principal components of the training data are plotted in Figures 6(a) and 6(b), respectively. From the result shown in Figure 6(a), it is seen that only the gear crack levels 25% and 100% are overlapped with each other. From the result shown in Figure 6(b), the five different gear crack levels are separable in a threedimensional principal component space. The first two principal components and the first three principal components of the testing data are shown in Figures 7(a) and 7(b), respectively, where it is found that the principal components are the new significant statistical features to distinguish the five different gear crack levels. It can be inferred that, with the number of the principal components increasing, the five different gear crack levels are well separable. For experiments and , the first two and three principal components of training data and testing data are plotted in Figures 8–11, respectively, where it is found that the five different gear crack levels are separable in threedimensional principal component space. At last, because support vector machine uses the kernel trick to map a lowdimensional feature space to a highdimensional feature space, where it is possible for the features to be linearly separated in a highdimensional space, this technique enhances the prediction accuracy of the five different gear crack levels.
(a)
(b)
(a)
(b)
(a)
(b)
(a)
(b)
(a)
(b)
(a)
(b)
4. Conclusions
In this paper, an intelligent gear crack level identification method under different working conditions is proposed. The major idea of the proposed method is to use superhighdimensional redundant statistical features to represent five different gear crack levels under different working conditions. The number of the redundant features is high to 920, which is obtained by using 10 statistical features extracted from the Scalogram and its frequency spectra. Then, to reduce the dimensionality of the redundant statistical features and relieve the curse of dimensionality, principal component analysis is performed on the redundant statistical features to generate new significant statistical features. At last, support vector machines with a Gaussian radial basis function are used to identify the five different gear crack levels under the four different motor speeds and three different loads. The comparisons with the four existing gear crack level identification methods show that the proposed method has the highest prediction accuracies among all existing methods and the prediction accuracies obtained by the proposed method are high to 100% for the three different experiments. The reasons for such high prediction accuracies of the proposed method are summarized as follows.
First, extraction of highdimensional redundant statistical features from the continuous wavelet transform at different scales is helpful to mine more gear fault signatures under different working conditions because these highdimensional redundant statistical features can be used to reflect the global and local characteristics of the gear crack level data. Second, the new significant statistical features, namely, the principal components, generated from these highdimensional redundant statistical features are useful to distinguish different crack levels under different working conditions. From the principal components shown in Figures 6 to 11, it is obvious to find that as the number of the principal components increases from 1 to 3, the five different crack levels are well separable. At last, support vector machine uses the kernel trick to map the principal components to a highdimensional feature space, where the separation of the five different crack levels is more notable.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This research work was partly supported by the CityU SRG Project no. 7004085, the RGC CRF no. 8730031, the Research Grants Council of the Hong Kong Special Administrative Region, China (Project no. CityU 122513), Shanghai Aerospace Science and Technology Innovation Fund (no. SAST201311), and the Natural Sciences and Engineering Research Council of Canada (NSERC).
References
 D. Wang, Q. Miao, and R. Kang, “Robust health evaluation of gearbox subject to tooth failure with wavelet decomposition,” Journal of Sound and Vibration, vol. 324, no. 3–5, pp. 1141–1157, 2009. View at: Publisher Site  Google Scholar
 D. Wang, P. W. Tse, W. Guo, and Q. Miao, “Support vector data description for fusion of multiple health indicators for enhancing gearbox fault diagnosis and prognosis,” Measurement Science and Technology, vol. 22, no. 2, Article ID 025102, 2011. View at: Publisher Site  Google Scholar
 M. Yang and V. Makis, “ARX modelbased gearbox fault detection and localization under varying load conditions,” Journal of Sound and Vibration, vol. 329, no. 24, pp. 5209–5221, 2010. View at: Publisher Site  Google Scholar
 X. Wang, V. Makis, and M. Yang, “A wavelet approach to fault diagnosis of a gearbox under varying load conditions,” Journal of Sound and Vibration, vol. 329, no. 9, pp. 1570–1585, 2010. View at: Publisher Site  Google Scholar
 X. Fan, Y. Wang, W. Li, S. Wu, and M. J. Zuo, “Experiment design of gear crack growth,” Tech. Rep., Reliability Research Lab, Department of Mechanical Engieering, University of Alberta, Edmonton, Canada, 2005. View at: Google Scholar
 S. Wu, M. J. Zuo, and A. Parey, “Simulation of spur gear dynamics and estimation of fault growth,” Journal of Sound and Vibration, vol. 317, no. 3–5, pp. 608–624, 2008. View at: Publisher Site  Google Scholar
 E. Bechhoefer and M. Kingsley, “A review of time synchronous average algorithms,” in Proceedings of the Annual Conference of the Prognostics and Health Management Society (PHM '09), October 2009. View at: Google Scholar
 W. J. Wang and P. D. McFadden, “Decomposition of gear motion signals and its application to gearbox diagnostics,” Journal of Vibration and Acoustics, vol. 117, no. 3, pp. 363–369, 1995. View at: Google Scholar
 Z. K. Peng and F. L. Chu, “Application of the wavelet transform in machine condition monitoring and fault diagnostics: a review with bibliography,” Mechanical Systems and Signal Processing, vol. 18, no. 2, pp. 199–221, 2004. View at: Publisher Site  Google Scholar
 R. X. Gao and R. Yan, Wavelets: Theory and Applications for Manufacturing, Springer, New York, NY, USA, 2010.
 R. Yan, R. X. Gao, and X. Chen, “Wavelets for fault diagnosis of rotary machines: a review with applications,” Signal Processing, vol. 96, pp. 1–15, 2014. View at: Publisher Site  Google Scholar
 W. Bartelmus and R. Zimroz, “A new feature for monitoring the condition of gearboxes in nonstationary operating conditions,” Mechanical Systems and Signal Processing, vol. 23, no. 5, pp. 1528–1534, 2009. View at: Publisher Site  Google Scholar
 W. Bartelmus, “Root cause and vibration signal analysis for gearbox condition monitoring,” Insight—NonDestructive Testing and Condition Monitoring, vol. 50, no. 4, pp. 195–201, 2008. View at: Publisher Site  Google Scholar
 W. Bartelmus, “Gearbox damage process,” Journal of Physics: Conference Series, vol. 305, no. 1, Article ID 012029, 2011. View at: Publisher Site  Google Scholar
 W. Bartelmus, “New focus on gearbox condition monitoring for failure prevention technology,” Key Engineering Materials, vol. 588, pp. 184–191, 2014. View at: Publisher Site  Google Scholar
 W. Bartelmus, “Editorial statement,” Mechanical Systems and Signal Processing, vol. 38, no. 1, pp. 1–4, 2013. View at: Publisher Site  Google Scholar
 J. Rafiee, M. A. Rafiee, and P. W. Tse, “Application of mother wavelet functions for automatic gear and bearing fault diagnosis,” Expert Systems with Applications, vol. 37, no. 6, pp. 4568–4579, 2010. View at: Publisher Site  Google Scholar
 J. Rafiee and P. W. Tse, “Use of autocorrelation of wavelet coefficients for fault diagnosis,” Mechanical Systems and Signal Processing, vol. 23, no. 5, pp. 1554–1572, 2009. View at: Publisher Site  Google Scholar
 H. Öztürk, M. Sabuncu, and I. Yesilyurt, “Early detection of pitting damage in gears using mean frequency of scalogram,” Journal of Vibration and Control, vol. 14, no. 4, pp. 469–484, 2008. View at: Publisher Site  Google Scholar
 S. J. Loutridis, “Instantaneous energy density as a feature for gear fault detection,” Mechanical Systems and Signal Processing, vol. 20, no. 5, pp. 1239–1253, 2006. View at: Publisher Site  Google Scholar
 Z. Feng, M. J. Zuo, and F. Chu, “Application of regularization dimension to gear damage assessment,” Mechanical Systems and Signal Processing, vol. 24, no. 4, pp. 1081–1098, 2010. View at: Publisher Site  Google Scholar
 Z. Feng, M. J. Zuo, R. Hao, F. Chu, and M. El Badaoui, “Gear damage assessment based on cyclic spectral analysis,” IEEE Transactions on Reliability, vol. 60, no. 1, pp. 21–32, 2011. View at: Publisher Site  Google Scholar
 S. J. Loutridis, “Selfsimilarity in vibration time series: application to gear fault diagnostics,” Journal of Vibration and Acoustics, vol. 130, no. 3, Article ID 031004, 2008. View at: Publisher Site  Google Scholar
 Y. Lei and M. J. Zuo, “Gear crack level identification based on weighted K nearest neighbor classification algorithm,” Mechanical Systems and Signal Processing, vol. 23, no. 5, pp. 1535–1547, 2009. View at: Publisher Site  Google Scholar
 Y. Lei, M. J. Zuo, Z. He, and Y. Zi, “A multidimensional hybrid intelligent method for gear fault diagnosis,” Expert Systems with Applications, vol. 37, no. 2, pp. 1419–1430, 2010. View at: Publisher Site  Google Scholar
 Y. Hai, K.L. Tsui, and M. J. Zuo, “Gear crack level classification based on multinomial logit model and cumulative link model,” in Proceedings of the 3rd Annual IEEE Prognostics and System Health Management Conference (PHM '12), pp. 1–6, IEEE, Beijing, China, May 2012. View at: Publisher Site  Google Scholar
 S. Mallat, A Wavelet Tour of Signal Processing: The Sparse Way, Elsevier/Academic Press, Amsterdam, The Netherlands, 3rd edition, 2009. View at: MathSciNet
 Z. Peng, F. Chu, and Y. He, “Vibration signal analysis and feature extraction based on reassigned wavelet scalogram,” Journal of Sound and Vibration, vol. 253, no. 5, pp. 1087–1100, 2003. View at: Publisher Site  Google Scholar
 R. Yan and R. X. Gao, “Multiscale enveloping spectrogram for vibration analysis in bearing defect diagnosis,” Tribology International, vol. 42, no. 2, pp. 293–302, 2009. View at: Publisher Site  Google Scholar
 L. J. van der Maaten, E. O. Postma, and H. J. van den Herik, “Dimensionality reduction: a comparative review,” Journal of Machine Learning Research, vol. 10, pp. 66–71, 2009. View at: Google Scholar
 I. Jolliffe, Principal Component Analysis, Wiley Online Library, 2005.
 V. N. Vapnik, Statistical Learning Theory, Adaptive and Learning Systems for Signal Processing, Communications, and Control, John Wiley & Sons, New York, NY, USA, 1998. View at: MathSciNet
 A. Widodo and B.S. Yang, “Support vector machine in machine condition monitoring and fault diagnosis,” Mechanical Systems and Signal Processing, vol. 21, no. 6, pp. 2560–2574, 2007. View at: Publisher Site  Google Scholar
 T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, Springer, Berlin, Germany, 2009. View at: Publisher Site  MathSciNet
Copyright
Copyright © 2015 Dong Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.