#### Abstract

The vibration signal of heavy gearbox has the nonlinear and nonstationary characteristic, which makes the gear fault diagnosis difficult. Moreover, the useful fault information is mainly focused on the high-frequency components of the raw signal, which also affects the fault feature extraction from vibration signal. For this reason, a novel signal processing method based on variational mode decomposition (VMD) and detrended fluctuation analysis (DFA) is proposed to diagnose the gear faults of heavy gearbox. Since high-frequency component contains more fault information, the raw vibration signal is decomposed several mode components by VMD, which can remove the low-frequency component to retain the high-frequency component. Moreover, the most sensitive mode component is selected in these high-frequency components by a maximal indicator, which is composed of kurtosis and correlation coefficient. The most sensitive mode component is calculated by DFA to obtain bi-logarithmic map, and the sliding windowing algorithm is employed to capture turning point of the bi-logarithmic map, thus extracting the fault feature of small time scale to identify gear faults. The effectiveness of the proposed method for fault diagnosis is validated by experimental data analysis, and the comparison results demonstrate that the recognition rate of gear faults condition have marked improvement by proposed method than the DFA of small time scale (STS-DFA) and EMD-DFA.

#### 1. Introduction

Heavy gearboxes are widely used in manufacture, metallurgic, and marine fields for their advantages of strong load-bearing capacity, compact structure, and large transmission ratio. The high-speed gearbox usually works in the harsh operating conditions, such as inevitable impact and complex alternating loads which cause the gear to be difficult to avoid crack, pitting, scratch, and spalling [1, 2]. Moreover, gear is a critical component of gearbox, and gear faults occupy a significant proportion of all faults in gearbox, which can cause the unit damage as well as high cost of maintenance. Therefore, research on fault diagnosis of gear will enable timely and effective detection of faults and ensure the normal operation of mechanical equipment.

Currently, a number of signal analysis methods have been used to diagnose gear fault, in which time-frequency analysis methods are widely used, and they extract information from the vibration signal as a function of time and frequency. Conventional time-frequency analysis includes short-time Fourier transform (STFT) [3, 4], Wigner–Ville distribution (WVD) [5], wavelet transform (WT) [6], S-transform [7], and so on, but these time-frequency analysis methods have some inherent restrictions. For instance, the performance of STFT depends on the selection of window function, which is unsuitable for analyzing nonstationary signals. WVD suffers from cross-term interference when analyzing composite signals. Although the wavelet transform shows better performance in time-frequency analysis, it is derived from the Fourier transform, which does not accurately describe the frequency as a function of time. S-transform takes the advantage of both STFT and WT, and one of the prime advantages is its simpler calculation [7]. Nevertheless, it is not locally adaptive as well. Due to complicated structure of heavy gearbox, long transmission path, and strong background noise, these factors usually make the vibration signal show nonlinear and nonstationary characteristic, which increase the difficulty of fault feature extraction from vibration signal. Consequently, these conventional time-frequencies are not suitable for nonlinear and nonstationary characteristic of gearbox vibration signal.

In order to overcome disadvantages of conventional time-frequency method, many nonlinear and nonstationary signal analysis methods have been proposed and developed in fault diagnosis. Huang et al. [8] proposed a recursive method termed empirical mode decomposition (EMD), which adaptively decomposes the signal into a finite number of intrinsic mode functions (IMFs) and a residue. However, EMD lacks the support of math theory and is prone to suffer from mode mixing [5, 9]. A novel decomposition algorithm named variational mode decomposition (VMD) proposed by Dragomiretsky and Zosso [10] is well adaptive to the nonlinear and nonstationary signal. VMD decomposes the signals into a series of mode components, and each mode is constantly updated by Wiener filtering technique to minimize constrained variational models. Therefore, central frequency of each mode will be gradually demodulated to the corresponding base band, which mitigates mode mixing [11]. By comparison analysis, it is concluded that VMD overcomes the disadvantage of lacking theoretical basis and noise sensitivity of EMD when analyzing nonlinear and nonstationary signals. Based on the advantages of VMD method, it has been widely applied into fault diagnosis [11–15]. Long et al. proposed a method combined VMD with WT [12] to reduce the strong background noise confusing in the raw signal and preserved the fault feature of raw signal effectively. Variational mode decomposition and permutation entropy method was introduced in [15], which used VMD to extract the relative high-frequency mode components in raw vibration signal because the fault information in the vibration signal was mainly concentrated within the high-frequency components.

Apart from time-frequency based approaches, a lot of feature extraction methods based on fractal theory are also developed and applied in fault diagnosis. The traditional fractal methods, such as rescaled-range analysis (R/S) and fluctuation analysis (FA) [16], are developed as statistical tools to evaluate the scaling exponent. However, it is more suitable for stationary time series. More recently, a new random walk theory based generalized scale exponent calculation method named detrended fluctuation analysis (DFA) is introduced in [17]. DFA is extensively used to detect the long-range correlation and power-law properties in nonlinear and nonstationary time series, and it is suitable for extraction of precise intrinsic statistical features from the time series by removing external polynomial trends of differential orders [18]. Furthermore, it avoids the spurious detection of correlations which are artifacts of nonstationary time series [19]. Thus, DFA method is applied into many fields, such as climate [20], heart rate dynamic [21–23], and mechanical engineering [24–32]. Lin and Chen claimed the valuable crossover properties of the scale exponents corresponding to different time scales in double logarithmic chart [26]. Liu [27] claimed that DFA curve of the rolling bearing vibration signal can be quantified by two scale indices, and the index of the small time scale can be used to identify the type of bearing fault. Wu and Xiao [29] used the sliding windowing algorithm to find the turning point of bi-logarithmic map of DFA adaptively. In [30], a signal enhancement method combining ensemble empirical mode decomposition (EEMD) with DFA was proposed. Although EEMD has some improvement in solving the disadvantages of EMD, it is still sensitive to the strong background noise. Liu et al. [31] employed the VMD and DFA for signal denoising. In this method, scale exponent extracted by DFA was used to evaluate the number of VMD mode components, which (overbinning or underbinning) impacted on the efficiency of the filtering. And then VMD was employed to decompose raw signals into a given number of modes. The filtering signal components were constructed by VMD and DFA, which suppresses the noise interference based on Wiener filtering principle, and DFA was used for parameter optimization. In our research, DFA is used to extract fault feature rather than evaluate the number of mode components of VMD, and then the main purpose in our work is the fault characteristics extraction, which is different from previous research. Wang et al. [32] proposed a method which combined the scale exponent with intercept in DFA double logarithmic map which used the small time scale to classify the fault pattern of gears. We term this method as small time scale DFA (STS-DFA). STS-DFA shows better performance of fault pattern recognization because of the fractal features of small time scale, which represents the local fluctuation as well as high-frequency component [32]. However, the characteristics of different fault modes have partial overlap because fluctuations corresponding to large time scales may affect local fluctuations corresponding to small time scales, which reduce accuracy of fault identification.

Motivated by the previous work, since the feature vector of local fluctuation corresponding to high-frequency components show better performance for gear fault classification, a novel method of VMD incorporation with DFA is applied in the gear fault diagnosis. VMD is used to extract the high-frequency mode components and eliminate the influence of fluctuation corresponding to the large time scale. The number of decomposition component is determined as 3 in our experiment, and it has better performance according to [10]. DFA is employed to extract the fractal feature vector of high-frequency mode components, thus the feature vectors of the small time scale are used to the fault diagnosis of the gear. Furthermore, in order to validate the accuracy of the experiment results, Gaussian mixture coefficient in our experiment is set to 2, which is estimated by maximum-a-posteriori (MAP) by [33].

The remainder of this paper is organized as follows: the principle of VMD and DFA is described Section 2. In Section 3, the details of the method VMD combined DFA is described. Section 4 describes experiments and results of the measured signal in gear equipment system by using proposed method. Finally, conclusions and future directions are given in Section 5.

#### 2. Theory Descriptions

##### 2.1. Variational Mode Decomposition

The VMD includes three fundamental concepts of the Wiener filter, one-dimensional Hilbert transform, frequency mixing, and heterodyne demodulation. The variational mode decomposition aims to decompose the original signal *f* into a number of mode , which has specific sparsity properties and finite bandwidth. Futhermore, each component has a corresponding central frequency. So, in order to obtain precise data of bandwidth, the following section need to be finished.

By using squared -norm of the gradient, the bandwidth of each mode can be estimated. Then, the constrained variational mode decomposed problems are obtained as follows:where {} ≔ {} and {} ≔ {}, respectively, represent the entire mode and all the central frequencies, *f* is the original signal, *k* expresses the number of mode, *δ* is the Dirac distribution, *t* denotes time script, and denotes convolution operator.

Since constrained problems will impact the result of VMD, there are number of methods which can solve reconstruct constrained question. The parameters, a quadratic penalty term *α* and Lagrange multipliers *λ* are used to deal with constrained question, so that the unconstrained expression can be obtained as follows:

To solve this question which acquires saddle point of the Lagrange expression, the method of alternate direction multipliers by iteratively updating can be used to acquire minimized and . The specific steps will be drawn as follows:(1)Initialize , *.*(2)Minimization with regard to , , *λ* by updating(3)Iteratively update the procedure of equations (3)–(5) until values of three expressions converge, which is aimed to satisfy the condition of , where is a given value.

##### 2.2. Detrended Fluctuation Analysis

The detrended fluctuation analysis is proposed by Peng et al. in [17]. It is suitable for nonlinear and nonstationary signal as well as for spurious detection that eliminates long-range dependence in nonstationary. DFA is shown below and consists of the three main steps:(1)Suppose is the time series of the length *N*, compute with by integration to obtain :where is the average time series .(2)In order to acquire corresponding fluctuation function, the time series is divided into number of segments of equal length *s.* The length of the series cannot divide the length *s* into an integer, and the series is extended for opposite direction until the whole length of the series comes to 2*N.* Thereby, the segments and each subtime series corresponding least squares *m* order fits can be obtained:where is the trend of the *k*-th subtime series, which is the fitting polynomial in this subtime series. Linear, quadratic, cubic, or high-order polynomials can be used in the fitting procedure (usually called DFA1, DFA2, DFA3, etc.). is the coefficient of *j*-th order.(3)For each subtime series, compute the fluctuation function:wherewhere is linear regression of the segments length *s*. is the fluctuation trend function.(4)By repeating the procedure 1 to 2 for each segment length *s*, if the time series is long-range power-law correlation, it can be indicated aswhere is named as the general scale exponent. It can be calculated by taking logarithm of both sides of (12):and the scale exponent and the intercept are used as feature vector for time series.

The scale exponent is a parameter of autocorrelation attributes of the time series which characterizes long-range power-law correlation properties. Theoretically, the value of scale exponent is described as five stages, which also represent difference correlation of time series. When scale exponent equal to 0.5, 1, and 1.5, it reflects to the characteristics of the time series which is uncorrelated signal or white noise, 1/*f* noise, and Brownian motion, respectively.

When , the time series in this scale interval presents an antipersistent long memory characteristics (negative correlations). When , it shows sustained long memory characteristic in this scale interval (positive correlations).

#### 3. The Proposed Fault Diagnosis Method of Gearbox Based on VMD-DFA

In the proposed method, the raw vibration signal from gearbox is decomposed by VMD, and it is used to extract high-frequency mode components corresponding to local fluctuation, which eliminates the influence of fluctuation corresponding to the large time scale. The index methods of kurtosis and cross-correlation in Equations (14) and (15) are used to select the most sensitive mode component from decomposed modes. DFA is used to calculate the eigenvectors of sensitive mode. The flowchart of the researched fault diagnosis method of VMD-DFA is plotted in Figure 1. The specific procedures are described as follows:(1)The raw vibration signal *f* was collected by the sensor, which was decomposed by VMD into *N* mode components *u*(*i*) = {*u*(1), *u*(2), , *u*(*N*)}, .(2)The index of the most sensitive component *q* is selected by (14) in *u*(*i*):where and correspond to mean and standard deviation of *i*-th mode component of VMD, respectively. The is actually the kurtosis of *u*(*i*).(3)If the maximal kurtosis of some component mode is about the same, the index of the most sensitive component *q* is selected by (15) and (16):where represents the original signal and is the mode component whose maximal kurtosis is about the same; *m* is the length of time series; and *C*_{i} indicates cross-correlation coefficient of the *i*-th maximal kurtosis component.(4)The bi-logarithmic map can be acquired by utlizing DFA to analyze the sensitive component *q*. Then, the sliding windowing algorithm is used to capture turning point of the bi-logarithmic map. The position corresponding to the smallest value of is the position of turning point.where and correspond to the variance of distance of points of the two time scales to their corresponding fitting line.(5)The left of turning point is termed the small time scale and the right is the large time scale. The characteristic parameters of double time scale (*α*, *b*) can be extracted by (13), respectively. These eigenvalues (*α*, *b*) are used to construct feature vector *P*, where *α* is the slope and *b* is the intercept.

#### 4. Experiment

In order to verify feasibility and effectiveness of the proposed method in fault diagnosis, the vibration signals collected by the experimental facility system are used. The experimental facility system is shown in Figure 2. The experimental setup is composed of a single stage gearbox with a pair of spur gears, an electric machine, a magnetic powder brake with necessary load, and an I/O Tech Wave Book/516E 16-bit 1 MHz data acquisition system with Ethernet interface. The 20-teeth pinion is set up on input shaft of a 0.55 KW DC motor meshed with a 37-teeth gear, which is loaded by magnetic power brake. The vibration signals from gearbox are picked up by PCB piezoelectric vibration accelerometer, which is mounted on vertical direction of the input bearing block. The motor’s rotational speed randomly changes in the range of 300 r/min–1217 r/min. The dataset includes four fault patterns: normal, scratched, toothless, and circular pitch error, and each pattern has 100 samples. The vibration signals are acquired with sample frequency of 8000 Hz and sample time of 0.5 second. The vibration acceleration signals of the four states are demonstrated in Figure 3.

**(a)**

**(b)**

**(c)**

**(d)**

##### 4.1. Analysis and Comparison of VMD-DFA and STS-DFA

An instance of vibration acceleration signal (Figure 3(b)) for a gear scratched pattern is decomposed into three components by VMD in Figure 4. Three mode components are distributed across the different frequency bands. This fault information in vibration signal for gear is concentrated within the high-frequency components which correspond to the local fluctuation. Therefore, the high-frequency components *u*2 and *u*3 are preserved for selecting the sensitive mode component by (14)–(16). With DFA algorithm, the feature parameter (*α*, *b*) of double time scale is effectively extracted for gear fault diagnosis.

**(a)**

**(b)**

**(c)**

In our research, the detrended order is one, which has been verified by experiment test and has better performance in [32]. The range of window sizes is from 8 to 512 sample points. The dataset includes four fault patterns: toothless, scratched, normal, and circular pitch error, and each pattern has 100 samples which are calculated by VMD-DFA to obtain the characteristic values (*α*_{ij}, *b*_{ij}) where *i* is the gear fault pattern and *j* is the number of fault pattern. The mapping of feature vector (*α*_{ij}, *b*_{ij}), for four gear states is shown in Figure 5. The Figures 5(a), 5(b) show the results of DFA calculations on small time scales and large time scales, respectively, and Figure 5(c) is the calculation result of STS-DFA.

**(a)**

**(b)**

**(c)**

Figure 5(a) shows that the four states (toothless, scratched, normal, and circular pitch error) are fundamentally distinguished by the proposed method in small time scale, except that toothless and scratched have subtle overlap. Figure 5(b) shows that the states of the circular pitch error have large overlap with normal, and the scratched and toothless are overlapped completely. The results verify that the feature parameters from small time scale have better performance than the lager time scale for detecting four fault states. In addition, the scale exponent slope of four states of gears trend to zero, which indicate a strong antipersistent long memory characteristics in this time series interval. The small time scale STS-DFA is shown in Figure 5(c), and the normal, toothless, and scratched have varying degrees of overlap, especially toothless and scratched. So, the proposed method has better distinguished performance than STS-DFA for the gear four faults.

In order to verify the advantage of VMD-DFA, 320 training data (each state has 80 samples) are used for training purposes, and 80 training data (each state has 20 samples) are used for testing analysis. We build the probability distribution model of feature vector (*α*_{ij}, *b*_{ij}) by Gaussian mixture model and identify the fault pattern of testing data by Bayesian maximum likelihood classifier. The Gaussian mixture model (GMM) and Bayesian maximum likelihood classifier can be described as follows:where *M* is the number of mixtures, is the constraint mixture weight values that and , represents Gaussian probability function of the *k*-th normal distribution, is the mean, and is the covariance matrix.

Under the Gaussian mixture model , highest likelihood can be expressed as follows:where *Y* is the feature vector of testing sample and *p*() is the probability of *Y* with known *i*-th gear fault condition described by *i*-th GMM.

To evaluate validity of the proposed algorithm, this proposed method and STS-DFA are compared by resubstitution test, jackknife test, and independent dataset, respectively. The resubstitution test method reflects the algorithm's self-compatibility, and jackknife test is a cross-test method which reflects promotion ability of the algorithm. The independent test is to verify the actual application. The results of experiment are listed in Tables 1–3. From Tables 1–3, we can learn that the recognition rate of three test algorithms of VMD-DFA method is as high as 95% or more. The data results demonstrate in Table 1 that the total recognition rate of VMD-DFA reaches to 98.44%, and it improves 6% than STS-DFA with resubstitution test, which shows higher self-compatibility. In Table 2, the independent dataset test, the recognition rate of scratched and toothless is only 70% and 75%, respectively, in STS-DFA. In our proposed method, the recognition rate of scratched and toothless reaches 100% and 90%, respectively. The results verify that VMD-DFA is more accurate, and the overall recognition rate increased by 12.5%. In the jackknife test of Table 3, the overall recognition rate of the VMD-DFA is increased by 7.5% compared to the STS-DFA, which better reveals the characteristic vibration of the original signal. From the results of the three test method data, it is not difficult to conclude that the proposed method is superior to STS-DFA and has a higher recognition rate for the fault signal pattern.

##### 4.2. Validation Analysis of VMD-DFA and EMD-DFA

To verify the effectiveness and persuasiveness of the proposed method, EMD method is used to decompose the raw vibration signal. After the same steps 2–5 in Section 3, the results of the decomposed modes of EMD and small time scale EMD-DFA are shown in Figures 6 and 7, respectively. The contrastive analysis of proposed method and EMD-DFA is also carried out, which indicates that the normal, scratched, and toothless have different degrees of overlap shown in Figure 7. By comparing the above two methods with the proposed method, the experimental results show that the proposed algorithm has a better ability in fault feature identification.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

**(g)**

**(h)**

**(i)**

**(j)**

#### 5. Conclusion

Due to the characteristics of nonlinear and nonstationary of the heavy-large gearbox vibration signal, a novel fault diagnosis method VMD-DFA for gearbox is proposed in this paper. The main research work is summarized as follows:(1)VMD is used to extract the high-frequency mode components, which eliminates the influence of fluctuation corresponding to the large time scale and obtains mainly fault information of the vibration signal. DFA is used to extract the fractal feature vector of local fluctuation of signals.(2)The measured gear signals derived from the gear experiment system were used to verify the effectiveness of the proposed method. By using the Gaussian mixture model (GMM) with Bayesian maximum likelihood classifier, three test methods were employed to comparative analysis so as to verify the effectiveness of the proposed method. The experiment results demonstrate that the proposed method has obvious advantages in extraction of gearbox fault characteristics comparing with STS-DFA and EMD-DFA.(3)However, the parameters of VMD, such as the number of mode *u*, bandwidth control parameter *σ*, and decay coefficient τ have a certain effect on the feature fault frequency extraction as well as the denoising effect. Thus, the future research should be focused on the optimization of relative parameter of VMD.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

#### Authors’ Contributions

Han Xiao conceived the idea, organized the paper, and analyzed the data. Sinian Hu contributed to algorithm design and data handling. All the authors revised the paper for intellectual content.

#### Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant nos. 51475339 and 51605344, the Natural Science Foundation of Hubei Province under Grant no. 2016CFA042, and the Applied Basic Research Programs of Wuhan Science and Technology Bureau under Grant no. 2017010201010115.