#### Abstract

This paper presents a method that combines Shuffled Frog Leaping Algorithm (SFLA) with Support Vector Machine (SVM) method in order to identify the fault types of rolling bearing in the gearbox. The proposed method improves the accuracy of fault diagnosis identification after processing the collected vibration signals through wavelet threshold denoising. The global optimization and high computational efficiency of SFLA are applied to the SVM model. Simulation results show that the SFLA-SVM algorithm is effective in fault diagnosis. Compared with SVM and Particle Swarm Optimization SVM (PSO-SVM) algorithms, it is demonstrated that the SFLA-SVM algorithm has the advantages of better global optimization, higher accuracy, and better reliability of diagnosis. Its accuracy is further improved through the integration of the wavelet threshold denoising method.

#### 1. Introduction

Rolling bearing is an important rotating mechanical part widely used in the fields of aerospace and metallurgy and in automotive, manufacturing, and chemical industry. Its working condition affects the entire equipment as well as the whole production. Its failure can cause economic losses and possible personal injury [1]. For example, a major derailment accident of Lanzhou Railway Branch 1479 train happened on November 30, 1991, due to poor-quality bearing and cage broken [2]. In June 1992, a 600 MW supercritical forming unit from Japan Kansai Electric Power Company Hainan Power Plant in the speeding test caused a strong unit vibration due to the unit bearing failure and the critical speed drop. It not only damaged the aircraft but also resulted in an economic loss of up to 5 billion yen [3]. Therefore, it is very important to detect and diagnose the rolling running bearings.

Around 60% of the mechanical equipment failures are caused by the gearbox, of which the failure caused by the bearings accounts for about 19% [4]. Many methods such as PSO, Genetic Algorithm, and Ant Colony Algorithm were developed for the bearings fault analysis. In this paper, a combined SFLA-SVM method is proposed to diagnose the fault of the gearbox through the determination of the type of failure of the rolling bearing in the gearbox for the first time. The performance of the new method is then compared with the SVM method and PSO-SVM method.

#### 2. Methods

##### 2.1. Shuffled Frog Leaping Algorithm (SFLA)

SFLA is a swarm intelligence optimization algorithm proposed by Eusuff and Lasey in 2001, which was refined in 2003 and 2006 [5]. The algorithm is inspired by the biological foraging behavior in nature, in which the methods of local search and information sharing in the population are combined to carry out the computation of the global optimization of randomness and certainty. SFLA has the advantages of Mimetic Algorithm (MA) and PSO with local search capacities. By using the components-mixing-division of each meme form, it can achieve global information sharing and find the global optimal solution rather than the local optimum. The advantageous characteristics of SFLA include ease of setting up, high precision, fast convergence, and global optimization [6].

The SFLA model is as follows:(1)Initialization of the SFLA parameters, including deciding the total number of frogs _{,} analyzing the experimental data, proposing frog heron number *m*, and setting the maximum distance that frog individuals can move to. Mimetic group evolution algebra is . Algorithm mixed sort iteration number is _{.} The frog individual biggest search range is .(2)Calculation of the fitness value: Assume the first frog population iswhere is a frog individual. The fitness value is calculated first, and then the data are stored according to the size of the value of the sort, which is recorded as . Finally, the best frog individual is recorded as .(3)Division of the population: The frog population is divided into memes , using the following equation, and the best frog individual and the worst frog individual in each group are recorded:(4)Local search: According to the rules of in the local search The frog jumping step is decided by Equation (4) is used to update the value and calculate its fitness value. If the updated frog is better than the original frog, it will replace the original frog. Otherwise, replace with . Equations (3) and (4) are used to iterate each mimetic group. If the optimization fails, a new frog individual will be randomly formed to replace the original . According to this process, there are times of mimetic group to gain a new mimetic group .(5)Mixing of populations: The evolved frog populations are mixed again to form , and the global best frog is updated and recorded. Then, the frogs in are grouped once again.(6)If the number of iterations of the algorithm satisfies the condition , go back to step (4); otherwise, output the best frog individual.

##### 2.2. Support Vector Machine

In 1995, Vapnik proposed a machine learning method, Support Vector Machine (SVM). SVM is a learning method based on statistical theory and risk minimization [7]. Its core idea is to transform the problem of linear inseparability through the kernel formula, and then find the best classification surface in the space and the solution to the problem using convex quadratic programming [8]. It successfully solved over learning and the local minimum problem. It also has the better generalization ability.

The SVM algorithm model is as follows:(1)Give sample and the matching .(2)Select the proper kernel formula and related parameters.(3)Solve the maximum value of the formula under constraints and to gain the best value .(4)Calculate , where the first ingredient of represents the best bias . Then, the best decision plane is .(5)Classify the vector and calculate as +1 or −1 to decide .

##### 2.3. SFLA-SVM Model

Fu found that there are mainly two features that affect SVM’s learning ability: penalty and Gaussian kernel coefficient [9]. These two limits directly affect the SVM’s classification accuracy and generalization ability. If is too large, the training accuracy is high while the generalization ability is poor. If is too small, the training accuracy is poor. When is too large, the classification accuracy of SVM will be reduced. When is too small, the reasoning ability of SVM will be worsened. Therefore, the proper parameters can enable the model to have better generalization ability and classification accuracy.

Although there is no unified method to decide the best features and , the methods of network search and cross-validation are normally selected [10]. In this paper, SFLA-SVM model is proposed, whose process is outlined below:(1)Initialize the frog to form vectors randomly. The ingredient is a random number between . The total number of iterations of the frog is . The number of subpopulations is , and the number of subpopulation iterations is .(2)Calculate the fitness value of each frog individual. If the constraint does not satisfy, set the fitness value of the frog individual as an infinite positive number. Otherwise, keep the fitness value and divide the subpopulation.(3)Perform iterative optimization on each subpopulation and then mix all the subpopulations to form a new population and return to step (2). Repeat the steps (2) and (3) until the number of iterations of the total population is reached and returns .(4)Calculate the best bias from .(5)Compute the decision formula and then decide the classification vector .

##### 2.4. Wavelet Threshold Denoising

A significant problem in wavelet threshold denoising is how to choose the threshold. If the chosen threshold is too small, the noise will largely remain in the signal. However, if the threshold is too large, it will remove useful and important characteristic information from the signal resulting in deviation. Therefore, the threshold will directly affect the denoising effect [11].

Another problem in wavelet threshold denoising is how to choose the threshold formula. Wavelet threshold denoising includes hard threshold denoising, soft threshold denoising, and default threshold denoising [12]. Hard threshold formula and soft threshold formula are the two most commonly used threshold formulas.

The expression for a hard threshold formula is

The expression of a soft threshold formula is

In Equations (5) and (6), is a wavelet coefficient, is the denoised wavelet coefficient, and is a threshold value, whose formula iswhere is the standard deviation of the noise and is the strength of the signal.

After breaking up the noisy signal by wavelet, the signal has a larger amplitude than that the noise does. Therefore, choosing the wavelet coefficients is achieved by setting the threshold.

The basic steps of the wavelet threshold denoising using (6) are as follows:(1)Use wavelet transform to break up the noisy signal and to obtain a set of wavelet coefficients .(2)Threshold the wavelet coefficient to decide the estimated value of the wavelet coefficient , so that is minimum.(3)Use wavelet inverse transform to remake to gain the estimated signal , which is the signal after denoising.

There is a difference between the wavelet coefficient gained by the soft threshold formula and that of the original signal. The hard threshold formula is not continuous at the threshold point. These defects affect the effect of denoising. Therefore, in order to overcome the shortcomings of the traditional wavelet threshold, soft threshold and hard threshold, it is necessary to improve the selection of the threshold.

The improved threshold iswhere is the resolution scale. As the scale increases, the threshold decreases. Compared with the original method, the new threshold is more adaptive to separate noise at all levels.

The improved threshold formula iswhere is the wavelet coefficient, is the denoised wavelet coefficient, is the threshold, and is the adjustment parameter. The improved threshold formula has the advantages of both the soft and hard threshold formulas.

#### 3. Application of SFLA-SVM Algorithm in Gearbox Bearing Fault Diagnosis

##### 3.1. Experimental Platform

Figure 1 shows the experimental platform, a gearbox equipment of Beijing Capital Airlines. It consists of a frequency controller, motors, brakes, gear boxes, and other parts. The performance of the gearbox equipment is stable, and it can withstand certain load impact. There is enough space for gear replacement and installation. It can simulate various types of fault conditions for gearbox analysis, noise characteristic analysis, vibration characteristic analysis, health/condition monitoring, and fault diagnosis.

##### 3.2. Sensors’ Setting

Due to restrictions in this experiment, the sensors cannot be set inside the gearbox. Thus, six sensors are mounted on the gearbox test stand, which is shown in Figure 2.

##### 3.3. Signal Acquisition

The experimental gearbox vibration signal is collected by a comprehensive data collection and fault diagnosis device HG8916. In this paper, the experiment uses 6 vibration channels and 1 speed channel to collect signals simultaneously. With the time-domain signal gaining module, the maximum sampling points is 32768 and the maximum analysis frequency is 50 kHz. Once the signal is collected, it is saved to a data file which is exported to Matlab for further analysis and processing.

Throughout the experiment, 1000 sets of data covering the rolling bearing under normal working conditions, rolling wear, inner ring wear, and outer ring wear conditions are measured. The first 920 sets are used as training data while the remaining 80 sets as test data. Signal sampling frequency is 2000 Hz, and the number of collection points is 1024.

The following figures show some of the collected data images. Figures 3–6 are the vibration images in normal running state, rolling wear state, inner wear state, and outer wear state, respectively.

##### 3.4. Selection of the Characteristic Value of the Fault

The eigenvalues of bearing faults are closely correlated to the accuracy of diagnosis. Various methods are proposed to extract the eigenvalue information. Eigenvalue indicators can be classified into the dimensional indicators, such as square root amplitude and variance, and the dimensionless indicators, such as waveform indicator, margin indicator, kurtosis indicator, and skewness indicator. The formulas for calculating of the eigenvalue indicators are shown in Table 1.

The table shows that some parameters are not independent. For example, the variance indicator, skewness indicator, and kurtosis indicator are related. The margin indicator and square root amplitude are related as well. Due to the nonlinear and nonstationary nature of the vibration signal, the correlation of these features does not exhibit a linear relationship and there is no collinear relationship between these data.

The dimensionless eigenvalue indicators are calculated from the collected data, and they are then normalized to form a unified basis for the determination of the fault type. Table 2 shows the results of the eigenvalue indicators calculated using the training data while Table 3 is with the test data. The normalization formula used here is as follows [13]:where is the normalized eigenvalue, is the eigenvalue, and and are the maximum and the minimum of , respectively.

##### 3.5. Signal Denoising

In order to reduce the influence of noise on the vibration signal, the vibration signal under different conditions is denoised by the wavelet threshold and then identified by fault diagnosis. At first, the simulation signal and the noise signal , where and , are constructed and synthesized. The synthesized signal is then denoised by the wavelet adaptive threshold and the given threshold, respectively. The results are shown in Figure 7, from which it can be observed that with the given threshold denoising process, noise part of the signal is removed and the remaining signal is evenly distributed. The effect of adaptive threshold denoising is not ideal, and the noise is unevenly included in the original signal. Therefore, the acquired signal should be denoised by adjusting the given threshold. There are three ways to adjust the threshold: hard threshold, soft threshold, and default threshold denoising. The denoising results for different conditions are shown in Figures 8–11.

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**

The comparisons among the three denoising methods indicate that the characteristic information of the vibration signal obtained using the given soft-threshold denoising is better preserved without causing loss of signal characteristic information. Therefore, we choose the signal with the given soft threshold to denoise under different working conditions and to identify fault by the SFLA-SVM algorithm.

#### 4. Results and Discussion

With the denoised data signals under four working conditions, the fault diagnosis is simulated by SFLA-SVM algorithm, as outlined below.

Firstly, the SFLA parameters are set as follows:(1) is the total frog population, which is the primary parameter of the algorithm. It is related to the complexity and dimension of the problem to be solved. Considering the experiments undertaken previously in this, is set to be 100.(2) is the number of subpopulations. If is too small, the optimal information in the subpopulation cannot be completely shared in the local range. If is too large, the evolution process becomes complicated and is easy to fall into local optimum. The number of subpopulations follows the relationship , where represents the number of frogs contained in each subpopulation. In this experiment, and .(3) is the number of subpopulation iteration. A large value will cause the “frog” to change its position frequently and ignore the information exchange between individuals, while a small value may cause the meme group to fail to find an optimal solution and fall into a local optimum. is set to be 10 in this experiment.(4) is the total number of iterations. If is too small, it will cause the “frog” to ignore the information exchange between individuals. If is too large, it will increase computational workload and lead to local optimum. is set to be 20 in this experiment.

Secondly, the SVM parameters are set as follows:

There are two main parameters of the SVM, namely, the penalty parameter and the kernel function coefficient . In the Gaussian kernel function, has the kernel function coefficient . The SVM initialization parameters are , .

The parameters that have been optimized by SFLA are , . The simulation results of the fault diagnosis by the SFLA-SVM algorithm are shown in Figure 12.

In order to highlight the advantages of SFLA-SVM in the fault diagnosis of rolling bearings and to compare it with the SVM and PSO-SVM algorithms, the dataset and the initialization parameters of SVM used are the same for the three algorithms. Parameters that have been optimized by PSO are and . The diagnostic results of SVM and PSO-SVM algorithms are shown in Figures 13 and 14, respectively.

A more comprehensive comparison of SVM, PSO-SVM, and SFLA-SVM diagnostic results and running time are shown in Table 4.

Figures 12–14 indicate that the SFLA-SVM algorithm has the highest accuracy, and only 3 of its results are misclassified. In comparison, SVM and PSO-SVM algorithms have 13 and 9 misclassifications, respectively. Meanwhile, the comprehensive comparative analysis of Table 4 indicates that optimizing SVM by the PSO and SFLA algorithms can improve the accuracy of the diagnosis results although the running time increases by 27.46 s and 55.30 s, respectively. This demonstrates that SFLA-SVM algorithm can significantly improve the accuracy of SVM recognition and does not cause much diagnostic efficiency loss, but with a marginally higher running time compared to the other two algorithms.

The vibration signal under different working conditions and the corresponding improved threshold denoised signal are analyzed in the frequency domain. Their Fourier-transform diagram and power spectral density diagram are shown in Figures 15–18.

**(a)**

**(b)**

**(a)**

**(b)**

**(a)**

**(b)**

**(a)**

**(b)**

The comparative analysis of these figures indicates that the denoising with the improved threshold preserves the most characteristic information of the signal and generates very little distortion. The fault identification results are shown in Figure 19.

As shown in Figure 19, the SFLA-SVM algorithm produces only two misclassifications after the wavelet threshold denoising, which means that its accuracy has been improved to 97.5%.

#### 5. Conclusions

This paper proposes the SFLA-SVM algorithm for the fault detection of rolling bearing in gearbox, which is based on the SVM algorithm and takes the advantage of global optimization of SFLA. By analyzing Table 4, it can be seen intuitively that the accuracy of SVM, PSO-SVM, and SFLA-SVM are 83.75%, 88.75%, and 96.25% respectively, and the running time are 76.52 s, 103.98 s, and 131.82 s respectively. The accuracy of SFLA-SVM is the highest. Although SFLA-SVM has the longest running time, it only takes 55.30 s and 27.84 s more than SVM and PSO-SVM, respectively. Moreover, the accuracy of the SFLA-SVM algorithm has been improved to 97.5% when the wavelet threshold denoising method is adopted in the algorithm. Therefore, by comparing the running time and accuracy of the three algorithms, the advantages of the SFLA-SVM can be better highlighted.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest in publishing this paper.

#### Acknowledgments

This work was supported by the Henan University of Science and Technology Innovation Team Support Program (19IRTSTHN011), Henan Province Postgraduate Education Teaching Reform Research and Practice Project (2017SJGLX006Y), Zhengzhou Measurement and Control Technology and Instrument Key Laboratory (121PYFZX181), and Ninth Graduate Student Innovation Project of NCWU (YK-2017-08).