#### Abstract

Fault diagnosis of rotating machinery mainly includes fault feature extraction and fault classification. Vibration signal from the operation of machinery usually could help diagnosing the operational state of equipment. Different types of fault usually have different vibrational features, which are actually the basis of fault diagnosis. This paper proposes a novel fault diagnosis model, which extracts features by combining vibration severity, dyadic wavelet energy time-spectrum, and coefficient power spectrum of the maximum wavelet energy level (VWC) at the feature extraction stage. At the stage of fault classification, we design a support vector machine (SVM) based on the modified shuffled frog-leaping algorithm (MSFLA) for the accurate classifying machinery fault method. Specifically, we use the MSFLA method to optimize SVM parameters. MSFLA can avoid getting trapped into local optimum, speeding up convergence, and improving classification accuracy. Finally, we evaluate our model on real rotating machinery platform, which has four different states, i.e., normal state, eccentric axle fault (EAF), bearing pedestal fault (BPF), and sealing ring wear fault (SRWF). As demonstrated by the results, the VWC method is efficient in extracting vibration signal features of rotating machinery. Based on the extracted features, we further compare our classification method with other three fault classification methods, i.e., backpropagation neural network (BPNN), artificial chemical reaction optimization algorithm (ACROA-SVM), and SFLA-SVM. The experiment results show that MSFLA-SVM achieves a much higher fault classification rate than BPNN, ACROA-SVM, and SFLA-SVM.

#### 1. Introduction

Fault diagnosis of rotating machinery is a borderline discipline with high integrity accompanied by the rapid development of modern industry. Fault feature extraction and fault classification are hot topics in fault diagnosis. The vibration signals from rotating machinery contain a lot of information which can be applied to determine whether the equipment is operating normally or not. The different fault types have different features of vibration signals. Feature extraction involves analysis of the large quantity of raw data according to their corresponding analysis theory and algorithm, so as to extract feature parameters of the fault from the raw data and then provide accurate data for fault classification. Based on the extracted fault signal features, we could further classify different fault types which usually characterize different fault types. Thus, we could obtain a rotating machinery operating state with fault diagnosis model and simultaneously identify the fault type for reducing fault risks.

Fault diagnosis has always been a hot topic for scholars to study the fault feature extraction and fault classification [1–21]. Based on the careful analysis of multiple mechanical accidents, Soher [1] classified mechanical faults into 9 categories with 37 types. Beard [2] reported a fault diagnosis technology based on analytic redundancy. Aiming at early fault of the incipient gear box, Saravanan and Ramachandran [22] exploited DWT to design a feature extraction method and then apply artificial neural networks (ANN) for fault classification based on the extracted features. Samanta and Al-Balushi [23] extracted the features of the vibration signals under normal and fault states of the rotating machinery and applied these data to the input of ANN. In this way, ANN can be used to do fault diagnosis for rotating machinery. Persis and Isidori [24] designed a fault detection filter and applied to nonlinear detection and isolation. Zhang et al. [25] proposed a hybrid model for motor-bearing fault detection and classification. Meanwhile, they also applied SVM to classify the type of fault and assess its severity. Li et al. [26] came up with a new approach named multimodal deep support vector classification, which reached a high classification rate on spur and helical gearboxes. Bordoloi and Tiwari [27] used grid-search method, artificial-bee-colony algorithm (ABCA), and genetic algorithm (GA) to optimize SVM, which could classify four different gear faults accurately. Dong and Luo [28] designed a prediction method for bearing degradation with PCA and LS-SVM methods, which uses particle swarm optimization (PSO) to select SVM parameters. The experimental results indicate that this method is efficient. Li et al. [29] proposed a novel fault diagnosis method combining time-frequency analysis and neural networks and analyzed vibration signals of motor bearing. Reference [30] gave a fault diagnosis model using empirical mode decomposition (EMD) and GA-SVM and analyzed high-voltage circuit breaker. Specifically, they combined EMD and energy entropy as the feature vector and used GA-SVM to improve generation ability and classification accuracy. Aiming at gear fault diagnosis, Yang et al. [31] exploited ensemble empirical mode decomposition extracting fault features and used SVM for classifying faults. Furthermore, they adopted ABCA to optimize SVM parameters and obtained higher classification accuracy than GA-based and PSO-based methods. Seshadrinath et al. [32] used complex wavelets to identify multiple fault diagnosis in variable frequency drives and validated the efficiency in fault feature extraction with complex wavelets. Dybala and Zimroz [33] proposed to diagnose the faults of rolling bearing with the EMD algorithm, which could detect fault at the early stage of bearing failure. Reference [34] showed an early fault diagnosis model of rotation machinery, which combined wavelet packet decomposition and EMD to extract fault characteristic frequency and used BPNN to process 10 rotor fault types. The team of Cai et al. [10, 35, 36] has achieved remarkable achievements in the fault diagnosis. They used fault diagnosis for three-phase inverters using Bayesian networks (BNs). Reference [10] provided series of fault classification methods for BNs. Cai et al. [36] proposed a multisource information fusion-based fault diagnosis method for ground-source heat pumps using BNs, significantly improving the accuracy of fault diagnosis.

The traditional approach of feature extraction concerns the spectrum analysis based on the Fourier transform. However, Fourier transform is not capable when it comes to the analysis of nonstationary vibration signal of the high-speed rotating machinery with relatively lower accuracy. This paper proposed a time-frequency feature extraction method named VWC, which combines the vibration severity, dyadic wavelet energy time-spectrum, and coefficient power spectrum (CPS) of the maximum wavelet energy level. VWC can extract fault vibration signal from time domain and time-frequency domain. It is known that the SVM algorithm based on the statistical learning theory is particularly suitable for learning from a small sample dataset. The parameters of SVM usually influence classification performance very much. The selection of kernel function and cost parameter is important to SVM classification result. This paper proposes an MSFLA-based optimizing method for SVM, which could obtain the optimum of kernel function and cost parameter. With the 22 features as input, in order to avoid being trapped into local optimum with random search, MSFLA exploits the improved position updating policy and Gaussian mutation for new solution. Compared with SFLA, MSFLA takes full advantage of global and local information. It keeps the diversity of population and avoids getting trapped into local optimum. Meanwhile, MSFLA can avoid blind search, speeding up convergence and improving classification accuracy.

#### 2. Model

The raw data are extremely large containing both valid signals and interference signals. The interference signals challenge the acquisition of vibration. Thus, it is necessary to extract vibrational signal features of rotating machinery for classifying fault types. This paper proposed the VWC method for fault feature extraction of rotating machinery. The fault classification of the fault signal, in nature, refers to the establishment of a correspondence between fault feature parameters and fault types, which can be used for accurate classification of fault types. This paper proposed the MSFLA-SVM method for fault classification.

As demonstrated in Figure 1, the fault diagnosis model consists of the front-end data acquisition, the signal characteristics extraction, and the fault classification identification. The front-end data acquisition consists of an axial flow pump, an acceleration sensor, and a PCI extension for instrumentation (PXI) data acquisition system. The collected data are processed with the signal extraction method and then are used for fault classification. Specifically, the data are divided into training data and test data. The model could identify four operating states including three fault states and one normal state. The research focus of this paper is on the extraction of the signal characteristics and recognition of the faults.

Totally, three fault states are listed in the paper. EAF is a rotor imbalance fault. Under an ideal condition, the pressure generated on the bearing is equal when the gyrorotor in the rotary machine rotates; that is, the gyrorotor is balanced. The imbalance of the rotor is caused by the quality eccentricity of the rotor components or the defects in the components. It is one of the most common faults in rotary machinery. EAF could cause fatigue damage and breakage of equipment, resulting in the vibration and noise of rotary machinery, speeding up the bearing wear, and reducing the working efficiency and service life of the machine and may cause the destructive accidents in severe cases. In the experiment, we manually changed the weight block on the shaft to cause the eccentricity faults of the shaft. BPF is caused by bearing faults. In rotary machinery with rolling bearings, the faults caused by bearings are common. The bearing pedestal consists of a bearing and a box. The fault of rolling bearing is mainly caused by fatigue flake, wear, and gluing. When the bearing fails, the bearing pedestal used to support the bearing will periodically jump and result in the rigid change of the system with an impact effect, thus causing looseness of the bearing pedestal. Sealing ring is mainly used to prevent the medium at the outlet from flowing back, namely, playing the role of sealing. Meanwhile, it could avoid damage caused by direct touching between the impeller and the pump casing and could thus protect the impeller. SRWF could cause a backflow of the internal medium and could also get the impeller damaged.

##### 2.1. Feature Extraction Method: VWC

Vibration severity, a vibration signal time-domain parameter, is an important criterion for characterizing the degree of vibration of machinery, measuring the vibration state of a machine and indicating the damage caused by vibration. The rotor is the core of the rotating machine. When the rotor fails, the vibration frequency of the fault is closely related to the fundamental frequency of the rotor. Among the vibration signals, there is the fundamental vibration of the rotor, as well as other frequency components, such as 1/2*f*_{0,}*f*_{0,} 2*f*_{0}, and 3*f*_{0}. Different types of faults have different effects on these frequency components. When the fault occurs, the distribution of signal energy will be changed in each frequency band, and the fault types can be recognized according to the distribution of energy. The vibration features are not particularly obvious but relatively weak when the machinery shows the sign of a fault in the early stage. Furthermore, these weak fault signals are likely to be submerged in the periodic signals and interference signals. Therefore, the dyadic wavelet energy time-spectrum method is used to deal with the vibration signal on the basis of the dyadic wavelet transform. By analyzing the signal energy distribution, the decomposition level of dyadic wavelet energy distribution can be found, and by analyzing the coefficient power spectrum of maximum wavelet energy level, the feature extraction of the energy value for each feature frequency can be carried out. In this way, the weak vibration signals of the fault can be detected.

###### 2.1.1. Vibration Severity

The vibration severity can be calculated aswhere VIB is the vibration severity*, v*(*t*) is the vibration velocity value, and *T* is the sampling time.

We obtain equation (2) with discretion of equation (1) and , that is,Here, the velocity could be computed as equation (3) from the acceleration value *a*_{i}(*t*) obtained by acceleration sensor:

Rotor rotation is a periodic motion, which stimulates the vibration. The vibration frequency is the rotate velocity of a rotor, that is, the fundamental frequency *f*_{0}. The relationship between *f*_{0}, the rotation frequency of the rotor *f*_{r}, and the rotor velocity *n* is as follows:

###### 2.1.2. Dyadic Wavelet Energy Time Spectrum and Coefficient Power Spectrum of Maximum Wavelet Energy Level

If , denotes square integrable space; we define dyadic wavelet as follows:here, denotes mother wavelet, 2^{j} denotes scale parameter, *j* ∈ Z, and *b* denotes translation parameters. The dyadic wavelet transform iswhere denotes complex conjugate of and denotes the convolution calculation symbol.

Equations (7) and (8) are well-known Mallat wavelet decomposition algorithm and Mallat wavelet reconstruction algorithm, respectively. The coefficients *cA* and *cD* obtained by wavelet decomposition can reconstruct the original waveform:where denotes inner product, denotes scaling function, *h*^{∗} denotes low-pass reconstruction filter, and denotes high-pass reconstruction filter.

According to equations (6), (7), and (8), we derive

Then, from the above equation, we further derive

Assuming as the detail signal energy of the dyadic wavelet in the *j* level, the dyadic wavelet energy time-spectrum is computed as

In particular, equation (11) could show the concentration degree of the signal energy at different scales. Then, assuming the maximum energy level as *k*, , we compute its Fourier transform as equation (12) and obtain the corresponding coefficient power spectrum of the maximum wavelet energy level as the following equation:Here, the symbol “^” denotes Fourier transform and *N* represents data length:

Actually, equations (11) and (13) characterize the variation of the energy of the vibration signal in different scales. From equations (11) and (13), we can also find the main frequency band of energy concentration, which is significant for fault feature extraction of fault features.

##### 2.2. Fault Classification Method: MSFLA-SVM

###### 2.2.1. SVM

Compared with traditional methods, SVM performs better in case of limit sampling data, global optimization point, curse of dimensionality, etc. In engineering applications, it is usually a problem as the sample data are mostly nonlinear. Kernel function can be used to map linear nonseparable data of low-dimensional space to high-dimensional space, in which the data are separable. Here, the radical basis function (RBF) kernel function is as follows:where represents the kernel function width.

The algorithm steps for the SVM classifier are as follows:

*Step 1. *Assume the training data sample set as :

*Step 2. *Select the appropriate kernel function and cost parameter .

*Step 3. *Construct and solve the optimization problem:To obtain the optimal solution , the following steps are carried out.

*Step 4. *Select the positive component in to calculate the threshold :

*Step 5. *Construct the decision function:In SVM, parameters *C* and have a great effect on fault classification. Small *C* usually leads to underfitting, thus causing lower training and prediction accuracy. In contrast, greater *C* could lead to overfitting and bring higher accuracy of training and prediction. can balance the impact of *C*. Reasonable *C* and could usually achieve balance between training accuracy, generalization ability, and classification accuracy for SVM. The parameters have great effect on classification results. In this paper, we use the MSFLA method to optimize SVM parameters.

###### 2.2.2. SFLA

As a heuristic algorithm, SFLA has the advantages of the memetic algorithm (MA) and PSO, featured by simple structure, less parameters, fast convergence, and easy realization. In SFLA, one frog represents one candidate solution. The frog population is divided into several memeplexes, each of which consists of some frogs. This algorithm combines global search and local search to evolve towards the global optimum.

SFLA process is as follows:

*Step 1 (initial frog population). *Generating *N* candidate solutions randomly and assuming the initial frog population as *F* = (*X*_{1}, *X*_{2},…, *X*_{N}), the candidate solution represented by the frog numbered *i* is *X*_{i}, *X*_{i} = (*X*_{i1}, *X*_{i2},…, *X*_{id}), where *d* is the dimension of solution, 1 ≤ *i* ≤ *N.*

*Step 2 (calculate fitness). *Calculate the fitness value of all frogs according to fitness function, which is defined as the classification accuracy under cross-validation meaning.

*Step 3 (memeplex division). *Arrange *N* frogs in the descending order according to their fitness value and divide the population into *M* memeplexes. Allocate the first frog to the first memeplex, the second frog to the second memeplex, and the frog numbered *M* to the *M* memeplex. Then, allocate the *M* + 1 frog to the first memeplex, the *M* + 2 to the second memeplex, and so on, until *N* frogs are all allocated. The whole population is divided into *M* memeplexes, with each containing *P* frogs, namely, *N* *=* *M* × *P*. We represent *z* as memeplex number. The division formula is as shown in equation (11):

*Step 4 (local updating). *Each time during the iterative calculation process within a memeplex, *F*_{w} will be adjusted with the method shown in equation (20). In each memeplex, *F*_{b} represents the frog that occupies the best position in its memeplex, *F*_{w} represents the frog that takes the poorest position in its memeplex, while *F*_{g} represents the frog that takes the best position in the whole population:here, *D*_{i} represents the frog’s movement distance.

Equation (20) represents the updated position of the poorest frog, where good positions indicate high fitness values. The valuation range of rand( ) is random data between [0, 1]. The range of frog’s movement distance is between (−*D*_{max}, *D*_{max}). After each iterative calculation of a memeplex, if the poorest frog in the memeplex has a position better than the previous one, i.e., *F*_{new_w} > *F*_{w}, then the frog in this new position should take the place of the old frog, which means replacing *F*_{w} with *F*_{new_w} to update *F*_{w}. Alternatively, *F*_{g} can be used to replace *F*_{b} in equation (20), e.g., equation (21), and then to repeat the above updating process. If the poorest frog in the population does not improve or its movement distance exceeds the maximum movement distance after calculating with equation (21), a new solution should be generated randomly to replace the original *F*_{w}. With this method, each memeplex will be updated for certain times internally and the poorest frog position will be updated until the local searching times:

*Step 5 (global updating). *After each memeplex completes local search, all frogs are mixed. Repeat Step (2) ∼ Step (4), until the largest number of global iterations or accuracy requirement is met.

###### 2.2.3. MSFLA

As the number of iterations increases, the convergence of individual frogs in SFLA leads to decreased population diversity, and the algorithm is easily trapped into local optimum and low accuracy of solution. Focusing on this problem, this paper proposes MSFLA. The method improves the poorest frog *F*_{w}, and at the same time, replaces random new solution with Gaussian mutation, so as to avoid blind search in SFLA. MSFLA can balance the global and local search ability of SFLA in a better way and improve classified accuracy of solution.

*(1) Updating Strategy*. When updating *F*_{w}, SFLA first compares *F*_{w} with *F*_{b} in one memeplex. If *F*_{new_w} is worse than *F*_{b}, then compare it with *F*_{g}. Actually, SFLA does not make full use of *F*_{g}, which can facilitate SFLA converging to the local optimum easily. This paper proposes a novel *F*_{w} updating strategy based on equations (20) and (21).

Suppose *U*_{z_c} is at the center of the *z*th memeplex, then this center point is shown byhere, *P* denotes the number of frogs in the *z*th memeplex.

Then, assuming the *F*_{g} in the best position as *S*, 1 ≤ *S* ≤ *N,* we have three important definitions.

*Definition 1. *The distance between individual frogs is measured by Euclidean distance and is defined ashere, *X*_{i} and X_{j} represent two individual frogs (candidate solutions) and *d* represents dimension. In Definition 1, both the distance between individuals and the distance between the center points of the memeplexes where individuals locate are taken into consideration.

*Definition 2. *The maximum distance between individual frogs and the global optimal frog is defined as

*Definition 3. *The minimum distance between individual frogs and the global optimal frog is defined asThis paper proposes a new frog-position updating strategy ashere, *h* and *q* denote the learning efficiencies of *F*_{b} and *F*_{g}, respectively.

As for individual frogs close to *F*_{g}, simple local searching can increase the probability of learning from *F*_{b}. Meanwhile, with larger *h* and smaller *q*, *F*_{b} could influence *F*_{new_w} more than *F*_{g}. On the contrary, the individual frogs far from *F*_{g} are more likely to learn from *F*_{g}, and thus, *F*_{g} influences *F*_{new_w} more than *F*_{b}. As shown in Figure 2, the update of *F*_{new_w} depends on distances between frogs and *F*_{g} with adjusted *h* and *q*. Since the update of *F*_{w} in each memeplex is based on *F*_{b} and *F*_{g}, it could maintain the population diversity and could prevent the algorithm from getting stuck in local optimum. Also, the method could help accelerate the algorithm converging to some extent.*(2) Improved Mutation Process Based on Gaussian Perturbation*. In SFLA, if the fitness is still bad than the original one after several local and global searching, then a random new solution *F*_{new_w} is generated to replace *F*_{w.}, lowering the converging speed. This paper integrates a Gaussian random perturbation on *F*_{new_w} ashere, *N* (0, 1) is a Gaussian distribution with mean 0 and variance 1. Particularly, the perturbation term *F*_{w} × *N* (0, 1) could avoid trapping into local optimum.

#### 3. Experiment and Analysis

##### 3.1. Experiment Environment

The experiment focuses mainly on four working modes, namely, normal state, EAF, BPF, and SRWF, with the latter three known as typical rotating machinery faults. We use a vertical axial flow pump in the experiment. The fundamental frequency *f*_{0} is 16 Hz according to equation (3). The experiment is carried out in a closed water loop, which drives the water to flow in the circuit when the vertical axial flow pump is running. A vibration acceleration sensor is installed on the vertical axial pump for the collection of vibration signals. The sampling frequency of acquisition card is set to 10 kHz. The experiment environment parameters are shown in Table 1. The experiment site is shown in Figure 3.

##### 3.2. Working Condition and Analysis

###### 3.2.1. Normal State

From Figure 4(a), the waveforms are relatively messy. However, these signals contain rich frequency components. The vibration severity of vibration signal is 1.051 mm/s. Furthermore, according to equation (11), we obtain its dyadic wavelet energy time-spectrum as Figure 4(c). Obviously, the d5 level is the maximum wavelet energy level. From Figure 4(b), a large amount of energy concentrates on *f*_{0}. Meanwhile, the concentration of energy can also be found in the power spectrum at 32 Hz and 48 Hz (second harmonic and third harmonic). There is power interference at 50 Hz. 4*f*_{0} (64 Hz) energy impact is found, causing by the impeller of the vertical axial flow pump.

**(a)**

**(b)**

**(c)**

###### 3.2.2. EAF

In Figure 5(a), the vibration severity of vibration signal is 3.737 mm/s. The energy mainly focuses on *f*_{0}. In Figure 5(b), due to some crosstalk from other devices in the experimental site, some energy exists at 28 Hz, 34 Hz, and 38 Hz.

**(a)**

**(b)**

**(c)**

###### 3.2.3. BPF

From Figure 6(a), we obtain the vibration severity of 1.161 mm/s based on the raw data. Figure 6(c) is the change of dyadic wavelet energy time-spectrum of BPF. From Figure 6(b), the energy concentrates on *f*_{0} and highlights further the feature of 1/2*f*_{0} and 3*f*_{0}.

**(a)**

**(b)**

**(c)**

###### 3.2.4. SRWF

In Figure 7(a), the vibration severity of vibration signal is 1.70024 mm/s. From Figure 7(b), the energy concentrates on *f*_{0} and highlights further the feature of 1/2*f*_{0} and 2*f*_{0}.

**(a)**

**(b)**

**(c)**

###### 3.2.5. Analysis and Discussion

After feature extraction, we select vibration severity and amplitude of wavelet CPS 1/2*f*_{0}, *f*_{0}, 2*f*_{0}, 3*f*_{0}, 4*f*_{0}, 5*f*_{0}, and 6*f*_{0} as feature parameters. Each frequency has 3 vibrational directions, i.e., axial, tangential, and radial directions. Thus, one group of data totally includes 22 fault characteristic parameters. We collect 60 groups of data for each fault type in this experiment. Because of the condition restriction, it cannot collect more sample data, but the advantage of SVM can apply to deal with small sample data. Therefore, we select 45 groups of four types of data sample as training data and the other 15 groups as testing data. The training data are an input vector of 180 × 22, and testing data are also an input vector of 60 × 22. The training dataset is collected from 4 different rotational machinery states, i.e., normal, EAF, BPF, and SRWF. The 22 features are composed of vibration severity, 1/2*f*_{0}∼6*f*_{0} in three directions as Figures 8 and 9.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

**(g)**

**(h)**

**(i)**

**(j)**

**(k)**

**(l)**

From Figure 8, we can see that vibration severity of EAF is the largest, followed by the vibration severity of BPF and the vibration severity of normal state is the least. From Figures 9(a)–9(c), we can see that the amplitude of wavelet CPS in tangential direction is larger than that of axial and radial direction in the normal state, and the maximum value in tangential direction appeared in *f*_{0}, which suggests that energy mainly concentrated in the *f*_{0} position in the normal state. From Figures 9(d)–9(f), the number of wavelet CPS in tangential direction increases from the maximum 1.683 × 10^{−3} in the normal state to 3.322 × 10^{−3}. Meanwhile, we can also obtain from Figures 9(a) and 9(f) that the corresponding maximal CPS value at the 7 characteristic frequency points 1/2*f*_{0}∼6*f*_{0} are increased when the equipment is in the EAF state. Compared with Figure 9(a), the waveform of 7 feature frequency points in Figure 9(d) is more chaotic and it is not so regular like the normal state. Compared with Figures 9(b) and 9(e), we can see that the axial energy is mainly concentrated in the f0 position in normal condition and dispersed to two frequency locations 1/2*f*_{0} and *f*_{0} when EAF occurs. In Figures 9(c) and 9(f), the maximal CPS value increases from the normal state 4.574 × 10^{−4} to 2.492 × 10^{−3}. It can be seen from Figures 9(d)–9(f) that the main influence of the equipment in the EAF state is in the tangential direction and the maximum energy is mainly concentrated in the *f*_{0} position in tangential direction, which is in line with the fault features of EAF. In Figures 9(g)–9(i), the largest energy also concentrated in the tangential direction. Compared with the normal state, when BPF occurs, the maximum value of CPS increases from 1.683 × 10^{−3} to 4.701 × 10^{−3}. While in Figure 9(g), the energy is mainly concentrated in 1/2*f*_{0}, which is not consistent with normal state. However, when BPF occurs, the value of 1/2*f*_{0} is improved, indicating that BPF in tangential direction has the greatest influence on 1/2*f*_{0}. Normally, the energy in the axial direction is concentrated at 1/2*f*_{0}. From Figure 9(h), we can obtain that the energy at the frequency of 5*f*_{0} and 6*f*_{0} is increased. The analysis suggests that these high-frequency energy shocks are caused by the flow-induced vibration and natural frequency of the internal components of the equipment. It can be seen from Figures 9(c) and 9(i) that the energy in the radial direction slightly decreased and the energy is concentrated shifting from *f*_{0} to 1/2*f*_{0}, 3*f*_{0}, and 4*f*_{0}. The maximum energy appears in the frequency position of 1/2*f*_{0} in tangential direction when BPF happens. Thus, BPF mainly influences the tangential direction, which is consistent with the fault features of BPF. As shown in Figures 9(j)–9(l), the energy of SRWF in tangential direction reaches the maximum and the energy intensity of SRWF in axial and radial direction is between one of EAF and BPF.

Before fault classification, we normalize all training data and testing data, which could speed up the convergence. We evaluate MSFLA-SVM with 240 real small datasets. In particular, each of the 4 types has 60 datasets, consisting of 22 features. We use 180 datasets for training sample and another 60 datasets for testing sample. The comparison results of training classification are shown in Table 2. The comparison results testing classification results are shown in Table 3. The BPNN was proposed by Huang and Xie [37], and the ACROA-SVM was developed by Ao et al. [38].

From Table 3, we can obtain that the normal state has the highest classification accuracy, with testing average classification accuracy 96.667%, followed by EAF with 90.000%, and SRWF with 90.000%. BPF gets the least classification rate, only 85.000%. It is lower than the other three states. The main reason is that the raw signal of BPF is restricted by components resonance and field interference signals, which may cause much lower classification rate. Meanwhile, MSFLA-SVM achieves the best classification accuracy of 93.333%, much higher than 91.667% of ACROA-SVM, 91.667% of SFLA-SVM, and 85.000% of BPNN.

#### 4. Conclusions

Fault feature extraction and fault classification are the core of fault diagnosis. Since the traditional Fourier transformation-based methods could only analyze stationary signal, how to analyze nonstationary signal is still an open topic. The local signal features are very efficient in fault diagnosis yet difficult to extract. Focusing on this problem, this paper proposes a novel fault diagnosis model consisting of VWC and MSFLA-SVM. As the results demonstrated, VWC can accurately capture the local weak transient transformation of the signal. Based on the extracted features from VWC, we classify normal states, EAF, BPF, and SRWF with BPNN, ACROA-SVM, SFLA-SVM, and MSFLA-SVM. As demonstrated by the experiment results, the proposed MSFLA-SVM could achieve the best in terms of classification accuracy in the four methods. Yet, due to the complexity of BPF signals, all the four methods have low classification accuracy on the BFP signal. In the future, we will focus on improving the classification rate of BPF by increasing signal conditioning and reducing interference signals.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

#### Acknowledgments

This research was supported by the Scientific Research Fund of State’s Key Project of Research and Development Plan (2016YFB0800600), Scientific Research Fund of Sichuan Provincial Education Department of China (17ZA0082), Sichuan Science and Technology Program (2018JY0292 and 2018GZ0247), Open Fund of Engineering Laboratory of Spatial Information Technology of Highway Geological Disaster Early Warning in Hunan Province (Changsha University of Science & Technology, kfj170602), Scientific Research Fund of Key Laboratory of Pattern Recognition and Intelligent Information Processing of Sichuan, Chengdu University (MSSB-2018-04), and Chengdu Science and Technology Program (2018-YF05-00731-SN). The authors would like to express their sincere thanks to Mr. Pan He (Nuclear Power Institute of China).