Abstract

When the vibration signals of the rolling bearings contain strong interference noise, the spectrum division of the vibration signals is seriously disturbed by the noise. The traditional empirical wavelet transform (EWT) decomposes signals into a large number of components, and it is difficult to select suitable components that contain fault information. In order to address the problems above, in this paper, we proposed the improved empirical wavelet transform (IEWT) method. The simulation experiment proved that IEWT can solve the problem of a large number of EWT components and separate the impact component effectively which contains bearing fault information from noise. The IEWT method is combined with the support vector machine (SVM) to diagnosis the fault of the rolling bearings. The permutation entropy (PE) is used to construct feature vectors for its strong induction ability of dynamic changes of nonstationary and nonlinear signals. The crucial parameter penalty factor C and kernel parameter of SVM are optimized by quantum genetic algorithm (QGA). Compared with traditional EWT and variational mode decomposition (VMD) methods, the effectiveness and advantages of this method are demonstrated in this study. The classification prediction ability of SVM is also better than that of K-nearest neighbor (KNN) and extreme learning machine (ELM).

1. Introduction

The rolling bearing is an essential part of rotating machinery. Its working environment is harsh, and it bears complex loads. It is a crucial component of rotating machinery that is prone to faults. The research on condition monitoring and fault diagnosis is a necessary measure to ensure the normal operation of rotating machinery.

Vibration signals of a rolling bearing are nonlinear and nonstationary. The most commonly used method to deal with such signals is the empirical mode decomposition method (EMD) proposed in [1]. EMD has made a great breakthrough in achieving fault information extraction and improving signal-to-noise ratio [2]. However, EMD has serious defects, that is, mode-mixing and endpoint effect [3]. Although the ensemble empirical mode decomposition (EEMD) and its improved methods which were proposed by [4, 5] could suppress the influence of modal mixing to a certain extent, its effect needs to be further improved.

For EMD the lack of complete theoretical derivation, the problem such as mode-mixing and endpoint effect, the empirical wavelet transform (EWT) based on wavelet transform was proposed [6]. EWT divides the spectrum and constructs orthogonal wavelet filter banks to obtain Amplitude Modulation (AM)-Frequency Modulation (FM) components of the Fourier spectrum with physical meaning, then realizing adaptive decomposition of signals. EWT has rigorous mathematical derivation, and the calculation process has no iteration, which has higher efficiency compared with other signal decomposition methods [7].

Xi [8] proposed to combine EWT with singular value decomposition (SVD) and singular value packet decomposition (SVDP) technology for fault diagnosis of bearings, rotors, and universal shafts and achieved certain results. Zhou [9] proposed a fault classification and diagnosis method for transmission lines based on EWT and learning vector quantization (LVQ) neural network. In the process of signal processing using the EWT method, spectrum division is the key step to determine the effect of signal decomposition. When the collected vibration signals contain a lot of noise, its spectrum is seriously affected by noise, and it is difficult for the acquired component signal to contain fault characteristic components. However, in the decomposition process of EWT, when faced with a complex spectrum, the scale-space method is adopted to select points. Because of the small threshold value, many excessive cutoff points are obtained, resulting in too many EWT components, and it is difficult to select useful components in the subsequent analysis. Deng et al. [10] proposed a method of dividing the frequency spectrum by constructing a variable bandwidth sliding frequency window. The proposed envelope spectrum harmonic noise ratio index was used to determine the position of the frequency window, which improved the accuracy of frequency spectrum division. Reference [11] proposed a method based on the spectral envelope to overcome the problem of excessive division of the high-amplitude spectrum. However, this method requires manual estimation of the number of subdivisions, which is a lack of adaptability. In the face of the complex spectrum, overdecomposition or underdecomposition will still occur due to inaccurate estimation. In this paper, IEWT is proposed to dividing the spectrum based on mutual information (MI) to solve the problem of excessive components.

In the process of bearing signal acquisition, it is difficult to obtain a large amount of bearing data and a large amount of data is also difficult to analyze and process. SVM is an intelligent fault diagnosis method based on the principle of structural risk minimization with prominent advantages for small sample data [12]. The core of SVM is to map feature vectors to high-dimensional feature spaces through kernel functions and select an optimal hyperplane to classify data. Among them, the optimal kernel parameter and penalty coefficient C of SVM have a great influence on the classification results. Therefore, this paper adopts an intelligent optimization algorithm to optimize the key parameters of SVM. In [13], particle swarm optimization (PSO) and genetic algorithm (GA) were used to optimize the kernel parameter and penalty coefficient C, which are prone to fall into local optimization and have poor stability. In this paper, the quantum genetic algorithm (QGA) is used to optimize the kernel parameter and penalty coefficient C, with good stability and global searchability. Permutation entropy (PE) has a good induction ability for the dynamic change of nonstationary and nonlinear signals, and good results have been achieved in representing the failure state of the rolling bearing by PE [14]. Therefore, PE is employed as the feature vector input of SVM in this paper.

In this paper, a spectrum repartition method based on MI is proposed to solve the problem that EWT is heavily affected by noise and produces many spectral demarcation points, resulting in too many decomposition components. The fault impact components are extracted successfully from a complex time-domain series, which improves the accuracy of signal decomposition.

In this paper, a new method is proposed to solve the problem caused by excessive EWT components. And a rolling bearing fault diagnosis method based on IEWT and SVM is proposed. The PEs of components are applied as the feature vector input of SVM, and the penalty factor C and kernel parameter are optimized by QGA. The rest of this paper is organized as follows: Section 2 introduces the basic theories that include the EWT, IEWT, and its simulation signal verification and basic theories of SVM, QGA, and PE. The proposed bearing fault diagnosis method is described in Section 3, and then, the experiment of the proposed method and the comparisons are presented in Section 4. Finally, the conclusions of this work are summarized in Section 5.

2. Materials and Methods

2.1. Empirical Wavelet Transform

EWT is an adaptive decomposition method under the framework of wavelet transform theory. First, the vibration signal spectrum is divided adaptively, and then, the orthogonal wavelet filter bank is used for decomposition to obtain the AM-FM signal with tight support characteristics [15].

Computing the frequency spectrum of vibration signals, the spectrum is defined within the scope of [0 π], the spectrum is divided into N segments, ω0 = 0 and ωN = π are, respectively, the left and right boundaries, and each frequency band is represented by Λn = [ωn−1, ωn], so the entire spectrum can be represented as .

According to Meyer and Littlewood–Paley equation, an empirical wavelet is constructed. The corresponding scale function and wavelet function are defined as follows:where and .

Empirical wavelet is constructed according to the classical wavelet transform method. The detail coefficient and approximate coefficient are described as follows:where and are determined by equations (1) and (2).

According to the wavelet theory, the empirical mode function is obtained from the following equation:

The signal is reconstructed according to equation (5):

According to equations (7) and (8), empirical mode components can be obtained:where φ1 is the empirical scale function and and are, respectively, the Fourier transforms of and .

2.2. Support Vector Machine

SVM is a machine learning method based on VC dimension theory and empirical risk minimization principle. SVM shows strong advantages in the classification and regression of small sample data [16]. The core of SVM is to map the data of nonlinear classification to high-dimensional linear space through kernel function and construct the optimal classification hyperplane in high-dimensional space to realize data classification [17]. In this paper, cross-validation is used to train and verify the data.

Given the data {xi, yi; i = 1, 2, …, n}, xi is the sample data and yi is the sample category. If the optimal hyperplane equation ωx + b = 0 correctly separates the samples and has the largest classification interval, then the optimal classification plane can be converted into quadratic convex optimization problem as follows:where ω is the weight vector and b is the bias vector.

When relaxation factor ξ and penalty factor C are introduced into the solution process, the optimization problem is expressed as

The kernel function K(xi, xj) which satisfies Mercer condition is used to replace the inner product operation, so the optimal classification hyperplane is

In this paper, RBF kernel function is adopted and the equation iswhere σ is the standard deviation, which represents the degree of correlation between the support vectors.

2.3. Quantum Genetic Algorithm

In the process of SVM classification, penalty factor C and kernel parameter have a great influence on the classification accuracy and generalization ability of SVM. When C is large, it has a higher classification accuracy but poor generalization ability. When C is small, the classification accuracy will be reduced. When is large, the relationship between the support vectors is too close and the classification accuracy is reduced. When is small, the relationship between support vectors is sparse, the system is relatively complex, and the generalization ability is weak [18]. In order to obtain the best classification effect and achieve the balance between generalization ability and classification accuracy, the intelligent optimization algorithm is adopted to optimize the parameters.

Although the structure of GA and PSO is simple, easy to implement, and fast to converge, it is easy to fall into premature maturity and difficult to select the optimal parameters [19]. QGA can make up the shortcomings of GA in convergence speed and easily fall into local convergence and has stronger adaptability, stability, and global searchability.

The key of QGA is to construct appropriate quantum gates and update qubits by rotating quantum gates. The specific operations are shown as follows:

Suppose there is a population containing N individuals, where is an individual of the T generation of the population, and there iswhere T represents an evolutionary generation and m represents chromosome length.

QGA optimization is realized by coding qubit and updating the quantum revolving door. Building a new population, initializing the population, and measuring the qubit binary values are all completed by coding the qubit. To further optimize the model parameters, the population fitness value is calculated and the optimal solution is saved, through the quantum rotation to search and update the current qubits, and the output optimal solutions are looped to meet the termination conditions [20].

2.4. Permutation Entropy

Permutation entropy is a measure of the complexity of time series, which has a good ability to analyze the complexity and sense the small changes of nonstationary and nonlinear signals and good resistance of noise. It shows great advantages in representing the fault state of vibration signals of rotating machinery [21, 22].

Define a time series {X(i), I = 1, 2, …, k}, according to Shannon entropy. Then, the permutation entropy can be defined aswhere m represents the embedding dimension. When the Pl = 1/m!, Hp(m) gets the maximum value ln (m!). Hp(m) can be standardized as

The range of Hp is 0 ≤ Hp ≤ 1, and the value of Hp indicates the randomness and complexity of the time series. The larger the Hp is, the stronger the randomness of time series is. The smaller the Hp is, the stronger the regularity of time series is [23].

The embedding dimension m and time delay τ have a great influence on the calculation results of permutation entropy. In [24], the authors suggest that when τ values are in the range of [1, 6], permutation entropy changes are small and so τ takes a value of 1 in this paper. When the range of m is [3, 7], the permutation entropy can better reflect the small changes in time series. In [25], the authors indicate that when m = 5∼7, the best effect can be obtained by using the permutation entropy to characterize the dynamic change of time series. Through experimental verification, when m = 5, it has higher operational efficiency, so m = 5 is taken in this paper.

3. Fault Diagnosis Method Based on IEWT and SVM

3.1. Improved Empirical Wavelet Transform

In the scale-space plane, each minimum corresponds to a scale-space curve. The threshold Th is determined by the Ostu method, and the cutoff points where the threshold value is greater than Th are reserved. However, the threshold value obtained by the Ostu method is relatively small, leading to too many cutoff points which are selected, resulting in excessive EWT components [26].

Mutual information (MI) was developed from the concept of entropy in information theory. It is always employed to measure the correlation degree of two random variables and is more accurate than the correlation coefficient [27]:where H(Y) represents the entropy of Y and H(Y|X) represents the conditional entropy of Y given X. When the correlation between X and Y is stronger, the conditional entropy value H(Y|X) is smaller, while the MI(X, Y) value is larger [28].

To solve the problem of excessive components of EWT, an improved method of EWT is proposed. The maximum threshold ignores redundant cutoff points, and only the valid spectrum cutoff points remains. According to the correlation with the original signal, the redundant components are further combined. The vibration signal is finally decomposed into meaningful components. The proposed method includes four steps which are written as follows:Step 1: FFT is carried out on the vibration signal to obtain the frequency spectrum.Step 2: the Ostu method is used to determine the threshold Th of the scale-space curve, and the scale-space method is used to determine the initial boundary point to obtain the initial component.Step 3: calculate the MI of the initial components, and combine the components according to the MI. If the MI of the component is greater than the mean and the MI of the adjacent components is greater than the mean, the two components are merged. If the MI of the adjacent components is less than the mean, the component remains independent. If the MI of this component is less than the mean and the MI of adjacent components is less than the mean, then the two components are merged. If the MI of adjacent components is greater than the mean, the components remain independent. According to the combination of the components, a selection is made among the initial boundary points to obtain the reserved demarcation points.Step 4: according to the reserved boundary point, the signal Fourier spectrum is redivided.Step 5: based on the Fourier spectrum, a new orthogonal filter bank is constructed and a new decomposition component is obtained. The implementation process of IEWT is shown in Figure 1.

3.2. Simulation Signal Verification

The simulation signal is written in equation (18), which is composed of impact component y, harmonic oscillation component x, and random noise e. Sampling frequency is fs = 20 kHz, damping coefficient is ξ = 0.1, displacement constant is y0 = 5, failure cycle is 0.01 s, natural frequency is fn = 3 kHz, rotating frequency is fr = 25 Hz, sample number is N = 5000, and e(t) is a random noise. Time-domain diagram of the simulation signal is shown in Figure 2:

Following step 1, FFT is carried out on the simulation signal to obtain the spectrum.

Following step 2, the threshold is obtained by the Ostu method shown as Figure 3(a), where the red line is the threshold, points corresponding to scale-space curves greater than the threshold are retained, and the spectrum is decomposed according to the cutoff points obtained by the scale-space method. The decomposition result is plotted in Figure 3(b). The spectrum is decomposed into 37 frequency bands; the number of components is too large. It is difficult to select useful components for subsequent analysis. Figure 4(a) shows the MI values of each component and the repartition spectrum.

Following step 3, only 3 boundary points are preserved, as shown in Figure 4(a) where the 3 purple lines represent the points.

Following step 4, the spectrum is divided into 4 segments according to the reserved cutoff points shown in Figure 4(b).

Following step 4, the IEWT components are obtained as shown in Figure 5. It can be clearly seen from Figure 5 that the harmonic component, impact component, and noise component of the simulation signal are separated successfully, and the number of components is reduced from 37 to 4. The problem of excessive EWT components has been solved.

From the IEWT components shown in Figure 5, the signal time-domain diagram can clearly show that the simulation signal harmonic component, impact component, and noise component were successfully separated, and the number of components has been reduced from 37 to 4. With accurate decomposition signal ability, the fact that the improved EWT method can solve the defects of traditional EWT method when processing complex signals is proved, resulting in repeated redundant components. Table 1 shows the correlation coefficient of the IEWT component. Table 2 shows the kurtosis of IEWT components.

According to the correlation coefficient between each component of simulation signal and IEWT components in Table 1, IEWT1 corresponds to x, IEWT3 corresponds to y, and IEWT4 corresponds to e. Each corresponding component has a higher correlation coefficient, which proves that the IEWT method has a better decomposition effect and solves the problem that the traditional EWT component generates redundant components.

In order to verify the decomposition effect of the IEWT method, the EEMD method was used for comparison. Figure 6 shows the components of simulation signals obtained by EEMD decomposition.

As can be seen from Figure 6, the simulation signal is decomposed into 13 components by EEMD, among which IMF2 has a good periodic impact and IMF10∼IMF13 are meaningless trend components. Table 3 shows the correlation coefficient of EEMD components. And Table 4 shows the kurtosis value of EEMD components.

According to Table 3, the corresponding component of the impact component y is IMF1 and the correlation coefficient is 0.4565. The corresponding component of the harmonic component x is IMF9 and the correlation coefficient is 0.9799. The corresponding component of random noise component e is IMF1, and the correlation coefficient is 0.6636. In addition, IMF2 and IMF3 have a high correlation with the impact component y. IMF2, IMF3, and IMF4 also have a high correlation with the random noise component e. According to Table 4, it can be seen that the optimal component obtained by EEMD decomposition is IMF2, and the corresponding kurtosis value is 4.7932, which is the same as the observation result. The above analysis proves that EEMD is prone to produce similar components and a component cannot be decomposed into a single component.

According to the number of IEWT and EEMD components, the optimal component kurtosis value, the correlation coefficient between the constituent components, and the corresponding components in the above tables, the decomposition effects of the two methods are comprehensively compared.

According to Table 5, analysis shows that IEWT components have a higher correlation than the corresponding EEMD components, and the optimal component has higher kurtosis value and also will not produce EEMD trend components. The number of IEWT components is closer to the actual constituent components. The IEWT method not only avoids the disadvantages of traditional EWT that produce excessive components but also its running time is only 1/5 of the EEMD, which proves that IEWT has higher decomposition efficiency than EEMD.

In summary, comparative analysis of simulation signals proves that the IEWT solves the defect of traditional EWT generating redundant components, greatly reduces the number of components, and provides convenience for subsequent analysis and processing. By comparing with EEMD, it is proved that IEWT has a better decomposition effect than EEMD, not only in decomposing each component into a single component but also in separating better the fault impact components, and at the same time, has a faster decomposition efficiency.

3.3. IEWT and SVM Combined Bearing Fault Diagnosis Method

The IEWT method is proposed to improve the traditional EWT. The PEs of IEWT components are calculated as the input feature vector of SVM. QGA is used to optimize the parameters of the SVM model, and the optimal parameters are used for model prediction. Specific implementation steps are written as follows:Step 1: the IEWT method is used to decompose the vibration signals of different states and reserve the first 5 components.Step 2: The PEs of all components are calculated, normalized, and served as the feature vector input of SVM.Step 3: 10 groups of samples in each state are selected to train the model. QGA optimization algorithm is adopted. The prediction accuracy is taken as the fitness function for cross-validation (CV). The maximum evolution generation is 200, the population is 20, the scope of C is [0.1, 100], and the scope of is [0.01, 1000].Step 4: Input the optimal parameters into the trained SVM model and select 15 groups of samples in each state for fault classification prediction. Figure 7 shows the fault diagnosis flow chart.

4. Experimental Evaluation

In order to validate the effectiveness of the method proposed in this paper, bearings data from the laboratory of Case Western Reserve University are adopted for the experiment [29]. In this experiment, the 6205-2RS JEM SKF deep groove ball bearing was adopted, and the grooves of different sizes were processed on the bearing by electric spark technology to simulate the bearing failure. Samples of normal, outer race fault 1, outer race fault 2, inner race fault 1, inner race fault 2, rolling element fault 1, and rolling element fault 2 were selected in 25 groups, respectively, among which 10 groups were selected for training and 15 groups for testing. The fault sizes of fault 1 and fault 2 are 0.1778 mm and 0.3556 mm, respectively, the sample length is 2048, and the sampling frequency fs is 12 kHz. The parameters of experiment bearings are shown in Table 6.

The time-domain diagram of a group of randomly selected state signals is plotted in Figure 8. IEWT is performed on samples of different states, the first 5 components are reserved, and the PEs of all components are calculated to construct feature vectors, as shown in Table 7 (limited by length, only partial data are displayed).

It can be observed from Table 7 that the PEs of IEWT components in the same fault state fluctuate within a small range and the PEs of different fault states are very different. This indicates that PEs can well represent the failure states of the rolling bearings.

Ten groups of training sample feature vectors of different states are input into SVM, respectively, and QGA is adopted for cross-validation with classification accuracy as the fitness function to find the optimal solution. Figure 9 shows QGA iterative optimization results.

According to QGA iterative optimization results in Figure 9, the highest classification accuracy is obtained when . The optimal parameter solution is applied to the SVM prediction model. The classification result of the IEWT-SVM method is shown in Figure 10.

The classification results of IEWT-SVM show that the proposed method can effectively and accurately diagnose bearing faults. In order to validate the advantages of IEWT-SVM, traditional EWT and VMD methods are employed to compare with it. The classification results of the traditional EWT method are shown in Figure 11.

When QGA optimization parameters are adopted, the classification accuracy of the EWT-SVM method is 85.33%. The classification results of the VMD-SVM method are expressed in Figure 12.

The components obtained by VMD are also sent to SVM classification. When QGA optimization results are , the classification accuracy of VMD-SVM method is 90.67%.

In order to prove the classification prediction effect of SVM, different classifiers were used to classify the feature vectors. K-nearest neighbor (KNN) algorithm is a supervised learning algorithm based on prior knowledge, which has a simple structure and strong classification performance, so it has been widely applied in the field of fault diagnosis [30]. Since the selection of the nearest neighbor parameter K has an important influence on the classification prediction results, the prediction accuracy of the KNN classifier under different K values is studied. The value range of K is set as [1, 20]. Figure 13 shows the prediction accuracy under different K values.

As can be seen from Figure 13, the KNN classifier has high classification accuracy when the value of K is in the range [1, 8]. When K = 1, the accuracy of classification prediction was the highest, 91.43%. Therefore, K = 1 was taken in this paper. Figure 14 shows the prediction result of the KNN classifier when K = 1.

The extreme learning machine (ELM) is a supervised learning algorithm based on a single-hidden layer feedforward neural network (SLFN). Compared with the traditional gradient descent algorithm, ELM has a shorter training time and better generalization performance, which is widely used in the field of mechanical fault diagnosis [31]. Figure 15 shows the prediction results of the IEWT-ELM method. The fault diagnosis results of the proposed IEWT-SVM and comparative methods are shown in Table 8.

By comparing the results of the 5 methods in Table 8, the time of signal decomposition is also included in running time, and it can be observed that the IEWT method proposed in this paper has a higher classification accuracy and operation efficiency than the traditional EWT method and VMD method. When the component number K of VMD is larger, it takes more time for decomposition. The proposed IEWT method has a higher prediction accuracy than IEWT-KNN and IEWT-ELM. In summary, the IEWT and SVM combined bearing fault diagnosis method proposed in this paper can achieve the aim of high classification accuracy and high operational efficiency at the same time.

5. Conclusions

Vibration signal analysis is a common and available method to diagnose the bearing fault. EWT has a great advantage in signal decomposition without mode-mixing and running fast. However, there will be excessive components decomposed by EWT which leads to components selecting difficulty. To overcome the problem of generating many redundant components by EWT, the IEWT method is proposed in this paper. Simulation signal analysis validates that the IEWT method can not only effectively reduce the number of components but also extract fault impact components from complex signals. In this paper, the IEWT method is combined with SVM and the PEs of IEWT components are constructed for feature vectors. QGA is employed to optimize the penalty factor C and kernel parameter of SVM. Compared with traditional EWT and VMD methods, the proposed IEWT and SVM bearing fault diagnosis method is proved to be more effective and accurate.

Data Availability

The experiment data supporting the analysis were supplied by Case Western Reserve University, and its free download website has been cited.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was funded by the National Natural Science Foundation of China (11790282; 11572206; U1534204; and 11802184), Natural Science Foundation of Hebei Province (A2016210099), Scientific Research Project of Talent Project Training of Hebei Province (A2016002036), and the Program for Advanced Talent in the Universities of Hebei Province (GCC2014021).