Research Article  Open Access
Jing Jiao, Jianhai Yue, Di Pei, Zhunqing Hu, "Application of Feature Fusion Using Coaxial Vibration Signal for Diagnosis of Rolling Element Bearings", Shock and Vibration, vol. 2020, Article ID 8831723, 14 pages, 2020. https://doi.org/10.1155/2020/8831723
Application of Feature Fusion Using Coaxial Vibration Signal for Diagnosis of Rolling Element Bearings
Abstract
The research of rolling element bearings (REBs) fault diagnosis based on single sensor vibration signal analysis is very common. However, the information provided by an individual sensor is very limited, and the robustness of the system is poor. In this paper, a novel fault diagnosis method based on coaxial vibration signal feature fusion (CVSFF) is proposed to fully analyze the multisensor information of the system and build a more reliable diagnostic system. An ensemble empirical mode decomposition (EEMD) method is used to decompose the original vibration signal into a number of intrinsic mode functions (IMFs). Then the autocorrelation analysis is introduced to reduce the random noise remaining in IMFs. After that, the Rényi entropy is calculated as the feature of bearings. Finally, the features of coaxial vibration signal are fused by a multiplekernel learning support vector machine (MKLSVM) to classify bearing conditions. In order to verify the effectiveness of the CVSFF method in REB diagnosis, eight data sets from the Case Western Reserve University Bearing Data Center are selected. The fault classification results demonstrate that the proposed approach is a valuable tool for bearing faults detection, and the fused feature from coaxial sensors improves fault classification accuracy for REBs.
1. Introduction
The state of bearing is very important to efficient operation of mechanical equipment. Fault bearing will cause periodic impacts on vibration, which will lead to problems of other parts of mechanical system. Therefore, it is of great significance to find out bearing fault in time and replace bearing to avoid the breakdown of machine.
With the development of information technology, judging bearing state by signal analysis has become an important trend of conditionbased monitoring (CBM) [1]. In recent years, many methods have been applied to fault detection of REBs, such as vibration signal, oil, temperature, and acoustic emission analysis. Among all these methods, vibration signal analysis is the most widely used and effective method owing to the great information that vibration signal contains [2, 3]. If bearing is operating with local spall, it will cause vibration impulse [4]. Through analysis of the collected vibration signals, the fault impact characteristics can be obtained, so as to realize bearing fault diagnosis. In the research of bearing fault diagnosis based on vibration signals, many papers focus on the signal collected from a single sensor. However, for the complex mechanical system, it is uncertain to use the information of an individual sensor for fault diagnosis, which will lead to unreliable diagnosis results in many cases.
Information fusion is a technology that merges data to obtain more consistent, informative, and accurate information than the original raw data that are mostly uncertain [5]. Some scholars have made achievements in bearing fault diagnosis based on multisensor information fusion technology. These research studies are mainly divided into multisource data fusion relating to the data from the same kind of sensors and different kinds of sensors. Owing to the advantages of vibration signals, many scholars adopt the method of multiple vibration sensor signal fusion in bearing fault diagnosis. These signal fusion studies can be classified as datalevel fusion, featurelevel fusion, and decisionlevel fusion.
Yan et al. [6] proposed a concept of spacetime fragments, and vibration signals captured by multiple sensors are fused at data level. For featurelevel fusion, Jiang et al. [7] developed a featurelevel information fusion methodology of fault diagnosis in rotating machinery. Tao et al. [8] presented a novel bearing fault diagnosis method using multivibration signals and deep belief network (DBN). Banerjee and Das [9] suggested a hybrid model for an on board faulttolerant control system by vibration data fusion. Wang et al. [10] fused vibration data features as the health indicator of bearing status and got a good representation of bearing defect conditions. Zhou et al. [11] addressed a feature fusion approach based on NCA and coupled hidden Markov model. Hong et al. [12] introduced a preprocessing model of bearing using wavelet packetempirical mode decomposition (WPEMD) for feature extraction of vertical and horizontal vibrations. For decisionlevel fusion, Hui et al. [13] proposed an automated bearing fault diagnosis model that employs SVM and the Dempster–Shafer evidence theory in classification.
In the aspect of different kinds of signals fusion, some scholars have carried out bearing fault diagnosis based on the fusion of vibration signals and other types of signals. Safizadeh and Latifi [14] presented a new method of bearing fault diagnosis using the fusion of two primary sensors: an accelerometer and a load cell. Lu et al. [15] designed a soundaided vibration signal adaptive stochastic resonance (SAVASR) method for bearing fault detection. All those works indicate that multisensor information fusion methods have higher classification accuracy than single sensor data analysis in bearing fault diagnosis.
Nevertheless, there are still some problems in vibration signal source of the above bearing fault diagnosis research studies. The authors of [6, 9, 10] do not specify the installation location of vibration sensors. The experiments of papers [7, 9] have too many sensors, which make the system more complex. The researchers [11, 12, 14, 15] only collect vibration signals of one or multidirections of a single bearing. In the experiment Case 2 of [11], vibration sensors are only installed on the base of the test rig. As a matter of fact, the location of vibration sensor is very important in fault diagnosis of mechanical equipment. Vibration data are composed of multiple vibration source signals and noise signals. It must pass through multiple interfaces to reach an accelerometer, which will definitely cause energy dissipation. Because a shaft and bearing inner race are rigidly connected, the shaft plays a role of vibration transmission between coaxial bearings. The fault signals of coaxial bearings are usually similar in a frequency domain. Therefore, the coaxial vibration signal feature fusion (CVSFF) algorithm proposed in this paper takes vibration sensors of coaxial bearings as data sources. Enough information about REBs can be obtained from limited multisource data. Then, the characteristics of this information are fused to realize bearing fault diagnosis.
In the research of bearing fault diagnosis based on information fusion technology, featurelevel fusion is widely used. Among the existing algorithms, a support vector machine (SVM) has been widely used because of its good classification performance. The researchers of papers [7, 9, 13] used SVM as a classifier for fault classification and have achieved good results. However, the classification performance of SVM is greatly affected by a kernel function. The determination of kernel function depends on human experience [16]. To solve this problem, multiple kernel learning (MKL) methods have been proposed. MKL learns the kernel function and classifier parameters simultaneously, which can effectively solve the problem of kernel function selection. Meanwhile, the SVM trained by MKL has more flexibility and higher classification accuracy. Many scholars have applied MKL to SVM model optimization and obtained good results [16, 17].
Based on the abovementioned analysis, a new method of fault diagnosis based on CVSFF is presented in this paper. First, an EEMD method is used to decompose the original vibration signals collected from bearings at both ends of a shaft. Then, autocorrelation is carried out to reduce the random noise in IMF components. After that, an energy ratio of each IMF component is calculated to extract probability mass function (PMF). The Rényi entropy feature matrix of coaxial sensors is obtained based on PMF. Finally, different states of REBs are classified by MKLSVM.
This paper is organized as follows: the proposed method of CVSFF is presented in the next section. In Section 3, the algorithm is validated and analyzed by experimental data. Meanwhile, MKLSVM classification based on an individual sensor and SVM, genetic algorithmoptimized SVM (GASVM), and particle swarm optimization SVM (PSOSVM) based on coaxial features are carried out to evaluate the effectiveness of CVSFF. Section 4 presents the conclusion.
2. Methods
2.1. EEMD
There are many timefrequency data analysis methods to decompose nonlinear and nonstationary time series into a set of components, such as empirical mode decomposition (EMD) [18], ensemble empirical mode decomposition (EEMD) [19], variational mode decomposition (VMD) [20], and broadband mode decomposition (BMD) [21, 22]. Among all those methods, EEMD is the most widely used algorithm in CBM of REBs. EEMD can significantly improve the decomposition effect by reducing mode mixing. The two important parameters used in the EEMD algorithm are ratio k for standard deviation of white noise to standard deviation of signal and the total number M of EMD. Huang suggested that k is generally set as 0.2 [19]. Moreover, the amplitude of white noise should be reduced appropriately for the signal mainly composed of highfrequency components and increased for the signal mainly composed of lowfrequency components. Besides, Huang found that M follows the statistical law of the following equation.where is the maximum relative error of signal decomposition. In this paper, M is taken as 100, and k is taken as 0.2.
The specific decomposition steps of EEMD are as follows:(1)Add random white noise kσ_{x}n (t) to x (t) as shown in the following equation, where n (t) is the Gaussian white noise with mean value 0 and standard deviation 1 and σ_{x} is the standard deviation of signal x (t):(2)EMD decomposition of signals x_{m}(t).(3)Repeat Steps 1 and 2 for M times.(4)The average value of each IMF component is obtained by decomposing M times EMD, and the global average is obtained:where and are the i^{th} IMF and residual component, respectively.
2.2. Autocorrelation
An autocorrelation function describes the relationship of a signal at different times. The autocorrelation function of signal x (t) is defined as
According to the properties of autocorrelation function, a periodic signal has the same cycle as an original signal after autocorrelation. Furthermore, the autocorrelation function of random noise attenuates quickly and tends to zero with the increase of time delay τ. If the periodic signal contains random noise, the autocorrelation function can be used to reduce noise. In this paper, autocorrelation function is applied to noise reduction of IMFs, so as to retain the useful periodic signals in IMFs and reduce random white noise.
2.3. Rényi Entropy
Selfinformation I(x_{1}) refers to the amount of information contained in an event x_{i} of a physical system. Different events in a physical system contain diverse amounts of information. So I (x_{1}) is a random variable that cannot be used as a measure of information about the whole system.
Shannon [23] defines the mathematical expectation of information as information entropy, that is, the average amount of information in a source:where p_{i} is the probability mass function (PMF).
Shannon entropy is a nonparametric measure, and it is well known that it does not have significant sensitivity when dealing with noisy data. For this reason, Rényi entropy [24] has been chosen as another potential entropy measure, which is defined as follows:
The parameter α in Rényi entropy can be used to make the entropy more or less sensitive to particular segments of the probability distributions. α ⟶ 0 causes the Rényi entropy to become highly sensitive to changes in the tails of the distribution. And for α ⟶ 1, it reduces to the Shannon entropy, and hence, the Rényi entropy becomes more sensitive in the regions where the bulk of PMF is located [25].
In order to obtain the frequency distribution change of vibration signal’s energy for rolling bearing, p_{i} is the energy ratio of IMFs to total energy:where is the energy of IMF_{i}, , n is the number of IMFs, N is the number of points of signal , and .
The Rényi entropy is a parameter that characterizes the statistical properties of random variables and reflects the randomness of variables. The energy distribution of normal rolling bearing vibration signal in each frequency band is uniform. It means that the energy distribution is uncertain, so entropy is relatively large. When rolling bearing spall occurs, the energy is mainly distributed in the resonance frequency. The uncertainty of energy distribution is relatively reduced, so the entropy decreases. Therefore, the Rényi entropy is a sensitive feature for REBs classification.
2.4. MKLSVM
SVM is a machine learning method based on statistical learning theory and structural risk minimization principle and is developed by Vapnik and his group [26]. The optimization classification function is as follows:where K(·, ·) is the kernel function associated with a reproducing kernel Hilbert space (RKHS) H, x_{i} is the i^{th} training data, x is the data to be classified, and and are the unknown coefficients.
There are many kinds of kernel functions, such as linear kernel function ; polynomial kernel function ; Gaussian radial basis kernel function ; and sigmoid kernel function .
The kernel function and SVM model parameters directly affect the performance of the SVM classifier. However, it is difficult to map the sample into a highdimensional feature space by using a single kernel function for complex classification problem. In recent years, many scholars have carried out relevant research on MKLSVM. In order to select the appropriate kernel function, MKLSVM learns by combining different kernel functions and thus has more flexibility, better generalization ability, and stronger model interpretation ability.
In MKL framework, K (·, ·) is a convex linear combination of a set of basic kernels:where d_{m} is the weight of K_{m}(·, ·) obtained by sample learning. The decision function of MKL is as follows:where each function f_{m} belongs to a different RKHS H_{m} associated with a kernel K_{m}. Rakotomamonjy et al. [27] proposed a simple MKL method to learn both the coefficients and the weights d_{m}. They adopted the gradient method to solve the MKL problem. The optimal solution is obtained by calculating the gradient of the objective function about to d_{m}:
Finally, the decision functions of MKLSVM are obtained as follows:
2.5. Structure of CVSFF
The method process of CVSFF proposed in this paper is as follows: Step 1: collecting signals from coaxial vibration sensors Step 2: decompose vibration data through EEMD and obtain respective IMFs Step 3: denoise the IMFs by the autocorrelation function Step 4: extract the Rényi entropy of IMF components Step 5: randomly select the training set and test set from feature matrix Step 6: train MKLSVMs Step 7: input the test set into MKLSVM models, and output the classification results
The method flowchart is shown in Figure 1.
3. Experiments Analysis and Discussion
3.1. Data Set
The bearing data diagnosed in this paper were obtained from the Case Western Reserve University (CWRU) Bearing Data Center. These data sets have been considered as a benchmark and analyzed by many researchers. As shown in Figure 2, the test rig consists of a 2 hp motor, a torque transducer, and a dynamometer.
Two coaxial accelerometers are installed at both ends of the motor to measure vibration signal of the corresponding bearing. In this paper, these two coaxial vibration signals in the drive end and fan end are used for fault diagnosis.
Smith and Randall [28] used three established bearing diagnostic techniques to provide a benchmark analysis of these widely used data sets. According to the data analysis conclusion in [28], data of eight states that are difficult to diagnose by the benchmark method were selected for analysis in this paper, as shown in Table 1.

The last column in Table 1 means categorisation of the benchmark method proposed in that paper. The explanation of diagnosis category is as follows: Y1: data clearly diagnosable and showing classic characteristics for the given bearing fault in both the time and frequency domains Y2: data clearly diagnosable but showing nonclassic characteristics in either or both of the time and frequency domains P1: data probably diagnosable; e.g., the envelope spectrum shows discrete component at the expected fault frequencies, but they are not dominant in the spectrum P2: data potentially diagnosable; e.g., the envelope spectrum shows smeared components that appear to coincide with the expected fault frequencies N1: data not diagnosable for the specified bearing fault but with other identifiable problems (e.g., looseness) N2: data not diagnosable and virtually indistinguishable from noise, with the possible exception of shaft harmonics in the envelope spectrum
Randall only used the benchmark method to analyze four sets of data with data set number 169, 170, 171, and 172 for inner ring fault of the drive end with 12 kHz sampling frequency. Data set 171 was selected from the four groups with poor diagnostic results. The reason for choosing data set 276 is the same as that for choosing data set 171. Except these two sets and normal bearing data, data sets in Table 1 are not diagnosable by the benchmark method in [28].
3.2. EEMD Analysis
Vibration signals of two coaxial sensors corresponding to eight states were cut into 1600 segments with ten seconds data length (120,000 points) for each state. Each state of bearing has 200 segments from two coaxial sensors (each sensor has 100 segments; each segment has 1200 points). Every segment of data is decomposed into nine IMFs and a residual component by EEMD. Driveend raw time waveforms and part of decomposition results of eight different states are given in Figure 3. Only original signal and the first seven orders of IMFs are shown in the figure.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
In the timedomain distribution of IMF, there is no faultrelated impulse component. In order to view the frequency distribution of each IMF, fast Fourier transform (FFT) was performed for each component. Take the first segment of data for eight states as examples, IMF frequencydomain distribution of these data is shown in Figure 4.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
(m)
(n)
(o)
(p)
It can be seen in Figure 4 that the energy of IMFs is mainly concentrated in IMF1. Due to the characteristics of EEMD, the frequency content of IMFs ranges from high to low with the increase of order. Moreover, EEMD separates the frequency bands successfully without mode mixing. It may also be noticed that the frequency distribution of normal bearing coaxial sensor data is very similar by comparing Figures 4(a) and 4(b). In Figures 4(c) to 4(p), which are fault bearing coaxial data, components except IMF1 have similar frequency distribution too. Meanwhile, it can be seen from the red box in Figure 4 that the frequency distribution of IMF components is spread over many frequency bands. The random noise in IMFs submerged part of the fault impulse, thus affecting the subsequent feature extraction effect. Therefore, the method of autocorrelation noise reduction is needed to reduce random noise in IMF components.
3.3. Autocorrelation Noise Reduction
In order to suppress the noise, the autocorrelation function of each IMF is computed in the time domain. Figure 5 is the frequency distribution of the corresponding data in Figure 4 after autocorrelation denoise.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
(k)
(l)
(m)
(n)
(o)
(p)
By comparing Figures 5 and 4, it can be seen that the frequency amplitude marked by the red box in Figure 4 significantly decreases after the autocorrelation noise reduction. And the frequency band corresponding to the large amplitude of each component does not change.
The frequency distribution of IMFs is more centralized, which means random noise in the component is effectively suppressed after autocorrelation. However, Res in Figures 5(b), 5(d), and 5(f) show a same lowfrequency peak distribution of 10 Hz marked in the red dotted box. We take Res of Figure 5(d) as an example to analyze this phenomenon. The time domain, autocorrelation function, and corresponding frequencydomain distribution of this component are shown in Figure 6.
(a)
(b)
(c)
As can be seen from Figure 6, the signal becomes a monotone signal after autocorrelation. So, the frequency peak of 10 Hz in Figures 5(b), 5(d), and 5(f) correspond to the time length 0.1 s of signal. It has no actual physical meaning. Since this paper only uses IMF19 components to calculate the entropy value, it has no influence on the feature matrix distribution.
3.4. Entropy Feature Extraction
According to equations (6) and (7), we calculated the Rényi entropy of IMFs. The dimension of entropy multisensor feature matrix of each state is 100 2. The row 100 of feature matrix means 100 segments of each state. The column 2 of feature matrix means 2 coaxial sensors. Table 2 shows the mean values of entropy for eight states which are calculated by taking an average of 100 Rényi entropy values for every state of bearings. The entropy values of IMFs without and with denosie are compared in Table 2.

The variation in Table 2 means the change rate of entropy mean value for the noisereduced data comparing with original IMFs:where R_{1} is the mean value of entropy for data without denoise and R_{2} is mean value of entropy for data with autocorrelation denoise.
As can be seen from Table 2, the entropy of all measuring points decreases after autocorrelation. Arrays with a large reduction (greater than 50%) are highlighted in bold. The maximum reduction is 81.53%. It can also be seen from Table 2 that entropy of data from sensors located in faulty bearing end generally drops greatly. This may be because the vibration signal at the fault bearing end is more susceptible to noise interference.
Figure 7 is the Rényi entropy box plot calculated by vibration signals of two coaxial sensors. Figure 7(a) is the Rényi entropy distribution for data from the driveend sensor. Figure 7(b) is the Rényi entropy distribution for data from the fanend sensor. The black part in Figure 7 is the entropy value calculated from raw data, while the red part is the entropy value calculated by IMFs after autocorrelation noise reduction. It is clear that all entropy decreased after noise reduction. Furthermore, the entropy distribution for data without noise reduction from driveend or fanend sensor is similar to that after autocorrelation denoise. Although autocorrelation denoise hardly changes the entropy tendency of bearings in different states, it increases the differentiation of eight states. In addition, the entropy values of normal bearing in both sensors are greater than those of all faulty bearings which are consistent with the characteristics of the Rényi entropy.
(a)
(b)
In order to check the distribution of the outlier in the box diagram, we counted the number of outliers in the box plot of Figure 7. It can be clearly seen from Table 3 that the entropy distribution becomes more centralized and the number of outlier decreases after vibration noise reduction. The overall decrease of outliers was attributed to bearing states of IF, O3F, and O12F. That is, the reduction of the outlier mainly consisted of coaxial data of fanend fault bearing.

The Rényi entropy of the vibration signal at drive end and fan end constitutes an 800 2 dimensional multisensor feature vector. To compare the distribution of feature matrix without and with noise reduction more distinctly, we draw the feature scatter plot. Their distributions are shown in Figure 8.
(a)
(b)
As can be observed, features from all conditions except normal bearing have relatively concentrated space distribution in Figure 8(a). In Figure 8(b), although conditions BF, O3F, and O12F still have some similar features, the aliasing of the feature distributions of different states has been significantly improved after noise reduction. The characteristic differentiation degree of bearing in eight different states is increased.
3.5. Result Comparison and Discussion
Through all above signal preprocessing methods carried out in Section 3.2 to 3.4, we obtained the coaxial vibration feature matrix. To classify eight different states of bearing, MKLSVM is used in this paper. Kernel functions are applied to the features of single sensor and all sensors.
Common kernel functions are divided into two types: local kernel function and global kernel function. The local kernel function has local characteristics, strong learning ability, and weak generalization ability. Linear and Gaussian kernel functions are typical local kernel function. On the contrary, the global kernel function has global characteristics, strong generalization ability, and weak learning ability. Polynomial and sigmoid kernel functions are typical global kernel functions.
In order to make the model have a good classification effect for data with different characteristics, the Gaussian kernel and polynomial kernel functions with different parameters were selected in this paper. Detailed parameters are as follows:(1)Five Gaussian kernel functions with different bandwidths: bandwidth is uniformly sampled at the interval of [0.01, 100] on a logarithmic scale.(2)Three different levels of polynomial function: the orders d are 1, 2, and 3.
These kernel functions are applied to a single feature and all features, respectively. Therefore, there are 24 kernels in total.
3.5.1. Comparison and Discussion of Coaxial Signals with Individual Signal
In order to verify the sensitivity of coaxial signal features, we also did the same diagnosis based on single sensor. To train MKLSVMs, 70% of feature matrix is selected as training data. The one against all method is adopted to construct multiclassification model. To improve computational efficiency, we jointly optimized eight binary classification problems. A total of eight MKLSVMs and a combination of kernels are obtained for coaxial signals or individual signal. The detailed parameters of the models are shown in Table 4 and 5.


As can be observed from Table 4, weight of the kernel function acting on the data of two coaxial sensors accounts for 91.41%, which takes the most part of the MKL model. More specifically, the fusion of twodimensional coaxial features outperformed feature of a single sensor. It can also be concluded from Table 4 that the weights obtained by MKL realize not only the selection of kernel function but also the selection of data. The kernel function parameters of model based on the data of the drive end in Table 5 are similar to those of the fan end. These selected kernel functions are sorted by weight as follows: Gaussian kernel = 0.01, Gaussian kernel = 0.1, and polynomial kernel d = 3. In addition, a firstorder polynomial kernel is included based on the driveend data, but the weight is very small. According to the previous formula, the firstorder polynomial kernel is a linear kernel. The classification results of these three models based on different data are shown in Table 6. The classification accuracy based on the CVSFF method is 97.50%. This is much higher than those of single signalbased classification accuracy, which are 66.67% and 60.42%, respectively. All results of three models prove that the fusion of data from two coaxial sensors improved accuracy and robustness in fault detection of bearing.

The huge difference in results is mainly due to the good differentiation of twodimensional features based on coaxial signals, while the singledimensional entropy distribution of the individual sensor overlaps, as shown in Figure 7. For the characteristics of drive end, IF and O12F are distributed with overlap in the interval [0.6, 1.2]; ID and O6D are distributed with overlap in the interval [0.3, 0.5]. For the characteristics of fan end, ID and IF are distributed with overlap in the interval [1.1, 1.6]; O6D, BF, O3F, and O12F are distributed with overlap in the interval [0.3, 0.7].
3.5.2. Comparison and Discussion of MKLSVM with SVM, GASVM, and PSOSVM
To introduce the effectiveness of MKLSVM, SVM, genetic algorithmoptimized SVM (GASVM), and particle swarm optimization SVM (PSOSVM) based on single kernel are measured in this section. All parameters of SVM are default. That is to say, the kernel function is the Gaussian kernel; the bandwidth of the Gaussian kernel is 1.
In order to compare the results of CVSFF, we use the same feature matrix that is used in MKLSVM to train and test the other three SVM models. The classification results are shown in Table 7. The classification accuracies are 97.50%, 95.00%, 95.83%, and 96.67% using the MKLSVM, original SVM, GASVM, and PSOSVM, respectively. The Gaussian kernel bandwidth obtained by GA and PSO optimization is 8.64 and 5.64. Although the classification accuracy of two optimized algorithms is 0.83% and 1.67% higher than that of SVM, it is still lower than MKLSVM.

Furthermore, the running time of the models is compared. The experiments are carried out on Intel (R) Core (TM) i54210U CPU 2.4 GHz, 4G RAM, Win 7 and MATLAB R2016b. With the exception of the SVM model, the other three models need to optimize the parameters. Thus, the running time of these three models is greatly increased compared with that of SVM. In the first, third, and fourth models, GASVM has the shortest running time but the lowest accuracy. The MKLSVM model adopted in this paper has a higher classification accuracy, although its running time is 86.17 s longer than GASVM. Compared with PSOSVM, this method is not only more efficient but also more accurate in classification.
To analyze the results of these four models in detail, we draw the confusion matrix corresponding to the classification results in Table 7, as shown in Figure 9.
(a)
(b)
(c)
(d)
Apparently, the error classification is mainly concentrated in BF, O3F, and O12F. And the predicted states of these error classifications are BF, O3F, and O12F. It verifies the characteristics of the feature distribution in Figure 8(b). The lowest classification accuracy of MKLSVM is 90.00% (O3F, 27/30) for single state bearing, which is significantly higher than that of SVM (73.33%, BF, 22/30). In addition, the classification accuracy of MKLSVM for the first five states is 100%, while SVM, GASVM, and PSOSVM have 1, 2, and 1 misclassification of data in ID state, respectively.
However, it can also be seen from Figure 9 that there are three groups of misclassification for O3F based on the method proposed in this paper, while the performance of the other three models is better. The three segments misclassified by MKLSVM are the 79^{th}, 88^{th}, and 97^{th} segments of data O3F. GASVM misclassified the 71^{th} segment of O3F. PSOSVM misclassified the 88^{th} segment of O3F which is one of the MKLSVM misclassified groups. The 97^{th} segment is not close to the feature of BF, but it is still misclassified as BF. The results that are misclassified by MKLSVM need further study and improvement. Nevertheless, MKLSVM outperforms SVM, GASVM, and PSOSVM both in whole and single state classification accuracy.
4. Conclusion
This paper proposed a feature fusion method based on MKLSVM using coaxial vibration signals to classify REB states. The obtained accuracy of coaxial signals is 97.50% which is much higher than the results of single sensor signal for drive end (66.67%) and fan end (60.42%). Polynomial kernel with global characteristics and Gaussian kernel with local characteristics are selected, which greatly improves the generalization ability of the model. By comparing the CVSFF, SVM, GASVM, and PSOSVMbased feature fusion, 97.50%, 95.00%, 95.83%, and 96.67% accuracies are obtained, respectively. It shows that the method extracted in this paper is more effective for REB fault classification.
Data Availability
The bearing data used to support the findings of this study are obtained from the Case Western Reserve University (CWRU) Bearing Data Center (http://csegroups.case.edu/bearingdatacenter/home).
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
The research was supported by the China Railway (grant number 2017J004H).
References
 E. P. Carden and P. Fanning, “Vibration based condition monitoring: a review,” Structural Health Monitoring: An International Journal, vol. 3, no. 4, pp. 355–377, 2004. View at: Publisher Site  Google Scholar
 B. Li, M.Y. Chow, Y. Tipsuwan, and J. C. Hung, “Neuralnetworkbased motor rolling bearing fault diagnosis,” IEEE Transactions on Industrial Electronics, vol. 47, no. 5, pp. 1060–1069, 2000. View at: Publisher Site  Google Scholar
 L. Guo, H. Gao, H. Huang, X. He, and S. Li, “Multifeatures fusion and nonlinear dimension reduction for intelligent bearing condition monitoring,” Shock and Vibration, vol. 2016, Article ID 4632562, 10 pages, 2016. View at: Publisher Site  Google Scholar
 H. Zhao, S. Zuo, M. Hou et al., “A novel adaptive signal processing method based on enhanced empirical wavelet Transform technology,” Sensors, vol. 18, no. 10, p. 3323, 2018. View at: Publisher Site  Google Scholar
 T. Meng, X. Jing, Z. Yan, and W. Pedrycz, “A survey on machine learning for data fusion,” Information Fusion, vol. 57, pp. 115–129, 2020. View at: Publisher Site  Google Scholar
 X. Yan, Z. Sun, J. Zhao, Z. Shi, and C.a. Zhang, “Fault diagnosis of rotating machinery equipped with multiple sensors using spacetime fragments,” Journal of Sound and Vibration, vol. 456, pp. 49–64, 2019. View at: Publisher Site  Google Scholar
 L.l. Jiang, H.k. Yin, X.j. Li, and S.w. Tang, “Fault Diagnosis of rotating machinery based on multisensor information fusion using SVM and timedomain features,” Shock and Vibration, vol. 2014, Article ID 418178, 8 pages, 2014. View at: Publisher Site  Google Scholar
 J. Tao, Y. Liu, and D. Yang, “Bearing fault diagnosis based on deep belief network and multisensor information fusion,” Shock and Vibration, vol. 2016, Article ID 9306205, 9 pages, 2016. View at: Publisher Site  Google Scholar
 T. P. Banerjee and S. Das, “Multisensor data fusion using support vector machine for motor fault detection,” Information Sciences, vol. 217, pp. 96–107, 2012. View at: Publisher Site  Google Scholar
 J. Wang, Y. Liang, Y. Zheng, R. X. Gao, and F. Zhang, “An integrated fault diagnosis and prognosis approach for predictive maintenance of wind turbine bearing with limited samples,” Renewable Energy, vol. 145, pp. 642–650, 2020. View at: Publisher Site  Google Scholar
 H. Zhou, J. Chen, G. Dong, H. Wang, and H. Yuan, “Bearing fault recognition method based on neighbourhood component analysis and coupled hidden Markov model,” Mechanical Systems and Signal Processing, vol. 6667, pp. 568–581, 2016. View at: Publisher Site  Google Scholar
 S. Hong, Z. Zhou, E. Zio, and K. Hong, “Condition assessment for the performance degradation of bearing based on a combinatorial feature extraction method,” Digital Signal Processing, vol. 27, pp. 159–166, 2014. View at: Publisher Site  Google Scholar
 K. H. Hui, M. H. Lim, M. S. Leong, and S. M. AlObaidi, “DempsterShafer evidence theory for multibearing faults diagnosis,” Engineering Applications of Artificial Intelligence, vol. 57, pp. 160–170, 2017. View at: Publisher Site  Google Scholar
 M. S. Safizadeh and S. K. Latifi, “Using multisensor data fusion for vibration fault diagnosis of rolling element bearings by accelerometer and load cell,” Information Fusion, vol. 18, pp. 1–8, 2014. View at: Publisher Site  Google Scholar
 S. Lu, P. Zheng, Y. Liu, Z. Cao, H. Yang, and Q. Wang, “Soundaided vibration weak signal enhancement for bearing fault detection by using adaptive stochastic resonance,” Journal of Sound and Vibration, vol. 449, pp. 18–29, 2019. View at: Publisher Site  Google Scholar
 V. Mygdalis, A. Tefas, and I. Pitas, “Exploiting multiplex data relationships in support vector machines,” Pattern Recognition, vol. 85, pp. 70–77, 2019. View at: Google Scholar
 M. H. Zangooei and S. Jalili, “PSSP with dynamic weighted kernel fusion based on SVMPHGS,” KnowledgeBased Systems, vol. 27, pp. 424–442, 2012. View at: Publisher Site  Google Scholar
 N. E. Huang, Z. Shen, S. R. Long et al., “The empirical mode decomposition and the Hilbert spectrum for nonlinear and nonstationary time series analysis,” Proceedings of The Royal Society A: Mathematical, Physical and Engineering Sciences, vol. 454, no. 1971, pp. 903–995, 1998. View at: Google Scholar
 Z. Wu and N. E. Huang, “Ensemble empirical mode decomposition: a noiseassisted data analysis method,” Advances in Adaptive Data Analysis, vol. 1, no. 1, pp. 1–41, 2009. View at: Publisher Site  Google Scholar
 K. Dragomiretskiy and D. Zosso, “Variational mode decomposition,” IEEE Transactions on Signal Processing, vol. 62, no. 3, pp. 531–544, 2014. View at: Publisher Site  Google Scholar
 Y. Peng, Z. Li, K. He et al., “Broadband mode decomposition and its application to the quality evaluation of welding inverter power source signals,” IEEE Transactions on Industrial Electronics, vol. 67, no. 11, pp. 9734–9746, 2020. View at: Google Scholar
 Y. Peng, Z. Wang, K. He et al., “Modulated broadband mode decomposition for the feature extraction of double pulse metal inert gas welding,” IEEE Access, vol. 8, pp. 134031–134041, 2020. View at: Publisher Site  Google Scholar
 C. E. Shannon, “A mathematical theory of communication,” Bell System Technical Journal, vol. 27, no. 3, pp. 379–423, 1948. View at: Publisher Site  Google Scholar
 A. Rényi, “On measures of entropy and information,” in Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, pp. 547–561, University of California Press, Berkeley, CA, USA, 1961. View at: Google Scholar
 P. Boškoski and Đ. Juričić, “Fault detection of mechanical drives under variable operating conditions based on wavelet packet Rényi entropy signatures,” Mechanical Systems and Signal Processing, vol. 31, pp. 369–381, 2012. View at: Google Scholar
 V. Vapnik, The Nature of Statistical Learning Theory, Springer, Berlin, Germany, 1995.
 A. Rakotomamonjy, F. Bach, S. Canu et al., “Simple MKL,” Journal of Machine Learning Research, vol. 9, pp. 2491–2521, 2008. View at: Google Scholar
 W. A. Smith and R. B. Randall, “Rolling element bearing diagnostics using the Case Western Reserve University data: a benchmark study,” Mechanical Systems and Signal Processing, vol. 6465, pp. 100–131, 2015. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2020 Jing Jiao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.