Acoustic diagnosis has been a research hotspot in recent years because of the advantages of noncontact signal acquisition. However, acoustic diagnosis technology has not been applied to bearing fault diagnosis of Electric Multiple Units (EMU) traction motor. Traditional fault diagnosis methods are difficult to diagnose acoustic signals with complex noise. An intelligent fault diagnosis method based on Cross Wavelet Transform (XWT) and GoogleNet model is proposed in this paper. Firstly, the fault feature enhancement algorithm is proposed using XWT and bandpass filtering. Secondly, the CR400 EMU traction motor bearing fault test bed is built to collect real fault acoustic signals from two different positions, then XWT is applied to the original signal to identify the fault feature frequency band, then bandpass filtering is used to filter out the noise frequency band other than the fault feature frequency band. Finally, the kurtosis spectrum of the denoised signal and the original signal are input into GoogleNet, respectively, for fault classification. The result shows that (1) GoogleNet achieves 98.23% accuracy in the fault classification for denoised signals, while only 89.66% accuracy for the original signals. (2) Deep learning is an effective method for the acoustic diagnosis of motor bearing faults in EMU trains.

1. Introduction

The railway transportation system plays a major role in the rapid development of the national economy. As the core component of the train, the health status of the traction motor directly affects the safety of train operation [1]. According to the survey, bearing failure is one of the most frequent train faults [24]. When the bearing failure cannot be found, that may lead to train derailment, resulting in huge accidents and economic losses. Therefore, it is necessary to monitor the train bearing and get the health status of the bearing in time. As a traditional diagnostic technology, the diagnosis technology based on vibration signals has been improving and has become the mainstream fault diagnosis technology [58]. In recent 15 years, acoustic diagnosis technology was very important in bearing condition monitoring and was always the research hotspot of fault diagnosis [915]. Previous studies have shown that acoustic measurement technology can be successfully applied to the field of fault diagnosis [16, 17]. Compared with the vibration signal acquisition method, it is more convenient to collect acoustic signals. There is no need to drill the bearing seat, and the strength of the structure will not be affected. Therefore, acoustic diagnostic technology has more advantages than vibration diagnosis technology.

Although acoustic diagnostic technology has advantages, in practice, acoustic signals are always extremely complex and contain a lot of noise, so it is difficult to extract fault features from original signals, which brings great difficulty to the reliability of acoustic diagnostic technology [18]. With the development of computer technology, machine learning has been widely used in the field of mechanical equipment fault diagnosis. By taking the time-domain statistical characteristics of the original signal or denoised signal as the input dataset, excellent fault diagnostic results can be achieved by using excellent algorithms such as KNN, SVM, and DBN [1922]. Convolutional neural network (CNN) has superior performance in the field of feature recognition. By taking the characteristic data of the signal as the input data of the CNN for deep feature extraction, which can achieve an excellent fault classification function. For example, Fengli proposed deep convolution domain advisory transfer learning (DCDATL), in which the deep convolution residual feature extraction method is used to extract the deep fault features from the bearings signal, and finally the fault classification achieves an accuracy of more than 90% [23]. Alexander takes the kurtosis spectrum of acoustic emission signal as the input image of LeNET5 network and finally uses the softmax classifier for fault classification. Under the working condition of 250 rpm–500 rpm, he achieved a classification accuracy of 95.6% to 100% [24]. Zhan proposed a normalized convolutional neural network model, applied the model to the fault classification of the bearing dataset of Western Reserve University, and finally achieved a classification accuracy of 98.5% [25]. Liu used a probabilistic neural network (PNN) to diagnose the denoised vibration signal and obtained 100% classification accuracy [26]. Zhang et al. applied a support vector machine (SVM) to intelligent fault diagnosis of bearings and finally achieved 89.58% accuracy on the overall sample [27]. Appana extracts the deep feature of the signal through the self built CNN architecture and finally uses softmax to classify the fault type, achieving an accuracy rate of 86.5% [28]. Tao uses DBN for bearing fault identification and compares the diagnostic effect of DBN with SVM BPNN and KNN [29]. Liu et al. proposed Categorical Adversarial Autoencoder (CatAAE) for unsupervised bearing fault diagnosis. Finally, under the SNR of 20 db to −4 db, the diagnostic accuracy is 96.76% to 85.76% [30]. Kumar et al. used ANN to diagnose the signals after wavelet denoising and achieved 96.67% accuracy [31]. At the same time, using a machine learning algorithm for fault classification and comparing the classification accuracy of the model on the denoised signal and the original signal can also verify the effect of denoising.

Cross wavelet transform (XWT) has been widely used in the field of regional climate analysis. It can be used to analyze the coherence of two signals in the time-frequency domain and obtain a common frequency band [32]. Research shows that XWT can be used to enhance the fault feature of bearing signals, but XWT is rarely used in the field of fault diagnosis. For example, Jimeng proposed a bearing fault feature enhancement method based on XWT, and his experimental comparison showed that the bearing fault feature was enhanced in the time and frequency domain of the vibration signal [33]. Lihua applied XWT to transformer fault diagnosis, XWT was applied to the vibration signals collected from different directions. Then, the principal components of the signal were extracted according to the cross wavelet coherence spectrum. The results show that the interference components in the signal were greatly reduced and the fault feature were enhanced [34]. At present, XWT has not been applied to the research of acoustic signal fault feature extraction. Aiming at the problem that it is difficult to extract fault features from the complex sound produced by a working train traction motor. Combining with XWT and bandpass filtering, this paper proposed a method of fault feature enhancement. The innovations of this paper are as follows: (1) in order to study the acoustic diagnosis of bearing fault based on the real acoustic signal of traction motor, we have established a CR400 experimental platform for collecting acoustic signal of train traction motor bearing faults; (2) the fault feature frequency band (coherent frequency band) is identified by wavelet coherence analysis of the acoustic signals; (3) finally, the kurtosis spectrum of the signal is used as the input image dataset of CNN to realize the acoustic diagnosis of the traction motor bearing of CR400 EMU.

The layout of this paper is as follows: Section 2 introduces the fault feature enhancement method proposed in this paper, Section 3 introduces the deployment of the experimental platform and the source of sound data, and Section 4 provides the details of the processing of fault feature extraction. Finally, in Section 4.3, the fault classification of bearing based on GoogLeNet is done to verify the effectiveness of this method. In the end, the research of this paper is summarized in Section 5.

2. Fault Feature Enhancement Algorithm Based on XWT

In practice, the waveform of the sound signal emitted by the working traction motor is very complex. There are many noise sources in the sound generated by the motor, so it is difficult to identify the bearing fault. When the noise is removed out from the original signal, the periodic impact component caused by the bearing fault in the signal will be more obvious. In order to remove the noise component in the signal, this paper proposed a method combining XWT and bandpass filtering to filter the fault feature frequency band of the original signal. The feature signal is reconstructed based on the method to achieve the function of noise reduction. In the following, the sound signal emitted by the traction motor is described as xsound(t) (abbreviated as xs(t)) for convenience of expression.

2.1. Wavelet Coherence Analysis Based on XWT

Based on the theory of wavelet analysis, XWT can be used to analyze the coherence of two sets of time-domain signals in their whole frequency domain. Multiple sound signals are collected from different directions for the same target, and these sound signals contain noise and fault signals similarly. In this paper, two microphones will be used to collect the sound of the working traction motor from two different directions, collected signals are called xs1(t) and xs2(t), respectively. Due to different sound propagation paths, the signals collected by the microphone at different positions will be different. Based on the principle of the wavelet transform, XWT is applied between xs1(t) and xs2(t) as follows:where a is the scaling factor, τ is the translation factor, represent complex conjugate, ψ(t) is the morlet wavelet function, ω0 is the initial phase angle. The absolute value of Wxs1−xs2(a,τ) is the cross wavelet power spectral density. The higher the value, the greater the coherence between xs1(t) and xs2(t). Due to the randomness and instability of noise, the cross wavelet power spectral density between noise signals will be very small, that is, its coherence is very small. According to the difference of coherence content, in the complete frequency band of the traction motor signal, the noise and fault signal will be easily distinguished [32, 35]. In order to visually observe the coherence of the whole frequency band and the frequency band of noise and fault signal, the wavelet coherence spectrum of xs1(t) and xs2(t) needs to be made. In the wavelet coherence spectrum, the coherence is expressed by the brightness of the color.

2.2. Bandpass Filter

The bandpass filter can transmit the signals within a specific frequency range, and block the signals outside this frequency range to achieve the purpose of selective transmission. After identifying the noise frequency band or fault feature frequency band, the noise signal can be purposefully deleted from the original signal by using bandpass filter. At present, bandpass filter has been widely used in signal processing.

2.3. Steps of Fault Feature Enhancement

According to the wavelet coherence spectrum, the frequency band of the noise can be identified, and the design parameters of the bandpass filter can be determined. The algorithm steps of fault feature enhancement based on wavelet coherent spectrum plus filtering are as follows:(1)Deploy two microphones in two different positions around the motor to collect two sets of sound signals (xs1(t) and xs2(t)). For ensuring that the propagation paths of xs1(t) and xs2(t) are different from each other, the two microphones are placed at the positions of the motor with a difference of 90 degrees.(2)The cross wavelet transform is applied to xs1(t) and xs2(t), and the wavelet coherent spectrum between them is drawn. According to the light and dark distribution of each frequency band in the wavelet coherent spectrum, the frequency band of the noise is determined. Based on this noise band, a bandpass filter is designed.(3)The bandpass filter is used to filter and reduce the noise of xs2(t) based on the wavelet coherent spectrum. Finally, the denoised signal is marked as xfeature(t). The flow of the whole process is shown in Figure 1.

3. CR400 EMU Motor Bearing Acoustic Data Experiment

An acoustic bearing fault test bed of the traction motor of CR400 EMU is established. The model of the traction motor is YQ-625, its rated output power is 625 kW, and the maximum output speed is 5600 rpm. The test bearing is a cylindrical roller bearing whose type is NU214. In this experiment, the bearing is installed at the drive end of the traction motor. Bearing specifications are shown in Table 1.

Laser etching is used to produce single point damage. The width of laser etching damage is 0.3 mm or 2 mm which are mild fault and severe fault respectively. Mild fault or severe fault are produced on cage, ball, outer race, and inner race, respectively. There are eight fault types, namely cage mild fault, cage severe fault, ball mild fault, ball severe fault, outer race mild fault, outer race severe fault, inner race mild fault, and inner race severe fault. These fault types are referred simply to as CMF, CSF, BMF, BSF, ORMF, ORSF, IRMF, and IRSF, respectively. The fault bearings are shown in Figure 2.

Five microphones are placed around the traction motor to collect the sound signal when the traction motor is working. The layout of the test bed is shown in Figure 3. The bearing speed is set to 2414 rpm to simulate the working condition of the train at 160 km/h. The sampling rate is set to 54.94 kHz. The sampling duration of bearing acoustic signal is shown in Table 2.

4. Acoustic Diagnosis of CR400 EMU Motor Faulty Bearing

4.1. Signal Denoising Based on XWT

Based on the method proposed in this paper, we first need to deploy two microphones around the motor. As shown in Figure 3, microphone 1 and microphone 2 are selected as research objects. Because the relative positions of these two microphones and the motor are 90 degrees different, and their signal propagation directions are completely different. The acoustic signals collected by the two microphones are abbreviated as xs1(t) and xs2(t). Their sound pressure fluctuation waveforms are shown in Figure 4.

Taking the fault of CMF as an example, the method in this paper is used to deal with it. Firstly, cross wavelet transform is applied to xs1(t) and xs2(t), and then the wavelet coherence spectrum is obtained, as shown in Figure 5.

It can be observed from Figure 5 that the noise frequency band is between 0.125 KHz and 4 KHz and below 0.03125 KHz, and the other frequency bands are coherent frequency bands (common frequency bands). Then, the bandpass filter is used to reduce the noise of xs2(t), so that only the coherent frequency band is retained in the signal. Then, the fault characteristic signal is obtained. The sound pressure fluctuation waveform of xfeature(t) is shown in Figure 6.

Compared with the original signal, the complexity of xfeature (t) become lower. A large part of the noise in the signal is filtered out, and the burr in the waveform is eliminated to a great extent. Other fault signals have been processed in the same way, and the repeated expression will not be repeated.

4.2. Spectral Kurtosis Analysis of Feature Signal

Spectral kurtosis (SK) was precisely defined by Antoni in 2006 [36]. As the more developed SK analysis for optimum selection of the bandwidth, the kurtogram is accepted and used in fault diagnosis, particularly in bearings. Moreover, kurtosis has a high probability of carrying a high value for nongaussian noises and fault feature [37]. When the bearing fails, there will be a gap between its components and fierce collision will occur between them under violent rotation. The fault impact causes a component of a specific frequency band into the original signal. In this paper, the spectral kurtosis of the raw signal and feature signal is calculated based on short-time Fourier transform. SK is very sensitive to the transient impact included in the signal. When the noise is removed, the SK value will increase to indicate that signal could better reflect the fault of the bearing.

The kurtogram takes the frequency as the horizontal axis and uses the color scale to represent the spectral kurtosis value of each frequency. Figure 7 (left) shows the spectral kurtosis of IRMF’s original signal, and Figure 7 (right) shows the spectral kurtosis of its feature signal (xfeature(t)). It can be seen from Figure 7 (right) that the spectral kurtosis is the largest in the range of center frequency of 2.5753 kHz and bandwidth of 1.7169 kHz, that is, the transient impact is the most obvious in this range. In Figure 7, the maximum of kurtosis (Kmax) of feature signal is 23.8332 which is significantly larger than the original signal’s 2.6636. Therefore, the xfeature(t) can better reflect the bearings fault than the original signal. In kurtogram, the frequency band corresponding to the brightest color region can better characterize the impact between bearing part [38].

Figures 814 show the kurtogram of other seven fault feature signal. It can be seen that for all fault types, all the Kmax of the fault feature signal are significantly greater than their original signal. According to the definition of spectral kurtosis, the greater the value, the more obvious the impact component in the signal. After the processing by the method proposed, the noise in the original signal could be removed to a great extent.

There is a great difference in spectral kurtosis for different types of fault data. The difference of center frequency, bandwidth, and maximum spectral kurtosis results in the different light and dark distribution of the corresponding kurtogram. So the kurtogram of xfeature(t) are used as the feature images of bearing fault signal to represent the fault condition.

4.3. Fault Diagnosis of EMU Motor Bearing Fault Based on CNN
4.3.1. Preparation of Bearing Faults’ Kurtograms Dataset

Figure 15 shows the preparation process of kurtograms dataset. The sampling duration of the signals are recorded in Table 2. In order to increase the number of samples, the equal interval overlapping segmentation method is adopted to generate a sample signal. The time length of a single sample is set to 1 second and the interval time is set to 0.5 second. After dividing the signal, a total of 4512 sample signals are generated. Then, kurtograms of all sample signals are randomly divided into a training dataset and testing dataset in a ratio about 7 : 3.

4.3.2. Fault diagnosis Based on GoogleNet and Denoised Signal

Convolutional neural network (CNN) is used for fault classification, and the effectiveness of the proposed method will be further verified by comparing the classification performance of CNN on the original signal and fault feature signal. Based on the original GoogleNet, a fault classification model is established. GoogleNet uses the Inception structure to improve the sparsity of the network, so as to improve the computing speed. And, GoogleNet has higher performance than AlexNet and LeNet5. For more details about GoogleNet, please browse to reference [39]. GoogleNet structure is shown in Table 3.

Since there are totally 8 types of bearing faults in this paper, the classification number of GoogleNet was set to 8. The loss function used is cross entropy loss function, and Adam optimizer is used to optimize all the weight parameters of GoogLeNet. Before fault classification, in order to fit the dataset as soon as possible, the initial learning rate was set to 0.001. The model learns the kurtograms dataset 45 times in total, and the learning rate are changed every 15 times to ensure the stability of the model. The learning rate was set to 0.0001 and 0.00001 at the 16th learn and 31st learn, respectively. Every time the model fits in the training dataset, the performance of the model on the test dataset will be output.

Figure 16 shows the fitting process of GoogLeNET to the kurtogram dataset in this paper. The blue curve in Figure 16 shows the change of the fitting degree of the model on the training dataset. The orange curve indicates the performance of the model on the test dataset after each fitting of the model to the training dataset. It can be seen that at the end of the 15th fitting, the fitting accuracy of the model on the training dataset reached to 98.23%, then the classification accuracy on the test dataset reached to 96.30%. Then, as the learning rate decreases to 0.0001, the fitting extent of the model becomes better. At the end of the 45th fitting, the accuracy has reached to 98.23%. The result shown in Figure 16 shows that the fault feature signal processed can effectively reflect the fault state of the bearing.

The results show that the method proposed in this paper can enhance the fault feature, and GoogLeNet can accurately identify the fault types of faulty bearings according to the information of kurtograms of fault feature signals.

4.3.3. Comparison Experiment for Denoise or Not

The original signal is divided based on the steps shown in Figure 15, but XWT-Bandpass noise reduction is not applied before the division process. Then, the kurtosis spectrum dataset of all original signal samples is produced. Finally, the kurtosis spectrum of the original signal is input into the GoogLeNet described in Table 3 for fault classification, and the fault classification accuracy of the original signal is obtained.

Figure 17 shows the fitting process of GoogLeNet to the kurtogram dataset of original signal and fault feature signals. The blue curve and orange curve in Figure 17 indicates the performance of the model on the the fault feature signal. The green curve and red curve in the figure shows the change of the fitting degree of the model on the training dataset and test dataset of the original signal, respectively. At the end of the fitting, the classification accuracy of the original signal test set is only 89.66%. It can be seen that compared with the fault feature signal, GoogLeNet’s fitting degree of the original signal is worse, which reflects that the fault feature of the signal after noise reduction is more obvious. Figure 18 shows the confusion matrices of classification of the fault feature signal and the original signal. It can be seen from Figure 18 that the classification accuracy of all fault types has been improved after fault feature enhancement.

The faults classification recall and precision of all fault types before and after noise reduction are shown in Table 4.

4.3.4. Comparison with Other Diagnostic Methods

At present, many classification methods have been used in bearing fault detection, such as SVM, CNN, DBN, KNN, and BPNN. And the application of these methods in the field of fault diagnosis has achieved good results. The results of these methods and our methods as well as the fault classification accuracy of each fault of a part methods are listed in Table 5 to compare the effect. It can be seen from Table 5 that the accuracy of our proposed method is higher than that of other methods, and the accuracy rate of each fault is maintained between 95% and 100%. The performance and stability of the method are better, which again demonstrates the superiority of this method.

5. Conclusion

In this paper, microphones are used to collect the sound signals of the working motor from different positions. According to the principle of the cross wavelet transform and bandpass filtering, the coherent frequency band is distinguished out and reserved. According to the definition of spectral kurtosis, by comparing the spectral kurtosis values of the original signal and the feature signal, it is proved that the feature signal can better reflect the impact caused by bearing fault. The kurtograms of feature signal is input to GoogLeNet for fault classification and then an accuracy of 98.23% was achieved. While the fault classification accuracy of the original signal by GoogLeNet is only 89.66%. So the effectiveness of the proposed method is further proved. Therefore, the method proposed in this paper can effectively remove noise and enhance the fault feature, and excellent fault diagnosis can be achieved by using a convolutional neural network.

Data Availability

This study does not involve any public data sets.

Conflicts of Interest

The authors declare that they have no conflicts of interest.


This work was supported by National Key R&D Program of China (Grant no. 2020YFB1200300ZL) and Sichuan Science and Technology Program (Grant no. 2022YFG0088).