#### Abstract

A wheelset bearing is a crucial energy transmission element in high-speed trains. Any parts of the wheelset bearing that have faults may endanger the safety of the railway service. Therefore, it is important to monitor the running condition of a wheelset bearing. The multifault on a wheelset bearing is very common, and these impulsive components generated by different types of faults may interact with each other, which increases the difficulty of entirely identifying those faults. To solve the multifault problem, this paper proposed a hierarchical shift-invariant K-means singular value decomposition (H-SI-K-SVD) to hierarchically separate those multifault impulsive components based on their fault power levels. Each of the separated impulse signals contains only one fault impulse, and the fault information could be highlighted both in time domain and frequency domain. In addition, the sparsity of envelope spectrum (SES) is introduced as an indicator to adaptively tune a key parameter in this method. The effectiveness of the proposed method is verified by both simulation and experimental signals. Compared with ensemble empirical model decomposition (EEMD), the proposed method exhibits better performance in separating the multifault impulsive components and detecting the faults of a wheelset bearing.

#### 1. Introduction

A wheelset, as an important part of a high-speed train, mainly consists of three types of components: an axle, two wheels, and some wheelset bearings. When a high-speed train is running, the wheelsets commonly operate under harsh conditions, such as natural wear, random impacts, exposure to rapid humidity and thermal variations, and rolling contact fatigue [1]. A wheelset bearing is a key part of a wheelset, which maintains the energy-stable transmission between the driven system and the wheelset. The rapid and long-term alternating transmission may lead to the accelerated wear and failure of the train wheelset bearings and ultimately endanger the safety of the railway service [2]. Therefore, it is very important to monitor the running condition of the train wheelset bearings.

Vibration-based fault identification methods are powerful diagnostic techniques that have been widely applied in industrial applications because of their effectiveness, lower cost, and convenient installation [3, 4]. For a wheelset bearing, when there is a defect on the surface of its component, a series of vibration impulses will be generated with the rotation of the axle of wheelset [5]. Therefore, failure monitoring for a wheelset bearing based on the vibration signal could be an effective and reasonable approach.

Vibration-based signal processing methods mainly include time-domain analysis and transform-based methods. In time-domain analyses, some statistical indicators, such as the root mean square and kurtosis factor, have been used to analyze the vibration signals to monitor the running condition of a bearing [6]. However, the time-varying running information collection is invalid when a time-domain analysis is applied. Frequency-domain analyses, such as empirical mode decomposition (EMD) [7–9], short-time Fourier transformation (STFT) [10] and Wigner–Ville distribution (WVD) [11], and time-frequency analyses, such as wavelet transformation (WT) [12] and its extensions [13, 14], are the traditional transform-based methods, which transform the measured vibration signal into another available space. In these transformed spaces, the fault information is notably enhanced [15–17]. Nevertheless, a suitable matching degree between the transform basis and the target fault feature is the key of these transform-based methods. In other words, transform-based methods lack flexibility and adaptability.

The dictionary learning algorithm, as a branch of sparse representation theory, aims at learning a dictionary from the analyzed signal itself and representing the essential constituent components embedded in the analyzed signals. The learned dictionary can reveal the high-level structure and intrinsic behaviour of the impulsive components embedded in the measured signals [18]. Compared with traditional time-domain statistical indicators and transform-based methods, dictionary learning is a data-driven method and can adaptively match the fault feature information from the measured vibration signals [19, 20]. Dictionary learning methods mainly include regular dictionary learning and shift-invariant dictionary learning [21]. A wheelset bearing is rotating machinery, and its vibration signal has shift-invariant features when there are some defects on the surfaces of their components. Therefore, shift-invariant dictionaries are very helpful both in extracting these latent similar components generated by the faults from the measured vibration signal and monitoring the wheelset bearings running condition [22, 23]. Shift-invariant K-means singular value decomposition (SI-K-SVD) is a shift-invariant dictionary learning method that has the potential to solve the shift-invariant issue and obtain an optimized sparse representation of the measured signal [24, 25].

Fault detection for bearings using SI-K-SVD has been studied, and the results of those studies have verified its validity [26, 27]. However, the fault detection of a wheelset bearing employing this method is more difficult because of severe background noise and other types of interference, such as wheelset axle vibration and other structural vibration [28]. Furthermore, multifault detection is also a greater challenge for directly using SI-K-SVD with respect to single fault. The good signal-to-noise ratio (SNR) of the target impulse signal to the measured vibration signal is the premise and key [20]. However, in multifault detection, when separating one certain fault impulse signal, the other fault impulses would be regarded as noise, which leads to a greatly reduced SNR of this impulse signal [29, 30]. In addition, the impulsive components generated by different faults have different power levels, and certain fault impulses with a higher power would be so notable that other lower power impulses would be easily submerged [31]. Videlicet, it is difficult to completely identify multifaults using this method, especially for the lower power impulse signals. In conclusion, these problems render multifault impulsive component simultaneous separation and detection difficult using SI-K-SVD. To address these difficulties, on the basis of the aforementioned SI-K-SVD, this paper propose hierarchical SI-K-SVD (H-SI-K-SVD) to hierarchically separate the wheelset bearing multifault impulsive components based on their power levels of the impulse signals and each of the separated impulse signals only contains only one fault impulse, which could be employed to detect the multifault of a wheelset bearing. In addition, the sparsity of envelope spectrum (SES) is employed to adaptively tune a key parameter, namely, number of iterations.

The remainder of this paper is organized as follows: the basic theory of SI-K-SVD is introduced in Section 2. Then, in Section 3, the separation of multifault impulse signals using H-SI-K-SVD is elaborated in detail. Next, Section 4 shows the results of the simulation signals employing the proposed method. In Section 5, a bench experiment is undertaken to verify the effectiveness of the proposed method. Finally, conclusions are presented in Section 6.

#### 2. Basic Theory of SI-K-SVD

SI-K-SVD is a dictionary learning method that has been used to solve the shift-invariant issue. The dictionary in SI-K-SVD containing atoms is built by shifting a family of normalized patterns . Based on the shift-invariant theory, those components with similar characteristics can be represented by only one basic function with different shift operators to distinguish the location in the given signal. Therefore, such dictionary learning is transformed to learn a set of patterns, defined as follows [25]:where is the measured signal, is the nonzero entries of the sparse coefficients, is the zero norm, represents the learned pattern, is the pattern length, is the shift operator that takes the pattern and returns an atom that is null everywhere except for a copy of that starts at instant , and is the coefficient associated with the pattern shifted to instant . That is, the dictionary is defined as .

In equation (1), because each pattern and its corresponding coefficients have a coupled relationship, the two variables cannot be solved simultaneously. However, the coupled problem could be solved by successively updating the pattern and its corresponding coefficients . Then, the solution of this coupled problem is transformed to solve two optimization problems.

The pattern is solved by the optimization problem as follows:where is the inner product, is the 2 norm, and denotes the transpose of . When the pattern is obtained, the corresponding coefficients can be solved using the following equation:

Although the pattern and its corresponding coefficients are obtained, the two variables need to be updated to minimize the subject function in equation (1). The updating method in SI-K-SVD is the same as that in K-SVD to update the patterns and their resulting coefficients via singular value decomposition (SVD) [25]. Specifically, when the pattern is updated, the relevant shift operator is supposed to be fixed. The pattern and its coefficients are updated by the following equation:where is a set of all data indices using in their representation and represents the overall error without the contribution of .

When applying SI-K-SVD to the fault detection of a wheelset bearing, the pattern in SI-K-SVD corresponds to fault impulse element and its coefficients of the pattern relates to the amplitude of this impulse element at time . Besides, the time location of the impulse element could be represented by the shift operator . Finally, this fault impulse signal embedded in the measured vibration signal could be separated, and the separated fault impulse signal is written as

When all the fault impulse signals are obtained, the fault information that is applied to monitor the running condition of a wheelset bearing could be achieved by using the Hilbert envelope spectrum of these impulse signals [2].

#### 3. Separation of Fault-Impulsive Components Using H-SI-K-SVD

##### 3.1. Proposed H-SI-K-SVD

Although in the theory of SI-K-SVD, the patterns corresponding to the faults could be more than one; thus, multifault detection remains difficult, as mentioned in Section 1. For a multifault vibration signal, the fault-impulsive components with higher power usually have a better SNR of these impulse signals to the original measured signal and these impulse signals are easier to separate. Nevertheless, the lower power impulse signals are more difficult to find and separate from the original measured signal. However, if these higher power impulse signals are mostly removed from the original signal before separating the lower ones, then the SNR of the lower power impulse signals to the residual signal would be improved greatly, and the difficulties of separation for these lower power impulse signals could be solved properly. On the basis of this idea, this paper proposed H-SI-K-SVD to hierarchically separate the wheelset bearing multifault impulsive components based on their power levels of those impulse signals. In this paper, the residual signal in each separation stage is obtained by setting the amplitude of the processed signal to zero at the positions where the amplitude of separated impulse signal is nonzero. The procedure of H-SI-K-SVD for multifault impulsive component separation is shown in Figure 1.

First, the impulse signal of a certain fault with the highest power separated from the measured vibration signal using SI-K-SVD. Then, the obtained residual signal is applied to separate the other fault impulse signals with lower powers. Next, the impulse signal () with the second highest power is separated from the residual signal in the second separation stage. Moreover, the remainder of impulse signals could be performed in the same way. Finally, when all the fault impulse signals have been separated from the measured signal, the Hilbert envelope spectrums of each of the separated impulse signals are employed to assess the faults of a wheelset bearing.

##### 3.2. Discussion regarding the Number of Iterations

As previously mentioned, lower power fault impulse signals are easier to be submerged and harder to be found. To reduce this influence of the higher power fault impulse signal to the lower ones and recognize all the faults, the target fault impulse signal with higher power should be mostly removed in each separation stage. The results of impulsive component separation are closely related to a key parameter, namely, the number of iterations, referred to the nonzero entries of the sparse coefficient in the algorithm. When a certain fault impulse signal is separated, if the number of iterations is smaller than the real number of target impulses embedded in the measured vibration, then some impulses and fault information generated by the target fault will be lost. However, some of the noise without any fault information or false impulses generated by other faults is separated if the number of iterations is larger than the real number.

Considering that the waveform of a single-fault impulse signal is simpler and clearer, a simulation signal with a single fault, shown in Figure 2(a), is applied to intuitively illustrate the influence of this key parameter on the separation results. The sampling rate was 10 kHz, and the simulated fault frequency was 49 Hz. In 0.4096 s, 19 fault impulses could be clearly seen from the simulation signal. If the number of iterations is not properly selected, then certain impulses, generated by the target fault, marked by green rectangles, will be missed or some noise components, indicated by red circles, will be mistaken for the target fault impulses, as shown in Figures 2(b)–2(d). Notably, those noise components previously mentioned may be just noise or other impulses generated by other faults in multifault detection.

##### 3.3. Selection of the Number of Iterations

The number of iterations affects the impulse number, distribution, and sparseness of the separated impulse signal . The envelope spectrum of the signal is sparser when the signal has distinct periodic continuous impulses. A higher SES value means that the separated impulse signal has more periodic impulsive components and fewer nonperiodic components [32]. Therefore, SES is an effective indicator that reflects the sparseness and periodicity of the separated impulse signal. In other words, SES is suitable to be an indicator to select the number of iterations .

The Hilbert transform of the extracted impulse signal is expressed asand the envelope spectrum of is written aswhere denotes the Fourier transform. The SES is defined as [32]

When applying SES to separate impulsive components from the simulation signal in Figure 2(a), the SES values for different numbers of iterations are computed and shown in Figure 3(a). The SES has a clear peak when , which is equal to the impulse number in the simulation signal. With the selected , all the fault impulsive components are separated, as shown in Figure 3(b). Videlicet, all the impulsive components are separated from the original simulation signal. The result of the separated impulse signal verifies the effectiveness of the indicator SES for selecting the number of iterations. The procedure of optimal separated impulse signal based on SES is shown in Figure 4, where represents that the number of iterations adopted in is .

**(a)**

**(b)**

#### 4. Simulation Validation

To illustrate the effectiveness of the proposed method for multifault impulsive component separation, a simulation bearing vibration signal with two faults is introduced in this section. The process of separation was realized by the proposed H-SI-K-SVD. The fault-related parameters in simulation signals are listed in Table 1, where represents the amplitude of the impulses, is the coefficient of structure damping, is the excited resonance frequency, and is the characteristic frequency of the impulse.

The simulation signal with a SNR of −3 dB is shown in Figure 5(a), and the Fourier and Hilbert envelope spectrums of the simulation signals are shown in Figures 5(b) and 5(c), respectively. The characteristic frequency and its harmonics were easily obtained from the Hilbert envelope spectrum in Figure 5(c), but the other characteristic frequency was barely evident. The simulation signal was processed by the proposed method. The obtained results of the separation for multifaults are shown in Figures 6 and 7.

**(a)**

**(b)**

**(c)**

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**

When separating the first impulse signal, the SES values are computed in Figure 6(a) and SES exhibits an obvious peak point at . Based on this optimal number of iterations, the first impulse signal is separated from the original signal, as shown in Figure 6(b). Figure 6(d) shows that the characteristic frequency and its harmonics are found, and their amplitudes are higher than those in the Hilbert envelope spectrum of the original simulation signal in Figure 5(c). Moreover, based on SES in Figure 7(a), was selected as the optimal number of iterations for separating the second impulse signal , and its time-domain waveform is shown in Figure 7(b). The characteristic frequency and its harmonics can be clearly seen in Figure 7(d). Compared to the Hilbert envelope spectrum in Figure 5(c), the characteristic frequency and its harmonics were obtained and these fault information could be employed to assess the fault for a bearing.

To illustrate the superiority of the proposed method, an existing and widely used method for bearing fault diagnosis, ensemble empirical model decomposition (EEMD), was employed to process the same simulation signal, and the results are shown in Figure 8. Although the envelope spectrum of the first intrinsic mode function (IMF1) in Figure 8(b) can find the characteristic frequency and its harmonics, some of harmonics are missed (fourth and thirteenth harmonic), which are marked by carmine dashed lines, and the amplitude of these harmonics is obviously shorter than those harmonics found in Figure 7(d). Besides, the time domain of the IMF1 contains considerable noise, which is marked between two green dashed lines. The envelope spectrum from IMF2 to IMF4 indicates the characteristic frequency and its harmonics as well, but these amplitudes are also shorter than those in Figure 6. In addition, the impulsive components generated by only one fault are divided into three intrinsic mode functions from IMF2 to IMF4 in Figure 8(a), which are different from the separated impulse signal using the proposed method in Figure 6(b) and have an adverse effect on the detection of this fault. Therefore, the comparison results show that the proposed method has better performance with respect to both separating the impulse signals generated by different faults and integrating the fault information.

**(a)**

**(b)**

#### 5. Experimental Validation

An experimental bench for a wheelset bearing was used to verify the effectiveness of the proposed H-SI-K-SVD under experimental conditions, as shown in Figure 9. A detailed schematic diagram of this experimental bench is shown in Figure 10.

**(a)**

**(b)**

The bench mainly consisted of a driving motor, a wheelset, a gear box, two driving wheels, two loading devices, two axle boxes, and some fixing devices. Some artificial notches were introduced on the surface of a roller (Figure 11(a)) and outer race (Figure 11(b)). The wheelset bearing with these two types of artificial notches installed in the axle box was employed to conduct the fault experiments. The geometric parameters of this fault wheelset bearing are listed in Table 2. An accelerometer was mounted on the top surface of the axle box to collect the vibration signal of this wheelset bearing, as shown in Figure 9(b). To simulate the actual operation of the wheelset, a certain load was applied to the wheelset. The load was applied to the axle box through the loading devices, and a rubber buffer was mounted on the top surface of the axle box to isolate the vibration from the running wheelset.

**(a)**

**(b)**

The vibration signals, shown in Figure 12, were collected from the accelerometer installed on the top of the axle box. The sampling rate was 10 kHz. The rotational frequency of wheelset was 10.3 Hz which corresponded to a running speed of 100 km/h. The fault characteristic frequencies of roller and outer race are defined as [3]where donates the number of rollers, is the pitch diameter, is the diameter of a roller, and donates the contact angle. According to the parameters listed in Table 2, the roller characteristic frequency is 69.6 Hz and the outer race characteristic frequency is 83.8 Hz.

**(a)**

**(b)**

**(c)**

From Figure 12(c), only the roller characteristic frequency and its harmonics were easily found, but those of the outer race were hardly evident from the Hilbert envelope spectrum of the measured vibration signal. The results of impulsive component separation using H-SI-K-SVD are shown from Figures 13 and 14. Following the ruler for selecting the number of iterations based on SES, equals 28 (as shown in Figure 13(a)) for separating the roller fault impulse signal and 47 (as shown in Figure 14(a)) for separating the outer race fault impulse signal. A vibration law that a defective roller alternatively entered the bearing zone (BZ) and nonbearing zone (NBZ) could be seen clearly in Figure 13(b), and this phenomenon indicates a defect on certain roller [33]. Furthermore, the Hilbert envelope spectrum in Figure 13(d) can find the roller fault characteristic frequency and its seven harmonics as well, and the amplitude of these harmonics are higher than those in Figure 12(c). In addition, the outer race fault impulsive components are separated from the measured vibration signal, as shown in Figure 14. The characteristic frequency of the outer race fault and its five harmonics are clearly observed in the Hilbert envelope spectrum in Figure 14(d). Therefore, the obtained results show the effectiveness of the proposed H-SI-K-SVD for separating multifault impulsive components under experimental conditions.

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**

EEMD, as a compared method, is also employed to analyze the same signal, and the results are shown in Figure 15. In terms of the results in the frequency domain, the roller fault characteristic frequency and its second and third harmonics can be seen clearly in the Hilbert envelope spectrum of the IMF1, but both the amplitude and the number of these spectrum lines are worse than those in Figure 13(d). With respect to the outer race fault characteristic frequency , the envelope spectrum of IMF2 could merely find its characteristic frequency and second and third harmonics. In addition, although some harmonics of outer race fault characteristic frequency could be found, many interference spectrum lines with approximate energy lead to difficulty in recognizing these spectrum lines. For the time-domain results, comparing the proposed method with EEMD, the time-domain waveform of separated impulse signals (Figures 13(b) and 14(b)) using the proposed method have pretty good SNR, and each of the separated impulse signals only contains the impulsive components generated by only one fault. Moreover, the separated impulse signals with a high SNR can facilitate the detection of the time-domain waveform characteristic of a wheelset bearing, which is difficult for EEMD to realize. In conclusion, the proposed H-SI-K-SVD shows better performance on separating these multifault impulse signals from the measured vibration signals and diagnosing the multifault of a wheelset bearing under experimental conditions.

**(a)**

**(b)**

#### 6. Conclusion

A train wheelset bearing is an important element that maintains the energy-stable transmission between the wheelsets and driven systems. If any part of a wheelset bearing has a failure, then the safety of train service would be diminished. With respect to a single fault for a wheelset bearing, multifault detection remains difficult because some impulsive components generated by certain faults with lower power are easily missed and are barely evident when other fault impulse signals have higher power. This paper reports a method, namely, H-SI-K-SVD, to hierarchically separate the multifault impulsive components from the measured vibration signals, and this method can be used for detecting the multifault of a wheelset bearing. The hierarchical process aims at separating those impulsive components generated by different faults based on their power levels and highlights different fault information both in the time domain and frequency domain. In addition, SES, as an indicator, is introduced to this method for adaptively selecting the number of iterations. The simulation and experiment vibration signals are employed to verify the effectiveness of the proposed method in separating the multifault impulsive components and in the field of a wheelset bearing multifault diagnosis. Compared with an existing and widely used method, namely, EEMD, the proposed method has a better ability in impulsive components separation and fault detection.

Lastly, further work will concentrate on improving the proposed method to extract fault information from modulated vibration signals which are common for rotating component assembly systems, such as multistage transmission gear-box systems.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. 51875481), the Fundamental Research Foundations for the Central Universities (No. 2682017CX011), the China Postdoctoral Science Foundation (No. 2017M623009), the China National Key Research and Development Plan for Advanced Rail Transit (No. 2017YFB1201004), and the Research Fund of the State Key Laboratory of Traction Power (No. 2019TPL_T08).