#### Abstract

The feature extraction of wheelset-bearing fault is important for the safety service of high-speed train. In recent years, sparse representation is gradually applied to the fault diagnosis of wheelset-bearing. However, it is difficult for traditional sparse representation to extract fault features ideally when some strong interference components are imposed on the signal. Therefore, this paper proposes a novel feature extraction method of wheelset-bearing fault based on the wavelet sparse representation with adaptive local iterative filtering. In this method, the adaptive local iterative filtering reduces the impact of interference components effectively and contributes to the extraction of sparse impulses. The wavelet sparse representation, which adopts L1-regularized optimization for a globally optimal solution in sparse coding, extracts intrinsic features of fault in the wavelet domain. To validate the effectiveness of this proposed method, both simulated signals and experimental signals are analyzed. The results show that the fault features of wheelset-bearing are sufficiently extracted by the proposed method.

#### 1. Introduction

As the core system of high-speed trains, the bogie frame plays an extremely crucial role in the operation process. Among the components of the bogie frame, wheelset-bearing is the core component for the connection between the wheelset and the frame. When a high-speed train operates on a rail, the wheelset-bearing plays an important role in the power transmission. Compared with the other common bearings on the static mechanical equipment, the service conditions of the wheelset-bearing are quite different. The wheelset-bearing bears not only the static pressure of high-speed train but also the unstable dynamic load caused by the radial acceleration during operating time. A higher speed naturally causes a greater vibration and dynamic force to the wheelset-bearing. In addition, the wheelset-bearing will bear a large axial force when the train passes through a curve.

When a high-speed train operates on a rail, various operation characteristics could cause the wheelset-bearing fault. Since the distance between adjacent stations is usually not long for high-speed train, the acceleration and the braking of high-speed train occur frequently. This causes the dynamic load range of the wheelset-bearing to change greatly and frequently. In addition, due to the effect of overwhelming impact from track-vehicle system, which is caused by the polygonal wear of wheels, track irregularities, and irregular turnout, various faults (such as peeling, flaw) might appear on the wheelset-bearing. Once these faults appear under the condition of high-speed rotation, the service conditions of wheelset-bearing will deteriorate rapidly, which will eventually affect the safety of high-speed train. Therefore, it is of great significance to detect wheelset-bearing fault [1]. As for the traditional fault detection of wheelset-bearing, it cannot be conducted until the wheelset-bearing is disassembled during the level-three maintenance of a high-speed train. This indicates that fault detection of wheelset-bearing cannot be executed before the level-three maintenance. Therefore, the fault diagnosis, based on vibration signal, becomes a feasible technique for wheelset-bearing fault in the early stage.

In general, when the bearing fault occurs, the periodical impulses are generated. Therefore, the vibration signals are collected to determine whether the fault exists on the bearing. During the operation of high-speed train, the defect inevitably appears on the wheel-tread. Compared with the traditional bearing, the energy of the rotation frequency for wheelset, generated by the defect on the wheel-tread, is larger because of the interaction between wheel-tread and rail. Therefore, the vibration information of both rotation frequency and harmonics for wheelset is also evidently contained in the collected vibration signals, which makes the frequency components contained in the vibration signals more complex [2]. This is the most important difference between the vibration signals of traditional bearing and wheelset-bearing. If the rotation frequency information of wheelset cannot be handled appropriately in the analytical process, the mistaken fault features might be extracted. In addition, the collected vibration signals usually contain nonstationary component. They are also interfered with noise component, which causes extraordinarily challenges for the detection of fault [3, 4].

As for the fault detection in a general bearing, many fault diagnosis methods, including empirical model decomposition and its variants [5, 6], empirical wavelet decomposition [7, 8], variational mode decomposition [9, 10], minimum entropy deconvolution [11, 12], local mean decomposition [13, 14], deep learning [15, 16], and sparse representation [17, 18], have been proposed for bearing fault detection. Among these techniques, the sparse representation might be an advanced method for feature extraction of bearing fault. Recently, many scholars have devoted themselves to applying sparse representation methods to bearing fault detection. Chen et al. [17] proposed a method named SpaEIAD for fault extraction, which showed a good performance in denoising of signal. In [19], feature-sign search was adopted for sparse representation which also obtained a desired extraction result. Sun et al. [20] designed a parametric impulsive dictionary and improved the stopping criteria of the OMP for sparse representation. Ding [21] proposed a shock response convolutional sparse coding technique and achieved the extraction of shock response based on time location coefficients. Qin [22] proposed a new sparse representation method based on a family of model-based impulsive wavelets, which was able to accurately represent the bearing fault impulses. In [23], a bearing fault extraction method, based on the adaptive OMP algorithm and improved K-SVD with adaptive transient dictionary, was proposed and it achieved a good effect in fault detection and computation speed.

In the traditional sparse representation, the power levels of different features will affect the results extracted by the sparse representation [24]. When the energy of interference, usually expressed as the nonstationary component, is stronger than the energy of fault features, it will detect and extract the interference component instead of the fault features. Considering this condition, Qin separated the harmonics and modulated components from the vibration signal of gearbox bearing successfully with the improved OMP and Fourier dictionary in [23]. This method achieved an outstanding performance in fault extraction through two experiments. However, the service conditions of the wheelset-bearing are more complex than those of the gearbox bearing in [23]. As analysis showed before, apart from the influence of nonstationary component, the wheelset’s rotation frequency component also should be considered to be removed. Due to this reason, the aforementioned separation algorithm in [23] should be executed additionally, which increases the complexity of the algorithm. In addition, the Fourier dictionary might not be a suitable dictionary for the separation of the wheelset’s rotation frequency component. Therefore, Qin’s algorithm might not be fully suitable for the application of fault extraction for the wheelset-bearing. According to the analysis in [2], two resonance frequencies, excited by the defect of wheel-tread and the wheelset-bearing fault, are more likely at low frequency and high frequency, respectively. Additionally, the frequency of nonstationary component, which is distributed throughout the signal, is also at low frequency relative to the resonance frequency excited by the wheelset-bearing fault. In order to remove the impact of these two components conveniently and effectively, adaptive local iterative filtering (ALIF) [25] is introduced. The ALIF is suitable to process the separation of the components belonging to different frequency. Therefore, the ALIF can reduce or eliminate the influence of aforementioned two components.

When sparse representation is solely applied in time domain, it will lead to the inadequate feature extraction. This indicates that the acquired fault features cannot be extracted thoroughly. The wavelet domain is another scale representation of signal [26]. The most useful information of the signal in time domain can be compressed and represented in wavelet domain without losing local information. During the process of wavelet decomposition, the noise component can be partially separated. This highlights the local features of signal. In addition, the initial atoms of constructed dictionary, obtained in wavelet domain, are more fitted to the signal. Therefore, the wavelet sparse representation, which indicates that the sparse representation is applied in the wavelet domain, is able to extract intrinsic features of signals.

To diagnose wheelset-bearing fault more effectively, a novel feature extraction method, namely, ALIF-SBAKW, based on the wavelet sparse representation (Split Bregman for sparse coding and approximate K-SVD for dictionary learning) with the adaptive local iterative filtering (ALIF), is proposed in this paper. The paper is organized as follows. Section 2 elaborates the details of wavelet sparse representation. Section 3 describes the main principle of ALIF-SBAKW for feature extraction. The proposed ALIF-SBAKW is verified by simulations and experiments, respectively, in Sections 4 and 5. Section 6 concludes the paper.

#### 2. Wavelet Sparse Representation

##### 2.1. Sparse Representation

According to the sparsity of fault impulses, the observed signals can be represented sparsely by combining the dictionary with the sparse coefficient, as shown in where denotes the observed signal, is the measurement noises, denotes dictionary matrix, and denotes sparse coefficient vector. The extracted signal can be constructed by the multiplication of dictionary and sparse coefficient .

It can be observed that (1) is an underdetermined equation, which means (1) has infinite solutions. To solve this kind of problem, it can be turned into the problem of finding the L0-regularized optimization and L1-regularized optimization. Compared with L0-regularized optimization, L1-regularized optimization is more suitable for solving an NP-hard problem [19]. Therefore, L1-regularized optimization is adopted. The equation of L1-regularized optimization is given in where and *c* denotes adjusted gain of standard deviation of noise.

To solve this kind of optimization problem, a penalty factor can be introduced to reduce the constraint. A new objective function can be obtained:

In fact, the length of signal, i.e., *n*, is usually a large number, which means that a very large amount of computing resources is consumed in solving the problem. In order to reduce the computational burden, the observed signal can be segmented into a series of truncated-signals with a certain overlap; as shown in Figure 1, dataset comprised of the truncated-signals can be obtained. Equation (3) can be solved in the form of matrix . Assuming that the dataset is after segmentation, accordingly dictionary matrix and sparse coefficient matrix are defined as and , respectively. Therefore, (3) is transformed into

To solve the objective optimization problem in (4), two optimization steps containing both sparse coding and dictionary learning are executed. In sparse coding, a sparse approximation is used to find sparse coefficients with a fixed dictionary. The dictionary learning step is used to update the dictionary with the obtained sparse coefficients.

##### 2.2. Wavelet Sparse Representation

In order to better extract the intrinsic features of signals, the wavelet decomposition can be adopted in sparse representation. As for a one-dimensional signal, there are mainly two forms of coefficients after wavelet decomposition: approximation coefficients (CAs) and detail coefficients (CDs). The CAs contain the main information of the original signal. On the contrary, the CDs contain the subordinate components of the original signal. The wavelet decomposition of the signal is given in Figure 2. It can be observed that the number of decomposed subbands is related to the decomposed level *q*. These subbands mainly consist of a CA band and *q* CD bands. Accordingly the number of subbands is *q* + 1.

The fault features of bearing are mainly hidden in the CAs after the wavelet decomposition of signal. Accordingly, they can be called impulse wavelet coefficients (IWCs). In addition, the CDs mainly contain the noise component of signal. According to the performance of wavelet decomposition, the sparse representation can be accomplished by the obtained IWCs. The objective function can be transformed into where denotes the -th column of dataset , which is composed of the segmented IWCs in the wavelet domain. Equation (5) suggests that the dictionary matrix and sparse coefficient matrix can be obtained after the wavelet sparse representation. As for the CDs, they can be directly set to zero because they mainly contain noise component. It should be noticed that the level of wavelet decomposition has great significance for the IWCs. The wavelet decomposition of the one-dimensional signal is a process of downsampling. If there are too many required levels, the IWCs will be distorted, causing the fault features, hidden in the IWCs, to be weakened or to disappear. Therefore, the choice of level for wavelet decomposition is extraordinarily important.

At the beginning of wavelet sparse representation, *k* columns of dataset are randomly selected as the initialization dictionary . During the calculation, the row *b* of the designed dictionary is always much smaller than the column *k* of the designed dictionary. This makes the dictionary redundant. With the feature of redundancy, the advantage of this kind of dictionary is that it is more conducive to expressing a highly diversified signal and reconstructing the local features of the signal. Furthermore, the noise is generally not considered to be sparse. Therefore, the fault signal, which needs to be extracted from the noise, will become much sparser and more stable by using the representation of redundant dictionary.

##### 2.3. Split Bregman for Sparse Coding

Split Bregman (SB) iteration is one of the effective methods of L1-regularized optimization. Due to its ability to solve a very wide class of L1-regularized problems by using alternating iteration, SB has been widely used in the field of image processing [27, 28]. It is also suitable for sparse coding. In this paper, the sparse coding problem needs to be solved in the form of (6) and (7) based on the principle of SB:where and are given. and are the unknown quantities. Defining , an elegant form of iteration, based on the principle of SB, can be obtained by the simplification to (7), as seen in (8) and (9).

In (8), the item contains L1 and L2 components with two different independent variables. When the value of one of the variables is updated, the other variable can be seen as a constant. Due to this alternating characteristic, (8) can be split into two steps:

When solving (10), the solution can be obtained by differentiation with corresponding independent variable and setting result to zero. Therefore, the computed equation of is shown below:

In addition, can be obtained by using a shrinkage operator at the second iteration, as shown in (13) and (14). denotes elementwise multiplication.

##### 2.4. Approximate K-SVD for Dictionary Learning

After the sparse coefficient matrix is calculated, the dictionary will be considered updated. K-SVD is an available method for dictionary learning, which can effectively reduce the sparsity of the corresponding sparse coefficient matrix of dictionary [23]. The objective function can be modified as follows:where denotes the -th row of . Based on the principle of K-SVD, the optimization of (15) is equivalent to optimizing the nonzero elements in :where matrix is a size of with ones on and zeros elsewhere . denotes a group of indices for the elements of nonzero in , i.e., , and denotes the number of nonzero in .

This problem can be solved directly by SVD decomposition (i.e., ). and are defined as the first column of and the first column of multiplied by , respectively.

Based on the K-SVD, approximate K-SVD (AK-SVD) is introduced to improve the dictionary by iteration and reduce the computational burden simultaneously [29]. In this case, the following two steps can be alternately iterated to obtain an approximate solution, as shown below:

#### 3. Proposed ALIF-SBAKW

##### 3.1. The ALIF Algorithm

In order to reduce the impact of the nonstationary and the wheelset’s rotation frequency components, adaptive local iterative filtering (ALIF) is used to process signals. ALIF is a novel time-frequency analysis algorithm, which is inspired by EMD [25]. The flowchart of ALIF is shown in Figure 3.

In this algorithm, the operator denotes the moving average of the signal , as shown in (19). In (19), denotes the low pass filter constructed by the solution of Fokker–Planck (FP) equations with a mask length of . The main idea of computing mask length is to compute a multiple of the distance of subsequent local minima and maxima of . Obtaining a continuously varying and smooth can be achieved by interpolating the values of the distance of the subsequent local extrema of and subtracting the high frequency from the interpolated line. It can be observed in Figure 3 that the ALIF consists of two loops: the outer loop and inner loop. The outer loop mainly derives the IMFs captured by the inner loop. It determines whether the process of decomposition can stop with the number of extrema. The inner loop mainly captures a single IMF component with a stopping criterion. The stopping criterion in the inner loop usually requires , which is difficult to apply in practice. Therefore, an exact threshold can be set as a stopping criterion shown in (20), where is the *n*-th step of the -th inner loop shown in (21).

The ALIF performs much better when processing the separation of the aforementioned two components. ALIF follows the iterative framework of the EMD algorithm. The moving average in ALIF is the convolution between the signal and the low pass filter. The low pass filter, constructed by the solution of FP equation, is compactly supported and is tending to zero smoothly at both ends, which ensures the nonexistence of artificial oscillations. In addition, the length of filters in ALIF is adapted accordingly. This ensures that the nonstationary changes in signal can be captured more effectively. Therefore, ALIF is more stable under perturbation.

##### 3.2. The Choice of Impulse-IMF

As presented in Section 1, in order to make the conducting of fault extraction more conducive, the nonstationary and the wheelset’s rotation frequency components that are caused by the service conditions of the wheelset-bearing should be removed firstly. After that, an appropriate signal component, which mainly contains fault information, should be chosen as the impulse-IMF for wavelet sparse representation. According to the algorithm of ALIF, a series of IMFs for different frequency can be obtained. These IMFs are arranged in a frequency order from high to low. According to the different frequency of the signal components, the IMF1 (fault information with high frequency) is empirically chosen as the impulse-IMF. This can effectively reduce or eliminate the influence of the aforementioned two components (low frequency), which is beneficial to the feature extraction of the wheelset-bearing fault.

##### 3.3. Proposed ALIF-SBAKW

A novel feature extraction method of wheelset-bearing fault, ALIF-SBAKW, is proposed in this paper. The flowchart of ALIF-SBAKW is shown in Figure 4. ALIF-SBAKW mainly consists of the following steps: Step 1: the collected vibration signal is decomposed into a series of IMFs by ALIF. The impulse-IMF containing the impulses with noise can be selected from the IMFs. Step 2: the IWCs can be obtained by the wavelet decomposition of impulse-IMF. Generally, the number of decomposed levels can be set to 1 or 2 for the signal with a short length. The noise level of IWCs is estimated, and the adjusted gain should be given. The IWCs can be segmented, with the maximal overlap, into a series of truncated-signals whose length is . Let and columns of be randomly selected as an initial dictionary . The sparse coefficient matrix is initialized to zeros. Step 3: sparse representation can be applied to the given , , , and . and can be updated by the application of SB and AK-SVD. can be reconstructed by the multiplication after sparse representation. The IWCs is restored by tiling . Step 4: in order to achieve the purposes of strengthening the IWCs and enhancing the impulse response in the original signal, the convolution between the IWCs and a typical impulse is applied. The extracted impulses of the vibration signal can be reconstructed by the wavelet reconstruction with convolutional IWCs and zero-setting CDs.

#### 4. Simulation Validation

In order to illustrate and verify the effect of the proposed ALIF-SBAKW, a simulation validation is designed in this section. As the analysis in Section 1 showed, a simulated signal generated by bearing rotation can be constructed aswhere denotes the impulse component generated by the fault of bearing. It can be expressed by (23), where is a unit step function, is the total number of generated impulses, is the amplitude of the *j*-th impulse, denotes the structure damper coefficient, denotes the resonance frequency, and , whose reciprocal is the fault characteristic frequency , is the time interval between the two adjacent impulses. The simulated parameters of (23) are shown in Table 1. The time-domain waveform of is shown in Figure 5(a):

**(a)**

**(b)**

**(c)**

The simulated denotes some unknown interferences with nonstationary component which are generated during the process of measurement. It can be described by a cosine function and its corresponding modulation function. Here, is directly written out, as shown in (24). In addition, denotes noise component of , which can be described by Gaussian random noise, whose standard deviation can be set to 1. The power ratio between the impulse component and the total noisy simulated signal is −18.7008 dB:

When the sampling frequency and sampling time are set to 10000 Hz and 1 s, respectively, a simulated signal , as shown in Figure 5(b), can be obtained. Figure 5(c) is the Hilbert envelope spectrum of . It can be observed from Figure 5(b) that the impulses have been completely covered because of the interference components. The features of bearing fault cannot be identified in time domain. In addition, the fault characteristic frequency also cannot be detected in its corresponding Hilbert envelope spectrum.

##### 4.1. The Setting Rule of Adjusted Gain *c*

In order to obtain a good performance of fault extraction in ALIF-SBAKW, the selection of suitable adjusted gain *c* is very crucial. If *c* is set too small, more wrong impulses and noises will be extracted in the results. Instead, if *c* is set too large, the number of fault impulses will be extracted insufficiently. In the fault diagnosis of bearing, the extracted performance of fault characteristic frequency and its harmonics in the Hilbert envelope spectrum is an important evaluation. Envelope spectrum kurtosis (ESK) is an effective measure index of extracted fault features in the Hilbert envelope spectrum. A higher value of ESK implies clearer fault features with larger amplitudes in their envelope spectrum and a larger number of harmonics for fault characteristic frequency [2]. Therefore, in order to select *c* more reasonably, the ESK of different extracted signals with different *c* in a particular range is calculated. The adjusted gain *c* with the highest ESK is selected as the most suitable parameter.

##### 4.2. The Choice of Wavelet Base

It should be noted that, except the adjusted gain *c*, the choice of wavelet base also should be reasonable to obtain a good performance of wavelet decomposition and reconstruction. Daubechies wavelet is compact support and it has a wonderful regularity. It can not only retain the peak feature of the impulse, but also obtain a smoothing bearing fault signal. Therefore, it is very suitable for wavelet transform of bearing fault signal [30]. In addition, the choice of the support length of wavelet also has an impact on the analytical results. In order to avoid the boundary problem and low order vanishing moments caused by the overlong and short support length of wavelet, respectively, the filter length 8 (i.e., Daubechies 8-tap wavelet) is selected for wavelet decomposition in this paper.

##### 4.3. Simulation Results

According to the flowchart of ALIF-SBAKW, a series of crucial calculated parameters are given. The threshold of stopping criterion in ALIF is set to 0.001. The decomposed level of wavelet decomposition is 1. The chosen wavelet basis is Daubechies 8-tap wavelet. The adjusted gain *c* of noise standard deviation is determined as 4.2. The size of the dictionary is set to 10 40, which means that the length of segmented signal is 10. The parameters, and in SB, are set to 100 and 1000, respectively. The chosen impulse-IMF is IMF1. Finally, the simulated analysis of (22) is implemented based on the procedure of the ALIF-SBAKW method. The decomposed IMFs of (22) by ALIF are shown in Figure 6. The simulated result of feature extraction of bearing fault is shown in Figure 7(a) and the corresponding Hilbert envelope spectrum is shown in Figure 7(b). It can be observed from the results that the fault features are clearly detected in Figure 7(a), in which the interference components are totally eliminated. The fault characteristic frequency and its harmonics are explicitly illustrated in Figure 7(b). The simulated results show that the fault features are sufficiently extracted by the proposed method no matter whether in time domain or in the Hilbert envelope spectrum.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(a)**

**(b)**

##### 4.4. Performance Comparison

In order to further illustrate the advancement of proposed method, two comparative methods are applied to analyze the same simulated signal. Firstly, wavelet sparse representation, i.e., SBAKW, is used to process the simulated signal directly. The adjusted gain *c* of noise standard deviation is set to 24, and the other parameters are determined as above. The simulated result using SBAKW is shown in Figure 8(a), and its Hilbert envelope spectrum is shown in Figure 8(b). Secondly, the OMP-KSVD algorithm, a well-known sparse representation algorithm, is applied to handle the IMFs decomposed by ALIF, i.e., ALIF-OMPK. In general, there are two different iteration stopping criteria in OMP algorithm: target sparsity and error goal. The size of the dictionary is set to 64256 in ALIF-OMPK. The number of iterations is set to 10, and the chosen impulse-IMF also is IMF1. When the target sparsity is adopted as the stopping criterion and it is set to 2, the simulated result is shown in Figure 9(a) and its Hilbert envelope spectrum is shown in Figure 9(b). When the error goal is adopted as the stopping criterion and it is set to 54.7, the simulated result is shown in Figure 10(a) and its Hilbert envelope spectrum is shown in Figure 10(b).

**(a)**

**(b)**

**(a)**

**(b)**

**(a)**

**(b)**

Making a comparison between SBAKW and ALIF-SBAKW, features of interferences are extracted by using SBAKW in Figure 8(a). This is because the sparse representation is more likely to detect the components having stronger energy. In this simulated case, the energy of nonstationary component is much stronger than that of fault features. The extracted features obviously are not fault features. The information reflected in Hilbert envelope spectrum also cannot extract the fault characteristic frequency effectively either. Therefore, SBAKW cannot detect fault features of bearing directly.

The results extracted by ALIF-OMPK under two different iteration stopping criteria can also identify the information of fault characteristic, as shown in Figures 9 and 10. However, the extracted results are not desired compared with the results extracted by the ALIF-SBAKW. The ALIF-SBAKW method significantly removes noise between the two impulses, which has better performance in denoising than that of the ALIF-OMPK. Additionally, although the fault characteristic frequency and its harmonics can also be identified in Hilbert envelope spectrum by using ALIF-OMPK, the amplitude of is weaker than that extracted by the ALIF-SBAKW. The number of harmonics extracted by the ALIF-SBAKW is also much more fruitful. In order to further conduct the comparison, the value of each ESK is calculated as a comparative indicator. The values of ESK in Figures 7(b), 9(b), and 10(b) are 511.7, 336.4, and 477.8, respectively. This implies that the results, extracted by the ALIF-SBAKW, are superior. Therefore, compared with the ALIF-OMPK, the proposed ALIF-SBAKW method can extract more fruitful information of fault no matter whether in time domain or in the Hilbert envelope spectrum.

#### 5. Experimental Validation

In order to further validate the effect of the proposed ALIF-SBAKW, the experimental data of wheelset-bearing fault has been obtained through the testing rig shown in Figure 11(a). The experiment was conducted by the project between Southwest Jiaotong University and CRRC Corporation. The axle box bearing running on the testing rig is from China Railway High-speed (CRH) vehicle, and the double-row tapered roller bearing is adopted for CRH vehicle axle box. The testing rig consists of a motor, a loading device, a pair of driving wheels, a testing wheelset, and an axle box. The testing wheelset, which is driven by the driving wheels in the bottom, is supported by the axle box bearing. The driving power is delivered by the motor, and it can be conveyed to the driving wheels though the rubber belts. The accelerometer for collecting vibration signals is mounted on the axle box, as shown in Figure 11(b). In Figures 11(c) and 11(d), two typical bearing faults are introduced into the experiment: outer-race fault and roller fault. The fault characteristic frequency of these two typical faults can be calculated using (25) and (26), respectively. The bearing parameters , , , and denote roller-ball diameter, pitch diameter, the number of balls, and the contact angle of balls, respectively. The values of bearing parameters are listed in Table 2; denotes the rotation frequency. When the speed is 100 km/h, the corresponding rotation frequency is 10.3 Hz. Therefore, it can be easily calculated that the fault characteristic frequency of outer-race fault and roller fault are 83.23 Hz and 33.69 Hz, respectively. The accelerometer is installed on the axle box, and the fault signal is collected at a sampling rate of 10 kHz.

**(a)**

**(b)**

**(c)**

**(d)**

In this section, the proposed ALIF-SBAKW method is used to analyze the collected vibration signal. In order to further validate the effect of proposed ALIF-SBAKW and highlight its superiority, four different comparative methods, namely, SBAKW, ALIF-OMPK (using target sparsity), EWT, and fast kurtogram [7], are applied to process the analyzed signal:

##### 5.1. Outer-Race Fault Experiment

In the experiment of outer-race fault, the collected signal is shown in Figure 12(a) and its Hilbert envelope spectrum is shown in Figure 12(b). It can be seen that the fault characteristic frequency and its harmonics cannot be discovered clearly due to the existence of power-line interference and noises.

**(a)**

**(b)**

The proposed ALIF-SBAKW method is used to analyze the collected signal in Figure 12(a). The calculated parameters, used in ALIF-SBAKW, are mostly identical with those set in the simulation validation. The adjusted gain *c* and the segmented length *b* are set to 3 and 10, respectively. The number of IMFs, decomposed by ALIF, is 6, as shown in Figures 13(a)–13(d) (only the first 4 IMFs are shown). Similarly, the IMF1 is still set as the impulse-IMF. The results of fault extraction are shown in Figures 13(e) and 13(f). It can be seen that the impulses in time domain are indeed extracted. In addition, the fault characteristic frequency of outer-race fault and its harmonics are clearly displayed in Hilbert envelope spectrum, which indicates that there certainly exists fault on the surface of the outer race.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

In order to further validate the effect of proposed ALIF-SBAKW, four comparative methods are conducted. In SBAKW, the adjusted gain *c* of noise standard deviation is set to 5.5, and the other parameters are identical with those set in ALIF-SBAKW. The extracted results using SBAKW are shown in Figure 14. In ALIF-OMPK, the size of the dictionary is set to 128512, and the sparsity is 2. The number of iterations is set to 10. The extracted signal using ALIF-OMPK is shown in Figure 15(a), and its Hilbert envelope spectrum is shown in Figure 15(b). In EWT, the maximum number of segmented bands is set to 14. In the results obtained by the EWT, most subband signals are unable to contain useful information due to the segment of narrow frequency bands. Among all bands, only the 13th subband signal can extract some feature information. The analytical results are shown in Figure 16. In fast kurtogram, the highest level of decomposition is set to 6. The kurtogram is obtained as shown in Figure 17(a). The centre frequency and optimal level of filter are set to 3945.3125 and 6, respectively. The extracted results are shown in Figures 17(b) and 17(c).

**(a)**

**(b)**

**(a)**

**(b)**

**(a)**

**(b)**

**(c)**

**(a)**

**(b)**

**(c)**

Making comparison between SBAKW and ALIF-SBAKW, although some kinds of vibration features can be extracted by using SBAKW, they mainly derive from the energy of power-line interference. The characteristic frequency of power-line interference can be detected in Hilbert envelope spectrum, whereas the is almost submerged in other frequency components and its harmonics cannot be extracted. This shows that the interference seriously affects the extraction of correct features for impulses. Compared with ALIF-OMPK, EWT, and fast kurtogram, the extracted signals are not purified, and the fault impulses are nearly drowned in the noise component in time domain. Furthermore, the whole amplitude of extracted signals, analyzed by EWT and fast kurtogram, is greatly reduced compared with the signal extracted by the ALIF-SBAKW, which means that the extraction of fault is affected. Compared with the Hilbert envelope spectrum, the most significant difference is that the amplitude of and its harmonics, extracted by ALIF-OMPK, EWT, and fast kurtogram, are generally weaker than those extracted by the ALIF-SBAKW. As for the ALIF-OMPK, although the number of its harmonics increases, the is inconspicuous. For EWT, the number of harmonics of is far less than the number of harmonics in Figure 13(f). By contrast, the of fast kurtogram and its harmonics cannot be discovered directly in Figure 17(c). Therefore, the proposed ALIF-SBAKW can be used to analyze the fault extraction of outer race effectively, and it performs better than the other four comparative methods.

##### 5.2. Roller Fault Experiment

In the roller fault experiment, the collected signal is shown in Figure 18(a) and its Hilbert envelope spectrum is shown in Figure 18(b). It should be noted that the even harmonics of are often dominant in Hilbert envelope spectrum. This is because the defect on rolling element surface impacts both the inner race and outer race, which excites two impulses and results in two shocks per basic period [10, 31]. Therefore, the and its harmonics should be used as the diagnostic indices for roller fault. It can be observed from Figure 18(a) that the impulses of roller fault in time-domain waveform are almost overwhelmed. Additionally, although can be reluctantly discovered, its harmonics are not purified. They are almost overwhelmed in complex frequency components and thus cannot be detected.

**(a)**

**(b)**

The collected signal is firstly processed by the proposed ALIF-SBAKW method. Similarly, the adjusted gain *c* and the segmented length *b* are set to 4 and 10, respectively. The IMFs of collected signal, decomposed by ALIF, are shown in Figures 19(a)–19(d) (similarly the first 4 IMFs are shown). The result of fault extraction is shown in Figure 19(e), and the corresponding Hilbert envelope spectrum is shown in Figure 19(f). It can be observed that the impulses, caused by the roller fault, are directly extracted. The double fault characteristic frequency of roller and its harmonics are clearly discovered in Hilbert envelope spectrum.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

Similarly, four comparative methods are used to process the collected signal. In SBAKW, the adjusted gain *c* of noise standard deviation is set to 6, and the other parameters remain unchanged as those set in ALIF-SBAKW. The analytical results are shown in Figure 20. In ALIF-OMPK, the related parameters are consistent with the parameter settings of ALIF-OMPK in Section 5.1. The extracted results are shown in Figure 21. In EWT, the detected boundaries of Fourier spectrum are shown in Figure 22(a). According to the results, only the 17th subband signal is able to extract some feature information, as shown in Figures 22(b) and 22(c). In fast kurtogram, the centre frequency and the optimal level of filter are set to 1875 and 5, respectively. The analytical results are shown in Figure 23.

**(a)**

**(b)**

**(a)**

**(b)**

**(a)**

**(b)**

**(c)**

**(a)**

**(b)**

**(c)**

The reason why the double fault characteristic frequency can be detected in Hilbert envelope spectrum of the original collected signal is that the original energy of fault impulses in the roller fault experimental data is extraordinarily strong. However, the quality of results, extracted by four comparative methods, is still not as good as that extracted by ALIF-SBAKW. Compared with SBAKW, although some kinds of vibration features can be extracted by using SBAKW, other different features are simultaneously extracted compared with the signal in Figure 19(e). This is due to the existence of power-line interference and noises, which lead to the inconspicuousness of directly in Figure 20(b). Making comparison among the left three methods, we find that the extracted signals can only observe the peaks of impulses and there still exist noises at the interval between the adjacent impulses compared with signal in Figure 19(e).

According to the comparison in Hilbert envelope spectrum, it can be observed from Figures 20(b) and 22(c) that the frequency, extracted by SBAKW and EWT, is nearly unable to detect the harmonics of directly. In Figures 21(b) and 23(c), although the amplitude of partly harmonics, extracted by the ALIF-OMPK and fast kurtogram, is not the most prominent, a certain number of harmonics still can be identified. In order to further highlight the superiority of the ALIF-SBAKW, some feature indicators are introduced and calculated to make a more direct comparison between ALIF-SBAKW, ALIF-OMPK, and fast kurtogram. The introduced feature indicators include crest factor (CF), impulse factor (IF), kurtosis [32], and envelope spectrum kurtosis (ESK). The related parameters and calculated values are shown in Table 3.

Theoretically, when the values of CF, IF, and kurtosis are higher, the features of impulses extracted in time domain are relatively stronger. In addition, the ESK mainly reflects the richness of fault information in Hilbert envelope spectrum. As shown in Table 3, obviously each of the values of the indicator, calculated by the results of the ALIF-SBAKW, is higher than those calculated by the results of the other two methods. Consequently, the comparative results indicate that the ALIF-SBAKW is still superior to the ALIF-OMPK and fast kurtogram. Overall, it can be concluded that, compared with SBAKW, ALIF-OMPK, EWT, and fast kurtogram, the proposed ALIF-SBAKW has better performance in some degree.

#### 6. Conclusions

The fault diagnosis of wheelset-bearing has great significance to the safety of high-speed train. Sparse representation is an advanced method for bearing fault extraction. However, it is hard for the traditional sparse representation to conduct fault extraction under severe service conditions, especially under the complicated track-vehicle system. If the energy of interference is stronger than the energy of fault features, the interference component, instead of the fault features, will be detected and extracted. Therefore, the ALIF-SBAKW is proposed in this paper. There are two reasons why this new method can solve this problem. On the one hand, the ALIF can effectively reduce or eliminate the nonstationary and the wheelset’s rotation frequency components (caused by severe service conditions of the high-speed train), which is conducive to realizing fault extraction. On the other hand, the wavelet sparse representation can deeply find the intrinsic features of signal and extract the wheelset-bearing fault. The ALIF-SBAKW method is validated by simulated and experimental signals. The results show that the ALIF-SBAKW method is extraordinarily suitable for the fault feature extraction of wheelset-bearing signals, especially compared with SBAKW, ALIF-OMPK, EWT, and fast kurtogram in experimental validation. To some extent, this novel method can effectively complete the fault diagnosis of wheelset-bearing.

Finally, although the ALIF-SBAKW method can effectively extract the fault feature, this method cannot be effectively applied to the separation of multiple faults now. Therefore, further research should be made to solve the considered problems. In addition, the fault extraction method, proposed by Qin et al. in [23], is an excellent technique. In the future research of separation for multiple faults, a more comprehensive comparison and research with the method in [23] will be conducted in terms of the effect of component separation and fault extraction.

#### Appendix

The derivation process of simplification from (7) to (8) and (9) is presented briefly in this part. According to the principle of SB [28], (7) is solved bywhere and are in the subgradient of at and , respectively. denotes the inner product between and . Defining , (A.2)–(A.3) can be written as

Then, both of the above equations can be replaced by and are simplified as the same iterative equation, i.e., (9) in Section 2.2. Finally, the simplification can be simplified aswhere and denote relevant constants involved in simplification. In fact, these constants can be ignored in minimal optimization. Therefore, (8) is obtained by ignoring .

#### Data Availability

The experimental data are based on the CRRC project and are confidential.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was supported by the National Natural Science Foundation of China (no. 51905453) and the China Postdoctoral Science Foundation (no. 2019M663899XB).