#### Abstract

Rolling element bearing is one of the most commonly used supporting parts in rotating machinery, and it is also one of the most easily failing rotating parts. It is of great safety and economic significance to study the effective fault diagnosis method of rolling element bearing. The fault characteristic signal of rolling bearing is often affected by other interference signals in practical engineering, and the situation is much more serious when the rolling bearing fault occurs in gearbox. Besides, only a limited number of measuring points are used in the process of rolling bearing fault signal acquisition due to the limitation of sensors installation condition. In some sense, the above two factors often cause the result that the fault diagnosis of rolling bearing is the problem of underdetermined blind source separation. The independence and non-Gaussian characteristic of the observed signals are the prerequisite of most of existent blind source separation methods. Unlike traditional blind source separation methods, SCA originating from sparse representation is an effective method to solve the problem of underdetermined blind source separation, because it does not require the independence or non-Gaussian characteristics of the observed signals, and it only makes full use of the sparse characteristics of the observed signals to extract the source signal from the observed signals. Based on these, a sparse component analysis (SCA) method based on linear clustering (LC) named LC-SCA is proposed for the purpose of underdetermined blind source separation of vibration signals of rolling element bearing, and the LC is introduced into SCA to improve the computation efficiency of SCA. The effectiveness of the proposed method is verified by simulation and experiment. In addition, the superiority of the method is verified by comparison with the other related methods such as constrained independent component analysis (cICA) and SCA.

#### 1. Introduction

As the key and most commonly used supporting part in modern high-speed and large-scale rotating machinery, effective fault diagnosis of rolling element bearing provides important safety and economic significance for health monitoring of rotating machinery. The collected vibration signals of rolling bearing are usually from multiple sources on practical engineering occasions, and the fault diagnosis of rolling bearing is actually a process of signal blind source separation to some extent. In recent years, kinds of blind source separation methods [1–5] and other advanced methods [6, 7] for analyzing vibration signals of rolling element bearing have been arising. However, most of these methods belong to positive determined (i.e., with sources equal to sensors) or overdetermined problems (i.e., with more sensors than sources). In most of engineering occasions, the fault diagnosis of rolling bearing is the problem of underdetermined blind source separation (i.e., with less sources than sensors) due to the limitation of sensors installation condition. Underdetermined blind source separation of rolling bearing vibration signal is a hot and difficult research topic.

Unlike the traditional underdetermined source blind separation method, SCA originating from sparse representation does not require the independence or non-Gaussian characteristics of the observed signals, and it only makes full use of the sparse characteristics of the observed signals to extract the source signal from the observed signals. So SCA has great application potential in signal undetermined blind source separation. In recent years, the amount of research studies of blind source separation methods based on sparse representation has been increasing. To solve the difficult problem of existing conventional blind separation method in dealing with complex operating conditions, a blind source separation method of composite bearing vibration signals by combining low rank with sparse decomposition was proposed by considering the problem of bearing faults from the perspective of signal’s rank and sparsity [8]. A two-channel blind source separation method based on a sparse learning strategy was designed to enhance the speech quality [9], which took advantage of the sparse nature of the acoustic path impulse responses of the mixing model. SCA was used in underdetermined blind modal identification of structures by earthquake and ambient vibration measurements [10], and the superior performance of the used method was investigated by a synthetic example and an experiment, respectively. The existing links between SCA and independent component analysis (ICA) were studied [11], and a new optimization framework was proposed, which took the advantages of SCA and ICA. The feedback mechanism was utilized and combined with sparse component analysis in [12], and a new blind source separation algorithm named feedback sparse component analysis was proposed for blind source separation of mixed images. A novel method based on sparse component analysis was proposed to estimate modal parameters [13], and the proposed method was confirmed by an experiment conducted on a column beam. To solve the existing main issues of convergence of solution space and separation quality under current nonnegative matrix factorization (NMF), a new algorithm named adaptive parameterized hybrid kernel based sparse NMF was proposed for blind source separation to optimize the above issues successfully [14]. In [15], a new algorithm for approximately estimating matrix A was proposed, which solved the major problems in underdetermined sparse component analysis in the field of (semi)blind source separation. A two-stage sparse representation underdetermined blind source separation approach including the precise estimations of the unknown mixing matrix and source matrix was proposed [16], and the effectiveness of the theoretical results was illustrated by simulation. A four-step blind source separation method based on sparse feature for the fault signals of the continuous mills was proposed to separate the complex signals into independent status signals successfully [17]. A block-based approach coupled with adaptive dictionary was presented for underdetermined blind speech separation. The proposed algorithm, derived as a multistage method, was established by reformulating the underdetermined blind source separation problem as a sparse coding problem [18]. A new decentralized modal identification method was proposed using parallel factor decomposition and sparse blind source separation [19]. A novel method based on sparse component analysis-based underdetermined blind source separation was proposed to estimate modal parameters and the proposed method was applied to estimate time-varying modal parameters of a beam successfully [13]. Although several kinds of underdetermined blind source separation methods have been arising as stated above, most of them focus on the study of other areas of signal processing such as audio signal and image signal, and very limited numbers of them are focusing on fault diagnosis of rotating machinery. Based on these, a sparse component analysis method based on linear clustering named LC-SCA is proposed for underdetermined source separation of rolling bearing vibration signals. The paper is organized as follows: Section 1 is dedicated to introduction, and Section 2 discusses the theory of the proposed method. Sections 3 and 4 discuss the simulation and experiments to verify the effectiveness of the proposed method. Section 5 discusses the comparison study to verify the advantage of the proposed method. Conclusion is obtained in Section 6.

#### 2. Basic Theory

SCA is a relative new blind source signal separation technology. In real life, many signals meet the characteristic of sparsity. Unlike ICA, SCA does not require the independence or non-Gaussian characteristics of the signal, and it makes full use of the sparse characteristics of the signal to extract the source signal from the mixed signal. SCA has been used widely in spectral estimation, data mining, medical image processing, and so on [20–24]. In this paper, an SCA method based on linear clustering named LC-SCA is proposed for underdetermined blind source separation, and it has the advantages of simple calculation theory and more efficient separation result compared with the other blind source separation methods such as SCA and cICA.

##### 2.1. Basis of SCA

The base model of SCA is as follows:where represents the observed mixed signal and represents the mixed matrix that does not need to meet the sparse characteristics. The source signal is expressed as , which should meet the sparse characteristics. The target of SCA is to separate the sparse source signals from the observed mixed signals without knowing the mixed matrix and source signals.

The following two concepts should be introduced firstly: Vector sparsity: with regard to a vector , if the number of 0 among is , then the vector sparsity of vector is Matrix sparsity: with regard to a matrix , if all the vector sparsities of each column of are at least, then the sparsity of matrix is

Georgiev et al. [20] proposed and proved the two conditions as following that SCA could reconstruct the source signals completely in blind source separation:(1)Any submatrix with size of in mixed matrix is nonsingular(2)The sparsity of source signals matrix is at least

##### 2.2. LC-SCA

Theorem 1. *With regard to a complete vector with sparsity of , the column vectors of mixed signals cluster linearly along the direction of the mixed matrix column vectors.*

*Proof. * Assume that only the source signal at the moment is nonzero with regard to source signals matrix ; that is to say, there exist the following relationships:and then , and column vector is collinear with column vector . It could be seen that all columns in mixed signals satisfying are collinear with the column vector in mixed matrix. The direction of the linear clustering center of each column in the mixed signals determines the direction of the column vectors in the mixed matrix, and the number of clusters in mixed vectors along the linear direction is the number of columns of mixed matrix .

###### 2.2.1. Estimation of Mixed Matrix

Based on the characteristic of a complete set with sparsity, the column vectors in the mixed signal cluster along the column vector direction of the mixed matrix, which is presented in Theorem 1; the mixed matrix could be obtained by the following liner clustering method:(1)Direction unification: with regard to each column in the mixed signals matrix, if there exists , then .(2)Linear clustering: for any two column vectors and in mixed signals, if there exists , then vector and vector are collinear, and all the columns in matrix are clustered through this method.(3)Clustering center calculation: suppose that there are all elements being included in each class , and the clustering center vector could be calculated as .(4)Mixed matrix estimation: the direction of the clustering center vector calculated by the above steps is the direction of the column vector of the mixed matrix. There exists when the source signals are allowed to be zoomed.

###### 2.2.2. Source Signals Estimation

The solution of the source signals could be realized by the estimated mixed matrix as stated above and the observed mixed signal as follows.

With regard to each column in the mixed source signals matrix, if and are collinear, that is to say, , then

For the details of this process and SCA algorithm, refer to [22–24].

#### 3. Simulation

In this section, the simulation is carried out to verify the effectiveness of the proposed method. The mathematical expressions of the five original signals are presented in equations (4)–(8), and their corresponding time-domain waveforms are shown in Figure 1. The first and third signals are modulated signals, and the second and fourth signals are periodic signals. The fifth signal is impulsive signal:where , , , , and . . Set the sampling frequency as = 1024 Hz. To verify the underdetermined blind source separation ability of the proposed method LC-SCA, let *X* = *H* ∗ *S* represent the two observed signals. The matrix of 2 rows and 5 columns is generated randomly in MATLAB and its specific expression is shown in equation (9), and *S* is the linear combination of, , , , and , which could be expressed in equation (10):

The time-domain waveforms of the two observed signals *X* are shown in Figure 2, from which the time-domain features of the five original signals could not be identified. Input *X* into the calculation model of the proposed method LC-SCA and the time-domain waveforms of the obtained separated five signals are shown in Figure 3. The separation effect is very satisfactory intuitively by comparing Figures 1 and 3. Equation (11) is used here in order to quantify the separation effect:where vector represents the five obtained separated signals, vector *S* represents the five original signals, and *C* represents the cross correlation result between vector and vector *S* whose last calculation value is shown as follows:

The red values on diagonal in the above matrix represent the cross correlation coefficients between the five original signals and the five separated signals, and this quantifies the superior capability of the proposed method for underdetermined source blind separation.

#### 4. Experiment

In this section, two experiments are carried out to verify the effectiveness of the proposed method.

##### 4.1. Experiment 1

In the first experiment, the corresponding vibration data of three states of rolling bearing (inner race fault state, outer race fault state, and normal state) are collected. The test rig is shown in Figure 4: two ends of the rotor are supported by two rolling element bearings, respectively, one of which is convenient for replacing the test bearing in the experiment process. The test rig is equipped with hydraulic position and clamping device to fix the outer race of the test bearing. The test rig is driven by AC motor, and the rotor is driven by coupling.

The acceleration sensor with type 8791A250 is used in the experiment, and it has the virtues of light weight and being insensitive to temperature transient. The sensor is fixed on the outer race of the test bearing using wax sealed installation, and the installation diagram is shown in Figure 5.

The pitting failure is eroded on the inner and outer races of two different test bearings, respectively, using EDM technology. The type of all the test bearings is GB6023 and its parameters are given in Table 1. The outer race of the test bearing is fixed on the test bench, and the inner race rotates synchronously with the shaft in the experiment process, and the rotating speed of the shaft is 720 r/m; that is, = 12 Hz. The inner race and outer race fault characteristic frequency (FCF) of the test bearing could be calculated by using equations (13) and (14), and inner race FCF is = 51.9 Hz and outer race FCF is = 32.1 Hz through calculation. The sampling frequency is set as = 12.8 kHz in the experiment process:

The time-domain waveforms of the test bearing’s three states (inner race fault, outer race fault, and normal state) are shown in Figure 6, and their corresponding envelope demodulation spectral analysis results are shown in Figure 7, from which the FCFs are extracted perfectly (note: the random sliding between the roller and the raceway results in the error between the theoretical FCF and the actual FCF). To verify the blind source separation ability of the proposed method in underdetermined blind source situation, a matrix *H*′ of 2 rows and 3 columns is generated in Matlab randomly, and its expression is shown in equation (15). Then let *S*′ represent the signals of the test bearing’s three states, and the two observed signals X′ could be obtained as shown in equation (16):

**(a)**

**(b)**

**(c)**

The time-domain waveforms of the two observed signals *X*′ are shown in Figure 8, and their corresponding envelope demodulation spectral analysis results are shown in Figure 9, from which the spectral lines are chaotic, and the inner race FCF could be identified. However, the outer race FCF could not be identified. Input the observed signal *X*′ into the calculation model of the proposed method, and the three separated signals *S*″ with their envelope demodulation spectral analysis results are shown in Figures 10 and 11, respectively. Comparing Figure 10 with Figure 6, the separation result is satisfactory in time domain, and it is further verified by Figure 11 because both the outer race FCF and inner race FCF are extracted perfectly. The same as the ideology of simulation, the cross correlation values between *S*′ and *S*″ as shown in equation (17) are calculated to measure the separation effect in numbers:

**(a)**

**(b)**

**(a)**

**(b)**

**(c)**

The calculation result of *C*′ is presented as follows, and the red values on diagonal represent the cross correlation coefficients between *S*′ and *S*″ which further quantifies the superior capability of the proposed method for underdetermined blind source separation:

##### 4.2. Experiment 2

The test rig of experiment 2 is shown in Figure 12, which is composed of transmission platform, control panel, and data acquisition system. The transmission platform is composed of variable frequency motor, gearbox, and magnetic powder brake. The control panel is composed of frequency converter and tension controller, which are used to adjust the speed of motor input and the torque of magnetic powder brake loading. Parameters of main components in the transmission line are as follows:(1)Variable frequency motor Type: YVP80M1; rated power: 0.55 kW; rated speed: 1400 r/min; rated torque: 3.5 N.m; rated current: 1.6 A; rated frequency: 50 Hz.(2)Gearbox The gearbox is a two-shaft single-stage transmission device composed of a pair of standard spur gears. The teeth numbers of the two gears are = 28 and = 39, respectively, and the module is 2. So the transmission ratio of the pair of gears is . The structure of the gearbox is shown in Figure 13.(3)Magnetic powder brake Type: FZ5.J; rated torque: 5 N.m; rated speed: 1500 r/min.

The principle of measuring point arrangement is to be as close as possible to the place where vibration occurs when using sensors to collect vibration signals. Because the vibration signal inside the gearbox is mainly transmitted through the shaft, four measuring points are arranged at the four bearing parts close to two shafts as shown in Figure 13 in order to get more real vibration signal inside the gearbox. Each measuring point is installed with an acceleration sensor (the same as experiment 1, 8791A250 accelerometer is used) to collect the vibration signal and the real scene of sensor installation is shown in Figure 14. The sampling frequency is set as 25.6 kHz; each group contains four channels and vibration data with length of 5 s, i.e., 4 ∗ 128000 points.

The main focus of this experiment is the rolling bearing fault arising in gearbox, and the gears used in the test are in normal state. The corresponding fault combination is shown in Table 2.

The type of all the test bearings is 6023, the same as experiment 1, and its structural parameters and characteristic frequencies are given in Tables 1 and 3. The signals of two fault bearings in the normal state of gear are collected for analysis: the bearing with inner race pitting fault is installed at the position of measuring point 2, and the bearing with outer race fault is installed at the position of measuring point 3. The pictures of the two fault bearings are shown in Figure 15. The speed of the input shaft is = 10.4 Hz and the load of the magnetic powder brake is 3 N.m. The FCFs and gear mesh frequency are calculated and shown in Table 4.

**(a)**

**(b)**

The time-domain waveforms of the collected vibration signals corresponding to measuring points 1, 2, 3, and 4 are shown in Figure 16 and their corresponding envelope demodulation spectral analysis results are shown in Figure 17: the structures of the envelope demodulation spectral lines as shown in Figure 17 are almost the same, and the inner race FCF and outer race FCF of the test bearing could not be identified. The reason for the above phenomenon is that the components of the four signals are complex: gear meshing components, shaft rotating component with its harmonics, and the rolling bearing fault signal components are all contained in them, so the envelope demodulation spectral analysis would not work effectively. In order to use this experiment to verify the effectiveness of the proposed method in underdetermined blind source separation and conform to the subject of the paper, only the collected signals corresponding to measuring points 2 and 3 are taken as the two observed signals, the reason for which is that the two signals are much closer to the fault sources, and more fault characteristic components would be contained in them. Input the two observed vibration signals into the mathematical model and the blind source separation results are shown in Figure 18. Apply envelope demodulation spectral analysis on the separation results as shown in Figure 18, respectively, and the results are presented in Figure 19, from which the inner race and outer race FCFs are both extracted.

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**

#### 5. Comparison

##### 5.1. Comparison 1

In this section, the analysis results of the signal as shown in Figure 16 using SCA are presented to verify the virtues of the proposed method. Figure 20 shows the 4 separated signals using SCA and their corresponding envelope demodulation spectral results are presented in Figure 21. The inner race or outer race fault characteristic frequencies could not be identified based on Figure 21, and one advantage aspect of the proposed method is verified. Besides, the calculation times of the proposed method and SCA on the same computer are about 35 seconds and 50 seconds, respectively, and the high calculation efficiency of the proposed method is verified.

**(a)**

**(b)**

**(c)**

**(d)**

##### 5.2. Comparison 2

In this section, the cICA [25] method is used to verify the advantage of the proposed method. The cycles (points number) of the two square wave reference signals (inner race fault reference signal and outer race fault reference) could be set as *T* = 1/FCF ∗ fs = 104 based on the works of [25], so the cycle of inner race fault reference is *T*1 = 1/51.5 ∗ 25600 = 497, and the cycle of outer race fault reference is *T*2 = 1/22.4 ∗ 25600 = 1143. Then construct the inner race fault and outer race fault reference signals based on T1 and T2, and the constructed reference signals are shown in Figures 22(a) and 22(d). The same as the blind source extraction process stated in the works of [25], firstly, input the signal as shown in Figure 22(a) and the two observed signals (the same as experiment 2, the signals correspond to measuring points 2 and 3) as shown in Figure 16 into cICA mathematical calculation model, and the corresponding output signal is shown in Figure 22(b). Similarly, input the signal as shown in Figure 22(d) and the two observed signals (the same as experiment 2, the signals correspond to measuring points 2 and 3) as shown in Figure 16 into cICA mathematical calculation model, and the corresponding output signal is shown in Figure 22(e). Apply envelope demodulation spectral analysis on the signals as shown in Figures 22(d) and 22(e), respectively, and their corresponding results are shown in Figures 22(c) and 22(f). It is evident that the inner race and outer race FCFs could not be extracted by using cICA method, and this verifies the advantage of the proposed method over the cICA method.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

#### 6. Conclusion

Underdetermined blind source separation is an active and difficult branch of blind source separation, and it refers to the recovery or extraction of independent source signals by using a group of observed signals with more sources than sensors. The independence and non-Gaussian characteristic of the observed signals are the prerequisites of most of existent blind source separation methods. However, the above assumptions are not satisfied in most of the cases. Unlike traditional blind source separation methods, SCA originating from sparse representation is an effective method to solve the problem of underdetermined blind source separation, because it does not require the independence or non-Gaussian characteristics of the observed signals, and it only makes full use of the sparse characteristics of the observed signals to extract the source signal from the observed signals. Based on the huge potential of sparse component analysis method in underdetermined blind source separation of rolling element bearing fault signals and the characteristic that linear mix of sparse source signals clusters along vectors of mixed matrix being made full use of, a sparse component analysis based on linear clustering named LC-SCA is proposed in the paper for underdetermined blind source separation, with related algorithm given. The effectiveness of the proposed method was verified through simulation and experiments. Besides, the advantage of the proposed method in underdetermined blind source separation of rolling element bearing fault signals over the related methods such as cICA is also verified. The proposed method provides a new and simple way for underdetermined blind source separation of rotating machinery vibration signals.

#### Data Availability

The data are available from the corresponding author upon request by the e-mail: hongchao1983@126.com.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This research was supported by the National Natural Science Foundation (approved grant: U1804141) and the Key Science and Technology Research Project of Henan Province (approved grant: 192102210105).