Short-Sampled Blind Source Separation of Rotating Machinery Signals Based on Spectrum Correction
Nowadays, the existing blind source separation (BSS) algorithms in rotating machinery fault diagnosis can hardly meet the demand of fast response, high stability, and low complexity simultaneously. Therefore, this paper proposes a spectrum correction based BSS algorithm. Through the incorporation of FFT, spectrum correction, a screen procedure (consisting of frequency merging, candidate pattern selection, and single-source-component recognition), modified -means based source number estimation, and mixing matrix estimation, the proposed BSS algorithm can accurately achieve harmonics sensing on field rotating machinery faults in case of short-sampled observations. Both numerical simulation and practical experiment verify the proposed BSS algorithm’s superiority in the recovery quality, stability to insufficient samples, and efficiency over the existing ICA-based methods. Besides rotating machinery fault diagnosis, the proposed BSS algorithm also possesses a vast potential in other harmonics-related application fields.
As one of the most common classes of mechanical equipment, rotating machinery plays a significant role in industrial applications. Meanwhile, since it generally operates under harsh working conditions, it is likely to suffer from failures, which may cause the machinery to break down or decrease machinery service performance such as manufacturing quality and operation safety. Nowadays, rotating machineries in modern industry tend to be larger, more precise, and more automatic, which further increases the difficulty of the potential faults detection.
Blind source separation (BSS), which can recover underlying sources from observations without the knowledge of the mixing system, is widely used in machinery fault diagnosis [1–5], speech recognition , wireless communication , and so on. Nowadays, BSS techniques applied in the machinery fault diagnosis mainly focus on two aspects: removal of interferences and disturbances and parameter modeling and feature detection for mechanical faults.
On the one hand, as is known, rotating components (such as gears and bears) are the common and key components of modern machinery . Affected by a lot of field factors (such as multiple motors that are fixed to the same structure or several fault events that happen simultaneously), the signal recorded from a sensor cannot solely reflect the operating state of a specific component. Furthermore, in industrial applications, these recorded signals are inevitably disrupted by the environment (ambient noise, other mechanical systems, etc.). Hence, BSS can act as an effective preprocessing procedure  to remove these interferences from other components or the disturbances arising from the environment. Wu et al.  proposed a BSS algorithm to remove the interferences of acoustic emission signals from a multiple cylinder diesel engine. In , an improved morphological component analysis (MCA) is proposed to diagnose compound faults of gearboxes. Cui et al.  put forward a null-space pursuit (NSP) BSS algorithm to diagnose compound faults of roller bearings.
On the other hand, due to the effect of several rotor operations at some speeds, the signal recorded from a vibration sensor is mainly composed of multiple periodic harmonic components. For different categories of faults, the spectra of these recorded vibration signals exhibit distinct harmonics-related features. For example, a vibration signal caused by rotor misalignment is mainly characterized with the 2nd harmonic component . The loosening of the bearing in the bearing block often generates components higher than 10th harmonic (even up to 20th harmonics). The fault of oil whirl  always gives rise to some subharmonics approximating half harmonic, and so forth. Hence, BSS is expected to accurately extract these harmonic features of individual sources. What is more, the model-based fault identification assumes that there exists a certain model to characterize a mechanical structure, in which the variation of model parameters can reflect the abnormal behaviors of the machinery system . As a result, BSS can be utilized to identify the model parameters.
Hence, a lot of studies of BSS problem have been made in the feature extracting and model identification fields. For example, sparse component analysis based  and independent component analysis (ICA) based  BSS methods were employed to estimate the vibration signals’ modal parameters. Following this, Žvokelj et al.  proposed the ensemble empirical mode decomposition based multiscale ICA (EEMD-MSICA) method and applied it into the bearing fault detection. Li et al.  proposed the supervised order tracking bounded component analysis (SOTBCA) based BSS algorithm for gear fault detection, which is suitable for dealing with the situation that the vibration signals do not satisfy the independent condition.
To reduce the loss arising from fault accidents, it is urgently demanded in field operations that rotating machinery fault analysis should be as fast as possible. One possible solution is to implement the BSS in a short period of observations.
However, these existing BSS methods can hardly work well in case of short-sampled observations. For example, the mainstream BSS method in rotating machinery fault diagnosis is the ICA . A lot of ICA-based methods [18, 19], or improved ICA like second-order ICA , nonlinear adaptive ICA , and kernel ICA , are applied into the failure detection and analysis. As will be elaborated in this paper, ICA is likely to fall into nondeterministic solutions when provided only short-sampled observations. This arises from the fact that ICA is based on optimizing a kurtosis-related objective function. As a fourth-order cumulant statistic, the calculation of kurtosis needs to consume a large amount of samples. In fact, other statistics-based BSS methods, such as fourth-order-only blind identification (FOOBI) method  which is based on constructing high-order tensors, also exhibit poor performance in short-sampled situations.
Hence, in this paper, we propose a novel blind source separation method which works well in both long observations and short observations. Due to the incorporation of spectrum correction and a phase coherence criterion, this BSS method can accurately extract harmonic features (frequency, amplitude, and phase) of individual sources. In case of short-sampled observations, which reduce the frequency resolution of fast Fourier transform (FFT) spectrum and thus deteriorate the picket-fence effect, the proposed BSS can also estimate harmonic parameters by means of spectrum correction. Moreover, a frequency screening procedure consisting of frequency merging, candidate pattern selection, and single-source-component recognition is able to exclude the interference between individual harmonics-related components. Therefore, unlike ICA or FOOBI method, the proposed BSS is competent in dealing with case of insufficient samples. In addition, the proposed BSS algorithm does not require the a priori source number. Both numerical simulation and practical experiment verify the proposed BSS algorithm’s superiority in efficiency and accuracy over the existing ICA-based methods.
2. Blind Source Separation Model
2.1. Temporal Model
Consider underlying sources and recording sensors. Suppose that the structure under investigation has a high rigidity, and the transmission delays in the mechanical structure are negligible compared to the sampling period . In this case, the mixing system can be treated as an instantaneous one, which can be modeled asIn (1), is the source vector, is the observation vector, is the additive noise vector, and is the mixing matrix. The task of short-sampled BSS is to recover the sources from the observations without the knowledge of mixing matrix in the small sample number situation.
According to the relative relationship between and , the BSS problem can be divided into 2 conditions, the overdetermined or determined BSS () and the underdetermined BSS (). This paper focuses on the overdetermined condition.
Since the vibration of some mechanical component stems from the rotation of the rotor, th source can be formulated as a combination of individual harmonics: that is,where is the number of components and , , and are the amplitude, frequency, and phase parameters of th component of th source, respectively.
Based on this model, this paper aims to develop a BSS algorithm, which consumes a small amount of samples to estimate the mixing matrix A and recover all sources . Besides, it should be emphasized that, in industrial applications, the source number is usually not known in advance. Therefore, this paper also addresses the problem of source number estimation.
2.2. Harmonics Based BSS Model
Since a real signal contains two conjugate side spectra, we rewrite in (2) aswhere
Further, if the harmonic frequency is far from direct component (DC), only a single side spectrum is enough to achieve BSS. In combination with (1), we have a frequency-domain model:
As is known, the ideal Fourier transform of a complex exponential signal is a dirac function. Hence, the spectrum of th source in (4) isDenote the mixing matrix as . Substituting (6) into (5), we have
To determine each column vector of the mixing matrix , some particular frequency which is only included in a single source and excluded by other sources, is considered; that is, should satisfy
Then, it can be inferred from (9) that the frequency-domain vector corresponding to the component is parallel to . Hence, as long as sufficient single-source components are collected, every column of the mixing matrix can be sequentially determined.
2.3. Difficulty of Short-Sampled BSS
Note that (7) is an ideal Fourier model of the BSS system, in which the frequency is a continuous variable. However, as is known, the ideal Fourier transform is unrealizable since it consumes infinite numbers of samples.
In practice, the ideal Fourier transform is replaced by a -point discrete Fourier transform (DFT) (“” refers to the number of consumed samples), in which in (7) only allows being one of frequencies , ( is the frequency resolution and refers to the system sampling rate). Thus, the DFT spectrum of each observation will suffer from severe picket-fence effect.
In addition, it is very likely that the frequency of th source is not exactly the integer times of the DFT frequency resolution , resulting in the fact that the dirac function in (7) cannot achieve an ideal sampling result. This deviation is also reflected in observations’ DFT spectra , which exhibit the effect of the spectral leakage.
Without loss of generality, denote the frequency of th source as the summation of integer times and fractional time of : that is,
When the sample length becomes smaller, the DFT frequency unit gets larger and thus the DFT spectrum gets coarser. Limited by the picket-fence effect, in fact, the fractional item “” in (10) cannot be directly obtained from DFT bins and thus the frequency has to be treated as the integer times of (i.e., ), which corresponds to several peak DFT spectral bins of the observations. As a result, large deviation of frequency estimation inevitably occurs.
Furthermore, as (7) shows, since an observation contains multiple components, severe interinterferences surely occur among distinct components when these frequency estimates are inaccurate. As a result, the recovered spectrum of is bound to be greatly different with the ideal spectrum, thereby increasing the BSS difficulty in the case of short-sampled observations.
To overcome this difficulty, we introduce spectrum correction to solve this problem.
3. Spectrum Correction Based BSS
3.1. Spectrum Correction
In this paper, we apply the ratio-based spectrum correction method addressed in  to th () observation to overcome the short-sampled difficulty. The spectrum correction consists of the following steps:(1)Implement Hanning-windowed DFT on the -length observation and acquire its DFT spectrum ().(2)Collect all the peak indices of . For the peak index , ( is the peak number of ), calculate the amplitude ratio between and its subpeak neighbor: that is, Further, a variable can be obtained as(3)Adjust to estimate the fractional number as Then, the accurate frequency estimate is(4)Acquire the corrected amplitude estimate and phase estimate as where “” is the acquiring angle operator.
After spectrum correction, 3 harmonic parameter sets , , and of th observation () can be acquired.
Further, as (8) and (9) demonstrate, for an estimated frequency , only when it is included by a single source can it be utilized to estimate a column of the mixing matrix . Hence, it is necessary to screen those single-source related frequencies from , .
3.2. Screening Single-Source Components
The proposed scheme of screening single-source components consists of 3 stages: frequency merging, candidate pattern selection, and single-source-component recognition.
3.2.1. Frequency Merging
Its noteworthy that, affected by noise and interferences, even for the same single-source component, its frequency estimates of all the observations obtained by spectrum correction still exhibit tiny differences. Hence, a frequency merging procedure should be implemented.
If we put all these frequency estimates together and sort them in an ascending order, the aforementioned frequency estimates of tiny differences tend to converge into a cluster. Assuming altogether that clusters are formed, without loss of generality, denote th () cluster as ( refers to th cluster’s size). Then, elements of this cluster can be merged by their average:
3.2.2. Candidate Pattern Selection
Theoretically, in terms of the BSS model (1), as long as the mixing matrix does not contain zero elements, any source component should be included in all the observations. In other words, those corrected frequencies not contained by all the observations can be treated as fake components and should be removed.
In practice, given a small threshold , for a merged frequency , if for each observation index () there exists only one peak subscript satisfying can be regarded as an effective component. Accordingly, in combination with (9), a pattern vector relevant to this component’s corrected parameter pairs (amplitude and phase) can be selected as a candidate vector to estimate a column of the matrix ; that is,
After candidate pattern selection, the number of merged frequencies is reduced from to .
3.2.3. Single-Source-Component Criterion
In rotational machinery fault analysis, it is likely that multiple sources contain some common harmonic components (i.e., the overlapping frequencies). Obviously, these frequencies are not in accordance with (8) and (9) and should not be adopted to estimate the mixing matrix . Hence, these overlapping components are invalid and should be removed from the candidate frequencies .
Assume that among candidate frequencies , only frequencies are single-source components. Since only belongs to a single source, in combination with (9), its corresponding single-source-component vector (i.e., the item in (9)) should be parallel to a column of the mixing matrix .
Furthermore, since the matrix is real-valued, from (8) and (9), one can find that phases of all the entries of originate from the same phase of a single source’s component (i.e., in (8)) and thus should be equal to each other.
Thus, a single-source-component vector should exhibit two special properties:(1)Its amplitude vector is parallel to a column of the mixing matrix .(2)Its phase vector possesses a property of coherence, in which any two phase entries of should approximately point to the same direction. In other words, the following inequality of single-source-component criterion should be satisfied: where , , , and is a small positive value.
3.3. DB-Index Based Source Number Estimation and -Means Clustering
If the source number “” is known, one can directly employ a clustering algorithm (such as -means clustering) on single-source-component vectors to estimate all the columns of the mixing matrix . However, in industrial applications, the source number “” is usually unknown in advance. Therefore, this section combines DB-index  with -means clustering to estimate and .
Clearly, if the number of clusters is specified as , then, the conventional -means clustering algorithm can classify into clusters (), whose entries can be denoted asThe relationship between these clusters and the entire set of single-source-component vectors can be expressed as
Davies Bouldin index (DB-index) is used to evaluate the appropriateness of data partitions  of a clustering algorithm. The definition of the DB-index is formulated aswhere represents the dispersion measurement of two distinct groups (assuming their cluster centers are ) and refers to the similarity between these two groups. They are calculated with the following two formulas:
Apparently, on the one hand, the larger is, the less the similarity between th and th clusters is, that is, the better the partition discrimination is. On the other hand, the smaller the dispersion degree is, the higher the concentration degree of the group is. As a result, the smaller the DB-index is, the more appropriate the data partition is. Therefore, the source number estimation can be realized by searching out the minimum DB-index of the -means algorithm: that is,
Once the source number is determined, the magnitude parts of cluster centers of groups generated by -means algorithm can be directly treated as the columns of the mixing matrix estimate .
3.4. Summary of the Proposed BSS Recovery Algorithm
Having obtained the overdetermined mixing matrix estimate , the sources can be recovered bywhere refers to the pseudoinverse of .
To summarize, the proposed BSS algorithm is listed as follows.
Step 1. Implement the procedure of spectrum correction addressed in Section 3.1 on , to acquire the corrected frequency set , amplitude set , and phase set .
Step 2. Merge the corrected frequencies using (16). Further, use (17) and (18) to acquire the candidate vectors . Then, in terms of the screening criterion (19), pick out single-source-component vectors from these candidate patterns.
Step 3. Implement the modified -means clustering on single-source-component vectors to obtain the final estimate of the source number and mixing matrix .
Step 4. Calculate the pseudoinverse of and recover the source by (25).
In this section, both numerical simulation of synthesis signals and practical mechanical diagnosis experiment are conducted to verify the performance of proposed BSS algorithm. As a comparison, the results of fast-ICA are also presented.
4.1. Numerical Simulation
Consider a mixing system expressed asTwo sources and are formulated asThe sampling rate was fixed as Hz and 4 cases of sample length () were taken into account. Since fast-ICA needs several iterative operations to optimize a kurtosis-related objective function, which starts from a random initialization on the demixing matrix, it is likely to fall into failure in case of insufficient samples. Hence, for each sample length case, 1000 trials were conducted. The times of successful trials of both BSS algorithms were recorded in Table 1. Moreover, among these successful trials, correlation coefficients between the recovered signals and the sources were statistically averaged and also listed in Table 1. Figures 1 and 2 present the recovered results of these two BSS algorithms in case of long observations (), while Figures 3 and 4 present the short observation case ().
As Figures 1 and 2 depict, both the fast-ICA and proposed algorithm can acquire high-quality recovered waveforms in case of long observations (, limited by page layout, only half-duration waveforms are plotted). However, when the sample length reduces into , one can observe that obvious distortions appear in the waveforms recovered by fast-ICA in Figure 3. In contrast, there exist no distortions in the recovered waveforms in Figure 4, reflecting that the proposed BSS algorithm outperforms fast-ICA in dealing with insufficient samples.
Table 1 shows that as the sample length decreases, the times of successful recovery of fast-ICA decline sharply, and the average correlation coefficient also tends to be slightly smaller, accordingly. In contrast, as Table 1 lists, all the trials of the proposed BSS algorithm for different sample lengths are successfully conducted and all correlation coefficients remain 1. This is because, unlike fast-ICA, the proposed BSS algorithm is based on spectrum correction related harmonics analysis rather than statistical analysis and thus it is insensitive to the sample length.
4.2. Mechanical Diagnosis Experiment
In this section, two practical fault signals , collected from field rotating machineries are treated as sources. is an imbalance fault signal with the rotating frequency 89.6853 Hz, and is a misalignment fault signal with the rotating frequency 102.8811 Hz. The mixing system is the same as the matrix in (26). Different sample lengths () were considered. In each case, 1000 trials were conducted. Figures 5–8 present the recovery results of both BSS algorithms. Table 2 lists their recovery performance indexes.
From Figures 5 and 6, one can see that, just like the recovery of synthesis signals in (27), both the fast-ICA and the proposed BSS algorithm can achieve excellent recovery results in the long-sample situation (). Nevertheless, when it comes to the short-sample situation (), the proposed BSS algorithm exhibits better performance than the fast-ICA does.
From Table 2, one can see that, as the sample length decreases from to , the proposed BSS algorithm’s superiority over fast-ICA becomes more obvious. In particular, due to the effect of field noise, the correlation coefficients resulting from the proposed algorithm do not remain 1 but approximate to 1. Hence, the proposed BSS algorithm outperforms the fast-ICA in rotating machinery fault diagnosis.
This paper proposes a novel blind source separation algorithm based on spectrum correction. Both numerical simulation and practical experiment verify the proposed BSS algorithm’s excellent performance. In general, this algorithm possesses the following 4 merits:(1)Compared to classical fast-ICA algorithm, the proposed algorithm can achieve a higher-quality source recovery even in case of short-sample observations. This meets the demand of fast response of the rotating machinery fault analysis.(2)The spectrum correction involved in the proposed algorithm does well in harmonics information extraction and thus is especially suitable for rotating machinery fault analysis. As is known, most of these faults arise from the rotor malfunction, which generates a lot of rotating-frequency related harmonics.(3)The proposed BSS algorithm can accurately determine the underlying source number by means of the modified -means clustering, which is in accordance with practical situation of rotating machinery operations.(4)Unlike fast-ICA, the proposed BSS algorithm does not involve random initialization and iterative operations and thus possesses a higher stability and lower complexity, which enhances the reliability and efficiency of the rotating machinery fault analysis.
In fact, besides rotating machinery fault analysis, harmonics analysis is also frequently encountered in a lot of fields such as power harmonics analysis, channel estimation in communication, radar, and sonar. Hence, the proposed BSS algorithm possesses a vast potential in a wide range of applications.
Xiangdong Huang and Haipeng Fu are IEEE members.
The authors declare that they have no competing interests.
This work was supported by the National Natural Science Foundation of China under Grant 61271322.
W. Wu, T. R. Lin, and A. C. C. Tan, “Normalization and source separation of acoustic emission signals for condition monitoring and fault detection of multi-cylinder diesel engines,” Mechanical Systems and Signal Processing, vol. 64, article 3837, pp. 479–497, 2015.View at: Publisher Site | Google Scholar
Z. Li, X. Yan, X. Wang, and Z. Peng, “Detection of gear cracks in a complex gearbox of wind turbines using supervised bounded component analysis of vibration signals collected from multi-channel sensors,” Journal of Sound and Vibration, vol. 371, pp. 406–433, 2016.View at: Publisher Site | Google Scholar
A. Ypma and P. Pajunen, “Rotating machine vibration analysis with second-order independent component analysis,” in Proceedings of the 1st International Workshop on Independent Component Analysis and Signal Separation, vol. 99, pp. 37–42, 1999.View at: Google Scholar
Z. Li, X. Yan, Z. Tian, C. Yuan, Z. Peng, and L. Li, “Blind vibration component separation and nonlinear feature extraction applied to the nonstationary vibration signals for the gearbox multi-fault diagnosis,” Measurement: Journal of the International Measurement Confederation, vol. 46, no. 1, pp. 259–271, 2013.View at: Publisher Site | Google Scholar