Shock and Vibration

Volume 2018, Article ID 2530248, 17 pages

https://doi.org/10.1155/2018/2530248

## Feature Frequency Extraction Based on Principal Component Analysis and Its Application in Axis Orbit

School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510640, China

Correspondence should be addressed to Weiguang Li; nc.ude.tucs@ilgnaugw and Xuezhi Zhao; nc.ude.tucs@zxoahzem

Received 14 March 2018; Accepted 15 May 2018; Published 12 July 2018

Academic Editor: Jean-Jacques Sinou

Copyright © 2018 Zhen Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Vibration-based diagnosis has been employed as a powerful tool in maintaining the operating efficiency and safety for large rotating machinery. However, the extraction of malfunction features is not accurate enough by using traditional vibration signal processing techniques, owing to their intrinsic shortcomings. In this paper, the relationship between effective eigenvalues and frequency components was investigated, and a new characteristic frequency separation method based on PCA (CFSM-PCA) was proposed. Certain feature frequency could be purified by reconstructing the specified eigenvalues. Furthermore, three significant perspectives were studied via the distribution of effective eigenvalues, and theoretical derivations were subsequently illustrated. More importantly, this proposed scheme could also be used to synthesize axis orbits of larger machines. Purified curves were so explicit and the CFSM-PCA exhibited higher efficiency than harmonic wavelet and wavelet packet.

#### 1. Introduction

Principal component analysis (PCA), which can reduce the dimensionality of data set but retain most of original variables [1–3], has been widely used in fields of image processing, fault diagnosis, pattern recognition, neural network, data compression, wavelet transform, and so on. For example, Kirby et al. [4] employed PCA algorithm to compress image and extract main features. Moreover, the combination of PCA and Back Propagation (BP) neural network could also be applied in reorganization of facial image. Xi et al. [5] and Malhi et al. [6] individually applied PCA approach to reduce the dimension of data and extract the feature variables. Additionally, neural network was further used as a classifier to categorize the bearing faults. To investigate the fault diagnosis of impeller in centrifugal compressor, PCA was also adopted to decrease the dimensionality of multiple time series by Jiang’s group [7]. Sun et al. [8] analyzed the defects of conventional fault diagnosis methods and introduced the data mining technology into fault diagnosis. After that, a new scheme used to reduce data features was proposed based on C4.5 decision tree and PCA algorithm.

Generally, when PCA is used to denoise or for data compression, the number of effective eigenvalues is determined by the cumulative contribution rate and its deformation [9–14], expressed aswhere and are eigenvalues of covariance matrix, respectively; is the number of eigenvalues of covariance matrices; is the number of effective eigenvalues. When cumulative contribution rate is greater than a certain value (80%-95%), could be decided [2].

Although impressive progress in signal denoising and dimensionality reduction fields has been achieved, the studies on extraction or elimination of specific characteristic spectrum (single frequency) via this classical PCA method have always been ignored. However, precise extraction of the fundamental frequency (1X), the second-harmonic (2X), or the other feature frequencies of raw signal is of significance to the purifications of axial orbit, notch filter [15], speech recognition [16], fault diagnosis of rolling bearing [17], and so forth. Over the past decade, many signal processing tools for the extraction of certain frequency have been developed, such as wavelet packet transform, harmonic wavelet, ensemble empirical mode decomposition (EEMD), and sparse decomposition [18]. For instance, references [19–21] adopted multilevel division technique of wavelet packet to select certain frequency band for the extraction of specific frequency, from which axis orbit could be manufactured. References [22, 23] subdivided random frequency band infinity via harmonic wavelet to extract interesting frequency; the refinement of rotor center’s orbit from one or more interesting frequency bands could be realized subsequently. Nevertheless, wavelet packet and harmonic packet algorithm are subject to the Heisenberg uncertainty principle and resolutions of time domain and frequency domain could not be randomly high simultaneously, i.e., . In addition, in EEMD method, signals are adaptively decomposed into several sums of intrinsic mode functions (IMFs), whose instantaneous frequencies have physical meanings. In practice, the IMF is always multicomponent rather than a single component, resulting in unexplainable irregularity in its instantaneous amplitude and blind extraction of the 1X, 2X, and the other subharmonics. So the EEMD method is not suitable for the decomposition of signals with multiple components in a narrow band [24]. By using singular value decomposition (SVD) strategy, reference [25] generated axis orbit by means of cumulative contribution rate to denoise noisy signal. This method is unable to extract the specified feature frequency. Therefore, other more effective and simple methods, which suffer free from above disadvantages, remain to be explored.

Our group has been committed to studying the fault diagnosis of large scale equipment [26–28]. During the course of single frequency simulation via PCA, an interesting phenomenon was discovered unexpectedly; i.e., a frequency component produces two eigenvalues. After intensive study, we found that PCA algorithm could be used to extract the specified single or multiple feature frequencies from a crude signal. Guidelines are summarized as follows. Each characteristic frequency in a signal produces only two valid eigenvalues. The number of effective eigenvalues is related to the quantity of raw signal frequencies and has nothing to do with the magnitude of , , and . The sequence of eigenvalues of covariance matrix in its distribution chart is determined by the amplitude of feature frequency. For these discoveries, a novel frequency separation method based on PCA was proposed, through which axis orbits of large rotating machines were readily purified. Moreover, purification results are better than that of existing methods, such as wavelet packet and harmonic wavelet.

Hereafter, the paper is organized as follows. Section 2 briefly introduces the basic principle PCA theory in signal processing, and a new method of signal recovery is introduced. In Section 3, the theoretical discovery and the theoretical verifications of relationship between eigenvalues and feature frequency are given. Section 4 illustrates the application on purification of axis orbit of large rotor test bed and compares experimental results with that of harmonic wavelet and wavelet packet, proving its high efficiency of the proposed scheme. The filtration of single frequency is given in Section 5. Finally, Section 6 draws the conclusions.

#### 2. Basic Theories of PCA in Signal Processing

We assume that there are random vectors () with each vector containing samples (). An matrix with rows and columns can be described aswhere and T denotes the vector transpose. Supposing that represent the indicatrix of crude variables, then new variables () could be obtained after principal component analysis of , described aswhere . According to the definition of PCA [3], and are eigenvectors corresponding to the* i*th eigenvalue in descending order in the covariance matrix of [2–4, 6], and (4) should be satisfied.Covariance matrix of in (2) iswhere . Based on PCA theory, characteristic equation of covariance matrix can be given bywhere are eigenvalues of covariance matrix and are eigenvectors corresponding to . Given that eigenvalues ranged in descending order, that is, , data dimensionality reduction can be achieved by using (1) and (3). Then original* m* variables are converted to new ones. If signal processing is performed, the signal reconstruction is needed. Considering covariance matrix is a semi-positive symmetric matrix, its eigenvectors are orthogonal to each other, i.e., [1, 2]. After left multiplication on both sides by of (3) and sum calculation, (7) can be given as follows:

If the former principal components are chosen to reconstruct in the light of cumulative contribution rate , an approximate matrix can be formulated as

Compared with original matrix , the reconstructed approximate matrix comprises most of information of and excludes redundant features, such as noise and power frequency interference [9, 10].

Signal could be recovered from the reconstructed matrix in terms of matrix composition mode. is converted to 1 row columns vector with the row vectors arranging in head-to-tail fashion, then a new vector is obtained, recorded as , . The recovered signal is obtained via inverse transformation of vector , derived aswhere ,* L*=*m*+*n-*1,* L* is the length of recovered data, , and is pseudoinverse of . , and comprises unit matrices, where the rank of each unit matrix is . Taking* m*=3,* n*=3 as an example, in this case,* L*=*m*+*n*-1=5, and* mn*=9, the matrix can be expressed as follows:

The signal is recovered from the pseudoinverse matrix ; actually it is to compute the average values of each element at counter-diagonal of matrix , which is consistent with the method reported in [23]. Since is a sparse matrix, especially when values of and are large, random-access memory and time for calculating of pseudoinverse matrix will increase exponentially. Hence, signal recovered from simple method is extremely desired. With this aim in mind, we have developed a new averaging method and adopted it to recover signal from matrix . The expression is shown as follows:where are element of the th row and the th column in the reconstructed approximate matrix . The signal can be restored facilely according to (11).

#### 3. Internal Law of Effective Eigenvalues and Frequency Components

##### 3.1. The Process of Feature Frequency Separation Method Based on PCA

For noise-free signal , the effective eigenvalues are nonzero ones. It can be seen from (7) that signal has eigenvectors, which are mostly generated by noise. So, for a signal containing certain number of frequency components, what is the relationship between the number of eigenvalues and the number of frequencies?

During our research course, an important connection between nonzero eigenvalues and the number of frequency components was discovered. To illustrate this connection explicitly, signal with different amplitude , frequency , and phase was constructed as follows:where is the number of frequency components.

4096 data points were collected with sampling frequency of 1024 Hz, and then the Hankel matrix was formed from signal with* m* rows and* n* columns. Decomposition and reconstruction of signal were proceeded by employing PCA algorithm in Section 2 [9, 13]. Effect of the constructed Hankel matrix on signal processing was studied by Zhao et al. [26]; they pointed out that if the number of rows was close to the number of columns, signal processing effect was better. Furthermore, when* L* was an even,* m*=*L*/2 and* n*=*L*/2+1; when* L* was an odd,* m*= (*L*+1)/2 and* n*= (*L*+1)/2. Hence, in our case* m *= 2048 and* n* = 2049 were applied. The decomposition procedure consists of the following:

Given* k* = 1, = 1, tri-groups signals were constructed and their principal component eigenvalue distribution maps are shown in Figure 1. The number of eigenvalues range from 1 to 2048, and just the leading 50 eigenvalues are listed in this case.