#### Abstract

Feature extraction is a key procedure in the fault diagnosis of rotating machinery. To obtain fault features with lower dimensionality and higher sensitivity, a feature extraction method based on adaptive multiwavelets transform (AMWT) and local tangent space alignment (LTSA) is proposed in this paper. AMWT is first used to obtain multiple features from the vibration signals of the machine under test to form a high-dimensional feature set. Then, in order to avoid the adverse effect of the irrelevant features in this high-dimensional feature set on the fault diagnosis result, a detection index (DI) is investigated to evaluate the sensitivity of the features and those with lower sensitivity are removed. After that, LTSA is applied for feature fusion to reduce the redundant features in the high-dimensional feature set. To validate the proposed method, performance of four feature extraction schemes based on (i) wavelet and LTSA, (ii) Geronimo, Hardin, and Massopust (GHM) multiwavelets and LTSA, (iii) AMWT and principal component analysis (PCA), and (iv) AMWT and multidimensional scaling (MDS) is compared with the proposed method. The feature extraction results by these methods are then fed into K-medoids classifier to discriminate the faults. The results show that the proposed method can improve the sensitivity of the extracted features and obtain higher fault recognition rate.

#### 1. Introduction

Rotating machinery is widely used in the modern industry [1, 2]. Faults occurring in rotating machinery may lead to fatal breakdowns. Therefore, it is important to diagnose the existence of the rotating machinery faults accurately at an early stage to avoid huge maintenance cost and catastrophic accidents [3, 4].

Feature extraction is a pivotal procedure in the process of fault diagnosis [5]. The application of an effective feature extraction method may improve the accuracy of fault diagnosis results significantly. Being rich in information on machinery health conditions, vibration signals are often processed by advanced signal processing techniques to extract features from the rotating machinery [6, 7]. Up to now, multiple methods have been extensively applied for vibration signal processing, such as fast Fourier transform (FFT), short-time Fourier transform (STFT) [8], wavelet transform (WT) [9], Wigner–Ville distribution (WVD) [10], empirical mode decomposition (EMD) [11], multiwavelets transform (MWT) [12], etc. However, in spite of all the achievements these methods have made, they are fixed and independent of the measured signals. Hence, the sensitivity of the extracted features could be hardly enhanced further.

As an extension of WT and MWT, adaptive multiwavelets transform (AMWT) not only has perfect local property in time and frequency domain but also possesses properties of orthogonality, symmetry, short support, and higher order of vanishing moments, simultaneously. More importantly, it can change adaptively according to the characteristics of the vibration signals and enhance the sensitivity of the extracted fault features [13]. Therefore, AMWT constructed based on two-scale similarity transform (TST) [14] has been investigated in this paper and applied to feature extraction for fault diagnosis of rotating machinery.

In order to know the machinery condition most comprehensively, many features are generally extracted from vibration signal beforehand to form a high-dimensional feature set. However, irrelevant or redundant features, that have adverse effect on the accuracy of fault diagnosis result, are often contained in this feature set. Additionally, applying such a high dimensional feature set directly for rotating machinery fault diagnosis is computationally demanding [15]. Therefore, investigating both feature selection and feature fusion methods to reduce irrelevant and redundant features, respectively, and obtain fault features with lower dimensionality and higher sensitivity is significant in increasing the accuracy of fault diagnosis results and reducing the complexity of the fault diagnosis algorithm.

Generally, feature selection can be divided into two categories: filters and wrappers [16]. Filters conduct feature selection depending on information of the feature subset [17], whereas wrappers generally employ the accuracy of the classifier result to evaluate the feature subset [18]. Compared with wrappers, filters are independent of classifiers, thus need less computational time and possess good generalizability [16]. Therefore, filters are considered in this paper. In this process, an effective criterion for feature sensitivity evaluation is essential to obtain more sensitive features from the high features set. Detection index (DI) [14, 19, 20] is employed in this paper to evaluate the rotating machinery vibration signal features for feature selection.

There are many classical feature fusion methods for redundant features reduction such as principal component analysis (PCA) [21] and multidimensional scaling (MDS) [22]. These methods are easy to be implemented. However, as linear methods, they fail to capture nonlinear characteristic of the dataset and thus are likely to exhibit poor performance for nonlinear feature fusion [23–26]. In recent years, many manifold learning techniques have been investigated, such as locally linear embedding (LLE) [27], isometric feature mapping (IsoMap) [28], local tangent space alignment (LTSA) [29], etc. Among them, LTSA aims to preserve local structures in small neighborhoods and discovers the intrinsic features of nonlinear manifolds [30]. It has been proven to be superior to LLE and IsoMap in obtaining intrinsic data structure [31, 32] and applied in vibration signals feature extraction [23, 33, 34].

In this paper, a new feature extraction method based on AMWT and LTSA is proposed. With this method, efforts have been made in three aspects in order to extract fault features with lower dimensionality and higher sensitivity from rotating machinery signals. Firstly, considering AMWT has the ability to change adaptively with signal characteristics, it is employed to improve the synthetic sensitivity of the candidate high-dimensional feature set. Secondly, with the evaluation index DI, feature selection is conducted to eliminate the irrelevant features from the original high-dimensional feature set. Thirdly, feature fusion is implemented with LTSA to remove the redundant features from the obtained feature subset by the feature selection step. Results of the studies show that, with the proposed method, fault features with lower dimensionality and higher sensitivity can be obtained, and the accuracy of the rotating machinery fault diagnosis can be improved.

The rest of this paper is arranged as follows. In Section 2, theories of AMWT and LTSA are introduced briefly. The specific process of the proposed method is illustrated in Section 3. After that, effectiveness of the proposed method in feature extraction by application in rotating machinery fault diagnosis is demonstrated in Section 4. Finally, conclusions are described in Section 5.

#### 2. Adaptive Multiwavelets and LTSA

##### 2.1. Adaptive Multiwavelets

Adaptive multiwavelets have been applied in rotating machinery fault diagnosis and obtained good results [13, 14]. The TST-based adaptive multiwavelets are investigated in this paper for rotating machinery vibration fault feature extraction. They are constructed on the basis of multiwavelets and TST theory.

###### 2.1.1. Multiwavelets

Different from scalar wavelet, multiwavelets [35] contain vector multiscaling functions and multiwavelet functions formed by several scaling functions and wavelet functions , respectively:

Similar to the two-scale relations of scalar wavelet, multiwavelet functions and multiscaling functions of multiwavelets satisfy two-scale matrix equations:where and represent two-scale matrix sequences.

In frequency domain, the two-scale matrix equations can be expressed aswhere and represent Fourier transformations of the multiscaling functions and multiwavelet functions, respectively; and represent the two-scale matrix symbols corresponding to the multiscaling and multiwavelet functions, respectively:

With the two-scale matrix equations (2) and (3), the decomposition and reconstruction of multiwavelets can be obtained as (6) and (7), respectively:where the superscript represents the complex conjugate transpose.

Since the two-scale matrix sequences are matrices, two or more input streams are required in the process of multiwavelet decomposition. Hence, preprocessing is needed to mapping a sequence of univariate data to *r* vectors. Among the proposed preprocessing methods, the oversampling scheme has been proven to be more effective for feature extraction than others [36]. Therefore, it is adopted in this paper.

###### 2.1.2. TST

Two-scale similarity transform theory was first proposed in [37]. In this, TST was divided into singular TST and regular TST. Singular TST can generate multiwavelets with approximation order one higher than the original multiwavelets. Therefore, it attracts more attention. However, the obtained multiwavelets do not satisfy orthogonality. To make up this drawback, the singular TST was extended to biorthogonal multiwavelets [38].

Suppose , , , and are the multiscaling functions, multiwavelet functions, dual multiscaling functions, and dual multiwavelet functions, respectively, , , , and are their corresponding two-scale matrix symbols, has approximation order of at least one, and is a TST matrix. Then, the biorthogonal multiwavelets singular TST is

Correspondingly, the biorthogonal multiwavelets singular inverse TST is

###### 2.1.3. Adaptive Multiwavelets

Adaptive multiwavelets are constructed by adding some free parameters to the TST matrix in TST process of GHM multiwavelets [39].

The GHM multiwavelets have the properties of orthogonality, short support, symmetry, and approximation order of 2. The two-scale matrix sequences of GHM multiwavelets are

The two multiscaling functions and two multiwavelet functions of GHM are shown in Figure 1.

**(a)**

**(b)**

In order to construct adaptive multiwavelets, two TST matrices with free parameters are needed. They are

Substituting (11) into (8) and (9), new biorthogonal multiwavelets with two-scale matrix symbols of , , , and can be obtained. Then, taking , , , and as the original two-scale matrix symbols , , , and in (8) and (9), respectively, and substituting (12) into (8) and (9), the final biorthogonal multiwavelets with two-scale matrix symbols of , , , and can be obtained. Since the TST matrices shown as (11) and (12) are applied in this process, the two-scale matrix symbols , , , and are formed by free parameters *a*, *b*, *c*, *d*, and *f*. Therefore, the final obtained biorthogonal multiwavelets can adaptively change their basis functions with the characteristics of signals.

##### 2.2. LTSA

As a fundamental type of manifold learning method, LTSA employs locally linear transformation to extract low-dimensional manifold structure embedded in high-dimensional data and is capable of digging out valuable information and realizing feature dimension reduction. The algorithm of LTSA generally includes the following three steps [29]:(1)Neighborhood identification: for each data point in sample set , identify its *k* neighborhood samples with K-nearest neighbor algorithm to construct neighborhood .(2)Local information extraction: project the *k* points in into a tangent space of the manifold at :where is the neighborhood mean vector, is a column vector with all ones; represents the local coordinate; denotes the orthogonal basis of the tangent space which can be estimated by performing the singular decomposition of ; it consists of *d* left singular vectors of corresponding to the first *d* largest singular values.(3)Global coordinate determination: the global coordinate system can be obtained by linearly aligning the local coordinates:where is an affine transformation matrix, is the reconstruction error, is the global coordinate, and is the centralized vector of . Expressing (14) as matrix form, we getwhere , , and .

To preserve as much of the local geometry in the *d*-dimensional space, the reconstruction error needs to be minimized:where represents the matrix Frobenius norm. Obviously, the affine transformation matrix should be given by to minimize the reconstruction error, where represents the Moore–Penrose generalized inverse of .

Let be the 0-1 selection matrix and be the weight matrix of , thenwhere , , and .

In order to determine the global coordinate, let . Then, the optimal global *d*-dimensional coordinate **T** is obtained by the *d* eigenvectors of **B**, corresponding to the second to *d* + 1 smallest eigenvalues of **B**.

#### 3. Feature Extraction Based on Adaptive Multiwavelets and LTSA

The feature extraction method for rotating machinery vibration fault diagnosis mainly includes three steps. First, a high-dimensional candidate feature set is constructed via AMWT. Second, DI is used as the feature sensitivity evaluation index to remove irrelevant features from the candidate feature set and form a subset of vibration features. Third, LTSA is applied to the fusion of the redundant features in the feature subset to obtain favorable vibration features with lower dimensionality and higher sensitivity.

##### 3.1. High-Dimensional Feature Set Construction Based on Adaptive Multiwavelets

To improve the sensitivity of the features in the original high-dimensional feature set, AMWT, rather than conventional fixed signal processing method, is applied to process the vibration signals in this paper. The maximum value of SDI is used as the optimal object, and genetic algorithm (GA) is used to find the optimal multiwavelets from the adaptive multiwavelets library. Then, the optimal multiwavelets are used to extract the values of some nondimensional symptom parameters (NSPs) from the vibration signal to form the original high-dimensional features set.

###### 3.1.1. Summary of SDI

SDI is an index formed by DI for feature sensitivity evaluation [19]. Suppose that *s*_{1} and *s*_{2} are values of a fault feature calculated from the signals measured in machinery state 1 and state 2, respectively. They obey the normal distributions and , . Then, DI corresponding to these two states is

Suppose the number of features used for diagnosis and machinery states are *M* and *N*, respectively. Then, the SDI can be expressed as

According to [19], the larger the value of DI, the more sensitive is the fault feature. In addition, it is obvious from (18) and (19) that SDI is the sum of the DI over all of the fault features and fault types. Therefore, increasing the value of SDI can improve the sensitivity of the fault feature set comprehensively.

###### 3.1.2. Types of NSPs

To construct the high-dimensional candidate feature set, the following ten NSPs are taken into consideration:where denotes the value of sampling point, and represent the peak and valley of , respectively; , , and represent the means of , , and , respectively; , , and represent the standard deviations of , , and , respectively.

###### 3.1.3. High-Dimensional Features Set Construction

In order to avoid the adverse effect of the noise component in signals on the extracted features, the original signals are first processed by GHM neighborhood coefficients denoising method [36] before input into the algorithm for the construction of high-dimensional features set. The flowchart of this algorithm is shown in Figure 2. Its specific steps are also given as follows:(1)Input the data denoised by GHM neighborhood coefficients denoising method.(2)Determine the decomposition level of the adaptive multiwavelets and assign it to variable *n*.(3)Choose the coefficients scale from the 2(*n* + 1) scales of coefficients of the adaptive multiwavelets for optimization.(4)Assign some values to the adaptive parameters *a*, *b*, *c*, *d*, and *f*, and substitute them into (11) and (12) to get the TST matrices.(5)Perform two steps of biorthogonal multiwavelets singular TST and biorthogonal multiwavelets singular inverse TST with (8) and (9) to get new biorthogonal multiwavelets.(6)Preprocess the input data by oversampling preprocessing scheme [40].(7)Decompose the preprocessed data with the newly constructed biorthogonal multiwavelets in step (5).(8)Calculate the value of SDI corresponding to the chosen scale of coefficients with (19).(9)Compare the iteration number with the preassigned maximum step of the GA algorithm. If the former is smaller, turn to (4), otherwise, save the free parameters *a*, *b*, *c*, *d*, and *f* and proceed to the next step.(10)Determine if all coefficients scales have been chosen. If so, proceed to the next step, otherwise, turn to step (3).(11)Use the obtained 2(*n* + 1) optimized multiwavelets to process the input data.(12)Extract the 10 NSPs in (20) to (29) from each of the chosen scale of coefficients corresponding to the 2(*n* + 1) optimized multiwavelets, respectively.(13)Output the 10 2(*n* + 1) features obtained in step (12) to form a high-dimensional candidate feature set.

##### 3.2. Feature Selection with DI

By adaptive multiwavelets, an original feature set can be constructed. However, it contains multiple irrelevant and redundant features, which will have adverse effect on fault diagnosis.

According to Section 3.1.1, DI can be used as an index to evaluate the sensitivity of the fault features. Therefore, feature selection with DI is conducted to remove the irrelevant features from the constructed original feature set. Generally, different scales of coefficients contain different frequency components of the signal. In order to consider fault features in each frequency band, the 10 NSPs corresponding to each scale of multiwavelets coefficients are evaluated by DI, and then the 2 better NSPs are selected. The feature selection scheme is illustrated in Figure 3. After this process, 4(*n* + 1) NSPs can be obtained.

##### 3.3. Feature Fusion Based on LTSA

With feature selection method, irrelevant features can be excluded from the high dimensional set. However, there still exist some redundant features in the obtained feature subset, which may have negative influence on the fault diagnosis process. Therefore, feature fusion should be implemented on the feature subset to obtain fault features with lower dimensionality and higher sensitivity.

Compared with the traditional PCA and MDS feature fusion methods, LTSA has illustrated better performance on processing nonlinear signals which are the common form of signals measured from rotating machinery. Hence, it is exploited in this paper to process the feature subset for feature fusion. In this process, GA is taken as the optimizing algorithm to select proper parameters of the neighborhood sample number *k* and the lower dimensionality *d* for LTSA algorithm, which have remarkable influence on the feature extraction result. As mentioned in Section 3.1.1, the value of DI can be used to evaluate the sensitivity of the features on discriminating two machinery states. Suppose that there are *M* machinery states to be recognized, then *M*(*M*−1)/2 DI are related to each feature. Among these DI, the smallest one determines the comprehensive sensitivity of the feature. Therefore, the maximum value of the smallest DI is taken as the optimal object to search for the optimal parameters *k* and *d*. The flowchart of the feature fusion algorithm is illustrated in Figure 4. The specific steps of this algorithm are also illustrated as follows:

*n*+ 1) NSPs obtained in the feature selection process.(2)Assign some values to the parameters

*k*and

*d*for LTSA algorithm.(3)Use the LTSA algorithm to process the 4(

*n*+ 1) NSPs.(4)Calculate the DI corresponding to each obtained feature in the lower-dimensional space.(5)Determine if the maximum step reached. If so, continue to the next step. Otherwise, turn to step (2).(6)Output the

*d*dimension features extracted by the optimal LTSA algorithm.

#### 4. Application Experiments

To verify the feature extraction capability of the proposed method, two datasets, rotating machinery platform simulation data and hydroelectric generator unit data, are used for test.

##### 4.1. Rotating Machinery Platform Simulation Data

The rotating machinery platform shown in Figure 5 is applied to collect vibration signals for the feature extraction experiment. The rotor of this platform is driven by a DC motor which is controlled by a speed controller. It is composed of 2 single shafts coupled together and supported by 4 bearing blocks. 2 mass disks are fixed on the rotor and 2 rub screw housings are installed on the rack of this system. The vibration signals measured by the sensors are sent to a proximitor for filtering and amplification and then transmitted to the computer for storage and analysis.

The test was performed on the condition that the speed of the platform was set at 1200 rpm, and the sample frequency was set to 2048 Hz. Four common classes of rotating machinery samples including normal, unbalance, misalignments, and rotor-to-stator rub samples were collected from the platform. For each of these classes, 20 samples with sample points 2048 were applied for the feature extraction method verification. Typical waveforms of the 4 kinds of samples are presented in Figure 6.

To verify the effectiveness of the proposed method, the acquired data were processed by GHM neighborhood coefficients denoising method first and then input into the algorithm in Section 3.1 to construct a high-dimensional candidate feature set. In this process, settings of some relevant parameters are listed in Table 1. After the high-dimensional feature set construction, 8 optimal multiwavelets and 80 features were obtained. The values of the SDI and free parameters *a*, *b*, *c*, *d*, and *f* corresponding to these optimal multiwavelets are shown in Table 2.

To exclude the irrelevant features in the high-dimensional feature set, the obtained 80 features were processed by the feature selection scheme described in Section 3.2. Take the optimal multiwavelets corresponding to the A01 scale of multiwavelet coefficients as an example. The DI values corresponding to the A01 scale optimal multiwavelets are shown in Table 3. It can be seen in Table 3 that the DI values of NSP4 and NSP6 are significantly better than those of other features. Therefore, NSP4 and NSP6 are selected from the 10 features corresponding to the A01 scale optimal multiwavelets. Similarly, 2 features were selected from each of the 10 features corresponding to all the 8 optimal multiwavelets to form a 16-dimensional feature subset. The obtained 16 features corresponding to each 20 samples of the 4 machinery states are shown in Table 4.

Input the 16-dimensional feature subset into the method described in Section 3.3 for feature fusion. Set the neighborhood sample number *k*, the lower dimensionality *d*, and the relevant parameters in GA to be the values listed in Table 1. Then, the final fault features with the optimal dimensionality 3 and the optimal neighborhood sample number 6 were extracted. These features are illustrated in Figure 7, where , , , and represent the features of normal samples, unbalance samples, misalignment samples, and rub samples, respectively. It can be seen from Figure 7 that with the proposed AMWT- and LTSA-based feature extraction method, the 4 rotating machinery states can be distinguished clearly.

Considering that GHM multiwavelets are the basis of the AMWT, features of rotating machinery were extracted based on GHM multiwavelets and LTSA for comparison. In this process, the AMWT was substituted with GHM multiwavelets in the high-dimensional feature set construction step. The feature selection step and feature fusion step are the same as those of the proposed method. By this method, the fault features with the optimal dimensionality 5 and the optimal neighborhood sample number 9 were extracted. Similarly, features were also extracted based on Db4 wavelet and LTSA for comparison. And, the fault features with the optimal dimensionality 5 and the optimal neighborhood sample number 13 were extracted.

In addition, other 2 comparative experiments were also carried out to verify the effectiveness of the feature extraction method based on AMWT and LTSA. One substituted LTSA with PCA, and the other substituted LTSA with MDS method in the proposed method. The three-dimensional extracted features by these two comparative experiments are illustrated in Figures 8 and 9, respectively. It can be seen from Figure 8, the features extracted by the AMWT and PCA methods under misalignment and rub conditions are mixed together, and the features under normal and unbalance conditions are very close to each other. According to Figure 9, the 4 kinds of samples can be distinguished roughly; however, the distribution of the features is much scattered.

To verify the effectiveness of the proposed method further, the features extracted by the proposed method and the 4 comparative methods were input into clustering algorithm, respectively, for fault classification. According to [41], K-medoids clustering is more robust to noise and outliers than K-means clustering. Therefore, it is applied in this paper. The fault discrimination results are demonstrated in Table 5.

It can be seen from Table 5 that the 4 rotating machinery states can be classified by the proposed method, but some of these machinery states cannot be entirely identified by the other comparative methods. This indicates that the proposed method can be applied to extract higher sensitive fault features for rotating machinery.

##### 4.2. Hydroelectric Generator Unit Data

The effectiveness of proposed method was also verified by extracting features from vibration signals of practical rotating machinery, a hydroelectric generator unit in a hydropower plant in China. The sketch of this hydroelectric generator unit structure is shown in Figure 10. It is composed of a generator, a hydroturbine, 2 brackets, a thrust bearing, and 3 guide bearings.

It was noticed on a certain day that the unit vibrated so intensely that the amplitude of the vibration signals exceeded the safety threshold. The specialist identified the fault pattern as unbalance fault. After performing counterweight process to the unit, the amplitude of the vibration signals returned to normal.

The signals acquired before and after the counterweight process by the accelerometer mounted on the lower bracket are taken as signals under unbalance and normal conditions, respectively. In the operation process of the unit, the speed was 500 rpm and the sampling frequency was set to 500 Hz. For each of the 2 machinery conditions, 16 samples with sample points 512 acquired from the unit were applied for the feature extraction method verification. Typical waveforms of the 2 kinds of samples are presented in Figure 11.

Similar to the rotating machinery platform experiment, the raw signals acquired from the hydroelectric generator unit were firstly input into the GHM neighborhood coefficients denoising method for noise cancellation and then processed by the algorithm in Section 3.1 to construct a high-dimensional candidate feature set with 80 features. In this process, setting of the parameters is the same as those for the rotating machinery platform shown in Table 1. The values of the SDI and optimal parameters *a*, *b*, *c*, *d*, and *f* corresponding to the obtained 8 optimal multiwavelets are shown in Table 6.

By inputting these 80 features into the method described in Section 3.2 for irrelevant features cancellation, 16 features were obtained. These features for each 16 samples of the 2 machinery states are shown in Table 7.

The obtained 16 features were processed with the feature fusion method in Section 3.3 to remove the redundant features. Then, the final fault features with the optimal dimensionality 2 and the optimal neighborhood sample number 10 were extracted. They are shown in Figure 12. It can be seen that the features of the two classes can be distinguished clearly.

The 4 comparative experiments were also carried out to verify the effectiveness of the feature extraction method. By the method based on GHM and LTSA, the fault features with the optimal dimensionality 2 and the optimal neighborhood sample number 15 were extracted. The 2-dimensional extracted features are shown in Figure 13. According to Figure 13, some of the obtained two kinds of features are mixed together so they cannot be distinguished clearly. By the method based on Db4 and LTSA, the fault features with optimal dimensionality 4 and the optimal neighborhood sample number 9 were extracted. By the other two methods based on AMWT and PCA and on AMWT and MDS, the 2-dimensional fault features shown in Figures 14 and 15, respectively, can be obtained. It can be seen in Figures 14 and 15 that the two classes of samples can be separated. However, the distribution of points is relatively dispersed.

Inputting the features extracted by the proposed method and the 4 comparative methods into the K-medoids method, the discrimination results shown in Table 8 can be obtained. According to Table 8, the discrimination rate by the proposed method and the two comparative methods based on AMWT and PCA and on AMWT and MDS are all 100%, which are higher than the results obtained by other two comparative experiments. Nevertheless, comparing Figure 12 with Figures 14 and 15, it can be seen that the two classes of samples in Figures 14 and 15 are relatively dispersed. Therefore, the proposed method can be applied to extract fault features for rotating machinery effectively.

#### 5. Conclusions

Rotating machinery is commonly used in the modern industry. Developing an efficient feature extraction method is a real necessity for accurate diagnosis of the fault type and avoids catastrophe accidents and huge economic losses. In this paper, a novel feature extraction scheme based on AMWT and LTSA is proposed for fault diagnosis of rotating machinery. To obtain fault features with lower dimensionality and higher sensitivity, three main steps are implemented. Firstly, the high-dimensional NSPs are extracted from vibration signals with AMWT. Because the AMWT has the capability to change adaptively with the characteristics of the signals, the comprehensive sensitivity of the constructed high-dimensional feature set is improved. Then, the DI is used to evaluate the sensitivity of the NSPs in the original high-dimensional candidate feature set and remove the irrelevant features. By this step, the dimensionality of the feature set can be reduced substantially and features with higher sensitivity can be picked out. After that, LTSA is investigated to exclude the redundant features and reduce the dimensionality of the feature space further. In this process, GA is applied to optimize the free parameters in LTSA so as to obtain an optimal feature fusion result. In order to verify the effectiveness of the proposed method, fault feature extraction tests were carried out with the data acquired from a rotating machinery platform and a hydroelectric generator unit. Experimental results show that the proposed method has the capability to obtain fault features with lower dimensionality and higher sensitivity and thus can improve the fault classification performance.

#### Data Availability

The data are attached in the “Optional Supplementary Materials” named as “supp.1201084.xlsx.”

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

The authors acknowledge the financial support from the National Natural Science Foundation of China under Grant No. 51609203 and from the Xi'an Science and Technology Planning Project under Grant No. NC1504(1).

#### Supplementary Materials

Four common classes of rotating machinery vibration signals including normal, unbalance, misalignment, and rotor-to-stator rub samples collected from the rotating machinery platform described in the paper are included in the data file. For each of these classes, there are 20 samples with sample points 2048. There are four sheets in this file which including the normal, unbalance, misalignment, and rotor-to-stator rub samples, respectively.* (Supplementary Materials)*