Abstract

The dimension reduction methods have been proved powerful and practical to extract latent features in the signal for process monitoring. A linear dimension reduction method called nonlocal orthogonal preserving embedding (NLOPE) and its nonlinear form named nonlocal kernel orthogonal preserving embedding (NLKOPE) are proposed and applied for condition monitoring and fault detection. Different from kernel orthogonal neighborhood preserving embedding (KONPE) and kernel principal component analysis (KPCA), the NLOPE and NLKOPE models aim at preserving global and local data structures simultaneously by constructing a dual-objective optimization function. In order to adjust the trade-off between global and local data structures, a weighted parameter is introduced to balance the objective function. Compared with KONPE and KPCA, NLKOPE combines both the advantages of KONPE and KPCA, and NLKOPE is also more powerful in extracting potential useful features in nonlinear data set than NLOPE. For the purpose of condition monitoring and fault detection, monitoring statistics are constructed in feature space. Finally, three case studies on the gearbox and bearing test rig are carried out to demonstrate the effectiveness of the proposed nonlinear fault detection method.

1. Introduction

Mechanical equipment is widely used in modern industrial production, but it often suffers from damage during the long time operation, such as the fracture of bearings and the broken tooth of gears; the defect of these parts may cause the performance of the machine to degrade, or even cause security accidents. Therefore, the fault detection of mechanical equipment is of great significance to ensure the safety of the industrial production process and the economic benefits. In recent years, the multivariate statistical process monitoring (MSPM) technique has been developed and used to detect the faults in industrial production process, such as principal component analysis (PCA) ‎[1], partial least squares (PLS) ‎[2], and independent component analysis (ICA) ‎[3]. These classical monitoring methods perform dimension reduction on the process data and extract few components to construct monitoring statistics which can reflect the characteristics of the original data, at this point, the performance of dimension reduction will affect the monitoring effect.

Multivariate data-driven statistical PCA-based monitoring framework is the most frequently employed method in condition monitoring and fault detection field. To overcome the weakness that linear monitoring method may perform poorly in processing the nonlinear monitoring processes, KPCA-based monitoring method is widely investigated and used to detect faults [4, 5]. Although the improved PCA-based monitoring methods can retain latent features of raw data, they only capture the global structure of the data, and the local structure characteristics in the data have been ignored. However, the features extracted from the local structure of the data can also represent the different aspects of the data. The loss of the important information may have impact on dimension reduction and monitoring result [6].

As opposed to the global data structure preserving dimension reduction techniques, manifold learning methods have been developed to preserve the local data structure characteristics, represented by Laplacian eigenmap (LE) ‎[7], local preserving projections (LPP) ‎[8], locally linear embedding (LLE) ‎[9], and neighborhood preserving embedding (NPE) ‎[10]. LPP and NPE both are linear projection methods that can process the testing data conveniently; manifold learning based monitoring methods can overcome some limits of the PCA-based monitoring method. However, these manifold learning methods only consider the neighborhood relationships to preserve local properties among samples and thus may also lose crucial information contained in the global data structure. In order to take both global and local data structure characteristics into account, the methods which unify LPP and PCA have been proposed, and the fault detection performances have proven to be better than LPP and PCA [11, 12]. But these approaches are still linear methods, as they are employed to process the nonlinear process data; these methods have limitations and may obtain a poor monitoring performance.

On the other hand, kernel function is usually investigated to extend linear methods to nonlinear methods, by mapping original data from input space into high dimension feature space, and then perform the linear method in feature space. For the purpose of taking full advantages of global and local data structure and processing the nonlinear monitoring problem efficiently, a kernel global-local preserving projections (KGLPP) method ‎[13] based on KLPP and KPCA has been proposed, and the results show that it outperforms the linear global-local preserving projections (GLPP) method ‎[14]. Orthogonal neighborhood preserving embedding (ONPE) is an orthogonal form of conventional NPE algorithm, which adds an additional orthogonal constraint on the projection vectors ‎[15]; thus, ONPE not only inherits the local structure preserving feature, but also can avoid the distortion defects of NPE ‎[16]. Moreover, the orthogonal property is also an advantage for fault detection and fault diagnosis. Firstly, orthogonal transformations can enhance the locality preserving power, which is effective in data reconstruction and computing reconstruction error; it is useful for fault detection. Secondly, the dimension reduction methods considering orthogonal constraint can improve the performance of identification, which is helpful to detect fault effectively ‎[17].

In this paper, a new nonlinear dimension reduction method named nonlocal kernel orthogonal preserving embedding (NLKOPE) is proposed on the basis of a linear dimension reduction method named nonlocal orthogonal preserving embedding (NLOPE). NLOPE takes both advantages of ONPE and PCA into account; NLKOPE is a nonlinear extension of NLOPE. The exponentially weighted moving average (EWMA) statistic is built for condition monitoring and fault detection. To verify the effectiveness of our proposed methods, these methods are employed to detect the faults of gearbox and evaluate the performance degradation of bearing. In order to diagnose the fault type of the bearing, dual-tree complex wavelet packet transform (DTCWPT) is used for noise reduction, and Hilbert transform envelope algorithm is employed to extract the fault characteristic frequency.

The rest of the paper is organized as follows. KPCA, ONPE, and KONPE are reviewed and analyzed in Section 2. The proposed NLOPE-based monitoring method is developed in Section 3. The proposed NLKOPE-based monitoring method is developed in Section 4. In Section 5, three cases are used to demonstrate the effectiveness of the proposed methods. Finally, conclusions are drawn in Section 6.

2. Background Techniques

2.1. Kernel Principal Component Analysis

As a multivariate method, PCA is widely used for process monitoring. However, for some complicated cases in industrial processes with nonlinear characteristics, PCA performs poorly as it takes the process data as linear, and some useful nonlinear features may be lost when the PCA is used to reduce the dimension and extract features. KPCA performs a nonlinear PCA that constructs a nonlinear mapping from the input space to the feature space through the kernel function. Given data set , where is the number of samples and is the number of variables, the samples in the input space are extended into the feature space by using a nonlinear mapping , the covariance matrix in the feature space can be expressed aswhere it is assumed that the data set in feature space is centered and . The principal component can be calculated by solving the eigenvalue problem in the feature space.where denotes the dot product between and and denotes eigenvalue and denotes eigenvector. For , eigenvector is regarded as a linear combination of with the coefficients , .

Multiplying at both sides of (2), , we obtain

Defining a kernel matrix with , (4) can be expressed aswhere , by solving the eigenvalue problem of (5), this yields eigenvectors with the eigenvalues . The coefficients are normalized to satisfy and to ensure . The projection of a test sample is obtained as follows:

2.2. Kernel Orthogonal Neighborhood Preserving Embedding
2.2.1. Orthogonal Neighborhood Preserving Embedding

Given a high dimensional data set , as a linear dimension reduction method, ONPE is used to find a transformation matrix that maps the high dimensional data set to the low dimensional data set , that is, . In NPE algorithm, in order to preserve the local geometric structure, the adjacency graph is built to reflect the relationship between samples; each sample can be reconstructed by a linear combination of its neighbors with the corresponding weight coefficients. The weight coefficients matrix is computed by minimizing the following objective function:

The reconstruction weight coefficients between and its neighbors are preserved to reconstruct by using its corresponding neighbors. Then, the low dimensional embedding is obtained by optimizing the error function: where , , , and is the number of neighbors of . If is not the neighbor of , . In ONPE, any high dimensional data can be mapped into the reduced space by the orthogonal projection matrix , according to (7)-(8), matrix is obtained by the following formulations:where , . Using the Lagrange multiplier method, the orthogonal vectors is calculated iteratively as follows:(1) is the eigenvector corresponding to the smallest eigenvalue of matrix .(2) is the eigenvector corresponding to the smallest eigenvalue of matrix :where .

2.2.2. Kernel Orthogonal Neighborhood Preserving Embedding

ONPE algorithm has some nonlinear data processing ability, but it is essentially a linear method, the limitations are obvious as it comes to extract the nonlinear features of data. KONPE is a nonlinear extension of ONPE; the original data are mapped into the kernel space that it is feasible to use ONPE to obtain low dimensional nature characteristics.

Given a data set , the data are projected on to the high dimensional feature space by using the nonlinear mapping function , assuming that the data is centered and satisfies . Defining be the linear mapping matrix to project the data from feature space to the low dimensional space, the low dimensional embedding are obtained by using matrix , that is . The matrix can be expressed as a linear combination of with the coefficients .where . According to , the formulations for calculating the matrix are as follows:where , . The problem of computing can be converted to solve the coefficients vectors based on the Lagrange multiplier method. According to (16) and (18), the specific steps are as follows:(1) is the eigenvector corresponding to the smallest eigenvalue of matrix .(2) is the eigenvector corresponding to the smallest eigenvalue of matrix : where ; the kernel matrix should be centered by where .

Given a test sample , the variable of the low dimensional sample is obtained by mapping into vector in the feature space, .

The test kernel matrix should also be centered as follows:where , .

3. Nonlocal Orthogonal Preserving Embedding

3.1. Algorithm Description

In order to preserve both the local and global data structures, NLOPE algorithm is proposed to unify the advantages of PCA and ONPE. Given a data set , the objective function of NLOPE is as follows:where , .

Using the Lagrange multiplier method, the projection vector can be calculated by solving following eigenvector problems:(1) is the eigenvector corresponding to the smallest eigenvalue of matrix .(2) is the eigenvector corresponding to the smallest eigenvalue of matrix :where , is the dimension of samples in NLOPE space, , , and . A strict mathematical proof of the projection vector is given in Appendix A.

3.2. Selection of Parameter

The parameter describes different roles of global and local data structure preserving in constructing the NLOPE model; it is important to choose an appropriate value of , which will affect the extraction of latent variables. As we need to solve a dual-objective optimization problem, usually it is hard to find an absolutely optimal solution which simultaneously optimizes the two subobjectives. However, it is possible to obtain a relatively optimal solution by making balance between them. The parameter is used to balance the matrix and matrix in (24), and it can be regarded as balancing the energy variations of and . Thus, we choose spectral radius of the matrix to estimate the value of .

To balance the global and local structure of the data, can be selected as follows ‎[6]: where and denote the energy variations of and . is the spectral radius of the matrix. and are defined in (24). Thus, is computed by

3.3. Monitoring Model

In PCA-based monitoring method, Hotelling’s statistic and the squared prediction error statistic are often used for fault detection. Similarly, the two monitoring statistics are applied in NLOPE-based model. Hotelling’s is used to measure the variation in latent variable space and detects a new sample if the variation in the latent variables is greater than the variation explained by the model, which is computed aswhere is the covariance matrix of the projection vectors of training samples in the NLOPE subspace.

The squared prediction error statistic is a measurement of the variation in residual space and is used to measure the goodness of fit of the new sample to the model; it is defined as follows ‎[20]:where is the embedding of input sample in NLOPE subspace.

As it is hard to estimate the condition of the machine only by the original vibration signal, some features need to be constructed. Time-domain and frequency-domain features can be generated from vibration data and are also widely used to characterize the state of machinery, time-domain features such as kurtosis, crest factor, and impulse factor, i.e., are sensitive to impulsive oscillation, and frequency-domain features can reveal some information that can not be characterized by time-domain features. In this study, 11 time-domain features and 13 frequency-domain features ‎[21] were extracted from each sample to construct the high dimensional feature sample. For the purpose of condition monitoring and fault detection, it is critical to extract the most useful information hidden in current machine state. Therefore, the dimension reduction methods can be employed to extract latent features effectively.

In order to detect the incipient fault of mechanical equipment more accurately and reliability, the exponentially weighted moving average (EWMA) statistic based on a combined index of and statistics is developed to detect the fault of the mechanical equipment. The combined index is a summation of and statistics as follows:where and are the control limits of and statistics, they can be computed by kernel density estimation (KDE) algorithm, the values of statistic should be normalized between 0 and 1 by using the maximal value and minimum value of , and the values of statistic should be normalized, too.

The statistic is computed as follows:where is calculated by the average of preliminary data and is a smoothing constant between 0 and 1. While is large, the value of puts more weight on the current statistic than on the historic statistic. Calculate the control limit for statistic by KDE method, too. In this study, the value of is set to 0.2.

The offline modeling procedure is listed as follows:(1)The healthy samples are used to be the training samples, convert each original training sample into the high dimensional feature sample, and then normalize the high dimensional feature samples to zero mean and unit variance.(2)Use (26) to calculate the projection matrix , and calculate the projected vectors of the training samples in the NLOPE subspace.(3)Compute and statistics of all training samples, and calculate the control limits and , and then obtain the EWMA statistics and the control limit .

The online monitoring procedure is listed as follows:(1)Convert each testing sample into the high dimensional feature sample, and normalize the high dimensional feature samples with the mean and variance of the training feature samples .(2)Calculate the projected vectors of testing samples as .(3)Compute EWMA statistics associated with , and monitor if they exceed the control limit .

4. Nonlocal Kernel Orthogonal Preserving Embedding

4.1. Algorithm Description

NLKOPE algorithm performs a nonlinear NLOPE by using the kernel trick. Given a data set , the nonlinear mapping is used to project onto the feature space, the data set in the feature space is assumed to be centered, and . The objective function of NLKOPE algorithm is computed as follows:According to (33), computing is reduced to obtain the coefficients vectors , using the Lagrange multiplier method, the formulation is converted as follows:

The coefficients vectors are obtained as(1) is the eigenvector corresponding to the smallest eigenvalue of matrix :(2) is the eigenvector corresponding to the smallest eigenvalue of matrix :where , is the dimension of samples in NLKOPE space, , , , and is the centered kernel matrix of obtained by (21). A strict mathematical proof of the coefficients vectors is given in Appendix B.

4.2. Selection of Parameter

The method to choose the parameter in NLKOPE model is same as in the NLOPE model, while the values of spectral radius are different, and is set aswhere , .

4.3. Monitoring Model

Hotelling’s statistic and the squared prediction error statistic are also used in NLKOPE-based model to monitor the abnormal variations. The statistic defines aswhere is the covariance matrix of the projection vectors in the training samples.

The statistic defines as ‎[22]where is the centered kernel vector of obtained via (23).

The offline modeling procedure is listed as follows:(1)The healthy samples are used to be the training samples, convert each original training sample into the high dimensional feature sample, and then normalize the high dimensional feature samples to zero mean and unit variance.(2)Compute the kernel matrix by selecting a kernel function, and center the kernel matrix via (21).(3)Obtain the projection matrix by calculating the eigenvector problem of (36)-(37).(4)Compute and statistics of all training samples, and calculate the control limits and , and then obtain the EWMA statistics and the control limit .

The online monitoring procedure is listed as follows:(1)Convert each testing sample into the high dimensional feature sample, and normalize the high dimensional feature samples with the mean and variance of the training feature samples ..(2)Compute the kernel vector and center it to get via (23).(3)Calculate the projected vectors of testing samples via (41).(4)Compute EWMA statistics associated with , and judge whether they exceed the control limit .

The procedure of condition monitoring and fault detection by the method of NLKOPE is shown in Figure 1. The healthy vibration signals are collected to implement NLKOPE and construct the offline model, and then the model will be employed to implement online condition monitoring, fault detection, and performance degradation assessment.

5. Case Studies and Result Analysis

5.1. Fault Detection of Gearboxes

The 2009 PHM gearbox fault data ‎[18] is a representative of generic industrial gearbox data; we use it to evaluate the proposed methods. The gearbox contains 4 gears, 6 bearings, and 3 shafts, the measured signals consist of two accelerometer signals and a tachometer signal with a sampling frequency of 66.67 kHz, and the schematic and overview of the gearbox are shown in Figure 2. In this study, 3 different health conditions of the helical gearbox under low load and 30 Hz speed are used to test the effect of fault detection, and the detailed description of the data and pattern is shown in Table 1. In the pattern of health, all the mechanical elements in the gearbox are normal. In the pattern of fault 1, the gear with 24 teeth on the idler shaft is chipped. In the pattern of fault 2, the gear with 24 teeth on the idler shaft is broken, and the bearing at output side of the idler shaft also has inner race defect.

In this case, 1024 sampling points are selected as a sample, and we extract 30 samples for each pattern. The first 30 samples from the pattern of health are used as training samples, and the remaining 60 samples from pattern of fault 1 and fault 2 are collected as testing samples. In other words, we use these 90 samples to detect whether the gearbox is faulty, and actually the gearbox starts to fail at 31st sample.

For the purpose of comparison, five monitoring methods based on KPCA, KONPE, NLOPE, KGLPP ‎[13], and NLKOPE are presented to detect the fault of gearbox respectively. The embedding dimension in each model is set to 3, and the number of nearest neighbors is set to 20 in KONPE, NLOPE, KGLPP, and NLKOPE models. The 99% confidence limit is used for , , and EWMA statistics. In order to compare the results more clearly, the indicator of fault detection rate (FDR) is applied in this case.

Monitoring charts of five methods are shown in Figure 3, and the detailed fault detection results of these monitoring methods are listed in Table 2. Obviously, gearbox starts to fail at 33rd sample that detected by KPCA monitoring model as shown in Figure 3(a), and 35th, 36th, 37th, 38th samples are all under the control limit, that means the failure of detection at these samples. Figure 3(b) illustrates that gearbox starts to fail at 33rd sample that detected by KONPE, that means the failure of detection at 31st and 32nd samples. As shown in Figures 3(c)-3(d), gearbox starts to fail at 32nd sample that detected by NLOPE and KGLPP, but in fact, gearbox starts to fail at 31st sample. The detection result of NLKOPE monitoring model is shown in Figure 3(e); the EWMA statistic can work well and detect the fault of gearbox accurately. Besides, as shown in Table 2, the fault detection rates of NLOPE, KGLPP, and NLKOPE are higher than KPCA and KONPE which only consider the global or local data structure, although NLOPE has considered the global-local data information; the ability to process nonlinear data is not prominent when compared to NLKOPE. The results indicate that the NLKOPE-based monitoring method outperforms KPCA, KONPE, NLOPE, and KGLPP-based monitoring method.

5.2. Dimension Reduction Performance Assessment

In this case, the experimental data from Case Western Reserve University ‎[23] are used to evaluate the dimension reduction performance of the proposed methods. The bearings used at the drive end are the deep groove ball bearing 6205-2RS JEM SKF. Data was collected with 12kHz sampling frequency at the rotating speed of 1797 rpm and 0HP load. The sample sets include 7 different severity conditions, i.e., health, inner race faults with faulty sizes 0.007, 0.014, 0.021, and 0.028, respectively, outer race, and ball fault with faulty size 0.014, respectively. We select 1024 sampling points as a sample, and extract 70 samples for each severity condition. Furthermore, the first 35 samples of each severity condition are collected as training samples, then the remaining 35 samples are used as testing samples.

The purpose of dimension reduction is to make the intraclass low dimensional samples clustering and interclass separation, which will be helpful to improve the performance of fault classification. Thus, the clustering degree is used as a quantification index to evaluate the dimension reduction performance; it defines as follows:where is the number of fault types, is the sample size of the th fault type, is the low dimension embedded coordinate, , is the mean value of embedded coordinates of the th fault type, and is the mean value of all low dimension embedded coordinates.

11 time-domain features and 13 frequency-domain features ‎[21] are extracted from each sample to be the variations and make up the high dimensional sample, and in order to visualize clearly, the embedding dimension is set to 3. For the purpose of comparison, five methods including KPCA, KONPE, NLOPE, KGLPP ‎[13], and NLKOPE are presented to obtain the dimension reduction results on the training and testing samples, respectively; scatter plots of three features are as shown in Figure 4. Furthermore, Health, Fault1, Fault2, Fault3, Fault4, Fault5, and Fault6 in Figure 4 represent 7 different fault types which contain health, four inner race faults with faulty sizes 0.007, 0.014, 0.021, and 0.028 in, outer race, and ball fault with faulty size 0.014 in, respectively, and x, y, z indicate three-dimensional representation based on the three features extracted from the training and testing samples by the proposed methods. Figure 4 illustrates the classification abilities of five methods for the 3D-clusters samples, where samples in the same fault types are marked in the same color.

The distribution of the same fault type of samples is dispersed in Figure 4(a), and the different fault types of samples gather together as shown in Figure 4(b), both of these two situations may increase the probability of misclassification. The clustering degree is calculated as shown in Table 3, the clustering degree value of KGLPP is close to NLKOPE, and the result of dimension reduction based on NLKOPE has the minimum clustering degree, which is beneficial to improve the accuracy of fault classification.

5.3. Condition Monitoring and Performance Degradation Assessment of Bearing

In this case, the aim is to implement condition monitoring and evaluate the performance degradation of bearing, and the degradation index is important to assess the state of bearing. Thus, we hope to identify the degradation at an early stage to avoid continuous deterioration of the state and minimize machine downtime. The bearing experimental data were generated from the run-to-failure test [19, 24]. Figure 5 illustrates the bearing test rig. The rotation speed was kept constant at 2000 rpm, and each sample consists of 20480 points with the sampling rate set at 20kHz. The structural parameters and kinematical parameters (shaft frequency , inner-race fault frequency , rolling element fault frequency , and outer-race fault frequency ) of the experiment bearing are listed in Table 4, and the detailed information about the experiments have been introduced in the literature [19]. One bearing (i.e., the bearing 3 of testing 1) with inner race defect is used to verify the performance of the proposed algorithm. We extract 2100 sets of test-to-fail samples recorded for the bearing 3, the first 500 samples are used as the training samples, and the rest are generated as the testing samples.

For the purpose of comparison, five monitoring methods based on KPCA, KONPE, NLOPE, KGLPP ‎[13], and NLKOPE are presented to explain the bearing performance state, respectively. The 99% confidence limit is used for , , and EWMA statistics. In this case, we extracted 2100 test-to-fail samples, and the 1790th sample was regarded as the initial weak degradation point based on the research in literature ‎[25].

As shown in Figures 6 and 7, the EWMA statistic has presented the state of the bearing, and the 1797th sample is considered to be the initial weak degradation point where the performance of the bearing begins to degrade. As the samples were recorded every ten minutes, it is 70 minutes late to detect the failure by KPCA or KONPE-based monitoring method when compared with the result in literature ‎[25], and the KPCA-EWMA statistic with large fluctuations is not suitable for condition monitoring. Figures 810 illustrate the detection results of NLOPE, KGLPP, and NLKOPE-based monitoring methods, and they all obtain the initial weak degradation point of the bearing at the 1789th sample, which is 10 minutes earlier than the result in literature ‎[25], and the statistics after 1789th sample all exceed the control limits, but LPP-EWMA statistics ‎[25] between 1950th sample and 2150th sample are below the control limit, that means the failure of detection in these interval. Though the fault detection accuracy of NLOPE-based monitoring method outperforms KPCA and KONPE, this advantage of NLOPE is not prominent, since the EWMA statistic after the initial weak degradation point has a relative big fluctuation as shown in Figure 8, which is not conducive to evaluating the bearing performance state. The performance degradation assessment of KGLPP-based monitoring method is slightly inferior to the performance of NLKOPE-based monitoring method, because NLKOPE-EWMA statistic can reflect the damage degree of the bearing from the severe degradation occurrence of incipient defect to final failure, as shown in Figure 10, the EWMA statistic continues to grow from severe degradation to final failure stage, which is consistent with actual bearing degradation. Thus we can draw the conclusion that the NLKOPE-based monitoring model which considers the global and local data structure together will obtain better monitoring performance than the model considering only the global structure or local structure of data.

The above results have shown that the proposed method can be effectively used for the task of bearing fault detection. The next step is to diagnose the fault type of the bearing. We extract the 1789th sample to analysis, the signal is complex and messy that contains lots of noise as shown in Figure 11, and thus it is hard to diagnose whether the bearing is faulty only by the time waveform of the vibration signal, as the features have been submerged by the strong noise. In order to extract useful features for diagnosis, it is necessary to eliminate the noise in the original vibration signal. Dual-tree complex wavelet packet transform (DTCWPT) is a multiscale method with such attractive properties as nearly shift-invariance and reduced aliasing, which has been widely used in signal processing [26]. In this study, DTCWPT is employed to denoise the original vibration signal combined with the threshold method, and Hilbert transform envelope algorithm is applied to extract the fault characteristic frequency. As shown in Figure 12, the noise in vibration signal has been greatly reduced, and transient periodicity can be found because of the impacts produced by the bearing defect. The envelop spectrum of denoised vibration signal is presented in Figure 13, we can find the shaft frequency and its harmonics, the fault characteristic frequency and its harmonics are all quite effectively extracted, and there are also side bands and on both sides of the fault characteristic frequency . Therefore, the bearing inner race can be judged to be faulty, which is also in line with the actual condition of the bearing.

6. Conclusions

In this paper, a linear dimension reduction method called nonlocal orthogonal preserving embedding is proposed, and the nonlinear form of NLOPE named nonlocal kernel orthogonal preserving embedding is also presented. In order to retain the geometric of the latent manifold, NLOPE and NLKOPE both take global and local data structures into account, and a tradeoff parameter is introduced to balance the global preserving and local preserving. Hence, compared to KPCA and KONPE, NLKOPE is more general and flexible, and it is also more powerful to extract latent information from nonlinear data than NLOPE. Based on the results of three cases, the dimension reduction performance of NLKOPE is the best, which is beneficial to improve the accuracy of fault classification, and NLKOPE-based monitoring method has higher fault detection rate, it is also more sensitive and effective to evaluate the performance degradation of bearing in comparison with KPCA, KONPE, and NLOPE-based monitoring method.

Appendix

A.

To obtain the result of , we construct the Lagrange function of based on (24)

Let , , set the partial derivative of with respect to to be zero, and we get

Thus, is the eigenvector corresponding to the smallest eigenvalue of matrix .

To obtain the result of , we construct the Lagrange function of based on (24)

Substituting and into (A.3), we get

Set the partial derivative of with respect to to be zero

Multiplying the left side of (A.4) by , we have

, (A.6) can be represented as

Let , , we obtain

Multiplying (A.4) by and substituting (A.8), we obtain

Thus, is the eigenvector corresponding to the smallest eigenvalue of matrix :

B.

To obtain the result of , we construct the Lagrange function of based on (31)

Let , , set the partial derivative of with respect to to be zero, we get

Thus, is the eigenvector corresponding to the smallest eigenvalue of matrix .

To obtain the result of , we construct the Lagrange function of based on (31)

Substituting and into (B.3) and setting the partial derivative of with respect to to be zero, we obtain

Multiplying the left side of (B.4) by , we have

, (B.6) can be represented as

Let , , and we obtain

Multiplying (B.4) by and substituting (B.8), we obtain

Thus, is the eigenvector corresponding to the smallest eigenvalue of matrix :

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research is supported by National Nature Science Foundation of China (nos. 61640308 and 61573364) and Nature Science Foundation of Naval University of Engineering (no. 20161579). The author is also grateful to the 2009 PHM Challenge Competition, Case Western Reserve University, and NSF I/UCR Center for Intelligent Maintenance System, University of Cincinnati, USA, for providing the experimental data.