#### Abstract

A trend prediction method based on the Pchip-EEMD-GM(1,1) to predict the remaining useful life (RUL) of rolling bearings was proposed in this paper. Firstly, the dimension of the extracted features was reduced by the KPCA dimensionality reduction method, and the WPHM model parameters were estimated via the kernel principal components. Secondly, the hazard rate was calculated at each time, and the Pchip interpolation method was used to obtain the uniformly spaced interpolation data series. Then the main trend of signal was obtained through the EEMD method to fit the GM(1,1) prediction model. Finally, the GM (1,1) method was used to predict the remaining life of the rolling bearing. The full life test of rolling bearing was provided to demonstrate that the method predicting the hazard data directly has the higher accuracy compared with predicting the covariates, and the results verified the feasibility and effectiveness of the proposed method for predicting the remaining life.

#### 1. Introduction

Rolling bearings are critical components that determine the remaining lifetime of machinery [1]. The bearings are important parts in rotating machinery, and bearing failure may lead to catastrophic accidents [2]. Previous analysis has shown that faults caused by vibration accounted for 70% of all mechanical faults. Furthermore, faulty rolling bearings accounted for 30% of the faults caused by vibration [3]. The CBM uses modern sensing technology to provide the maintenance decisions before the equipment becomes faulty by real-time updated degradation information of the equipment [4]. The CBM is more effective when determining equipment maintenance based on the vibration data, rather than the traditional maintenance strategies. The CBM initially establishes exact correspondence between the condition parameters of equipment and reliability in order to provide accurate maintenance decisions [5]. Effective maintenance strategies reduce the maintenance downtime and cost, while simultaneously ensuring the normal operation of the equipment [6, 7]. Improving the prediction accuracy of the residual life plays an important role in making correct maintenance decisions that are based on the running status of the equipment.

The life prediction theory based on condition monitoring is an important research as continuous improvements are made in the measuring and the testing techniques. The proportional hazards model (PHM) [8] is a statistic analysis model of the lifetime data. It is applicable in models with no special requirements for data distribution or residual distribution. The PHM analyzes censored data, which establishes the failure model based on equipment condition monitoring and historical life data. This method is commonly used for predicting equipment life. Ding F. et al. [9] set the root mean square (RMS) and the kurtosis as the covariant and then used the Weibull proportional hazards model (WPHM) to assess the reliability of the rolling bearings on railway wheels. Zimroz R et al.[10] used the load-dependent feature processing to diagnose the wind turbine bearings in strong operating and nonstationarity conditions. Liao H. T. et al. [11] set the RMS and the kurtosis as a covariant, where the logistic regression model and the PHM were used to predict the remaining useful life of single equipment. Zhang Q. et al. [12] used the mixed WPHM to predict the remaining life of a mechanical system that contained multiple failure modes. Chi K. R. L. et al. [13] used the state-space Switching Kalman Filter (SKF) method for predicting the remaining life and providing maintenance decisions based on the degradation model. In real life applications, the trend prediction is a key component of the remaining life prediction. By improving the accuracy of trend prediction, the accuracy of the remaining life is confirmed.

The trend prediction uses historical time-series vibration data and conjecture data variations from future data to establish the prediction model. Current trend prediction methods include the curve-fitting methods, the time-series methods, the neural network methods, the support vector machine methods, and the grey model methods. The computational process of the curve-fitting methods is simple; however, the prediction accuracy is low. The standard time-series prediction methods are based on the autoregressive model and the autoregressive moving average model. The prediction accuracy is low, and the methods are only suitable for short-term prediction. The neural network methods and the support vector machine methods require trained data, which cannot be used in the trend prediction of small datasets. The grey system theory [14] demonstrated the evolution law of things based on the analysis of lacking systematic characteristics, operating mechanism, and behaviors. Liu S. D. et al. [15] normalized the weighted time domain parameters and used the grey model to predict the life trend, which resulted in a high prediction accuracy. Yang J. T. et al. [16] used a multivariable grey prediction model to predict the development of mechanical failure, which overcame the disadvantages of the traditional fault prediction methods by considering each characteristic parameter separately. Liu E. L. et al. [17] used the vibration signals of rolling bearings to adopt the GM(1, 1) model, which predicted the life trend based on two typical characteristic indicators (RMS and kurtosis). Tabaszewski M. [18, 19] used the moving method applied to GM(1,1) model for estimating the model parameters to develop the method based on the weighted mean forecast. The life trend reflects the performance degradation process and became the foundation for subsequent RUL prediction. The grey model methods require less data and the prediction accuracy is high. This paper proposed a modified grey model knew as the Pchip-EEMD-GM(1, 1) model to predict the trend based on WPHM.

#### 2. The Fundamental Theory

##### 2.1. WPHM

The WPHM established the mathematical relationship between the running status of the equipment and the reliability. The current hazard rate was obtained based on the real-time running status. The hazard function of the WPHM is defined as follows:where is the parameter shape of the Weibull distribution and is the scale parameter of the Weibull distribution. is a column vector that is composed of a covariant, which have time-varying characteristics. The degree to which the selected covariant could accurately reflect the performance degradation process is imperative to the accuracy of the model. is a row vector composed of the regression parameters of the corresponding covariant.

The reliability and the failure probability density are estimated with

The rotating rolling bearing is considered to be in failure when the reliability of time is lower than the preset threshold. The failure time is defined as follows:where is the failure threshold or the characteristic life when it is equal to [20].

The error of the RUL prediction is defined aswhere is the predicted RUL and is the actual RUL.

##### 2.2. GM(1, 1) Model

The grey system theory revealed a variation law of research objects with less data and insufficient information based on the analysis of systematic characteristics. The theory uses small samples and insufficient information to establish the prediction model [21].The GM(1, 1) model is the most widely used prediction model of the grey system theory, where the specific modeling process is described as follows.

is set as the original nonnegative data series. is described as the generation data series. The relationship between and can be described as (5), which is referred to as the accumulated generating operation:

The accumulated generating operation effectively eliminates the volatility and the randomness of the original data series, aiming at building an ascending series. The new-generation data series is roughly the exponential growth. Then, the differential coefficient equation can be established as

which is referred to as the winterization equation of GM(1, 1) model. and are unknown parameters, denoted as . The parameter of vector can be solved through the least square method. The discrete solutions of and can be described in

Finally, the grey prediction series of the original data can be described as

#### 3. Method and Steps

This paper extracted the kernel principal elements to establish the WPHM model and used Pchip-EEMD method to preprocess the obtained hazard rate data directly and fit the requirements of the GM(1,1) prediction model. Figure 1 showed the flow chart of this method.

The specific steps are as follows.

(1) Extraction of the kernel principal components: the feature parameters were extracted from the full life vibration data, and the KPCA method was used to reduce the dimension of the feature set and obtain the kernel principal elements.

(2) Establishing the WPHM: the WHPM parameters were estimated based on the obtained kernel principal components and the hazard rate was calculated at the corresponding time.

(3) Data preprocessing: the Pchip interpolation method was used to get the uniformly spaced interpolation data series, and the main trend was reconstructed by removing the high-frequency signal through the EEMD decomposition method to fit the GM(1,1) model.

(4) RUL prediction: the GM (1,1) model was used to predict the trend of hazard rate data, and the reliability was calculated to predict the remaining life of the bearing.

#### 4. Experiment Verification

##### 4.1. The Features Extraction

The rolling bearing life cycle datasets for this experiment were provided by the Center for Intelligent Maintenance Systems (IMS), University of Cincinnati [22]. The experimental datasets were generated from bearing run-to-failure tests under load conditions on a specially designed test rig as shown in Figure 2.

**(a) Bearing test rig**

**(b) Sensor placement illustration**

There were four test double row bearings (Rexnord ZA-2115) on a single shaft within the bearing test rig. The shaft was driven by an AC motor and coupled with rub belts. A 6,000 lb radial load was added to the shaft and the bearings via a spring mechanism. The rotation speed was kept constant at 2,000 rpm throughout the experiment. A magnetic plug was installed in the oil feedback pipe in order to collect the oil debris as evidence of bearing degradation. The test was stopped when the accumulated debris that had adhered to the magnetic plug exceeded a certain level, which caused an electrical switch to close. The vibration data was collected every 20 minutes with a National Instruments DAQCard-6062E data acquisition card (data sampling rate 20 kHz and data length 20,480 points). The data collection was conducted in the National Instruments LabVIEW program.

Figure 3 displays the vibration signal of a tested bearing during its entire lifecycle. The results show that the amplitudes of vibration signals are steady during the early stage, but the amplitude has significantly increased compared to the normal standard amplitude in the end of the bearing life, which indicates that the vibration signals contain the bearing performance degradation. Figure 4 shows the pictures of bearing components after a test; the bearing failure in different forms. From the vibration signal, 11 features that could reflect the degradation process are selected to form a feature set [23].The features were as follows:

**(a) Inner race defect**

**(b) Roller element defect**

**(c) Outer race defect**

(1) Time domain: RMS, kurtosis, peak to peak value, and peak factor.

(2) Frequency domain: spectral mean, the root mean square value of the spectrum, and spectral variance.

(3) Time-frequency domain: the third band normalized wavelet packet energy spectrum of 3-level wavelet packet decomposition (E3), the seventh band normalized wavelet packet energy spectrum (E7), the entropy of the third band of the 3-level wavelet packet decomposition (S3), and the seventh band samples entropy value (S7).

Figure 5 shows the degradation trend in full life cycle of the selected features over time, which are suitable for RUL prediction of the rolling bearing.

**(a) Time domain, RMS**

**(b) Time domain, peak-peak value**

**(c) Time domain, kurtosis**

**(d) Time domain, peak factor**

**(e) Frequency domain, spectrum mean**

**(f) Frequency domain, spectrum variance**

**(g) Frequency domain, spectrum RMS**

**(h) Time-frequency domain, E3**

**(i) Time-frequency domain, E7**

**(j) Time-frequency domain, S3**

**(k) Time-frequency domain, S7**

**(l) The time of data collection**

##### 4.2. The GM(1,1) Model Application

The KPCA was used to reduce the dimension of the bearing feature set and to select the first three-dimensional kernel principal components, whose cumulative contribution rate was greater than 85%. The WPHM parameters were estimated and shown in Table 1 [23].

The hazard rate was calculated by using the variation data series of the KPC1, the KPC2, and the KPC3. The different degradation stages [23] were used to select the data from four time periods in order to predict the trend with the GM(1,1) model. The characteristics of the data were shown by amplifying and marking each time period in Figure 6.

Figure 6 shows that the data was collected with the unequal interval sampling method. The GM(1, 1) model required the use of the equal interval data series when predicting the trend. Thus, equally spaced interpolation was required. The MATLAB toolkit provided a variety of interpolation methods, including the Pchip interpolation and the Spline interpolation. These methods were compared by using the data from day 30.2 to day 31.0 (period (a)), as well as the data from day 32.1 to day 33.1 (period (c)). Figures 7 and 8 showed the results of the two interpolation methods. The performance is nearly the same in period (a) of Figure 7, because the data was approximately the same interval with no large gaps, but the intervals between data points were large and uneven in period (c) of Figure 8; the transient overshoot in the interpolated line between the two points throughout the Spline interpolation method was more obvious than in the Pchip interpolation method. So, the Pchip interpolation method performed better in terms of shape preserving especially dealing with the data that has large spacing and are uneven.

**(a) Pchip interpolation**

**(b) Spline interpolation**

**(a) Pchip interpolation**

**(b) Spline interpolation**

The data series adopted interpolation processing, although the oscillation still existed within the data points. The oscillation had a passive influence on trend prediction according to the GM(1, 1) model. The influence of the transient overshoot component among data points need to be reduced in order to achieve a smoother trend component of the data. This was accomplished by dividing the data series into a dozen signals using the ensemble empirical mode decomposition (EEMD). The first several signals that contained the high-frequency oscillation component were omitted in order to reconstruct the main trend of data series based on the remaining component signals.

Figures 9 and 10 show the interpolation curve and the main trend curve of periods (a) and (c), and Figures 11 and 12 show the comparison results of the trend prediction curve of periods (a) and (c). As seen in Figures 11 and 12, the proposed Pchip-EEMD-GM(1, 1) model (the hazard rate prediction method) was used to predict the trend that would effectively eliminate high-frequency oscillation throughout the data series. This substantially decreased the influence of discrete points selection and obtained a more stable and more reliable trend prediction result.

##### 4.3. RUL Prediction

Previous research verified that the kernel principal components could reflect the degradation process of the rolling bearings as the covariants, and it included more bearing life cycle data than RMS values or kurtosis [23]. Both the covariate and the hazard rate trends can be predicted using the proposed method. However, the hazard rate prediction method can optimize the prediction steps and reduce the amount of model calculation. To verify the effectiveness of the proposed method, the first three KPCs and the hazard rate were used to predict the main trend, respectively, and then predict the remaining life. The results were compared to the actual URL to verify the accuracy of the two methods. Figure 13 showed the variation of the kernel components and the hazard rate over the entire life of the bearing; the three KPCs curves were unstable and oscillated more while the hazard rate obtained from the KPCs was relatively stable with less turbulence. The reconstructed data series from the four periods ((a), (b), (c), and (d)) were used to predict the future trend periods through the covariant prediction method and the hazard rate prediction method.

Figures 14–17 show the trend prediction results of the covariates and the hazard rate method throughout different degradation stages of the rolling bearings. The predicted values were closer to the actual values as time progressed. The RUL accuracy of the prediction methods was verified through the threshold of reliability, which is typically set by statistical experience. The reliability value of the tested rolling bearing at the severe wear stage was around [23]; we set the reliability value of as the moment of complete failure of the bearing; the bearing operating time was taken as the bearing life. The prediction results of RUL are presented in Table 2.

The prediction errors of the two RUL prediction methods were calculated using (4). The results were shown in Figure 18. The prediction error continued to decrease as the rolling bearings degenerated. As seen in Table 2, the proposed method achieved a better performance than the covariates prediction method throughout the four stages with smaller error. And the predicted RUL was quite different from the actual RUL of the tested rolling bearing during the normal working stage. This may cause by accident the damage of the rolling bearings; the trend of the hazard varied nonlinearly, which led to larger errors at the normal working stage. The accumulation of the new degradation characteristics at the early failure stage led to the predicted RUL becoming closer to the actual RUL. The accuracy degree reached 90.16% (1-error rate ) at the medium wear stage. The result showed that the hazard rate prediction method of the Pchip-EEMD-GM(1, 1) model accurately predicted the rolling bearing’s RUL with less computational amount than the covariates prediction method, as well as providing timely and effective maintenance decisions.

#### 5. Conclusion

In this study, the Pchip-EEMD-GM(1, 1) model successfully and accurately predicted the variation trend and the RUL of the rolling bearings based on the vibration data. Pchip-EEMD was used to extract the main trend of data change. It overcame the shortcomings of the traditional trend predicting methods with large fluctuations and low prediction accuracies. The proposed method improved the prediction accuracy more effectively. The advantages of the proposed method were listed as follows.

(1) The original data adopts interpolation processing by the Pchip interpolation method. This method has a good form of conformality and prevention of overshoot, maintains the trend characteristics of the original signal, and helps to improve the reliability of RUL prediction results.

(2) The EEMD method is used to extract the main trend of the data; it reduces data fluctuations to fit the the GM (1,1) prediction model, which helps to improve the model prediction accuracy.

(3) The WPHM was used to calculate the hazard rate in the bearing life cycle, and the hazard rate was predicted directly, which reduced the error of prediction and the amount of calculation, optimized the prediction steps, and improved the prediction accuracy.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was supported by the National Natural Science Foundation of China (no. 51375067).