Shock and Vibration

Volume 2017 (2017), Article ID 6184190, 11 pages

https://doi.org/10.1155/2017/6184190

## Rolling Bearing Reliability Assessment via Kernel Principal Component Analysis and Weibull Proportional Hazard Model

^{1}School of Mechanical Engineering, Dalian University of Technology, Dalian, China^{2}School of Mathematical Sciences, Dalian University of Technology, Dalian, China^{3}School of Business Management, Dalian University of Technology, Dalian, China

Correspondence should be addressed to Fengtao Wang; nc.ude.tuld@tfgnaw

Received 9 December 2016; Revised 24 February 2017; Accepted 13 March 2017; Published 16 April 2017

Academic Editor: Giorgio Dalpiaz

Copyright © 2017 Fengtao Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Reliability assessment is a critical consideration in equipment engineering project. Successful reliability assessment, which is dependent on selecting features that accurately reflect performance degradation as the inputs of the assessment model, allows for the proactive maintenance of equipment. In this paper, a novel method based on kernel principal component analysis (KPCA) and Weibull proportional hazards model (WPHM) is proposed to assess the reliability of rolling bearings. A high relative feature set is constructed by selecting the effective features through extracting the time domain, frequency domain, and time-frequency domain features over the bearing’s life cycle data. The kernel principal components which can accurately reflect the performance degradation process are obtained by KPCA and then input as the covariates of WPHM to assess the reliability. An example was conducted to validate the proposed method. The differences in manufacturing, installation, and working conditions of the same type of bearings during reliability assessment are reduced after extracting relative features, which enhances the practicability and stability of the proposed method.

#### 1. Introduction

The rolling bearing is one of the most important components of rotating machinery [1], and its running state directly influences the health of the entire system of equipment [2]. It is also very easily damageable, as it not only supports load but also permits relative motion [3, 4]. Effective rolling bearing maintenance strategies can not only reduce the amount of downtime and cost of maintenance, but also ensure the normal operation of the equipment [5, 6]. Accurate reliability assessment is essential for making predictive maintenance decisions based on the real-time status of equipment.

Reliability assessment comes with two key challenges: the construction of an appropriate reliability assessment model and the selection of features which can accurately reflect performance degradation process. Reliability assessment based on real-time equipment conditions has become a popular research topic in recent years [7], and advancements in information technology and artificial intelligence have brought about a number of valuable contributions to the literature. For example, the proportional hazard model (PHM), first introduced by the British statistician Cox [8] in 1972, is a powerful statistical analysis methodology. It is an important statistical regression model based on lifetime data and condition monitoring data and has been successfully used for reliability assessment in accelerated life testing. Ding et al. [9] used the Weibull proportional hazard model (WPHM) to assess the reliability of a rolling bearing in real time. Liao et al. [10] used logistic regression model and PHM to assess the reliability of an individual unit. Zhang et al. [11] used a mixed WPHM to predict the failure of a mechanical system with multiple failure modes.

The WPHM is a well-established mathematical model. However, when it is applied in real equipment life prediction, it is problematic as far as covariates selection, setting reliability threshold, trend prediction, and other issues. In terms of covariates selection, most previous studies concern direct time domain statistical analysis where one or more time domain features are selected to build a reliability assessment model. However, one single feature or features based on one single domain cannot accurately reflect the performance degradation process and thus impact the overall accuracy of reliability assessment. Although time domain, frequency domain, and time-frequency domain features can comprehensively reflect the performance degradation process of bearings over their entire service lifetime, excessive parameters lead to data redundancy. Further, selecting more covariates of the WPHM makes the parameter estimation process more challenging. The vibration signals of faulty machinery are generally nonstationary and nonlinear under complicated operating conditions [12, 13]. Therefore, it is crucial to select the features through nonlinearly reducing dimensionality and removing redundant features.

Kernel principal component analysis (KPCA), first proposed by Schölkopf et al. [14, 15], is generalized principal component analysis (PCA) that is applied to nonlinear cases by nonlinearly mapping input samples to a higher dimensional feature space before performing PCA per usual [16]. It has been successfully utilized in process monitoring and fault diagnosis applications. Lee et al. [17] developed a new nonlinear process monitoring technique based on KPCA. Jiang et al. [18] proposed a fault diagnosis approach based on KPCA and multiclass classifiers of a support vector machine. He et al. [19] used the low-dimensional principal component representations from the statistical features of measured signals to characterize and monitor gearbox conditions. Su et al. [20] used a Euclidian distance discriminating approach to distinguish bearing fault data by adopting the first seven principal components as inputs.

In order to overcome the weakness for the selection of WPHM covariates, this paper proposes a novel method for assessing the reliability of rolling bearings based on KPCA and WPHM. The novelty of this research is in improving the covariates selection method of WPHM, which has considerable value in practical application. A high relative feature set is constructed by selecting the effective features through extracting the time domain, frequency domain, and time-frequency domain features over the bearing’s life cycle data. Then the first three kernel principal components (KPCs), which can accurately reflect the performance degradation process through KPCA, are selected as WPHM covariates to assess the reliability. The feasibility and effectiveness of the method were validated using bearing’s life cycle data, and it can provide important basis for equipment proactive maintenance. The differences in manufacturing, installation, and working conditions of the same type of bearings during reliability assessment are reduced after extracting relative features, which enhances the practicability and stability of the proposed method compared to traditional assessment techniques. It enriches the theory of covariates selection and is more emphasis on application innovation.

The remainder of this paper is organized as follows. Section 2 presents the fundamental theories of KPCA and WPHM. Section 3 presents the proposed method for reliability assessment in detail. Section 4 discusses the features extraction method used to reflect the bearing performance degradation process through KPCA. The case study we conducted to validate the proposed method reported in Section 5, and conclusions are given in Section 6.

#### 2. Fundamental Theory

##### 2.1. Kernel Principal Component Analysis

KPCA essentially works by nonlinearly mapping input samples to a highly dimensional feature space and applying a linear PCA to the transformed signals. KPCA performs nonlinear data processing more effectively than PCA.

In KPCA, a set of multidimensional signals , is mapped to , by nonlinear mapping . Assume , has been mean-centered. PCA is performed by finding the eigenvalues and eigenvectors satisfying , where the sample covariance matrix of is

Substituting (1) into the eigenvector equation yields

The eigenvectors can be expanded as follows:where is correlation coefficient. Substituting (3) into (2) yieldsand is symmetric matrix, where

Equation (4) can then be written as . are the corresponding eigenvalues of , and are the eigenvectors of . If is the minimum eigenvalue (a nonzero number), normalized eigenvectors can be obtained successfully:

Finally, the principal components for testing examples can be calculated as follows:where

The above algorithm is based on the assumption that is mean-centered, but the assumption is suitable in general. The mapping data to be mean-centered is expressed as follows:where is the unit matrix of the coefficient which is .

The cumulative contribution rate (CCR) is utilized to determine number of principal components.

The cumulative contribution rate (CCR) threshold is referenced by [21, 22]. This threshold could be set at 85%, 90%, or 95%. In general, once the CCR exceeds 85%, the first principal components contain most information of the original feature set, so this paper is set at 85%.

There are three common types of kernel functions: polynomial kernel function, radial basis Gaussian (RBG) kernel function, and neural network kernel function. The transformation matrices of the RBG kernel function have positive definiteness and a wide convergence field. It only contains one parameter, and the calculation process is relatively simple [23]. The RBG kernel function is utilized here, where is a parameter related to the kernel width. can be obtained by optimizing the parameter of the kernel function [24].

##### 2.2. Weibull Proportional Hazard Model

The PHM builds a mathematical relationship between the feature parameters of the equipment running status and the reliability. According to the feature parameters of the real-time operation, PHM can get the device hazard rate in its current state, to assess the current reliability of the equipment. The hazard rate at time is expressed as follows:where is the baseline hazard rate dependent on the service time, is a row vector composed of monitoring values at time that is time-dependent, and is a column vector composed of the regression parameters corresponding to the monitoring variables. In the PHM, is regarded as a vector of covariates that increases or decreases the system hazard rate proportionally; its coefficient vector defines the influence of the monitoring variables on the failure process.

The Weibull distribution is frequently used to model the failure time of mechanical systems. The hazard rate function of the Weibull distribution is commonly selected as the baseline hazard rate of the PHM. The hazard rate for the two-parameter Weibull distribution is written as follows:where and are the shape and scale parameter of the Weibull distribution, respectively.

The PHM with the Weibull baseline function is called the WPHM, the hazard function of which is defined as follows:

According to the principle of reliability analysis [25], reliability and failure probability density can be estimated as follows:

The key of using WPHM to assess the operating status of equipment is to estimate unknown parameters according to the feature data and time data of the real-time status. The maximum likelihood method is commonly applied to estimate unknown WPHM parameters. In practice, a mechanical system may be run until it fails but may be repaired prior to failure. The lifetime data usually contains failure times and suspension times to reflect this. To properly account for both types of data, the likelihood function where the covariates are time-dependent is defined as follows:where indexes the failure times, indexes the suspension times, is the number of failure samples, and is the number of suspension samples. By substituting (14) into (15), the likelihood function can be rewritten as follows:where indexes both the failure times and the suspension times. The log-likelihood function is

In the above equations, the covariates of WPHM are time-dependent. When the covariates only relate to the current time (i.e., they are non-time-dependent), the reliability and the failure probability density can be, respectively, rewritten as follows:

Therefore, by substituting (18) into (17), the log-likelihood function can be rewritten as follows:

By setting the partial derivatives of (17) or (19) with respect to the parameters , , and equal to zero, , , and can be obtained via Newton iterative method. Unless the initial value is suitable, the numerical solution is very difficult to obtain.

With increase in the number of covariates, the complexity of the maximum likelihood estimation increases substantially. Therefore, the Nelder-Mead iterative algorithm [26, 27] is applied to estimate these mixed parameters.

#### 3. Proposed Method

A flowchart of the proposed method is shown in Figure 1.