#### Abstract

Aiming at solving the problems of small fault data samples and insufficient remaining useful life (RUL) prediction accuracy of nuclear power machinery, a method based on an exponential degradation model is proposed to predict the RUL of equipment after the failure warning system alarm. After data preprocessing, time-domain feature extraction, selection, and dimensionality reduction fusion of multiple degradation variables, the exponential degradation model is constructed based on the Bayesian process, and prior information is used. As an application, the RUL of a nuclear power turbine was calculated based on actual monitoring data, the precision curve was used to evaluate the prediction effect, and the RUL prediction results verified the effectiveness of the proposed method.

#### 1. Introduction

The term “remaining useful life” (RUL) refers to a type of defect prediction and health management system that is widely used in industries [1, 2]. Reliable and accurate RUL prediction of nuclear power mechanical equipment is crucial for predictive maintenance and accidental failure control.

Existing RUL prediction methods are generally based on physical failure models and data-driven methods [3]. Physical failure model-based methods mainly describe the failure mechanism of components by establishing a mathematical model to predict the RUL of a system [4–6]. Models such as crack propagation or lamination propagation that describe damage behavior are usually adopted, and a large amount of measurement data is needed to verify accuracy [7]. For example, considering that previous methods were only used to track a single damage mode, Daigle and Goebel [8] used a particle filter to calculate joint state parameter estimation to predict RUL and used this method to process multiple damage modes simultaneously. Lei et al. [9] proposed a prediction method of rolling bearing RUL based on a nonlinear degradation model, considering four variables of a random degradation process. Yu et al. [10] proposed a hybrid system sequential prediction method based on the concept of dynamic fault isolation in which the estimated degradation model and user-selected fault threshold were used to calculate the RUL of faulty components that changed with different operating modes. Although a physical failure model-based method can achieve accurate prediction, an in-depth understanding of physical failure characteristics is required. When equipment is in a complex environment, its failure mechanism is difficult to be fully grasped, and it is difficult to have an accurate physical model for RUL prediction [11], so its application has great limitations.

Commonly used data-driven methods include neural network, filtering, threshold regression, and random models [12]. Li et al. [13] established a model for the RUL prediction of industrial equipment based on long short-term memory and multivariate monitoring data and adopted a small-batch gradient descent algorithm in the model’s training process, which could quickly find the global optimal solution. Caesarendra et al. [14] studied the application of relational vector machines in machine degradation assessment using a logistic regression method. Kundu et al. [15] proposed a Weibull accelerated failure time regression model, considering the influence of working conditions on state monitoring data during model development. It can effectively cover a series of operating conditions and improve the RUL prediction accuracy without the need to train different models for different operating conditions. Loutas et al. [16] proposed a rolling bearing RUL prediction method based on support vector regression using information about the past life history to perform offline training and calibration. A new feature (Wiener entropy or spectral flatness) was proposed for condition-based maintenance. Gebraeel et al. [17] used a Bayesian method to update the parameters of an exponential degradation model online and calculated the life distribution of equipment combined with the equipment failure threshold.

An exponential degradation model is an RUL prediction method based on a statistical model. Gebraeel [18] established an exponential model which is constantly updated with new sensor monitoring data using the Bayesian method to update the posterior distribution of model parameters and the RUL distribution of equipment. The model was verified in rolling bearing prediction on a database, but the prior distribution of parameter selection does not yield a detailed technique. On this basis, You et al. [19] combined an equipment historical fault database with real-time monitoring data and considered two equipment failure modes and their corresponding parameter updating methods to improve prediction accuracy. Si et al. [20] improved a case that prior hyperparameters of the Gebraeel model need to be determined by multiple historical degradation data of the same type, and the prior parameters of the model could be estimated only by relying on the monitoring data from the equipment operation to failure. Li et al. [21] proposed an adaptive method to determine the fault threshold and used a particle filter to reduce the random error of the random process.

A severe accident at a nuclear power plant is likely to have disastrous consequences. With an increase in the service life of nuclear power equipment, the RUL decreases and the probability of failure increases [22]. Therefore, it is necessary to predict the RUL of nuclear power equipment. Barbieri et al. [23] predicted the performance degradation trend of three-phase motors using the general trajectory model. Upadhyaya and Hines [24] and Ardsomang et al. [25] predicted degradation trends using a linear universal trajectory model and Bayesian method, respectively, based on data generated from a heat exchanger degradation experiment. Kim et al. [26] predicted the cumulative thermal efficiency loss and the changing trend of key parameters of a thermal efficiency monitoring system of a nuclear power plant system using linear regression. Chookah et al. [27] fitted a degradation trajectory generated by Monte Carlo simulation with a mechanistic superposition model and verified it in a corrosion fatigue experiment of carbon steel specimens. Yuan and Pandey [28] made a nonlinear improvement on the autoassociative kernel regression method and verified it in an experiment of carbon steel pipes in a nuclear power plant. Coble and Hines [29] used self-associative kernel regression to process data and made a posterior estimation of model parameters through the Bayesian process. However, most existing methods are stochastic processes and linear models with a single degradation quantity, which is difficult to ensure the prediction accuracy under the condition of poor data quality and life prediction in the middle and late periods. Recently, owing to the continuous improvement of the monitoring ability of nuclear power plants, the establishment of an exponential degradation model has been a prerequisite. By constructing an exponential degradation model, the model parameters can be updated and the prediction results can be obtained with high precision.

In this study, for the vibration signal of nuclear power steam turbine units, mean, root mean square (RMS), margin factor, and other variables are analyzed for feature selection and principal component analysis (PCA) fusion. Thereafter, an exponential degradation model is established, and the model parameters are iteratively optimized by Bayesian updation and expectation-maximization, so as to realize the remaining RUL prediction of nuclear power equipment after fault alarm.

#### 2. RUL Prediction Method Based on an Exponential Degradation Model

An RUL prediction method based on an exponential degradation model requires less data, so RUL prediction can be performed based on the current historical data of equipment operation rather than data from equipment operation to failure occurrence. It is suitable for nuclear power equipment with few failure samples. In the proposed method, the mapping relationship between RUL and degradation trend is obtained, a comprehensive health parameter is constructed for multiple variables, and a failure threshold is set. Finally, the RUL is predicted by the regression method.

The prediction process is shown in Figure 1. After denoising data, time and frequency-domain parameters related to the data are calculated, and the parameters with high monotonicity values are extracted for PCA. After dimensionality reduction, the principal components with the largest degradation trend are selected as health indices, and the exponential degradation model is built to achieve RUL prediction.

##### 2.1. Data Preprocessing

The moving average method was used to denoise data. The essence of the moving average method is low-pass filtering. N sampled data were regarded as a queue of fixed length. Every time new data are collected, the data at the top of the queue are removed from the queue and the new data are added, and the arithmetic average value of the new queue is calculated. A new average can be calculated for each sample.

##### 2.2. Feature Extraction and Fusion

The monitoring signals of nuclear power turbine units are usually discrete data, so the data were analyzed in the time domain.

Time-domain features include mean value, standard deviation, RMS, kurtosis, skewness, peak-to-peak value, wave peak index, shape index, pulse index, margin factor, and energy.

Monotonicity screening was performed on the obtained time-domain features, and the parameters with greater monotonicity trends were selected. The monotonicity value is calculated quantitatively. When the value is closer to 1, the monotonicity trend of the time-domain feature is more obvious. The calculation method is given by the following formula:where *N* is the number of sampling points.

PCA was used to reduce and fuse standardized feature data, and the principal components with obvious degradation trends were selected as health indices to construct the exponential degradation model.

##### 2.3. Exponential Degradation Model

The basic principle of an exponential degradation model is to use the Bayesian process to take measurement information as a priori information to perform a posterior evaluation of exponential model parameters.

Assuming that a series of discrete equipment feature monitoring points are accumulated, , where represents the latest moment, degradation feature at *t*_{n} moment can be expressed by an exponential model as follows:where is a known constant, is also a constant used to represent the uncertainty of the degradation process, is a random error term subject to the normal distribution, i.e., , are independent of each other, and are also independent random variables subject to the normal distribution, i.e., , and , which is used to describe the differences between individuals, where represent the mean and variance of the normal distribution of and . The logarithmic simplification of equation (2) can be obtained as follows:where . In addition, because is normally distributed, is also normally distributed, so

The posterior estimation of and is evaluated using the Bayesian process after logarithmic degradation features accumulated at a certain time are known. It is assumed that logarithmic degradation features accumulated at the current time are expressed as , where ; given and , the joint distribution of is multivariate normal distribution, so

Because the prior distribution of and follows the normal distribution, the posterior distribution of and follows the Gaussian distribution under the given condition of , i.e., , more accurately expressed as follows:

##### 2.4. RUL Prediction

To calculate the RUL, the threshold for entering the failure period should be determined first, and the logarithm should be . The logarithmic degradation characteristic value of is after the current moment and after the moment . When the logarithmic degradation characteristic value *y* is known, it can be proven that is also normally distributed and expressed aswhere

It is assumed that from the current time to time , just reaches the realization threshold , and is the RUL at time . Given , the conditional cumulative distribution function of the RUL iswhere .

Because the RUL satisfies , the cumulative distribution function is truncated as follows:

The conditional probability density function of the RUL is expressed as follows:

Theoretically, the distribution of the RUL at time can be calculated according to equations (11) and (12), but it is relatively difficult. Another technique to solve this problem is that the conditional formula of the RUL at moment is , which is interpreted as the degradation characteristic value equal to the failure threshold after units of time. Because is a random variable subject to a normal distribution, its expected value can be used to replace the value of the random variable and then can be obtained. The formula for calculating the RUL is as follows, and here is an approximate estimate under the maximum probability.where and can be calculated from equations (7) and (8), respectively, but there are five nonrandom unknown parameters in equations (7) and (8). Si et al. [20] proposed the expectation-maximization method for the iterative optimization of these parameters . The termination condition is that the iteration is stopped when the norm of the difference between the parameter vectors of two iterations is less than a certain threshold. The last estimated value is obtained to calculate the RUL at this time.

#### 3. Case Analysis

The proposed method was validated with actual monitoring data of a steam turbine unit in a nuclear power plant. The data used reflected the horizontal relative vibrations of the generators in the plant. When the generator works normally, its vibration value remains within a normal range with small fluctuation. On the contrary, when the generator fails, the vibration would rise rapidly to the failure level. This case processed the monitoring data of the generator vibration value from February 6 to April 03, 2017, which consists of the normal-working data and rapid-rising data. On March 28, the early warning system of the nuclear power plant issued an alarm and shut down. After checking the control cabinet, it was found that the fuse of the overspeed protection module was burnt out and the speed threshold box was faulty. The unit has been seriously overspeeding, resulting in an emergency shutdown. The rotating speed of the generator is an important operating parameter and must be kept within a certain range. The too low speed will make the output voltage of the unit unable to meet the requirements, and too high speed will increase the output voltage, which may lead to equipment discharge or even breakdown, resulting in safety accidents. The rotating speed of the generator is closely related to its load. When the load increases, the consumed mechanical energy increases and then the speed decreases. Conversely, when the load decreases, the rotating speed increases. In the operation of the generator in the nuclear power plant, the load changes are complex, so a speed controller is equipped to adjust the input energy in time according to the required load value to stabilize the rotating speed of the unit. In addition, the maximum and minimum speed thresholds are set. When the speed is not within the threshold range, the unit will be shut down for inspection. In this case analysis, the speed controller failed, and the load decrease led to the continuous increase of generator speed, and thus the vibration value continued to rise until it exceeded the threshold, resulting in an alarm and shutdown. To avoid the prediction lag and improve the reliability and timeliness of the prediction, the monitoring data up to March 27 (totally 50 days) were selected.

##### 3.1. Data Preprocessing

The moving average method was adopted to denoise the data for smoothness (Figure 2).

##### 3.2. Feature Extraction and Fusion

Time-domain feature extraction was performed on daily monitoring data to obtain daily time-domain feature parameters. Three parameters were selected (Figure 3).

**(a)**

**(b)**

**(c)**

From Figure 3, some time-domain features show obvious monotonicity trends, whereas others do not. For RUL prediction, the features with obvious monotonicity trends reflect the degradation process of equipment, so these features should be screened out.

The monotony of 11 time-domain features obtained from 50 days of data was evaluated, and features with high monotony were selected to construct health indices. Notably, the monotonicity calculation method is very sensitive to noise, so it is necessary to first perform the moving average smoothing for the features before calculating the monotonicity value. Figure 4 shows before and after margin factor feature smoothing.

After smoothing, the monotonicity value is calculated (Figure 5).

For the monotonicity value, the larger the monotonicity value is, the more likely the feature is to reflect the equipment degradation process. Therefore, four features with a monotonicity value greater than 0.3, namely, mean, RMS, margin factor, and energy, were selected for feature fusion.

It would be difficult to use all four selected characteristics for RUL prediction, so PCA was used to fuse them into a single health index. Before PCA, data were normalized to eliminate dimensionality. Notably, the data of the entire life cycle were unavailable during the equipment operation phase, and we only obtained the data up to now.

The data collected were assumed still some time before the end of equipment life, so the data of 50 days were divided into training and test sets. The first 40% of the time included 20 days of data, which were used for the training, and the data of the last 50 days were used to verify the model. The mean and variance of the data of the first 20 days were calculated, and then the entire dataset was normalized according to the mean and variance using the z-score method. PCA dimension reduction was performed for normalized data. In PCA, the coefficient matrices used and standard deviations were based on training data and then applied to the entire dataset. This is equivalent to updating health indices in the actual prediction based on existing monitoring data as the data are updated. The first and second principal components were taken after dimension reduction of PCA (Figure 6).

From the figure, the first principal component has a more obvious degradation trend than the second principal component, so the first principal component can be used as a health index reflecting the degradation state of equipment. To make the degradation process more obvious, the first principal component was also smoothed by moving average to obtain a smooth health index. In addition, all health index values were subtracted from the seventh health index value to make the minimum health index value 0 for the subsequent index model construction. Then, the absolute value of all health indices is taken to make the health index curve in the first quadrant, which meets the requirements of the exponential model curve. The health index curve obtained is shown in Figure 7.

##### 3.3. Exponential Degradation Model Construction

Notably, the prediction error of the exponential model accumulates with an increase in the prediction length. When constructing the index model for the health index, the error is large. This is because (1) the prediction time is long and (2) the growth trend of health indices in the late life is slower than that in the early life, leading to the unsuitability of model parameters obtained in the late life for the early prediction, and the prediction accuracy is low. Therefore, to improve the prediction accuracy, shorten the prediction time and select the period with an obvious growth trend of health indices for prediction, which can fully reflect the effectiveness of the model prediction. The construction of the index model starts with setting a threshold value to calculate the time when the current index level reaches the threshold value. By observing the health index curve, the health index value on the last day is closest to the value of downtime, so set the value on the last day as the threshold. The monotony of health indices during this period is good, which accords with the exponential model.

##### 3.4. RUL Prediction and Evaluation of Results

The accuracy curve was used to evaluate the prediction effect. The accuracy is defined as a binary metric that evaluates whether the prediction accuracy of instance at a particular time is within the specified range . In this study, is the time between and life end time , and is the time when the predicted result falls into the evaluation index for the first time. is the percentage of true life at time :where is the time window modifier, such as ; is the minimum acceptable probability density; and is the predicted RUL at time . The probability density of a function of in is given by , .

Based on the evaluation of RUL prediction results, the curves of RUL prediction results for 40 days from 10 to 50 were obtained, and the confidence interval was given; in Figure 8 is 20%.

From the figure, the first 20 days were in the stage of learning and updating of Bayesian parameters; with the increase in time, the Bayesian parameters are gradually updated to the level consistent with the data model; meanwhile, the prediction accuracy of the first ten days is too low, and the prediction was not credible, so the result was not given. The prediction curve of the RUL could not reflect the correct prediction in the early stage. With the completion of the Bayesian exponential updation, the prediction becomes more accurate after the parameters are adjusted to the appropriate parameter distribution. On the 37th day, the prediction curve falls into the preset by the exponential model, and the model’s prediction has credibility.

To verify the accuracy of the exponential degradation model for this dataset, the same steps for RUL prediction were performed based on a linear degradation model, in which the degradation feature can be expressed as follows:

The curve and prediction results for the linear model are shown in Figure 8. The figure illustrates that the actual and predicted RUL curves have a similar downward trend only after the 29th day, but the deviation is significant, and the prediction accuracy is obviously lower than that of the exponential model.

#### 4. RUL Prediction Method Based on an Exponential Degradation Model

In this study, an exponential degradation model based on Bayesian updation and expectation-maximization is proposed to predict the RUL of equipment promptly after failure warning. After analyzing the existing and simulated data, the following conclusions are drawn:(1)The proposed exponential model-based method can be effectively applied to the prediction of the RUL of nuclear power machinery under the condition of multiple variables. Only the historical data of the equipment up to the present moment, rather than when the failure occurred, are required for the number of samples, making the model very applicable.(2)Model parameters are estimated and updated through Bayesian updation and exponential degradation model with expectation-maximization to improve the accuracy of prediction results.(3)After feature extraction, feature selection, and PCA dimensionality reduction, the exponential model was used to predict the 10–50 d life. The prediction effect was evaluated by the accuracy curve, and the model was verified.

#### Data Availability

The datasets used or analyzed during the current study are available from the corresponding author on reasonable request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This research was funded by the Science and Technology Planning Project of Guangdong Province (grant no. 2021A0505030005), the National Natural Science Foundation of China (grant nos. 51875209 and 11975181), the Natural Science Foundation of Guangdong Province (grant no. 2022A1515011004), the Open Funds of State Key Laboratory of Nuclear Power Safety Monitoring Technology and Equipment (grant no. K-A2020.408), and the Guangdong Basic and Applied Basic Research Foundation (grant no. 2019B1515120060).