#### Abstract

An optimized principal component analysis (PCA) framework is proposed to implement condition monitoring for sensors in a nuclear power plant (NPP) in this paper. Compared with the common PCA method in previous research, the PCA method in this paper is optimized at different modeling procedures, including data preprocessing stage, modeling parameter selection stage, and fault detection and isolation stage. Then, the model’s performance is greatly improved through these optimizations. Finally, sensor measurements from a real NPP are used to train the optimized PCA model in order to guarantee the credibility and reliability of the simulation results. Meanwhile, artificial faults are sequentially imposed to sensor measurements to estimate the fault detection and isolation ability of the proposed PCA model. Simulation results show that the optimized PCA model is capable of detecting and isolating the sensors regardless of whether they exhibit major or small failures. Meanwhile, the quantitative evaluation results also indicate that better performance can be obtained in the optimized PCA method compared with the common PCA method.

#### 1. Introduction

As a safety-critical system, in NPPs, safety is of prime importance. Meanwhile, there is also an increasing demand for NPPs to operate more cost-effectively [1]. Thus, advanced technologies for performance diagnosis and control are incorporated into the engineering designs, which aim to guarantee the safety and improve the economy of the whole NPP simultaneously. Meanwhile, with the wide application of digital I&C systems in NPPs, more sensors are applied to obtain the operating information of the plant. On the one hand, the application of more sensors in a NPP contributes to advanced diagnosis and control technologies where quantities of sensors are required to deliver data about the key indicators of system status and performance; on the other hand, it also increases the fault probability of sensors in NPPs [2]. If an abrupt or an incipient failure occurs on a sensor, nonpermitted characteristic property deviation of the sensor will be caused. As a result, inaccurate measurements are delivered to related systems which may further lead to the plant operation deviating from the optimal condition, resulting in process shutdown or even severe accidents in NPPs [3]. Thus, it is necessary to implement condition monitoring for sensors in NPPs.

Confirmed sensor measurements, in addition to conveying the operating information effectively to where it is required to ensure the safety and economy of the NPP, are also beneficial to the condition-based maintenance (CBM) strategy in NPPs. At present, a preventive maintenance strategy is mainly adopted in sensor calibrations during the regular refueling of a NPP. This not only presents a significant cost in time but also leads to component degradation due to repetitive manipulations compared with the CBM strategy [4, 5].

A traditional approach for sensor condition is based on hardware redundancy [6]. The major problem with hardware redundancy is the cost (including the sensor cost and maintenance cost). In this context, approaches based on analytical redundancy are proposed in the literature, including artificial neural networks (ANN) [7–9], independent component analysis (ICA) [10, 11], support vector machine (SVM) [12, 13], fuzzy logic [14–16], partial least-squares regression (PLSR) [17], and PCA [18–24]. A study conducted by Hines and Seibert concluded that the simplicity of analytical redundancy techniques and the tractability of their uncertainty calculations could favor them for acceptance by regulatory bodies [25]. Hence, PCA is adopted for sensor condition monitoring in this paper due to its simplicity and individual strong points.

In the literature, PCA has been used for sensor condition monitoring in many cases. Rosani and Hines applied PCA to monitor 5 temperature sensors in a research reactor [20]. Water-cooled chiller sensors were analyzed with the PCA technique by Hu [22]. Jamil et al. implemented fault diagnosis on the Pakistan Research Reactor-2 with PCA and Fisher discriminant analysis (FDA) [18]. Magan-Carrion et al. introduced a PCA-based method to carry out fault detection in WSNs [26]. Liu et al. and Delimargas et al. used the PCA method to solve the calibration sensitivity, respectively [27, 28].

However, the previous research is mainly focused on the design of the PCA model and implementation of the PCA method in various industries. There are quite a few problems in the common PCA method. Firstly, there is usually an implicit assumption that all the data are prepared in advance; nevertheless, data from a real NPP are usually contaminated by random noise or unknown factors in practice. Secondly, since thousands of sensors are applied in a NPP, it is impossible to put all the sensors into a single PCA model. How to separate sensors into various PCA models is not considered in previous research. Finally, false alarms are inevitable in practice due to the external and internal influences. How to reduce the false alarms to guarantee the reliability of the PCA model has got little attention.

The contribution of this paper is as follows: various optimizations techniques are proposed to deal with the foregoing problems in the common PCA method. Optimizations are involved in different modeling procedures of the common PCA method, including data preprocessing, modeling parameter selection, and fault detection and isolation.

The paper is organized as follows. Section 1 describes the necessity of sensor condition monitoring. Based on the previous research, an optimized PCA framework is proposed. Section 2 outlines the common PCA method. Section 3 details the PCA optimization framework. The effectiveness of the optimized PCA method is tested and evaluated with sensor measurements from a real NPP in Section 4. Conclusions and future work are given in the last section.

#### 2. PCA Methodology

The basic concepts and formulas involved in the PCA method will be briefly explained in this section. For detailed mathematical derivation processes, refer to Li, He, or Jose [29–31].

##### 2.1. Basic Theories of PCA

PCA transforms a set of correlated variables into a set of new uncorrelated variables and meanwhile retains most information of the original data. Then, the principal components (PCs) are derived from the uncorrelated variables to detect and isolate process abnormalities in a robust way [32].

The original data matrix is ( samples, variables) decomposed as the sum of an estimation matrix and a residual matrix : and are the scores and loading matrixes of , respectively. Vectors are orthonormal, and vectors are also orthonormal. Meanwhile, is the linear combination of which is derived asVector represents how the samples are related to each other, while vector represents how variables are related to each other.

The next step is to select the PCs in a PCA model. There are various criteria to determine the number of PCs [33]. Eigenvalues corresponding to the eigenvectors describe how much information each PC contains. Cumulative percent variance (CPV) percentage represents the variation of selected PCs accounting for all the variation of . Then, the CPV is adopted to determine the number of PCs. It is defined as

That is, PCA divides into two parts in the foregoing steps: the model estimation matrix and the residual matrix .

##### 2.2. Fault Detection of PCA

There are two commonly used statistics to carry out this task: statistics and Hotelling’s statistics. They are defined to measure the variation in matrixes and , respectively. If a new testing vector exceeds the effective region in or a significant residual is observed in , a special event, either due to disturbance changes or due to changes in the relationship between variables, can be detected [2].

statistic quantifies the lack of fit between the testing vectors and the model. It indicates the distance that a testing vector falls from the PC model. The Hotelling statistic measures the variation within the PCA model. They are calculated as and are confidence limits for and statistics, respectively. For the calculation of and , refer to the doctoral thesis by Li [34].

#### 3. Optimized Framework for Sensor Condition Monitoring Based on Common PCA

All the optimizations based on the common PCA method are summarized in Figure 1. Firstly, original data are preprocessed with statistical analysis and sliding window method. Then, the preprocessed data are applied to train the PCA model. Meanwhile, at the PCA modeling stage, three kinds of modeling parameter selection criteria are proposed compared with the common random selection criterion, including the variance of sensor measurements, the correlation of sensor measurements, and the type of sensors. Particularly, two different variance criteria are contained in the criterion of variance, which are standard deviation and volatility degree of the sensor measurements, respectively. Next, a false alarm reducing method is applied to reduce the false alarms of and statistics in the fault detection stage. Finally, the detected abnormal behavior is analyzed in principal and residual space simultaneously to locate the faulty sensor more accurately in the isolation stage. This way, more credible and reliable monitoring results can be obtained with the foregoing optimizations in a common method.

##### 3.1. Data Preprocessing Stage

Since sensors in a NPP usually work at high temperature, high pressure, high radiation, high humidity, or high corrosion environment, thus singular points or noise-like fluctuations are inevitable in the original measurements [35]. If these data are directly used to develop the PCA model (nine coolant outlet temperature sensors are selected as an example), the monitoring results with 1000 testing samples are shown in Figure 2. It is evident from Figure 2 that the results are not quite satisfactory; both and statistics present quite a few alarms under normal operating conditions. Thus, data preprocessing is necessary for the data from a real environment.

The abnormal fluctuations in the original data are further classified into singular points and random fluctuations, and they are preprocessed with various methods in this paper.

To eliminate the singular points in the original data, a statistics-based analysis method is applied, which is characterized by its simple structure, small calculating amount, and fast speed [36]. All these advantages make it well suitable for the monitoring of sensors in a NPP, where a large number of sensors are installed. The theory of this statistics-based method is explained as follows.

Most random errors obey normal distribution under normal operating conditions; there is only a very small probability that the random error is greater than 3 standard deviations of the sensor measurements [37]. Whether is a singular point in or not, it can be inferred bywhere is the arithmetic average and is the standard deviation estimation for the equal precision measurements of sensor . If satisfies (5), will be treated as a singular point and eliminated from the original data directly.

The measurements of three feedwater flow sensors are selected as an example to show the effectiveness of the singular points elimination method, and the results are given in Figure 3. It can be seen that singular points are all existent in the measurements of 1#, 2#, and 3# feedwater sensors based on the foregoing analysis.

After singular points are eliminated according to (5), random fluctuations in the measurements will be further reduced. Medium filtering, arithmetic average filtering, weighted recursive filtering, and wavelet analysis are the most used methods to reduce the random fluctuations [38]. Usually, the selection of the elimination method is mainly dependent on the characteristics of the measurements. Considering the type of sensors applied during modeling in this paper, the sliding window average method is used as the denoising method for the sensor measurements from a real NPP [39]. It is a time-domain denoising method which constantly takes out contiguous measurements of sensor and calculates the arithmetic average of the measurement. is just the length of the sliding window. Then, the average value in the sliding window is regarded as the estimated value at moment . That is,

Random fluctuations are filtered based on (6). Then, the data present a smoother changing trend after singular points and random fluctuations are reduced from the original. The measurements in Figure 2 are used again to show the effectiveness of data preprocessing, and the results in this case are shown in Figure 4.

Compared with Figure 2, it is clear that the false alarms of and statistics are greatly reduced. Then, it can be concluded that data preprocessing is significantly effective in improving the accuracy of the PCA model, and it is really necessary and meaningful to preprocess the data from a real operating environment.

##### 3.2. Modeling Parameter Selection Stage

After the original data are preprocessed, the next step is to develop the PCA model with the preprocessed measurements. Obviously, it is unrealistic and unreasonable to put all sensors in a NPP into a single PCA model; thus, a distributed framework is proposed in this paper, that is, multiple PCA models running in parallel to implement condition monitoring for all the monitored sensors in a NPP. Hence, how to best group various sensors into various PCA models to get optimal performance is very important [35]. In this context, the following criteria are proposed, which are compared with random modeling parameter selection criterion.

*(1) Variance*. Two different criteria are included in variance, which are standard deviation and volatility degree of the sensor measurements. They are described as follows.

*(a) Standard Deviation*. It refers to the standard deviation of the sensor measurements, which is typically used in statistical terminology. Considering that a similar standard deviation of the sensor measurements in a PCA model may be beneficial to the detection of small failures, thus it is defined in this paper:

*(b) Volatility Degree*. It refers to the volatility degree of the sensor measurements, which is a bit different from “standard deviation” defined in statistical terminology. The volatility degree of sensor measurement is described as

Compared with the criterion of standard variation, the criterion of volatility degree may be more reasonable. Since the sensor measurements cover different orders of magnitude, standard deviation may be incapable of describing the variation in the measurements more accurately. Two vectors and are taken as an example for explanation. Suppose thatObviously, we can see that the changing trends, namely, the volatility degrees of and , are equal. Then, the and values of and can be calculated as follows:Based on (10), the foregoing inference is proved to be right; that is, the same volatility degree of and is obtained; however, the standard deviation of and is different. Thus, the volatility degree-based criterion is proposed as the supplement of the standard deviation-based criterion in this paper. This way, sensor measurements with similar changing trends (namely, with similar volatility degree) rather than with similar standard deviation can be grouped together to train a PCA model. Then, the PCA model should be more sensitive to glitches in the monitored sensors. And the fault detection sensitivity with these two different criteria will be evaluated in the simulation section.

*(2) Correlation*. It refers to the correlation coefficients between sensors and which can be calculated as (11). A higher value usually means a more significant linear correlation between and *.* Since PCA is a linear analysis method, naturally it is advantageous to group the linear dependent sensors into a single set to develop the PCA model. Thus, this criterion is proposed.

Then, the sensor measurements with higher correlation coefficients are separated into the same PCA model. That is, sensors in each PCA model present higher linear correlation compared with a random grouping PCA model.

*(3) Type*. It refers to the types of sensors that are used to measure various parameters in a NPP. As it is known, various parameters are usually measured with various types of sensors, and various types of sensors are usually with different measurement precisions, work in different environments, and suffer from different external disturbances, and so on. Considering all these factors, a type-based modeling parameter selection criterion is proposed. Then, the same type of sensor can be grouped together to train a PCA model. As a result, the foregoing mentioned influence factors can be minimized.

All the proposed criteria are tested and evaluated in Section 4 to get an optimal modeling parameter selection criterion.

##### 3.3. Fault Detection and Isolation Stage

Based on data preprocessing and modeling parameter selection, a false alarm reducing method is further applied to improve the accuracy and reliability of the PCA model in the fault detection stage. Meanwhile, the detected abnormal behavior is analyzed in principal and residual space simultaneously in order to locate the faulty sensor more accurately in the fault isolation stage.

The false alarm reducing method defines another confidence limit to further reduce the false alarms of and statistics. If or is called the first confidence limit, this new confidence limit is called the second confidence limit for and statistics.

Suppose that the false alarm probability for or statistics is , which is usually set between 0 and 0.05 according to the experience in process industries [40]. Selecting as the length of a basic observation window, the allowable maximum , namely, the second confidence limit, can be derived from the following formula: where is also an experience value which is determined based on the model precision. Usually, it is set between 0.98 and 1 according to the experience in process industries [40]. If the number of false alarms for or statistics exceeds in an observation window before , then will be defined as a true faulty state.

After or statistics exceed the second confidence limit, an abnormality is detected. Then, an abnormal behavior is analyzed in principal and residual space simultaneously to locate the faulty sensor more accurately in the fault isolation stage. Since and statistics represent the total variation in principal and residual space, respectively, thus the contributions of sensors to and statistics are applied simultaneously to identify the faulty sensor [30].

Suppose that a testing vector is expressed as and is the number of sensors in . The contribution of sensor to the total variation in residual subspace (represented by statistic) is defined asThe contribution of sensor to the total variation in principal subspace (represented by statistic) can be calculated as the following steps.

Calculate the contribution of to score vector :where is the th element of vector .

Calculate the contribution of to statistic:

When a NPP is operating under normal conditions, and statistics should be within the confidence limits, and the contributions of each sensor to and statistics should be almost equal theoretically. If a fault occurs on the monitored sensors, and/or statistics will be beyond their confidence limits, and then and can be directly used to locate the faulty sensor. Furthermore, if the fault that occurs on the monitored sensors is just a small glitch, such as a small drift which may not be detected by and statistics, these two fault isolation indexes will also be beneficial both in the detection and in the isolation of this small fault. However, an evident increasing trend still can be seen in and/or for the drift sensor, although and statistics may be incapable of detecting the small drifts on sensors.

Small drifts on sensors may not result in severe accidents, but if the drift sensor participates in important control processes in the NPP, this may lead to operation deviation from the optimal condition. The consequence of the deviation operation is potential decline of the plant economy. Even if small drifts appear on sensors which do not participate in important control processes and just serve monitoring purposes, these two fault isolation indexes can also contribute to the CBM strategy in a NPP. Since a higher index value usually indicates unknown degradation on the sensor, thus sensors can be calibrated, maintained, or repaired as required, and excessive calibration and maintenance manipulations for sensors can be avoided.

#### 4. Simulation Tests and Results

In order to test the functionality of the optimized PCA method, sensor measurements are acquired from a real NPP under normal operating conditions with full power to carry out the simulations. Since a large number of sensors are included in the database of a NPP, thus the sensors are numbered separately in Arabic numerals in order to demonstrate the simulation results more conveniently. To verify the performance of PCA models with various modeling parameter selection criteria, five PCA models are given based on the proposed criteria, which are described in the following. Meanwhile, in order to verify the fault detection and isolation performance of the optimized PCA model, failures with different degrees are imposed sequentially to the measurements of coolant outlet temperature sensor (which is exactly marked 1# sensor in the database). The reason of introducing failures to this sensor is that 1# sensor is included in all the five PCA models mentioned above.

The five proposed PCA models are determined as follows.

*(1) PCA Model with Modeling Parameter Selection Criterion of Type*. Since 1# sensor is confirmed to be contained in all the five PCA models, thus sensors with the same type are selected to train the PCA model. Then, based on the modeling parameter selection criterion of type, the following sensors in the database are selected to train the PCA model, including . And the Arabic numerals represent the positions of the selected sensors in the database.

*(2) PCA Model with Modeling Parameter Selection Criterion of Standard Deviation*. Similarly, 1# sensor is also included in this PCA model. Firstly, the standard deviation of all sensors in the database is calculated based on (7). Then, based on the modeling parameter selection criterion of standard deviation, sensors in the database with the most similar standard deviation to 1# sensor are selected out to train this PCA model. This way, the PCA model with modeling parameter selection criterion of standard deviation is determined. And the positions of the selected sensors in this PCA model are , which are ordered by the similarity of standard deviation to 1# sensor from large to small. Likewise, the Arabic numerals represent the positions of selected sensors in the database.

*(3) PCA Model with Modeling Parameter Selection Criterion of Volatility Degree*. In the same way, the volatility degree of sensors in the database is calculated firstly based on (8), and then sensors with the most similar volatility degree to 1# sensor are selected as the modeling parameters in this PCA model. Thus, the PCA model with modeling parameter criterion of volatility degree is determined. The selected sensors in this PCA model are with the following positions in the database: , which are ordered by the similarity of volatility degree to 1# sensor from large to small.

*(4) PCA Model with Modeling Parameter Selection Criterion of Correlation Coefficients*. In order to determine this PCA model, correlation coefficients between 1# sensor and all the other sensors in the database are calculated first based on (11). And then the first eight sensors with the largest correlation coefficients to 1# sensor are selected as the modeling parameters of this PCA model. The positions of the selected sensors in the database are , which are ordered by the correlation coefficients to 1# sensor from large to small. This way, the PCA model with modeling parameter selection criterion of correlation is determined.

*(5) PCA Model with Modeling Parameter Selection Criterion of Random*. For comparison, this PCA model is developed in this paper. The selected modeling parameters in the model are , which cover different types and different orders of magnitude on standard deviation, volatility degree, and correlation coefficients of sensors.

It can be seen that not only is the 1# sensor a common item in the foregoing five PCA models, but also nine sensors are included in each PCA model. In this context, failures can be imposed to the mutual 1# sensor measurements for every PCA model, and the model performances with different modeling parameter selection criteria can be evaluated with reasonable preconditions.

##### 4.1. Simulations with Normal Measurements

1000 original samples are used to train the five PCA models and another 1000 original samples are selected as the testing data to carry out the simulation tests. The results of and statistics in the five PCA models are shown in Figures 5 and 6, respectively. Red dotted lines in the figures are the confidence limits for and statistics. It can be seen that statistics present false alarms in all the five PCA models under normal operating conditions. For statistics, it is relatively better that false alarms only occur in PCA models with parameter selection criteria of random and standard deviation.

If the original samples are preprocessed with the methods proposed in this paper, then the preprocessed data are used to train the five PCA models. In this context, the simulation results of and statistics in the five PCA models are shown in Figures 7 and 8. Since singular points and random fluctuations in the original samples are eliminated by statistical and sliding window method, the false alarms of and statistics are reduced to some extent.

Thus, on the basis of the data preprocessing, the second confidence limit for and statistics is proposed to further reduce the false alarms of and statistics. With the application of the second confidence limit, the detailed false alarm probability of and statistics in the five PCA models is summarized in Table 1. Obviously, the false alarms of and statistics in all the five PCA models are reduced to lower levels with the application of the false alarm reducing method. As a result, the data preprocessing method to original data and false alarm reducing method to and statistics really contributes to false alarm reduction of and statistics under normal operating conditions. Then, the model performance is really improved in this way.

From Table 1, it can be seen that the PCA model with parameter selection criterion of correlation shows optimal performance on sensor fault detection compared with the other four PCA models. False alarms of and statistics are reduced to 0 and 0.2%, respectively, in this PCA model, which are lower than that in the other four PCA models.

Due to the influence of model precision and external environments, the contributions of sensors to and statistics in a PCA model are not equal under normal operating condition as the results in Figure 9. Thus, two samples are selected from the 1000 samples (namely, the 600th and 1000th samples) as a contrast to show the condition monitoring results. Then, contributions of sensors to and statistics in the five PCA models are calculated at the 600th and 1000th sample points, which are illustrated in Figures 9(a), 9(b), and 9(c). statistics in the PCA model with parameter selection criterion of random in Figure 9(a) are taken as an example for explanation. At the 600th sample point, the contribution of 1# sensor to statistics is about 14%; meanwhile, the contribution of 130# sensor to statistics is about 7%. It is clear that there is a large contribution difference between these two sensors, which should indicate unknown failures in the monitored sensors in theory. However, at 1000th sample point, the contribution of 1# sensor to statistics in this PCA model is still around 14%, and also that of the 130# sensor is still around 7%. Similar results also can be seen on the other sensors in this PCA model. That is, the contributions of all sensors in a PCA model to or statistics are not equal at a single sample point; however, the contributions of each sensor at different sample points almost keep unchanged. Then, it can be inferred that no failures occur in the monitored sensors; the contribution differences among various sensors may result from unknown uncertainty factors in the PCA model, not from the failures on sensors. In the other four PCA models, similar results also can be obtained.

**(a)**

**(b)**

**(c)**

From the contribution figures, we also can get such a fact that the PCA model with parameter selection criterion of correlation shows better performance on fault isolation under normal operating conditions. The contributions of sensors to statistics are almost equal, which best accords with the theoretical analysis. Meanwhile, the contributions of sensors to statistics in this PCA model also agree more with the theoretical analysis compared with the other four PCA models. On the other hand, from Figure 9, it also can be seen that the PCA model with random parameter selection criterion presents the worst performance on this point. Whether to or statistics, the contributions of sensors are quite different in this case.

##### 4.2. Simulations with Abnormal Measurements

Meanwhile, in order to verify the fault detection and isolation ability of the proposed PCA model, two artificial drifts (ramps) are imposed to the coolant outlet temperature sensor (namely, 1# sensor in the database) at the 400th sample point. One drift simulates a common problem that affects process sensors and may result from aging. The simulated drift is a ramp that grows to 0.45°C for 1# sensor measurements. This small drift corresponds to a maximum 0.15% change of the measurements, which is imperceptible in the time profile. Another drift is relatively bigger, which represents a common issue that may result from mechanical failures. This simulated drift is also a ramp that grows to 3.5°C for 1# sensor measurements. And it is equivalent to a maximum 1.15% change which also can be seen in the time profile.

It can be seen that statistics in all five PCA models cannot detect the small drift that occurred on 1# sensor, which is shown in Figure 10. In Figure 11, increasing trends of statistics can be seen at the last period of the tests; however, the trends are not significant and with higher volatility, which are representative of uncertain results. Then, the contributions of sensors are further required to help detect the small failure on 1# sensor, which are illustrated in Figure 12. For explanation, the PCA model with random parameter selection in Figure 12(a) is taken as an example.

**(a)**

**(b)**

**(c)**

From Figure 12(a), the contribution of 1# sensor to statistics is about 22% at the 600th sample point, and it almost reaches 30% at the 1000th sample point. A big contribution increase is present on the 1# sensor, which is different from the situation under normal conditions (contributions keep unchanged between the 600th and 1000th points). In contrast, the contribution of 80# sensor to statistics is about 20% at the 600th sample point and reduced to 18% at the 1000th sample point. A small contribution decrease appears between the 600th and 1000th sample points, which is the same on the other sensors (47#, 55#, 61#, 130#, 149#, 102#, and 112#) in this PCA model; the contributions of these sensors almost remain unchanged or present minor decreasing trends with the drift developing on 1# sensor. However, no evident contribution differences of statistics appear on any sensor in this PCA model between the 600th and 1000th sample points. It can be explained in Figure 10, where statistics of 1# sensor almost have no obvious changes during the test either.

Based on the analysis of statistics and the contributions to statistics, it can be inferred that 1# sensor behaves abnormally. That is, it is entirely within the capacity of the PCA models to detect and isolate sensors with this level of drift. Meanwhile, from Figure 11, it also can be seen that the PCA model with correlation parameter selection is more sensitivity on fault detection compared with the other four PCA models, since the small drift on 1# sensor can be detected by this PCA model more quickly. The PCA models with modeling parameter selection criteria of standard deviation and volatility degree are in the second and third order, and the PCA model with random parameter selection criterion shows the worst performance of fault detection in this case.

From Figure 12, it can also be concluded that the PCA model with correlation parameter selection shows better performance on the small fault isolation. The contributions of 1# sensor to statistics at the 1000th sample point in the five PCA models are taken as an example to demonstrate the foregoing conclusion.

Since the failure imposed on 1# sensor is a ramp function, thus the failure will develop over time. Similarly, the contribution of 1# sensor to statistics will become large with the developing of the failure over time. It can be seen that the contributions of 1# sensor to statistics at the 1000th sample point have reached about 30%, 30%, 35%, 40%, and 60%, respectively, in the PCA models with parameter selection criterion of random, standard deviation, volatility degree, type, and correlation. Obviously, the contribution of 1# sensor to statistics in the PCA model with parameter selection criterion of correlation is significantly larger than that in the other four PCA models, which is very beneficial to the isolation of the drift on 1# sensor among the monitored sensors. Thus, compared with the other four PCA models, the PCA model with parameter selection criterion of correlation shows the best performance on sensor fault isolation with small drifts.

In contrast, the condition monitoring results with a larger drift on 1# sensor are described in Figures 13 and 14. The figures indicate that both and statistics in all the five PCA models can detect the failure during the test. That is, the PCA method has enough sensitivity to this kind of failures that occurred on the monitored sensors.

In this case, the contributions of sensors to and statistics in the five PCA models are shown in Figure 15. In each PCA model, the contribution of 1# sensor to or statistics at the 1000th sample point is significantly larger than that at the 600th sample point, which corresponds to theoretical analysis. Meanwhile, due to the larger drift on 1# sensor, the contributions of 1# sensor are also significantly greater than that in Figure 12. As a result, based on the contribution distribution of sensors, the failure on 1# sensor is located.

**(a)**

**(b)**

**(c)**

Meanwhile, from Figure 15, it also can be seen that the PCA model with random parameter selection criterion shows the worst performance compared with the other PCA models. Only in this PCA model is the contribution of 1# sensor to statistics below 50% either at the 600th or at the 1000th sample point. However, the contributions are all greatly larger than 50% in the other four PCA models whether at the 600th or at the 1000th testing point, which presents more effective fault detection and isolation abilities during the test. Thus, it can be concluded that the PCA models with parameter selection criteria of standard deviation, volatility degree, type, and correlation all show quite good performance on the fault isolation of sensors with larger failures.

Based on the foregoing simulations, the following conclusions can be obtained:

The proposed data preprocessing and false alarm reducing methods are proved to be effective in the reduction of false alarms of and statistics in a PCA model, which is equivalent to the improvement of model performance.

Simulations under normal and abnormal conditions show that the PCA model with modeling parameter selection criterion of correlation presents better performance both on the fault detection and on the fault isolation, compared with the other four PCA models.

#### 5. Conclusions and Perspectives

An optimized PCA framework for sensor condition monitoring is proposed in this paper. The proposed optimizations are mainly involved in various modeling procedures in the common PCA method, including data preprocessing stage, modeling parameter selection stage, and fault detection and isolation stage. In the data preprocessing stage, singular points and random fluctuations in the original data are eliminated with various techniques. In the modeling parameter selection stage, various parameter selection criteria are proposed to get optimal model performance of the PCA method. In the last fault detection and isolation stage, a statistics-based method is further applied to reduce the false alarms of and statistics on the basis of data preprocessing. Meanwhile, the confirmed faulty state is discussed in the principal and residual space simultaneously to locate the faulty sensor more precisely.

Data from a real NPP are used to test the optimized PCA method in this paper. According to the simulation results under normal conditions, false alarms of and statistics really can be greatly reduced with the application of data preprocessing and false alarm reducing method. Based on the simulations with faulty data, the optimized PCA method proves to be effective in sensor fault detection and isolation, whether with small or major failures. Meanwhile, it can be concluded that the PCA model with parameter selection criterion of correlation shows better performance either under normal or under abnormal operating condition.

Although valuable improvements have been made in this paper, there is still much work to do in the future. How to further process the remaining false alarms and how to best reconstruct the faulty data will be analyzed on the basis of the done effort in this paper.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

The authors acknowledge the financial support of the national project of “Research on Online Monitoring and Operation Support Techniques in a Nuclear Power Plant” to the present research.