Abstract

One can find many reliability, availability, and maintainability (RAM) models proposed in the literature. However, such models become more complex day after day, as there is an attempt to capture equipment performance in a more realistic way, such as, explicitly addressing the effect of component ageing and degradation, surveillance activities, and corrective and preventive maintenance policies. Then, there is a need to fit the best model to real data by estimating the model parameters using an appropriate tool. This problem is not easy to solve in some cases since the number of parameters is large and the available data is scarce. This paper considers two main failure models commonly adopted to represent the probability of failure on demand (PFD) of safety equipment: by demand-caused and standby-related failures. It proposes a maximum likelihood estimation (MLE) approach for parameter estimation of a reliability model of demand-caused and standby-related failures of safety components exposed to degradation by demand stress and ageing that undergo imperfect maintenance. The case study considers real failure, test, and maintenance data for a typical motor-operated valve in a nuclear power plant. The results of the parameters estimation and the adoption of the best model are discussed.

1. Introduction

The safety of nuclear power plants (NPPs) depends on the availability of safety-related components that are normally on standby and only operate in the case of a true demand. These components typically have two main types of failure modes that contribute to the probability of failure on demand:(a)by demand-caused failure, associated with a demand failure probability ,(b)standby-related failure, associated with a standby hazard function .

Both are generally associated with constant values in a standard Probabilistic Risk Assessment (PRA) models, that is, and , respectively. Such parameters are associated probability density functions in PRA, which are tailored based on a priori generic probability distribution function, for example, exponential, lognormal, Weibull, and beta, depending on the particular sort of component, for example, motor-driven pump and motor-operated valve. A Bayesian approach is used to combine such generic probability density functions with plant specific failure data for each particular component [14].

However, both failure modes are often affected by degradation such as demand-related stress and ageing, which cause the component to degrade with chronological time and ultimately to fail. Maintenance and test activities are performed to control degradation and the unreliability and unavailability of such components, although this has both positive and negative effects. Thus, different approaches have been proposed in the literature to model time-dependent and that take into account such effects in an either implicit or explicit way.

Samanta et al. [5, 6] proposed a well-organized foundation to account for ageing and the positive and adverse effects of testing components in modelling demand failure probability and standby hazard function models. However, this model does not take into account the positive effect of maintenance activities as a function of their effectiveness in managing component degradation due to demand-induced stress and ageing.

As regards the standby-related failure mode, Martorell et al. [7] provide an age-dependent reliability model associated only with standby-related failures which explicitly takes into account the effect of equipment ageing and the positive and negative effects of maintenance activities founded on imperfect maintenance modelling. Mullor [8] proposes an approach for parameter estimation of such a sort of imperfect maintenance models. Martón et al. [9] propose an approach to modelling the unavailability of safety-related components associated with standby-related failures that explicitly addresses all aspects of the effect of ageing, maintenance effectiveness, and test efficiency. Other authors have proposed alternative approaches to modelling the effect of ageing and test and maintenance activities [1013].

As regards the demand-caused failure mode, this probability of a safety component is normally considered to be mainly affected by demand-induced stress, for example, due to true demands, proof tests, and others. The demand-induced stress is therefore modelled with a stochastic degradation jump in [14, 15]. These studies consider that random shocks occur according to a Nonhomogeneous Poisson Process, leading to the immediate failure of the component. Shin et al. [16] propose an age-dependent model that considers, among others, the effect of “test stress” and maintenance effects. In general, previous studies have found that the demand failure probability should be considered as a function not only of the number of tests but also of the effectiveness of maintenance activities. Thus, recently, Martorell et al. [17] have proposed a new reliability model for the demand failure probability that explicitly addresses all aspects of the effect of demand-induced stress, maintenance effectiveness, and test efficiency.

In this context, the objective of this paper focuses on fitting the best model to represent the real operation of safety-related equipment, dealing with the problem of estimating a significant number of parameters considering a small amount of data. With this aim, a methodology of parameters estimation and model selection is developed. This methodology allows the joint estimation of reliability and maintenance related parameters as well as obtaining a measure of goodness of fit to select the best imperfect maintenance model for each failure mode. This study considers a standby-related failure model assuming linear ageing and a demand-caused failures model assuming test-induced stress. In addition, it considers imperfect maintenance adopting Proportional Age Setback and Proportional Age Reduction for preventive maintenance modelling. Then, maximum likelihood estimation (MLE) using a direct search algorithm based on the Nelder-Mead Simplex (NMS) method is used to estimate maintenance effectiveness and ageing rate simultaneously. A practical and realistic case study is included facing the parameters estimation of a typical motor-operated valve in a nuclear power plant. Additionally, how the estimates obtained can be used, for example, in the planning of maintenance and surveillance test activities with the aim of minimizing equipment unavailability, is shown.

The rest of this paper is organized as follows: Section 2 introduces briefly the demand failure probability model and the standby-related failure model that addresses component degradation because of demand-induced stress and ageing, respectively, and the positive effect of imperfect preventive maintenance. Section 3 describes the parameter estimation method used to fit plant data to reliability models introduced in the previous section. Section 4 describes a case study involving a motor-operated valve of a pressurized water reactor nuclear power plant. Lastly, Section 5 presents the concluding remarks.

2. Reliability Models under Imperfect Maintenance

In this paper the models presented by Martorell et al. [7, 17] have been selected to model the standby hazard function and the demand failure probability, respectively. In the following subsections, both models are briefly described and the expressions involved in the parameters estimation and model selection are obtained under the following assumptions:(1)Time-directed preventive maintenance effect which depends on its effectiveness. The effectiveness is represented by an imperfect maintenance model with parameter ε, ranging in the interval and adopting either Proportional Age Setback (PAS) or Proportional Age Reduction (PAR) model.(2)Corrective maintenance with minimal repair. That is, repairing failures do not improve the age of equipment. Therefore, for corrective maintenance, we adopt the Bad As Old (BAO) model.(3)A linear ageing model which is selected to model the standby hazard function.(4)Test-caused stress which is the only degradation mechanism considered to model the demand failure probability.

2.1. Reliability Model of Standby-Related Failures

In the context of safety-related equipment of NPP, the most frequently used function in reliability analysis is the hazard function. The standby hazard function of equipment depends on its age, which is a function of the chronological time elapsed since its installation and the effectiveness of the maintenance activities performed on it. So, an age-dependent hazard function model, in period after the maintenance number , can be expressed as [7] where is the initial hazard function of the equipment and is the age of the equipment immediately after the maintenance activity.

Adopting a linear model for hazard function, the expression for the age-dependent hazard function after the maintenance number can be written aswhere is the linear ageing rate andwith being the time in which the equipment undertakes the -maintenance activity.

The cumulative hazard function in the period , after the maintenance number , can be obtained by integration from the hazard function given by equation (2) as

The age of the component immediately after the maintenance number , , and, therefore, the hazard function and the cumulative hazard function depend on the model of imperfect maintenance selected (PAS or PAR). In the following subsections, the particularization of the previous equations to PAS or PAR model is presented.

2.1.1. Proportional Age Setback Model

In the PAS approach, each maintenance activity is assumed to shift the origin of time from which the age of the equipment is evaluated. The PAS model considers that maintenance activities reduce proportionally to a factor , the age the equipment had immediately before it enters in maintenance. If , the PAS model simply reduces to the BAO situation, whereas corresponds to the Good As New (GAN) situation. Thus, this model is a natural generalization of both GAN and BAO models in order to account for imperfect maintenance. Considering PAS approach the age of the equipment immediately after the maintenance activity is given by [7]Replacing the expressions corresponding to and given by (3) and (5), respectively, into (2) the expression for the induced hazard function becomesIn a similar way, the cumulative hazard function, , in the period , can be obtained by replacing (3) and (5) into (4) obtaining

2.1.2. Proportional Age Reduction Model

In the PAR approach, each maintenance activity is assumed to reduce proportionally the age gained from the previous maintenance. Thus, while the PAS model considers that each maintenance activity reduces the total equipment age, the PAR model assumes that maintenance only reduces a portion of the equipment age, the one gained from the previous maintenance, keeping the rest unaffected. The PAR model considers that maintenance reduces the age gained between two consecutive maintenance activities by a factor . Again, one can realize that if , the PAR model simply reduces to BAO, whereas if it reduces to GAN.

According to the above conditions, the age of the equipment in instant of period , after the -maintenance activity using the PAR model, is given by

Using a similar process as the one described for the PAS model, but adopting (8) instead of (5), it is possible to derive the expression for the hazard function and the cumulative hazard function of imperfect maintenance at instant , under the PAR approach

2.2. Reliability Model of Failures by Demand

The demand failure probability of a component, which is normally in standby and ready to perform a safety function on demand, depends on the number of demands performed on the component, which are often associated with performing surveillance tests. In addition, it is necessary to consider the positive effect that the preventive maintenance activities performed on the equipment have on the degradation factor and, therefore, on demand probability failure.

A time-dependent demand failure probability model that addresses the demand-induced stress and the effect of maintenance activities can be formulated for the period as follows [17]:with being the residual demand failure probability and being the degradation function.

Assuming, the same degradation factor, , for all types of demands, the evolution of the degradation function in the period number , that is, between maintenance and , can be expressed aswhere is the degradation function immediately after maintenance which depends on the selected imperfect maintenance model (PAS or PAR) and is the floor function that gives the largest integer less than or equal to , which returns the number of tests performed in the interval that are performed with periodicity .

Time-dependent evolution of the cumulative demand failure probability, , over the period , can be obtained by adding the cumulative distribution function in the maintenance to the demand probability functions in each test performed over the period . Generally, does not have a closed-form expression.

In the following subsections, the particularization of equations and for the PAS or PAR model is presented.

2.2.1. Proportional Age Setback Model

If a PAS model is considered, the degradation function after maintenance number assuming preventive maintenance activities are performed on a regular basis with constant maintenance interval given by can be formulated by [17]Substituting (11) and (12) into (10) the function of demand failure probability for the period can be obtained as

The distribution function of the cumulative demand failure probability, , in the period , after the -maintenance activity, can be obtained, as it is mentioned above, by summing the distribution function immediately after the maintenance activity and the probability functions in the tests performed between the maintenance and to yield:

2.2.2. Proportional Age Reduction Model

In the PAR approach the degradation function immediately after maintenance number assuming preventive maintenance activities are performed on a regular basis with constant maintenance interval, , is given by

Using an analogous procedure as the one described for the PAS model, a time-dependent model for the demand failure probability can be obtained substituting (15) into (10) and (11) to yield

In addition, the cumulative demand failure probability, , considering a PAR model is given by

3. Methodology of Parameters Estimation and Model Selection

Many methods for parameter estimation of reliability models have been proposed in the literature, such as the maximum likelihood, methods of moments, and Bayesian estimators. In this paper, the maximum likelihood estimation method has been selected to estimate the parameters of the reliability models presented in Section 2. For a given model and a set of observed data, the likelihood function is the product of probabilities of the observed data as a function of the model parameters. It can be applied to reliability and imperfect maintenance models for standby-related failures and for demand-caused failures. Thus, the likelihood function for standby-related failures, , and the likelihood function for demand-caused failures, , can be formulated asThe maximum likelihood estimation (MLE) method provides estimators, called maximum likelihood estimators, of parameters involved in reliability and maintenance models. The maximum likelihood estimations of these parameters are those values which make the likelihood function as large as possible, that is, which maximize the probability of the observed data. Since the natural logarithm is an increasing function, the likelihood function and its logarithm achieve their maximum at the same values of their objective parameters. For computational purpose it is preferable to maximize the log likelihood function. By maximizing the expressions corresponding to , the maximum likelihood estimators of the objective parameters are obtained. In this paper, the Nelder-Mead Simplex [18, 19] algorithm is used to maximize the likelihood functions for each proposed model.

The maximum likelihood estimation method provides, in addition to the parameter estimates, information on its variability through the Fisher information matrix, which is defined as the opposite of the partial second derivative matrix, that is, the opposite of its Hessian. So, for the set of estimated parameters the variance-covariance matrix as the inverse of the information matrix divided by the sample size can be obtained.

In particular, taking advantage of the asymptotic normality of the maximum likelihood estimation, if the sample size is large enough, we can obtain the standard deviations of the parameter estimation as the square root of the main diagonal of the variance-covariance matrix to obtain confidence intervals for each of the parameters, as well as information on the relationship between the parameters through their covariance.

3.1. Likelihood Function for Standby-Related Failures,

Let be the number of standby-related failures of component , during the maintenance period which occur at times , and let be the chronological time for the -maintenance in component . The likelihood function for identical components of equipment under imperfect preventive maintenance is given bywhere is the vector of unknown parameters, . For each component , is the number of preventive maintenance activities performed during the observation period , with and being the induced hazard function and the cumulative hazard function in period , respectively, and the cumulative hazard function in censoring time .

The log likelihood function is given by

Equation (21) must be particularized depending on the imperfect maintenance model considered. If a PAS imperfect maintenance model is considered, the expressions corresponding to , , and are obtained from (6) evaluated in the failure times and (7) evaluated in the preventive maintenance activities times and censure time:

In the case of PAR imperfect maintenance model, the expressions corresponding to the failure rate, , and the cumulative failure rates, and , are obtained from (9) as

3.2. Likelihood Function for Demand-Caused Failures,

In the same way as in the previous section, let be the number of demand-caused failures of component , during the maintenance period which occur at times , and let be the chronological time for the -maintenance in component . The log likelihood function for identical components of equipment under imperfect preventive maintenance is given by

The probability function, , and the cumulative probability functions, and , depend on the imperfect maintenance model considered. In the case of a PAS model these functions are obtained from (14) and (15) as

If a PAR maintenance model adopted the expressions corresponding to demand failure probability and cumulative demand failure probability is obtained by particularizing (16) in and (17) in and ,

4. Case Study

This section encompasses the estimation of the parameters associated with the reliability models presented in Section 2 for a motor-operated valve (MOV) of a nuclear power plant. The parameters are estimated and the reliability models that best fit the plant data are selected using the methods presented in Section 3. Then, the estimates obtained are used to predict the performance of the MOV as a function of test and maintenance intervals. In particular, the MOV average unreliability contribution of each failure mode and the total MOV unavailability are computed and plotted as a function of maintenance and test intervals for a 10-year horizon.

4.1. Historical Maintenance and Testing Data

Historical failure, maintenance, and test data have been collected from a nuclear power plant for two identical motor-operated safety valves. The data set contains all the failures, preventive maintenance, and surveillance test activities registered during an observation period of 27 years.

Table 1 shows the failure times of the two MOVs studied obtained from the plant operational data. Table 1 provides also a brief description of the corresponding failure cause and failure mode. The failures have been classified as either standby-related or demand-caused failure taking into account the information available for the failure cause.

A total of 432 surveillance tests and 17 preventive maintenance tasks were performed on MOV1, distributed uniformly with periodicity 22 and 572 days, respectively, along the 27-year period analysed. In addition, a total of 424 surveillance tests and 18 preventive maintenance tasks were performed on MOV2, distributed uniformly with periodicity 22 and 528 days, respectively, within the same period.

4.2. Results of the Maximum Likelihood Estimation

This section presents the results of the joint estimation of the effectiveness of maintenance, , and the reliability parameters, for standby-related failures and for demand-caused failures, under PAS and PAR imperfect maintenance models using the plant data introduced in the previous section. The model that provides the best fit is identified for each of the two failure modes.

The maximum likelihood estimations of parameters , , and are obtained maximizing the log likelihood functions given by (20) for standby-related failures and (25) for demand-caused failures using the Nelder-Mead Simplex algorithm. Table 2 gives MLEs of parameters corresponding to reliability model of standby-related failures considering PAS and PAR imperfect maintenance models, the double of the standard deviations, , which are obtained from the Fisher information matrix, and the values of likelihood functions . Table 3 shows the same information for the case of the reliability model of demand-caused failures.

The best model for standby-related failures and demand-caused failures is the PAS model in both cases since it provides the higher value for the likelihood function shown in Tables 2 and 3, respectively. So that, the reliability model that considers PAS imperfect maintenance is selected for both failure modes with the value of the corresponding model parameters given in Tables 2 and 3.

4.3. Average Unreliability Contribution as a Function of Maintenance and Test Intervals

The average unreliability contribution to the unavailability of a component normally in standby over its renewal period can be formulated as follows [9, 17]:where is the standby-related unreliability contribution and is the demand-caused unreliability contribution.

On one hand, adopting the PAS model to represent the behavior of the imperfect maintenance for the standby-related failures of the component according to the results in the previous section, is given by [9]

On the other hand, adopting the PAS model to represent the behavior of the imperfect maintenance for demand-caused failures of the component according to the results in the previous section, is given by [17]

Figure 1 shows the evolution of and as a function of the test interval, regarding different preventive maintenance intervals for a 10-year horizon renewal period. It can be seen that increases significantly for high and values. Nevertheless, the effect of maintenance is positive for both unreliability contributions. Moreover, an increase on test frequency between maintenances, that is, low values, has a very negative effect on for very low intervals.

In addition, Figure 1 shows confidence intervals for the values predicted for the unreliability contributions and for different couples and . One can realize large confidence intervals exist, which even increase with and because of the RAM model and the uncertainty in the estimation of the model parameters shown in Tables 2 and 3.

4.4. Average Unavailability as a Function of Maintenance and Test Intervals Regarding Unreliability and Downtime Effects

In accordance with [9], the averaged unavailability of a component is the sum of the average unreliability contribution and the unavailability contributions due to detected downtimes for performing testing and maintenance activities with the plant at power, which can be formulated as follows:where represents the unavailability contribution due to testing, is the unavailability contribution due to performing preventive maintenance, is the unavailability contribution due to performing corrective maintenance conditional to detecting a failure during a previous test, and is the contribution due to replacement of the equipment, if any. These downtime contributions can be evaluated using the following equations [9]: where is the downtime for testing, is the downtime for preventive maintenance, is the downtime for corrective maintenance or repair, and is the downtime for replacement or renewal.

For the sake of simplicity, the last two contributions, and , are not included in the sensitivity analysis due to both are negligible as compared with the downtime effect of preventive maintenance and testing activities. Therefore, the averaged unavailability of the component is given by

Figure 2 shows the evolution of versus as a function of the test interval considering different preventive maintenance intervals for a 10 years horizon renewal period. The term allows quantifying the benefit of developing test and maintenance activities on the total component unavailability while the sum of contributions represents their negative effect.

Figure 2 shows confidence intervals for the values predicted for the unavailability contributions for different couples and . Again, large confidence intervals exist, which even increase with and because of the RAM model and the uncertainty in the estimation of the model parameters shown in Tables 2 and 3.

Substituting (29), (30), (31), and (32) into (36) yields the following formulation of the total average unavailability of a component:

The last study involves the analysis of the total average unavailability of the component as a function of the couple for a 10-year horizon, which is shown in Figure 3. The highest values of are reached adopting the highest maintenance and test intervals. The main contributor to the total unavailability, (see (37)), is the standby-related unreliability contribution given by equation (29) as it can be seen in Figure 2. This explains the direct and proportional dependence between and and . Nevertheless, the sum of the demand unreliability contribution and downtime effects considered, that is, downtime effect of preventive maintenance and testing activities, become more relevant for very low values. This fact is appreciated in Figure 2 too.

5. Concluding Remarks

This paper presents a methodology of parameters estimation and model selection for safety-related equipment. In the literature, complex reliability, availability, and maintainability (RAM) models have been proposed with the aim of capturing equipment performance in a more realistic way, such as explicitly addressing the effect of component ageing and degradation, surveillance activities, and corrective and preventive maintenance policies. A major challenge for the adoption of the new models in practice is to estimate reliability and maintenance parameters with the aim of selecting the best model for describing the real operation of safety-related equipment.

Then, there is a need to fit the best model to real data by estimating the model parameters using an appropriate tool, which could be a problem in some cases because the number of parameters is large and the available data is scarce. This may have great influence in the confidence intervals of the values found for the model parameters that better fit the data.

The paper considers a standby-related failure model assuming linear ageing and a demand-caused failures model assuming test-induced stress. In addition, it considers imperfect maintenance adopting Proportional Age Setback and Proportional Age Reduction for preventive maintenance modelling. Maximum likelihood estimation (MLE) using a direct search algorithm based on the Nelder-Mead Simplex (NMS) method is used to estimate maintenance effectiveness and ageing rate simultaneously.

A practical and realistic case study is included facing the parameters estimation of a typical motor-operated valve in a nuclear power plant. The case study considers real failure, test, and maintenance data for a typical motor-operated valve in a nuclear power plant. The results of the parameters estimation include confidence intervals and the selection of the best model.

Equipment RAM is quantified based on the best model fitted to make the impact of such an estimation in a testing and maintenance-planning context clear. Thus, the results of such a predictive model may help to plan in a more efficient way the test and maintenance program, which should provide appropriate balance among the different contributions to the unavailability of the MOV, with the aim of minimizing its unavailability assuring a low level of unreliability.

However, the effect of the uncertainties introduced in the estimation of the model parameters, because of the availability of scarce data, can jeopardize the decision-making. Thus, the example of application shows large confidence intervals for the unreliability and unavailability contributions for different and couples, which even increase with and because of the RAM model and the uncertainty in the estimation of the model parameters shown in Tables 2 and 3.

It can therefore be concluded that estimating the parameters and, consequently, fitting these models, it is possible to manage in a more efficient way the test and maintenance program, by providing appropriate balance among the different contributions to the unreliability and unavailability of the component. However, there is a need to increase the data set used to reduce the uncertainty in the decision-making.

Acronyms and Notations

BAO:Bad As Old
GAN:Good As New
MLE:Maximum likelihood estimation
MOV:Motor-operated valve
NPP:Nuclear power plant
PAR:Proportional Age Reduction
PAS:Proportional Age Setback
PFD:Probability of failure on demand
PRA:Probabilistic Risk Assessment
RAM:Reliability, availability, and maintainability
:Residual demand failure probability
:Time-dependent demand failure probability for the period
:Cumulative demand failure probability in the period
:Degradation function of the component immediately after maintenance
:Degradation function of the component associated with demand-related stress for the period
:Residual standby-related hazard function
:Cumulative hazard function in the period
:Likelihood function for identical components of an equipment under imperfect preventive maintenance for standby time-related hazard function
:Likelihood function for identical components of an equipment under imperfect preventive maintenance for demand failure probability
:Preventive maintenance number
:Preventive maintenance interval
:Cumulative number of demands at time
:Test degradation factor associated with demand failures
:Number of failures of component during the maintenance period
:Replacement interval (overhaul maintenance)
:Chronological time
:Time at which the component undertakes the maintenance number
:Test interval
:Averaged unavailability of a component
:Averaged unavailability contribution due to performing corrective maintenance
:Averaged unavailability contribution due to performing preventive maintenance
:Averaged unavailability contribution due to replacement or component renewal
:Unreliability contribution to the component averaged unavailability over the component useful life
:Demand-caused unreliability contribution
:Standby-related unreliability contribution
:Averaged unavailability contribution due to testing
:Age of the component immediately after the maintenance
:Age of the component in the period
:Linear ageing rate
:Downtime for preventive maintenance
:Preventive maintenance effectiveness
:Downtime for testing
:Downtime for corrective maintenance or repair
:Downtime for replacement or renewal
:Standard deviation
:Failure times.

Conflicts of Interest

The mentioned received funding in “Acknowledgments” did not lead to any conflicts of interest regarding the publication of this manuscript. There are no other possible conflicts of interest in the manuscript.

Acknowledgments

The authors are grateful to the Spanish Ministry of Science and Innovation for the financial support received (Research Project ENE2016-80401-R) and the doctoral scholarship awarded (BES-2014-067602). The study also received financial support from the Spanish Research Agency and the European Regional Development Fund.