Research Article | Open Access
A Data-Driven Reliability Estimation Approach for Phased-Mission Systems
We attempt to address the issues associated with reliability estimation for phased-mission systems (PMS) and present a novel data-driven approach to achieve reliability estimation for PMS using the condition monitoring information and degradation data of such system under dynamic operating scenario. In this sense, this paper differs from the existing methods only considering the static scenario without using the real-time information, which aims to estimate the reliability for a population but not for an individual. In the presented approach, to establish a linkage between the historical data and real-time information of the individual PMS, we adopt a stochastic filtering model to model the phase duration and obtain the updated estimation of the mission time by Bayesian law at each phase. At the meanwhile, the lifetime of PMS is estimated from degradation data, which are modeled by an adaptive Brownian motion. As such, the mission reliability can be real time obtained through the estimated distribution of the mission time in conjunction with the estimated lifetime distribution. We demonstrate the usefulness of the developed approach via a numerical example.
Many complex systems are designed to perform missions that consist of phases or stages in which deterioration and configuration of the system may change from phase to phase. These systems are called phased-mission systems in the literature. Formally, phased-mission system (PMS) is defined as the system subject to multiple, consecutive, and nonoverlapping phases of operation required to finish the final product or service [1, 2]. These systems were first introduced by  and a vast literature has accumulated since then. Most of real-world systems operate in phased missions where the reliability structure varies over consecutive time periods, known as phases. During each phase, the PMS has to accomplish a specified task. Thus, the system behavior can change from phase to phase. Particularly, a typical PMS which is frequently studied is represented by the on-board systems for the aided guide of aircraft, whose mission consists of takeoff, ascent, cruise, approach, and landing phases. Another example is NASA's Mars Exploration Rover Mission, which consists of many phases like vehicle launch, cruise, approach, entry, descent and landing to Mars, rover egress, and a number of surface operations that involve scientific data collection and transmission to earth. For mission success, all phases must be completed without failure. If the system cannot be repaired during the mission then it is known as a nonrepairable phased mission .
Reliability serves as an important measure for system design, operation, and maintenance and has been long recognized as a metric to quantify the performance of the engineering systems. Therefore, reliable and accurate estimates of the reliability of PMS are important for the maintenance and logistic support of such systems, which can lead to lifecycle cost reduction and avoiding catastrophic failures. In this paper we focus on the reliability estimation for PMS but with an emphasis on data-driven method as discussed later. Here the data mean the condition monitoring data obtained from the sensors.
1.2. The Literature Review
Reliability engineering research has developed many methods to analyze the reliability of the PMS, in which the fault tree analysis methods are mainstream. The earliest of these methods involved the direct manipulation of the fault trees. Esary and Ziehms  introduced a fault tree based method to transform a phased mission into an equivalent single phase mission. The transformed phase fault trees are then combined into a single fault tree, and standard fault tree methods are used to derive the system’s reliability (see, e.g., [5–8]) However, these methods cause the size of the problem to become very large as the number of phases increases.
Recently, it is recognized that increasing the solution efficiency is particularly important for real-time analysis, where the timeliness of the analysis results is crucial . Several papers addressed the issue of reducing the computational burden, including [10–13]. Even so, the fault tree based methods remain unsuitable for analyzing large systems within reasonable timeframes. This led to the adoption of the more efficient, powerful binary decision diagram (BDD) technique [4, 9].
Over the past decade, researchers have proposed a set of new algorithms based on BDD for fault tree analysis of a wide range of PMS. Zang et al.  proposed an algorithm for nonrepairable systems with general failure distribution. This work was the first to use the BDD method to analyze the reliability of phased-mission systems and marked a significant step forward by enabling large phased-mission systems to be analyzed. Xing and Dugan  analyze a more general class of systems which includes phased-mission systems with combinational phase requirement and imperfect coverage. Other important recent papers on generalized phased-mission systems including [16–19]. In a recent study, Çekyay and Özekici  analyzed the reliability of mission-based systems under a general setting by proposing three different reliability definitions. Çekyay and Özekici  further extended this line of research by analyzing the availability of mission-based systems under the maximal repair policy.
As observed in the literature, the current approaches are heavily dependent on the knowledge of the structure of the PMS to estimate the reliability of PMS. However, in practice, the structure of the mission system at hand is too complicated to determine and the complete knowledge is not always available. This leads to a great difficulty to apply these approaches for reliability estimation of a practical PMS. In addition, all focuses on a population of common type and there is no work directly establishing the link between the reliability and the historical data/real-time condition monitoring (CM) information of individual PMS in service. These approaches only consider the static scenario with an offline nature. Finally, most of previous works assume that the degradation of the PMS follows a finite state continuous/discrete-time Markov chain. This makes the lifetime estimation of the PMS depends only upon the current state. These limits drive our primary motivation to develop a novel reliability estimation approach for PMS.
1.3. The Proposed Approach
Due to the rapid development of information and sensing technologies, an abundance of data is now readily available in many real-world PMS. This profusion of process/product measurement data provides opportunities for effective reliability estimation through the full exploitation of the data-rich environment [22, 23]. To our best knowledge, there is no report on how to use such CM data to analyze the reliability of the PMS. Therefore, the primary purpose in the paper is to provide a useful answer to the above question.
In this paper we attempt to address the issues associated with reliability estimation for PMS and present a novel approach to achieve reliability estimation for PMS using the CM information and degradation data of such system under dynamic operating scenario. In order to establish a linkage between the historical data and real-time information of the individual PMS, we adopt a stochastic filtering model to model the phase duration and obtain the updated estimation of the mission time by the Bayesian law. At the meanwhile, the lifetime of PMS is estimated by the degradation data, which is modeled by an adaptive Brownian motion. As such, the mission reliability can be obtained through the estimated distribution of the mission time in conjunction with the estimated lifetime distribution. This is a new contribution but not documented before. We demonstrate the usefulness of the developed approach via a numerical example.
The remainder parts are organized as follows. Section 2 gives the problem description. In Section 3, we formulate the mission time estimation from the CM information. Section 4 formulates the degradation data-based lifetime estimation for the mission system. Section 5 discusses the mission reliability and presents the formulations. In Section 6, we provide a numerical study for illustration. Section 7 draws up the main conclusions and comments on the future research.
2. Problem Description and Assumptions
2.1. Problem Description
In this paper, we consider a multiphase mission process having phases. Let denote the duration of the th phase, which is a random variable taking values in . Further, we let a random variable denote the total time of completing the mission. Thus, the random variable can be represented as . If there are some linkages among , , such as the probability density function (PDF) , for , then the PDF of the mission time can be estimated from the historical data. However, this mechanism is aimed for the population of this type of mission systems. To achieve the aim for a specific system, we need to estimate the mission time at each phase using the CM information at the current time , denoted by , which is related to the mission phase duration . Here we represent the estimated PDF of the mission time as , which shows the dependency of the estimated mission time on the CM information to date. Further, let a random variable denote the lifetime of the mission system. To estimate the PDF of the lifetime from the observed degradation data to , denoted by , we use the degradation modeling technique, in which the estimated PDF of is represented as .
After obtaining the estimated and , our primary objective is to compute two kinds of the mission reliability. Here we specifically summarize the general formulations for these two cases as follows.(i)Compute the probability that the mission can be successfully accomplished before a given time without the system failure, formulated as .(ii)Compute the probability that the mission can be successfully accomplished before the system fails, formulated as .
(1)No maintenance activities are involved during the process of carrying out a mission.(2)The mission consists of a set of consecutive phases.(3)For mission success, all phases must be completed.(4)The phases of mission are sequential; that is, the order of the mission phase is deterministic.(5)The durations of the different phases are dependent.(6)The duration of every phase is random following a general distribution.(7)The degradation process is independent of the mission process.(8)Failure resulting from degradation will lead to a mission failure.(9)The duration of the future phase is only dependent on the current and previous phases’ duration; for example, , and .
Assumptions except and have already been widely adopted in the literature. Assumption makes our focus on the random phase duration with a general distribution. Assumption is used for model simplification but is also practical. For example, at the first phase, we only observe the CM information which is related to the duration of the first phase. Therefore, given and , it is reasonable to assume that the duration of the second phase is only dependent on . Following the same procedure, given , , and , the duration of the third phase is only dependent on and , and so on.
3. Model Formulation for Mission Process to Estimate the Mission Time
Without loss of generality, we consider a three-phase mission process for illustration. In the following, we treat the model formation for the mission system phase by phase.
3.1. Model Formulation for the First Phase
Considering that the exact duration of the phase is unknown in its operation, but one thing we do know is that, over a monitoring interval of time, the duration is just an interval shorter at the end of the interval than at the beginning of the interval if nothing happened during that interval. In the meantime we may observe an increasing or decreasing trend of the monitored CM information . Based on these observations, the problem can be formulated as follows with a simple and intuitive form. If we define as the remaining duration of the first phase at time , the current monitoring check point, with the realization , and the relationship between and can be described as , if . It is noted that is actually the duration of the first phase. Furthermore, the duration of the mission time is always positive and thus we use the transformation with the realization to guarantee . In order to estimate from , we need to model the stochastic relationship between and . To do so, we use a concept called a floating scale parameter to model the relationship between and which is modeled by a stochastic distribution in this paper [24–26]. The basic idea was to let the mean parameter of be a function of , which enables an updating mechanism of the mean parameter.
Together with the above description, the relationship among , , , and can be described in  as follows: where is a function to be determined, which describes the relationship between the mission process and the CM data relative to the duration of the phase, and is the normally distributed measurement error represented as .
Therefore, the key for remaining time estimation is to formulate the relationship between and the condition monitoring history . By the classical stochastic filtering theory, it can be shown that this relationship can be established recursively as follows:
In order to solve and formulate the above equation explicitly, we here develop a method using the extended Kalman filtering (EKF) technique based on the work in , in which the EKF technique was used to estimate the residual life. As aforementioned, the duration of the mission time must be positive. As such, we define as a log-normal random variable and thus as the unknown state of the model (1). After obtaining the CM information at , we can use the EKF to estimate/update the conditional PDF of and further the remaining duration . We denote the updated and one-step predicted conditional PDF of as and , respectively, where the parameters , , , and can be obtained by the EKF as follows. Specifically, the updating equation of the expectation of the state can be formulated as where is the Kalman gain function formulated by where .
Correspondingly, the updating equation for the estimation variance can be obtained as
When applying the above EKF methodology, we need to initiate the algorithm at the start of the mission phase using the parameters and , which can be estimated from historical data. In addition, in the above updating equations, it is required to calculate the one-step estimation for the expectation and variance . In the following, we present one method to obtain these quantities.
Considering that and , we can obtain with the associated variance
The above two equations are implied from the relationship between the normal distribution and the log-normal distribution. Thus, we further have . Then, based on the first equation in (1), a one-step forecasting of the remaining mission phase duration from to is
Since the change in the established state equation is deterministic over the interval , the variance about the mean estimate is thus formulated as
By reversing the relationships given in (6) and (7) and together with the previous results, can be transformed into for the next CM time as Furthermore, without any random variation in the prediction of the state, we have Using these results, the expectation and variance can be straightforwardly formulated as
From the above updating equation, the estimated PDF of the remaining duration of the first phase, , can be formulated as
According to the relationship between the duration of the first phase and its remaining duration, we have
Then we directly estimate the distribution of according to the variable transformation as where can be calculated by (13). Therefore, the duration of the second stage conditional on the data up to and can be computed as Similarly, the duration of the third phase conditional on the data up to and can be computed as Since , we have
Therefore, the distribution of the mission time , denoted by , can be calculated as
Then, by differentiating regarding to , we have the PDF of the mission time as follows:
From the above formulation, we can obtain the estimated mission time from the observed CM data associated with the duration of the first phase.
3.2. Model Formulation for the Second Phase
In this case, it is worth noting that the second phase duration is dependent on the termination time of the first phase. In a similar way to the case of the first phase, the relationship between , , , and can be described as where is a function to be determined and is the measurement error which is normally distributed as .
After obtaining the CM information at , we can use the EKF to estimate/update the conditional PDF of the remaining duration on the basis of . Define and . As Section 3.1, the updating equation of the expectation of the state can be formulated as where .
Further, the one-step estimation for the expectation and variance can be formulated as follows:
From the above updating equation, upon obtaining the CM information at , the estimated PDF of the remaining duration of the second phase, , can be formulated as
According to the relationship between the duration of the second phase and its remaining duration , we have
Here, is known since at this time the first phase of the mission process has been accomplished by our model setting of the sequential nature of the mission phase. This distinguishes the second phase from the first phase. Based on this fact, we directly have the distribution of according to the variable transformation as where can be calculated by (24).
Therefore, the duration of the third stage conditional on the data up to can be estimated as
Accordingly, the distribution of the mission time can be calculated by
Since , the above equation becomes
Then, by differentiating regarding to , we have the PDF of the mission time as
3.3. Model Formulation for the Third Phase
In this case, the mission process is in the third phase and then in a similar way the relationship between , , , and can be similarly described as where is a function to be determined and is the measurement error which is normally distributed as .
After obtaining the CM information at , the updating equation of the expectation of the state can be formulated as where .
Further, the one-step estimation for the expectation and variance can be formulated as
From the above updating equation, upon obtaining the CM information at , the estimated can be formulated as
According to the relationship between and its remaining duration , we have
In this case, and are known since at this time the first and second phases of the mission process have been accomplished. Based on this fact, we directly have where can be calculated by (34).
Therefore, the distribution of the mission time conditional on the related CM information , denoted by , can be calculated by
Then, differentiating regarding to yields
So far, we have completed the task of formulating the mission time distribution based on the related CM information.
4. Model Formulation for System Degradation Process to Estimate the Lifetime
In this paper, we use a Wiener process to model the degradation process of the mission system. Without loss of generality, we assume that the start reading of the degradation process is . Then, the evolution of the monitored variable over time can be described as
This type of Wiener process-based model is a typical model used in the literature to characterize the evolving path of the degradation process [27–31]. Considering the potential for updating knowledge of the process, we model the degradation process over time since as
To incorporate the history of the observations, we consider an updating procedure for the drifting parameter by making evolve as , where . In order to establish the linkage between the drift parameter and the observation history up to date, the degradation equation can be reconstructed and taken to be a self-organizing state-space modelas where and . The updated estimation of can be obtained from Algorithm 1.
Due to Gaussian’s assumption and the principle of Bayesian filtering, we can obtain the PDF of at as
Based on the threshold, the remaining useful life (RUL) modeling principle is presented as follows. When degradation modeled by (40) reaches a preset critical level , the plant can be declared to fail. Therefore, it is natural to view the event of lifetime termination as the point that the degradation process exceeds the threshold level for the first time. Therefore, using the concept of the first hitting time, we define the RUL at time as with the cumulative density function (cdf) and the PDF .
Considering the adaptive mechanism introduced by the state-space model (41), we can predict the future degradation at , represented by the PDF . This distribution is normal and can be written as . Further, according to the standard theory of Wiener process, it is direct to obtain the PDF and CDF of the RUL at time as follows:
As mentioned previously, the drift parameter is evolving as a random variable in model (41) with a distribution . To consider the impact of the adaptive mechanism on the estimated lifetime distribution, the PDF and CDF of the RUL conditional on the observations to date can be, respectively, obtained by the total law of the probability as 
Accordingly, the PDF and CDF of the estimated lifetime of the PMS can be formulated as
From (47), we can also observe the dependency of the estimated lifetime of the system on the observation history up to .
5. Reliability Estimation for PMS
After obtaining the estimated mission system lifetime distribution and the mission time , we can estimate the reliability of the mission process according to the two definitions of the mission reliability given in Section 2.1. Together with these analyses, the reliability of PMS at under the th phase can be, respectively, formulated as where and have been modeled in Sections 3 and 4.
From (48) and (49), we can observe that our approach for mission reliability estimation establishes a linkage between the historical data and real-time information of the individual PMS. The associated model parameters can be estimated based on the historical data by the maximum likelihood approach naturally and thus we do not specifically discuss this estimation issue to limit our scope.
6. Numerical Studies
In this section, we provide a numerical example to illustrate the implementation process and the performance of the presented approach in this paper.
Suppose that there is a PMS which is designed to complete the three-phase mission process. The phase durations are log normally distributed but correlated. For an individual PMS to conduct a particular mission process, there are some sensors to monitor the CM information related to the phase duration and the degradation data related to the lifetime of the PMS. The CM information is used to update the phase duration and the mission time, while the degradation data are used to estimate the lifetime of the PMS. Specifically, we consider the following relationship among the phase durations: where , and , are the parameters of the log-normal distributions. These distributions correspond, respectively, to the distributions of , and , in the filtering models.
In the presented approach, it is required to determine the functional forms of the CM information and the remaining phase duration, that is, , . In this numerical study, we use the following functional forms of :
The above is just an idea to model the relationship between and . In order to generate the degradation data to estimate the lifetime of the PMS, we use the following discrete equation: where .
Now, given (51), (52), and (53) and the parameters in these equations, we can simulate the required data for our modeling and reliability estimation. Table 1 shows the parameters used for the data simulation.