Abstract

The performance level of a multistate system (MSS) can vary among different values rather than only two states (perfect functioning and complete failure). To improve the reliability of MSSs, a maintenance strategy has been adopted to satisfy customer demand, and reliability modeling of MSS with preventive maintenance and customer demand is proposed. According to the regular degradation and random failure at each state, based on the Markov random process, the proposed MSS with preventive maintenance can be modeled to satisfy the customer demand in a specific state. This model can also be adapted to compute other reliability indices. Based on this model, the effect of different preventive maintenance actions on the reliability indices can be analyzed and further compared. Two numerical examples have been illustrated to show the validity of the proposed model. The reliability model presented in this study can be used to assess the type of MSS and help reliability engineers to compare different maintenance actions quantitatively and make optimal decisions.

1. Introduction

Generally, all systems and/or components will undergo an aging process before complete failure. This aging process is often modeled as a continuous and deterministic function of time. For example, the failure rate is usually depicted as a bath tub curve as a function of time. However, in most real-life situations, the failure rate depends not only on time but also on the states of the systems and/or components. Moreover, the traditional binary reliability theory assumes that there are only two states: perfect functioning and complete failure. The binary-state assumption may oversimplify the practical circumstances. A multistate degradation system may operate in an intermediate state between perfect functioning and complete failure. These intermediate states can be caused by system deterioration or peripheral factors, such as fatigue, burn-in, vibration, efficiency, failure of nonessential components, and the number of random shocks. Furthermore, the sojourn times in every state are typically uncertain, which can result in the uncertainty of the state-dependent failure rate. Therefore, reliability modeling and evaluation of such multistate degraded systems have been impelled, some of which are discussed in the following.

The basic concepts, such as models, definitions of the structure function, and the properties of a stochastic multistate degradation system, were developed [13]. The notions of minimal path set, minimal cut set, coherence, and component relevancy have also been introduced. Based on these concepts, some corresponding performance measures, such as reliability, availability, mean time-to-failure, and redundancy can be deduced as the reliability description of the system under study [413].

To retain the reliability of a degraded system at a desired level, maintenance plays an important role. There are two types of maintenance that are based on time: corrective maintenance (CM) and preventive maintenance (PM). An optimal PM scheduling for a system consisting of deteriorating components was developed, and the simulated annealing method was employed to obtain the optimum solution [14]. A reliability model based on Markov was presented to evaluate a three-state system, and a novel approach based on the Markov process to solve the differential equations reduced the computational time significantly [15]. The system reliability of a multistate network with multiple sinks was modeled as one of the probabilities, and an efficient algorithm was developed [16]. To improve the availability of nuclear power systems, the PM optimization for the series-parallel structure was modeled and a metaheuristic method was applied to solve the formulated problem [17]. Aiming at the degradation modeling and failure probability quantification of nuclear power plant piping systems, a multistate physics modeling approach had been proposed and applied to the piping system of a pressured water reactor undergoing thermal fatigue [18]. Some researchers have focused on the multistate k-out-of-n system with identical and nonidentical components, and a novel recursive algorithm to assess the reliability and the optimal design to improve the reliability were developed by them [1924]. For a mission-based system, where missions are executed successfully with random durations, periodic and random inspection policy with postponed replacement is introduced [25]. An age-based preventive replacement policy is performed for components and a recursive method is developed to obtain its availability measure [26]. Although most of the models for reliability analysis have assumed that degradation will induce a decrease in system reliability, they have not considered the sudden abrupt failure from a normal working state.

Furthermore, maintenance actions are not always able to restore a system back to its “as-good-as-new” condition. If that were the case, the system might be used for an infinite period of time or for an unlimited number of missions. It is well known that this is something almost impossible to achieve in actual situations. Once a system fails stochastically, either in a perfect or degraded working state, a proper maintenance action will bring the system back to the state that existed just before the failure. As mentioned above, CM can be adopted when systems fall into the failure state, and PM can improve the performance when systems are in degraded states. Perfect PM will bring the system to as-good-as-new conditions; however, most PMs are imperfect due to limited maintenance resources such as time, budget, maintenance tools, technical level of maintenance engineers, and working environment [27]. An imperfect PM model is depicted as one in which, upon failure, the system will be replaced with probability and be minimally repaired with probability [28]. Such a model was given and the extended great deluge optimization method was illustrated to give the best solution [29]. Similarly, under imperfect PM, an optimal selective maintenance strategy was resolved by a genetic algorithm [30, 31]. A systematic replacement model with minimal repair based on the cumulative repair-cost limit and an optimal PM policy based on a cumulative damage model for the used system were proposed and analyzed, respectively [32, 33]. For the system with two competing failure modes, degradation-based and shock-based failure, a condition-based maintenance model is proposed [34]. When shocks arrive according to a nonhomogeneous Poisson process, it can significantly weaken the safety of system operating in an uncertain dynamic environment [35].

After the concept of imperfect maintenance had been introduced in the literature, the application of imperfect PM for multistate degraded systems has drawn the attention of many researchers. Because the scheduled PMs can be either imperfect or perfect, the optimal PM policies and repair decisions have been studied in order to significantly improve the maintenance efficiency of MSS modeled by the nonhomogenous continuous time Markov process [36]. The difference from the MSS model previously mentioned is that the model proposed in this study is not only based on the homogenous Markov chain but also on features such as Poisson failure and customer demand, which are all incorporated into this model. Here, a multistate degraded system with stochastic failure and imperfect PM has been modeled. Under the satisfaction of customer demand, this kind of system can fail stochastically in any state between perfect functioning and complete failure. When a system degrades to an unacceptable state, PM can be chosen to restore the system to one state. Based on the Markov chain theory and imperfect maintenance theory, the corresponding differential equations have been built up. Some reliability measures have also been developed and can be obtained by solving the model.

The rest of this article is composed of five sections. Section 2 formulates the MSS and stochastic model based on Markov chain theory. In Section 3, some reliability indices based on this model are deduced. The detailed method of modeling MSS is given in Section 4. Several illustrative examples are shown in Section 5. Section 6 discusses and concludes that the proposed model for reliability analysis is valid for practical application.

2. System Description and Modeling

2.1. System Description

The degraded system considered herein can be of performance levels including degraded states. Initially, the system will be in perfect functioning denoted by state 1. As time progresses, it can fall into one of the two states: failure state (because of an abrupt failure), and the first deterioration state 2 (which is at a lower performance level or production rate). This type of failure, which occurs randomly and suddenly, is often named as Poisson failure, which is ubiquitous in many working environments [28, 32, 33, 3739]. Upon a Poisson failure occurring at the failure rate from state 1, the minimal repair will be implemented immediately at the repair rate , which will restore the system back to state 1. If the system falls into the degraded state 2 from initial state 1 at the failure rate , it will proceed in the same manner. In other words, the system will transit from state 2 to the second degraded state 3 at the failure rate or to the repairable Poisson failure state at the failure rate . The other states are mimics of the transition state as can be seen in Figure 1. When it reaches the last degraded state , it will only fall into the complete failure state at the failure rate .

The deterioration states can be observed through some system parameters provided by some online supervised systems, and the time for inspection is also neglected. All the states of the system can be categorized into four types. State 1 is the perfect state with the perfect performance . State 2 to state are the degraded states with corresponding performance or production rate ranging from to . State to state are repairable states of corresponding maintenance rate that can restore back to the corresponding state, which increase the Poisson failure. At the same time, the repairable states are of zero production rate . The last state is of complete failure with no repair and zero performance level . For the purpose of clarity, the backgrounds of different states are shown in distinctive shades and patterns.

To satisfy customer demand , the system performance will be restored to a better state, resorting the PM before the last state . There are several PM actions that can be chosen, from minor to major maintenance. The system will be restored to the previous deteriorated state by minor maintenance, while the major maintenance will take the system back to the initial as-good-as-new condition. We can suppose that the state is the state that satisfies customer demand at the lowest performance level. It can be expressed by

When an inspection finds that a system falls into the last acceptable state, a PM should be implemented to restore the system to one of the previous higher performance level at the maintenance rate. The maintenance rates are different corresponding to different states, as shown in Figure 1. Furthermore, we can assume that only one transition from state to state , will occur. For example, the transition rate denotes the state transition from to 3 after a minor maintenance is chosen.

2.2. Stochastic Modeling

As mentioned previously, the order of the performance level of system states can be expressed as follows:

These performance levels are represented by the set . At any time , the system performance level is a random variable which takes its value from the set . When the system operates after a time interval , the performance level can be treated as a stochastic process. At an instant time , the probabilities associated with the respective state are expressed as the set , where

Usually, the customer demand can also be seen as a discrete stochastic variable taking a value from the set . For a specific system, we can assume that the customer demand takes a constant value . The acceptability of the system performance level is usually dependent on the relation between the system performance level and customer demand . If we assume that the state is the last acceptable state, the deterioration states after it will mean nothing. According to equation (5), those states between state and state will be out of consideration and can be aggregated into one single failed state with performance level . Consequently, these repairable states after state can also be omitted. Thus, Figure 1 can be simplified further, as shown in Figure 2.

To assess the effect caused by a PM, a binary variable can be defined aswhere . Because only one PM can be performed such that

If we assume that all the transition rates (including the Poisson failure rate, repairable rate, degraded rate, and PM rate) are all constant values and exponentially distributed, then the transition process can be depicted by the Markov process. According to Figure 2 and the assumptions mentioned above, the Chapman–Kolmogorov equations corresponding to the Markov model are written aswith the following initial conditions:

These state probabilities can be obtained by solving the differential equations given by equations (6) and (7). According to these state probabilities, reliability indices can be calculated further.

3. Reliability Indices

3.1. Reliability, Availability, and Production Rate

The reliability function is the probability of the event that there will be successful operation of the repairable degraded system without any interruption until the time . The time is usually the time to the first failure, which is a random variable defined as the time from the beginning of the system life up to the instant that the degraded system reaches the first degraded or unacceptable state. Under this condition, the initial performance level of the degraded system can satisfy the customer demand , and the reliability function will be given by

To determine in Figure 2, the repairable states and the unacceptable state should be grouped into one absorbing state denoted by state 0. In addition, all the repairs that make the degraded system transit from state 0 to any degraded state are removed. The failure rate from the last acceptable state to state 0 is equal to the sum of and . Based on the analysis mentioned above, the Markov model can be built, as shown in Figure 3.

According to Figure 3, the differential equations will take the formwhere the initial probability is

These state probabilities can be solved and used to calculate the reliability function. Whenever the degraded system enters into the absorbing state 0, it will never leave it. The state probability can be easily used to calculate the reliability function because it characterizes the , which will be written as

It is to be noted that as time progresses to infinity, the final state probabilities of the degraded system are and others are all equal to zero, because the degraded system always enters the final absorbing state 0.

The instantaneous availability function is the probability that the degraded system will be found in the operational state at time . For the system described in Figure 2, these states of working efficiency are the perfect and the degraded states. That is to say, is the sum of the probability that the degraded system is in state 1 and one of the other acceptable degraded states at time . Combining the results of foregoing equations (6) and (7),

At the same time, the production rate can also be obtained by the probability distribution of each state. The instantaneous production rate function at time is a de facto output performance expectation, viz., . The value can be given by

3.2. Other Indices

Assuming that the life time of a degraded system is the time to reach the designated state due to degradation, the unavailability of the system due to Poisson failure can be calculated as

Hence, the probability that the degraded system fails completely at the state can be defined as

During time , the expected operational time spent in each state is as follows:where .

Further, the expected operational time (EOT) and the expected down time (EDT) during time are given by

Furthermore, the mean life time (MLT) is the expected life time of the system:

The mean operational life time (MOLT) is the expected operational life time of the system, which is given by

The mean time to first failure (MTTFF) of the degraded system is the expected time to the first failure which can be obtained by

4. Method for Modeling

Based on the abovementioned analysis, the method for modeling an MSS with Poison failure under customer demand can be summarized as follows:S1. According to the practical production system, its Markov model can be sketched by drawing the state transition diagram, as shown in Figure 1S2. Considering the customer demand on the performance level of this system, the Markov model shown in Figure 1 will be simplified to Figure 2S3. State probability of the system can be obtained by solving equations (6) and (7)S4. Relevant indices can be calculated according to equations (12) to (19)S5. In order to obtain the reliability function of the system, the model shown in Figure 2 needs to be altered to Figure 3S6. Solving equations (9) and (10), the probability at each state can be obtainedS7. The reliability function and MTTFF can be calculated easily by using equations (11) and (20)

After the abovementioned steps are fulfilled, the reliability evaluation based on the MSS model will be performed according to the relevant equations.

5. Application Examples

5.1. Example without PM

Given a degraded system shown in Figure 4 where the parameters are signed, reliability indices can be obtained according to the model equations (6) and (7) to illustrate an example without PM using this model.

The differential equations are built up according to Figure 4, and the results of some indices can be found. The expected times spent in each state are , , and . The relations of , , and are depicted as follows.

From the abovementioned curves during the time interval , it can be observed that the system availability decreases with time. However, the unavailability increases with time. The probability curve of is near to zero, indicating that complete failure at state is nearly impossible. At the time , the values of the three indices are shown in Figure 5.

The results of the EOT and EDT listed in Table 1 are based on 5 chosen time intervals.

According to equations (18) and (19), and . Obviously, is greater than . However, is higher than which increases with time .

In order to obtain the reliability function for this example, equations (9) and (10) of the model will be adopted. After solving the equations, the reliability function can be found using equation (11). The curve of is shown in Figure 6.

Combining equation (20), we obtain and , and the point B shown in Figure 6. If the probability value needs to be greater than or equal to 0.6, would be adequate as the point A implies.

5.2. Example with PM

A more practical example with PM actions can also be illustrated using the model. For the feeding water system in the power plant, its performance level can be measured usually by the weight of water pumped to the boiler. According to the different needs of generating power in one district, the production rate of feeding water system can be ranged from 2000, 1500, or 700 to 0 tons/hour. In other words, there are some different states corresponding to those production rates. State 1 is the perfect functioning of the 2000 performance level. State 2 and 3 are the degraded states whose performance levels are 1500 and 700, respectively. State 4 is the unacceptable state whose performance level is below the requirement. The other states are the Poison failures. With regard to this degraded feeding water system which has 7 states, some PM actions may be required to be adopted. The state transition diagram of this system is given in Figure 7.

Two PM actions can be chosen at state 3: one is the imperfect PM with the transition rate and the other is the perfect PM with the transition rate . The values of all transition rates are listed in Table 2 where their meanings correspond to Figure 7. The production rate at each state are 1000, 750, and 600 for states 1, 2, and 3, respectively. The other states can be seen as the failure state whose production rate is zero. Furthermore, the customer demand for this system can be assumed by . Therefore, when the system degrades to state 3, one PM action should be taken to meet the customer demand.

Using equations (6) and (7) from the model, the probabilities of each state can be obtained. Then, the availability function will be evaluated by equation (12). To compare the effectiveness of the PM actions, three types of actions are adopted. The first is to do nothing, that is, without PM. The second is imperfect PM with the transition rate , and the last is perfect PM with the transition rate . The results of the three PM actions on are depicted in Figure 8 within the interval .

From Figure 8, it can be observed that the availability rate decreases with time. When PM actions are implemented, availability rate is improved. Perfect PM has higher characteristics of improving than imperfect PM. The availability rates of three types of PM actions at time are shown by points A, B, and C, respectively.

Similarly, the production rate will be calculated according to equation (13). The results of production rate are shown in Figure 9. At time , the production rates are shown as points A, B, and C for the three types of PM actions, respectively. Although the production rate decreases with time, the findings show that the PM will improve the production rate of this system.

In order to calculate the reliability function, equations (9) and (10) from the model are used. Combining the three types of PM actions, the changing trends of are depicted in Figure 10.

In this figure, the changing trends of decrease with time. PM actions have the property of making the reliability higher. For example, the reliability of three PM actions at time are the points A, B, and C for the three types of actions, respectively, as shown in Figure 10.

Furthermore, the MTTFF of the system will be calculated. According to equation (20), this index under three types of PM actions can be obtained as , , and . Obviously, the PM actions prolong the mean time to first failure significantly.

6. Discussion and Conclusion

In this study, reliability modeling for a degraded MSS was considered. Its practical implication includes two aspects. First, it takes into account a sudden and random failure called Poison failure, which may occur with certain failure rate at each degraded state, and maintenance action can restore the system back to the state just before the Poison failure at certain maintenance rate. Second, it includes the customer demand on the performance level. When the performance level of the MSS degrades to a level below the customer’s specified demand, the model will be simplified to meet the customer’s performance limit. Moreover, some PM actions can be adopted to restore the system back to a better state at certain transition rate in order to improve the reliability of the degraded MSS. The proposed method is not only convenient to model the degraded MSS under a customer’s specific reliability demand but also suitable to calculate those reliability indices for the qualification of PM actions.

The proposed model can be applied in many practical situations because it can respond to a situation based on the needs to assure a customer’s reliability demands. Furthermore, some PM actions can be qualified and expressions of reliability indices can be easily derived and compared by maintenance engineers for making decision. A limitation of this study is that the transition rates among states are considered constant. A model that treats the transition rates as a type of distribution rather than as constants will be part of our future work to strengthen the proposed model.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported financially in part by a grant from the Fundamental Research Funds for the Central Universities (nos. 2018MS076 and 2020MS120) and China Scholarship Council (no. 201906735027).