Complexity in Manufacturing Processes and Systems 2019View this Special Issue
Research Article | Open Access
Jinlei Qin, Zheng Li, "Reliability and Sensitivity Analysis Method for a Multistate System with Common Cause Failure", Complexity, vol. 2019, Article ID 6535726, 8 pages, 2019. https://doi.org/10.1155/2019/6535726
Reliability and Sensitivity Analysis Method for a Multistate System with Common Cause Failure
With the increasing complexity of industrial products and systems, some intermediate states, other than the traditional two states, are often encountered during reliability assessments. A system with more than two states is called a multistate system (MSS) which has already become a general phenomenon in the components and/or systems. Moreover, common cause failure (CCF) often plays a very important role in the assessment of system reliability. A method is proposed to assess the reliability and sensitivity of an MSS with CCF. Some components are not only in a failure state that can cause failure itself, but also in a state that can cause the failure of other components with a certain probability. The components that are affected by one type of CCF make up some sets which can overlap on some components. Using the technology of a universal generating function (UGF), the CCF of a component can be incorporated in the expression of its UGF. Consequently, indices of reliability can be calculated based on the UGF expression of an MSS. Sensitivity analysis can help engineers to judge which type of CCF should be eliminated first under various resource limitations. Examples illustrate and validate this method.
Common cause failures (CCFs) are the dependent failures of multiple components originating from a common cause or single occurrence or condition. CCFs can be important contributors to system unavailability or accident risk. For example, lightning events can cause outages of unprotected electronic equipment and voltage surges caused by inappropriate switching can lead to multiple failures of components in a power system. Recognition of the fact that a CCF will increase the joint-failure probability and then reduce system reliability has inspired many researchers to model and estimate the reliability or availability of a system with CCF. Some basic concepts and theories have been investigated and developed .
Two fundamental kinds of methods, termed implicit and explicit approaches, are used to incorporate CCFs in system reliability analysis . The implicit approach proceeds as follows. The CCF is first ignored and the system logic is modeled with basic single-failure (component-level) events. The system reliability is given by algebraic probability expressions of the basic components, which are quantified to include the contribution of the CCF such that the system reliability or availability with the CCF can be expressed. This method has been studied in both binary- and multistate system (MSS) reliability . In the explicit approach, the CCF is modeled as a basic cause-event in the system reliability block diagram or system fault tree, appearing as the repeated input to all elements or gates affected by the CCF. As the CCF can occur at random times in a redundant standby safety system, the expressions for the basic event probabilities of the explicit fault tree model have been developed .
In the context of imperfect fault coverage (IPC), an uncovered failure may often lead to complete system failure . Aiming at this situation, other methods such as ordered binary decision diagrams and generalized reliability block diagrams have been suggested to assess system reliability . In the phased-mission system with IPC and CCF, a new binary decision diagram approach to reliability analysis was suggested . Myers demonstrated how the coverage effect can be computed using combinational and recursive technologies for four coverage models . In the reliability redundancy allocation problem three nonlinear optimization models mixing components with the inclusion of CCF events were addressed . The reliability of an IPC system with functional or performance dependence was also investigated . An extended object-oriented Petri net model was proposed for mission reliability simulation of a repairable phased-mission system with both external and internal CCFs . Using the reliability block diagram method, the quantitative study of probability of failing safely (PFS) for a safety instrumented system showed that the CCF increases the PFS . Under the assumption that a single-failure event may lead to simultaneous failure of multiple components, a nonparametric predictive inference for system reliability following the CCF of components was presented .
In many cases, the system and/or its components can function in some different states characterized by several performance levels. Such systems are often referred to as MSSs. The theory of the MSS was investigated by Murchland in 1975 . Many researchers have analyzed the reliability which is often seen as an important measure of the MSS to provide the desired performance level . MSS can also be subjected to the CCF which can lead to the failure of an entire system or subsystem .
Many studies have investigated the reliability and modeling of the MSS with CCF. For instance, an algorithm was proposed to evaluate the reliability of a complex nonrepairable series-parallel MSS with CCF under the assumption that the failure propagation time is a random value with a given distribution . An MSS reliability analysis method incorporated a CCF into a Bayesian network  which is based on a well-defined theory of probabilistic reasoning and the ability to express the complex dependence between random variables. To reduce the computational complexity caused by the MSS, where the number of states will increase rapidly with the number of elements, a universal generating function (UGF) was adopted by Ushakov . Further developments and applications have been studied in detail . Recently, the reliability analysis method based on the UGF has been employed in many environments. A new reliability-evaluation methodology based on the UGF and a recursive algorithm was applied to a multistate weighted k-out-of-n system . To analyze the reliability of the generalized linear multistate consecutively connected system, a UGF-based method was suggested . Using UGF, the evaluation method of reliability characteristics for a nonrepair complex system has been presented .
Although much interest has been focused on methods of reliability analysis for systems with CCF, less attention has been paid to the MSS with CCF, which only occurs in several independent elements. The reliability of a transmission MSS considering the CCF effects has been analyzed based on Bayesian network model . This model can clearly express the influence of CCF on system reliability without the computation of minimum cut sets or the determination of algebraic expression of unreliability. However, these parameter values used in -factor model reduce its accuracy because these values are based on engineering experiences or published statistics of CCF. An implicit two-stage procedure is proposed to evaluate reliability of MSS with CCF . This procedure can obtain the system reliability function in both analytic and numerical solution. But the procedure can be applied only to series-parallel MSS in which each component can belong only to one common cause group.
Aiming at the research limitations mentioned above, such as the inaccuracies of parameter values and nonoverlap of common cause group, an easily programmed recursive method is suggested for this situation, with the benefit of reduced computational complexity for reliability and performance distribution assessment. Using the sensitivity analysis method, it is easy to locate the optimal solution to eliminate the CCF and will help engineers to find the bottlenecks of system reliability design. This method is based on the application of UGF.
This paper is organized as follows. Section 2 formulates the MSS with CCF. In Section 3, based on the expression of the CCF, the algorithm to calculate the system reliability is presented. Several illustrative examples are shown in Section 4. Section 5 discusses the conclusions.
2. System Description
2.1. Formulations of the MSS
Under the framework of multistate modeling, an MSS consists of components connected in series-parallel. A characteristic of the MSS is that any component can be in one of performance levels corresponding to states . State 0 is for the perfect function, state for complete failure and other intermediate states are for partial failure or degradation. Without loss of generality, the performance level at each state of component will take values from the set
Then the space of all combinations of performance levels for all components can be given bywhere the operator denotes the Cartesian product. The event of every performance level for component at any time is also assumed to a discrete random variable . The probability set of can be expressed by
Obviously , because that the states constitute a group of mutually exclusive events, i.e., the component can be in one and only one state from at any time. The relationship between the variable and performance can be given by
Another characteristic of the MSS is that it usually will consist of more than two states. Here, some external factors causing the failures or degradations of a system are not considered. Under this assumption, the performance levels of the MSS will be unambiguously ascertained by performance levels of its components, whose states will consequently determine the states of the MSS.
Suppose that the MSS has states and the variable is the performance level corresponding to system state . The MSS performance level can be seen as a discrete random variable taking values from the set . Using (2), the system function can be written as
The above formula is the map of the performance level space from all components to the system. Similarly, the probability of system performance can be expressed as
From the analysis above, the keys of the MSS model are given as follows:(i)The MSS consists of components connected in series-parallel.(ii)Any component has states. Each state’s performance level and probability of a corresponding performance level can be expressed as (1) and (3), respectively.(iii)The MSS will have states deduced from the states of the components as defined in (5) and its performance level will take a value from .(iv)The probability of the MSS in a corresponding state can be expressed by (6).
2.2. Indices of Reliability
According to the origin of the CCF, there are two types of cause: external and internal cause. The CCF caused by the latter is often called a propagated failure because it will affect other system components. In an MSS where the CCF has an internal cause, the failed components will propagate to other components which may be mutually independent or even overlap with one or more components. Then the system performance level will be undermined with an induced decrease in reliability. When the system performance level is reduced to a limit value , which is often called system demand, the system will be seen as unacceptable. The reliability of the MSS with CCF can be defined as the probability that the system satisfies the value of . From (6), one can obtainwhereAnother important measure is the conditional expected performance . It expresses the system’s expected performance under the condition that the MSS is in an acceptable state. Having the system reliability , this measure can be calculated byTo calculate these measures, the performance level distribution of a system should be first obtained according to (6). UGF has been proved an effective method for the reliability assessment of different types of MSS. In particular, the series-parallel system is likely to adopt it by the recursive method.
3. Analysis Method
The CCF caused by one component corresponds to the state with the performance level . The performance value can coincide with component performance in the state of local failure and can usually be seen as zero. When the component is in state , all the system components affected by this CCF will also be in the failure state with performance . If one component cannot arouse a CCF, the probability mass function (PMF) of the corresponding state should be zeroed: . The UGF of component incorporating the CCF can be rewritten as
The conditional PMF of any component that cannot fail due to the CCF can be represented bywhere is the probability of causing the CCF.
Suppose there are components that can independently and simultaneously cause the CCF. For component vector , , the probability that the CCF originated from one component is expressed as . Since these CCFs can be combined independently, the number of combination is .
For any combination , the CCF originates from component ifAfter evaluating the above formula for to , the set of components corresponding to the combination can be obtained. For every combination , the probability of the corresponding CCF is
According to the properties of the CCF, other components failed because component can be denoted by . For every combination , the set of components affected by the CCF and aroused the CCF can be obtained bywhere
Given that one component fails due to the CCF, the conditional PMF of its performance level can be represented by , which indicates the component can be only in the failure state with performance level .
When the CCF corresponding to combination occurs, all components in the set will come in the CCF mode, and their UGFs will be replaced by . However, the UGFs of components not belonging to the set must be represented by (11). Then the conditional PMF of the overall system performance will be obtained as in the form of a UGF. From (13), the system UGF can be calculated as
Combined with the above analysis, the following steps should be adopted to realize the assessment: S1. Incorporate the CCF into each component’s UGF according to (10). S2. Obtain the conditional UGF of component by (11). S3. Determine all possible combinations of components aroused by the CCF. S4. For every combination , fix its corresponding components which can inspire the CCF according to (12). S5. Calculate the probability of every combination using (13). S6. Determine the set of components affected by the CCF given one combination according to (14). S7. Replace the UGF of all components belonging to set with , and use the UGF of the components not belonging to it. S8. Based on the physical structure of the series-parallel MSS, represent the conditional UGF . S9. Obtain the UGF of the entire system , according to (16).
4. Application Examples
4.1. Basic Application
A numerical example illustrates the above method for the reliability assessment of the MSS with CCF. A type of MSS with a series-parallel structure, as shown in Figure 1, is a typical configuration. In a flow transition system, matter such as oil, water, or steam is often transmitted from terminal A to terminal B by interconnected components.
For example, in the feeding water system of a power plant, components (pumps) are grouped into two subsystems; sub1 consists of C11, C12, and C13 connected in parallel, and sub2 consists of C21 and C22. Their parameters of performance distribution are listed in Table 1. As shown in this table, all components can fail with a given probability. But only components C11 and C13 can arouse the CCF. C11 can destroy component C21 owing to its nearby location and then constitute the CCF group CG1 with component C21. Component C13 will stimulate the failure of components C21 and C22 at the same time, to form the second CCF group CG2. The reasons for the CCF can be events such as fire or lightning.
According to the above steps, the UGF of every component can be obtained as follows:
From (11), the conditional UGFs of components take the following forms:
The conditional UGFs of component C12, C21, and C22 can be obtained by removing directly. Two components, , cause the CCF, and then , . The other variables are listed in Table 2.
According to the physical nature of the components’ interconnection, the function within subsystems sub1 and sub2 should take the form of the sum, and the function between two subsystems should take the form of the minimum. Then the conditional UGF of the entire system can be denoted as
For and , no UGF should be replaced with . Equation (19) can be expressed as
For and , only and will be substituted by . In the case of failure, the performance level is zero. Equation (19) will be rewritten as
For and , the UGF of components must be replaced by in (19). So one can obtain the following formula:
When , the UGF of components should be substituted with . Then (19) will be written as
The whole system’s UGF can be obtained by
Now the reliability of the entire system with CCF can be depicted as in Figure 2.
Using (9), the conditional expected system performance level at reliability is
4.2. Advanced Application
A more practical example will be depicted and the sensitivity of the CCF can be further analyzed. Suppose an MSS production system consists of nine components connected in series-parallel, as shown in Figure 3.
The system can be divided into the subsystems. Components C22, C31, and C33 have only binary states. All the others can be seen as multistate components with two operating states: perfect and degraded performance.
The CCF can be inspired by four components . The set of components affected by the CCF are , , , and . These components constitute four CCF groups, CG1, CG2, CG3, and CG4. The performance distributions of system components for operating states and failing states are listed in Table 3.
According to the analysis method suggested above, the reliability curve at different performance levels can be represented as in Figure 4.
There are 21 performance levels corresponding to this MSS, as shown in the results. For example, the data points A, B, C, and D are the reliabilities at performance: , , , and , respectively.
Under the limitation of cost, it is necessary to find the most valuable component than can improve the reliability of the system to the greatest extent when the CCF originating from it is eliminated. In other words, the sensitivity of components can be further analyzed by the suggested method. When the CCF of component is localized, its UGF will be rewritten as follows, according to (10)
The CCF can be incorporated in the failure state without arousing other components’ failures, and the performance level is also zeroed. Then the system reliability can be reassessed when the CCF in one single component is eliminated. The improvements shown in Table 4 have been compared with the case where no CCF is removed.
In Table 4, the performance level is chosen from 8 to 15 and the reliability corresponds to cases where no CCF is excluded. For the four components that can cause the CCF, the improved reliability is expressed in percentage form after the corresponding CCF has been eliminated.
Moreover, for a specified performance level , the best component can be chosen to localize its CCF. For , the component should be the best one to be improved because its improving proportion can reach 10.14%. At the same time, when one component has been chosen, the best performance level can be determined according to the extent of improvement. For example, because of some limitations only can be maintained to remove its CCF, and the best performance level corresponding to the biggest reliability improvement, 0.7%, is . The analysis of sensitivity to CCF localization will help engineers to find bottlenecks in the system reliability design and to properly direct investment in reliability enhancement.
5. Concluding Remarks
In this paper a reliability assessment method is suggested for the series-parallel MSS with CCF. The components that can arouse the CCF may cause failures of different subsets of system components. These subsets can even overlap on some components. This method is based on the UGF application, and the factor of the CCF is incorporated in the expression of the UGF. Because of its repetition for every combination, the method is conveniently programmed by iteration. The indices of reliability and expected conditional performance level can be obtained. The sensitivity of components can be further analyzed. This method will help for reliability engineers to decide which component is the most valuable to be invested in, so as to eliminate its CCF. According to the assumptions in this method, the structure of the MSS is applied only to the series-parallel system. The more complex MSS topologies such as bridge and G:(k/n) structures need to be considered in the future.
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
This work was supported financially in part by a grant from Fundamental Research Funds for the Central Universities (no. 2015MS128; no. 2018MS076).
- J. K. Vaurio, “The theory and quantification of common cause shock events for redundant standby systems,” Reliability Engineering & System Safety, vol. 43, no. 3, pp. 289–305, 1994.
- K. N. Fleming and A. Mosleh, “Common cause data analysis and implications in system modeling,” in Proceedings of the International Topical Meeting on Probabilistic Safety Methods and Applications, pp. 1–12, 1985.
- G. Levitin, “Incorporating common-cause failures into nonrepairable multistate series-parallel system analysis,” IEEE Transactions on Reliability, vol. 50, no. 4, pp. 380–388, 2001.
- J. K. Vaurio, “Common cause failure probabilities in standby safety system fault tree analysis with testing—scheme and timing dependencies,” Reliability Engineering & System Safety, vol. 79, no. 1, pp. 43–57, 2003.
- S. V. Amari, J. B. Dugan, and R. B. Misra, “A separable method for incorporating imperfect fault-coverage into combinatorial models,” IEEE Transactions on Reliability, vol. 48, no. 3, pp. 267–274, 1999.
- G. Levitin, “Block diagram method for analyzing multi-state systems with uncovered failures,” Reliability Engineering & System Safety, vol. 92, no. 6, pp. 727–734, 2007.
- L. Xing, “Reliability evaluation of phased-mission systems with imperfect fault coverage and common-cause failures,” IEEE Transactions on Reliability, vol. 56, no. 1, pp. 58–68, 2007.
- A. F. Myers, “k-out-of-n:G system reliability with imperfect fault coverage,” IEEE Transactions on Reliability, vol. 56, no. 3, pp. 464–473, 2007.
- J. E. Ramirez-Marquez and D. W. Coit, “Optimization of system reliability in the presence of common cause failures,” Reliability Engineering & System Safety, vol. 92, no. 10, pp. 1421–1434, 2007.
- L. Xing, B. A. Morrissette, and J. B. Dugan, “Efficient analysis of imperfect coverage systems with functional dependence,” in Proceedings of the Reliability and Maintainability Symposium (RAMS), pp. 1–6, 2010.
- X.-Y. Wu and X.-Y. Wu, “Extended object-oriented Petri net model for mission reliability simulation of repairable PMS with common cause failures,” Reliability Engineering & System Safety, vol. 136, pp. 109–119, 2015.
- J. Jin, L. Pang, S. Zhao, and B. Hu, “Quantitative assessment of probability of failing safely for the safety instrumented system using reliability block diagram method,” Annals of Nuclear Energy, vol. 77, pp. 30–34, 2015.
- F. P. A. Coolen and T. Coolen-Maturi, “Predictive inference for system reliability after common-cause component failures,” Reliability Engineering & System Safety, vol. 135, pp. 27–33, 2015.
- J. Murchland, “Fundamental concepts and relations for reliability analysis of multi-state systems,” in Reliability and Fault Tree Analysis, 1975.
- X. Janan, “On multistate system analysis,” IEEE Transactions on Reliability, vol. R-34, no. 4, pp. 329–337, 1985.
- A. Shrestha, L. Xing, and S. V. Amari, “Reliability and sensitivity analysis of imperfect coverage multi-state systems,” in Proceedings of the Annual Reliability and Maintainability Symposium: The International Symposium on Product Quality and Integrity, RAMS 2010, pp. 1–6, USA, January 2010.
- G. Levitin, L. Xing, H. Ben-Haim, and Y. Dai, “Reliability of series-parallel systems with random failure propagation time,” IEEE Transactions on Reliability, vol. 62, no. 3, pp. 637–647, 2013.
- J. Mi, Y. Li, H.-Z. Huang, Y. Liu, and X. Zhang, “Reliability analysis of multi-state systems with common cause failure based on Bayesian Networks,” Eksploatacja I Niezawodnosc-Maintenance and Reliability, vol. 15, no. 2, pp. 169–175, 2013.
- I. A. Ushakov, “Optimal standby problems and a universal generating function,” Soviet Journal of Computer and Systems Sciences, vol. 25, no. 4, pp. 79–82, 1987.
- G. Levitin, Universal Generating Function in Reliability Analysis and Optimization, Springer, London, UK, 2005.
- H. A. Khorshidi, I. Gunawan, and M. Y. Ibrahim, “On reliability evaluation of multistate weighted k-out-of-n system using present value,” The Engineering Economist, vol. 60, no. 1, pp. 22–39, 2015.
- G. Levitin, L. Xing, and Y. Dai, “Linear multistate consecutively-connected systems subject to a constrained number of gaps,” Reliability Engineering & System Safety, vol. 133, pp. 246–252, 2015.
- S. Negi and S. B. Singh, “Reliability analysis of non-repairable complex system with weighted subsystems connected in series,” Applied Mathematics and Computation, vol. 262, pp. 29–89, 2015.
Copyright © 2019 Jinlei Qin and Zheng Li. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.