Transient Analysis of a -out-of- System with -Policy, Repairmen’s Multiple Vacations, and Redundant Dependency
This paper analyzes a -out-of-: repairable system with -policy, repairmen’s multiple synchronous vacations, and redundant dependency. When there is no failed component in the system, the repairmen leave for a vacation, the duration of which follows a phase type distribution. Upon returning from vacation, they should take another vacation if there are less than failed components waiting in the system. This pattern continues until at least failed components are waiting. Moreover, the redundant dependency which is a special kind of failure dependency is taken into account in the multicomponent system. Under such assumptions, the availability, the rate of occurrence of failures, and the reliability of the system are derived in transient regime by applying the quasi-birth-and-death process. Furthermore, the Runge-Kutta method is carried out to numerically discuss the time-dependent behavior of the system reliability measures. Finally, a special case of the system is presented to show the validity of our model.
In reliability theory, redundancy is a technique widely used to improve system availability and reliability. The -out-of-: system as a popular type of redundancy is often encountered in industrial systems. A -out-of-: system consists of components, in which all components are working initially even though only of them are required for the system to be normal. Classical examples of its applications include the communication systems with multiple transmitters, the power transmission and distribution systems, the design of electronic circuits, the cables in a suspension bridge, the multipump system in a hydraulic control system, the multidisplay system in a cockpit, and the multiengine system in an aircraft. As specific practical applications, in a data processing system with four video displays, the full data could be displayed as long as at least two video displays are in good condition. Thus, the display system works as a 2-out-of-4: system. In an aircraft with five engines, a minimum of three engines operational may be possible to fly the aircraft. Hence, the aircraft functions as a 3-out-of-5: system. Over the past decades, many authors have carried out the availability and the reliability of the -out-of-: systems due to their importance in industry and in fault-tolerant systems. Extensive discussion of -out-of-: systems is referred to the bibliographies by Kuo and Zuo  and Cao and Cheng .
When -out-of-: system is considered for modelling, the exponential distributions of random times involved are usually assumed, for example, Moustafa , Krishnamoorthy and Ushakumari , Ushakumari and Krishnamoorthy , Li et al. , Tang and Zhang , Zhang and Wu , Jain and Gupta , Kang and Kim . They investigated such a redundant system with different assumptions. Krishnamoorthy et al.  studied a -out-of-: system with repair under -policy, in which a repairman is activated for repair as soon as the number of failed components accumulates to a predefined value . They discussed three different situations: cold system, warm system, and hot system. Besides, because of the mathematical complexity of the -out-of-: system, Khatab et al.  proposed an algorithm for automatic construction of the system state transition diagram to analyze the availability of a -out-of-: system with nonidentical components and repair priority rule. Later, Moghaddass et al.  generalized this work to a -out-of-: system with nonidentical components, similar or different repair priorities, and shut-off rules. The main contribution made in their study is that an algorithm is introduced to systematically generate the system state vectors and the transition rate matrix.
In traditional repairable systems, authors generally assumed that the repairman remains idle until a failed component is present or repair control policies are realized, which will lead to a waste of human resources. In many real life situations, the repairman may perform another assigned job during his/her idle period. The time spent by the repairman to take other secondary tasks is called vacation time. Owing to the study of reliability models with repairman’s vacation which is very important both in theory and in practice, Guo et al. , Wu and Ke , and Yu et al.  introduced “repairman’s vacation policy” into repairable system with different vacation strategies. However, it still seems that little more than a beginning has been done in -out-of-: repairable systems. Except [17, 18], no work considered such a system has come to our notice. With this knowledge, the case of repairman’s multiple vacations is taken into account in this paper. We here are interested in how the system reliability measures will be influenced by the vacation time and how long the vacation time could be without affecting the repairman’s primary work.
Another important factor in a multicomponent system is the failure dependency which considers the interactions between the failures of components. The dependency is the most common phenomenon in real world scenarios. For instance, the failure/adding of one component will alter system reliability measures both by the loss/gain of the component’s reliability and by the reconfiguration of the system loading. Fricks and Trivedi  presented a classification of failure dependencies where the common cause failures, the standby dependency, and other failure dependencies are discussed. Later, a variety of methods for analyzing the failure dependency models have been used by many researchers (see [20–23]). In 2007, Yu et al.  introduced a specific kind of failure dependency called redundant dependency, in which any component can be viewed as a redundancy of another component. They investigated the reliability optimization problem of an nonrepairable parallel system having identical components. In their work, the dependence function is originally defined to quantify the redundant dependency. With the help of the dependence function, the redundant dependencies are further classified as independence, weak, linear, and strong dependencies. More recently, Yu et al.  presented a constructive approach to optimize the availability of an -component parallel repairable system through modeling the dependency of the components. The optimization problem is formulated and the resolution procedure is also progressively developed.
This paper considers a -out-of-: repairable system with repairmen’s multiple synchronous vacations and redundant dependency. In order to make our model more flexible and reasonable, we assume that the repair is according to -policy. Applying the quasi-birth-and-death process, several important reliability measures including the availability, the rate of occurrence of failures, and the reliability of the system are obtained in transient regime. In addition, the Runge-Kutta method is applied to show the influence of various parameters on the evolution of the system reliability measures. Through a particular case of our system, we numerically show that our formulae exactly agree with that provided in Cao and Cheng . The new contribution in this work is that we suppose that the repairmen take multiple vacations, the duration of which follows a phase type distribution. Moreover, comparing with [24, 25], a more general -out-of-: repairable system which is a common form of redundancy is considered.
The rest of this paper is arranged as follows. Section 2 gives the assumptions of the model and some notations. The infinitesimal generator of the vector-valued Markov process that governs the system is constructed in Section 3. Moreover, system reliability measures are derived in Section 4. In Section 5, the numerical results are reported. Finally, conclusions are drawn in Section 6.
2. Model Description
The detailed assumptions of the system are described as follows.
Assumption 1. The system is composed of identical components, in which all components are working initially even though only of them are needed for the system to be normal. The system is down as soon as the number of components in the working state goes down to . When it fails, no other working components may break down any more.
Assumption 2. The working time of every component is governed by an exponential distribution with nominal failure rate . Failed components in the system form a single waiting line and receive repair provided by repairmen in the order of their failures, that is, FCFS discipline. The repair time of each failed component follows an exponential distribution with parameter . Moreover, a repaired component is as good as new; that is, the repair is perfect.
Assumption 3. There are repairmen. All the repairmen leave for a vacation together whenever there is no failed component in the system. Upon returning from the vacation, if there are at least failed components waiting for repair, the repairmen start to repair components. Otherwise, they leave for another vacation. This pattern continues until there are at least failed components waiting in the system. The vacation time follows a PH distribution with representation (, ) of order .
Assumption 4. The redundant dependency is taken into account, which is originally defined in . The dependency of a system is called the redundant dependency if any of the components can be viewed as a redundancy of another component. As the components are identical and redundant to each other in our model, the dependency must be symmetric to these components. To combine the redundant dependency into the system failure, we assume that the failure rate of components is determined by its nominal failure rate and the dependence function in the following waywhere is the dependence function and is the number of working components in the system. When one component fails, the failure rate of the working components will update upon the dependence function .
Assumption 5. The random variables , , and are assumed to be independent of each other.
Furthermore, we define the following notations for use in the sequel. Let be an identity matrix of order , and let be a column vector of order of 1’s. We denote by a zero matrix of order , by a zero matrix of dimension , and by the set of all matrices over the field of complex numbers. Write .
3. Infinitesimal Generator
The -out-of-: repairable system as described above can be studied as a block-structured continuous-time Markov chain (CTCM). To see this, define to be the number of failed components (either waiting or being repaired) at time ; it follows that . Let be the state of repairmen at time andSince , there are at least one repairman that is busy during the nonvacation period and some repairmen that may be idle.
Owing to the studying model which is a redundant system with dependent components, the failure rate of the system is related to the dependence function and redundant number. Here, the failure rate and the repair rate of the system are offered as follows:
According to the above assumptions and analysis, it can be shown that the stochastic process is CTMC with state space given by
By partitioning the system state space into levels with regard to the number of failed components and employing lexicographical sequence for the state, we observe that the corresponding infinitesimal generator matrix of is of dimension , exhibiting the following block-structured form:from which it follows that we deal with a finite quasi-birth-and-death (QBD) process. Each block of the matrix is defined in the following:
4. Transient Analysis
This section will discuss the transient behavior of system reliability measures including the point availability, the rate of occurrence of failures at time , and the reliability at time . We first assume that all the components are new and the repairmen are on vacation at phase 1 initially and define the following notations:
By a straightforward probability arguments, the transition equations of the Markov model are formed as ordinary differential equations with an initial conditionwhere .
The solution of the above ordinary differential equations can be written as
Let the distinct eigenvalues of the matrix be , with respective algebraic multiplicities where . By the existence assertion of the Jordan canonical form theorem (see [26, Theorem 3.1.11] for details), there is a nonsingular matrix such thatwhere is a Jordan matrix.
In order to determine the nonsingular matrix , we write with , . It follows from (10) that
Then, we have that
Denote where , . Thus, from (13), we know that
Moreover, we obtain
It appears from (15) that is the eigenvector of corresponding to the eigenvalue . Once the explicit expression for is given, the explicit expressions for can be obtained by a recursion procedure. Moreover, the nonsingular matrix is determined. Furthermore, we get
From (17), we can get the system state probabilities explicitly. Some important reliability measures such as the availability and the rate of occurrence of failures of the system are easily obtained.
In a repairable system, the availability is the first reliability measure. It defines the probability that the system is working at a specific time . By definition, the availability of the system is
4.2. The Rate of Occurrence of Failures at Time
The rate of occurrence of failures , called also the failure frequency, is defined as the mean number of failures per unit time. Based on the formula proposed by Lam , the expression of is given by
4.3. System Reliability
In order to get the reliability of the system, we lump all failure states together to make one absorbing state. Let be the number of failed components in the system at time , . Thus, the new process forms a Markov process with state space . The corresponding infinitesimal generator of the resulting process has the form
Moreover, define the system state probabilities
By a similar argument to (8), we have
The solution of (22) is
Similar to (17), we can obtain the explicit expressions for the state probabilities . Thus, the reliability of the system at time is given by
Certainly, it is rather difficult to derive the reliability measures of the system analytically when the values of and are very large. Although the system reliability measures are obtained, the explicit expression is always very tedious. Here, instead we will apply the readily available differential equation solver, namely, the Runge-Kutta method, to numerically derive the system state probabilities and three kinds of reliability measures. The most widely used Runge-Kutta method of order 4 is an extremely important and effective class of single step method. This method has attracted much attention particularly when the solution is required at many points or plots need to be drawn to show the evolution of certain performance measures.
5. Numerical Illustrations
The investigation model can be effectively applied to many real-world systems. For illustration, we consider a communication system with transmitters. The average message load may require at least transmitters be operational at any time, or some critical message should be lost. Whenever a transmitter fails, the transmitters that are still operating have to bear the message load of the failed transmitter, as such the performance of surviving transmitters are strongly affected due to increased load. Transmitters fail according to a Poisson process with parameter . Once failed, it will be repaired by repairmen and the repair time follows an exponential distribution with parameter . Moreover, the repairmen take additional duties (e.g., maintenance or repair work in other places) when there are no failed transmitters. Under normal circumstances, the repairmen return to periodically check the status of the system. If there are at least failed transmitters, the repairmen repair them immediately. Otherwise, they leave the system together again for another duty. Consequently, such a practical example provides a good approximation of our model. In such a system, practitioners or researchers are interested in how the performance of the system will be affected by the various system parameters. Here, we first study the influence of the repair capacity including the number of repairmen and the vacation time on the three reliability measures, namely, , , and . Then, we analyze the influence of the redundant dependency on these measures. Finally, a particular case is presented to verify the correctness of the formulae obtained in our model.
5.1. The Influence of the Repair Capacity on the System Reliability Measures
Example 1. First, we select , , , , , and . The vacation time follows a PH distribution with representation (, ) of order 4 and mean 0.2744, whereThe numerical example performs the above specific parameters by collocating with different values of . For conciseness, the numerical results for the state probabilities at time with are tabulated in Table 1. Moreover, the graphs of three kinds of reliability measures against time are shown in Figures 1(a)–1(c). It appears from Figure 1 that the system reliability measures are sensitive to the number of repairmen. Figures 1(a) and 1(c) indicate that and increase with an increase in , while Figure 1(b) reveals that decreases as increases.
Example 2. In this example, we first choose , , , , , , and . We then consider the following three cases of PH vacation time where the PH representations are given by the following. Case 1. Phase type distribution (PH): Case 2. Coxian distribution (COX): Case 3. Erlangian distribution (ERL):
The expected values of the above three PH distributions are 0.1926, 1.3500, and 3.3333, respectively. The computational results for the state probabilities at time with Coxian vacation time distribution are provided in Table 2. For the three cases, Figures 2(a)–2(c) indicate the influence of the vacation time on the three system reliability measures. We observe from Figure 2(a) that the system availability decreases with increasing the mean of vacation time. So does the system reliability. Figure 2(b) shows that the rate of occurrence of failure of the system goes on increasing with increasing the mean of vacation time. It can be seen the system performance measures are also sensitive to the vacation time.
The Example 1 and Example 2 indicate that (a) the curves of the system availability and the rate of occurrence of failures exhibit violent fluctuations in the early stage, but after some time units the fluctuations of them tend to disappear; (b) the improvement of the repair capacity contains two aspects: increase the number of repairmen and shorten the vacation time.
5.2. The Influence of the Redundant Dependency on the System Reliability Measures
In (1), the dependence function completely indicates redundant dependency. Clearly, if the number of working components is given, the bigger is, the stronger the dependency is. According to the reference , several types of redundant dependencies are quantified and classified by the value of as given in Table 3.
Based on Table 3, the redundant dependencies are classified into independence, weak, linear, and strong dependence. We here study the influence of different types of the redundant dependencies on the system reliability measures by taking the , as one specific form.
We select , , , , , and . The vacation time is governed by a PH distribution with 3 stages and mean 0.6567 whereThe numerical example performs the above specific parameters under different values of , and 1.5 (corresponding to independence, weak dependence, linear dependence, and strong dependence, resp.). The system state probabilities at time with strong dependency are given in Table 4. Moreover, we plot the three reliability measures against time in Figures 3(a)–3(c). It appears from Figure 3(a) that the point availability function goes on increasing with increasing . Figure 3(b) shows that the rate of occurrence of failures is a monotonically decreasing function of . We observe from Figure 3(c) that the system reliability increases as increases. This example reveals that the dependency is an essential and effective option to improve the reliability of the system. That is, a system could be rescued by improving its interior strength.
5.3. Special Case
Set , , , and the mean vacation time tend to zero; then our model reduces to the classical -out-of-: Markovian repairable system. Cao and Cheng  analyzed the Markovian repairable system where the working times and the repair times of components follow exponential distributions with parameters and , respectively. Applying the Markov analysis method, the steady-state availability and the rate of occurrence of failures are obtained as follows:
To illustrate the correctness of the formulae obtained in this paper, we select , , , and the vacation process defined by and . The computation results are tabulated in Tables 5 and 6, respectively. Numerical results show that the formulae obtained in the present paper exactly agree with that given in Cao and Cheng .
A -out-of-: repairable system with -policy, repairmen’s multiple vacations, and redundant dependency is discussed in this paper. By using a finite quasi-birth-and-death process, the availability, the rate of occurrence of failures, and the reliability of the system are obtained in transient regime. The Runge-Kutta method is used to discuss the effect of various parameters on the system reliability measures. In our work, the dependence function is introduced to describe the redundant system with dependent components. The system’s reliability with independent class is clearly lower than that of the dependency class. The stronger the dependency is, the higher the reliability of the system is. That implies that a system could be rescued by improving its interior strength. Consequently, a model with failure dependence seems much more applicable to practical situations. There is a need to work further to solve the -out-of-: repairable system with dependent but nonidentical components.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This research is supported by the Project of Visual Computing and Virtual Reality Key Laboratory of Sichuan Province (no. KJ201401) and the National Natural Science Foundation of China (no. 71171138).
W. Kuo and M. J. Zuo, Optimal Reliability Modeling: Principles and Applications, John Wiley & Sons, New York, NY, USA, 2003.
J. H. Cao and K. Cheng, Introduction to Reliability Mathematical, Higher Education Press, Beijing, China, 2006.
Y. H. Tang and J. Zhang, “New model for load-sharing k-out-of-n:G system with different components,” Journal of Systems Engineering and Electronics, vol. 19, no. 4, pp. 748–751, 2008.View at: Google Scholar
L. N. Guo, H. B. Xu, C. Gao, and G. T. Zhu, “Stability analysis of a new kind -unit series repairable system,” Applied Mathematical Modelling: Simulation and Computation for Engineering and Environmental Systems, vol. 35, no. 1, pp. 202–217, 2011.View at: Publisher Site | Google Scholar | MathSciNet
R. M. Fricks and K. S. Trivedi, “Modeling failure dependencies in reliability analysis using stochastic petri-nets,” in Proceedings of the European Simulation Multiconference (ESM '97), Istanbul, Turkey, 1997.View at: Google Scholar
H. Y. Yu, C. B. Chu, F. Yalaoui, and É. Châtelet, “Optimal reliability allocation: simplification and fallacy,” in Proceedings of the IFAC Conference on Management and Control of Production and Logistics, Santiago, Chile, 2004.View at: Google Scholar
A. H. Roger and R. J. Charles, Matrix Analysis, Cambridge University Press, New York, NY, USA, 2nd edition, 2013.View at: MathSciNet