Abstract

Firstly, a two-unit cold standby shock model with multiple adaptive vacations is introduced, in which the startup and replacement of repair facility are also considered. Secondly, using supplementary variable method and Laplace transform, some important reliability indices are derived, such as availability, failure frequency, mean vacation period, mean renewal cycle, mean startup period, and replacement frequency. Finally, a production line controlled by two cold-standby computers is modeled to present numerical illustration and its optimal part-time job policy at a maximum profit.

1. Introduction

It is well known that the shock model is used to study the external causes which may make a system fail. For example, a computer system may fail due to the invasion of some virus or an attack from a raider. Many authors have investigated various shock models based on different assumptions. Among them, Esary et al. [1], Barlow and Proschan [2], Ross [3], and Fagiuoli and Pellerey [4] dealt with Poisson shock models. Eryilmaz [5, 6] discussed discrete-time shock model and its life behavior. Also, Gottlieb [7], Aven and Gaarder [8], Wang and Zhang [9], Lam and Zhang [10], and Tang and Lam [11] obtained the optimal replacement policies for several shock models. Recently, from the reliability viewpoint, Li and Zhao [12] studied a -shock model consisting of components, and Q. T. Wu and S. M. Wu [13] analyzed a two-unit cold standby shock model with single vacation.

However, in the existing shock literature, the repair facility is assumed to be always available when a failed unit occurs, although this assumption is evidently unrealistic. In fact, in many practical situations the repair facility generally needs a startup time with random length for its preparatory work before starting repair. Furthermore, the busy repair facility is typically subject to lengthy and unpredictable breakdowns and has to be replaced (see [1416]). On the other hand, to utilize the repairman’s idle time effectively and increase profit, the system manager can assign some secondary jobs to the idle repairman. But the repairman’s additional tasks will reduce system availability and sometimes yield huge economic losses. Therefore, it is very important that the system manager knows how to assign the idle repairman optimal additional jobs based on a maximum profit and high availability level. In this paper, the period when the repairman undertakes additional jobs is represented by the repairman’s vacation time. A comprehensive and excellent study on the vacation models can be found in Tian and Zhang’s book [17].

Based on the above facts, in this paper we present an extended shock model for two identical unit cold-standby systems. Here, cold standby means that the redundant unit cannot fail at its standby state. Our study differs from previous work [113] in that (i) it considers the startup and breakdown of repair facility and (ii) it introduces the multiple adaptive vacation policy (MAVP). The MAVP first proposed by Tian and Zhang [17] is more general than single vacation, multiple vacations and variant vacations, which is useful for high availability and profit optimization of the system (see Section 5); (iii) some new reliability indices are derived, such as mean renewal cycle, mean vacation period, mean startup period, mean idle period, and mean busy period; (iv) as an application of our model, a production line controlled by two identical cold-standby computers is modeled to analyze its optimal part-time job policy.

The paper is organized as follows. Section 2 presents the assumptions. In Sections 3 and 4 we obtain the solutions to state probability equations and main reliability indices by the supplementary variable method. In Section 5 the numerical illustration and optimal part-time job policy for a production line are given. Conclusions are drawn in Section 6.

2. Assumptions

The extended shock model we consider here consists of two identical cold-standby units, a repair facility and a repairman. The model assumptions are as follows.(1)The external shocks arrive according to a Poisson process with the rate . The magnitude of each shock, , is independent with common distribution function . Shocks only influence the operating unit. The operating unit will fail if outstrips a threshold , where is assumed to be nonnegative with a distribution function .(2)Suppose that shocks are the only cause of unit failure. The system breaks down if and only if two units fail. When the operating unit fails, the standby one begins to operate if there is one (the switch is instantaneous and perfect). The failed units are repaired in order of failures. The repairman can repair only one failed unit at a time by means of repair facility. The repaired unit is as good as new and immediately goes into standby or operating state. The repair times are independent and identically distributed (i.i.d.) with common distribution function , density function , hazard rate function , and finite mean , respectively.(3)After completing repair, the repairman will shut down repair facility and take a multiple adaptive vacation policy. Under this vacation policy, the length of each vacation is i.i.d. random variable, which follows a general distribution function with density function hazard rate function , and mean vacation time , respectively. The maximum vacation number of the repairman, denoted by , has an arbitrary distribution with probability generating function , . At each vacation completion instant, the repairman checks the system and decides the action to take according to the system state. There are three possible cases: (A) if there is any failed unit in the system, he will immediately spend a startup time to turn repair facility on and then start his repair until there are no failed units; (B) if there is no failed unit in the system and the total number of vacations is still less than , he will take another vacation; (C) if there is no failed units in the system and the total number of vacations is equal to , he will remain idle in the system until the first failed unit appears, which induces a startup time and subsequent repair. We assume that the startup time has distribution function , density function , hazard rate function , and mean startup time , respectively.(4)The repair facility may break down with a Poisson rate in the process of repair. The broken facility is immediately replaced by the repairman. The replacement times are assumed to be i.i.d. random variables having a general distribution function , density function , hazard rate function , and mean replacement time , respectively. After replacement, the repair facility continues its remaining repair. The repair time of the failed unit is cumulative.(5)Initially, both units are new (one is operating and the other is in cold standby), and the repairman is idle. After the first busy period is completed, he begins to take a multiple adaptive vacation policy. All random variables are mutually independent.

Remark 1. From Assumptions (1) and (5), the probability that a shock causes the operating unit to fail is given by where is the probability of event .

3. The State Probability Equations and Solutions

We define the possible states of the system as follows:    ,    ,    ,    ,    ,    ,    ,    ; ,    , ; , , ; , where , , , , and represent that one unit is operating, in cold standby, waiting for repair, waiting for remaining repair (preserving the time spent in repair), and under repair, and , , and represent that the repairman is idle, turning the repair facility on, replacing the repair facility, and taking the th vacation under the condition that the maximum vacation number is , ; , respectively. By definition, the system states , , , , , and are operable and , , , and are inoperable.

Let be the system state at time . For , we define as the elapsed startup time of repair facility at time , the elapsed repair time of the failed unit at time , the elapsed replacement time of the broken repair facility at time , and the elapsed vacation time of the repairman at time . Then, is a vector Markov process. Define the state probabilities at time as follows: where , , , and are the values taken by , , , and , respectively.

In steady state, we define

Let , since the process is a vector Markov process in continuous time, one can write the equations of the process in the usual way by considering the transitions occurring in and . For example, we have Letting tend to zero yields By taking the limit in , we can obtain In the same way, we readily get the following steady-state equations for state probabilities: with the boundary conditions and the normalization condition

In order to derive important reliability indices, we define the Laplace transform of a nonnegative function as , and we also denote , Define then we have the following.

Lemma 2. The expressions of , are given as follows: where

Proof. Solving (6) and (10), respectively, and using (19), we get Applying (28), (7) becomes which leads to Similarly, by the theory of first-order, linear, ordinary differential equation, we get
Now, we need to determine the values of , , , and , ; .
Substituting (32) and (35) into (21) and (22), respectively, leads to which gives rise to From (5), (35), and (40), we obtain Substituting (42), (36), and (41) into (15) gives Combining (16), (37), and (41) leads to By (17) and (18), we get It follows from (45) that Finally, according to (47) and (43), we have Thus, , are obtained by (28)–(37) and the above results. Again by the normalization condition (24), we get the value of .

4. Reliability Indices

Based on the results obtained in Lemma 2, some important reliability indices are easily derived as follows.

Theorem 3. (1) The steady-state availability of the system, that is, in steady state, the probability that the system is operating, denoted by , is
(2) The steady-state failure frequency of the system, that is, in steady state, the rate of occurrence of failures of the system, is given by
(3) In steady state, let and denote the repairman’s idle and vacation probabilities, and and are the startup and busy (repair and replacement) probabilities of repair facility, respectively; then
(4) In steady state, let denote the unavailability of the repair facility, that is, the probability that the repair facility is under replacement, and is the replacement frequency of the repair facility, that is, the rate of occurrence of replacements; then where is given by Lemma 2.

Proof. According to the frequency formula in [18], and are easily obtained. The rest of probabilities are derived by their definitions, Lemma 2, and direct calculations.

Noting that the time points that the repairman completes repair and begins to take vacation are regenerative ones, we define the following.(i)System renewal cycle denoted by : this is the length of time from the beginning of the last vacation period to the beginning of the next vacation period.(ii)Vacation period denoted by : this is the length of total vacation time of repairman per renewal cycle.(iii)Idle period denoted by : this is the length of total idle time of repairman per renewal cycle.(iv)Startup period denoted by : this is the length of total startup time of repair facility per renewal cycle.(v)Busy period denoted by : this is the sum of repair time and replacement time of repair facility per renewal cycle.

Theorem 4. Denote by and the mean vacation period and mean actual vacation number of repairman, respectively; then

Proof. Let denote the complementary distribution of vacation period ; then . Now, we will construct a vector Markov process with finite absorbing states to get .
Let all system states defined in Section 3 except , ; ; and be absorbing states; then we obtain a new vector Markov process. For this new process, using similar state probability notations to Section 3, we get the following set of equations: with the boundary conditions where if , if , and the initial conditions: , , the rest are zero.
These equations can be solved in a similar manner to that in Section 3. The Laplace transforms of solutions are given by Therefore, we have Substituting the values of , ; ; and into the above equation and direct calculations complete the proof of Theorem 4.

Corollary 5. Denote by , , , and the mean renewal cycle, mean idle period, mean startup period, and mean busy period, respectively; then where is given by Lemma 2.

Proof. The results are easily derived by noting that

Remark 6. Since the time points in which the repairman completes repair and begins to take vacation are regenerative ones, by the frequency formula in [18] and in (32), the steady state renewal frequency of system, that is, the rate of occurrence of renewals of system, is given by . Again by Proposition  3.3 in [19], we get . The result derived by this method agrees with that in Corollary 5, which confirms that the deduction processes in Theorem 4 and Corollary 5 are right. More importantly, Theorem 4 presents an effective analysis technique to derive the mean vacation period and mean actual vacation number.

Remark 7 (special cases). (1) If , then each shock does not harm the operating unit; If , then each shock will cause the operating unit to fail.
(2) If , then our model becomes a two-unit cold standby shock system with repair facility startup and multiple adaptive vacations.
(3) If we let , then when , , our model becomes a two-unit cold standby shock system with multiple adaptive vacations.
(4) If we let and , , , then our model can describe a two-unit cold standby shock system with repair facility startup and replacement under a variant vacation policy. In this special case, and represent no vacation, single vacation, vacations, and multiple vacations, respectively.

5. Numerical Illustration and Optimal Part-Time Job Policy for a Production Line

As an application example of our model and its results, we consider a production line controlled by a huge operating computer, which may fail due to the invasion of some virus. The virus arrives according to a Poisson process with rate . If the magnitude of a shock outstrips a threshold the operating computer will fail, where and are assumed to have distribution functions and , , respectively. When the operating computer fails, an identical standby computer begins to operate. If two computers fail the production will be stopped. The repairer repaired the failed computer(s) in order of failure at a repair station. Before repair the repair station needs a random startup time . After startup, the repairer starts his repair until no failed computer appears. The repair time of each computer is assumed to be a random variable. The repaired computer goes into operating or standby state. Whenever no failed computer occurs, the repairer shuts down the repair station and performs part-time jobs (vacations), where is a discrete random variable with generating function . The part-time jobs can make profits for the production line. Upon completion of each part-time job time , the repairer checks the computer failure and decides whether or not to resume startup and subsequent repair. If at this moment there is no failed computer, a decision may be made for taking another part-time job to be performed. If the failed computer occurs, he will restart the startup of repair station before repair. If part-time jobs are completed, the repairer will be idle and wait for the computer failure. Further, the repair of failed computer may be interrupted due to some unpredictable events, which occur according to a Poisson process with rate . The interrupted repair is immediately recovered with a random time . The repair will restart when the interruption is recovered.

Firstly, we numerically analyze the effects of part-time job, virus arrival rate, startup, and repair recovery time on main reliability indices of production line. These indices include the stop production frequency , mean renewal cycle , mean startup period , mean part-time job period , the availability , busy probability , unavailability , and repair recovery frequency .

For convenience, we assume that (1) the time of each part-time job is exponentially distributed with parameter , (2) the repair time is deterministic with mean , (3) the startup time obeys 2-stage Erlang distribution with parameter , and (4) the repair recovery time follows hyperexponential distribution with density function , . Also, suppose that and , ; then we get . The numerical results are reported in Tables 15 by varying values of , , , and , where in Tables 25 it is assumed that is geometrically distributed with mean .

By means of analysis results derived in Section 4, Table 1 shows that the effects of maximum part-time job number on main reliability indices of production line for the set of parameters . Here, we assume that ; that is, suppose that the repairman takes no part-time job, single part-time job, 3 part-time jobs, 5 part-time jobs, and multiple part-time jobs, respectively. It is observed that the stop production frequency , mean renewal cycle , and mean part-time job period all increase monotonously, and the availability , busy probability , and repair recovery frequency all decrease monotonously as the value increases. The same conclusion holds for decreasing value of part-time job rate in Table 2.

The effects of virus arrival rate on main reliability indices are reported in Table 3, where we set . As is expected, , , and decrease, and , , , and increase as increases. But the mean startup period does not change. The effect of varying startup parameter is shown in Table 4 for the set of parameters . We see that the mean startup period increases monotonously as the value decreases.

Table 5 reports the effects of repair recovery time parameter on main reliability indices for the set of parameters . It should be noted that the repair recovery time does not affect the mean startup period and mean part-time job period. The trends shown by Tables 15 are as expected.

Next, we numerically analyze the optimal part-time job policy of production line while maintaining a maximum profit. Let us define income of the production line per unit time when one computer is operating, income per unit time when the repairer is performing part-time jobs, loss of the production line when the production stops for each time, idle cost of the repairer per unit time, startup cost of the repair station per unit time, repair and repair recovery costs per unit time, loss of the production line when the repair is recovered for each time;

then the average profit of production line per unit time is given by where , , , , , , , and are given by Theorems 34 and Corollary 5, respectively. For optimization analysis, we let(1) , , , , , , and ;(2) , , , and ;(3)the startup time obeys 4-stage Erlang distribution with mean ;(4)the repair time follows 3-stage Erlang distribution with mean ;(5)the repair is interrupted according to a Poisson process with rate ;(6)the repair recovery time is deterministic with mean .

The following two optimal part-time job policies will be determined so as to maximize the average profit and maintain an availability no less than 0.85: (a) the maximum number of part-time jobs when the time of each part-time job follows an exponential distribution with mean ; (b) the maximum value of when the maximum number of part-time jobs obeys a negative binomial distribution and the time of each part-time job is a positive integer .

For case (a), we set and ; the optimization problem is given by For case (b), we set and , ; the optimization problem is With Matlab 7.0 and a heuristic method, the numerical results for two optimization problems are reported in Tables 6 and 7, respectively. Table 6 shows that for case (a), a maximum average profit value per unit time of is achieved at . It is seen from Table 7 that for case (b), the maximum average profit per unit time is 1946.6 as the maximum positive integer value of equals 2.

6. Conclusions

In this paper, with the help of supplementary variable method and Laplace transform, we successfully obtain main steady-state reliability indices of an extended shock model, such as availability, failure frequency, mean renewal cycle, mean vacation period, and mean startup period. Some special cases are given. As an application, a production line is modeled to study numerical illustration and its optimal part-time job policy. For future research, one could consider some discrete time shock models with a repairman vacation and their optimization applications.

Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The author would like to thank the referees and editor for their valuable comments and suggestions. This work is supported by the Basic and Frontier Research Foundation of Chongqing of China (cstc2013jcyjA00008) and the Scientific Research Starting Foundation for Doctors of Chongqing University of Technology (2012ZD48).