Abstract

This paper studies the effect of postponing interruptions of the service in M/G/1 queueing systems until the end of the service in progress. The waiting and sojourn times in systems with postponement are compared with those in comparable systems without post-ponement, both for the case that interruptions can only occur during busy times of the server and for the case that interruptions are also possible during idle times of the server.

1. Introduction

Kelton et al. [1, Sc. 4.2.3] propose the following rule of thumb for dealing with failures of machines in the context of a simulation model. If the expected time to failure is large compared to the expected failure duration, postpone the failure until the end of the service in progress. They apply this rule to a machine which forms a node in a queueing network. The present paper compares the effect of this postponement against failures which preempt the service in progress on the time in system of the customers. For the simulation model in Kelton et al. [1, Sc. 4.2.3] the mean time at the node with the machine with failures is with postponement versus with preemption (95% confidence intervals from 100 replications of simulations with common random numbers of 500 days after 1 day warm up). A 95% paired-t confidence interval, cf. Kelton et al. [1, Sc. 6.4], of the difference between the time at the node with preemption minus that with postponement is , so the difference is significant at the 95% level. This means that the model with postponed failures underestimates the model with preemption by about 1.8%, partly because there will be somewhat (about 1%) less failures in the former model due to the postponement.

In the rest of this paper, we investigate the difference between postponement and preemption more systematically for M/G/1 queueing systems in which services may be interrupted because the server becomes unavailable during some time. For instance, the server could be a machine that is subject to breakdowns, a clerk whose servicing is interrupted by phone calls, or a doctor who is called away from consultations for more urgent matters. The arrivals are generated by a Poisson process at rate . The service times are assumed to be a sequence of independent, identically distributed random variables with distribution , with moments , and Laplace-Stieltjes transform (LST) . The interruptions or breakdowns are generated by an (interrupted) Poisson process at rate . The interruption or repair times are assumed to be a sequence of independent, identically distributed random variables with distribution , with moments , and LST . Throughout it will be assumed that a preempted service is resumed when the service interruption is over. M/G/1 systems with service interruptions have been discussed by Gaver [2], Keilson [3], and Cohen [4, Sc. III.3]. More recent work on queueing systems with service interruptions (without postponement) includes Fiems et al. [5], who consider a mixture of preemptive resume and preemptive repeat interruptions, and Krishnamoorthy et al. [6], who consider a queue with interruptions in a matrix-analytic setting.

The organization of the rest of this paper is as follows. The imbedded Markov chain analysis for M/G/1 systems is shortly reviewed in Section 2. Section 3 considers the effect of postponing interruptions for the case that interruptions can only occur during active periods of the server. Section 4 contains a discussion of the same effect but for the case that interruptions can also occur when the server is idle. A conclusion can be found in Section 5.

2. Review of M/G/1 Analysis

Consider the standard M/G/1 system without interruptions. The analysis of this system is often based on the imbedded Markov chain of the number of customers present just after service completions, cf. for example, Cohen [4, Sc. II.4.2] and Gross and Harris [7, Sc. 5.1]. Let denote the number of customers present just after the service completion of the th customer, not including this th customer. This sequence of random variables satisfies the following well-known recursion: Here, represents the number of arrivals during the service time of the th customer. Since the service times are assumed to be mutually independent, and since the arrivals occur according to a Poisson process, the random variables are also mutually independent. Their probability generating function (PGF) is , . Recursion (2.1) leads to the following relation for the PGF of the stationary version of the sequence : Since , an M/G/1 system is stable if , and the PGF of the distribution of follows as Next, let denote the number of customers waiting in the queue just after the start of the service of the th customer, not including this th customer. If , the service of the th customer starts immediately after the departure of customer with customers present in the system, so customers are waiting in the queue. If , the service of the th customer starts after an idle period of the server with no customers waiting. Hence, these random variables are related by With (2.1) and (2.3), it readily follows that the stationary version of the sequence has a distribution with PGF Since there is a single server, the number of customers waiting in the queue, , must be equal to the number of arrivals during the first-come-first-served (FCFS) waiting time of the th customer. For a Poisson arrival process, the number of arrivals during , , only depends on the length of and the arrival rate : . This implies for the stationary versions of these sequences that This leads with (2.5) to the LST of the stationary FCFS waiting-time distribution

3. Interruptions during Busy Periods Only

In this section it will be assumed that interruptions can only occur when the server is active. This means that the instants generated by the Poisson process with failure rate are only interpreted as a server interruption if the server is occupied with the service of a customer; they are ignored during idle periods and during interruption times. The analysis of this variant of the M/G/1 system can be reduced to that of a standard M/G/1 system with a modified service-time distribution. The total duration of a service, called the service-completion time, cf. Gaver [2], may include one or more interruption times. The number of interruptions during a service, , only depends on the length of the service time, and not on those of the interruption times. Given the service time and the number of interruptions , the total duration is the sum of and of independent interruption times, that is, for ,  , Because the number of interruptions during a service time of duration is Poisson-distributed with mean , it follows that The first three moments of the distribution of the service completion time, including interruptions, are Results for the standard M/G/1 system, cf. Section 2, are translated to the present variant with interruptions by replacing the LST and related quantities throughout by and corresponding quantities. For instance, the M/G/1 system with interruptions is stable if the modified load of the system, , is bounded by one so that The latter inequality states that the offered load must be smaller than the probability that the server is available for service when interruptions are possible at any instant. Further, let denote the waiting-time until the start of service under FCFS. The LST of this waiting time distribution becomes, cf. (2.7), as follows: The probability that the waiting time is zero is The mean waiting time until the start of service is and the variance of the waiting times until the start of service is The time in system of a customer is the independent sum of the waiting time until the start of service and a service completion time , so that The fraction of time the server is busy is equal to the offered load and the fraction of time the server is idle is by the PASTA (Poisson Arrivals See Time Averages) property equal to , cf. (3.6). Hence, the fraction of time the server is interrupted is Next, it will be assumed that interruptions can be postponed until the end of a service. Let denote the number of customers left behind in the system at the end of the th service, including a possible interruption. Then, this sequence of random variables forms an imbedded Markov chain and satisfies the recursion, cf. (2.1), as follows: Here, the random variable represents the length of a service time and a possible interruption. With the service time and the first interruption occurrence after the start of this service, we have for the distribution of such a service completion time so that The first three moments of the distribution of the service completion time, including a possible, postponed interruption, are This system is stable if From (3.11), it follows that the stationary distribution of the sequence has a similar PGF as that of the standard M/G/1 system, cf. (2.3), but with replaced by . Further, although is not necessarily related to a departure instant in the present system, relation (2.4) does hold for the present system, with the number of waiting customers at the start of the service of customer . Hence, the stationary distribution of the sequence also has a similar PGF as that of the standard M/G/1 system, cf. (2.5), but with replaced by . Let denote the waiting-time until the start of service under FCFS for this system with postponed interruptions. Since also holds for the present system, the LST of this waiting time distribution becomes that of the standard M/G/1 system (2.7) with replaced by as follows: The probability that the waiting time is zero is The mean waiting time until the start of service is and the variance of this waiting time is The time in system of a customer is the independent sum of the waiting time until the start of service and an uninterrupted service time so that, for example, The fraction of time this system is in an interrupted state is, cf. (3.17), as follows: Table 1 contains a numerical comparison of M/G/1 systems with preempting interruptions and postponed interruptions, where interruptions can only occur when the server is busy. In all cases, the mean service time is normalized to . Case I is the base case, with offered load , with exponentially distributed service times and interruption times, with failure rate , and mean interruption time equal to the mean service time () so that the mean time to failure is more than 20 times the mean interruption time. The mean time in system is smaller when interruptions are postponed. Case II concerns less frequent longer interruptions. Less frequent longer interruptions are known to increase the delays of customers. But a decrease of the interruption rate by a factor 5 leads to a reduction in the difference of the mean times in system to 1.3%. Case III concerns more frequent shorter interruptions. An increase of the interruption rate by a factor 5 leads to a rise in the difference of the mean times in system to 7.1%. Case IV concerns more variable service times, modeled by a gamma distribution with shape parameter 0.5. An increase of the squared coefficient of variation by a factor 2 leads to a rise in the difference of the mean times in system to 3.3%. Case V concerns more variable interruption times, also modeled by a gamma distribution with shape parameter 0.5. An increase of the squared coefficient of variation by a factor 2 leads to a slight rise in the difference of the mean times in system to 2.7%. Case VI concerns more frequent interruptions. An increase of the interruption rate by a factor 2 leads to a rise in the difference of the mean times in system to 9.0%. Case VII concerns less frequent interruptions. A decrease of the interruption rate by a factor 5 leads to a reduction in the difference of the mean times in system to 0.3%. Case VIII concerns a higher arrival rate. An increase of the load from 0.8 to 0.9 leads to a rise in the difference of the mean times in system to 4.7% while the mean times in system also strongly increase. Case IX concerns slightly more frequent interruptions to compensate for the reduction of interruptions due to the postponement of them; that is, at is the same as at . So, the values of the postponed interruptions model of case IX should be compared with the values of the preempting interruptions model of case I. After correction for the decrease of the number of actual interruptions, the difference of the mean times in system is still 1.1%. Finally, case X forms a kind of worst case scenario, with a rise in the difference of the mean times in system to 13.3%.

4. Interruptions during Idle and Busy Periods

In this section, it will be assumed that interruptions may also occur during idle periods. In this case, the probability that the first customer of a busy period finds the server available, upon arrival is, since is the probability that no customer arrives during a time when the server is available and is the probability that no customer arrives during an interruption time is as follows: The imbedded Markov chain of the number of customers left behind at departure instants satisfies the recurrence relations, for , as follows: Here,   denotes the number of arrivals during the th service completion time (including interruption times), with PGF , cf. (3.2), and denotes the number of arrivals during the interruption time in which the th customer arrives (if any), . The PGF of the latter random variables is From the recursions (4.2), it follows by the procedure of Section 2 that the PGF of the stationary distribution of the number of customers in the system, , for , is given by, Here, denotes the LST of the distribution of a residual interruption time. It turns out that the condition for stability is the same as that for systems without interruptions during idle periods, cf. (3.4). The LST of the stationary distribution of the waiting time until the start of service under FCFS follows from (4.4) by the procedure of Section 2 as follows: This LST is the product of the LST of the distribution of the waiting time in systems without interruptions during idle periods, cf. (3.5), and the LST of the distribution of a residual interruption time at an arbitrary instant. Hence, it follows with (3.7) that Observe that does not vanish as , in contrast with , due to the possibly positive residual interruption time at an arrival instant. The variance of the waiting times can be computed from The probability that an arbitrary customer does not have to wait is the limit of the LST (4.5) as The time in system is the independent sum of the waiting time and the service completion time , so that The fraction of time this system is in an interrupted state is, cf. (3.10), (4.8), given by This is intuitively clear, since the mean time that the server is interrupted is , and the mean time between interruptions is .

Next, it will be assumed that interruptions can be postponed until the end of a service. Let denote the number of customers left behind in the system at the end of the th service, including a possible interruption. Then, this sequence of random variables forms an imbedded Markov chain and satisfies the recursion Here, the random variable represents again the length of a service time and a possible interruption, and the PGF of is again given by (4.3). From these recursions it follows by the procedure of Section 2 that the PGF of the stationary distribution of the number of customers in the system, , for , is given by, The number of customers waiting in the queue at the start of the th service satisfies, like (2.4), the following: With (4.11), it follows that for the stationary versions, so that (4.12) implies that Since holds as in the standard M/G/1 system, cf. (2.6), the LST of the FCFS waiting-time distribution follows as Like (3.16), this LST is the product of the LST of the distribution of the waiting time in systems without interruptions during idle periods and the LST of the distribution of a residual interruption time at an arbitrary instant. Hence, it follows like (4.6) and (4.7) that: The probability that an arbitrary customer does not have to wait is The time in system of a customer is the independent sum of the waiting time until the start of service and a service time so that The fraction of time this system is in an interrupted state is, cf. (3.21), (4.17), as follows: From (4.18), (4.9), (4.6), (4.7), (4.16), (3.9), and (3.20), it follows that the absolute differences between the preempting and the postponing policies are the same for the cases with and without interruptions during idle periods. Table 2 contains similar cases as Table 1 but for systems in which interruptions can also occur during idle periods. As relation (4.20) predicts, the absolute differences of the mean times in system without and with postponement are the same in both tables. Since the mean time in system is somewhat higher in systems in which interruptions can also occur during idle periods in comparison with that in corresponding systems without interruptions during idle periods, the relative differences of the mean times in system without and with postponement are somewhat smaller in Table 2. For case IX, the failure rate which makes the fraction of time in the interrupted state for the system with postponement, , equal to the fraction of time in the interrupted state for the system with preemption, , at , is , a bit smaller than the in Table 1. This correction of the failure rate leads to difference of the mean times in system of 1.3% versus 2.6% in the base case I without this correction.

5. Conclusion

This paper has studied the differences in performance of queueing systems with interruptions which can be postponed until the end of the service in progress and comparable systems in which interruptions preempt the service in progress. It turns out that these differences are the same whether interruptions only occur during busy times of the server or also during idle times. The difference may only amount a few percentages in many cases, but may become quite large (well over 10%) if the failure rate increases at a fixed ratio of, say, 1 : 20 between the failure rate and the repair rate. And the difference further increases with the variability in the service times and with the arrival rate. The difference is partly due to a reduction in the number of failures when failures are postponed and partly due to the feature that the customer in service is not delayed by a failure which is postponed.