#### Abstract

There are many applications where it is necessary to model queuing systems that involve finite queue size. Most of the models consider traffic with Poisson arrivals and exponentially distributed service times. Unfortunately, when the traffic behavior does not consider Poisson arrivals and exponentially distributed service times, closed-form solutions are not always available or have high mathematical complexity. Based on Lindley’s recursion, this paper presents a fast simulation model for an accurate estimation of the performance metrics of G/G/1/K queues. One of the main characteristics of this approach is the support for long-range dependence traffic models. The model can be used to model queuing systems in the same way that a discrete event simulator would do it. This model has a speedup of at least two orders of magnitude concerning implementations in conventional discrete event simulators.

#### 1. Introduction

Traditionally, queuing theory applications are limited to systems with assumptions that derive in closed-form expressions. For example, most of them restrict the tractability of solutions or the tools available to achieve numerical results. Some features that impact tractability are queue size, arrival process, service time distribution, and queue service’s policy. In terms of queue size, the queuing systems solutions divided into infinite and finite queue size.

For practical reasons, real-world applications in manufacturing, transportation, communication, networking, and computation systems have finite queues [1, 2]. Moreover, classical works have demonstrated that the queue size has a significant impact on the performance metrics of queuing systems [1, 2].

In the same way, arrival processes can be loosely classified by the decay of its autocorrelation function in ones with Short Range Dependence (SRD), i.e., fast decay, and ones with Long-Range Dependence (LRD) or slow decay [3, 4]. Many applications consider Poisson arrivals, which are SRD and produce solutions with closed-form expressions [1]. However, real systems have an arrival process different from that of Poisson. For example, some applications exhibit Long-Range Dependence (LRD) traffic [3–6]. Therefore, it is desirable to have models for general queuing systems, such as the queue, which is generally difficult to analyze; e.g., the queue was studied through approximations that result in high mathematical complexity in [7, 8], while, in [9], asymptotic representations were used for the metrics of the system.

Service time distribution broadly classifies queuing systems in exponentially distributed (Markovian property, i.e., memoryless, ), heavy-tail distributions, among others. In the literature, there are exact solutions and approximations for the queue [2, 10, 11]. For example, the Poisson Pareto Burst Process (PPBP) was studied by their applications in real broadband traffic modeling [5, 6, 12]. In PPBP burst arrivals occur as Poisson processes with Pareto distributed length, and each burst is divided into small pieces that will represent the work in the system (packets, entities, users, etc.). Some interesting, theoretical, and simulation results for the PPBP model are presented in [5], where short and long bursts are divided in order to study the impact of each subprocess.

On the other hand, simulations can solve queuing systems with no closed expressions but, unfortunately, have high computation times in many cases. A fast discrete event simulation model for a priority round robin multiplexer based on Lindley-type recursions is presented in [13]. In [14], Lindley-type recursive representations for multiserver tandem queues are presented. The models presented both in [13, 14] have run times faster than those implemented in conventional simulations. This paper, based on Lindley’s recursion [15], proposes a fast discrete event simulation (FDES) model for the study of the queue. The model can accurately estimate the metrics of interest and can be applied in optimization problems [11, 16] even when the traffic model has LRD features. Additionally, the proposed model can easily be used to the performance analysis of queuing systems [11, 16, 17].

The rest of the paper is organized as follows. Section 3 introduces the queueing system model. First, the well known single server model with infinite queue size is presented. Later, based on the single server model for infinite queues, an algorithm for the single server model with finite queue size is presented. The performance analysis results for the queue with different traffic behavior are presented in Section 4. Finally, Section 5 gives some conclusions.

#### 2. Mathematical Preliminary

This section introduces a brief description of some analytical results for finite queuing systems.

##### 2.1. The M/M/1/K Queue

As mentioned in the previous section, there are few results for the queueing system. This subsection presents some average results for this kind of systems. Perhaps, the most studied queueing model is . For example, in [10, 17–19], there are results for the queueing systems. These results are the blocking probability, the expected number of entities in the system, and the average response time. Consider that the customers or entities arrive at the system with an average arrival rate and are served with an average service rate . The traffic intensity is defined as . From [19] we have the following:

The blocking probabilityThe effective arrival rate or throughput from the input side is given byand the expected number of entities in the systemwhile the expected number of entities in the queue

The average response time that is the expected sojourn time of an entity (in other words, the expected waiting time of an entity given that the entity must wait in the queue) can be derived from Little’s theorem and using equations (2) and (3) [19]:

##### 2.2. The Imbedded-Markov-Chain Method for M/G/1/K Queueing Systems

The method is based on the consideration that the M/G/1/K queueing system can be seen as an imbedded-Markov-chain observing the number of entities left behind upon the departure of an entity [2, 10, 18, 19]. Then we define the single-step transition matrix truncated to , aswhich implies that the stationary equation iswhere is the set of stationary probabilities at points of departures. The probability of arrivals during a service time, , is given bywhere represents the distribution of the service time. Table 1 shows results for for distributions of the service time exponential, Erlang-2 and deterministic. The blocking probability depends on the probability that an arrival finds the system full, and in general, the probability distribution of the system size encountered by an arrival (here ) will be different from as and the blocking probability is given by

##### 2.3. Long-Range Dependence and Self-Similarity Basis

Modern investigations of traffic measurements suggest that self-similar processes and Long-Range dependence (LRD) can be applied to the study and accurate modeling of network traffic [20, 21]. A desirable feature of modeling network traffic through self-similar processes is that the correlation structure is expressed in terms of a single parameter called the Hurst index or Hurst parameter. The self-similarity is defined through continuous and discrete stochastic processes [22]. In this work, the discrete self-similarity is used.

Let be a discrete stochastic process or discrete time series with mean , variance , and autocorrelation function , . Assume to be of the form as , where H is the Hurst index [23]. It is known that stochastic processes with present short range dependence (SRD), while stochastic processes with present LRD. If , neither SRD nor LRD are present; i.e., not presenting any dependency. This is the well-known property of white Gaussian noise [22]. When considering discrete time series, the definition of self-similarity is given in terms of the aggregated processes, as is shown in (11) [24]:

Figure 1 illustrates an aggregated process, where represents the aggregation level and is obtained by averaging the original series over nonoverlapping blocks of size , and each term is given by

Then it is said that is self-similar with Hurst parameter if [25]where denotes equality in distribution.

The variance of the aggregated time series is defined by equation (14) as follows [24]:

The plot versus is known as Variance Plot. It is a straight line of slope for self-similar processes. This plot is the basis of the variance based estimator of the Hurst parameter [22]. Several methods have been developed, then in order to estimate the Hurst parameter [22, 26]; in this work, the periodogram method is used [26].

#### 3. The Fast Discrete Event Simulation Model

This approach is based on difference equations based on Lindley’s recursion, in order to model the waiting and service times of a queuing system. Performance metrics such as mean service time and mean queue size can be calculated numerically from the data processed by the difference equations. This process is the same as any simulation software executes. However, this approach takes off all the unnecessary details needed for a full-scale simulation. In the Fast Discrete Event Simulation (FDES) approach we call every customer an entity because it can be a client, packet, call, etc. depending on the system modeled. However, FDES only takes into consideration the arrival and service times for every entity that comes into the system. Then FDES takes these two quantities and processes them into the recursive equations to obtain the waiting and departure times for every entity.

##### 3.1. The Single Server Model: Infinite Queue

The model consists of a single queue server with an infinite queue and operates according to the First-In-First-Out (FIFO) queuing discipline, as shown in Figure 2. Once an entity arrives into the system and the server is free, the entity is attended. If the entity finds the server busy, it is placed into the queue and waits to be served. In this model, the events of the system are represented by the arrivals and departures of the entities.

Let , , and denote the sequences that represent random variables of the arrival instant to the system, service time, and departure instant, respectively, of the th entity, . Here the entity may be interpreted as a packet or a customer. Furthermore, let () denote the sequence that represents the interarrival (interdeparture) time between the arrivals (departures) of th and . Entities arrive to the system with an average arrival rate and are served with an average service rate . The traffic intensity is defined as .

Figure 3 shows a timing diagram for an infinite queuing system. Observing the figure, the arrival of the th entity occurs at the time epoch, and this entity might have to wait for a random time before being served for a random service time . When the th entity is served, it departs from the system at the time epoch. Initially, the system is considered to be empty. The random sequences that describe the system-time behavior of the th entity can be obtained.

We define the random variable aswhere represents a stability variable since for a stable system the expectation of must be negative. This is . From Figure 3, we can see that the waiting time can be expressed in terms of the stability variable and the waiting time of the previous entity, as is shown in equation (16):A general representation for the waiting time that considers those entities that do not wait in the queue because the server is available results in Lindley’s recursion [1, 15]; thenEquation (17) can also be expressed as follows:The departure instants from the system for the th entity can be expressed asand the interdeparture time for the th entity is given by

From the bottom of Figure 3, the number of entities found in the system immediately prior to the arrival of (or the number of entities in the queue upon the arrival of ), denoted by , can be determined aswhere is the number of entities served between the arrivals of and . In order to determine , we need first to find . We propose Algorithm 1 in order to find the number of entities served between two consecutive arrivals.

Input: The last entity attended until the arrival of entity , at the begin ; | |

the -th arrival epoch ; the departures epochs previous to . | |

Output: Number of entities served between the arrival of and , . | |

1: for to do | |

2:if then | |

3: | |

4: | |

5:brake the for loop. | |

6:end if | |

7: end for |

The description of Algorithm 1 is as follows: First, the inputs of the algorithm are defined. is the index of the last entity served until the arrival of entity . Initially, it is considered that . The algorithm also requires the th arrival epoch and the departure epochs previous to . In line 1, the** for** loop sequentially searches the entities that have departed between the arrivals of and . In line 2, a comparison between and is necessary in order to know which of the entities have departed before the arrival of entity . If a departing entity is found, the number of entities between and , can be calculated (line 3), and the auxiliary variable is updated (line 4). Then, the loop is broken and the algorithm finishes.

##### 3.2. The Single Server Model: Finite Queue

In the previous section, we presented a system model for a single server with an infinite queue. Based on this model, we propose a model. In this queuing system, there is a finite amount of waiting room , which includes the room for the server (according to Kendall’s notation).

The model consists of a single queue server with a finite queue and operates according to the First-In-First-Out (FIFO) queuing discipline, as shown in Figure 2. The entities () arrive to the system with an arbitrary distribution. Once an entity arrives to the system, one of the following cases occurs: if the server is empty, an entity will be immediately served for a time that depends on an arbitrary distribution; if the server is busy and there is at least one place in the queue the entity will wait in the queue until all the entities that arrived previously have been served; if the queue is full, the entity will be blocked and cleared from the system.

In order to estimate the performance metrics of the queuing system, we propose Algorithm 2. In Algorithm 2, we first define the input and output variables. is the number of entities to consider in the model. Initially, the system is empty. Line 1 shows the initialization of the variables. In line 2, a loop** for** is utilized to evaluate the different variables of interest for each entity under consideration. In line 3, the indexes for the entities are updated so that the expression obtained for the queuing system in the previous section can be utilized without considering the blocked entities. When an entity arrives into the system, the number of entities in the queue is calculated (line 4). Notice that in (21) represents the number of entities found in the system immediately prior to the arrival of , so the distribution of the system size can be determined. Then, we ask for the availability of the queue (line 5); if there is at least one place in the queue, then the entity is accepted to be served and the waiting time in the queue and the departure epoch of the entity is calculated (lines 6 and 7). Index is updated only for nonblocked entities (line 8); if the queue is full upon the arrival of entity , then the entity is blocked and a counter to determine the number of blocked entities is increased (line 10).

The blocking probability , that is, the probability that an arriving entity finds the queue full, can be estimated through the distribution of or simply by the relation of and . The average waiting time can be estimated as follows:

#### 4. Results

This section presents numerical results for the performance analysis of the queue. Additionally, the FDES runtime is analyzed and an estimation of the time complexity is presented.

##### 4.1. Performance Analysis Results

Exact (or theoretical) results are obtained as follows (unless otherwise stated): the queue is solved through the method of embedded Markov chains; the ( refers to deterministic) queue estimation was generated according to the numerical approximation in the computer program McQueue [27]; the was generated from the numerical approximation presented in [7]. The average service rate is fixed to 1 second. For the simulation and the FDES model, independent and identically distributed (i.i.d.) Gamma random variables were used to model nonexponentials, such as the 2-stage Erlang, labeled as . In the representation of -x, x = refers to theoretical (t), simulation (s), and FDES model (f). The simulations were implemented in SimEvents of Mathworks Matlab and in OMNET++. Both simulations and the FDES model were fed with the same traces. An Intel core i7 2600 @ 3.4 GHz computer with 8 GB of RAM running on a 64-bit Windows 10 was used to obtain model outcomes.

Table 2 shows results for the average number in the system (), the average number in the queue (), the probability that there is no customer in the system (), the blocking probability (), and the runtime. It can be observed that the c FDES model accurately estimates the metrics with respect to the theoretical results and the results of the simulation and the FDES model are exactly the same since the models are fed with the same traces. However, the FDES model is two orders of magnitude faster than the simulation implemented in SimEvents.

Likewise, Table 3 shows results for the queue considering traffic with different burstiness. Markov Modulated Poisson Processes (MMPP) as arrival processes were considered in order to simulate LRD traffic. Exponential service times were considered and the average service time is fixed to 1 second. As the table shows, both simulation and FDES metrics agree. Although Markov Modulated Poisson Processes (MMPP) cannot exhibit LDR features, in this paper they are used as an emulation of LDR such as in [19, 28]. The purpose of the aforementioned is to specify different sources of traffic. The Hurst parameter () was estimated using the periodogram method [29, 30]. Markov Modulated Poisson Processes with two states were considered. The state transition rates and , for states 1 and 2, were 0.001 and 0.01 transitions/seconds, respectively. The arrival rate in state 1 is , and the arrival rate in state 2 is , where is the burstiness. Notice that corresponds to a simple Poisson process (nonmodulated).

Figures 4 and 5 show the average waiting time in the queue versus the traffic intensity and the blocking probability for diverse values of the capacity , respectively. Different queuing systems and 30 realizations of the fast simulation are considered. Results show that the FDES model perfectly matches the theoretical outcomes; this is an expected result since the FDES model is a fast simulation based on Lindley’s formula.

Table 4 shows results for the FDES and simulation considering Pareto Modulated Poisson Processes (PMPP), labeled as , and pure Pareto arrivals processes, labeled as , both of them considering exponential service times. For the mean of the Pareto distribution is and the variance is infinite. In the Pareto distribution, denotes the thickness of the tail of the distribution and the minimum value of the random variable [5, 12]. On-Off Pareto Modulate Poisson Processes can be used to model LDR traffic [31]. In the same way, pure Pareto arrivals exhibit self-similar properties. The Hurst parameter can be calculated as , where denotes the thickness of the tail of the Pareto distribution [31]. The state transition rates and , for states* on* and* off*, were 0.001 and 0.01 transitions/seconds, respectively. Besides, we consider that the thickness parameter is . As can be observed, from Table 4, the FDES model and simulation agree perfectly. Furthermore, for high Hurst parameter, the worst performance is shown, but the pure Pareto traffic model presents a worse performance than the PMPP.

Figures 6 and 7 show the performance for the pure Pareto arrivals queue, , considering the FDES model. In both figures, the well-known model is used as a reference for comparison purposes, and, theoretically, the exponential arrival processes have a Hurst parameter of 0.5. Figure 6 shows the blocking probability versus the buffer capacity for different values of the Hurst parameter. As the capacity is increased, the blocking probability is reduced, which is expected since the buffer size increases. Besides, notice that for a higher Hurst parameter we obtain a higher blocking probability. This last result is also expected since the Hurst parameter represents a measure of the self-similarity and the Pareto distribution is heave tailed with self-similarity behavior. On the other hand, Figure 7 shows the average waiting time in the queue versus the buffer capacity. Notice that, as the capacity increases, the average waiting time in the queue increases too. Again, the worst case for the waiting time occurs for a high Hurst parameter. Both in Figures 6 and 7 the keeps below the curves that have values of the Hurst parameter greater than 0.5, while those curves that are near to that value of have a performance similar to the model. This kind of performance analysis can be accomplished through the utilization of the FDES model.

Other interesting results can be observed when analyzing a type of process similar to the PPBP model. Consider that, in PPBP, burst arrivals occur as Poisson processes with Pareto distributed length. In PPBP a burst is divided into small pieces at a rate . These kinds of systems are used to model real traffic where the buffer size is fixed and the service time is constant [5, 6]. The methodology presented in this work does not represent an exact PPBP model since Algorithms 1 and 2 are not designed to handle the partition of bursts into small entities. However, the FDES queue model could study the performance analysis of the model. In the model burst arrivals occur as Poisson processes, the service times are Pareto distributed random variables, there is a single server, and the queue size is . The model and the PPBP model are similar in the sense that the burst duration (for PPBP) and the service time (for ) are Pareto distributed and both of them represent the amount of work that the server most processes.

Figures 8 and 9 show the performance for the queue considering the FDES model. As before, there are some results for different values of the Hurst parameter. Results for the Pareto and the truncated Pareto distribution (labeled as ) are presented. The truncated Pareto distribution is considered to limit the maximum duration of the service time. For the case of a PPBP model, the truncated Pareto distribution limits the maximum bust duration. In the case of truncated Pareto service times, a limit of the service time of 1000 was considered. Additionally, a mean service time of 1 second and a traffic intensity () of 0.6 were considered. Figure 8 shows the blocking probability versus the buffer size for different values of the Hurst parameter. As the buffer size is increased the blocking probability is decreased. When the Hurst parameter is increased the blocking probability is increased too. Long service time can occur due to the infinite variance of Pareto distribution, which contributes to system instability. Indeed, as increases the performance is worst and, for each realization of the simulation, there will be different results. On the other hand, when a truncated Pareto distribution is considered, the system tends to stability, which is expected since the system does not consider long service times [5]. A similar result is presented in Figure 9, where the waiting time in the queue versus the buffer size is analyzed. Notice that for high values of the waiting time increases and the system tends to the instability. However, when the service time is limited, the system tends to stability. Of course, it is necessary to determine the limit value, such as what was done in [5], in order to ensure the stability of the system.

##### 4.2. Runtime

Finally, the average runtime for the different models was estimated and was shown in Figure 10. For these results, the following assumptions were considered: the traffic intensity is fixed to 0.6; the number of simulated entities was ; and the considered models are the FDES model, the model implemented in OMNET++, and the model implemented in SimEvents. Although the theoretical model implemented with imbedded Markov chains does not depend on the number of simulated entities, it was also considered for the runtime results. The runtime was averaged over 30 realizations. Figure 10 shows that the runtime for the FDES model is at least one order of magnitude faster than the model implemented in OMNET++ and at least two orders of magnitude faster than the model implemented in SimEvents.

On the other hand, Figure 11 shows results for the runtime as a function of the number of simulated entities and the buffer capacity . Again, the traffic intensity is fixed to 0.6 and the number of simulated entities was . The runtime for different queue models, using the FDES model, was analyzed: the , , , and . The label in Figure 11 corresponds to simulations that do not consider the runtime of the generated traffic sequences. From Figure 11, it can be observed that buffer size does not have an impact over the runtime, at least for moderated values of the number of simulated entities. Nevertheless, the runtime is considerably affected by the number of simulated entities, as can be observed in Figure 11. Furthermore, the runtime of the FDES is not considerably affected by the implemented model, as can be shown in Figure 11. Data from Figure 11 has been processed utilizing genetic programming to best fit the curve [32]. The worst cases represented by and have been considered. The resulting fitting model for the runtime is approximated as a polynomial expression given as and , for and , respectively. The R-squared goodness of the polynomial fit was 0.9999 in both cases, which indicates a good fit to the curves. The dominant term in both expressions is , but it is attenuated by a coefficient of the order of . Therefore, considering that the maximum value of in the simulation was , the time complexity could be approximated to . Moreover, Algorithm 1 has a computational complexity of , whereas the computational complexity for Algorithm 2 is for . When the computational complexity is approximately , which corresponds to the aforementioned time complexity. Therefore, since the computational complexity is , the FDES model is a good option to study the performance analysis of queuing systems.

#### 5. Conclusions

Based on Lindley’s recursion, a fast simulation model for the performance analysis of the queue was presented. Different traffic sources can be considered including LRD traffic. The model can accurately estimate the main metrics of the queuing system quickly compared with the models implemented in OMNET++ and SimEvents of Mathworks. The model has a speedup of at least two orders of magnitude with respect to implementations in the aforementioned discrete-event simulators. Our model can be exploited in the performance analysis of queuing systems embedded in optimization problems.

#### Data Availability

No data were used to support this study.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.