Abstract

In Internet traffic modeling, many authors presented models based on particular fractal shot noise representations. The inconvenience of these approaches is the multitude of assumptions and the lack of tools to check them. In this paper we propose a unified model based on a general Poisson shot noise representation for the cumulative input process (CIP). We present a procedure of approximation of this process; then we give a procedure for controlling the bandwidth of Internet providers. The approximation and control go via limit theorems for functionals of the CIP, namely, the supremum process, the right inverse, and the storage mapping.

1. Introduction

The aim of this paper is to present a method for approximating the traffic that accumulates in an Internet server. We suppose that a unique server deals with an infinite sized source that sends data over independent transmissions to the server according to a stationary Poisson process. We are interested in the cumulative input of work to the server (also called total accumulated work). This stochastic process generated by the traffic of the transmission over an interval of time is denoted by and called cumulative input process (briefly CIP). The terminology work could be understood as a portion of bandwidth that has to be allocated in order to evacuate the traffic. We also used on purpose the terminology transmission because we will see that the approximation we propose could be applied for the traffic of packets upstream and/or packets downstream, or even to more global objects related to the traffic such as downloads from the Internet. The approximation we propose works for fixed and single type of transmissions. Different types of transmissions simply induce the superposition of their corresponding CIP.

In the specialized literature, it seems that the rules are to assume a mechanism of evolution in time of this CIP [16]. The inconvenience of these assumptions on the evolution in time is that they are quite impossible to check.

In [7], we presented a model based on a general Poisson shot noise representation without imposing any such a mechanism. With minimal technical assumptions we showed that the CIP can be approximated in a certain sense by a nicely tractable stochastic process, namely, a stable process with only positive jumps. The sense is the weak convergence in law of a modified version of the CIP to the stable process. This sense is extremely useful because it allows some functionals of the CIP to converge to appropriate functionals of the stable process. This will be the central key for forecasting the bandwidth allocation via stable processes. A widespread literature is devoted to stable processes and powerful statistical methods are available; see [8]. The procedure of approximation for the CIP is resumed as follows: within an interval of time large enough, we observe a certain number of transmissions ordered according to the first arrived. In Internet traffic each transmission can be traced by its related packets and several data can be extracted for each transmission. We showed in [7] with Theorem 1, stated below, that the nice approximation of the CIP holds weakly in law with limit , a strict stable process totally skewed to the right, for some correcting terms , . The latter holds under these three easily checkable properties:(1)the arrivals of the transmissions form a Poisson process (eventually with intensity increasing at large time scales);(2)the size of each transmission is heavy tailed with infinite variance. In probabilistic terminology, the right tail of their distribution is regularly varying with index for some . This parameter of heaviness gives the index of self-similarity of the stable process;(3)the length of each transmission has finite expectation (other alternative weaker assumptions are feasible).A comparative study with several of the existing models is also provided in our paper [7] and illustrates why our assumptions are weaker and more tractable.

In the present paper we start in Section 2 by presenting the model for the CIP. In Section 3 we give its corresponding limit theorem. In Section 4 an approximation procedure for functionals of the CIP, namely,(1)the supremum process corresponding to the maximum input of work;(2)the right inverse, corresponding to the first passage time process over critical barriers;(3)the storage mapping, corresponding to the process solution of a storage equation.We will see that we obtain limit theorems similar to those obtained in [7] and no additional assumption is needed for achieving this goal. The following result is stated in Theorem 3 under the same formalism than Theorem 1, take any of the mappings behind the previous functionals. Then, there exist correcting terms , and a companion mapping of such that Section 5 is devoted to the proofs. The tools we used there are based on the monograph of Whitt [9].

2. The Model

Through all the following, we assume that a unique server deals with an infinite sized source. Transmissions arrive to the server according to a homogeneous Poisson process on labeled by its points so that and hence are i.i.d. exponentially distributed random variables with parameter is considered as the time of initiation of the th transmission.

Let the counting measure and define the Poisson process by The quantity represents the number of transmissions started between time and time . We are interested in the cumulative input process (CIP) generated over an interval of time and denoted by . It corresponds to the size of the files transmitted by the source. There are many ways to model it (from the most trivial way to the most sophisticated). Specification of the source behavior could be taken into account adding more and more parameters. In order to avoid this intricacy, most of the authors [26] have a macroscopic approach strongly connected with times of initiation of the transmissions, their duration, and their rate. As we will see, this paper confirms the pertinence of this approach, and we show that it is sufficient for having the required control on the cumulative input process.

Our aim is to describe, in the more general setting, the CIP and to give an approximation of the law of functionals of this process. Notice that the CIP describes the work generated over the interval . Time is when our “observation starts” and is when our “observation finishes.” Observe that times of initiation of transmissions are either positive or negative (before or after our observation starts). The th transmission starts at time and continues over the interval of time .

Suppose we observed the transmissions since ever and until time and we want to calculate the work generated by the th transmission. This work holds over the random interval . The length of this interval is the r.v. We deduce that the work generated by the th transmission is given by a quantity which depends on the length . We will denote this work by where is a stochastic process; the random variable is an increasing function of , vanishing if and describing the quantity of work that could be generated by the th transmission over an interval of time of length .

If we had observed only over the interval (instead of ), the work that could be generated by the th transmission should be written as the difference Notice that is equivalent to . Because of the above considerations, we propose the following model describing the CIP over by a “moving average type”: The processes are assumed to be an i.i.d. sequence independent of the arrival process . For each , the process has right continuous left limited sample paths, vanishing on the negative real axis and increasing to a finite r.v. .

We will see in the sequel that the asymptotic distributional behavior of the process is the same as the one of its “finite memory” part which is a Poisson shot noise (see [10, 11]): Observe that the process has stationary increments, while has not. The process is of a special interest because it only takes into account the transmissions started after time . The special structure of the process calls some other comments. The problem is that at any fixed time we can not “see” if the th transmission has finished or not and what is more we are unable to calculate during the time the accumulated work . The only available information is the quantity which is the total work required by the th transmission.

It is then natural to introduce the process which characterizes the total work required by all the transmissions started within the interval . This process enjoys a very special property: it is a Lévy process; that is, it has independent and stationary increments, and is a compound Poisson process (see the appendix and [12] for more account on Lévy processes). This process turns out to be the principal component of the processes and , a component which will give the right approximation by a stable process, as stated in Theorem 1 below. Another special process is defined by This process is special because it is stationary. Contrary to the others, it has problems of definitions. Namely, the contribution of the past (before time 0) could make the random variables be infinite. The same problem can occur for the process , but it is actually finite under our assumptions. The processes and are very well defined because they are finite sums. This is the reason why we will only consider the processes , , and . These processes are increasing. Because these processes are closely connected to the load of the transmissions, we will call them load processes. We will see in next section that, when adequately modified, they satisfy a weak limit theorem and share a common limiting process which is a totally skewed to right stable process.

3. Forecasting the CIP

We present here the main result obtained in [7]. This result may appear highly theoretical. The assumptions under which the result works are actually easy to check on real internet data. We state the result under its theoretical form and give right after the indications to its best exploitation.

3.1. Technical Assumptions for the Approximation of the CIP

The next theorem shows that the load processes , or , after being correctly drifted and normalized, are approximated in law by a strict stable process totally skewed to the right (see [7, 12] for more account on stable processes). The main keys for proving these results were(1)the infinite divisibility property of the load processes ;(2)the stationary increments property of the CIP and the fact that it shares the same finite-dimensional expectations as the total work process in case is finite;(3)the stationary and independent increments property of the process .

3.1.1. The Assumptions

Recall that the processes are i.i.d. Let and the stopping time defined by The r.v. and are actually versions of the size and the length of any transmission.

3.1.2. Behavior of the Size of a Transmission

The r.v. is finite and has a regularly varying tail of index , with ; that is, there exists a deterministic increasing function , such that Condition (13) is equivalent to where is a slowly varying function (i.e., for all , ). Actually, the function is simply the quantile function , that is, the generalized inverse of the function . It is known that for some slowly varying function and we can always choose an increasing version of  . For more account on regular variation theory, the reader is referred to [13].

3.1.3. Assumption on the Arrivals

Notice that the intensity parameter of the Poisson arrival process is not necessarily fixed. It may depend on a scale and may go to infinity with as studied by Maulik and Resnick [14] and Kaj [1]. Through all the following, we are in the situation where is fixed or is increasing to a value when goes to infinity.

3.1.4. Technical Assumptions on the Length of the Transmissions: Connection with the Intensity and with the Size

Actually, at large time scales, we do not really need to distinguish between the two cases of finite intensity ( or ) and infinite intensity (), with the latter being obviously more technical. We will only need this assumption: In case of finite intensity at large scales, or , it is also obvious that condition (15) is equivalently expressed with . Moreover, in this case, it is simply implied by . See [7] for more comments on assumption (15).

Theorem 1 presented below is proved in [7]. It generalizes many existing models (see [7] for a comparison) and treats the complete infinite variance case but also provides more powerful approximations (the weak convergence in law) with less conditions than what was used to be assumed in the literature. It states that the processes , , and , when correctly drifted and normalized, are attracted in law by a common non-Brownian -stable process (the Case corresponding to the Standard Brownian motion will be explicitly excluded in the sequel).

3.2. Importance of the Mode of Convergence

The convergence in Theorems 1 and 3 holds in the weak mode in the space of Càdlàg functions endowed with the -Skorohod topology. We will not get into details about Skorohod topologies; we just say that, according to [15, 3.20 page 350], a sequence of stochastic processes is said to converge in law to a process , and we denote if and only if the limit process is well identified via finite-dimensional convergence; that is, for all and we have convergence in distribution of the random variable and the sequence of processes must be tight, where tightness is a technical criterion (strongly related to the modulus of continuity of the topology) ensuring the existence of the limit. This paper mostly used the powerful tools on topology presented in the book of Jacod and Shiryaev [15]. This Topology is nicely tractable, since many important functionals are continuous and preserve convergence. This is the aim of the next section. We stress that all convergences in Theorem 3 also hold in the space .

3.3. Approximation of the CIP by a Stable Limit

Notice that stable processes and their functionals are very studied in the probability theory literature (see [8]). We recall that it is defined as a Lévy process (see [12] for more account), with no positive jumps and whose Laplace transforms are given for every by : where is the index of stability and

Theorem 1. The processes , ,  and are attracted in law by a common stable process [7].
Assume (13) and (15). Let , , or and be a strict -stable process totally skewed to the right. Define for each where the function is given by (13), the function is defined by and the function is given by (19). Letting , one gets the weak convergence in law

4. Forecasting the Bandwidth

The main idea of this work is to extend convergence (22) and give its equivalent on functionals of the process . The approximated functionals will enable to forecast the badwidth allocation in order to avoid congestion. For this purpose, we present three natural and important functionals that are continuous on and also weak-convergence-preserving there. Theorem 3 provides limit theorems for functionals of the processes , , and .

4.1. Some Important Continuous Mappings on the Space

Let the subset of processes that are in and are increasing, the subset of processes in in that are unbounded above and null in 0 and the subset of processes in that are in and satisfy . Let a process in and consider for all :(1)The right inverse or first passage time process is defined by The random variable is simply the first time that the stochastic process crosses some critical barrier and then describes the congestion time.(2)The supremum and the infimum processes and are defined by The r.v and are, respectively, the maximum and the minimum of the values of the process over the interval of time and then, they describe the extreme workloads.(3)The reflection mapping process is defined by , where is given by so that .

More could be said on the reflection mapping. The pair of processes is said to solve the Skorohod problem associated with the process with , is nondecreasing, , and the reflection condition holds. See [16, page 375] for more details on Skorohod problem. For the sake of illustration of this problem, consider the problem of a server dealing with an input process and serving at a constant rate . Then, the buffer content process denoted by has positive paths and solution of the following storage equation with service rate  : Rewriting the storage equation with , it is easy to see that which corresponds to the reflection mapping of the drifted process . For this reason we call the storage mapping which coincides with when .

4.2. The Result

Having justified the importance of such functionals of the CIP, we will apply them on , , or . We will show that, correctly normalized as in (22), these functionals of are approximated by a companion functional of the limiting -stable process . Of course the approximation depends on the index of stability . This fact is explained by Theorem 3.

Recall the functions ,  ,  and given, respectively, by (13), (21), and (19). When , we obviously have

Lemma 2. Limit (28) is also true for .

For more readability, the proofs of the last lemma and the following theorem are postponed to Section 5. Now, recall the first passage time processes is given by (23), the supremum is given by (24), and the reflection mapping process is given by (25).

Theorem 3. Functionals of the CIP are attracted in law by functionals of the stable process.
Assume (13) and (15). Letting one gets the following weak convergence.(1)First passage time: One has where the quadruple formed by the correcting functions , ,  and and the process is given by (2)Supremum: One has (3)Storage mapping: choose such that admits a limit . If is the storage functional corresponding to the rate , then where

Remark 4. (i) Recall that the intensity is allowed to depend increasingly on a scale parameter . Then, as , we have and so does its right inverse . Finally, replacing in (29) with and taking the compositions , we equivalently have
(ii) When the corresponding -stable process is increasing, null in 0, and then is equal to its supremum process. When and , the reflected process of is also equal to itself.
(iii) With similar arguments, Theorem 3 can be generalized and stated for other quantities of interest such that the infimum functional. One could also consider the level of congestion (at time of congestion) : Let a critical value fixing the quality of service of the internet provider. The composed quantity determines the load reached when the buffer content crosses the critical value .

4.3. How to Forecast?

Hopefully, the reader is now convinced that only three facts are relevant for approximating the CIP.

(1) The first assumption to test is the one of Poisson arrival of the transmissions. After validation, we need to estimate the intensity . This assumption is an expected result since the behavior of lots of individuals acting independently is often well modeled by a Poisson process. This assumption may fail in presence of transmissions triggered automatically by machines. A work has to be done in order to compare the proportion of human and machine transmissions. If the proportion of the second kind is negligible compared to the first, one could validate this assumption. Some statistical methods for testing the Poisson assumption are presented in the well-documented review of Resnick [17].

(2) Boundedness of the length of transmissions is obvious for Internet traffic since this quantity has usually a magnitude of a microsecond. This validates assumption (15).

(3) The parameter giving the index of self-similarity can be easily computed provided that each transmission (labeled by the number ) captured over an interval of time of observation large enough gives the total work it requires (the payload one can extract from headers) denoted by . Since these sizes form an i.i.d. sequence of r.v. then plotting a Hill estimator is straightforward (see [17]).

(4) Since the length of transmissions has the magnitude of a microsecond, an observation over a few minutes would be sufficient. Theorem 1 shows that for large time the distribution of the accumulated work over the interval is distributed as The quantity is simply the quantile function of . The Function given in (21) is available after the previous steps and is an -stable r.v. totally skewed to the right.

(5) Now, the problem of bandwidth allocation could tackled. Notice that Theorem 3 uses functionals like first passage time, opposite and reflection mapping of the stable processes. There is a widespread literature devoted to stable distributions and processes (see the encyclopedic web page of Nolan [8]). We recommend the monograph of Zolotarev [18] (for finite dimensional properties), the book of Samorodnitsky and Taqqu [19], and the one of [12] for further trajectorial properties.

Let the input processes take the bandwidth quantity (time or critical level depending on the functional) big enough and consider the random variables , , and .

(a) The first time the traffic accumulates over the critical bandwidth level is the quantity . It is approximated in distribution by The Functions , given in (4) are easily computed from and .

(b) Here is the time. is the maximum of the input over the interval of time and has the same approximation in distribution as the proper input :

(c) Here is the time. is the buffer content when the service is delivered at a rate . It is approximated in distribution by

5. The Proofs

Proof of Lemma 2. Write for We have and as and then . Now, use the that is decreasing on and write Performing an integration by parts and a change of variable in the r.h.s, write By (13), and the difference of the two first terms in go to 0. For every , we have which is a bounded quantity, uniformly in . Using again (13), write Finally, for arbitrary , arrive to and let .

Proof of Theorem 3. The procedure goes by adapting some tools one can find in the monograph of Whitt [9]. The key is to exploit Skorohod representation theorem: If weakly converge to in , then there exist other random elements of , , and , defined on a common underlying probability space, such that (for every ), , and (when ). The idea is to consider the situation of a central limit theorem as (22) and we want to say something about the limits of functionals of or in (22).
Preliminary of the Proof. There are two procedures allowing to approximate functionals of the processes , and introduced in the model.
(1) Weak convergence is trivially preserved by applying continuous mappings. This is the definition of Weak convergence: if weakly (resp. almost surely) converge to in and is a continuous mapping on , then Continuous functionals of the centered process in (22) are treated according to procedure (44).
(2) Weak convergence in central limit theorem is preserved by applying “nice” mappings. We emphasize that this second procedure deals with functionals of the noncentered process and not . Assume that we have a central limit theorem where and when . Rewriting the limit, we have that with being the identity process. Since , it is immediate that and then, by procedure 1, convergence is preserved by continuous mappings, but this not our aim. We go further with Skorohod representation theorem: there exists a version of of and a version of of such that We are willing to study “nice” mappings allowing with some companion mapping . The procedure is finished whenever the mapping is well identified. Finally, “nice” mappings (with companion ) will be called weak-convergence-preserving on since they satisfy what we are asking for:
Completion of the Proof. Recall that if and, by Lemma 2 (28), The proof relies on the two preceding procedures and on key homogeneity relations: First if and , then and second, the the Identity process is preserved by the three mappings: , , and .
(1) The inverse functional is Lipschitz (Corollary 13.6.3 [9]). When obtain (29) by applying the continuous inverse mappings on and using the homogeneity relation (51). The inverse functional is weak-convergence-preserving in the sense of (49) with companion (Theorem [9]). By rewriting and using the homogeneity relations (51), we get convergence (29) for .
(2) The supremum functional is Lipschitz (Theorem 13.4.1 [9]). When , the corresponding -stable limit process is increasing and then is equal to its supremum. We obtain (31) for by applying the continuous supremum mappings on and using the homogeneity relation (52). The supremum functional is weak-convergence-preserving in the sense of (49) with companion (Theorem 13.4.2 [9]). Rewriting as in (55) and using the homogeneity relations (52), we get convergence (31) for .
(3) The Storage functional is seen as the reflection mapping applied to the drifted process . The reflection mapping is Lipschitz (Theorem 13.5.1. [9]). A trick for the storage functional is needed: rewrite in the form which we know, by Theorem 1, to have a stable limit. We get convergence (32) for by these two cases.
(i) Case : a Slutsky argument on (58) shows that We obtain (32) with limit by applying the continuous reflection mappings on and using the homogeneity relation (53).
(ii) Case : we use the fact that the supremum functional applied on is weak-convergence-preserving in the sense of (49) with companion (Theorem 13.5.2 [9]) together with the homogeneity relations (53).

6. Prospects

As we have seen, under the assumption of Poisson-like arrivals and heavy tail of the load of transmissions, we were able to approximate many characteristics of the traffic. As a continuation of this work, we aim to study the situation of intermediate (of the magnitude of a second) and small time (of the magnitude of a microsecond) behavior of the CIP. This is another story.

Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This project was supported by King Saud University, Deanship of Scientific Research, College of Science Research Center. The author is grateful to the anonymous referees for their valuable comments that improved the presentation of this paper.