Journal of Applied Mathematics

Journal of Applied Mathematics / 2012 / Article

Research Article | Open Access

Volume 2012 |Article ID 831909 | https://doi.org/10.1155/2012/831909

Michael J. Neely, "Stability and Probability 1 Convergence for Queueing Networks via Lyapunov Optimization", Journal of Applied Mathematics, vol. 2012, Article ID 831909, 35 pages, 2012. https://doi.org/10.1155/2012/831909

Stability and Probability 1 Convergence for Queueing Networks via Lyapunov Optimization

Academic Editor: P. G. L. Leach
Received15 Aug 2011
Revised17 Mar 2012
Accepted11 Apr 2012
Published19 Jul 2012

Abstract

Lyapunov drift is a powerful tool for optimizing stochastic queueing networks subject to stability. However, the most convenient drift conditions often provide results in terms of a time average expectation, rather than a pure time average. This paper provides an extended drift-plus-penalty result that ensures stability with desired time averages with probability 1. The analysis uses the law of large numbers for martingale differences. This is applied to quadratic and subquadratic Lyapunov methods for minimizing the time average of a network penalty function subject to stability and to additional time average constraints. Similar to known results for time average expectations, this paper shows that pure time average penalties can be pushed arbitrarily close to optimality, with a corresponding tradeoff in average queue size. Further, in the special case of quadratic Lyapunov functions, the basic drift condition is shown to imply all major forms of queue stability.

1. Introduction

This paper considers Lyapunov methods for analysis and control of queueing networks. Landmark work in [1, 2] uses the drift of a quadratic Lyapunov function to design general max-weight algorithms for network stability. Work in [3, 4] extends this result using a drift-plus-penalty algorithm that leads to joint stability and penalty optimization. The results in [1, 2] are developed for systems that evolve on ergodic Markov chains with countably infinite-state space. The results in [3, 4] apply to more general systems but are stated in terms of time average expectations, rather than pure time averages. Time average expectations can often be translated into pure time averages when the system has a Markov structure and some additional assumptions are satisfied, such as when the system regularly returns to a renewal state. However, these additional assumptions often fail and can be difficult to verify for particular systems of interest. This paper seeks to provide simple conditions that ensure desirable queue sample paths and desirable time average penalties with probability 1.

First, a basic drift condition for a general class of Lyapunov functions is shown to imply two weak forms of queue stability, called rate stability and mean rate stability. Next, this paper focuses on quadratic and subquadratic Lyapunov functions. Using the law of large numbers for martingale differences, a simple drift-plus-penalty condition is shown to ensure queue stability together with desirable time averages of an associated network penalty function. In the special case of quadratic Lyapunov functions, it is shown that the basic drift condition implies all major forms of queue stability, including rate and mean rate stability, as well as four stronger versions. These results require only mild bounded fourth moment conditions on queue difference processes and bounded second-moment conditions on penalty processes and do not require a Markov structure or renewal assumptions. Two simple examples are given in Appendix A to show what can go wrong if these bounded moment conditions are violated.

Finally, these results are used to design and analyze a class of algorithms for minimizing the time average of a network penalty subject to queue stability constraints and additional time average constraints. It is shown that the time average penalty can be pushed arbitrarily close to optimality, with a corresponding tradeoff in queue size. In the special case of quadratic functions, the tradeoff results of [3, 4] are recovered in a stronger form. They are shown to hold for time averages with probability 1 (rather than time average expectations). The results for quadratic functions in this paper assume the problem is feasible and that the constraints can be achieved with β€œπœ–-slackness” (similar to a Slater-type condition of static optimization theory [5]). However, this paper also obtains results for subquadratics. Algorithms based on subquadratics are shown to provide desired results whenever the problem is feasible, without requiring the additional slackness assumption. This analysis is useful for control of queueing networks, and for other stochastic problems that seek to optimize time averages subject to time average constraints.

1.1. On Relationships between Time Average Expectations and Time Averages

It is known by Fatou's Lemma that if a random process is deterministically lower bounded (such as being nonnegative) and has time averages that converge to a constant with probability 1, then this constant must be less than or equal to the liminf time average expectation (see, e.g., [6]). This can be used to translate the existing bounds in [4], which hold for time average expectations, into sample path bounds that hold with probability 1. However, this translation is only justified in the special case when the processes are suitably lower-bounded and have convergent time averages. Time average convergence can often be shown when the system is defined on a Markov chain with suitable renewal properties [7, 8]. Further, queue stability is often defined in terms of positive Harris recurrence [9, 10] although systems with positive Harris recurrence do not always have finite average queue size. For example, consider a standard 𝑀/𝐺/1 queue with service times that have finite means 𝔼[𝑋] but infinite second moments, and with arrival rate πœ†<1/𝔼[𝑋]. The system spends a positive fraction of time in the 0 state but has infinite time average queue size.

Networks with flow control often have a structure that yields deterministically bounded queues [11, 12]. While this is perhaps the strongest form of stability, it requires special structure. Primal-dual algorithms are considered for scheduling in wireless systems with β€œinfinite backlog” in [13, 14] and shown to converge to a utility-optimal operating point although this work does not consider queueing or time average constraints. Related primal-dual algorithms are treated in [15, 16] for systems with queues, where [15] shows utility optimality for a fluid version of the system, and [16] proves that the actual network exhibits rate stability with near-optimal utility. Related analysis is considered in [17] for systems with countably infinite-state ergodic Markov chains, and utility-optimal scheduling is treated for deterministic systems in [18]. The approaches in [13–18] do not obtain the desired tradeoffs between network utility and queue size.

1.2. Prior Work on Quadratics and Other Lyapunov Functions

Much prior work on queue stability uses quadratic Lyapunov functions, including [1, 2, 19–21] for ergodic systems defined on countably infinite Markov chains, and [22] for more general systems but where stability is defined in terms of a time average expectation. Stability with exponential Lyapunov functions is treated in [23, 24], and stability with more general Lyapunov functions is in [25]. The works [26–29] develop algorithms based on subquadratic Lyapunov functions and suggest that these often have better delay in comparison to algorithms derived from quadratic Lyapunov functions. This motivates recent work in [30] that has combined subquadratic Lyapunov functions with joint stability and penalty minimization, using time average expectations similar to [4].

2. Lyapunov Functions and Drift

Let 𝐐(𝑑)=(𝑄1(𝑑),…,𝑄𝐾(𝑑)) be a real-valued random vector that evolves over time slots π‘‘βˆˆ{0,1,2,…}. The distribution of 𝐐(0) is given, and the values of 𝐐(𝑑) for π‘‘βˆˆ{1,2,3,…} are determined by events on a probability space. Specifically, assume that outcomes on the probability space have the structure {πœ‚(βˆ’1),πœ‚(0),πœ‚(1),πœ‚(2),…}, where πœ‚(βˆ’1)=𝐐(0) is the initial vector state, and πœ‚(𝑑) for π‘‘βˆˆ{0,1,2,…} represents an event that occurs on slot 𝑑. For each slot π‘‘βˆˆ{0,1,2,…}, the value of 𝐐(𝑑) is assumed to be determined by the history {𝐐(0),πœ‚(0),πœ‚(1),…,πœ‚(π‘‘βˆ’1)}. Let β„‹(𝑑) represent this history up to (but not including) slot 𝑑. Formally, β„‹(𝑑) is a filtration of the infinite horizon probability space onto the finite set of slots {βˆ’1,0,1,…,π‘‘βˆ’1}. For any slot π‘‘βˆˆ{0,1,2,…}, a given β„‹(𝑑) includes information that specifies the value of 𝐐(𝑑).

An example of such a process 𝐐(𝑑) is the queue backlog process in a network of 𝐾 queues that evolve according to some control law via the queueing equation π‘„π‘˜ξ€Ίπ‘„(𝑑+1)=maxπ‘˜(𝑑)+π‘Žπ‘˜(𝑑)βˆ’π‘π‘˜ξ€»(𝑑),0,(2.1) where π‘Žπ‘˜(𝑑) and π‘π‘˜(𝑑) are arrival and service processes for queue π‘˜, which take values determined by πœ‚(𝑑). For this reason, throughout we call 𝐐(𝑑) the queue vector. However, our results are not restricted to processes of the type (2.1) and hold for more general processes, including processes 𝐐(𝑑) that can have negative components.

2.1. Lyapunov Functions

Let 𝐿(𝐐(𝑑)) be a nonnegative function of the current queue vector 𝐐(𝑑). We call this a Lyapunov function. Define Ξ”(𝑑)=𝐿(𝐐(𝑑+1))βˆ’πΏ(𝐐(𝑑)). Algorithms for queue stability are often developed by defining a convenient Lyapunov function, greedily minimizing a bound on Ξ”(𝑑) every slot, and showing that the resulting expectation 𝔼[Ξ”(𝑑)] has desirable properties [1, 2, 19–24]. An example Lyapunov function that is often used is 1𝐿(𝐐)=π›½πΎξ“π‘˜=1π‘€π‘˜||π‘„π‘˜||𝛽,(2.2) where 𝛽,π‘€π‘˜ are given positive constants, with 𝛽>1. This type of function represents a scalar measure of the vector queue size. Algorithms that greedily minimize a bound on Ξ”(𝑑) every slot often satisfy a drift condition of the form 𝔼[Ξ”(𝑑)]≀𝐡 for all 𝑑, for some finite constant 𝐡. Such drift conditions are examined in Section 3 in the context of queue stability.

Now let 𝑝(𝑑) be an additional penalty process for the system, representing some penalty (such as power expenditure) incurred on slot 𝑑. In many cases we desire to jointly stabilize all queues while minimizing the time average of 𝑝(𝑑). The drift-plus-penalty theory in [4] approaches this by taking control actions that greedily minimize a bound on Ξ”(𝑑)+𝑉𝑝(𝑑), where 𝑉 is a nonnegative weight that affects a performance tradeoff. Such policies often ensure a drift-plus-penalty condition of the following form: 𝔼[]Ξ”(𝑑)+𝑉𝑝(𝑑)βˆ£β„‹(𝑑)≀𝐡+π‘‰π‘βˆ—βˆ’πœ–π‘“(𝐐(𝑑))(2.3) for some constants π‘βˆ—, πœ–β‰₯0, and some nonnegative function 𝑓(𝐐(𝑑)). The value π‘βˆ— represents a target value for the time average of 𝑝(𝑑). The function 𝑓(𝐐(𝑑)) represents a measure of the total queue size. Expressions of the type (2.3) are examined in Sections 4, 5, and 6, and a broad class of networks that lead to expressions of this type are considered in Section 7.

2.2. Performance for Time Average Expectations

Results for time average expected queue size and time average expected penalty can be immediately derived from the drift-plus-penalty condition (2.3) together with minimal assumptions. Specifically, assume that (2.3) holds for all 𝑑 and all β„‹(𝑑), with 𝑉β‰₯0, πœ–β‰₯0, 𝑓(𝐐)β‰₯0. Assume that 𝔼[𝐿(𝐐(0))]<∞, and that there is a (possibly negative) constant 𝑝min such that 𝔼[𝑝(𝑑)]β‰₯𝑝min for all 𝑑. Taking expectations of (2.3) for a given slot 𝜏 gives the following: 𝔼[][]Ξ”(𝜏)+𝑉𝔼𝑝(𝜏)≀𝐡+π‘‰π‘βˆ—[].βˆ’πœ–π”Όπ‘“(𝐐(𝜏))(2.4) Now fix any integer 𝑑>0. Using Ξ”(𝜏)=𝐿(𝐐(𝜏+1))βˆ’πΏ(𝐐(𝜏)), summing over 𝜏∈{0,…,π‘‘βˆ’1}, and dividing by 𝑑 yields 𝔼[][]𝐿(𝐐(𝑑))βˆ’π”ΌπΏ(𝐐(0))𝑑1+π‘‰π‘‘π‘‘βˆ’1ξ“πœ=0𝔼[]𝑝(𝜏)≀𝐡+π‘‰π‘βˆ—1βˆ’πœ–π‘‘π‘‘βˆ’1ξ“πœ=0𝔼[].𝑓(𝐐(𝜏))(2.5) Rearranging the above and using 𝔼[𝐿(𝐐(𝑑))]β‰₯0 leads to the following two inequalities for all 𝑑>01π‘‘π‘‘βˆ’1ξ“πœ=0𝔼[]𝑝(𝜏)β‰€π‘βˆ—+𝐡𝑉+𝔼[]𝐿(𝐐(0)),1π‘‰π‘‘π‘‘π‘‘βˆ’1ξ“πœ=0𝔼[]≀𝑝𝑓(𝐐(𝜏))𝐡+π‘‰βˆ—βˆ’π‘minξ€Έπœ–+𝔼[]𝐿(𝐐(0)),πœ–π‘‘(2.6) where the first inequality holds whenever 𝑉>0, and the second holds whenever πœ–>0. Taking a limsup of the above gives limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝔼[]𝑝(𝜏)β‰€π‘βˆ—+𝐡𝑉,(2.7)limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝔼[]≀𝑝𝑓(𝐐(𝜏))𝐡+π‘‰βˆ—βˆ’π‘minξ€Έπœ–.(2.8) this shows that performance can be parameterized by the constant 𝑉, giving time average expected penalty within 𝑂(1/𝑉) of the target π‘βˆ—, with an 𝑂(𝑉) tradeoff in the time average expectation of 𝑓(𝐐(𝑑)) (representing a time average queue size metric). This method is used in [3, 4] for analysis and design of stochastic network optimization algorithms. Related manipulations of drift expressions in [31–34] lead to statements concerning time average expectations in other contexts.

The above arithmetic did not require Markov chain assumptions, renewal assumptions, or assumptions on the higher-order moments of the processes. We desire to obtain performance bounds similar to (2.7) and (2.8), but with the above limiting time average expectations replaced, with probability 1, by limiting time averages over a sample path. This cannot be done without imposing more structural assumptions on the problem (see Appendix A for simple counterexamples). The goal is to find simple and easily verifiable additional assumptions that preserve the elegance of the method while strengthening the results.

3. Rate Stability and Mean Rate Stability

This section considers drift conditions of the type 𝔼[Ξ”(𝑑)]≀𝐡. Let 𝐐(𝑑)β‰œ(𝑄1(𝑑),…,𝑄𝐾(𝑑)) be a stochastic vector defined over slots π‘‘βˆˆ{0,1,2,…}. Assume that there are constants 𝐷>0, πœƒ>0 such that 𝔼||π‘„π‘˜(𝑑+1)βˆ’π‘„π‘˜||(𝑑)1+πœƒξ‚„β‰€π·βˆ€π‘‘βˆˆ{0,1,2,…},βˆ€π‘˜βˆˆ{1,…,𝐾}.(3.1) For example, if πœƒ=1, then the condition (3.1) states that the second moment of queue changes is bounded for all time. This holds, for example, whenever the queue evolves according to (2.1) with arrival and service processes π‘Žπ‘˜(𝑑), π‘π‘˜(𝑑) that have bounded second moments for all slots. The use of the variable πœƒ>0 allows more general situations when arrivals and/or service variables can have infinite second moments.

Let 𝐿(𝐐) be a nonnegative function that has the following structural property. There exist constants π‘π‘˜>0, π‘π‘˜>0, π‘‘π‘˜β‰₯0 such that for all vectors 𝐐=(𝑄1,…,𝑄𝐾). 𝐿(𝐐)β‰₯π‘π‘˜||π‘„π‘˜||1+π‘π‘˜βˆ’π‘‘π‘˜βˆ€π‘˜βˆˆ{1,…,𝐾}.(3.2) Condition (3.2) implies that 𝐿(𝐐) grows superlinearly in |π‘„π‘˜|, at least as fast as Ξ©(|π‘„π‘˜|1+π‘π‘˜) for some π‘π‘˜>0. This is satisfied for a large class of functions used in practice, such as the example functions in (2.2) with 𝛽>1. Define Ξ”(𝑑)=𝐿(𝐐(𝑑+1))βˆ’πΏ(𝐐(𝑑)).

Proposition 3.1. Suppose that 𝐿(𝐐) is nonnegative and satisfies (3.2), and that there is a constant 𝐡β‰₯0 such that 𝔼[Ξ”(𝑑)]≀𝐡 for all π‘‘βˆˆ{0,1,2,…}. Suppose that 𝔼[𝐿(𝐐(0))]<∞. Then we have the following.
(a) For all queues π‘˜βˆˆ{1,…,𝐾}, we have limπ‘‘β†’βˆžπ”Όξ€Ί||π‘„π‘˜||ξ€»(𝑑)𝑑=0.(3.3)
(b) If additionally assumption (3.1) holds for some constants 𝐷>0, πœƒ>0, then for all queues π‘˜βˆˆ{1,…,𝐾} we have limπ‘‘β†’βˆžπ‘„π‘˜(𝑑)𝑑=0(𝑀.𝑝.1),(3.4) where β€œw.p.1” stands for β€œwith probability 1.”

The condition (3.3) is called mean rate stability, and the condition (3.4) is called rate stability. Both are weak forms of stability. Parts (a) and (b) of the proposition are proven separately.

Proof (Proposition 3.1 Part (a)). By definition of Ξ”(𝜏), the statement 𝔼[Ξ”(𝜏)]≀𝐡 is equivalent to 𝔼[][]𝐿(𝐐(𝜏+1))βˆ’π”ΌπΏ(𝐐(𝜏))≀𝐡.(3.5) The above holds for all slots 𝜏. Summing over 𝜏∈{0,1,…,π‘‘βˆ’1} for some integer 𝑑>0 gives 𝔼[][]𝐿(𝐐(𝑑))βˆ’π”ΌπΏ(𝐐(0))≀𝐡𝑑.(3.6) Now select any π‘˜βˆˆ{1,…,𝐾}. From (3.2), we have π‘π‘˜||π‘„π‘˜||(𝑑)1+π‘π‘˜βˆ’π‘‘π‘˜β‰€πΏ(𝐐(𝑑)).(3.7) Taking expectations of the above and using (3.6) gives: π‘π‘˜π”Όξ‚ƒ||π‘„π‘˜||(𝑑)1+π‘π‘˜ξ‚„βˆ’π‘‘π‘˜[]≀𝔼𝐿(𝐐(0))+𝐡𝑑.(3.8) Rearranging the above gives 𝔼||π‘„π‘˜||(𝑑)1+π‘π‘˜ξ‚„β‰€1π‘π‘˜ξ€Ίπ‘‘π‘˜[]ξ€».+𝔼𝐿(𝐐(0))+𝐡𝑑(3.9) The above holds for all π‘‘βˆˆ{1,2,3,…}. Thus, there is a constant 𝐢>0 such that 𝔼||π‘„π‘˜||(𝑑)1+π‘π‘˜ξ‚„β‰€πΆπ‘‘βˆ€π‘‘βˆˆ{1,2,3,…}.(3.10) By Jensen's inequality we have 𝔼[|π‘„π‘˜(𝑑)|]1+π‘π‘˜β‰€π”Ό[|π‘„π‘˜(𝑑)|1+π‘π‘˜]. Using this in (3.10) gives 𝔼||π‘„π‘˜||ξ€»(𝑑)1+π‘π‘˜β‰€πΆπ‘‘.(3.11) Therefore, 𝔼||π‘„π‘˜||ξ€»(𝑑)𝑑≀(𝐢𝑑)1/(1+π‘π‘˜)𝑑.(3.12) Taking a limit as π‘‘β†’βˆž proves part (a).

The proof of part (b) follows by noting that the condition (3.10), together with condition (3.1), satisfies the requirements of the following lemma concerning queues with higher-order moments that grow at most linearly.

Lemma 3.2. Let 𝑄(𝑑) be a real-valued stochastic process that evolves over slots π‘‘βˆˆ{0,1,2,…}, and suppose there are constants 𝑏>0, πœƒ>0, 𝐢>0, 𝐷>0 such that 𝔼||||𝑄(𝑑)1+𝑏𝔼||||β‰€πΆπ‘‘βˆ€π‘‘βˆˆ{1,2,…},𝑄(𝑑+1)βˆ’π‘„(𝑑)1+πœƒξ‚„β‰€π·βˆ€π‘‘βˆˆ{1,2,…}.(3.13) Then limπ‘‘β†’βˆžπ‘„(𝑑)𝑑=0(𝑀.𝑝.1).(3.14)

Proof. See Appendix B.

4. Convergence of Time Averages

This section considers drift conditions of the type (2.3).

4.1. The Law of Large Numbers for Martingale Differences

Let 𝑋(𝑑) be a real-valued stochastic process defined over slots π‘‘βˆˆ{0,1,2,…}. For integers 𝑑>0, let β„‹(𝑑) be a filtration that includes the information {𝑋(0),…,𝑋(π‘‘βˆ’1)}. The following theorem is from [35] and concerns processes that satisfy 𝔼[𝑋(𝑑)βˆ£β„‹(𝑑)]=0. Such processes are called martingale differences.

Theorem 4.1 (from [35]). Suppose that 𝑋(𝑑) is a real-valued random process over π‘‘βˆˆ{0,1,2,…} such that 𝔼[𝑋(𝑑)βˆ£β„‹(𝑑)]=0 for all π‘‘βˆˆ{1,2,…} and all β„‹(𝑑). Further suppose that the second moments 𝔼[𝑋(𝑑)2] are finite for all 𝑑 and satisfy βˆžξ“π‘‘=1𝔼𝑋(𝑑)2𝑑2<∞.(4.1) Then limπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝑋(𝜏)=0(𝑀.𝑝.1).(4.2)

Corollary 4.2. Let 𝑋(𝑑) be a real-valued random process over slots π‘‘βˆˆ{0,1,2,…}, and suppose there is a finite constant 𝐡 such that 𝔼[𝑋(𝑑)βˆ£β„‹(𝑑)]≀𝐡 for all π‘‘βˆˆ{1,2,…} and all β„‹(𝑑). Further suppose the second moments 𝔼[𝑋(𝑑)2] are finite for all 𝑑 and satisfy βˆžξ“π‘‘=1𝔼𝑋(𝑑)2𝑑2<∞.(4.3) Then limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝑋(𝜏)≀𝐡(𝑀.𝑝.1).(4.4)

Proof. Define 𝑋(𝑑)β‰œπ‘‹(𝑑)βˆ’π”Ό[𝑋(𝑑)βˆ£β„‹(𝑑)]. Then 𝔼[𝑋(𝑑)βˆ£β„‹(𝑑)]=0. Further, using (4.3), it is not difficult to verify that βˆ‘βˆžπ‘‘=1𝔼[𝑋(𝑑)2]/𝑑2 is finite (note that 𝔼[𝑋(𝑑)2]≀2𝔼[𝑋(𝑑)2]+2𝔼[𝔼[𝑋(𝑑)βˆ£β„‹(𝑑)]2], and the latter term can be bounded by 2𝔼[𝑋(𝑑)2] via Jensen's inequality). Thus, the conditions required for Theorem 4.1 hold, so we conclude that limπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝑋(𝜏)=0(w.p.1).(4.5) However, because 𝑋(𝑑)=𝑋(𝑑)+𝔼[𝑋(𝑑)βˆ£β„‹(𝑑)] and 𝔼[𝑋(𝑑)βˆ£β„‹(𝑑)]≀𝐡, we know that 𝑋(𝑑)≀𝑋(𝑑)+𝐡 for all 𝑑. Thus, for all 𝑑>0, we have 1π‘‘π‘‘βˆ’1ξ“πœ=01𝑋(𝜏)β‰€π‘‘π‘‘βˆ’1ξ“πœ=0𝑋(𝑑)+𝐡.(4.6) Taking a limsup of the above and using (4.5) proves the result.

4.2. A Drift-Plus-Penalty Result for General Lyapunov Functions

Now consider the stochastic system with queue vector 𝐐(𝑑)=(𝑄1(𝑑),…,𝑄𝐾(𝑑)) and penalty process 𝑝(𝑑), as described in Section 2.1. The history β„‹(𝑑) includes information {𝐐(0),…,𝐐(𝑑),𝑝(0),…,𝑝(π‘‘βˆ’1)}. Assume that there is a finite (possibly negative) constant 𝑝min such that for all 𝑑 and all β„‹(𝑑)𝔼[]𝑝(𝑑)βˆ£β„‹(𝑑)β‰₯𝑝min.(4.7) Further assume that 𝔼[𝑝(𝑑)2] is finite for all 𝑑, and βˆžξ“π‘‘=1𝔼𝑝(𝑑)2𝑑2<∞.(4.8) Let 𝐿(𝐐(𝑑)) be any nonnegative function of 𝐐(𝑑), and define Ξ”(𝑑)=𝐿(𝐐(𝑑+1))βˆ’πΏ(𝐐(𝑑)). Let 𝑓(𝐐(𝑑)) be another nonnegative function of 𝐐(𝑑).

Proposition 4.3. Suppose 𝐿(𝐐(0)) is finite with probability 1, that 𝔼[Ξ”(𝑑)2],𝔼[𝑓(𝐐(𝑑))2],𝔼[𝑝(𝑑)2] are finite for all 𝑑, that (4.7) and (4.8) hold, and that βˆžξ“π‘‘=1𝔼Δ(𝑑)2ξ€»ξ€Ί+𝔼𝑓(𝐐(𝑑))2𝑑2<∞.(4.9) Further suppose there are constants π‘βˆ—, 𝑉β‰₯0, πœ–β‰₯0, 𝐡β‰₯0 such that the following drift-plus-penalty condition holds for all π‘‘βˆˆ{0,1,2,…} and all possible β„‹(𝑑)𝔼[]Ξ”(𝑑)+𝑉𝑝(𝑑)βˆ£β„‹(𝑑)≀𝐡+π‘‰π‘βˆ—βˆ’πœ–π‘“(𝐐(𝑑)).(4.10) Then limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝑝(𝜏)β‰€π‘βˆ—+𝐡𝑉(𝑀.𝑝.1),(4.11)limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝑝𝑓(𝐐(𝜏))≀𝐡+π‘‰βˆ—βˆ’π‘minξ€Έπœ–(𝑀.𝑝.1),(4.12) where (4.11) holds when 𝑉>0, and (4.12) holds when πœ–>0.

Proof. Define 𝑋(𝑑)=Ξ”(𝑑)+𝑉𝑝(𝑑)+πœ–π‘“(𝐐(𝑑)), and note by algebra that 𝑋(𝑑)2≀4Ξ”(𝑑)2+4𝑉2𝑝(𝑑)2+4πœ–2𝑓(𝐐(𝑑))2(4.13) It follows that βˆžξ“π‘‘=1𝔼𝑋(𝑑)2𝑑2<∞.(4.14) Further, by (4.10), we have 𝔼[]𝑋(𝑑)βˆ£β„‹(𝑑)≀𝐡+π‘‰π‘βˆ—.(4.15) Thus, by Corollary 4.2: limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝑋(𝜏)≀𝐡+π‘‰π‘βˆ—.(4.16) Because 𝑋(𝑑)β‰₯Ξ”(𝑑)+𝑉𝑝(𝑑), we have 1π‘‘π‘‘βˆ’1ξ“πœ=01𝑋(𝜏)β‰₯π‘‘π‘‘βˆ’1ξ“πœ=0[]=Ξ”(𝜏)+𝑉𝑝(𝜏)𝐿(𝐐(𝑑))βˆ’πΏ(𝐐(0))𝑑+π‘‰π‘‘π‘‘βˆ’1ξ“πœ=0β‰₯𝑝(𝜏)βˆ’πΏ(𝐐(0))𝑑+π‘‰π‘‘π‘‘βˆ’1ξ“πœ=0𝑝(𝜏),(4.17) where the final inequality is because 𝐿(𝐐(𝑑))β‰₯0. Taking a limsup of the above and using (4.16) gives limsupπ‘‘β†’βˆžπ‘‰π‘‘π‘‘βˆ’1ξ“πœ=0𝑝(𝜏)≀𝐡+π‘‰π‘βˆ—(4.18) which implies (4.11).
The condition (4.10) together with 𝔼[𝑝(𝑑)βˆ£β„‹(𝑑)]β‰₯𝑝min also implies that 𝔼[Ξ”]𝑝(𝑑)+πœ–π‘“(𝐐(𝑑))βˆ£β„‹(𝑑)≀𝐡+π‘‰βˆ—βˆ’π‘minξ€Έ(4.19) Defining 𝑋(𝑑)=Ξ”(𝑑)+πœ–π‘“(𝐐(𝑑)) and again applying Corollary 4.2 similarly shows that limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0ξ€·π‘πœ–π‘“(𝐐(𝑑))≀𝐡+π‘‰βˆ—βˆ’π‘minξ€Έ(4.20) which implies (4.12).

The result of Proposition 4.3 is almost the result we desire, with the exception that the condition (4.9) may be difficult to verify. Note that (4.9) often trivially holds in special cases when all queues are deterministically bounded. Hence, in the flow control algorithms [11, 12] designed from drift-plus-penalty theory, Proposition 4.3 proves that time average penalties satisfy (4.11) with probability 1. The next section shows that this condition follows from the drift assumption when 𝐿(𝐐) is a subquadratic Lyapunov function. Section 6 treats quadratic Lyapunov functions.

5. Subquadratic Lyapunov Functions

Suppose that the queue processes π‘„π‘˜(𝑑) are nonnegative for all π‘˜βˆˆ{1,…,𝐾} and all 𝑑, and consider a Lyapunov function 𝐿(𝐐) of the following form: 1𝐿(𝐐)=π›½πΎξ“π‘˜=1π‘€π‘˜π‘„π›½π‘˜,(5.1) where π‘€π‘˜ and 𝛽 are positive constants, with 1<𝛽<2. The scaling by 1/𝛽 is only for convenience later.

5.1. The Drift Structure

For each π‘˜βˆˆ{1,…,𝐾}, define π›Ώπ‘˜(𝑑)=π‘„π‘˜(𝑑+1)βˆ’π‘„π‘˜(𝑑). Assume throughout that there is a finite constant 𝐷>0 such that for all π‘‘βˆˆ{0,1,2,…}, all β„‹(𝑑), and all π‘˜βˆˆ{1,…,𝐾}: π”Όξ€Ίπ›Ώπ‘˜(𝑑)2ξ€»βˆ£β„‹(𝑑)≀𝐷,βˆžξ“π‘‘=1𝔼||π›Ώπ‘˜||(𝑑)2𝛽𝑑2<∞.(5.2) Because 1<𝛽<2, the above conditions, hold, for example, whenever 𝔼[|π›Ώπ‘˜(𝑑)|2π›½βˆ£β„‹(𝑑)]≀𝐢 for some finite constant 𝐢. In particular, this holds if queues have the structure (2.1) and arrival and service processes π‘Žπ‘˜(𝑑), π‘π‘˜(𝑑) have bounded (2𝛽)th moments, regardless of the history β„‹(𝑑).

Appendix D shows that (5.2) implies there is a real number 𝐢>0 such that 𝔼Δ(𝑑)2≀𝐢+πΆπΎξ“π‘˜=1𝔼||π›Ώπ‘˜||(𝑑)2𝛽+π‘„π‘˜(𝑑)2π›½βˆ’2ξ‚„.(5.3) Further, Appendix D shows that there is a real number 𝐡>0 such that 𝔼[]Ξ”(𝑑)βˆ£β„‹(𝑑)≀𝐡+πΎξ“π‘˜=1π‘€π‘˜π‘„π‘˜(𝑑)π›½βˆ’1π”Όξ€Ίπ›Ώπ‘˜ξ€».(𝑑)βˆ£β„‹(𝑑)(5.4) Again let 𝑝(𝑑) be a penalty process. Algorithms that seek to minimize a bound on Ξ”(𝑑)+𝑉𝑝(𝑑) every slot 𝑑 often lead to drift conditions of the following form (see Section 7): 𝔼[]Ξ”(𝑑)+𝑉𝑝(𝑑)βˆ£β„‹(𝑑)≀𝐡+π‘‰π‘βˆ—βˆ’πœ–πΎξ“π‘˜=1π‘€π‘˜π‘„π‘˜(𝑑)π›½βˆ’1(5.5) for some πœ–β‰₯0. This has the general form (4.10) with βˆ‘π‘“(𝐐(𝑑))=πΎπ‘˜=1π‘€π‘˜π‘„π‘˜(𝑑)π›½βˆ’1.

5.2. Drift-Plus-Penalty for Subquadratics

Proposition 5.1. Let 𝐿(𝐐) be a subquadratic Lyapunov function of the form (5.1), with 1<𝛽<2. Assume that 𝔼[𝐿(𝐐(0))]<∞, (5.2) holds for the queue difference process π›Ώπ‘˜(𝑑), and (4.7)-(4.8) hold for the penalty process 𝑝(𝑑). Suppose that the drift-plus-penalty condition (5.5) holds for some constants π‘βˆ—, 𝑉β‰₯0, 𝐡β‰₯0, πœ–β‰₯0. Then for all π‘˜βˆˆ{1,…,𝐾}limπ‘‘β†’βˆžπ‘„π‘˜(𝑑)𝑑=0(𝑀.𝑝.1).(5.6) Further, we have limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝑝(𝜏)β‰€π‘βˆ—+𝐡𝑉(𝑀.𝑝.1),(5.7)limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πΎπœ=0ξ“π‘˜=1π‘€π‘˜π‘„π‘˜(𝜏)π›½βˆ’1≀𝑝𝐡+π‘‰βˆ—βˆ’π‘minξ€Έπœ–(𝑀.𝑝.1),(5.8) where (5.7) holds when 𝑉>0 and (5.8) holds when πœ–>0.

Proof. The drift-plus-penalty expression (5.5) implies that 𝔼[]Ξ”(𝑑)βˆ£β„‹(𝑑)≀𝐹,(5.9) where we define πΉβ‰œπ΅+𝑉(π‘βˆ—βˆ’π‘min). Taking expectations of the above yields 𝔼[Ξ”(𝑑)]≀𝐹. The conditions of the theorem ensure that all conditions hold that are required for Proposition 3.1, and so Proposition 3.1 implies that all queues are rate stable, so that (5.6) holds.
Define βˆ‘π‘“(𝐐(𝑑))=πΎπ‘˜=1π‘€π‘˜π‘„π‘˜(𝑑)π›½βˆ’1. If we can show that βˆžξ“π‘‘=1𝔼Δ(𝑑)2+𝑓(𝐐(𝑑))2𝑑2<∞,(5.10) then our system satisfies all the requirements for Proposition 4.3, and so (5.7)-(5.8) hold. Thus, it suffices to prove (5.10). To this end, we have 𝔼𝑓(𝐐(𝑑))2ξ€»βŽ‘βŽ’βŽ’βŽ£ξƒ©=π”ΌπΎξ“π‘˜=1π‘€π‘˜π‘„π‘˜(𝑑)π›½βˆ’1ξƒͺ2⎀βŽ₯βŽ₯βŽ¦β‰€πΊπ‘Šξ“π‘˜=1π”Όξ€Ίπ‘„π‘˜(𝑑)2π›½βˆ’2ξ€»(5.11) for some real number 𝐺>0 (where we have used the fact (B.6) in Appendix B). This together with (5.3) gives βˆžξ“π‘‘=1𝔼Δ(𝑑)2ξ€»ξ€Ί+𝔼𝑓(𝐐(𝑑))2𝑑2β‰€βˆžξ“π‘‘=1βˆ‘πΆ+(𝐢+𝐺)πΎπ‘˜=1𝔼||π›Ώπ‘˜||(𝑑)2𝛽+π‘„π‘˜(𝑑)2π›½βˆ’2𝑑2.(5.12) By (5.2), to prove that the above is finite, it suffices to prove that for each π‘˜βˆˆ{1,…,𝐾} we have: βˆžξ“π‘‘=1π”Όξ€Ίπ‘„π‘˜(𝑑)2π›½βˆ’2𝑑2<∞.(5.13)
Recall from (5.9) that 𝔼[Ξ”(𝜏)]≀𝐹 for all 𝜏, and so 𝔼[][]𝐿(𝐐(𝜏+1))βˆ’π”ΌπΏ(𝐐(𝜏))≀𝐹.(5.14) Summing the above over 𝜏∈{0,1,…,π‘‘βˆ’1} for some integer 𝑑>0 gives the following: 𝔼[][]𝐿(𝐐(𝑑))βˆ’π”ΌπΏ(𝐐(0))≀𝐹𝑑.(5.15) Thus, we have πΎξ“π‘˜=1π‘€π‘˜π›½π”Όξ€Ίπ‘„π‘˜(𝑑)𝛽[]≀𝔼𝐿(𝐐(0))+𝐹𝑑.(5.16) Thus, there is a real number 𝑏>0 such that for each π‘˜βˆˆ{1,…,𝐾} and all π‘‘βˆˆ{1,2,3,…}π”Όξ€Ίπ‘„π‘˜(𝑑)𝛽≀𝑏𝑑.(5.17) Because 1<𝛽<2, we have 0<(2π›½βˆ’2)/𝛽<1. Thus, by Jensen's inequality for the concave function π‘₯(2π›½βˆ’2)/𝛽 over π‘₯β‰₯0, we have π”Όξ€Ίπ‘„π‘˜(𝑑)2π›½βˆ’2ξ€»ξ€Ίπ‘„β‰€π”Όπ‘˜(𝑑)𝛽(2π›½βˆ’2)/𝛽≀(𝑏𝑑)2βˆ’2/𝛽(5.18) Thus, we have βˆžξ“π‘‘=1π”Όξ€Ίπ‘„π‘˜(𝑑)2π›½βˆ’2𝑑2β‰€βˆžξ“π‘‘=1(𝑏𝑑)2βˆ’2/𝛽𝑑2<∞,(5.19) where the final inequality holds because the summation terms are 𝑂(1/𝑑2/𝛽), and 2/𝛽>1. This proves (5.13), completing the proof.

Note that the final line of the proof requires 𝑂(1/𝑑2/𝛽) to be summable, which is true for 1<𝛽<2 but not true for 𝛽=2. This is one reason why quadratic Lyapunov functions must be considered separately. Another distinction is that the above proposition holds even for πœ–=0, whereas we require πœ–>0 for the quadratic analysis in the next section.

6. Quadratic Lyapunov Functions

Here, we allow π‘„π‘˜(𝑑) to possibly take negative values. Consider quadratic Lyapunov functions: 1𝐿(𝐐)=2πΎξ“π‘˜=1π‘€π‘˜π‘„2π‘˜,(6.1) where π‘€π‘˜ are any positive weights. This is the same form as in the previous section, assuming that 𝛽=2. For 𝛽=2, the assumptions (5.2) become π”Όξ€Ίπ›Ώπ‘˜(𝑑)2ξ€»βˆ£β„‹(𝑑)≀𝐷,βˆžξ“π‘‘=1π”Όξ€Ίπ›Ώπ‘˜(𝑑)4𝑑2<∞.(6.2)

Define Ξ”(𝑑)=𝐿(𝐐(𝑑+1))βˆ’πΏ(𝐐(𝑑)). These quadratic Lyapunov functions typically lead to drift-plus-penalty expressions of the following form (see [36]): 𝔼[]Ξ”(𝑑)+𝑉𝑝(𝑑)βˆ£β„‹(𝑑)≀𝐡+π‘‰π‘βˆ—βˆ’πœ–πΎξ“π‘˜=1π‘€π‘˜||π‘„π‘˜||(𝑑)(6.3) which is the same as (4.10) with βˆ‘π‘“(𝐐)=πΎπ‘˜=1π‘€π‘˜|π‘„π‘˜|. Again define π›Ώπ‘˜(𝑑)β‰œπ‘„π‘˜(𝑑+1)βˆ’π‘„π‘˜(𝑑).

Proposition 6.1. Suppose that 𝐿(𝐐) has the quadratic form (6.1), and 𝔼[|π‘„π‘˜(0)|3]<∞ for all π‘˜βˆˆ{1,…,𝐾}. Assume that the bounded moment conditions (6.2), (4.7), and (4.8) hold. If the drift-plus-penalty expression (6.3) holds for some constants π‘βˆ—, 𝐡β‰₯0, 𝑉β‰₯0, πœ–>0, π‘€π‘˜>0 for π‘˜βˆˆ{1,…,𝐾}, then limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝑝(𝜏)β‰€π‘βˆ—+𝐡𝑉(𝑀.𝑝.1),(6.4)limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πΎπœ=0ξ“π‘˜=1π‘€π‘˜||π‘„π‘˜||≀𝑝(𝜏)𝐡+π‘‰βˆ—βˆ’π‘minξ€Έπœ–(𝑀.𝑝.1),(6.5) where (6.4) holds when 𝑉>0.

The proposition is proven with the assistance of the following lemma.

Lemma 6.2. Suppose that the same assumptions of Proposition 6.1 hold, and that the following additional assumption holds. For all π‘˜βˆˆ{1,…,𝐾}, we have βˆžξ“π‘‘=1π”Όξ€Ίπ‘„π‘˜(𝑑)2𝑑2<∞.(6.6) Then (6.4) and (6.5) hold.

Proof (see Lemma 6.2). Using βˆ‘π‘“(𝐐)=πΎπ‘˜=1|π‘„π‘˜|, by Proposition 4.3, it suffices to show that βˆžξ“π‘‘=1𝔼Δ(𝑑)2𝑑2<∞.(6.7) To this end, we have 1Ξ”(𝑑)=2πΎξ“π‘˜=1π‘€π‘˜ξ‚ƒξ€·π‘„π‘˜(𝑑)+π›Ώπ‘˜ξ€Έ(𝑑)2βˆ’π‘„π‘˜(𝑑)2ξ‚„=12πΎξ“π‘˜=1π‘€π‘˜ξ€Ί2π›Ώπ‘˜(𝑑)π‘„π‘˜(𝑑)+π›Ώπ‘˜(𝑑)2ξ€».(6.8) Therefore, by (B.6) in Appendix B, there is a real number 𝐢>0 such that 𝔼Δ(𝑑)2ξ€»β‰€πΆπΎξ“π‘˜=1π”Όξ€Ίπ›Ώπ‘˜(𝑑)2π‘„π‘˜(𝑑)2+π›Ώπ‘˜(𝑑)4ξ€».(6.9) However, because 𝔼[π›Ώπ‘˜(𝑑)2βˆ£β„‹(𝑑)]≀𝐷, we have π”Όξ€Ίπ›Ώπ‘˜(𝑑)2π‘„π‘˜(𝑑)2𝑄=π”Όπ‘˜(𝑑)2π”Όξ€Ίπ›Ώπ‘˜(𝑑)2ξ€Ίπ‘„βˆ£β„‹(𝑑)ξ€»ξ€»β‰€π·π”Όπ‘˜(𝑑)2ξ€»(6.10) Thus, we have 𝔼Δ(𝑑)2ξ€»β‰€πΆπ·πΎξ“π‘˜=1π”Όξ€Ίπ‘„π‘˜(𝑑)2ξ€»+πΆπΎξ“π‘˜=1π”Όξ€Ίπ›Ώπ‘˜(𝑑)4ξ€».(6.11) Because βˆ‘βˆžπ‘‘=1𝔼[π‘„π‘˜(𝑑)2]/𝑑2<∞ and βˆ‘βˆžπ‘‘=1𝔼[π›Ώπ‘˜(𝑑)4]/𝑑2<∞, we also have βˆ‘βˆžπ‘‘=1𝔼[Ξ”(𝑑)2]/𝑑2<∞.

To prove Proposition 6.1, it remains only to show that the requirement βˆ‘βˆžπ‘‘=1𝔼[π‘„π‘˜(𝑑)2]/𝑑2<∞ used in Lemma 6.2 follows from the other assumptions in the lemma and hence can be removed. Note that the drift-plus-penalty expression (6.3) together with 𝔼[𝑝(𝑑)βˆ£β„‹(𝑑)]β‰₯𝑝min implies that for all 𝑑 and all β„‹(𝑑)𝔼[]𝑝Δ(𝑑)βˆ£β„‹(𝑑)β‰€π΅βˆ’π‘‰βˆ—βˆ’π‘minξ€Έβˆ’πœ–πΎξ“π‘˜=1π‘€π‘˜||π‘„π‘˜||.(𝑑)(6.12) Then the requirement βˆ‘βˆžπ‘‘=1𝔼[π‘„π‘˜(𝑑)2]/𝑑2<∞ follows by the next proposition.

Proposition 6.3. Suppose that 𝐿(𝐐) has the quadratic form (6.1), and that 𝔼[|π‘„π‘˜(0)|3]<∞ for all π‘˜βˆˆ{1,…,𝐾}. Suppose that (6.2) holds, and that there are constants 𝐹β‰₯0, πœ–>0, and π‘€π‘˜>0 for π‘˜βˆˆ{1,…,𝐾} such that for all 𝑑 and all β„‹(𝑑)𝔼[]Ξ”(𝑑)βˆ£β„‹(𝑑)β‰€πΉβˆ’πœ–πΎξ“π‘˜=1π‘€π‘˜||π‘„π‘˜||.(𝑑)(6.13) Then:
(a) For all π‘˜βˆˆ{1,…,𝐾} we have βˆžξ“π‘‘=1π”Όξ€Ίπ‘„π‘˜(𝑑)2𝑑2<∞.(6.14)
(b) For all π‘˜βˆˆ{1,…,𝐾} we have limπ‘‘β†’βˆžπ”Όξ€Ί||π‘„π‘˜||ξ€»(𝑑)𝑑=0,(6.15)limπ‘‘β†’βˆžπ‘„π‘˜(𝑑)𝑑=0(𝑀.𝑝.1),(6.16)limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πΎπœ=0ξ“π‘˜=1π‘€π‘˜π”Όξ€Ί||π‘„π‘˜||≀𝐹(𝜏)πœ–,(6.17)limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πΎπœ=0ξ“π‘˜=1π‘€π‘˜||π‘„π‘˜||≀𝐹(𝜏)πœ–(𝑀.𝑝.1),(6.18)limπ‘€β†’βˆžξƒ¬limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0ξ€Ί||𝑄Prπ‘˜||ξ€»ξƒ­(𝜏)>𝑀=0,(6.19)limπ‘€β†’βˆžξƒ¬limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=01ξ€½||π‘„π‘˜||ξ€Ύξƒ­(𝜏)>𝑀=0(𝑀.𝑝.1),(6.20) where 1{|π‘„π‘˜(𝜏)|>𝑀} is an indicator function that is 1 if |π‘„π‘˜(𝜏)|>𝑀, and 0 else.

Inequalities (6.15)–(6.20) represent six major forms of queue stability. Thus, all of these forms of stability are satisfied whenever the bounded second and fourth moment conditions (6.2) hold, and when the drift condition (6.13) holds for a quadratic Lyapunov function.

Proof. Part (a) is proven in Appendix C. To prove part (b), note that second moments are bounded because (6.2) holds. Further, the quadratic Lyapunov function (6.1) satisfies the condition (3.2), and the expression (6.13) implies that 𝔼[]Ξ”(𝑑)≀𝐹.(6.21) Thus, the rate stability and mean rate stability results (6.15)-(6.16) follow by Proposition 3.1. The result (6.17) follows by (2.8) (with 𝑉=0). The result (6.18) follows by part (a) together with Lemma 6.2 (for 𝑉=0). The results (6.19) and (6.20) follow immediately from (6.17) and (6.18), respectively, by noting that for all 𝑀>0 we have |π‘„π‘˜(𝜏)|β‰₯𝑀1{|π‘„π‘˜(𝜏)|>𝑀}.

7. Stochastic Network Optimization

Here we use Propositions 5.1 and 6.1 to analyze an important class of stochastic queueing networks. This is the same scenario treated in [4, 36]. However, while the work in [4] obtains bounds on the time average expectations, here we obtain bounds on the pure time averages. Further, the works [4, 36] treat only quadratic Lyapunov functions. Here we treat both quadratics and subquadratics.

Consider a network of 𝐾 queues. The queue vector 𝐐(𝑑)=(𝑄1(𝑑),…,𝑄𝐾(𝑑)) evolves in slotted time π‘‘βˆˆ{0,1,2,…} with update equation as follows: π‘„π‘˜ξ€Ίπ‘„(𝑑+1)=maxπ‘˜(𝑑)βˆ’π‘π‘˜(𝑑)+π‘Žπ‘˜ξ€»(𝑑),0βˆ€π‘˜βˆˆ{1,…,𝐾},(7.1) where π‘Žπ‘˜(𝑑) and π‘π‘˜(𝑑) are arrival and service variables, respectively, for queue π‘˜. These are determined on slot 𝑑 by general functions Μ‚π‘Žπ‘˜(𝛼(𝑑),πœ”(𝑑)), Μ‚π‘π‘˜(𝛼(𝑑),πœ”(𝑑)) of a network state πœ”(𝑑) and a control action 𝛼(𝑑) as follows: π‘Žπ‘˜(𝑑)=Μ‚π‘Žπ‘˜(𝛼(𝑑),πœ”(𝑑)),π‘π‘˜Μ‚π‘(𝑑)=π‘˜(𝛼(𝑑),πœ”(𝑑)),(7.2) where the control action 𝛼(𝑑) is made every slot 𝑑 with knowledge of the current πœ”(𝑑) and is chosen within some abstract set π’œπœ”(𝑑). The πœ”(𝑑) value can represent random arrival and channel state information on slot 𝑑, and 𝛼(𝑑) can represent a resource allocation decision. For simplicity, assume that the πœ”(𝑑) process is independent and identically distributed (i.i.d.) over slots.

The control action additionally incurs a vector of penalties 𝐲(𝑑)=(𝑦0(𝑑),𝑦1(𝑑),…,𝑦𝑀(𝑑)), again given by general functions of 𝛼(𝑑) and πœ”(𝑑) as follows: π‘¦π‘š(𝑑)=Μ‚π‘¦π‘š(𝛼(𝑑),πœ”(𝑑)).(7.3) For 𝑑>0, define π‘Žπ‘˜(𝑑), π‘π‘˜(𝑑), π‘¦π‘š(𝑑), π‘„π‘˜(𝑑) as time averages over the first 𝑑 slots: π‘Žπ‘˜1(𝑑)β‰œπ‘‘π‘‘βˆ’1ξ“πœ=0π‘Žπ‘˜(𝜏),π‘π‘˜1(𝑑)β‰œπ‘‘π‘‘βˆ’1ξ“πœ=0π‘π‘˜(𝜏),(7.4)π‘¦π‘š1(𝑑)β‰œπ‘‘π‘‘βˆ’1ξ“πœ=0π‘¦π‘š(𝜏),π‘„π‘˜1(𝑑)β‰œπ‘‘π‘‘βˆ’1ξ“πœ=0π‘„π‘˜(𝜏).(7.5)

Thus, every slot 𝑑, the controller observes πœ”(𝑑) and chooses 𝛼(𝑑)βˆˆπ’œπœ”(𝑑), with the goal of solving the following stochastic network optimization problem: Minimize:limsupπ‘‘β†’βˆžπ‘¦0(𝑑)(7.6)Subjectto:(1)limsupπ‘‘β†’βˆžπ‘¦π‘š(𝑑)≀0βˆ€π‘šβˆˆ{1,…,𝑀}(7.7)(2)π‘„π‘˜(𝑑)isratestableβˆ€π‘˜βˆˆ{1,…,𝐾}(7.8)(3)𝛼(𝑑)βˆˆπ’œπœ”(𝑑)βˆ€π‘‘βˆˆ{0,1,2,…}.(7.9) Assume that the problem is feasible (so that the constraints can be met), and define 𝑦opt0 as the infimum value of (7.6) over all feasible algorithms.

Typical penalties can represent power expenditures [11]. For example, suppose that π‘¦π‘š(𝑑)β‰œπ‘π‘š(𝑑)βˆ’π‘avπ‘š, where π‘π‘š(𝑑) is the power incurred in component π‘š of the network on slot 𝑑, and 𝑝avπ‘š is a required time average power expenditure. Then ensuring limsupπ‘‘β†’βˆžπ‘¦π‘š(𝑑)≀0 ensures that limsupπ‘‘β†’βˆžπ‘π‘š(𝑑)≀𝑝avπ‘š, so that the desired time average power constraint is met.

To ensure that the time average penalty constraints are met, for each π‘šβˆˆ{1,…,𝑀}, we define a virtual queue π‘π‘š(𝑑) as follows: π‘π‘šξ€Ίπ‘(𝑑+1)=maxπ‘š(𝑑)+π‘¦π‘šξ€»(𝑑),0.(7.10) Clearly π‘π‘š(𝜏+1)β‰₯π‘π‘š(𝜏)+π‘¦π‘š(𝜏) for any integer 𝜏β‰₯0. Summing this over 𝜏∈{0,1,…,π‘‘βˆ’1} yields the following (for any integer 𝑑>0): π‘π‘š(𝑑)βˆ’π‘π‘š(0)β‰₯π‘‘βˆ’1ξ“πœ=0π‘¦π‘š(𝜏).(7.11) Dividing by 𝑑 and rearranging terms yields 1π‘‘π‘‘βˆ’1ξ“πœ=0π‘¦π‘šπ‘(𝜏)β‰€π‘š(𝑑)π‘‘βˆ’π‘π‘š(0)𝑑.(7.12) It follows that if π‘π‘š(𝑑) is rate stable for all π‘š, so that π‘π‘š(𝑑)/𝑑→0 with probability 1, then the constraint (7.7) is satisfied with probability 1.

For simplicity, assume that all queues are initially empty. Now define Θ(𝑑)β‰œ[𝐐(𝑑),𝐙(𝑑)] as the combined queue vector, and define the Lyapunov function as follows: 1𝐿(𝚯(𝑑))β‰œπ›½ξƒ¬πΎξ“π‘˜=1π‘„π‘˜(𝑑)𝛽+π‘€ξ“π‘š=1π‘π‘š(𝑑)𝛽,(7.13) where 1<𝛽≀2. The case 𝛽=2 is a quadratic Lyapunov function, and the case 1<𝛽<2 is a subquadratic. The drift-plus-penalty algorithm thus seeks to minimize a bound on 𝔼Δ(𝑑)+𝑉̂𝑦0ξ€»(𝛼(𝑑),πœ”(𝑑))βˆ£β„‹(𝑑).(7.14)

7.1. Computing the Drift-Plus-Penalty Inequality

Assume that the functions Μ‚π‘Žπ‘˜(β‹…), Μ‚π‘π‘˜(β‹…), and ̂𝑦0(β‹…) satisfy the following for all possible πœ”(𝑑) and all possible 𝛼(𝑑)βˆˆπ’œπœ”(𝑑): 0β‰€Μ‚π‘Žπ‘˜Μ‚π‘(𝛼(𝑑),πœ”(𝑑)),0β‰€π‘˜(𝛼(𝑑),πœ”(𝑑)),𝑦0min≀̂𝑦0(𝛼(𝑑),πœ”(𝑑))≀𝑦0max,(7.15) where 𝑦0min, and 𝑦0max are deterministic bounds on 𝑦0(𝑑) for all 𝑑. Also assume that there is a finite constant 𝐷>0 such that for all (possibly randomized) choices of 𝛼(𝑑) in reaction to the i.i.d. πœ”(𝑑), we have π”Όξ€ΊΜ‚π‘Žπ‘˜(𝛼(𝑑),πœ”(𝑑))4ξ€»π”Όξ€ΊΜ‚π‘β‰€π·βˆ€π‘˜βˆˆ{1,…,𝐾}π‘˜(𝛼(𝑑),πœ”(𝑑))4ξ€»π”Όξ€Ίβ‰€π·βˆ€π‘˜βˆˆ{1,…,𝐾}Μ‚π‘¦π‘š(𝛼(𝑑),πœ”(𝑑))4ξ€»β‰€π·βˆ€π‘šβˆˆ{1,…,𝑀},(7.16) where the expectations are taken with respect to the distribution of the i.i.d. πœ”(𝑑) process, and the possibly randomized decisions 𝛼(𝑑)βˆˆπ’œπœ”(𝑑).

Using the results of (5.4) for subquadratics and results in [4] for quadratics, it can be shown that the drift-plus-penalty expression satisfies the following bound: 𝔼Δ(𝑑)+𝑉̂𝑦0ξ€»ξ€Ί(𝛼(𝑑),πœ”(𝑑))βˆ£β„‹(𝑑)≀𝐡+𝑉𝔼̂𝑦0ξ€»+(𝛼(𝑑),πœ”(𝑑))βˆ£β„‹(𝑑)πΎξ“π‘˜=1π‘„π‘˜(𝑑)π›½βˆ’1π”Όξ€ΊΜ‚π‘Žπ‘˜Μ‚π‘(𝛼(𝑑),πœ”(𝑑))βˆ’π‘˜ξ€»+(𝛼(𝑑),πœ”(𝑑))βˆ£β„‹(𝑑)π‘€ξ“π‘š=1π‘π‘š(𝑑)π›½βˆ’1π”Όξ€ΊΜ‚π‘¦π‘šξ€»(𝛼(𝑑),πœ”(𝑑))βˆ£β„‹(𝑑)(7.17) for some finite constant 𝐡>0.

7.2. The Dynamic Algorithm

It is easy to show that, given β„‹(𝑑), the right-hand-side of (7.17) is minimized on slot 𝑑 by the policy that observes the current queue values 𝐐(𝑑), 𝐙(𝑑) and the current πœ”(𝑑) and chooses 𝛼(𝑑)βˆˆπ’œπœ”(𝑑) to minimize the following expression: 𝑉̂𝑦0(𝛼(𝑑),πœ”(𝑑))+πΎξ“π‘˜=1π‘„π‘˜(𝑑)π›½βˆ’1ξ€ΊΜ‚π‘Žπ‘˜Μ‚π‘(𝛼(𝑑),πœ”(𝑑))βˆ’π‘˜ξ€»+(𝛼(𝑑),πœ”(𝑑))π‘€ξ“π‘š=1π‘π‘š(𝑑)π›½βˆ’1Μ‚π‘¦π‘š(𝛼(𝑑),πœ”(𝑑)).(7.18) This is done at the beginning of each slot 𝑑, and at the end of the slot the actual and virtual queues π‘„π‘˜(𝑑), π‘π‘š(𝑑) are updated according to (7.1) and (7.10). This policy does not require knowledge of the probability distribution for πœ”(𝑑). One difficulty is that it may not be possible to achieve the infimum of the above expression over the set π’œπœ”(𝑑) because we are using general (possibly noncontinuous) functions Μ‚π‘Žπ‘˜(𝛼(𝑑),πœ”(𝑑)), Μ‚π‘π‘˜(𝛼(𝑑),πœ”(𝑑)), Μ‚π‘¦π‘š(𝛼(𝑑),πœ”(𝑑)), and a general (possibly noncompact) set π’œπœ”(𝑑). Thus, we simply assume that there is a finite constant 𝐢β‰₯0 such that our algorithm chooses 𝛼(𝑑)βˆˆπ’œπœ”(𝑑) to come within an additive constant 𝐢 of the infimum on every slot 𝑑, so that 𝑉̂𝑦0+(𝛼(𝑑),πœ”(𝑑))πΎξ“π‘˜=1π‘„π‘˜(𝑑)π›½βˆ’1ξ€ΊΜ‚π‘Žπ‘˜Μ‚π‘(𝛼(𝑑),πœ”(𝑑))βˆ’π‘˜ξ€»+(𝛼(𝑑),πœ”(𝑑))π‘€ξ“π‘š=1π‘π‘š(𝑑)π›½βˆ’1Μ‚π‘¦π‘š(𝛼(𝑑),πœ”(𝑑))≀𝐢+infπ›Όβˆˆπ’œπœ”(𝑑)𝑉̂𝑦0(𝛼,πœ”(𝑑))+πΎξ“π‘˜=1π‘„π‘˜(𝑑)π›½βˆ’1ξ€ΊΜ‚π‘Žπ‘˜Μ‚π‘(𝛼,πœ”(𝑑))βˆ’π‘˜ξ€»+(𝛼,πœ”(𝑑))π‘€ξ“π‘š=1π‘π‘š(𝑑)π›½βˆ’1Μ‚π‘¦π‘šξƒ­.(𝛼,πœ”(𝑑))(7.19) Such a choice of 𝛼(𝑑) is called a 𝐢-additive approximation. The case 𝐢=0 corresponds to achieving the exact infimum every slot.

7.3. πœ”-Only Policies and the Slackness Condition

Define an πœ”-only policy to be one that chooses 𝛼(𝑑)βˆˆπ’œπœ”(𝑑) every slot 𝑑 according to a stationary and randomized decision based only on the observed πœ”(𝑑) (in particular, being independent of β„‹(𝑑)). In [36] it is shown that if the problem (7.6)–(7.9) is feasible, then for all 𝛿>0 there must be an πœ”-only algorithm π›Όβˆ—(𝑑) that satisfies the following: 𝔼̂𝑦0ξ€·π›Όβˆ—(𝑑),πœ”(𝑑)≀𝑦opt0𝔼+𝛿,Μ‚π‘Žπ‘˜ξ€·π›Όβˆ—ξ€Έβˆ’Μ‚π‘(𝑑),πœ”(𝑑)π‘˜ξ€·π›Όβˆ—π”Όξ€Ί(𝑑),πœ”(𝑑)ξ€Έξ€»β‰€π›Ώβˆ€π‘˜βˆˆ{1,…,𝐾},Μ‚π‘¦π‘šξ€·π›Όβˆ—(𝑑),πœ”(𝑑)ξ€Έξ€»β‰€π›Ώβˆ€π‘šβˆˆ{1,…,𝑀}.(7.20)

We assume that the problem is feasible. It is often useful to assume that the following stronger slackness assumption is satisfied. There exists an πœ–>0 and a particular πœ”-only policy π›Όβˆ—(𝑑) (typically different than the policy that yields (7.20)) that yields the following: π”Όξ€ΊΜ‚π‘Žπ‘˜ξ€·π›Όβˆ—ξ€Έβˆ’Μ‚π‘(𝑑),πœ”(𝑑)π‘˜ξ€·π›Όβˆ—π”Όξ€Ί(𝑑),πœ”(𝑑)ξ€Έξ€»β‰€βˆ’πœ–βˆ€π‘˜βˆˆ{1,…,𝐾}Μ‚π‘¦π‘šξ€·π›Όβˆ—(𝑑),πœ”(𝑑)ξ€Έξ€»β‰€βˆ’πœ–βˆ€π‘šβˆˆ{1,…,𝑀}.(7.21) This assumption is similar to a Slater-type assumption used in convex optimization theory [5].

7.4. Performance Bounds for Subquadratics

Assume 1<𝛽<2. Because our policy 𝛼(𝑑) comes within 𝐢β‰₯0 of minimizing the right-hand-side of (7.17) every slot 𝑑 (given the observed β„‹(𝑑)), we have for all 𝑑 and all possible β„‹(𝑑)𝔼Δ(𝑑)+𝑉̂𝑦0ξ€»ξ€Ί(𝛼(𝑑),πœ”(𝑑))βˆ£β„‹(𝑑)≀𝐡+𝐢+𝑉𝔼̂𝑦0ξ€·π›Όβˆ—ξ€Έξ€»+(𝑑),πœ”(𝑑)βˆ£β„‹(𝑑)πΎξ“π‘˜=1π‘„π‘˜(𝑑)π›½βˆ’1π”Όξ€ΊΜ‚π‘Žπ‘˜ξ€·π›Όβˆ—ξ€Έβˆ’Μ‚π‘(𝑑),πœ”(𝑑)π‘˜ξ€·π›Όβˆ—ξ€Έξ€»+(𝑑),πœ”(𝑑)βˆ£β„‹(𝑑)π‘€ξ“π‘š=1π‘π‘š(𝑑)π›½βˆ’1π”Όξ€ΊΜ‚π‘¦π‘šξ€·π›Όβˆ—ξ€Έξ€»,(𝑑),πœ”(𝑑)βˆ£β„‹(𝑑)(7.22) where π›Όβˆ—(𝑑) is any other decision that can be implemented on slot 𝑑. Fix any 𝛿>0. Using the policy π›Όβˆ—(𝑑) designed to achieve (7.20) and noting that this policy makes decisions independent of β„‹(𝑑) yields 𝔼Δ(𝑑)+𝑉̂𝑦0𝑦(𝛼(𝑑),πœ”(𝑑))βˆ£β„‹(𝑑)≀𝐡+𝐢+𝑉opt0ξ€Έ+𝛿+π›ΏπΎξ“π‘˜=1π‘„π‘˜(𝑑)π›½βˆ’1+π›Ώπ‘€ξ“π‘š=1π‘π‘š(𝑑)π›½βˆ’1.(7.23) The above holds for all 𝛿>0. Taking a limit as 𝛿→0 yields 𝔼Δ(𝑑)+𝑉𝑦0ξ€»(𝑑)βˆ£β„‹(𝑑)≀𝐡+𝐢+𝑉𝑦opt0,(7.24) where for simplicity we have substituted 𝑦0(𝑑)=̂𝑦0(𝛼(𝑑),πœ”(𝑑)) on the left-hand side. Inequality (7.24) is in the exact form of the drift-plus-penalty condition (5.5) (with πœ–=0). Recall that the penalty 𝑦0(𝑑) is deterministically lower bounded by some finite (possibly negative) value 𝑦0min. Further, the moment bounds (7.16) can easily be shown to imply that the boundedness assumptions (5.2), (4.7), and (4.8) hold. Thus, we can apply Proposition 5.1 to conclude that all queues are rate stable. In particular π‘π‘š(𝑑)/𝑑→0 with probability 1 for all π‘˜, so that the constraints (7.7) are satisfied as follows: limsupπ‘‘β†’βˆžπ‘¦π‘š(𝑑)≀0βˆ€π‘šβˆˆ{1,…,𝑀}(w.p.1).(7.25) Further, again by Proposition 5.1, we have limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝑦0(𝜏)≀𝑦opt0+𝐡+𝐢𝑉(w.p.1).(7.26) Thus, all constraints of the problem are met, and the time average of penalty 𝑦0(𝑑) is within 𝑂(1/𝑉) of optimality.

The above did not require the slackness assumption. However, we can understand the tradeoff with 𝑉 by assuming that the slackness assumption holds, so that a policy π›Όβˆ—(𝑑) exists that satisfies (7.21) for some πœ–>0. Using (7.21) in the right-hand-side of (7.22) gives 𝔼[]𝑦Δ(𝑑)βˆ£β„‹(𝑑)≀𝐡+𝐢+𝑉0maxβˆ’π‘¦0minξ€Έβˆ’πœ–πΎξ“π‘˜=1π‘„π‘˜(𝑑)π›½βˆ’1βˆ’πœ–π‘€ξ“π‘š=1π‘π‘š(𝑑)π›½βˆ’1(7.27) From which we obtain by Proposition 5.1 that limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0ξƒ¬πΎξ“π‘˜=1π‘„π‘˜(𝜏)π›½βˆ’1+π‘€ξ“π‘š=1π‘π‘š(𝜏)π›½βˆ’1≀𝑦𝐡+𝐢+𝑉0maxβˆ’π‘¦0minξ€Έπœ–(w.p.1).(7.28)

7.5. Performance Bounds for Quadratics

Similarly, we can obtain performance bounds for the quadratic case 𝛽=2. Without the slackness condition, we can obtain an expression identical to (7.24). This ensures that all queues are rate stable (by Proposition 3.1), so that all desired constraints are satisfied. However, we cannot use Proposition 6.1 because the drift expression has πœ–=0. Thus, we require the slackness assumption to proceed further. Under this assumption, we derive (similar to (7.27)): 𝔼[]𝑦Δ(𝑑)βˆ£β„‹(𝑑)≀𝐡+𝐢+𝑉0maxβˆ’π‘¦0minξ€Έβˆ’πœ–πΎξ“π‘˜=1π‘„π‘˜(𝑑)βˆ’πœ–π‘€ξ“π‘š=1π‘π‘š(𝑑)(7.29) from which we obtain by Proposition 6.1limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0ξƒ¬πΎξ“π‘˜=1π‘„π‘˜(𝜏)+π‘€ξ“π‘š=1π‘π‘šξƒ­β‰€ξ€·π‘¦(𝜏)𝐡+𝐢+𝑉0maxβˆ’π‘¦0minξ€Έπœ–(w.p.1).(7.30) We also know from (7.29) and Proposition 6.3 that βˆ‘βˆžπ‘‘=1(𝔼[π‘„π‘˜(𝑑)2]/𝑑2)<∞, so that we conclude from (7.24) that limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝑦0(𝜏)≀𝑦opt0+𝐡+𝐢𝑉(w.p.1).(7.31)

8. Conclusions

This work derives sample path stability and time average penalty results for stochastic networks. It uses the law of large numbers for martingale differences to strengthen existing performance guarantees from time average expectations to pure time averages, with probability 1. This requires only mild-bounded second and fourth moment assumptions, and these assumptions were shown to be necessary via two simple counterexamples in Appendix A. The analysis considers both quadratic and subquadratic Lyapunov functions and uses both types of functions to develop and analyze algorithms for minimizing a time average penalty subject to network stability constraints and additional time average constraints. These results are useful for control of queueing networks, and for other stochastic problems of optimizing time averages subject to time average constraints.

Appendices

A. Counterexamples

This appendix provides two examples of systems that satisfy the drift-plus-penalty condition (2.3), so that the desired time average expectations (2.7)-(2.8) are satisfied. However, the corresponding sample path time averages do not hold because one of the bounded moment assumptions (6.2), (4.7), and (4.8) is violated.

A.1. The Penalty Process 𝑝(𝑑)

Define a process 𝑝(𝑑) that is independent over slots π‘‘βˆˆ{0,1,2,…} and satisfies the following: ⎧βŽͺ⎨βŽͺ⎩1𝑝(𝑑)=0withprobability1βˆ’21(𝑑+1)𝑑+1withprobability.2(𝑑+1)(A.1) Formally define 𝐐(𝑑)=𝟎 for all 𝑑, and use βˆ‘πΏ(𝐐)=(1/2)πΎπ‘˜=1𝑄2π‘˜, 𝑉=1, so that 𝔼[Ξ”(𝑑)+𝑉𝑝(𝑑)βˆ£β„‹(𝑑)]=𝔼[𝑝(𝑑)βˆ£β„‹(𝑑)]=1/2 for all 𝑑 and all β„‹(𝑑). It is clear that limπ‘‘β†’βˆžβˆ‘(1/𝑑)π‘‘βˆ’1𝜏=0𝔼[𝑝(𝜏)]=1/2. However, because the probability 1/(2(𝑑+1)) decays slowly, 𝑝(𝑑) is equal to 𝑑+1 for infinitely many slots π‘‘βˆˆ{0,1,2,…} (with probability 1), and for any such time π‘‘βˆ— we have 1π‘‘βˆ—+1π‘‘βˆ—ξ“πœ=0𝑝𝑑𝑝(𝜏)β‰₯βˆ—ξ€Έπ‘‘βˆ—=𝑑+1βˆ—+1π‘‘βˆ—+1=1.(A.2) Thus, with probability 1 we have limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝑝(𝜏)β‰₯1.(A.3) This differs from the time average expectation because the second moment condition (4.8) is violated.

A.2. The Queue Process 𝑄(𝑑)

Let 𝐾=1, 𝑉=0, 𝑝(𝑑)=0 for all 𝑑. Define a queue process 𝑄(𝑑) with 𝑄(0)=0 and with dynamics as follows: [],𝑄(𝑑+1)=max𝑄(𝑑)+π‘Ž(𝑑)βˆ’1,0(A.4) where π‘Ž(𝑑) is independent over slots π‘‘βˆˆ{0,1,2,…} and satisfies ⎧βŽͺ⎨βŽͺ⎩1π‘Ž(𝑑)=0withprobability1βˆ’βˆš100(𝑑+1)101𝑑+1withprobability.100(𝑑+1)(A.5) Define 𝐿(𝑄(𝑑))=(1/2)𝑄(𝑑)2 and define Ξ”(𝑑)=𝐿(𝑄(𝑑+1))βˆ’πΏ(𝑄(𝑑)). Clearly 𝑄(𝑑+1)2≀(𝑄(𝑑)+π‘Ž(𝑑)βˆ’1)2, and so a basic calculation shows that for all π‘‘βˆˆ{0,1,2,…} and all β„‹(𝑑) we have 𝔼[]≀1Ξ”(𝑑)βˆ£β„‹(𝑑)2𝔼1+π‘Ž(𝑑)2ξ€»[]1βˆ£β„‹(𝑑)βˆ’π‘„(𝑑)𝔼1βˆ’π‘Ž(𝑑)βˆ£β„‹(𝑑)=1βˆ’π‘„(𝑑)1βˆ’βˆš10ξƒ­ξ‚€9𝑑+1≀1βˆ’ξ‚10𝑄(𝑑).(A.6) Using πœ–=9/10, it follows by (2.8) that limsupπ‘‘β†’βˆž1π‘‘π‘‘βˆ’1ξ“πœ=0𝔼[]≀𝑄(𝜏)109.(A.7) However, with probability 1, we have that βˆšπ‘Ž(𝑑)=10𝑑+1 for an infinite number of slots π‘‘βˆˆ{0,1,2,…}. Let π‘‘βˆ— be such a slot, and assume that π‘‘βˆ—β‰₯1. Because 𝑄(𝑑) can decrease by at most 1 every slot, it follows that for all 𝜏∈{π‘‘βˆ—+1,…,π‘‘βˆ—βˆš+βŒˆπ‘‘βˆ—+1βŒ‰} we have βˆšπ‘„(𝜏)β‰₯10π‘‘βˆ—ξ‚œβˆš+1βˆ’π‘‘βˆ—ξ‚βˆš+1β‰₯8π‘‘βˆ—+1,(A.8) where we have used the fact that βŒˆβˆšπ‘‘βˆ—βˆš+1βŒ‰β‰€2π‘‘βˆ—+1. Then we have π‘‘βˆ—βˆš+βŒˆπ‘‘βˆ—+1βŒ‰ξ“πœ=0𝑄(𝜏)β‰₯π‘‘βˆ—βˆš+βŒˆπ‘‘βˆ—+1βŒ‰ξ“πœ=π‘‘βˆ—+18βˆšπ‘‘βˆ—