Backpressure-based control protocols: design and computational aspects

(1)

Backpressure-based Control Protocols:

Design and Computational Aspects

D.I. Miretskiy

University of Twente Enschede, The Netherlands Email: d.miretskiy@math.utwente.nl

W.R.W. Scheinhardt

University of Twente Enschede, The Netherlands Email: w.r.w.scheinhardt@utwente.nl

M.R.H. Mandjes

University of Amsterdam Amsterdam, The Netherlands Email: mmandjes@science.uva.nl

Abstract—Congestion control in packet-based networks is often

realized by feedback protocols. In this paper we assess their performance under a back-pressure mechanism that has been proposed and standardized for Ethernet metropolitan networks. In such a mechanism the service rate of an upstream queue is reduced when the downstream queue is congested, in order to protect the downstream queue. We study a Markovian model that captures the essentials of the protocol, but at the same time allows for numerical analysis. We first derive explicit results for the stability condition of the model (which turns out to be nontrivial). Then we present logarithmic estimates of the probability of buffer overflow in the second queue, which are subsequentially used when devising an efficient simulation procedure based on importance sampling. We conclude the paper by presenting a number of numerical results, and some general design guidelines.

I. INTRODUCTION

A variety of protocols has been proposed for purposes of congestion control in packet networks. A well-known mechanism is random early detection, as proposed in e.g. [4], which has been intensively studied. Here, packets are randomly dropped when the buffer content exceeds some threshold, so as to notify the user about incipient congestion. Similar feedback-based mechanisms have been proposed and standardized for congestion control in Ethernet metropoli-tan networks. The backpressure scheme defined in IEEE 802.3x [11], is intended to provide flow control on a hop-by-hop basis. For ease restricting ourselves to a two-node network (but the principle applies to networks of any size), it works roughly as follows. Traffic enters the first queue (‘upstream queue’), and after being served it is routed to a second queue (‘downstream queue’), where it is served and leaves the network. When the buffer content of the downstream queue exceeds some threshold, the service rate at the first queue is reduced, so as to protect the downstream queue. The upstream queue resumes serving at its orginal rate when the buffer content of the downstream queue is below the threshold again.

It is clear that it is of crucial importance to find simple, yet adequate design rules that determine suitable values for the buffer thresholds and service rates. A problem is that there are hardly any explicit results known for models of this type. Even under Markovian assumptions, the buffer content distribution cannot be given in closed form – just rough characteristics are

known, see e.g. [13]. The present paper presents performance evaluation techniques that enable numerical analysis, and in addition we present a number of insights that are useful when designing such a backpressure system.

More specifically, our contributions are the following: A. We determine the stability condition of the backpressure-based model. In other words, we find a condition on the arrival rate, service rates, and threshold value such that the queue does not systematically fill. This condition is rather surprising (and very different from the common ‘% < 1 conditions’).

B. Main performance metrics are the (end-to-end) delay, the throughput, and the probability of overflow in the sec-ond queue (as that was the queue that was intended to be protected). From a computational point of view, the latter metric is the most challenging. It is noted that it is not straightforward to estimate this probability through simulation, because of the rare event involved. We propose an advanced simulation algorithm, that manages to reliably estimate the probability of interest (which is the probability of overflow in the downstream queue before the system becomes empty, starting off from any given state) in a reasonable amount of time. This simulation procedure is based on importance sampling (IS). The method borrows elements from algorithms developed earlier for related (tandem) systems [2], [7].

The main idea behind IS is to simulate a system under a new probability measure which guarantees more frequent occurrence of the rare event of interest. To obtain an unbiased estimate, the output of the simulations is corrected by so-called likelihood ratios. The challenge is to construct a ‘good’ new measure. An often-used notion in this respect is that of asymptotic efficiency (or: asymptotic optimality) which essentially means that the variance of the estimator behaves approximately as the square of its first moment. When this is not the case, the estimator may even have infinite variance. We refer to [5] for more background on IS.

In more detail, the contribution is that we present a simple and efficient IS implementation for simulating the overflow probability in the slow-down model. On the one hand it is as easy to implement as the scheme in [2], or even as that in [7], but on the other hand it performs comparably in terms of computational demand. On the other hand it allows any given starting state, while extensive numerical experimentation

(2)

shows that it inherits the efficiency properties of [9]. In the analysis it turns out that three regimes (in terms of the decay rates of the queues) need to be distinguished.

C. The last part of the paper is devoted to a study of the performance aspects of our system. In particular, we quantify the effect of the value of the threshold. It clearly affects the mean end-to-end delay as well as the overflow probability of the downstream queue, and it is not clear a priori what value is optimal. Therefore we construct a cost function (including both performance metrics), and use it to develop a procedure that is capable of finding the optimal threshold value. The numerical values for the overflow probability are obtained by using the IS algorithm mentioned under item B.

This paper is structured as follows. The next section con-tains a model description, the stability result, and background on IS. In Section III we propose an IS scheme, and argue why it has attractive efficiency properties. In Section IV we demonstrate how to develop procedures for setting the parameter values (focusing on the value of the threshold), and we provide supporting numerical results. We draw conclusions in Section V.

II. MODEL ANDPRELIMINARIES

A. Model and Stability

In this section we describe the system, which is a series of two stations (or: queues), in detail. Jobs enter the system at the upstream queue according to a Poisson process with rate λ, and after being served they are forwarded to the downstream queue; after service in this second queue, they leave the network. Service times at the second station are exponential with parameter µ2all the time, but the service speed at the first

queue depends on the content of the second queue. Normally, service times at the first station are exponential with parameter

µ1, but if the number of jobs in the second queue is larger

than some prespecified threshold m – the so-called

slow-down threshold – then the service times are exponential with

parameter µ+₁, where µ+₁ < µ1. When the system ‘stabilizes’

and the number of jobs in the second queue is again strictly below the slow-down threshold, the rate of the first station returns to its original value µ1.

For convenience we choose the parameters such that λ +

µ1+µ2= 1, without loss of generality (in fact, we ‘rescale our

time unit’), and hence λ + µ+₁+ µ2< 1. As in [9], we assume

(as an approximation) the waiting rooms at both stations to be infinitely large. We define the discrete-time joint queue-length process Qj= (Q1,j, Q2,j), where Qi,jis the number of

jobs at node i after the j-th transition. We define the possible jump directions of the process Qj via vectors v0 = (1, 0),

v1= (−1, 1) and v2= (0, −1) with corresponding jump rates

λ, µ1 (or µ+1) and µ2(where it is noted that vk is impossible

if queue k is empty – then take vk= (0, 0) instead).

A first question is under what condition this process is stable – clearly for design purposes such a criterion is of crucial importance. Interestingly, the resulting criterion is

substantially more involved than the usual ‘% < 1 conditions’. Define ψ := µ1/µ2and ψ+:= µ+1/µ2.

Theorem 2.1: Case I: µ+₁ < µ2. The network is stable if

λ < µ1(1 − ψ

m_{)(1 − ψ}+_{) + µ}+

1ψm(1 − ψ)

(1 − ψm_{)(1 − ψ}+_{) + ψ}m_{(1 − ψ)} .

Case II: µ+₁ ≥ µ2. The network is stable if λ < µ2.

Proof: It is obvious that λ < µ2 is necessary, but not

sufficient for stability. We deal with both cases separately. Case I: µ+₁ < µ2. The proof relies on techniques from the

theory of Quasi-Birth-and-Death (QBD) processes [6]; we aim to prove the positive recurrence of the discrete-time process

Qj. From now on we will treat Qj as a discrete-time QBD

with Q1,j and Q2,jbeing the level and the phase, respectively;

note that this is not a ‘standard QBD’, as the number of phases is infinite.

We now introduce some QBD related notation. Let M0, M1

and M2be (n+1)×(n+1) dimensional matrices, with n being

the number of phases (either finite or infinite). M0 represents

an increase in level (new job arrives to the system), M1 no

change in level (job leaves the system) and M2 an decrease

in level (job is forwarded from the first queue to the second one); the precise definitions of these matrices were given in [13] for our model. [6, Thm. 7.2.3] now states that if the number of phases is finite, then the QBD is positive recurrent if αM0e < αM2e, where the vector α is the solution to

αM = 0, with M := M0+ M1+ M2; e is an all-1 vector.

Application of this result (which is not legitimate in our case, due to the infinite number of phases!) would yield that the QBD is positive recurrent if

m−1 X i=0 α0ψi(λ − µ1) + ∞ X i=m α0ψm(ψ+)i−m(λ − µ+1) < 0, (1) where αi= α0ψi, i < m α0ψm(ψ+)i−m, i ≥ m ; α0= m−1 X i=0 ψi+ ψm ∞ X i=m (ψ+)i−m !−1 ,

from which the first statement of the theorem would follow. There is a counterpart of [6, Thm. 7.2.3] that does deal with an infinite number of phases, though: [12, Thm. 5] states that the QBD with infinite number of phases is positive recurrent if αM0e < αM2e provided that ¯M = M1+ M2. Here ¯M is

an infinite-dimensional matrix that describes the behavior of the phase-process of the QBD at level 0, see again [13] for its precise form. The condition ¯M = M1+ M2effectively means

that the phase process in level 0 is the same as for any other level. Obviously, this requirement fails in our case. In order to be able to apply [12, Thm. 5] we modify the QBD Qj in

order to satisfy the condition condition ¯M = M1+ M2:

v1= (0, 1), when Q1,j = 0,

where the other transition vectors remain unchanged. Now we can conclude that this modified process is stable if (1) holds.

(3)

An elementary inspection yields that the cycle time (i.e., the number of transitions it takes the discrete-time process to reach the origin) of the modified QBD is stochastically larger than the cycle time of the original QBD, and hence stability of the original QBD is implied by the stability of the modified QBD. Case II: µ2 ≤ µ+1. Here we cannot apply the reasoning

mentioned above since the distribution αidoes not exist when

µ2 ≤ µ+1. However, in this case, stability can be proven

rather straightforwardly. Clearly, the expected cycle length in this case can be bounded from above by the expected cycle length for m = 0, due to elementary coupling arguments. The latter value is finite, since it corresponds to the mean busy cycle length of the two-node Jackson tandem with parameters

(λ, µ+₁, µ2), which is stable under λ < µ2≤ µ+1.

We remark that somewhat related results for the slow-down system with a finite second buffer were reported in [13, Thm. 15]. Also, interestingly, the slow-down system can be stable

even when λ > µ+₁! The intuition behind this is as follows.

Consider the case when both λ > µ+₁ and the condition in Theorem 2.1 hold true. The content of the first queue typically increases when the number of jobs in the second queue is above m. However, it stays finite because the content of the second queue tends to decrease and the system returns to its normal state in which the number of jobs in the first queue tends to decrease. It is also worthwhile to mention that when the slow down threshold m is 0 or ∞, the criterion mentioned above reduces to the standard stability condition for the ordinary two-node Jackson tandem network.

B. State Space and System Dynamics

In Section III we focus on estimating the probability of reaching some high level B in the second queue before it returns to the origin, starting from any given state; this subsection describes a number of notions that are useful with this goal in mind. From now on we let the threshold m scale linearly with B that is, m ≡ θB for some θ ∈ (0, 1). In terms of the scaled process (X1, X2) = (Q1/B, Q2/B), we

analyze the probability that its second coordinate attains the value 1 before reaching the origin. Note that an advantage of this scaling is that, in order to analyze this probability, we may use the state space [0, ∞) × [0, 1] (for any value of B!). We introduce the following subsets of the state space, with

x := (x1, x2): D := {x : x1> 0, 0 < x2< θ}, ∂2:= {(x1, 0) : x1> 0}, D+_{:= {x : x} 1> 0, θ ≤ x2< 1}, ∂θ:= {(x1, θ) : x1≥ 0}, ∂₁+:= {(0, x2) : x2∈ [θ, 1)}, ∂e:= {(x1, 1) : x1≥ 0}, ∂1:= {(0, x2) : x2> 0}.

The full state space is ¯D ∪ ¯D+_{, where ¯}_{D := D ∪ ∂}

θ∪ (∂1\

∂₁+) ∪ ∂2 and ¯D+ := D+ ∪ ∂e∪ ∂1+ ∪ ∂θ. Recall that the

transition vk is impossible when queue k is empty, i.e., when

Xj ∈ ∂k. We modify the process Xj to deal with this by

allowing ‘self-loop transitions’ in the following way (see also

Figure 1): for k = 1, 2, P(Xj+1= Xj|Xj∈ ∂k\ ∂+1) = µk, (2) P(Xj+1= Xj|Xj∈ ∂1+) = µ + 1/(λ + µ + 1 + µ2). -6 x1 x2 1 θ 0 @ @ R ∂1 @ @ R ∂₁+ ∂2 ∂e ∂θ -λ ?µ₂ @ @ I µ1 -λ ?µ2 @ @ I µ+₁ j µ1 ?µ₂ -λ Dm Dm+ @ @ I µ1 -λ j Y_µ 2

Fig. 1. State space and transition structure for the scaled process Xj.

Next, we introduce the stopping time τ_Bs, which is the first time that the process Xj hits level 1, starting from state s =

(s1, s2), without visits to the origin: with X0= s,

τ_Bs = inf{k : Xk ∈ ∂e, Xj 6= 0 for j = 1, . . . , k − 1}, (3)

and we define τs

B= ∞ if Xj hits the origin before ∂e. We let

I(As

B) be the indicator of the event {τBs < ∞} for the scaled

sample path (Xj, j = 0, . . . : X0 = s). Thus we can write

the probability of our interest as

ps_B _{= EI(A}s_B_{) = P(τ}_Bs < ∞). (4) It is clear that estimating the probability ps_B through direct, na¨ıve, simulations is not feasible when B grows large, because of the rarity of the event involved. We therefore have to use some alternative techniques to obtain a reliable estimator. In Section III we focus on importance sampling, which we will now describe briefly.

C. Background on Importance Sampling

To estimate ps

B, IS generates samples under a new

prob-ability measure Q, with respect to which P is absolutely continuous. The probability ps

B can, in self-evident notation,

now alternatively be expressed as

ps_B _{= E[I] = E}Q_[LI], ₍₅₎

where I is an indicator function and L is the likelihood ratio of a realization (‘path’) ω:

L = dP

dQ(ω). (6)

After n replications we obtain a family of observations

(4)

estimator of ps_Bby n−1·Pn

i=1LiIi. We conclude this

subsec-tion by recalling that an IS scheme (corresponding to measure

Q) is said to be asymptotically efficient if

lim inf

B→∞

log EQ_[L2_I]

log EQ_[LI] ≥ 2. (7)

If the probability of {As_B} decays exponentially in B, i.e., B−1log ps

B tends (for B large) to some value in (−∞, 0),

then, due to (5) and EQ_[L2

I] = E[LI], (7) simplifies to lim sup

B→∞

1

Blog E[LI] ≤ 2 limB→∞

1 Blog p

s

B. (8)

III. IMPORTANCESAMPLING

In [9] we have developed an asymptotically efficient IS-based method for estimating ps

B, but this has, from a practical

point of view, important drawbacks: the new measure is state-dependent, and needs to be recomputed at every transition (amounting to jointly solving two cubic equations), thus severely limiting the efficiency gain. In this section we present a new measure that is still state-dependent, but its computation is substantially less demanding, as it requires just a few cubic equations to be solved. As we will see, the speed-up of this new scheme is still substantial.

We give a detailed description of the IS scheme for the case µ2< µ+1 < µ1; for the other cases (i.e., µ+1 < µ2≤ µ1

and µ+₁ ≤ µ1 < µ2), we just present the results. Throughout

this section we fix the starting state s and assume it is situated below the slow-down threshold, i.e., s ∈ ¯D, which is evidently

the most interesting case.

A. IS Scheme for Case µ2 < µ+1 < µ1. At first recall

from [9] the most probable path to overflow and the pair of new measures (˜λ, ˜µ1, ˜µ2) and (˜λ+, ˜µ+1, ˜µ

+

2) that will ensure

that any sample path will follow the optimal trajectory with high probability. To this end we assign a ‘cost’ to any path. Minimizing this we obtain the optimal trajectory and corresponding new measure, see [9]. To ease the exposition on the new measures we divide the state space as it is shown in Figure 2, where α1 := (µ1− µ2)/(µ1 − λ) and

α+₁ := (µ+₁ − µ2)/(µ+1 − λ). This figure also provides some

examples of the most probable overflow trajectories. We are particularly interested in the partition of the bottom part of the state space, i.e., in A1, A2and A3. The new measures for

s ∈ A1∪ A3are not difficult, as we will see in (17). However,

to find the optimal new measure for s ∈ A2one first needs to

jointly solve            λ(line)_{= µ}(line) 1 + κ(line)_−s 1 θ−s2 (µ (line) 1 − µ (line) 2 ) λ(line)_{+ µ}(line) 1 + µ (line) 2 = λ + µ1+ µ2 λ(line)_µ(line) 1 µ (line) 2 = λµ1µ2 λ(line)_{≤ µ}(line) 1 and µ (line) 1 > µ (line) 2 λ(line)_{, µ}(line) 1 , µ (line) 2 > 0 (9) -6 θ 1 0 x1 x2 α1θ+α+1(1−θ) θ α1 + 1−θ α+₁ A1 A2 A3 r K r K o r I k

Fig. 2. Partition of ¯D ∪ ¯D+_{when µ}

2< µ+1 < µ1. and            λ+(line)_{= µ}+(line) 1 − κ(line) 1−θ(µ +(line) 1 − µ +(line) 2 ) λ+(line)_{+ µ}+(line) 1 + µ +(line) 2 = λ + µ + 1 + µ2 λ+(line)_µ+(line) 1 µ +(line) 2 = λµ + 1µ2 λ+(line)_{≤ µ}+(line) 1 and µ +(line) 1 > µ +(line) 2 λ+(line)_{, µ}+(line) 1 , µ +(line) 2 > 0 (10)

with the condition

κ(line)_{:= s} 1− µ(line) 1 − λ (line) µ(line) 1 − µ (line) 2 (θ − s2) = µ+(line) 1 − λ +(line) µ+(line) 1 − µ +(line) 2 (1 − θ). (11) It can be verified that this amounts to solving two coupled cubic equations. The superscripts (line) and +(line) indicate that the solution is in fact the optimal change of measure to reach the state (0, 1) following a concatenation of two straight line starting in s, with intersection in (κ(line)_{, θ).}

Now we can define the (overall) optimal new measures below and above the slow-down threshold, which depend only on the starting state s. The new measure below the slow-down threshold, as given through (˜λ, ˜µ1, ˜µ2), is

(˜λ, ˜µ1, ˜µ2) =    (µ2, µ1, λ) if s ∈ A1, (λ(line)_{, µ}(line) 1 , µ (line) 2 ) if s ∈ A2, (λ, µ1, µ2) if s ∈ A3. (12)

Above the slow-down threshold the new measure, given through (˜λ+_{, ˜}_µ+ 1, ˜µ + 2), is (˜λ+, ˜µ+₁, ˜µ+₂) =    (µ2, µ+1, λ) if s ∈ A1, (λ+(line)_{, µ}+(line) 1 , µ +(line) 2 ) if s ∈ A2, (λ, µ+₁, µ2) if s ∈ A3. (13) Now, let us define γs_{(x) to be the residual cost of moving}

from state x to ∂ealong the path to overflow that started in s:

γs(x) := γs 1(x) + γ2s(κ?, θ) if x ∈ ¯D, γs 2(x) if x ∈ ¯D+, (14) with γ1s(x) := − (x1− κ?) log ˜ λ λ− (θ − x2) log ˜ µ2 µ2 , if x ∈ ¯D (15)

(5)

being the minimal cost of the bottom part of the path to overflow and γs₂(x) := −x1log ˜ λ+ λ − (1 − x2) log ˜ µ+₂ µ2 , if x ∈ ¯D+ (16) being the minimal cost of the top part of the optimal path to overflow; the optimal crossing state (κ?_{, θ) is as follows}

κ?:=    max(0, s1− α1(θ − s2)) if s ∈ A1, κ(line) _{if s ∈ A} 2, s1− (θ − s2)/α1 if s ∈ A3. (17) Note that (˜λ, ˜µ1, ˜µ2), (˜λ+, ˜µ+1, ˜µ + 2) and κ? (given by (12),

(13) and (17) respectively) are fixed, i.e., they only depend on the fixed initial state s, and not on the current state x (as was the case in [9]). It is also important to note that γs(s), the total

cost of moving from the starting state s to the exit boundary

∂e, equals the decay rate of psB, i.e., B−1· log p s B→ −γ

s

(s),

see Theorem 3.2 in [9].

Notice that the function γs(x) is piecewise-linear in x,

since the new tilde-measure, i.e., (˜λ, ˜µ1, ˜µ2) and (˜λ+, ˜µ+1, ˜µ + 2),

depends only on the fixed initial state s, and not on the current state x. This is the main difference with the new measure studied in [9], where we used the optimal new measure for each current state x with its cost γx_{(x). Therefore a pair of}

cubic equations (corresponding to (9)–(11) with s replaced by

x) had to be solved for each state x in the sample path. In our

current approach, computation of the new measure requires the solution of (9)–(11) just once.

It is known, e.g. from our previous research [7], that the new measures (˜λ, ˜µ1, ˜µ2) and (˜λ+, ˜µ+1, ˜µ

+

2), which make a sample

path ‘on average’ follow the optimal trajectory to the rare set, is not necessarily asymptotically efficient; this is due to the possibility of several visits to the horizontal axis, which inflate the likelihood ratio, cf. [1], [10]. In order to resolve this, we first introduce the ‘hat’-measure (ˆλ, ˆµ1, ˆµ2) and (ˆλ+, ˆµ+1, ˆµ

+ 2)

as in [3], to be used when the current state is on or near the horizontal axis, through

(˜λ, µ1λ/˜λ, µ2) and (˜λ+, µ+1λ/˜λ +_{, µ}+

2). (18)

The main idea behind it is to make the likelihood ratios of the loops around the horizontal axis not greater than 1 (by ensuring ˆµ2= µ2).

Having introduced the ‘tilde-measure’ and the ‘hat-measure’, we are now ready to define the measure

(¯λ(x), ¯µ1(x), ¯µ2(x)), of which we will prove asymptotic

ef-ficiency, and which is a combination of the two measures defined above and the original measure:

¯ λ(x) = λ˜ρ1_λˆρ2_λρ3_{M (x),} _{if x ∈ ¯}_D, ¯ µ1(x) = µ˜ ρ1 1 µˆ ρ2 1 µ ρ3 1 M (x), if x ∈ ¯D, ¯ µ2(x) = µ˜ ρ1 2 µˆ ρ2 2 µ ρ3 2 M (x), if x ∈ ¯D, ¯ λ+_{(x) =} _(˜_λ+₎ρ1_(ˆ_λ+₎ρ2_(λ)ρ3_M+_(x), _{if x ∈ ¯}_D+_, ¯ µ+₁(x) = (˜µ+₁)ρ1_(ˆ_µ+ 1) ρ2_(µ+ 1) ρ3_M+_(x), _{if x ∈ ¯}_D+_, ¯ µ+₂(x) = (˜µ+₂)ρ1_(ˆ_µ+ 2)ρ2(µ2)ρ3M+(x), if x ∈ ¯D+, (19)

where M (x) and M+(x) are normalization functions, and the ρi(x) are weights, cf. [2], [3], given by

ρ1(x) = N (x) · e− 2γs (x1,x2)−δ _, ρ2(x) = N (x) · e− 2γs (x1,δ2logµ2λ)−δ _, ₍₂₀₎ ρ3(x) = N (x) · e− 2γs (0,0)−δ .

Here N (x) is a normalization function and δ and are small positive numbers. We mention again that this measure is, albeit state-dependent, of low computational complexity, as it does not require to solve cubic systems for any point along the path (except s). There are interesting relations between this new measure, and the scheme proposed in [2].

B. IS Scheme for Case µ+₁ ≤ µ2< µ1. Here we present the IS

scheme for the case when µ+₁ ≤ µ2< µ1. At first we provide

the pair of new measures (˜λ, ˜µ1, ˜µ2) and (˜λ+, ˜µ+1, ˜µ + 2) under

which virtually any sample path follows the most probable trajectory with high probability. The bottom part of the state space ¯D is divided in subspaces Bi as depicted in Fig. 3,

where α+₂ := (µ2− µ+1)/(µ2− λ) and α1has been introduced

in the previous subsection, cf. [9]. The new measure below

m, (˜λ, ˜µ1, ˜µ2), is as follows: (˜λ, ˜µ1, ˜µ2) =    (µ2, µ1, λ), if s ∈ B1, (λ(line)_{, µ}(line) 1 , µ (line) 2 ), if s ∈ B2, (λ, µ1, µ2), if s ∈ B3, (21)

whereas above m we have

(˜λ+, ˜µ+₁, ˜µ+₂) =      ( q λµ+₁ z+ , q λµ+₁ z+ , µ2z+), if s ∈ B1 (λ+(line)_{, µ}+(line) 1 , µ +(line) 2 ), if s ∈ B2, (λ, µ2, µ+1), if s ∈ B3, (22) where z+ is the solution (unique in (0, 1)) to

λ + µ+₁ + µ2(1 − z+) = 2

s λµ+₁

z+ . (23)

The optimal crossing state (κ?, θ) is now given by

κ?:=    0 if s ∈ B1, κ(line) _{if s ∈ B} 2, s1− (θ − s2)/α1 if s ∈ B3. (24)

The function γs_{(x) is defined by (14) with (15), (16) and}

(24). The total cost of the path γs_{(s) again is the decay}

rate of ps

B, i.e., B−1· log psB → −γs(s), see [9, Thm. 3.2].

The new state-dependent measures (¯λ(x), ¯µ1(x), ¯µ2(x)) and

(¯λ+_{(x), ¯}_µ+ 1(x), ¯µ

+

2(x)) are given by (19), where the

‘hat’-measures and the weights ρi(x) are defined by (18) and (20),

respectively.

C. IS Scheme for Case µ+₁ < µ1≤ µ2. This case is the most

difficult case. We partition the state space as in Fig. 4. To see whether a starting state s belongs to C1 or C2, one needs to

jointly solve (9)–(11). Then s belongs to C1 if and only if

f (s) ≤ 0, where f (s) = θ logµ2 λ + (θ − 1) log q − s1log µ1 λ − γ s_(s),

(6)

θ -6 x1 x2 1 0 α1θ _αθ 1 + 1−θ α+₂ B1 B2 B3 r r r K 6 6 o M I k

Fig. 3. Partition of ¯D ∪ ¯D+_{when µ}+

1 ≤ µ2< µ1. θ -6 x1 x2 1 0 β1 _αθ 2 + 1−θ α+₂ 0 β C1 C2 C3 r r r K O j -o K I k

Fig. 4. Partition of ¯D ∪ ¯D+when µ+₁ < µ1≤ µ2.

with q being the solution (unique in (0, µ+₁λ/µ2 1)) to

µ1µ2q2+ µ1(µ1− λ − µ+1 − µ2)q + λµ+1 = 0, (25)

see also [7], [9]. The constants β and β1 are solutions to

f (0, s2) = 0 and f (s1, 0) = 0 respectively; α2 := (µ2−

µ1)/(µ2− λ).

• At first let us assume that s ∈ C2∪ C3. Then we define

the new measure below the slow-down threshold

(˜λ, ˜µ1, ˜µ2) = (λ(line)_{, µ}(line) 1 , µ (line) 2 ), if s ∈ C2, (λ, µ2, µ1), if s ∈ C3, (26) and above it (˜λ+, ˜µ+₁, ˜µ+₂) = (λ+(line)_{, µ}+(line) 1 , µ +(line) 2 ), if s ∈ C2, (λ, µ2, µ+1), if s ∈ C3, (27) The optimal crossing state (κ?, θ) is now given by

κ?:= κ(line) _{if s ∈ C} 2, s1− (θ − s2)/α2 if s ∈ C3. (28) The function γs_{(x) is again defined by (14) with (15),}

(16) and (28). The total costs of the path, γs_{(s), is the}

decay rate of ps

B, i.e., B−1 · log psB → −γs(s), see

[9, Thm. 3.2]. The new measures (¯λ(x), ¯µ1(x), ¯µ2(x))

and (¯λ+(x), ¯µ+₁(x), ¯µ+₂(x)) are given by (19), where the

‘hat’-measures and the weights ρi(x) are defined by (18)

and (20), respectively.

• Let us now proceed to the case s ∈ C1. In this case the

optimal new measure below the slow-down threshold at the k-th transition is

(˜λ, ˜µ1, ˜µ2) = (29)

(µ1, λI{k<τ?_}+ µ₂I_{k≥τ?_}, µ₂I_{k<τ?_}+ λI_{k≥τ?_}), where τ?_{:= min{k : X}

k ∈ C3 and X2,k= 0}; see also

the optimal trajectory that starts in C1, Figure 4. Above

it the new measure is defined by

(˜λ+, ˜µ+₁, ˜µ+₂) = (µ1, λµ+1/qµ1, qµ2), (30)

where q is defined by (25).

It is important, that the residual cost of the bottom part of the path to overflow, namely γ1(x), for this case is

different from the one defined by (15). Here it is

γ₁s(x) = θ logµ2 λ − (x1− κ ?_{) log}λ˜ λ+ x2log ˜ µ2 µ2 , (31) where κ?= (1 − θ)(λµ+₁ − qµ21)/(λµ + 1 − q 2_µ 1µ2). (32)

For this case γs

2(x) is defined by (16), and γs(x) again

by (14) with (31), (16) and (32). As before, γs_{(s) is}

the decay rate of ps

B, i.e., B−1 · log psB → −γs(s).

The new state-dependent measures (¯λ(x), ¯µ1(x), ¯µ2(x))

and (¯λ+(x), ¯µ+₁(x), ¯µ+₂(x)) are given by (19), where the

‘hat’-measures and the weights ρi(x) are defined by (18)

and (20) respectively.

For convenience we summarize the resulting IS scheme for the different cases.

• When µ2< µ+1 < µ1 one needs to

1. define the ‘primary’ new measures (12) and (13); 2. define ‘hat’-measures (18);

3. define weights ρi(x) by (20) based on (14)–(17);

4. apply (19).

• When µ+₁ ≤ µ2 < µ1, the same procedure is

followed, only replacing the ‘primary’ new mea-sures (12) and (13) by (21) and (22) in step 1; and (17) by (24) in step 3.

• When µ+₁ < µ1≤ µ2 and s ∈ C2∪ C3, we follow

the same algorithm, only with the ‘primary’ new measure replaced by (26) and (27) in step 1; and using (28) instead of (17) in step 3.

• When µ+₁ < µ1 ≤ µ2 and s ∈ C1, again the

same procedure is followed, this time replacing the ‘primary’ new measure by that in (29) and (30); we also replace (15) by (31) and (17) by (32) in step 3.

In [9] we proved asymptotic efficiency of the fully state-dependent IS scheme, i.e., when the ‘primary’ new measure was dependent on the current state. Analyzing the simplified

(7)

IS scheme (19) we have to deal with the additional compli-cation that the discontinuity of γs(x) around the slow-down

threshold, see e.g., (14)–(16), (12) and (13). We do conjecture, though, that the scheme is asymptotic efficient; for specific cases we proved this in [8].

IV. COMPUTATIONALASPECTS ANDDESIGNISSUES In this section we first study the efficiency gain achieved by applying IS (rather than na¨ıve simulation), and then we address a number of design issues.

In Table I we present simulation results for four different parameter settings using the new measure defined in (19). Instead of performing a fixed number of simulation runs such as in much of the IS literature, we simulated until the relative error of the estimator reached some prespecified value, viz.

10−2 for the first three cases and 5 · 10−2 for the last one. In the table we present 95% confidence intervals for ps

B, the

number of needed replications (# runs), the used machine time in seconds, and the number of ‘succesful’ replications (# succ.), i.e. the number of runs that resulted in buffer overflow. We compare three values of the overflow level B; the value of δ was taken to be δ = −1₃ log and = 0.001, as in

[8]. We fixed the starting state s at the origin, since this is the most natural choice from a practical perspective. Clearly,

(λ, µ1, µ+1, µ2) = (0.1, 0.7, 0.15, 0.2), RE = 0.01

B ps_B # succ. # runs time

20 3.79 · 10−7_{± 7.44 · 10}−9 _{15, 576} _{28, 332} _0.4 50 1.28 · 10−16_{± 2.52 · 10}−18 _{33, 542} _{58, 332} ₂ 100 3.54 · 10−32± 6.95 · 10−34 _{56, 982} _{109, 992} ₈ (λ, µ1, µ+1, µ2) = (0.3, 0.36, 0.32, 0.34), RE = 0.01 20 5.63 · 10−2± 1.11 · 10−4 39, 496 91, 596 2 50 1.19 · 10−3± 2.33 · 10−5 99, 567 241, 332 18 100 1.63 · 10−6_{± 3.21 · 10}−8 _{128, 864} _{320, 120} ₄₉ (λ, µ1, µ+1, µ2) = (0.3, 0.36, 0.35, 0.34), RE = 0.01 20 5.86 · 10−2± 1.44 · 10−4 _{32, 283} _{76, 169} ₂ 50 1.42 · 10−3± 2.79 · 10−5 61, 034 152, 283 12 100 2.64 · 10−6± 5.18 · 10−8 113, 527 279, 196 47 (λ, µ1, µ+1, µ2) = (0.25, 0.35, 0.28, 0.4), RE = 0.05 20 1.11 · 10−4± 1.09 · 10−5 _{45, 685} _{83, 436} ₂ 50 3.43 · 10−11± 3.36 · 10−12 _{79, 901} _{148, 256} ₇ 100 5.72 · 10−22± 5.60 · 10−23 _{235, 502} _{439, 006} ₄₂ TABLE I

SIMULATION RESULTS FORθ = 0.8

the IS scheme provides fast and reliable estimates. In some cases, especially when B grows large, the running times may be sensitive to the choice of and δ. In Table I we used relatively ‘good’ and δ. For instance, choosing = 0.01 will lead to almost 9−fold increase in the number of replications in the last line of the first part of Table I, i.e., for the case when (λ, µ1, µ+1, µ2) = (0.1, 0.7, 0.15, 0.2) and B = 100.

We also performed a few standard Monte Carlo simulations (i.e., without IS) for comparison, using the same relative error of 10−2to assess the efficiency gain. For the parameter settings of the first part of Table I with B = 20, this took 4521 seconds (±5 · 109 runs). In the settings of the second part of Table I with B = 50 it took 118 seconds (±107 runs).

To enable comparison with the state-independent scheme in [7] and the state-dependent scheme in [2], we also fixed the

(λ, µ1, µ+1, µ2) = (0.1, 0.7, 0.15, 0.2)

B st.-ind., [7] st.-dep., [2] current 20 1.49 · 10−3 _{2.63 · 10}−3 _{3.54 · 10}−3 50 2.06 · 10−3 _{7.87 · 10}−3 _{8.00 · 10}−3 100 2.75 · 10−3 19.71 · 10−3 17.01 · 10−3 (λ, µ1, µ+1, µ2) = (0.3, 0.36, 0.32, 0.34) 20 0.92 · 10−3 5.30 · 10−3 6.00 · 10−3 50 12.50 · 10−3 8.40 · 10−3 11.00 · 10−3 100 39.69 · 10−3 _{12.20 · 10}−3 _{11.00 · 10}−3 TABLE II

COMPARISON OF RELATIVE ERRORS FOR THREEISSCHEMES

number of runs to be 106and compared the relative errors, see Table II. Here, s = (0, 0), θ = 0.8, and in the state-dependent schemes = 0.03/√B and δ = − log . As can be expected,

both state-dependent schemes provide good estimates (in terms of the relative errors), but the performance of the state-independent scheme strongly depends on the parameters.

We now demonstrate techniques that enable selection of a proper value for the slowdown threshold m = θB. A first

caveat is the following. It is natural to expect that smaller

θ will provide better protection of the second node and

consequently smaller probability of overflow (with s being the origin), but this is not always the case. Indeed, numerical experiments show that, starting in θ = 1, decreasing θ leads to a reduction of ps

B. However, continuing to decrease θ, the

probability of interest will start to increase. The same holds for the stationary probability of the process Xk, with the

slow-down threshold θ, to be above level 1, denoted by πθ_{(B), see}

Figs. 5 and 6. In these graphs we plotted πθ_{(B) against θ}

for parameters (λ, µ1, µ+1, µ2) = (0.3, 0.36, 0.32, 0.34), with

overflow levels B = 20 and B = 50.

Fig. 5. πθ_{(20) against θ}

Fig. 6. πθ_{(50) against θ}

For the case of ‘shifting bottlenecks’, i.e., µ+₁ < µ2 < µ1,

we now provide an explanation for the non-monotone behavior of πθ_{(B). Clearly, π}θ_{(B) has decay rate θ log ρ}

2 + (1 −

(8)

ρ2 is defined as λ/µ2. However, when θ = 0 the decay rate

is just log ρ2, as the queue is then an ordinary tandem queue

(without backpressure); in [7] it was shown that z+< ρ2. This

shows that the decay rate is discontinuous in θ = 0, explaining that π0_{(B) > π}0+_{(B) (for B large).}

The above type of justification for the observed non-monotonicity is valid only for the case that µ+₁ ≤ µ2 < µ1.

Another explanation for the decreasing nature of πθ_{(B) for}

small θ is in the ‘specific’ behavior of the Xk around the

origin. More precisely, consider the process Xk with starting

state s = (0, 1), and compare threshold levels θ = 0 and

θ = 1/B. In the latter case, the first server operates at full

speed only when the second queue is empty. It is not difficult to see that the probability of transition (1, 0) → (0, 1) is higher when θ = 1/B; and the probability of the ‘terminating’ transition (0, 1) → (0, 0) is µ2/(λ + µ2), which does not

depend on θ. This means that the probability of overflow starting from the origin, p0

B, is higher when θ = 0 than when

θ = 1/B, even though we have ‘more slow-down’ in the

first case. One can generalize this type of arguments for the other states around the origin and the other values of θ. These arguments, unlike the ones based on decay rates, hold for all parameter values.

We now demonstrate how to develop procedures for op-timally choosing the value of the slow-down threshold. The primary role of the backpressure mechanism is to control the probability of some undesirable event, viz. overflow in the second buffer (expressed in terms of ps_B). However, introduc-ing server slow-down has a negative side effect: the expected sojourn time of a job is decreasing in θ. In order to find an optimal value of θ, one could, for given coefficient α and β, minimize the following (dis-)utility function

u(θ, B) = −α log−1p0_B+ βS(θ, B),

where S(θ, B) is the mean sojourn time of a job, α is the penalty for overflow and β is the cost for each job being in the system per unit time; we assume s is the origin. The α and β should be chosen by the service provider, and should reflect the Service Level Agreement (SLA) as agreed upon.

We present plots of the utility functions u(θ, 20) and

u(θ, 50), with α = 10 and β = 3, for a system with parameters (λ, µ1, µ+1, µ2) = (0.3, 0.36, 0.32, 0.34) in Figures 7 and 8

respectively.

Fig. 7. Utility function against θ for B = 20

We observe that the optimal points are located close to the minimum of πθ_{(B), see Figures 5 and 6, as one may expect.}

Fig. 8. Utility function against θ for B = 50

V. CONCLUSION

In this paper we analyzed a backpressure-based control mechanism. We first determined its (non-trivial) stability con-dition. Then we focused on efficient IS-based simulation techniques for estimating the probability of overflow in the downstream queue (which outperforms methods [9] developed earlier). It is noted that several aspects, which are not captured by the notion of asymptotic efficiency, play a crucial role in as-sessing the performance of this type of simulation techniques: it matters for instance very much whether a new measure requires computation of new transition rates ‘on the fly’, or whether these can be precomputed. These issues have been taken into account in the present paper. We then demonstrated how the evaluation techniques developed in this paper help in tuning the design parameters involved, specifically focusing on selecting an appropriate value for the slow-down threshold.

REFERENCES

[1] P.T. de Boer. Analysis of state-independent importance-sampling mea-sures for the two-node tandem queue. ACM Transactions on Modeling and Computer Simulation, 16(3):225–250, 2006.

[2] P. Dupuis, K. Leder, and H. Wang. Large deviations and importance sampling for a tandem network with slow-down. Queueing Systems: Theory and Applications, 57(2-3):71–83, 2007.

[3] P. Dupuis, A.D. Sezer, and H. Wang. Dynamic importance sampling for queueing networks. Annals of Applied Probability, 17(4):1306–1346, 2007.

[4] S. Floyd and V. Jacobson. Congestion gateways for packet networks. IEEE/ACM Transactions on Networking, 1:397–413, 1993.

[5] P. Heidelberger. Fast simulation of rare events in queueing and reliability models. ACM Transactions on Modeling and Computer Simulation, 5(1):43–85, 1995.

[6] G. Latouche and V. Ramaswami. Introduction to matrix analytic methods in stochastic modelling. SIAM, 1999.

[7] D.I. Miretskiy, W.R.W. Scheinhardt, and M.R.H. Mandjes. Efficient simulation of a tandem queue with server slow-down. Simulation, 83(11):751–767, 2007.

[8] D.I. Miretskiy, W.R.W. Scheinhardt, and M.R.H. Mandjes. Simple and efficient importance sampling scheme for a tandem queue with server slow-down. In Proceedings of RESIM, pages 38–50, Rennes, France, 2008.

[9] D.I. Miretskiy, W.R.W. Scheinhardt, and M.R.H. Mandjes. State-dependent importance sampling for a Slow-down tandem queue. Submit-ted, 2008. See also Memorandum 1879, Dept. of Applied Mathematics, University of Twente, URL: http://eprints.eemcs.utwente.nl/13251/. [10] S. Parekh and J. Walrand. A quick simulation method for excessive

backlogs in networks of queues. IEEE Transactions on Automatic Control, 34:54–66, 1989.

[11] IEEE standard 802.3. Carrier Sense Multiple Access with Collision Detection (CSMA/CD) access method and physical layer specification, annex 31b. 1998.

[12] R.L. Tweedie. Operator-geometric stationary distributions for Markov chains, with applications to queueing models. Adv. in Appl. Probab., 14:368–391, 1982.

[13] N. D. van Foreest, M.R.H. Mandjes, J.C.W. van Ommeren, and W.R.W. Scheinhardt. A tandem queue with server slow-down and blocking. Stochastic Models, 21(2-3):695–724, 2005.