• No results found

Index of /pub/pub/pub/pub/pub/pub/pub/pub/pub/SISTA/pcoppens/CDC_2021

N/A
N/A
Protected

Academic year: 2022

Share "Index of /pub/pub/pub/pub/pub/pub/pub/pub/pub/SISTA/pcoppens/CDC_2021"

Copied!
21
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

stochastic systems

Peter Coppens and Panagiotis Patrinos

Abstract

In this paper we introduce a novel approach to distributionally robust optimal control that supports online learning of the ambiguity set, while guaranteeing recursive feasibility. We introduce conic representable risk, which is useful to derive tractable reformulations of distributionally robust optimization problems.

Specifically, to illustrate the techniques introduced, we utilize risk measures con- structed based on data-driven ambiguity sets, constraining the second moment of the random disturbance. In the optimal control setting, such moment-based risk measures lead to tractable optimal controllers when combined with affine disturbance feedback. Assumptions on the constraints are given that guarantee recursive feasibility. The resulting control scheme acts as a robust controller when little data is available and converges to the certainty equivalent controller when a large sample count implies high confidence in the estimated second moment.

This is illustrated in a numerical experiment.

I. INTRODUCTION

DISTRIBUTIONALLY robust optimization (DRO) has gained traction recently as a technique that balances robustness with performance in an intuitive fashion. From a theoretical point of view such techniques act as regularizers [1] and in a data-driven setting, DRO acts at the interface between stochastic and robust optimization [2].

In the control community the potential of such techniques has not gone unnoticed [3], [4]. Here one would ideally solve stochastic optimal control problems like

minimize

π∈Π IEh PN −1

t=0 `t(xt, πt(w0, . . . , wt−1)) + `N(xN)i subj. to xt+1= f (xt, πt(w0, . . . , wt−1), wt), t ∈ IN0:N −1

IP[φ(xt) ≤ 0] ≥ 1 − ε, t ∈ IN1:N −1

ψ(xN) ≤ 0 a.s., x0 given,

where xt∈ IRnxdenotes the state. Parametrized, causal policies π ∈ Π map disturbances to inputs. That is, an element πtof the sequence π = {πt}N −1t=0 , maps {wi}t−1i=0 to inputs in IRnu for t ≥ 1 and π0 ∈ IRnu. Here, the disturbances wt ∈ IRnw are i.i.d. random vectors, the distribution of which is unknown, usually introducing the need for robust

P. Coppens and P. Patrinos are with the Department of Electrical Engineering (ESAT-STADIUS), KU Leuven, Kasteelpark Arenberg 10, 3001 Leuven, Belgium. Email: peter.coppens@kuleuven.be, panos.patrinos@kuleuven.be

This work was supported by: the Research Foundation Flanders (FWO) PhD grant 11E5520N and research projects G0A0920N, G086518N and G086318N; Research Council KU Leuven C1 project No. C14/18/068; Fonds de la Recherche Scientifique – FNRS and the FWO – Vlaanderen under EOS project no 30468160 (SeLMA); EU’s Horizon 2020 research and innovation programme: Marie Skłodowska-Curie grant No. 953348.

(2)

approaches. DRO then improves upon classical robust control by using available data to infer properties of the distribution, while retaining guarantees.

The core construct in DRO is the ambiguity set, a set of distributions against which one should robustify. Several ambiguity sets have been examined with varying success.

The most common are moment-based, φ-divergence and Wasserstein ambiguity sets [5].

Such ambiguity sets are connected to so-called risk measures by duality. Hence this approach is directly related to risk-averse optimization [6].

Throughout the paper we rely on conic representable ambiguity and risk to derive tractable problems, similar to the methodology presented in [7], [8]. The main contri- butions are then as follows: (i) we derive tight, data-driven, moment-based ambiguity sets that are conic representable and shrink when more data becomes available; (ii) we extend conic risks to the multi-stage setting and use them to model average value-at-risk constraints; (iii) we synthesize the controller such that it is recursive feasible when it is applied in a receding horizon fashion; (iv) we illustrate how our framework leads to tractable controllers based on affine disturbance feedback policies [9], which are evaluated in numerical experiments.

Similar results were achieved in [10] for a tube-based approach with Wasserstein ambiguity, the radius of which is not data-driven; in [11] for relaxed, robust constraints;

in [4] for moment-based ambiguity which is not data-driven and does not guarantee recursive feasibility; and [12], [13] for discrete distributions. DR control of Markov decision processes with finite state-spaces was also considered in [14]. Our framework supports online learning of truly data-driven ambiguity sets and risk constraints within a continuous state-space, while guaranteeing recursive feasibility.

This section continues with notation and preliminaries. Next§II introduces conic and data-driven ambiguity in a single-stage setting and§IIIintroduces multi-stage extensions as well as the optimal control problem that we want to solve. Then§IV shows how to construct a controller such that recursive feasibility is guaranteed. Finally§V illustrates how our techniques lead to tractable controllers and contains numerical experiments.

A. Notation and preliminaries

Let IN denote the integers and IR (IR+) the (nonnegative) reals. We denote by Sd the symmetric d by d matrices and by Sd++ (Sd+) the positive (semi)-definite matrices.

For two matrices of compatible dimensions X, Y we use [X; Y ] ([X, Y ]) for vertical (horizontal) concatenation. We use k·k2 to denote the spectral norm (Euclidean norm) for matrices (vectors) and [·]+:= max(0, ·). For matrices (or vectors) X, Y and cone K, let X4KY (X <KY ) be Y − X ∈ K (X − Y ∈ K). When K = Sd+ we use  ().

Meanwhile, (X, Y ) := [vec(X); vec(Y )] interprets X, Y as column vectors in vertical concatenation. Let diag(X, Y ) be a (block) diagonal matrix and let Id ∈ Sd denote the identity. For a vector x ∈ IRd, [x]i denotes the i’th element.

Slice notation: We introduce INa:b = {a, . . . , b}. Similarly we use wa:b to denote the sequence {wi}i∈IN

a:b. For a sequence of length N , index a (b) is omitted when 0 (N − 1) is implied (when both are omitted we write w).

Interpreting w0:N −1 with wi ∈ IRnw as an element of IRN nw, consider affine maps x0:M −1 = Aw0:N −1+ a0:M −1 (A ∈ IRM nx×N nw). Introducing homogeneous coordi- nates w = (w, 1) gives x = Aw, with A = [A, a].

(3)

For a matrix A acting on sequences, the slice Ai:j,k:`describes the part mapping wk:`

to xi:j. So we take block rows and block columns, with blocks of size IRnx×nw. Risk measures and ambiguity: Given some measurable space (W,B) with W a com- pact subset of IRnw andB the associated Borel sigma-algebra, we use M+(W) (M(W)) to denote the space of finite (signed) measures on (W,B), making the dependency on W explicit. Similarly, let P(W) denote the space of probability measures.

We also consider the spaceZ := C(W) of continuous (bounded) B-measurable func- tions z :W → IR. Elements of Z act as random loss functions. Notation z ∼ µ means z has distribution µ ∈ P(W). The space M(W) and Z are paired by the bilinear form [6, §2.2], for z ∈Z, µ ∈ M(W),

hz, µi :=

Z

W

z(w)dµ(w).

We endow M(W) with the weak topology.

We write z< 0 (µ < 0) to imply z(w) ≥ 0 (µ(w) ≥ 0), ∀w ∈ W. Note that, since Z and M(W) are linear spaces, we can use the usual notation of linear operators (e.g., let E : M(W) → IRn, then Eµ = (h0, µi, . . . hn−1, µi) for some random variables

i ∈ Z, i ∈ IN0:n−1). For each linear operator E we have the adjoint E: IRn → Z, with Eλ := (0, . . . , n−1) · λ, where · is the usual inner product between vectors. After all,

Eµ · λ = (h0, µi, . . . hn−1, µi) · λ

=

Z

W

(0(w), . . . , n−1(w))dµ(w)



· λ

= Z

W

((0(w), . . . , n−1(w)) · λ) dµ(w)

= h(0, . . . , n−1) · λ, µi = hEλ, µi. (1) We define risk based on its ambiguity as in most DRO literature [6]. Specifically, we say thatA ⊆ M+(W) is an ambiguity set if it is a non-empty, closed and convex subset of P(W). The associated risk measure ρA: Z → IR is then [6, §2]

ρA[z] = max

µ∈Ahz, µi = max

µ∈AIEµ[z], (2)

where IEµ[·] denotes the expected value w.r.t. µ ∈ P(W). and constitutes a mapping from random loss functions to the real line, which (similarly to expectation) can be used to deterministically compare random variables.

Our definition of an ambiguity set is directly related to that of coherent risk [15].

Lemma I.1. Suppose that A ⊆ P(W) is non-empty, closed and convex. Then ρA in(2) is coherent. Specifically,∀z, z0∈ Z and α ∈ IR, ρA is

(i) convex, proper, and lower semi–continuous;

(ii) monotonous:ρA(z) ≥ ρA(z0) if z < z0;

(iii) translation equivariant:ρA(z + α) = ρA(z) + α;

(iv) positive homogeneousρA(αz) = αρA(z) if α > 0.

Moreover, A is compact and equal to the domain of ρA andρA[z] is finite, where ρA denotes the convex conjugate.

(4)

Proof. Let χA denote the indicator function of A (i.e. χA[µ] = +∞ if µ /∈ A and 0 otherwise). Then, by (2),

ρ[z] = χ= sup

µ∈M(W)

{hz, µi − χ[µ]}, (3)

where we omit the subscript of ρA and χA for convenience. Since χ is an indicator function, it is convex (A is convex); lower semi–continuous (A is closed); and proper (A is nonempty). Therefore, by [16, Prop.2.112] and (3), ρ is proper, convex and lower semi–continuous (its epigraph is an intersection of closed halfspaces). Therefore(i)holds.

Since χ is convex and lower semi–continuous, we apply [16, Thm. 2.133] to show χ = χ∗∗= ρ, where the second equality follows by (3). Hence the domain of ρ isA.

Compactness of P(W) follows by Prohorov’s theorem [17, p.13]. SinceA is a closed subset of P(W) it is also compact.

The results (ii)–(iv) follow directly from [15, Thm. 2.2]. Specifically (ii) from A ⊂ M+(W),(iii)from µ(W) = 1 for all µ ∈ A and(iv) from (2).

Next, we show that for any z ∈Z,

hz, µi ≤ α, ∀µ ∈ P(W) z 4 α, (4)

where the inequality on the right holds pointwise over W (i.e. z(w) ≤ α, ∀w ∈ W).

The argument for (4) is as follows [17, Eq. 3.7]. Since µ(W) = 1, hz, µi ≤ α iff hα − z, µi ≥ 0, which holds if α − z < 0 (since µ < 0). For the converse note δw∈ P(W) for any w ∈ W with δw a dirac measure. So hα − z, δwi = α − z(w) ≥ 0 for any w ∈W is a necessary condition. So we have shown (4)

From (4) we can conclude hz, µi ≤ supw∈Wz(w), ∀z ∈ Z, µ ∈ A. Hence, by (2), ρ[z] ≤ supw∈Wz(w). Since z(w) is finite for any w ∈ W, ρ[z] is finite (cf. [6, §2.2]).

II. SINGLE-STAGE PROBLEMS

Given the dual formulation of a risk measure in (2), it is clear that the choice ofA is a critical design decision. In this section we introduce how ambiguity sets, using moment information, are derived from data. We also introduce conic representable risk, used to derive tractable problems.

A. Data-driven risk

In DRO the reasoning is usually as follows. Consider a probability space (Ω,F, IP) and the optimization problem

minimize

u∈U IEµ?[f (u, w)], (5)

with u ∈ IRnu some decision variable, f some loss function and w : Ω →W a random variable with W ⊂ IRnw the (compact) support of w. The main difficulty in solving the stochastic optimization problem (5) is that the distribution (or push-forward measure), µ?∈ P(W) defined on the sample space (W,B) as µ?(O) = IP[w−1(O)] for all O ∈B and with w−1(O) the pre–image of O, is unknown.

Hence, instead one introduces an ambiguity setA ⊆ P(W), which contains µ? with some confidence. To do so one can estimate some statistic θ based on data. In the case of φ-divergence [12] and Wasserstein ambiguity [18], this θ is the empirical distribution,

(5)

while for moment-based ambiguity, θ encapsulates moment information. We will consider this final case in§II-B. To summarize:

Definition II.1. Consider random variable w : Ω → W with distribution µ? and i.i.d.

samples w0:M −1: Ω → WM. Let θ : WM → Θ denote a statistic for a set Θ and let β ∈ IR be some radius1. Then a data-driven ambiguity A : Θ × IR ⇒ P(W) with confidenceδ ∈ (0, 1) maps (θ(w), β) to an ambiguity set Aβ(θ(w)) ⊆ P(W) such that IP[µ?∈ Aβ(θ(w0:M −1))] ≥ 1 − δ. (6) In [12] this is referred to as a learning system.

Based on (6) we minimize ρA

β( ˆθ)[f (u, w)] instead of (5). The result upper bounds (5) with probability at least 1 − δ.

B. Moment-based ambiguity

As mentioned before, we focus on the case where θ encapsulates moment information.

Such ambiguity sets have the advantage that [18] (i) they can contain measures with support not limited to the observed samples (unlike most φ-divergence based sets); (ii) the radius is estimated with reasonable accuracy based on known information of the distribution (unlike for Wasserstein-based sets); and (iii) problem complexity does not grow with the sample count.

To ensure that an ambiguity set satisfying (6) can be derived, we assume that W is bounded, which is often the case in control applications and is therefore the usual assumption in robust control. Other common choices are that w is multivariate Gaussian or that it satisfies some concentration properties (e.g., sub-Gaussian) [19]. We have:

Lemma II.2. Let W = {w ∈ IRnw: kwk2≤ r} and Rw = diag(Inw, cr) with c ∈ IR.

Assume we have a set of i.i.d. samplesw0:M −1ofw ∼ µ?and let ˆC :=PM −1

i=0 wiw>i/M . Then,

Aβ( ˆC) =n

µ ∈ P(W) :

Rw( ˆC − IEµ[w w>])R>w 2≤ βo

, satisfies (6) when β = 0.5r2(1 +

1 + 16c2)p2 log(2(nw+ 1)/δ)/M .

Proof. We use a matrix Hoeffding bound [20, Thm. 1.3] with improved constants. See App. A for the full proof.

Remark II.3. In the numerical experiments towards the end of the paper we select c = 1/4. This choice results in a relatively simple expression for the radius

β = 0.5(1 + 2)r2p

2 log(2(nw+ 1)/δ)/M and performed well in experiments.

1Some moment-based ambiguity set can have multiple radii (cf. [2]).

(6)

C. Conic-representable ambiguity

We introduce conic representable ambiguity (similar to the framework in [7], [8]) below and show how such risk is related to robust optimization through conic duality.

Definition II.4. Consider a compact sample space W ⊂ IRnw and Z = C(W). An ambiguity setA is conic representable if, for some E, F : M(W) → IRnb and b ∈ IRnb,

A = {µ ∈ P(W) : ∃ν ∈ M+(W), Eµ + F ν 4K b} ,

withν some auxiliary measure and K a closed, convex cone. Usually we assume F = 0.

When F 6= 0 we refer to the ambiguity as ν-conic representable. Similarly we refer to ρA, as in(2), as (ν-)conic representable risk (conic for short).

The parameters of A should be selected such that it is an ambiguity set (i.e., a nonempty, closed and convex subset of P(W)). Since we usually want an A satisfying (6), it will be non-empty as it should at least contain the true distribution. The random variables used to construct E and F , are all continuous. Therefore [21, Thm. 15.5] E and F are continuous mappings. Thus A is the intersection between the closed set P(W) and the pre–image of a closed set under a continuous mapping, which is also closed.

HenceA is a closed subset of P(W). Convexity of A then follows, since E and F are linear andK is convex.

In [8] it was shown that both the average and entropic value-at-risk are conic whenever W is finite. Many more risks fall under this framework [7], [22].

Direct application of conic linear duality [17] gives:

Lemma II.5. A risk ρA[z] as inDef. II.4 is equal to the optimal value of minimize

λ<K∗0,τ τ + b · λ

subj. to Eλ + τ < z, Fλ < 0,

(D) where the functional inequalities should hold pointwise for all w ∈ W, E and F denote the adjoint operators (cf. §I-A), and K the dual cone.

Proof. By (2) the primal problem is maximize

µ,ν∈M+(W) hz, µi

subj. to Eµ + F ν 4K b h1, µi = 1,

(P )

with val(P ) = ρ[z] (where we omit the subscript for convenience). We refer to the minimization problem (D) as the dual problem. Let τ ∈ IR and λ ∈ IRnb. Then the Lagrangian is

ϕ[µ, ν, λ] := hz, µi + (1 − h1, µi) · τ + (b − Eµ − F ν) · λ

= τ + b · λ − hτ + Eλ − z, µi − hFλ, νi, where we can use (1) to construct the adjoints. We have

K:= {λ ∈ IRnb: λ· λ ≥ 0, ∀λ∈ K}.

(7)

Hence maxν∈M+(W)minλ∈K{(b − Eµ − F ν) · λ} = −χ[µ], where χ is the indicator of A. Therefore

max

µ,ν∈M+(W)

min

λ∈K{ϕ[µ, ν, λ]} = max

µ∈M+(W)

{hz, µi − χ[µ]} = ρ[z].

Similarly note that [17, Eq. 3.7]

M+(W) := {z ∈ Z : hz, µi ≥ 0, ∀µ ∈ M+(W)}

= {z ∈ Z : z(w) ≥ 0, ∀w ∈ W} = {z ∈ Z : z < 0}, which follows from a similar argument as (4). As such

min

λ∈K max

µ,ν∈M+(W){ϕ[µ, ν, λ]} = val(D),

since λ gives a finite cost iff τ + Eλ − z ∈ M+(W) and Fλ ∈ M+(W) (i.e. τ + Eλ − z < 0 and Fλ < 0).

All that is left is to show strong duality (i.e. val(D) = val(P ) = ρ[z]). This follows directly from coherence of ρ (specifically ρ being proper, implying consistency of (P)), compactness ofW and [17, Cor. 3.1].

Note that constraints in the dual are robust constraints, since they hold for all w ∈W.

Hence, techniques from robust optimization enable finding tractable reformulations.

Example II.6. The ambiguity set Aβ( ˆC) of Lem. II.2is conic with nb= 3n2w, Eµ = (±hRww w>R>w, µi), b = (RwCRˆ >w± βI) and K = Sn+w+1× Sn+w+1.

Moreover, letting λ = (Λ, V) with Λ, V ∈ Snw+1 and τ ∈ IR while usingLem. II.5, means

ρA

β( ˆC)[z] = min

Λ,V0,τ τ + Tr[Λ(RwCRˆ >w+ βI)] + Tr[V(RwCRˆ >w− βI)]

s. t. τ + Eλ < z, where the adjoint E: IRnb → Z, is (cf. (1))

(τ + Eλ)(w) = τ + Tr[Rww w>Rw>(Λ − V)]

= w>R>w(Λ − V)Rww + τ.

If the constraint τ + Eλ < z is LMI representable, then ρAβ( ˆC)[z] can be evaluated by solving a SDP. For example if z = w>P w. Then, since w>w ≤ r2, we can apply the S-Lemma [23, Thm. B.2.1.] to show that τ + Eλ < z iff.,

∃s ≥ 0, R>w(Λ − V)Rw+ diag(sI, τ − sr2) − P  0. (7) We also consider ambiguity with only support constraints.

Example II.7. Ambiguity P(W) is conic representable with nb= 0. Hence, ρP(W)[z] = minτ{τ : τ < z}, corresponds to ρP(W)[z] = maxw∈Wz(w) and only considers the support as is common in robust optimization.

(8)

III. MULTI-STAGE PROBLEMS

In this section we show how conic single-stage risk can be extended to a multi-stage setting, which is required to develop distributionally robust MPC controllers. Specifically, we will consider risk measures operating on the dynamics

xt+1= f (xt, ut, wt),

with xt ∈ IRnx (ut ∈ IRnu) the state (input) and wt ∈ IRnw the disturbance, which follows a random process. For t ∈ IN0:N −1 we consider `t: IRnx× IRnu → IR+ a stage cost function, and `N: IRnx → IR+ the terminal cost.

For each stage t, the trajectory up to that time w0:t−1 is an element ofWt. For each Wt,Btis the accompanying Borel sigma-algebra, (Mt) Mt+the set of (signed) measures and Ptthe set of probability measures on (Wt,Bt). For brevity we henceforth omit the explicit dependency on Wt. Also consider the paired spaces of continuous functions Zt= C(Wt).

We can then consider multistage ambiguity sets At, which are nonempty, closed and convex subsets of Pt. These in turn define a multistage analog to risk measures2, multistage risk measures [6, §4.2], ρAt: Zt → IR. Since this is simply a usual risk measure, but defined on Zt, the properties of Lem. I.1 generalize. We specifically consider coherent multistage risk

ρAt[zt] = max

µt∈Athzt, µti = max

µt∈AtIEµt[zt]. (8) Given such risks, the goal is to solve, for a given x0,

minimize

π∈Π ρAN

"N −1 X

t=0

`t(xt, πt(w:t−1)) + `N(xN)

#

(9a) subj. to xt+1= f (xt, πt(w:t−1), wt), t ∈ IN0:N −1 (9b) rtA[φ(xt)] ≤ 0, t ∈ IN1:N −1 (9c)

ψ(xN) ≤ 0 a.s., (9d)

where Π denotes a set of parametrized, continuous, causal policies. The risk constraints (9c) involve the multistage risk measures rtA and are discussed in detail in §IV. We illustrate how (9) interpolates between the robust setting and (1) in§V.

Remark III.1. Problem (9) is not exact as we optimize over parametrized policies (cf.

§V), for tractability. As such, time-consistency [24] cannot be guaranteed (i.e. a policy computed at t = 0 may not be optimal at t = 1 after realization of w0). Hence a receding horizon scheme is used.

A. Product ambiguity

To enforce independence of the disturbances wt we introduce product ambiguity [6,

§4.2]. For a sequence of single-stage ambiguity factors Ai for i ∈ IN0:t−1, consider

×t−1 i=0

Ai= A0× · · · × At−1, (10)

2Multistage risk is often constructed using nested conditional risk measures. We avoid such a construction for conciseness and tractability. The consequences of this are discussed in [6, §4].

(9)

where some µt∈ A0× · · · × At−1 if it is constructed as a product measure of some µi ∈ Ai, for i ∈ IN0:t−1 (denoted by µt = µ0× · · · × µt−1). We show that in certain cases such ambiguities are conic representable.

Before doing so we need to extend linear operators Ei: M → IRnb to take arguments in Mtin a natural way. To do so, note that for any Ei: M → IRnb and µ ∈ M we have Eiµ = R

w∈Wei(w)dµ(w) for some ei: W → IRnb, by definition. Measures µt∈ Mt take arguments w:t−1= (w0, . . . , wt−1), so we introduce E|i: Mt→ IRnb such that

E|iµt= Z

Wt

ei(wi)dµt(w:t−1), ∀µt∈ Mt. (11) With these new operators we have

Lemma III.2. Let Aibe conic representable with parametersEi, bi, Kifori ∈ IN0:t−1. Then×t−1i=0Ai is also conic representable with parameters

t= (E|0µt, . . . , E|t−1µt), b = (b0, . . . , bt−1), and K = K0× . . . Kt−1. Moreover,

ρ×t−1

i=0Ai[zt] = min

λi<K∗i0,τ

n

τ +Pt−1

i=0bi· λi: τ +Pt−1

i=0E|iλi< zto .

Proof. Let µt= µ0× · · · × µt−1, with µi∈ Pi. Then, following the notation in (11), E|iµt:=

Z

Wt

ei(wi)dµt(dw0:t−1)

(i)

= Z

W

· · · Z

W

ei(wi)dµ0(w0) . . . dµt−1(wt−1) = Z

W

ei(wi)dµi(wi), where (i) follows from µt = µ0× · · · × µt−1 and µt ∈ Pt. Hence E|iµ 4Ki bi iff Eiµi 4Ki bi. Repeating the same argument for each i proves that At:= ×t−1i=0 Ai

is conic representable. Since Ai are all non-empty, At is also nonempty. Convexity and closedness follow from the arguments belowDef. II.4. The dual then follows from applying Lem. II.5.

B. Risk constraints

Ideally constraints like (9c) would require the state to lie within some set almost surely.

Since such a constraint in a stochastic setting can be very conservative, we will instead implement average value-at-risk constraints, for α ∈ (0, 1),

AV@Rµα[z] := inf

τ ∈IRτ + α−1IEµ[z − τ ]+ ≤ 0. (12) Such constraints (i) act as a convex relaxation of chance constraints [25]; (ii) penalize the expected violation in the α quantile where violations do occur. In control applications (12) is natural, since it penalizes large violations more.

To evaluate the expectation in (12), true knowledge about the distribution is needed.

Hence, we will operate on the distributionally robust AV@R constraint instead:

r-AV@RAα[z] := max

µ∈AAV@Rµα[z] ≤ 0, (13)

(10)

withA the core ambiguity. If A satisfies (6), then (13) implies the chance constraint IP[zt≤ 0] ≥ 1 − ε holds with 1 − ε ≤ (1 − δ)(1 − α). Moreover, whenever A is conic, then robust AV@R is ν-conic.

Lemma III.3. Let A be conic with parameters Ec, bc, Kc. Then r-AV@RAα in(13) is ν-conic with

Eµ = (Ecµ, h1, µi), F ν = (Ecν, h1, νi), b = (bc, 1)/α, and K = Kc× {0}. Moreover, r-AV@Rρα[z] equals

min

λ<K∗c0,τ,τc

τ + α−1c+ bc· λ) : Ecλ + τc< 0, Ecλ + τc+ τ < z . (14) Proof. This proof generalizes the methodology of [26] to arbitrary conic representable risk. First note that

max

µ∈A inf

τ ∈IRτ + α−1IEµ[z − τ ]+ = inf

τ ∈IRτ + α−1ρA[z − τ ]+ , (15) by [27, Thm. 2.1]. Specifically let φ(τ, w) = τ + α−1[z(w) − τ ]+. Then we have (i) φ(τ, ·) ∈ Z, implying that it is µ integrable and measurable; (ii) φ(·, w) is convex for any w ∈W; (iii) ρA[z − τ ]+ is finite (Lem. I.1); (iv) the set A ⊆ P(W) is compact (Lem. I.1); (v) φ(τ, ·) is continuous and hence bounded for any τ ∈ IR on W. Under these properties as well as A being convex, [27, Thm. 2.1] states that strong duality holds, allowing us to exchange the inf and the max.

ApplyingLem. II.5to ρAon the r.h.s., results in (14) (where [·]+produces two separate constraints and τc, λc act as the Lagrangian multipliers for the constraint µc ∈ Ac).

Again applyingLem. II.5gives the original ν-conic representation.

The second application of Lem. II.5 requires the resulting set of measures (denoted Aα below) to be a nonempty, closed and convex subset of P(W). By construction we already haveAα⊆ P(W). Next, we show that Aαis larger thanA. After all, for any µc ∈ A, take µ = µc and ν = α−1(1 − α)µc < 0, since α ∈ (0, 1). Moreover, since Kc is a cone,

Ecµ + Ecν 4Kc bc Ec(αµ + αν) 4Kc bc,

with αµ + αν = αµc+ (1 − α)µc= µc. For the same reason we have h1, µi + h1, νi = α−1h1, µci = α−1. Therefore µc∈ Aαfor each µc∈ A. Hence, since A is nonempty, so is Aα. Closedness and convexity then follow by the arguments below Def. II.4. So using Lem. II.5is justified.

IV. RECURSIVE FEASIBILITY

We show how one can configure the constraints of (9) such that recursive feasibility is ensured. To do so we assume

(A1) rAt [zt] := r-AV@RPαt−1×A[zt], ∀t ∈ IN0:N −1, zt∈ Zt;

(A2) A is updated based on measurements as g(A, w) (e.g., followingLem. II.2) and A+:= g(A, w) ⊆ A a.s.

We introduce the terminal set XN:= {x : ψ(x) ≤ 0}. Let VNA(x0) denote the mini- mum of (9) for some x0 and let DN(A) denote its domain. Then consider the set of feasible policies ΠN(x0, A) := {π ∈ Π : (9b), (9c), (9d)} .

(11)

We begin with the following definition.

Definition IV.1 (Recursive Feasibility). Let x0 ∈ DN(A) and π ∈ ΠN(x0, A). If, f (x0, π0, w0) ∈ DN(g(A, w0)) a.s., then (9) is recursive feasible (RF).

We can then prove the following theorem:

Theorem IV.2. Assume(A1),(A2)and that we are given some terminal policyπf(xN) such that

(A3) XN ⊆ {x ∈ IRnx: φ(f (x, πf(x), w)) ≤ 0, ∀w ∈ W};

(A4) XN is robust positive invariant (RPI) forπf (i.e.,f (x, πf(x), w) ∈ XN for each (x, w) ∈ XN × W);

(A5) ∀π0:N −1∈ Π, let πN = πf(xN), depending on w:N throughxN. Then the shifted policyπ+0:N −1= π1:N(w0, ·) for any fixed w0∈ W, lies in Π.

Then,(9) is recursive feasible.

Proof. We will consider any fixed w0∈ W and show that, given that (9) is feasible for some x0, it will also be feasible for the next time step starting from x+0 = f (x0, π0, w0) (cf. Def. IV.1). Here, we consider the feasible policy π0:N −1 ∈ ΠN(x0, A), to which we append πN = πf(xN). Propagating the dynamics with this policy gives the sequence of states x0:N +1, depending on w:N through (9b).

We then define the shifted sequence of states as

x+0:N(w1:N −1) := (x1(w0, w1:N), . . . , xN +1(w0, w1:N)),

where w0 is considered fixed and w1:N is left variable. We can analogously define the shifted policy π+0:N −1 as

π+0:N −1(w1:N −1) := (π1(w0, w1:N −1), . . . , πf(xN(w0, w1:N −1))).

By construction, these shifted sequences satisfy (9b) and we can consider risk measures over (continuous) functions of these, where integration is performed over w1:N.

Using this coupling between the feasible problem and the shifted problem we show π+∈ ΠN(x0, A+). That is, the candidate policy π+ is feasible for the shifted problem.

I: By (A5), π+ ∈ Π;

II: We show that (9c) at t implies (9c) in the shifted problem at t − 1. That is, rAt [φ(xt)] ≥ rt−1A [φ(x+t−1)] for any w0∈ W. So, letting z = φ(xt), by (15),

rtA[z] = max

µt∈Pt−1×A inf

τ ∈IRτ + α−1IEµt[zτ] = inf

τ ∈IRτ + α−1ρPt−1×A[zτ] , with zτ = [z − τ ]+. For z+:= φ(x+t−1) = z(w0, w1:t−1) (zτ+ = [z+− τ ]+), we replace ρPt−1×A[zτ] with ρPt−2×A[zτ+] for rAt−1[z+]. Writing out ρPt−1×A gives

max

µt∈Pt−1×A

Z

Wt

zτ(w:t−1)dµt(w:t−1)



(i)

= max

µt−1∈A max

µt−1∈Pt−1

Z

Wt−1

h(w:t−2)

Z

W

zτ(w:t−1)dµt−1(wt−1)



t−1(w:t−2)



(ii)

= max

µt−1∈A max

w:t−2∈Wt−1h(w:t−2)

(iii)

max

µt−1∈A max

w1:t−2∈Wt−2h(w0, w1:t−2).

(12)

Noting that µt= µt−1× µt−1with µt−1∈ Pt−1and µt−1∈ A, before splitting up the max and the integrals, gives(i). The inner integral (i.e. h(w:t−2)) then acts as a continuous random variableWt−1→ IR for any fixed µt−1(cf. App. B). Hence we can apply the reasoning withinEx. II.7 to maximize over w:t−2 ∈ Wt−1 instead of over measures resulting in (ii). It is clear that, fixing the value of w0 results in the inequality (iii). Reverting the steps (i) and (ii) to get a maximization over µt−1∈ Pt−2× A shows that the final expression after(iii)equals ρ+At−1[z+− τ ]+. Hence ρPt−1×A[z − τ ]+ ≥ ρPt−2×A[z+− τ ]+ for all τ ∈ IR, z ∈Zt and w0 W. Therefore rtA[φ(xt)] ≥ rAt−1[φ(x+t−1)]. Since A+ ⊆ A by (A2), rAt[φ(xt)] ≥ rAt−1+[φ(x+t−1)]. Hence (9c) holds for all t ∈ IN1:N −2 in the shifted problem. For t = N − 1 we rely on(A3) and(A4).

III: The terminal constraint (9d) follows directly from (A4).

We have thus shown that π+ is a feasible policy.

Remark IV.3. Note that (A1)is essential since RF is a robust property, holding a.s. It acts as a convex relaxation of chance constraints conditioned on previous time steps (i.e., IP[φ(xt) ≤ 0 | xt−1] ≤ ε which holds a.s., hence ∀µt−1∈ Pt−1byEx. II.7). Due to the reduction of the policy space Π (cf.Rem. III.1) it is harder to satisfy such constraints for larger t. Other (less conservative) reformulations exist in the stochastic MPC literature, which impose all constraints at the first time step using a (maximal) RPI set (cf. [28]).

V. AFFINE DISTURBANCE FEEDBACK

To make the reformulations above more concrete, we show how (9) is solved. In general this is intractable, since we need to optimize over infinite dimensional policies π, under robust constraints associated with the risks (cf.Rem. III.1). Hence, we use affine disturbance feedback. The resulting optimization problem is a SDP. Different ambiguity sets and policies would give other reformulations (e.g., [11], [8]).

Consider linear dynamics, quadratic losses and constraints:

f (x, u, w) = Ax + Bu + Ew, πf(x) = Kfx,

`t(x, u) = x>Qx + u>Ru, `N(x) = x>Qfx,

φ(x) = x>Gx + 2g>x + γ, ψ(x) = x>Gfx + 2g>fx + γf. with Q  0, R  0, Qf  0, G  0, Gf 0. We could include (hard) input constraints as well or multiple state constraints (cf. discussion in [4, §1] on modeling joint chance constraints), but abstain from doing so for conciseness.

In this setting affine disturbance feedback [9] has been applied to solve many robust optimal control problems (and even some DRO problems [4]). The idea is to let

π(w) = F w + f ,

where F : IR(N +1)nw → IR(N +1)nx (defined in App. C), has a structure that enforces causality of π. Note x:N ∈ IR(N +1)nx, w:N −1∈ IRN nw and u:N −1∈ IRN nu.

The state trajectory then depends on the disturbance as x = (BF + E)w + (Ax0+ Bf ) = Hw,

with A, B, E defined inApp. C, F = [F , f ] and H = [H, h] = [BF +E, Ax0+Bf ].

Here the linear part, H, can be interpreted as the sensitivity of the state to the disturbance,

Referenties

GERELATEERDE DOCUMENTEN

taires n’hésitent donc plus à recourir aux jeunes pour faire la promotion d’un produit.. En France, une marque de voitures lançait la tendance en 1994, avec ce slogan: «La

Here we present two modiŽ cations of the original Gibbs sampling algo- rithm for motif Ž nding (Lawrence et al., 1993). First, we introduce the use of a probability distribution

ÂhÃKÄAŐƛÇÉÈAÊKÈAË%̐ÍSÎ+ÏLЋÎÑ°ÒNÓTÔ0ÕTÖ­×ØeÓÚÙÙ0ЋÞÙ0äKϋÖ+àÖ+Ï

A Mathematical Comparison of Non-negative Matrix Factorization- Related Methods with Practical Implications for the Analysis of Mass Spectrometry Imaging Data..

This comparison is repeated for different varying simulation parameters, such as the distance r between the microphone array and the sound source, the sampling frequency of

4. Wij bevorderen dat derden hun verantwoordelijkheid voor de aanpak van bodemverontreinigingen oppakken. De doelstellingen van het BWM-plan sluiten aan op de doelstellingen, die in

NH040200110 s-Gravelandseweg 29 HILVERSUM Humaan Verspreiding Uitdamping binnenlucht Volume &gt; 6.000 m3 Ja Nee Sanering gestart, humane risico’s beheerst, overige risico's

Als by 't haer lel ver geeft. 'T lal oock veel lichter val Dan krijgen, 't geen ick hoop, dat ick uytwereken fal. En liet, daer komt hy felfs, gaet om het geit dan heenen. Ick fal