Decentralized Stochastic Unit Commitment with Storage

(1)

University of Groningen

Master Thesis

Decentralized Stochastic Unit

Commitment with Storage

Friso Kramer

Supervised by

First supervisor: dr. W. Romeijnders Second supervisor: dr. N.D. van Foreest

Faculty of Economics and Business University of Groningen

(2)

ii

(3)

iii Author: Friso Kramer

Student number: s1910620 Date: July 5, 2017

University of Groningen

Faculty of Economics and Business

Econometrics, Operations Research and Actuarial Studies First supervisor: dr. W. Romeijnders

(4)

iv

(5)

Abstract

The Unit Commitment (UC) problem is concerned with minimizing the operating cost of a power grid. In this thesis we consider three recent developments that significantly change this optimiza-tion problem. First, the increasing penetraoptimiza-tion of renewable energy producoptimiza-tion into the power grid has increased the volatility and uncertainty in the future net demand (demand minus renewable production). Moreover, since renewable energy production has an intermittent nature and fluc-tuates on a short time frame it is hard to forecast. That is why we use stochastic programming to capture this uncertainty into the optimization problem, contrary to the traditional UC problem that merrily uses the future net demand as an input. Second, we also consider storage devices in our problem since they are a promising technical solution to deal with this uncertainty, and the third development is that there is an increase in the use of distributed generation, which raises the concern that individuals are not willing to share their private information. In this thesis we assume that the system operator that needs to solve the UC problem has no knowledge of the individual costs and operating constraints of generators and storage devices. We develop an algorithm that works in a decentralized way, meaning that individuals do not need to share their private infor-mation, such as their cost functions and operating constraints. The decentralized algorithm is obtained by applying Lagrangian relaxation to the UC problem to decompose it into single-facility sub-problems, and by consequently updating the Lagrange multipliers using a subgradient algo-rithm. Within this algorithm, Progressive Hedging is used to solve single-facility single-scenario sub-problems. Our numerical results indicate considerable savings when we compare our stochastic method to traditional UC problems in which the uncertainty is ignored.

(6)

vi

(7)

1 |

Introduction

In this chapter, the motivation for this research is described together with the research focus.This chapter ends with the outline of this thesis.

1.1 Research motivation

This research is motivated by three recent developments that occur in power grids. Firstly the increasing penetration of renewable energy generation into the power grid increases the uncertainty in supply and demand. The second development is the increasing share of distributed generation, this introduces a limitation that we will call limited knowledge. The third development is that there are ongoing technological advances that allow for efficient energy storage devices. We will elaborate on these topics in the next three sections.

1.1.1 Uncertainty of renewable energy generation

Nowadays almost everyone is connected to the electricity grid and we are heavily dependent on it. While facing a worldwide increasing energy demand year by year and a growing trend to become independent on fossil fuels, a reliable power grid has to be guaranteed. The trend to increase the use of renewable energy sources world wide is motivated by many reasons. The European Union has established a target for the year 2020 that 20% of the energy production must come from renewable energy sources (European-Commission, 2010). Reasons are, among others, to be less dependent on depleting fossil fuels and to reduce the emission levels of greenhouse gasses. Renewable energy sources have two main difficulties: their production is uncertain and intermittent, and their production is uncontrollable when compared to traditional thermal generators (Papavasiliou and Oren, 2013). Due to the volatile and intermittent nature of renewable energy sources, the power grid is under growing stress since supply and demand have to be matched at all times.

The Unit Commitment (UC) problem is concerned with finding a subset of generators that can achieve some common goal. Usually this goal is to balance supply and demand at minimum operating costs subject to individual operating constraints per generator. One has to decide when to switch generators on and off at different moments in time. Traditional UC deals with uncertainty in demand. The uncertainty does not explicitly enter the UC problem, instead a reserve capacity constraint is added to the problem. This constraint makes sure that there is always enough online capacity to deal with unexpected peak demands. The production mix mainly consists of thermal units (conventional, nuclear, combined-cycle heat, and power co-generation, etc.) and energy storage in the form of hydro units together with pumped-storage units. These hydro units use flowing water to generate electricity. Furthermore, water can be pumped uphill to store its potential energy such that it can be used at a later moment to generate electricity.

We can identify three planning layers for the power grid. The first layer is a planning that looks one to two days ahead and is mainly concerned with the scheduling of large steam engines that require a one day ahead planning due to their long start-up times. We call this the long-term UC problem. The second layer is defined as short-term UC and it is concerned with the scheduling of

(10)

2 CHAPTER 1. INTRODUCTION gas fired generators. These generators have a start-up time of typically 0,5 to 1 hour. Hence their commitment to production must be determined one hour in advance. The final layer of control is the real time control. Since the demand is uncertain there is always a mismatch between supply and demand. Due to the increasing penetration of renewable energy sources this uncertainty becomes only larger. Start-up times of generators make it impossible to schedule the commitment of generators real time. Hence the only way to balance supply and demand is to increase or decrease generation of the committed gas fired generators. The goal of all the three layers is to match supply and demand at minimum operating costs subject to individual operating constraints per generator.

Modern UC also includes renewable generation units. The increasing uncertainty in the net demand (demand minus renewable production) due to the increasing penetration of renewable generation in the recent years challenges the traditional view that the forecasted net demand is accurate enough (Keyhani et al., 2009).

1.1.2 Limited knowledge

Kok et al. (2005) mention that there is an increasing penetration of distributed renewable energy sources, such as wind and solar power into the power grid. Distributed generation is defined as an electric power source connected directly to the distribution network or on the customer site of the meter (Ackermann et al., 2001). The definition does not specify the technology of the generator, some examples are wind turbines, photo-voltaic cells, hydro and (geo)thermal generators, flywheels, batteries, and fuel cells. All these generators have different costs and in general they are more expensive in a distributed fashion than the current centralized generators if one looks at the price per unit generated power. However, due to technological and regulatory improvements these small scale distributed generators become able to compete with economically feasible, traditional centralized, electricity plants (Strachan and Farrell, 2006).

Recent research in the field of Control Theory for both real time control and discrete time control adds an extra dimension to the current the problem: the assumption of limited knowledge. Tradi-tionally, optimization problems are solved in a centralized way. With centralized we mean that all the information necessary to solve the problem is available and the problem can be solved by an Independent System Operator (ISO). However, increased distributed generation raises concerns about ownership and private information. It may be possible that owners of generators are not willing to share their private information about costs and profits. In this case the optimization problem cannot be solved since not all information is available (limited knowledge). Another concern is that solving the centralized problem may be too large to solve in a central fashion. Motivated by these concerns, lots of research is focused on decentralized real time control (cf. Dörfler and Grammatico (2017); Trip et al. (2016)) and discrete time control (cf. Christofides et al. (2013)). The main assumption in this research area is that there is a communication network such that the different players can communicate with each other. Through this communication network (price) signals can be communicated that give the players an incentive to change their production in order to balance supply and demand.

1.1.3 Energy storage

(11)

1.2. RESEARCH FOCUS 3 generators can only be effectively overcome by storing excess power. Different storage devices are examined by Hadjipaschalis et al. (2009), for example flywheels, batteries, super-capacitors, pneumatic storage, and pumped storage. Their conclusion is that the ideal storage technology depends heavily on the network environment. It depends among others on the amount of energy that is stored, the storage time, and costs. Barton and Infield (2004) describe the application and duration of storage devices in the power grid. Batteries are able to store energy for less than three days and can supply energy at any given time (Barton and Infield, 2004). Therefore batteries are the dominant technology compared to other storage devices (Hadjipaschalis et al., 2009). This supports the use of batteries in the short-term UC problem.

Incorporating storage in the UC problem may lead to significant cost reductions since batteries may be discharged in unexpected peak demands, preventing expensive thermal generation. However, given the uncertainty in energy supply and demand, it is unclear when to charge and discharge the batteries, which means that these decisions have to be taken explicitly into account in the UC problem.

1.2 Research focus

These three developments of increasing uncertainty, limited knowledge and energy storage impact the way we traditionally control the power grid. To overcome the difficulties of these developments we reformulate the UC problem. More specifically, in this research we focus on a decentralized approach to solve the short-term Stochastic Unit Commitment (SUC) problem including storage. We focus on the short-term UC since the fast generators that need to be scheduled on the short term have the flexibility that we need to deal with the increasing uncertainty in net demand. This flexibility is missing in long-term UC problems. We assume that the commitment of the slow generators is already given and focus on the commitment of the fast generators. Furthermore we also consider storage devices. The storage devices are capable to charge/discharge at any moment and therefore we can assume that they are always online.

The deterministic UC problem only deals with uncertainty by a reserve-capacity constraint. This constraints ensures that there is always enough production capacity to supply the demand. A generalization of deterministic programming is stochastic programming. Stochastic programming deals with problems involving uncertain parameters. In the case of the UC problem these uncertain parameters are the future net demand. The generalization from deterministic UC to Stochastic Unit Commitment (SUC) seems only natural in the light of the recent changes and the increasing uncertainty. To deal with the uncertainty we use stochastic programming and more specifically we use a multi-stage stochastic program. Next to multi-stage programs one can also formulate the SUC problem as a chance constrained problem (cf. Ono and Williams (2010); Yang et al. (2016); Bienstock et al. (2014)) or as a robust optimization problem (cf. Jiang et al. (2012); Zugno et al. (2016)). We focus here on multistage stochastic programs since it intuitively models the decision problem over multiple time stages under the given uncertainty. Furthermore, its structure is very similar to deterministic UC problem, this allows us to easily use existent decomposition techniques that are required for decentralized control. For more detail about multi-stage programming we refer to Chapter 3.

(12)

4 CHAPTER 1. INTRODUCTION devices such that information about production and demand can be shared. Decentralized control is mainly motivated by limited system knowledge and sharing of computational power. We exploit the separability of the optimization problem and divide it in smaller sub-problems that can be solved by each player in the network. This is also the main approach used for decentralized control. In mathematical programming and in particular stochastic programming there has always been a research effort to reduce computation time. One way is by separating the problem in sub-problems and solve these by parallel computing. These techniques show the potential for decentralized optimization of the UC problem with limited system knowledge. However, the focus lies on computational efficiency and separability is used to solve (part of) the problem in parallel, but not necessarily in a decentralized fashion. We will use in particular Lagrangian relaxation and Progressive Hedging to exploit the separability of the optimization problem. This is discussed in Chapter 4 and 5.

1.3 Outline

(13)

2 |

Research Methodology

In this chapter the focus will be on research methodological aspects. We present our research goals and research framework.

2.1 Methodological framework and research goals

As a methodological framework we use the design science research framework described by Hevner (2007). The objective of design science research is to develop technology-based solutions to important and relevant business problems and must produce a viable artifact in the form of a construct, a model, a method, or an instantiation.

Figure 2.1 shows the three different parts of design science research. Hevner et al. (2004) state that the research must be initiated by a relevant problem. The three developments of increasing uncertainty, limited knowledge and storage come from the environment in this figure. These developments explain why this research is relevant. Furthermore, Hevner et al. (2004) states that the outcome of the research must be an artifact and the research relies upon the application of rigorous methods in both the construction and evaluation of the design artifact. The goal of this research is to develop an algorithm that solves the SUC problem with storage in a decentralized way. We use the literature for the construction of our algorithm and evaluate our algorithm by numerical analysis. Hence, this leads to the following two research goals, which are displayed here for easy reference.

• Research goal 1: Use separability and parallel processing methods to develop an algorithm that solves the Stochastic Unit Commitment problem (approximately) in a decentralized way with limited knowledge.

• Research goal 2: Use numerical experiments to evaluate the performance of the algorithm, to investigate the impact of storage on the optimal policies for the SUC problem, and to compare optimal policies for the SUC problem to those of the traditional deterministic UC problem.

In the next section we discuss the relevant literature for addressing these research goals, and subsequently in section 2.3 we describe how our research will contribute to the literature.

(14)

6 CHAPTER 2. RESEARCH METHODOLOGY

Figure 2.1: Design science research cycles (Hevner, 2007)

2.2 Literature review

2.2.1 Storage in the Unit Commitment problem

Storage is also considered in the recent literature. Handschin et al. (2006) consider combined heat and power units that can produce heat and power. Mostly this is combined with a boiler to store heat. Yang et al. (2016) consider the deterministic UC problem with batteries as storage devices. Typical constraints for storage devices are equivalent to the below described constraints for thermal generators, (cf. Hawkes and Leach (2009)). In addition to thermal generators, a storage device can charge and discharge energy. There are also maximum and minimum storage constraints. We also use these constraints.

2.2.2 Distributed optimization

(15)

2.2. LITERATURE REVIEW 7

2.2.3 Stochastic Unit Commitment problem

One of the first appearances of the UC problem in the literature is by Muckstadt and Koenig (1977). They consider a mixed-integer program with known demand. In deterministic UC problems, a common approach is to enforce some amount of spinning reserves to deal with uncertainty in demand. While the supply demand balancing constraint enforces enough supply, the spinning reserve constraint enforces that the committed units have residual capacity greater than some quantity. The spinning reserve can be called upon in case of an unexpected increase in demand since starting up new generators is not possible. Different constraints are considered in the UC problem. We consider only minimum and maximum generator capacity limits and minimum up/down time constraints for the generators. Other constraints that are commonly used are ramping constraints on the rate of change of generator output, upper bounds on the start up variables, and flow capacity constraints (cf. Papavasiliou and Oren (2013)). We expect that we can easily incorporate these constraints in our algorithm, except for the flow capacity constraints since these are coupling constraints. Coupling constraints are the constraints that we try to relax to develop a decentralized algorithm (Boyd et al., 2011).

One of the first attempts of solving a stochastic unit commitment (SUC) problem is by Takriti et al. (1996). They consider a Quadratic Mixed-Integer Multi-Stage Stochastic program. They consider a system with a limited amount of hydro-electric energy stored in dams and thermal generating units that together must satisfy the (uncertain) demand. They motivate their approach as a generalization of the the deterministic UC problem. Instead of only the supply demand balance constraint and a spinning reserve constraint, Takriti et al. (1996) consider a number of scenarios that represent possible occurrences of the demand. Therefore there is no need to use a explicit spinning reserve constraint anymore since it is implicitly covered by the demand scenarios. We largely rely on the mathematical model and method of Takriti et al. (1996) and Takriti and Birge (2000) to solve the SUC problem for generators only. When storage devices are considered their approach is not decentralized.

2.2.4 Decomposition techniques for SUC

(16)

8 CHAPTER 2. RESEARCH METHODOLOGY Borghetti et al. (2003) also uses Lagrangian relaxation to separate a deterministic UC problem into n + m independent problems where n stands for the number of thermal generators and m for the number of hydroelectric generators. Hence for a given Lagrangian multiplier the individual agents (generators) can solve their sub-problem and determine their commitment and production. However, Borghetti et al. (2003) make use of bundle methods for solving the Lagrangian dual, which is not completely decentralized. Nowak and Römisch (2000) also use bundle methods for solving the Lagrangian dual. Carpentier et al. (1996) uses the augmented Lagrangian algorithm to find single generator sub-problems for their multi-stage stochastic program.

Papavasiliou and Oren (2013) consider a two-stage stochastic program with uncertain wind pro-duction in various locations of the network as well as generator and transmission line failures. They also decompose their problem into one sub-problem for each scenario for their two stage program but not into the number of facilities.

Despite the age of the Progressive Hedging (PH) algorithm, there is a lot of resent research about it. Watson and Woodruff (2011) and Gade et al. (2016) consider different techniques to enhance the PH algorithm for mixed-integer problems. Ordoudis et al. (2015) also study solution methods for the PH algorithm for the stochastic unit commitment problem. Boland et al. (2017) present a new primal dual algorithm based on the PH algorithm. Their algorithm converges to the optimal Lagrange dual for stochastic mixed-integer programs, but only two-stage recourse models are considered. We will use these enhancements of the PH algorithm to improve our algorithm.

2.3 Research contribution

We develop a completely decentralized algorithm for the Stochastic Unit Commitment problem including storage. This has to the best of our knowledge not been considered before in the literature. Furthermore, we give a new interpretation to the already used methods in the field of stochastic programming. There has always been a research effort in stochastic programming to decompose problems into smaller sub-problems that can be solved efficiently but we have not encountered the interpretation of decentralized algorithms that can solve these problems with limited knowledge.

Most of the research about SUC and PH considers two-stage models and/or linear cost functions. We use quadratic cost functions since they model the fuel costs of generators more accurately and consider multiple time stages.

The algorithm of Takriti and Birge (2000) can be executed in a decentralized way if only generators are considered. We extend there method such that we can also consider storage devices in a decentralized way. We decouple the coupling constraints by Lagrange relaxation to create sub-problems that can be solved by each facility (generator or storage device) separately.

(17)

3 |

Problem Formulation

In this chapter we first explain the unit commitment problem with storage for the deterministic case. We extend the formulation to the stochastic case.

3.1 Deterministic Unit Commitment

We consider a set of generators that can produce electricity. These generators are subjected by start-up times of 1 hour. Hence, 1 hour in advance we have to decide whether we start-up a generator. We use the same constraints for the generators as Takriti and Birge (2000), that is, generators are subject to capacity constraints and minimum and maximum generation time constraints. The storage devices are assumed to be always online since they have no start-up times and are subject to capacity constraints. We can now define the following optimization problem. Takriti and Birge (2000) consider a similar model but we extend it with the formulation of the storage devices.

min x,u,w n X i=1 H X t=1 [fi(xti) + gi(wti)] (3.1) s.t. n X i=1 xt_i≥ dt_, _{t = 1, . . . , H} _(3.2) n X i=1 ut_ixmax_i ≥ ¯dt, t = 1, . . . , H (3.3) E_imin ≤ E_i0+ t X τ =1 (ηixτ ci − xτ d i ηi ) ≤ E_imax t = 1, . . . , H, i ∈ B (3.4) xt_i = xtd_i − xtc_i , t = 1, . . . , H, i ∈ B (3.5) xmin_i ut_i ≤ x_it≤ xmax_i ut_i t = 1, . . . , H, i ∈ G (3.6) wt_i ≥ ut_i− ut−1_i , t = 1, . . . , H, i ∈ G (3.7) wt_i ≥ 0, t = 1, . . . , H, i ∈ G (3.8) ut_i ∈ U_it(u1_i, . . . , ut−1_i ), t = 1, . . . , H, i ∈ G (3.9) xtd_i , xtc_i ≥ 0, t = 1, . . . , H, i ∈ B (3.10) ut_i ∈ {0, 1}, t = 1, . . . , H, i ∈ G (3.11) We consider two sets, a set G that represents the set of generators and a set B that represents the set of storage devices. The total number of facilities is n.

For every generator i ∈ G there is a binary variables ut

i that represents whether the facility is

on (ut

i = 1) or off (uti = 0) at time t = 1, . . . , H (constraint (3.11)). The planning horizon is

defined by H. There are n continuous variables xt

i for each stage t, that represent the amount of

energy generated by facility i = 1, . . . , n, where the charging and discharging variable xtc_i and xtc_i respectively are non-negative (constraint (3.10)). Constraint (3.6) models lower and upper bounds

(18)

10 CHAPTER 3. PROBLEM FORMULATION on the amount of generation, where 0 ≤ xmin

i ≤ xmaxi represent the maximum and minimum

operating capacity of a generator. Note that the minimum generation is always greater or equal to zero.

For storage devices we restrict the continuous variables with the following constraint. In each stage t the continuous variables xt

i are linked and constrained through time by the linear

stor-age constraints (3.4) and (3.5) that allows for transferring energy through time periods, where E_imin, E_imax, E_i0, ηi represents the minimum, maximum storage capacity, the initial energy level

and efficiency of storage device i ∈ B respectively. Storing energy is not free and energy is lost during this process. The efficiency of a storage device is given by η ∈ (0, 1], where η = 1 stands for lossless storage and for η < 1 a fraction of the energy stored is lost. If xt_i > 0 the storage device is supplying energy and if xt

i < 0 the storage device is charging at time t.

In each stage t, the total energy generation must be greater of equal to the demand dt, hence we have constraint (3.2). The peak demand for time period t is given by ¯dt _{It is a widely used}

approach to develop forecasts of both the average net demand, dt_{, and the peak demand, ¯}_dt_{, for}

each time period (Takriti et al., 1996). Constraint (3.3) is the reserve capacity constraint. The constraint ut

i ∈ Uit(u1i, . . . , ut−1i ) models minimum and maximum generation time. For

example, if a generators switches on, it must remain online for a number of periods. Hence, the binary variables ut_i are linked through time.

Fuel costs are commonly modeled as quadratic functions (cf. Dörfler and Grammatico (2017) and Trip et al. (2016)) or modelled as piece-wise linear functions (cf. Klein Haneveld and van der Vlerk (2001)). We let the quadratic function fi(xti) = qi(xti)2, qi> 0, be the production costs of

facility i at time t. Let gi(ut−1_i , ut_i) be the cost or penalty associated with switching a facility on

or off. We assume without loss of generality that costs are only incurred whenever a generator has to start up. That is, the function gi meets the requirement gi(0, 0) = g(1, 1) = 0, g(0, 1) ≥ 0,

and gi(1, 0) = 0. This assumption is also used by Takriti and Birge (2000). It means that costs

are incurred if the facility is starting up. No costs are incurred if the value of the binary variable does not change in consecutive time periods or the facility is shutting down.

We can represent the start-up costs of a generator by the function gi(ut−1i , uti) = max{uti −

ut−1_i , 0}ci. Instead of the non-linear form gi(ut−1_i , uti) we reformulate it in a linear form that we

define as gi(wt_i) = ciwt_i and use the constraints w_it≥ ut_i− ut−1_i and w_it≥ 0, i ∈ G, t = 1, . . . , H

with u0_i a given input parameter. Takriti and Birge (2000) do not mention explicitly how they deal with the non-linearity of gi(ut−1i , uti). Hence, they do not consider explicitly constraint (3.4),

(3.7), and (3.8). Our formulation is similar to that of Muckstadt and Koenig (1977). Note that we do not constrain the start up variable wt_i to be binary but treat them as continuous variables since every optimal solution wt

i of the optimization problem will be binary (cf. O’Neill et al. (2010)).

(19)

3.2. STOCHASTIC UNIT COMMITMENT 11 in the next chapters. We can represent this problem in short notation as

F = min x,u,w n_Xn i=1 H X t=1 fi(xti) + gi(wit) s.t. n X i=1 xt_i ≥ dt_, _{t = 1, . . . , H} n X i=1 ut_ixmax_i ≥ ¯dt, t = 1, . . . , H {x, u, w} ∈ Ko (3.12)

where K = KG∪ KB_{, with K}G _{and K}B _{given by}

KG = n (x, u, w) : xmin_i ut_i ≤ xt_i ≤ xmax_i u_it, w_it≥ ut_i− ut−1_i , wt_i ≥ 0, ut_i ∈ U_it, xt_i ∈ R, uti ∈ {0, 1}, t = 1, . . . , H, i ∈ G o , KB =nx : E_imin≤ E0 i + t X τ =1 (ηixτ ci − xτ d_i ηi ) ≤ E_imax, xt_i = xtd_i − xtc_i , xtd_i , xtc_i ≥ 0, t = 1, . . . , H, i ∈ Bo.

3.2 Stochastic Unit Commitment

The above mentioned model can be solved relatively easy by standard optimization software since it is relatively small. In this model we assume that the vector dt _{is known in advance. Hence,}

prescient decision makers that know the future demand can use this model. However, in practice the realizations of the demand vector dt are only known up to (and thus including) time t when the t-th stage decision has to be made. In stochastic programming it is common to represent the uncertain variable as a random variable. Hence, we assume that the information about the net demand is given by the discrete time stochastic process {ξt}H

t=1 that is defined on the probability

space (Ω, F , P ) and ξt_{taking on real values. Clearly the net demand d}t_{is influenced by for example}

the weather, the time of the day, the day of the week and many other factors (Takriti et al., 1996) and hence can be represented as a random variable. The available information at stage t consists of the random vector (ξ1_{, . . . , ξ}t_{) and the decisions at stage t may only depend on this vector.}

(20)

12 CHAPTER 3. PROBLEM FORMULATION are given by (3.14), (3.15), and (3.16) respectively in the following stochastic decision model:

min y E[ H X t=1 f (yt, ξt)] (3.13) s.t. ytis non-anticipative, (3.14) xt∈ Xt, Bt(ξt)yt≥ dt(ξt), P − a.s., t = 1, . . . , H (3.15) t X τ =1 Atτ(ξt)yτ ≥ gt(ξt), P − a.s., t = 1, . . . , H (3.16)

where P − a.s. means that constraints (3.15) and (3.16) hold almost surely (P − a.s.), i.e. for all ω ∈ Ω (except perhaps for sets with zero probability) (Birge and Louveaux, 1997). For further details we refer to Römisch and Schultz (2001).

Stage Available information for taking decisions Decision Prior decision Observed Residual

Commitment Generation Commitment Generation outcomes uncertainty

1 u1 (given) none ξ1 P(ξ2, . . . , ξT|ξ1) u2 x1 2 u1, u2 x1 ξ1, ξ2 P(ξ3, . . . , ξT|ξ1, ξ2) u3 x2 3 u1, u2, u3 x1, x2 ξ1, ξ2_,ξ3 P(ξ4, . . . , ξT|ξ1, ξ2, ξ3) u4 x3 .. . ... ... ... ... ... ... H − 1 u1, . . . , uH−1 x1, . . . , xH−2 ξ1, . . . , ξH−1 P(ξH|ξ1, . . . , ξT −1) uH xH−1 H u1, . . . , uH x1, . . . , xH−1 ξ1, . . . , ξH none none xH

Table 3.1: Decision stages, setting the order of observations and decisions. From Defourny et al. (2011)

3.2.1 Deterministic equivalent problem

We assume that Ω is finite, that is Ω = {ωs}Ss=1, F is the power set of Ω and P ({ωs}) = ps,

s = 1, . . . , S. We denote by dt_s := ξt(ωs) the value of the net demand in scenario s at stage

t. Hence, we consider a set of possible scenarios that represent the possible future demand. A weight psis assigned to each scenario and represents the probability of its occurrence in the future,

s = 1, . . . , S. Hence, instead of considering the demand vector d, we consider the demand vectors ds, s = 1, . . . , S. Important is that we introduce the non-anticipativity constraints. If we solve

problem (3.1) S times every time with a different demand vector ds, in general we get different

solutions. This information is not useful to the decision maker since he must make his decisions a priori and does not now the future demand with certainty. "Non-anticipativity means that if two different scenarios s1 and s2 are indistinguishable at time t on the basis of information available

about them at time t, then the decision made for scenario s₁must be the same as that of scenario s2" (Takriti and Birge, 2000). Following Takriti and Birge (2000) we partition the scenario set

at each time period into disjoint subsets that are called scenarios. If two scenarios have the same demand values for time periods 1, . . . , t, then they are member of the same bundle B(s, t) at time t. We can write the non-anticipativity constraint as

B(s1, t) = B(s2, t) =⇒ utis1 = u t is2, x t is1 = x t is2 i = 1, . . . , n, t = 1, . . . , H, ∀(s1, s2) ∈ {1, . . . , S} 2_.

(21)

3.2. STOCHASTIC UNIT COMMITMENT 13 connected dotted lines represents the scenario bundles. In each bundle the decision variables must be non-anticipative and hence they must be equal to each other.

Figure 3.1: Example Scenario tree and non-anticipativity constraints. Taken from Nowak and Römisch (2000)

To represent this as a linear constraint we define the following notation that follows from Takriti and Birge (2000). Let T (s) be the first period in which a scenario s ≥ 2 does not share a bundle with another scenario s0 < s. Let σ(s) < s be a scenario that shares the same bundle with s at all time periods prior to and including T (s) − 1. Then we can rewrite the bundle constraint as

ut_is− ut_iσ(s)= 0, xt_is− xt_iσ(s)= 0 i = 1, . . . , n, s = 2, . . . , S, t = 1, . . . , T (s) − 1. Now we can define the following deterministic equivalent problem, which is a large-scale mixed-integer quadratic program.

F = min x,u,w n_XS s=1 ps n X i=1 H X t=1 fi(xtis) + gi(wist) : n X i=1 xt_i,s≥ dt_s, {x, u} ∈ N , {xs, us, ws} ∈ Ks, t = 1, . . . , H, s = 1, . . . , S o (3.17)

where Ks= KsG∪ KsB, with KsG and KsB given by

K_sG=n(xs, us, ws) : xmini utis≤ xtis ≤ xmaxi utis, wist ≥ utis− ut−1is , w t is ≥ 0, ut_is∈ U_ist, xt_is ∈ R, utis∈ {0, 1}, t = 1, . . . , H, i ∈ G o , K_sB = n x : E_imin ≤ E_i0+ t X τ =1 (ηixτ cis − xτ d_is ηi ) ≤ E_imax, xt_is= xtd_is− xtc_is, xtd_is, xtc_is≥ 0, t = 1, . . . , H, i ∈ Bo. The set N represents the non-anticipativity constraints, that is

N =n(x, u) : ut_is− ut_iσ(s)= 0, xt_is− xt_iσ(s)= 0, i = 1, . . . , n, s = 2, . . . , S, t = 1, . . . , T (s) − 1 o

(22)

14 CHAPTER 3. PROBLEM FORMULATION Table 3.2: Dimension of the large scale deterministic equivalent Mixed-Integer Program for H = 24, |G| = 10, |B| = 10 with three branches each period.

Periods branching Scenarios Nodes Variables Constraints Binary continuous 1 1 24 240 960 2, 256 2 3 70 720 2, 880 6, 580 3 9 202 2, 160 8, 640 18, 988 4 27 580 6.480 25, 920 54, 520 5 81 1, 660 19, 440 77, 760 156, 040 10 59, 049 8.56 · 105 1.42 · 107 5.67 · 107 8.05 · 107 15 14, 348, 907 1.36 · 108 3.44 · 109 1.38 · 1010 1.28 · 1010 20 3, 486, 784, 401 1.57 · 1010 8.37 · 1011 3.35 · 1012 1.47 · 1012

The integer constraints on (some of) the decision variables render the stochastic programming problem non-convex and therefore increase the difficulty of the problem considerable (Watson and Woodruff, 2011). Standard MIP solvers can be used directly to solve optimization problem (3.17), where we consider all scenarios S simultaneously. However this works only for small and easy problem instances (Gade et al., 2016). This is easily indicated by the following numerical example. Note that problem (3.17) consists of S deterministic sub-problems that are coupled through the non-anticipativity constraints. This problems grows exponentially in size if we increase the number of scenarios S. This is illustrated in Table 3.2, where we consider 24 time periods, 10 generators and 10 storage devices. Hence, for large instances standard MIP solvers are not capable anymore to solve the problem within a practically desired time frame.

3.3 Outline Chapter 4 and 5

Up till now we have presented the deterministic model, the stochastic model and the deterministic equivalent model. To accomplish our research goal we will use Lagrangian relaxation for the coupling constraint Pn

i=1xti,s ≥ dts such that the problem can be separated in n singe-facility

sub-problems, that is one sub-problem for each facility. This is discussed in section 4.1. However, due to the size of S, these single-facility sub-problems are still large-scale optimization problems. Hence, we again use Lagrange relaxation on the non-anticipativity constraints to decompose these singe-facility sub-problems into singe-scenario singe-facility sub-problems that can be easily solved. This is described in section 4.2.

(23)

4 |

Decomposition

In this chapter we consider two decompositions, for a decentralized algorithm we need to relax the coupling constraints, we use Lagrangian relaxation to accomplish this. This results in n single-facility sub-problems. We relax the non-anticipativity constraints of each of the sub-problems to create comprehensible single-scenario single-facility sub-problems that are easy to solve, we again use Lagrangian relaxation to relax the non-anticipativity constraints.

4.1 Lagrangian relaxation demand supply constraint

According to our research goal we are looking for a decentralized algorithm to solve the stochas-tic unit commitment problem. A common technique to design decentralized algorithms is to decompose the given problem by relaxation of the coupling constraint. This is in our case the supply demand constraints. This decomposes the problem into n sub-problems. Consider the optimization problem (3.17). The Lagrangian function, where we have relaxed the supply demand balancing constraint, is L(λ) = min u,x,w S X s=1 ps h_Xn i=1 H X t=1 fi(xtis) + gi(wtis) − H X t=1 λt_s( n X i=1 xt_is− dt_s) i s.t. {x, u} ∈ N {x_s, us, ws} ∈ Ks, s = 1, . . . , S.

Here λt_s is the Lagrange multiplier. We can rewrite this as

L(λ) = min u,x,w S X s=1 ps n X i=1 H X t=1 fi(xti,s) + gi(wist) − λts(xti,s− 1 nd t s) s.t. {x, u} ∈ N {x_s, us, ws} ∈ Ks, s = 1, . . . , S.

Note that L(λ) is separable in each facility i, such that L(λ) = P

(24)

16 CHAPTER 4. DECOMPOSITION and φB_i (λ) = min xi S X s=1 ps H X t=1 fi(xti,s) − λts(xti,s− 1 nd t s) s.t. xi ∈ Ni xis ∈ KisB, s = 1, . . . , S

Furthermore note that the sets N , K_sG and K_sB are also separable in each facility i. Hence, for a given λ the problem L(λ) can be solved by solving the sub problems φi(λ) for i = 1, . . . , n. The

Lagrangian dual problem has the form sup_λ≥0L(λ).

The Lagrangian function L(λ) is concave and continuous in λ (Takriti et al., 1996). Furthermore it is also bounded, hence a global optimum can be reached by using an appropriate convex programming method (Takriti et al., 1996). By weak duality it follows that the maximum dual objective function value is a lower bound on the primal optimal objective function value. We define the duality gap for any feasible solution as

DG(λ) = F (x, u, w) − L(λ) ≥ 0

Since the feasibility set of L(λ) is not convex the duality gap is expected to be strictly positive. Note that the sub-problems φG

i (λ) and φBi (λ) inherit the same properties as L(λ).

4.2 Lagrangian relaxation of non-anticipativity constraints

Due to the size of S, φG_i (λ) and φB_i (λ) are very large problems. To solve these problems we will use Lagrangian relaxation on the non-anticipativity constraints.

Note that if we assume that the production variables xt

isof the storage devices are non-anticipative

(that is xt_is∈ N , i ∈ B), the non-anticipativity constraints on the commitment variables ut is ∈ N

for the generators implies non-anticipativity on the generation variables xt

is ∈ N of the generators

i ∈ G. We can explain this by rewriting the demand supply constraint as P

i∈Gxtis ≥ dts−

P

i∈Bxtis. We assume that B(s1, t) = B(s2, t) =⇒ xti,s1 = x

t

i,s2 for i ∈ B. The functions

fi and the set Kit are independent of s and the demand for every scenario in a bundle is the

same, also for previous time periods. That is, B(s1, t) = B(s2, t) =⇒ B(s1, τ ) = B(s2, τ ),

τ = 1, . . . , t − 1 (Takriti and Birge, 2000). Because of this, the bundle constraint of ut_is results in continuous variables xt

is that also satisfy the bundle constraints. A similar argument is given

by Takriti and Birge (2000), where they only consider generators and no storage devices. Hence we can rewrite φG_i (λ) as φG_i (λ) = min ui,xi,wi S X s=1 ps H X t=1 f_i(xt_i,s) + gi(wtis) − λts(xti,s− 1 nd t s) s.t. {x_s, us, ws} ∈ KisG, s = 1, . . . , S ut_is− ut_iσ(s)= 0, i = 1, . . . , n, s = 2, . . . , S, t = 1, . . . , T (s) − 1. If we relax the non-anticipativity constraints, then we get the following dual function:

(25)

4.2. LAGRANGIAN RELAXATION OF NON-ANTICIPATIVITY CONSTRAINTS 17 which is separable in S sub-problems. Such a sub-problem is

ρG_is(λ, µ) = min uis,xis,wis ps H X t=1 fi(xti,s) + gi(wtis) − λts(xti,s− 1 nd t s) − T (s)−1 X t=1 µt_is(ut_is− ut_iσ(s)) s.t. {xis, uis, wis} ∈ KisG

In a similar way we can define S sub-problems for each storage devices i ∈ B.

ρB_is(λ, µ) = min xis ps H X t=1 fi(xti,s) − λts(xti,s− 1 nd t s) − T (s)−1 X t=1 µt_is(xt_is− xt_iσ(s)) s.t. xis ∈ KisB.

The Lagrangian function L_i(λ, µ) is concave and continuous in µ (Takriti et al., 1996). Fur-thermore it is also bounded. For any feasible solution we can define the duality gap for both sub-problems as DGG_i = φG_i (λ) − LG_i (λ, µ) ≥ 0 DGB_i = φB_i (λ) − LB_i (λ, µ) ≥ 0, where LG_i (λ, µ) =PS s=1ρGis(λ, µ) and LBi (λ, µ) = PS

s=1ρBis(λ, µ). Note that for the sub-problem

ρB_is(λ, µ) the feasibility set, K_isBis convex since it contains no integer requirements. Hence a global optimum can be found for these sub-problems.

4.2.1 Distributed interpretation

We have decomposed the optimization problem into single-facility sub-problems, these are φG i (λ),

i ∈ G and φB_i (λ), i ∈ B. Hence, for a given λ we can solve these sub-problems in a decentralized way. Since they are still very large optimization problems we decompose them into single-facility single-scenario sub-problems, ρG

is(λ, µ), i ∈ G, s = 1, . . . , S and ρBis(λ, µ), i ∈ B, s = 1, . . . , S

that can be easily solved.

(26)

18 CHAPTER 4. DECOMPOSITION i = 1 i = 2 i = 3 i = 4 ISO x1 λ x2 λ x3 λ x4 λ

(27)

5 |

Algorithm

Up to now we have sketched a decentralized algorithm where we have decomposed our optimization problem into sub-problems for every facility and every scenario. Each facility can solve their own sub-problems given the Lagrangian multipliers. In this chapter we discuss techniques on how we can solve these sub-problems and how we determine the Lagrangian multipliers.

5.1 Subgradient method

We use a simple, yet powerful approach to solve the Lagrange dual. This is the subgradient algorithm (cf. Wolsey (1998)). The pseudo code of the algorithm is given by Algorithm 1. It is sometimes also called the broadcast-and-gather algorithm (Boyd et al., 2011). For a given initial λ0 we solve L(λ). We do this in parallel since L(λ) is separable in each facility i. We can calculate the duality gap DG and update λ until the duality gap is small enough or the maximum number of iterations is reached. We use the following criteria, DG(λ)_F ≤ = 0.01. That is, whenever the dual gap is less than 1% the algorithm terminates. The Lagrangian multiplier λ is increased into the opposite direction of the subgradient. Wolsey (1998) mention that the difficulty is in choosing the step lengths {αs(k)}∞_k . We discuss this in the next chapter. To maintain dual feasibility we

only allow non-negative values for λ.

Algorithm 1 Decentralized Subgradient Algorithm

1: Broadcast Set λ(0) = λ0_{, k = 1.} 2: while DG(λ) ≥ or k < kmax_do

3: Gather: xi(k) = argmaxxiφi(λ) for i = 1, . . . , n

4: Broadcast: λt

s(k + 1) = max{λ(k) + αs(k)(Pni=1xtis− dts), 0} 5: k = k + 1

6: end-do

In the remainder of this chapter we will explain the algorithm in more detail. In section 5.2 we describe the PH algorithm. We have explained already that the single-facility sub-problems are too large to solve directly. Therefore we have decomposed them into single-facility single-scenario sub-problems. We use the PH algorithm to find non-anticipative solutions to the single-facility sub-problems. In section 5.3 we explain in more detail how we update the Lagrange multiplier λ. In section 5.4 we describe how we can efficiently find a initial Lagrange multiplier λ0_{. In section}

5.5 we describe several methods to enhance the algorithm. Finally, in section 5.6 we present the final algorithm.

5.2 Progressive Hedging algorithm

To solve the single-scenario single-facility sub-problems we use the Progressive Hedging Algorithm. We could use the subgradient method to update the Lagrangian multiplier µ. However, similar to Takriti and Birge (2000) we will use the Progressive Hedging (PH) algorithm since it converges at a faster rate than the standard Lagrangian relaxation (Takriti and Birge, 2000). We rewrite

(28)

20 CHAPTER 5. ALGORITHM the non-anticipativity constraint as ut

is = c B(s,t) i and xtis = c B(s,t) i , where c B(s,t)

i is the value that

needs to be assigned to the i-th decision variable ut_is if i ∈ G or xt_is if i ∈ B of all scenarios that belong to the bundle B(s, t) at time t (Takriti and Birge, 2000). Hence, we can rewrite the sub-problem ρG is(λ, µ) and ρBis(λ, µ) as ρG_is(λ, µ) = min uis,xis n ps H X t=1 fi(xti,s) + gi(wist) − λts(xti,s− 1 n d t s) − µtis(utis− c B(s,t) i ) : {xis, uis, wis} ∈ KisG o and ρB_is(λ, µ) = min xis n ps H X t=1 fi(xti,s) − λts(xti,s− 1 n d t s) − µtis(xtis− c B(s,t) i ) : xis∈ KisB o

As mentioned by Takriti and Birge (2000), the value of cB(s,t)_i is not known in advance but can be approximated by a weighted average of the decisions made, cB(s,t)_i =

P s0ps0utis0 P s0ps0 if i ∈ G and cB(s,t)_i = P s0ps0xtis0 P

s0ps0 if i ∈ B, where the summation is taken over all scenarios s

0 _{that share the}

bundle B(s, t) at time t.

Similar to Takriti and Birge (2000), the Lagrangian multiplier µt

is is updated via the following

rule:

µt_is(k + 1) = µt_is(k) + r(k)(ut_is(k) − cB(s,t)_i (k)),

where r(k) > 0, and we use the same progressive hedging algorithm, see Algorithm 2.

Algorithm 2 Progressive Hedging Algorithm Generator, from Takriti and Birge (2000)

1: Initialize multiplier µt

is(0) and r(k) > 0. Let k = 0 Go to 2. 2: Let (ut_is, xt_is) solve ρG_is(λs, µis) for s = 1, . . . , S (in parallel).

3: Compute cB(s,t)_i (k) and µt_is(k + 1) = µt_is(k) + r(k)(ut_is(k) − cB(s,t)_i (k)).

4: If cB(s,t)_i (k) = ut_is(k) and µt_is(k + 1) = µt_is(k), then stop; cB(s,t)_i (k) and ut_is(k) are optimal. Otherwise, let k = k + 1 and go to 2.

Algorithm 3 Progressive Hedging Algorithm Battery

1: Initialize multiplier µt_is(0) and r(k) > 0. Let k = 0 Go to 2.

2: Let (xt

is) solve ρBis(λs, µis) for s = 1, . . . , S (in parallel)

3: Compute cB(s,t)_i (k) and µt_is(k + 1) = µt_is(k) + r(k)(xt_is(k) − cB(s,t)_i (k)).

4: If cB(s,t)_i (k) = ut_is(k) and µt_is(k + 1) = µt_is(k), then stop; cB(s,t)_i (k) and xt_is(k) are optimal. Otherwise, let k = k + 1 and go to 2.

The PH algorithm (see Algorithm 3) for the battery model ρB

is(λ, µ) is similar to the PH

algo-rithm for the generator model ρG

is(λ, µ). The difference is that now the continuous variable x

(29)

5.3. UPDATING LAGRANGE MULTIPLIERS 21 convergence are often practically desirable to avoid long run-times and mitigate issues associated with numerical tolerances. Therefore, for the Progressive Hedging algorithm we do not limit the maximum number of iterations but terminate wheneverPS

s=1

PH

t=1|utis− c B(s,t,)

i | ≤ = 0.01 for

the generators and similar for the storage devices.

5.3 Updating Lagrange multipliers

We base our update rule on the subgradient method as described by Takriti et al. (1996). λs(l + 1) = max{λs(l) − α(l)ξs(l), 0}, (5.1)

where ξs(l) is the subgradient of L(·) at λs(l). Hence, for a given scenario s the subgradient at

time t is equal to dt s−

Pn

i=1xti,s. To maintain dual feasibility we only allow non-negative values

for λ. Similar to Takriti et al. (1996) we set α(k) = _a+bk1 , a, b > 0. The impact of a, b on the number of iterations is big, this will be tested in section 6.2.2.

5.4 Initial Lagrange multipliers

The goal of Algorithm 1 is to find values λ∗ such that the dual gap DG(λ∗) ≤ . The initial Lagrange multipliers λ0 _{have a big impact on the number of iterations and therefore the execution}

time of the algorithm. If we have initial Lagrange multiplier λ0 very close to λ∗ we expect that less iterations are necessary. In this section we describe how we can find good initial values λ0_.

We follow the method of Takriti et al. (1996) to find initial values λ0 _{and extend it such that we}

can also consider the storage devices. The idea is to approximate the problems φG

i and φBi in such a way that we can easily solve them.

We solve these approximations for a given λ and subsequently update λ until an optimal solution is found for the approximation. The approximation that we make is a convex optimization problem and therefore we can find an optimal solution with the subgradient algorithm. The final value for λ can than be used as the initial Lagrange multiplier λ0 in Algorithm 1.

We use Algorithm 4 to find an initial Lagrange multiplier λ0_{. The steps of the algorithm are}

explained in section 5.4.1 and 5.4.2. Since we solve a convex optimization problem we can find a solution such that the dual gap is equal to zero. For numerical reasons we terminate the algorithm if the dual gap is smaller than some threshold value .

Algorithm 4 Lagrange multiplier initialization

1: set λt_s= 0 for all t = 1, . . . , H, s = 1, . . . , S and set k = 0

2: while DG(λ) ≥ do

3: k := k + 1

4: for given λt_s find xt_is according to equation (5.2) for i ∈ G (in parallel).

5: for given λt

s find xtis (Algorithm 5) for i ∈ B (in parallel). 6: Update λ

7: end-do

(30)

22 CHAPTER 5. ALGORITHM

5.4.1 Generator model approximation

Consider the following approximation for the generator sub problem as described by Takriti et al. (1996). We can approximate φG_i (λ) by min xi S X s=1 ps H X t=1 fi(xti,s) + ci xmax i xt_is− µt_s(xt_i,s− 1 n d t s). s.t. 0 ≤ xt_i,s≤ xmax i , t = 1, . . . , H,

where we have omitted the binary variables ut_is. We approximate the start-up costs by the term

ci

xmax i x

t

is. This term takes the start-up costs of each unit into consideration (Takriti et al., 1996).

Furthermore we relax the minimum generation constraint. Note that this problem is separable in S × H sub-problems. We get the following one-variable sub-problem

min xis fi(xti,s) + ci xmax_i x t is− µts(xti,s− 1 n d t s). s.t. 0 ≤ xt_i,s≤ xmax_i .

Since f_i(·) is a quadratic function, we can easily find a algebraic expression for xisas a function of

µt_s. If we assume an interior solution, then xt_is follows from the first order condition of optimality, that is ∂ ∂xt is qi(xti,s)2+ ci xmax i xt_is− µt_s(xt_i,s− 1 n d t s) = 0 ⇐⇒ xtis= µt_s− ci xmax i 2qi . Since xt

is is either an interior solution or a boundary solution, it follows that

xt_is= maxnminµ t s−xmaxci i 2qi , xmax_i , 0o. (5.2)

5.4.2 Battery model approximation

We identify at least three possible methods to determine a non-anticipative solution for the storage devices given a Lagrange multiplier λ. The first method is to use the PH algorithm for the storage devices, that is Algorithm 3. However, following the method of Takriti et al. (1996) for the generator model approximation we want to find a approximation such that it is separable in the number of time periods. The second method is to use the PH algorithm and on top of that use Lagrangian relaxation on the storage constraint such that the problem becomes separable in the number of time periods. We however, use a combination of these two methods. We use Lagrangian relaxation to make the problem separable and omit the non-anticipativity constraints but force non-anticipativity of the Lagrange multipliers. This is also called the projected subgradient method (cf. Duchi et al. (2012) or Boyd et al. (2003)).

Consider the optimization problem φB

(31)

5.4. INITIAL LAGRANGE MULTIPLIERS 23 periods we relax the storage constraint. The Lagrangian is given by

L(λ, α, β) = min xi S X s=1 ps h_XH t=1 fi(xti,s) − λts(xti,s− 1 n d t s) + H X t=1 (β_ist − αt_is)E_i0− β_istE_imax− (β_ist − αt_is) t X τ =1 (ηixτ ci,s− xτ d_is ηi ) i . s.t. xt_is= xtd_is− xtc_is t = 1, . . . , H, i ∈ B xtd_is, xtc_is ≥ 0, t = 1, . . . , H, i ∈ B xt_i∈ Ni t = 1, . . . , H, i ∈ B

Note that this problem is separable in H sub-problems. This becomes clear if we rewrite it in the following form: L(λ, α, β) = min xi S X s=1 ps H X t=1 fi(xti,s) − λts(xti,s− 1 n d t s) +(β_ist − αt_is)E_i0− β_istE_imax− γ_ist(ηixtci,s− xtd_is ηi ), s.t. xt_is = xtd_is − xtc_is t = 1, . . . , H, i ∈ B xtd_is, xtc_is≥ 0, t = 1, . . . , H, i ∈ B xt_i ∈ Ni t = 1, . . . , H, i ∈ B where γt is = PH j=t(β j is− α j

is). The corresponding dual problem is given by maxα,β≥0L(λ, α, β).

By omitting the non-anticipativity constraint the problem is separable in S × H sub-problems. From the first order conditions of optimality it follows directly that

∂L(λ, α, β) ∂αt is = −E_i0+ t X τ =1 (ηixτ ci,s− xτ d_is ηi ), ∂L(λ, α, β) ∂β_ist = E 0 i − Eimax− t X τ =1 (ηixτ ci,s− xτ d is ηi ). Hence, the update rule for α and β are

αt_is(k + 1) = αt_is(k) + δ(k)(−E0_i + t X τ =1 (ηixτ ci,s− xτ d is ηi )), β_ist(k + 1) = β_ist(k) + δ(k)(E_i0− E_imax− t X τ =1 (ηixτ ci,s− xτ d_is ηi )).

To maintain dual feasibility, we set the Lagrangian multipliers α and β equal to zero if they become negative. The step-length is δ(k). Furthermore,

∂L(λ, α(k), β(k)) ∂xtc is(k) = 0 ⇐⇒ xtc_is(k) = −λ t s+ ηiγist(k) 2qi (5.3) ∂L(λ, α(k), β(k)) ∂xtd is(k) = 0 ⇐⇒ xtd_is(k) = λ t s− γt is(k) ηi 2qi (5.4) To maintain feasibility, we do not allow negative values for xtd

is and xtcis and hence,

(32)

24 CHAPTER 5. ALGORITHM Note that λ and γ(k) in N implies that xt

is ∈ N , for s = 1, . . . , S and t = 1 . . . , H. Our goal

is to find optimal production levels for the storage devices such that they are non-anticipative. However, after updating α and β, we cannot guarantee anymore that γ is non-anticipative. If γ is not non-anticipative, the production levels are also not non-anticipative. To make sure that the production levels are non-anticipative we average γ via the following rule, we calculate the weighted average per scenario bundle B(s, t), that is cB(s,t)_i =

P s0ps0γtis0 P s0ps0 and set γ t is := c B(s,t) i .

In other words, we project γ on the set N . The projected subgradient method has the same convergence results as the subgradient method, hence Algorithm 5 finds the optimal value of the problem φB

i (λ) (Boyd et al., 2003).

Algorithm 5 Projected Subgradient Algorithm

1: set αt_is= 0, β_ist = 0 for all t = 1, . . . , H, s = 1, . . . , S and set k = 0

2: while γ(k) 6= γ(k − 1) do

3: k := k + 1

4: find xt

is(k) according to equation (5.5) for all t = 1, . . . , H, s = 1, . . . , S 5: update α(k) and β(k), where δ(k) = _a+bk1 , a, b > 0

6: update γ(k) := cB(s,t)_i .

7: end-do

As mentioned, both the PH algorithm as the Projected Subgradient algorithm can be used to solve the convex optimization problem φB_i (λ) optimally. In Chapter 6.2.1 we compare both algorithms on run time and solution quality.

5.5 Algorithm enhancement

To improve the speed of the algorithm we consider two additional methods. Within the PH algo-rithm we consider rounding and within the subgradient method we use an additional optimization round, where we fix the binary variables. This improves the solution quality.

5.5.1 Rounding

Ordoudis et al. (2015) use rounding for their two stage unit commitment problem, they fix the binary variables ut_is in a scenario bundle B(s, t) from whom the weighted average cB(s,t)_i of the binary variables in that bundle is either close to 0 or 1 via the following thresholds 1 ≥ 1 − θ > κ ≥ 0. There method can simply be transformed to the multistage case, this is

ut_i,s= (

1 if cB(s,t)_i ≥ 1 − θ

0 if cB(s,t)_i ≤ κ (5.6)

This procedure is used in the PH algorithm for the generators. After the update of cB(s,t)_i in the algorithm use rounding.

(33)

5.5. ALGORITHM ENHANCEMENT 25 number of scenarios, which means that rounding will not always be possible for low values of θ. In our case, the scenario tree consists of 3 branches at every node. Hence for the leaf nodes, rounding will never be used for θ < 1₃. This is illustrated in Figure 5.1. Note that for t = 1 the commitment is assumed to be given. We have to decide the commitment of the generators for the next period. Since this decision must be non-anticipative all the variables are linked (see the dotted line) over the scenarios. In this case, for t = 2, for each i ∈ G it must hold that ut

i1= utis0,

for 2 ≤ s0 ≤ 9. In this case the rounding rule of Ordoudis et al. (2015) is effective since this is a large enough bundle. However, for the leaf nodes, illustrated in Figure 5.1 by the ellipse, the bundle is not large enough to use rounding for small values for θ. To overcome this problem for the leaf nodes, we developed the following simple procedure. We fix the rounded variables and solve the model again with these fixed variables. If this gives a feasible solution, we return it, otherwise we return to the PH algorithm.

t = 1 t = 2 t = 3 t = 1 t = 2 t = 3

Figure 5.1: Example Scenario tree

5.5.2 Sub-optimization per iteration

We noticed while testing the algorithm that during the iterations the subgradient algorithm strug-gles to find an allocation that completely matches supply and demand. This is due the binary values that represent whether a generator is online or offline and the update rule of the Lagrange multiplier. Once the value of λt_s is high enough in a certain scenario s and time period t, a generator i will start-up and produce at least its minimum production level xmin

i . If this leads to

(34)

26 CHAPTER 5. ALGORITHM 0 5 10 15 20 25 30 35 40 45 50 0 0.1 0.2 0.3 0.4 0.5 Iteration λ

Figure 5.2: Example convergence Lagrange multiplier

5.6 Final algorithm

Algorithm 6 Final algorithm

1: Initialize Lagrange multipliers λ with Algorithm 4.

2: Broadcast initial λ(1) = λ0_{, k = 1.} 3: while DG(λ) ≥ or k < kmax_do 4: l = 1

5: for i = 1, . . . , n (in parallel) do

6: Gather: xt

is(k) with PH algorithm (Algorithm 2 and 3) for t = 1, . . . , H, s = 1, . . . , S 7: end-do

8: Sub-optimization round (see section 5.5.2).

9: Broadcast: λt_s(k + 1) = max{λ(k) + αs(k)(Pni=1xtis− dts), 0} 10: k = k + 1

11: end-do

As mentioned earlier, Algorithm 6 can be executed in parallel, each facility calculates its optimal generation given a price λ in parallel. Since finding this optimal generation is still a very large optimization problem we have decomposed this into singe-facility single-scenario sub-problems. Each facility i can now given the price λ solve these S sub-problems in parallel with the PH algorithm. This is graphically represented by Figure 5.3.

(35)

5.6. FINAL ALGORITHM 27 i = 1 i = 2 i = 3 i = 4 ISO s = 1 s = S x1 λ x2 λ x3 λ x4 λ _{µ, c}B(s,t) u, x µ, cB(s,t) u, x

(36)

6 |

Numerical Results

In this chapter the performance the algorithm will be evaluated. We test the different parameters and finally use these parameters to test the algorithm on some created instances.

6.1 Simulation setup

6.1.1 Software and hardware

The simulations are performed using MATLAB R2015a. To solve the constrained quadratic pro-gram problems, the external solver Gurobi Optimizer version 6.0.5 is used. Also, YALMIP version R20160930 (cf. (Löfberg, 2004)) is used to model the problem in MATLAB. YALMIP is down-loaded as a free toolbox and functions as a modeling language for advanced modeling of opti-mization problems. An advantage of YALMIP is that the language is consistent with the standard MATLAB syntax, which makes algorithm development easier for someone that is familiar with MATLAB.

The simulations are performed on a HP Compaq dc7800 PC desktop with an Intel(R) Core(TM)2 Duo CPU 3.00 GHz processor and 4.00 GB of installed memory (RAM). The PC has a 64-bit operating system.

6.1.2 Input parameters

We consider a power grid with up to 10 storage facilities and up to 10 generators. We represent the demand for each period as an random variable that follows the normal distribution N (µt_{, σ}t_),

with mean µt and standard deviation σt for t = 1, . . . , H. Here we assume that µt represents the forecast of the net demand. A typical daily electricity demand is characterized by two peak periods, one in the morning and one in the evening and two low demand periods, midday and at night. For simplicity we assume that the average demand is characterized by a sinus function. See Figure 6.1 for an example. To create a scenario tree we assume that the demand can take three possible values, µt_{− σ}t_{, µ}t_{, and µ}t_{+ σ}t_{at each time stage t where µ}t_{depends on the demand in}

the previous time period. See figure 6.2 for an example of the scenario demand tree. We cannot consider a scenario tree for 24 time periods due to computational limits, therefore we create our demand scenarios such that after a certain number of periods the demand is the same for every scenario. This is also illustrated in Figure 6.2.

The cost and capacity parameters are given in Table 6.1 and Table 6.2. The cost coefficients are fictitious but the proportions are not unrealistic (cf. Klein Haneveld and van der Vlerk (2001), Dörfler and Grammatico (2017), Yang et al. (2016)).

(37)

6.1. SIMULATION SETUP 29 0 5 10 15 20 −50 0 50 100 Time Generation Figure 6.1: (a) 0 5 10 15 20 −50 0 50 100 Time Generation Figure 6.2: (b)

Figure 6.3: Numerical example (a) Average demand, (b) Demand scenario tree for S = 35_.

Table 6.1: Parameters per generator

Generator 1 2 3 4 5 6 7 8 9 10

qi 0.01 0.04 0.02 0.015 0.0814 0.0905 0.127 0.0913 0.0632 0.0975

ci 15 15 15 15 15 15 15 15 15 15

xmin_i 10 5 2 7 6 0 8 4 3 6

xmax_i 60 50 25 75 65 4 85 93 68 76

Table 6.2: Parameters per storage facility

Storage facility 1 2 3 4 5 6 7 8 9 10

qi 0.005 0.040 0.010 0.097 0.095 0.048 0.800 0.014 0.042 0.0915

E_imin 0 0 0 0 0 0 0 0 0 0

(38)

30 CHAPTER 6. NUMERICAL RESULTS

6.1.3 Performance measure

The most important performance measure is the total cost of a solution. We can compare the cost on two different levels. We can look at the dual gap to asses how ’close’ we are to the optimal solution, and we can look at the difference between the costs of the stochastic problem and the deterministic problem where only the average net demand is used as input. Birge and Louveaux (1997) defined the latter as the Expected Value (EV ) solution. The expected results of the EV solution is called the Expectation of the EV (EEV ) solution. The Value of the Stochastic Solution (V SS) is defined as the difference between the EEV solution and the Stochastic Problem (SP ) solution, that is V SS = EEV − SP (Birge and Louveaux, 1997). To recall,

SP = min x,u,w nXS s=1 ps n X i=1 H X t=1 fi(xtis) + gi(wtis) : n X i=1 xt_i,s≥ dt_s, {x, u} ∈ N , {x_s, us, ws} ∈ Ks, t = 1, . . . , H, s = 1, . . . , S o . (6.1) If µt_{= E[ξ}t_{], ¯}_dt_{= max}

s=1,...,Sdts, the expected value problem is given by

EV = min x,u,w nXn i=1 H X t=1 fi(xti) + gi(wit) : s.t. n X i=1 xt_i ≥ µt_, n X i=1 ut_ixmax_i ≥ ¯dt, t = 1, . . . , H{x, u, w} ∈ K o . (6.2)

If we let ¯u be the optimal commitment of the expected value problem, then EEV = min x,w n_XS s=1 ps n X i=1 H X t=1 fi(xtis) + gi(wtis) : n X i=1 xt_i,s≥ dt s, {xs, ¯us, ws} ∈ Ks, t = 1, . . . , H, s = 1, . . . , S o . (6.3)

Secondly, next to costs, the calculation time of the algorithm is also an important performance measure. Since we would like practical run times, we also need to address the speed of the algorithm. Most likely there is a trade-off between better solutions in terms of costs and running time.

6.1.4 Experiments

As mentioned in the previous chapter, there are various unknown parameters that influence the performance of the algorithm. In this study, various numerical experiments are performed to study the influence of each parameter on the performance of the algorithm.

• Initialization Lagrange multipliers (section 6.2.1)

• Step-lengths and Lagrangian multiplier updates (section 6.2.2) • Stopping criteria (section 6.2.3)

• Rounding (section 6.2.4)

• Parallel implementation (section 6.2.5) • Number of scenarios (section 6.2.6)

(39)

6.2. PARAMETER TESTING 31

6.2 Parameter testing

6.2.1 Initialization Lagrange multipliers

If we consider only generators, the initialization of the algorithm is quite fast since the generation is determined by the simple function (5.2). We only need to repeat this for a number of iterations and update the Lagrange multiplier. The duration of the initialization is given in in Table 6.4 for a different number of scenarios. When storage devices are included into the model the initialization requires more time. We came up with two ways to find the initial Lagrange multipliers when storage devices are included. The most obvious solution is to use the PH algorithm as described in Chapter 5. In Chapter 5 we also describe another way to find initial Lagrange multipliers, we call this method Projected Subgradient (PS) method. We will compare the solution and running time of both ways in this section. In our numerical tests both methods result in the same initial value of the Lagrange multiplier λ with a small numerical difference up to 10−4. This is logical since both methods theoretically convergence to the optimal solution. The small numerical difference is due to the stopping criteria. Table 6.3 shows the average duration of the initialization over 10 runs for 5 different numbers of scenarios. We see from Table 6.3 that the PS method is much faster that the PH algorithm.

Table 6.3: Average duration initialization Lagrange multipliers

S 3 9 27 81 243 729

PS (s) 2.64 4.03 8.69 18.29 58.47 205.27 PH (s) 0.87 4.11 20.01 89.00 283.73 1591.14

Table 6.4: Average duration initialization Lagrange multipliers generators only S 3 9 27 81 243 729 2187 6561 19683 59049 Duration (s) 0.50 0.52 0.55 0.60 0.72 1.10 2.13 6.42 18.82 56.17

6.2.2 Step-lengths and Lagrangian multiplier updates

(40)

32 CHAPTER 6. NUMERICAL RESULTS 0 10 20 30 40 50 60 70 80 90 100 100 150 200 250 Iterations Objective value a =10 a =30 a =100 a =300 a =500

Figure 6.4: Example convergence Lagrange multiplier

0 10 20 30 40 50 60 70 80 90 100 108 110 112 114 Iterations Objective value a =10 a =30 a =100 a =300 a =500

Decentralized Stochastic Unit Commitment with Storage

University of Groningen

Master Thesis