The eﬀect of modeling choices on decision making regarding the optimal planning of distributed generation

(1)

The effect of modeling choices on decision making regarding

the optimal planning of distributed generation

(2)

Master’s Thesis Econometrics, Operations Research and Actuarial Studies

Supervisors: Dr. ir. D. Catanzaro (RuG) Drs. F. Phillipson RTD (TNO) M.H. Schreuder (TNO)

(3)

The effect of modeling choices on decision making regarding

the optimal planning of distributed generation

Lieke Kools

s1771817

January 29, 2014

Abstract

(4)

Summary

Recently much research has been directed towards the optimal planning of distributed generation. For this research it is common to use coarse data as input to the optimization models. This data does not reflect reality well. We investigate whether using more fine grained data leads to a different optimal planning of distributed generation.

Most models that are currently used for the optimal planning of distributed generation include many technical details on the system. These models become so complicated that they do not allow for fine grained data. Thus, in order to say something about the effect of the granularity of data on the solutions we have to take a more abstract approach to the problem. The variations in demand and supply of electricity on a short time scale are hard to predict and therefore best described by stochastic input.

For our evaluations we start off using a deterministic model developed in a master thesis written at the University of Groningen and Netherlands Organization for Applied Scientific Research. This model can be used to find the optimal installed capacity levels of different types of distributed generation within a residential area. When using stochastic input this model is equivalent to a two-stage recourse model. We simplify this model greatly to be able to incorporate stochastic fine grained data. As most essential simplification we take the second stage decisions outside of the optimization model. By doing this, we are left with a quite simple model in which we minimize an expected value.

In order to find a solution to our model we use a slightly adapted version of Stochastic Decomposition. We notice that the original algorithm performs poorly in terms of convergence and stability due to the flatness of our objective function. We therefore choose to adapt the updating of cuts in the algorithm which turns out to work very well for the problem at hand.

For our experiments we consider an average Dutch neighborhood where each household has the possibility to install a micro Combined Heat and Power system and photovoltaic solar cells. We generate stochastic demand and supply profiles in two different ways. For the first we use information on the average hourly demand and supply and make assumptions about the random intra-hour distributions. For the second we generate fine grained data using an open source simulator.

The results from the experiments indicate that including stochasticity in the model has a great impact on the optimal decisions. We therefore advice to use stochastic optimization models when determining the optimal planning of distributed generation. In this context the definition of supply has a greater impact on the optimal solution than the definition of demand. We noticed that current stochastic definitions for electricity demand and supply are either inaccurate or very complex and thus advice to investigate these definitions further.

(5)

Preface

With this thesis I conclude not only a masters in econometrics, operations research and actuarial studies at the University of Groningen, but also a graduation internship at TNO. Writing this thesis was an incredibly valuable experience. TNO provided me with time to learn all I wanted to know about stochastic optimization and the changing electricity sector and with space to shape my own research, make mistakes and discover how to recover from them. I am very grateful for the guidance I received along the way from my supervisors. Frank patiently guided me in understanding the problem and showed me how to structure my work. Max provided me with new perspectives on the electricity sector and tips on how to improve my writing. Daniele bored with me on the many side paths I took and set boundaries such that at some point I could converge. I am very happy Maarten stepped in early as a co-assessor to give me some extra feedback on the contents. This made everything fall in place and turned out to be the start of finishing the project.

(6)

1 Introduction

Sustainable energy is a hot topic in politics. The Dutch government aims for 14% of energy in the Netherlands to be produced in a sustainable fashion in 2020 and even heads for a completely sustainable provision of energy in 20501. In order to reach this goal a lot of changes need to be made in the current energy sector. The production of electricity is expected to be leading in this transition [10].

Traditionally electricity is generated on several large power plants, for example ones fired by coals or gas. These power plants are connected to a transmission network that transports the electricity from the power plants to distribution networks all over the country, such that it eventually arrives at the end consumers. In this system the power plants are controlled such that they supply exactly the amount of electricity the consumers demand at any moment. On top of that storage is used to balance loads. However, as nowadays consumers also start to generate electricity this system must change from being one directional (electricity flows from power plant to consumer) to a system in which power flows in two directions, see Figure 1.

The idea of people consuming the electricity they have produced themselves seems promising. For the consumers it is an opportunity to actively contribute to a sustainable environment and save money on their electricity bill. For distribution system operators it is a means for saving money as electricity gets produced close to where it is needed. In this way electricity does not have to travel far and thus little of it is lost on the way. In the end this can be beneficial for transmission system operators as the need for transmission capacity may be reduced and with that the need for costly capacity expansion.

However, it is not as simple as it sounds. The current grid is not built for distributed generation, such that many investments need to be made to transform it. Also distributed generation is often based on renewable resources. The amount of output these sources generate can vary a lot over time, such that electricity is not always provided at the time needed. If for example many households in one neighborhood own solar panels, on a sunny day a lot of electricity may be produced in this neighborhood while there is only little demand. This excess of electricity then has to be fed back on the distribution grid and transported to other consumers. Netherlands Organization for Applied Scientific Research (TNO) investigates how medium-sized electricity consumers can be utilized to consume part of such excess supply. This could be established by offering such consumers flexible electricity prices.2

A term that is often used in this context is smart grids. These are electricity grids that use advanced ICT to optimally control the grid and facilitate coordination of the energy market [1]. Research in this area focuses for example on possibilities for storage, such that produced electricity in excess of demand can be used at a later time. Also smart meters that can track demand and supply more precisely have been introduced. Combining this with the development of coordination mechanisms, such as the PowerMatcher [23], allows for optimal control of small electricity consuming and producing units, making way for the integration of more distributed generation in the system.

Accurate information on electricity demand and supply is needed to assist research on the transformation of the electricity system. This data has mostly been provided in fifteen minute or hourly averages. However, research shows that such averages do not capture the nature of demand and supply well, such that the proportions of over as well as underproduction of elec-tricity are underestimated [45]. We wonder if this also affects the smart grid. For example, does

1_{http://www.rijksoverheid.nl/onderwerpen/duurzame-energie/meer-duurzame-energie-in-de-toekomst}

2_{https://www.tno.nl/content.cfm?context=overtno&content=persbericht&laag1=37&item_id=}

(8)

Figure 1: A schematic representation of the changes in the electricity system, from [36].

this mean we should install more storage and bigger cables? And what happens to the optimal mix of distributed generators in a neighborhood? In this thesis we will focus on the latter, our research question being:

What is the effect of modeling demand and supply for electricity on a short time scale with respect to the optimal planning of distributed generation?

We will focus on finding a way of planning that is optimal in the sense that it minimizes grid losses, to continue on work that has already been done by students from the University of Groningen [9] and the University of Amsterdam [26] for TNO. Grid losses are losses encountered when transporting electricity from the producers to the consumers. In the Netherlands on a yearly basis an amount of electricity worth 300 million euro’s is lost. In the United States these losses even amount to 26 billion dollars [41], showing us that these losses are quite substantial.

We do not have any data on the variations of demand and supply on a time scale smaller than fifteen minutes. Even if we would have such data, we cannot simply describe the variations on small timescales using an average daily profile as they depend on uncertain weather conditions and unpredictable usage of household appliances. In order to incorporate information on a fine granularity we will thus work with models that allow for uncertainty in the parameters. Most existing models in the context of electricity to do not incorporate such uncertainties, hence before we can answer our main question we should find answers to the following supporting research questions:

(9)

• How should we adapt current models for optimal planning of distributed generation to include stochasticity?

• What is the effect of modeling demand and supply for electricity using random distributions with respect to the optimal planning of distributed generation?

Only recently researchers have started to apply stochastic models in the context of electricity. With this thesis we hope to contribute to this branch of research. Furthermore we aim to provide a better understanding of the required granularity of data on electricity demand and supply for optimal distributed generation planning. If the results in this thesis indicate that variations on a smaller time scale can have a substantial impact on decision making, this motivates the need to acquire actual fine grained data.

The rest of this thesis is outlined as follows. In Section 2 we discuss the existing literature on the integration of renewable resources. In Section 3 we present a mathematical formulation of the problem at hand. In section 4 we explain our approach to solving the model. In Section 5 we present the data we use for our empirical evaluations of which the results can be found in Section 6. We conclude with our recommendations and suggestions for further research in Section 7.

2 Literature review

Already a lot has been written about the optimal integration of distributed generators (DGs). For example Tan et al. [40] give a nice overview of methods for optimal distributed renewable generation planning. The proposed models differ greatly in aim, scope and complexity. When we speak of distributed generators this includes any kind of generator that can be placed close to the consumer. In this sense we do not limit ourselves to renewable energy sources, but also consider storage devices and small scale conventional generation.

2.1 Important factors for minimizing losses

When evaluating the effect of distributed generation on losses three things are important: (1) the capacity of the generators, (2) the type of the generators and (3) the location of the generators. Location is a somewhat ambiguous term here, since it could mean both the physical location within a given network as the logical location in the system of electricity distribution, so whether they should be located at the mid voltage or low voltage level. It should be made clear that when we speak of location we are referring to the physical location unless stated otherwise.

Quezada et al. [33] show that increasing the capacity of generators decreases losses at first (1). But, as we keep including more generators, at some point losses start to increase again. In the same article it is shown that of the different types of DGs wind turbines have the least positive impact on losses (2) and that DGs have a greater impact on losses when their locations are more dispersed (3). In order to reach a setting where losses are minimized all three aspects should be taken into account. However, since looking at the aspects separately can already be very complicated, research often does not include more than two of them simultaneously.

(10)

pumps and electric vehicles get introduced and concludes that the current grid does not have the capacity to handle that.

Another approach is to look at the optimal size (1) as well as the placement (3) of the DGs, without differing between types (2). An example is the analytical method derived by G¨ozel and Hocaoglu [12]. The writers assume one is given the lay out of a specific distribution system. For every bus in that system they find the derivative of the total power losses with respect to the real power injected at that bus, i.e. the size of the DG, based on some technical relations. If we want to find the optimal size for the DG at a certain bus, we naturally set this derivative equal to zero. Given this information and using an iterative procedure it can then be determined at which busses these DGs should be added.

Since investment in DGs is most often done by consumers and not by the network operators, the optimal planning of DGs cannot simply be implemented in practice. However, it can serve as a guideline, such that incentives can be given to the consumers to invest in the right way. Soroudi [38] takes into account that the decisions on investment in DGs are out of the hands of network operators. He models demand and wind generation probabilistically and includes uncertainty about the investment in DGs in a possibilistic way. Given this information he looks at losses as well as the voltage profile. Next to the capacity, location and type of DG, he adds the timing of the investment as a fourth important factor for minimizing losses. The model should help network operators to anticipate on possible investment decisions by consumers.

2.2 Alternative objectives

We explained above that we aim to minimize losses by integrating DGs in an optimal way. The choice of this objective reflects the idea that DGs can be used to make the system of electricity supply more efficient. Electricity can be produced close to where it is consumed, such that little power is lost in transmission and distribution. However, distributed generation can also be motivated from other points of view, which are reflected using alternative objectives.

A natural objective would to minimize costs, since often distributed generation is promoted as a means of cost saving to consumers. If we aim to minimize costs at the consumer level, we are not that much concerned with placement or capacity of generators, but more so with how the generator should be operated after installation. In this context Jesudasan et al. [17] look at optimal storage policies, such that excess electricity is stored at times when outside supply is cheap and can be used or sold at times when supply is more costly. El-Khattam et al. [11] assume that not the consumers, but the distribution companies will invest in distributed generators. These companies have the task to serve the electricity demand of their customers by purchasing electricity for them. They buy electricity either at some prespecified amount from generation companies or directly on the market. By investing in DGs the distribution company may save on the costs for purchasing electricity, since it can now serve its customers using the electricity produced by these DGs. The writers build a heuristic that minimizes total costs, which includes not only investment costs, but also the cost of purchasing power on the market, operation costs, and more.

(11)

Figure 2: Electricity losses within the power and energy framework, from [31].

2.3 Modeling demand and supply

Just as the perspective on electricity generation has changed over the years, the view on how demand and supply of electricity should be modeled has changed tremendously. Traditionally demand was seen as given and supply was scheduled such that it would meet demand. When it comes to sizing the grid, the only aspect of load deemed relevant was the peak demand. This was simply given by a fixed parameter on which sizing rules could be established. However, with the introduction of renewable sources and now the smart grid, more complicated models for demand as well as supply get employed.

Ochoa and Harrison [31] explain that the approaches for minimizing losses can be divided in power and energy approaches. For the power approach we take a snapshot of the system at one point in time, such that demand and supply can be modeled as a constant. When using the energy approach on the other hand we consider the system over a period of time, asking for time-varying input. This first approach is often used for models focusing on very technical details related to losses, whereas the latter is used for models that look at the grid from a more abstract level. The authors show that using the power approach results in high levels of recom-mended capacity and low energy losses. However, when demand is modeled as time-varying the optimal capacity level is much lower, just as the calculated losses. When also supply is taken to be time-varying the effect of DG on losses is overall less distinct and losses turn out to be much higher, see Figure 2. Ochoa and Harrison include time-variability by using a deterministic supply profile, however it can also be modeled using a stochastic process such that uncertainties can be included in the model, just as Soroudi [38] does.

(12)

the activity profiles. A bottom-up approach for supply could be based on weather conditions, such as a recently derived model for photovoltaic (PV) output by Richardson and Thomson [34]. This model is based on the degree of cloud coverage over the day, which can be modeled as a Markov chain. The authors also define a base curve for clear sky irradiance levels, which depends on for example the distance of the earth to the sun and the angle of the earth’s rotation, such that it is different for every day. They combine these two elements and a factor for the efficiency and size of the panels to derive stochastic PV output.

Bottom-up type models often ask for a lot of input, in the above demand model we for example need information on the usage of every single appliance, while we sometimes may prefer a simple distribution to describe load. These distributions can be retrieved using a top-down type model, such as the demand model described in Ardakanian et al. [3]. Although the authors were inspired by the adding of appliances, in the model only the time-variation in total demand is considered. They model electricity demand as a continuous time Markov chain as on the one hand this is a natural way to model the addition of on and off loads and on the other hand these type of models are easy to use for mathematical analysis. They group households into four categories depending on for example the size of their house and fit separate Markov models for each type. Also, they divide the day into sections of high, medium, and low demand and fit separate models for each of these.

For an example of a top-down type supply model we can look at Brokish and Kirtley [7], who investigate the use of Markov Chains to model wind power. They explain that knowing the wind power generated at this moment we can easily predict the wind power in the next period, as this will usually be either the same or a little bit higher or lower. This behavior can be easily modeled using Markov chains. However, the writers emphasize that Markov Chains are not always suitable for modeling wind. Especially when using time steps shorter then fifteen minutes, this model that cannot sufficiently capture the dependence of current wind power on the power from wind some hours ago. The authors use their models to estimate the capacity of storage needed to provide an islanded micro grid with just wind power. They find the necessary capacity is underestimated no matter on which time scale they modeled.

Although top-down models for demand and supply are easy to use, they cannot capture all characteristics of load. An important factor that is difficult to model is the nonstationarity of the processes. In the method of Ardakanian et al. [3] this for example had to be solved by modeling different processes for different point on the day.

One concern with modeling demand and supply is that in general the time increments con-sidered are fairly large. As Wright and Firth [45] point out, information on electricity loads is usually given on time intervals of thirty minutes in the UK, although a fifteen minute resolution is also common in some countries. They find that such timescales do not sufficiently capture variations in demand. A situation is considered in which a household generates its own supply at a constant rate. Demand data is retrieved as one-minute averages, but also used to create data averaged over larger intervals. Given this information the authors measure the proportion of electricity that is imported or exported and find that both export and import proportions are underestimated when using intervals larger than a minute. However, the effect is less prominent for export than for import proportions.

Similarly, large-scale averages do not capture the variability in supply. As Von Meier [42] points out, output from wind turbines and solar PV panels can vary at the time scale of minutes or even seconds, see also Figure 3. In case of wind this is related to the variations in wind speed, for solar PV output cloud coverage is of great impact.

(13)

Figure 3: Time scales in electric grid operations, from [42].

approach it is assumed that within an hour the behavior of demand or supply can be defined by a single known distribution. This knowledge can be combined with information on hourly averages over a day to obtain a complete stochastic profile of demand or supply. Intra-hour demand can for example be modeled using a gamma distribution [8, 28], solar PV output using a Markov process [27] and wind power using a skew-Laplace distribution [29].

2.4 Stochastic models of the electricity grid

As explained above, many models regarding optimal planning of DGs do not incorporate the uncertainties surrounding supply and demand. We believe that especially when considering renewable sources it is important to consider uncertainties and hence discuss some research where uncertainties are included.

In his master thesis Leenman [26] puts the optimal planning of DGs in the context of Stochas-tic Programming, a field of research concerned with including stochasStochas-tic parameters in optimiza-tion models. He investigates the optimal placement of windmills at the mid voltage grid. For this he takes the ’snapshot’ power approach to the problem in which we consider the state of the system at a single point in time. The current load levels are taken as fixed, but the output levels of the windmills to install are defined using a random distribution. He develops a two-stage recourse model in which the first stage determines the optimal placement decision and the second stage determines the optimal power flow given a chosen placement. The main drawback of his model is that it easily gets to complex, asking for a heuristic approach to solving it. Further-more in order to include storage in our model, it is necessary to take an energy approach to the problem.

(14)

very complex models, which become to computationally expensive.

To allow for more flexible models Hutterer et al. [16] suggest to adopt a simulation-based optimization approach. By using a simulation tool one can evaluate a given performance measure without the need to define closed form mathematical expressions for all aspects of the model. Simulation is usually used to evaluate a given setting or, by running it multiple times, compare a couple of settings to each other. However, when we combine the simulation model with a metaheuristic we can actually compare many different settings and search for the best of these settings with respect to the chosen performance measure. The metaheuristic drives the search for new solutions and the objective for the given solutions are evaluated using the simulation model. This method is very flexible with respect to the aspects one can include in the model. Also using multiple performance measures it is quite straightforward to define and solve a multi-objective optimization model in this way. The flipside to this flexibility is that performing simulations can result in a great computational burden.

Many stochastic models aimed at optimally planning DGs look at the reliability of such sources, rather than their effect on losses. Although this is a different viewpoint than ours, it is interesting to look at the techniques used in this field. In this sense the grid is often compared to a communication network, which is designed to provide a certain quality of service. The level of service may be influenced in two ways, on the one hand by choosing the appropriate capacity of the server, on the other hand by controlling how the traffic enters the network. Controlling can be done by delaying arrivals using buffers or by blocking traffic to keep bandwidth available for more important traffic that might still come in. For the electricity grid similarly the capacity of renewable resources is chosen and their output is controlled using storage. Or, if we look at it from another viewpoint, demand levels are decreased by encouraging more energy efficient appliances and demand is controlled using price incentives. This two way approach to capacity and control, makes the latter system much more complicated, see Figure 4.

An example of such research is that by Wang et al. [44], that is aimed to jointly size buffers and resources in an islanded grid. They model demand as well as the different sources for supply using stochastic arrival curves. Furthermore they assume fully efficient storage, such that they can use theory on loss queues and results from stochastic network calculus to find an upper bound on the loss of power supply as well as the waste of power supply. Based on the first, the fraction of time that energy demand cannot be met is derived. Using trade off curves they evaluate this metric for different capacities of storage and generation. Similarly Ardakanian et al. [4, 5] show that determining the probability of overloading a server is equivalent to finding the probability that a demand cannot be served from storage. They do not use this to evaluate the capacity of renewable sources, but that of transformers and buffers.

Another piece of research that is interesting to mention in this context is that of Jiang et al. [18]. They look into the capacity value of variable generators. The maximum possible output may not be very informative in such cases, so they distinguish two other types of capacity met-rics: Guaranteed Capacity and Best-Effort Capacity. The first provides the output level that a source will at least attain at any point in time. The second indicates to what extent the resource can be used to serve demand, so its value depends on the variations in supply compared to the variations in demand. The authors argue that the first may underestimate the usefulness of certain generators. For example, it does not matter that PV panels do not generate anything a night, as long as their variations during the day closely match demand.

(15)

ro-Communication

Electricity

Figure 4: The use of capacity and control in communication and electricity networks.

bustness, and accuracy. We would like to see if we can find a way to get around some of these problems. The research described regarding reliability in the power grid seems like a great source of inspiration. Here elegant and seemingly simple solutions are provided for the optimal planning of DGs. Unfortunately ideas from this stream of research cannot easily be applied to problems related to losses. For one, it may be quite straightforward to calculate the probabilities of hav-ing storage overload or underload, but findhav-ing the expected amounts of overload is much more involved. Moreover, although it is possible to incorporate storage and its inefficiencies in the model, it is hard to derive the expected amount of losses resulting from these inefficiencies. We thus choose to take on yet another approach.

3 Mathematical model

3.1 Problem formulation and notation

We derive a mathematical model to find the optimal mix of distributed generators such that real energy losses are minimized. For this purpose we build upon the model defined by Croes [9]. She considers a residential area where each household can install one or more types of DG. The capacity of the separate DG units is fixed. The model is defined such that demand equals supply at all times. When necessary the neighborhood can export or import electricity from outside. Also the neighborhood has access to a storage unit that can be used to shift supply.

As explained in Section 2, there are many different aspects we could take into account in our model. We focus on the capacity and the type of generators needed and do not consider the physical placements of the generators. To be able to deal with this last aspect many technical details would have to be included in the model, which complicates the model tremendously. As we want to analyze the system over a period of time rather than at a fixed point in time, such models are not suitable for our research. Besides that, since we are considering a residential area we assume that the investment in the generators is done by the consumers. Although there may be ways to use incentives to let a certain percentage of households acquire generators, pinpointing the investments to specific consumers is an undoable task.

(16)

be known exactly in advance. From the perspective of mathematical models, there are broadly defined three ways in which we can deal with such uncertainty. Most often the choice is made to ignore the uncertainty altogether. In that case an estimate of the parameter is made and taken as fixed. This makes the model much easier, however often this is too crude. Another approach is to acknowledge that we do not know the exact value of the parameter and specify a range which the parameter can fall in. These kind of models are dealt with in the area of robust optimization. In this thesis we will follow yet another approach, namely we assume that the parameters can be represented by random variables with a known joint distribution, as is done in the field of Stochastic Programming. The lecture notes by Klein Haneveld and Van der Vlerk [21] provide a useful guide to Stochastic Programming. The expressions, definitions and theorems used here are as defined in this text.

In Table 1 we introduce the notation used throughout this thesis. Indices i = 1, . . . , NDG Type of the distributed generator

t = 1, . . . , T Time period considered Variables dt Demand in period t

sit Supply of one generator of type i in period t

st Total supply in period t

rt Resistive losses in period t

ct Conversion losses in period t

bt Storage level at the end of period t

wt Exported electricity in period t, decision variable

gt Imported electricity in period t, decision variable

lt Charging amount in period t, decision variable

ut Discharging amount in period t, decision variable

xi Installed capacity for generator of type i, decision variable

Parameters Nh Number of houses in the district

bmax Capacity of the storage unit

αl Charging efficiency

αu Discharging efficiency

βw Export efficiency (quadratic)

βg Import efficiency (quadratic)

γw Export efficiency (linear)

γg Import efficiency (linear)

Table 1: Overview of notation

3.2 Objective

Our objective is to minimize real energy losses. These losses can result from storage or transport. In order to store electricity it needs to be converted to another type of energy depending on the type of storage used. This process is usually not efficient, such that not all of the energy that we plan to store actually gets charged onto the storage unit. Similarly not all of the energy that we discharge from the storage unit can be used as electricity by the consumer. Let αl ∈ (0, 1]

(17)

amount of electricity lost due to conversion during period t, ct, is given by

ct= (1 − αl)lt+

1 − αu

αu

ut, t = 1, . . . T. (1)

When exporting or importing electricity from outside the neighborhood we lose electricity in transport. Basically when pushing the electricity over a transmission line this line heats up, so some of the electricity is transformed into heat. This process is called resistive heating. The losses incurred in this way are therefore also known as resistive losses. By Joule’s first law we know that the generated heat is a quadratic function of the current passed through. To be precise

Q = I2Rt,

where Q is the generated heat for a period of time t, R is the resistance of the line and I is the electric current. Furthermore we know that electric power P is defined as

P = IV,

with V the voltage level. So for a given voltage level, transmitting more electrical power means passing through a higher electrical current. Then keeping the resistance fixed the resistive losses will be a quadratic function of transmitted power. So the resistive losses during period t, rt, are

given by

rt= βwwt2+ βggt2, t = 1, . . . T, (2)

with βw ∈ [0, ∞) and βg ∈ [0, ∞) coefficients to be estimated. As we would prefer to have a

linear objective we could also use a linearization of transport losses as suggested by Croes [9]. Let γg ∈ (0, 1] denote the fraction of generated energy that reaches to the consumers and let

γw∈ (0, 1] be the fraction of exported energy that gets to its destination. Then equation (2) can

be rewritten to

rt= (1 − γw)wt+

1 − γg

γg

gt, t = 1, . . . T.

Although this linearization makes the mathematical model easier to handle, it may have a big impact on the results, especially when the fluctuation in excess supply are very big.

We can now formulate our objective as

min xi,lt,ut,gt,wt E " T X t=1 (ct+ rt) # . (3)

3.3 Constraints

We consider a neighborhood with a known number of households Nh. Each of these households

is allowed to install one or more types of DG, each with a given capacity. There are different types of generators available with a total of NDG types. Let xibe a decision variable indicating

the number of households that have installed a generator of type i, i = 1, . . . NDG. We get the

restriction

0 ≤ xi≤ Nh, i = 1, . . . , NDG. (4)

It may feel natural to define xi as an integer variable. However, with our model we aim to give

(18)

purpose the integrality constraint is actually not that important. Since integer variables are much harder to deal with than continuous ones, we define xi to be continuous.

We will look at the neighborhood as being one entity, that is, we will look at the aggregated demand and supply for the neighborhood. Let dtbe a stochastic variable representing the

neigh-borhood demand during time period t and st a stochastic variable representing neighborhood

supply. We assume the distribution of dtis known. However, the distribution of stdepends on

the amount of installed generators. Let sitbe a stochastic variable representing the output of a

generator of type i during period t, such that,

st= NDG

X

i=1

sitxi, t = 1, . . . T. (5)

We define ∆t to be the excess supply, so the amount of electricity generated by the DGs in

the neighborhood, minus the amount that is used for consumption within the neighborhood,

∆t= st− dt, t = 1, . . . T. (6)

A negative value of ∆tindicates that there was less electricity generated than was consumed.

Demand and supply of electricity should always be matched perfectly. We could obtain this balance by importing electricity (gt, g from generation) from producers outside the neighborhood

or exporting electricity (wt, w from waste) to consumers outside the neighborhood. Furthermore,

we assume that the neighborhood owns a storage device which can be used to charge electricity to (lt, l from load) or to take electricity from (ut, u from unload). So,

wt− gt+ lt− ut= ∆t, t = 1, . . . T, (7)

wt, lt, gt, ut≥ 0, t = 1, . . . T. (8)

There are some constraints on using storage. Obviously we cannot store negative amounts of energy, just as there is some upper limit to the amount of energy we can store (bmax). Both

when we charge as when we discharge some energy goes to waste. Above we defined αl∈ (0, 1] to

be the fraction of the energy that we plan to charge that actually gets charged onto the storage unit and αu ∈ (0, 1] the fraction of energy discharged, that arrives to the customer. Letting bt

(b from battery) be the amount of energy stored at the end of period t, we get bt= bt−1+ αllt−

1 αu

ut, t = 1, . . . T, (9)

0 ≤ bt≤ bmax, t = 1, . . . T. (10)

(19)

Figure 5: A schematic overview of the mathematical model.

Combining equation (1) to (10) , we get

min xi,wt,gt,lt,ut E " T X t=1 (ct+ rt) # s.t. xi≤ Nh i = 1, . . . , NDG, wt− gt+ lt− ut= NDG X i=1 sitxi− dt t = 1, . . . , T, bt= bt−1+ αllt− 1 αu ut t = 1, . . . , T, (11) ct= (1 − αl)lt+ 1 − αu αu ut t = 1, . . . , T, rt= βww2t+ βgg2t t = 1, . . . , T, bt≤ bmax t = 1, . . . , T, bt, wt, gt, lt, ut≥ 0 t = 1, . . . , T, xi≥ 0 i = 1, . . . , NDG.

(20)

3.4 Properties of the model

The model defined above can be viewed as a two-stage multiperiod recourse model. As the name already suggests for these type of models we assume we can take our decisions in two steps. Some decisions have to be taken now, while we do not know yet what values the uncertain parameters will take. Other decisions can be made after we have observed the actual values of the uncertain parameters. In our case we now have to decide on the number and type of DGs (xi) and only

after demand and supply in a certain period are known we can decide what to do with the excess or shortage of supply in that period (wt, gt, lt, ut). The model is called multiperiod because

in the second stage we are not just considering demand and supply in one period, but over T periods.

Mathematically two-stage recourse models are expressed as min x∈X cx + Eω min y∈Yqy : W y ∼ h(ω) − T (ω)x : Ax ∼ b , (12)

where x ∈ Rn _{is a vector reflecting the decisions to be made in the initial stage. b}

max is the

n-vector of costs associated to these first stage decisions, the set X provides upper and lower bounds of x, and Ax ∼ b reflect m1 other linear constraints on x. The symbol ∼ indicates that

this set of constraints can include both equality (=) and inequality (≤, ≥) constraints. y ∈ Rp

is a vector reflecting the decisions to be made in the second stage. Similar to the first stage, q is the vector of costs associated to the second stage decisions, the set Y provides upper and lower bounds of y, and W y ∼ h(ω) − T (ω)x reflects m other linear constraints on y for given x. The m-vector h(ω) and the m × n matrix T (ω) may depend on the random variable ω ∈ Rr. In fact we assume we can write both h(ω) and T (ω) as linear functions of ω. We will write Tkfor T (ωk) and hk for h(ωk), with ωk the kth realization of ω. The recourse structure (Y, q, W ) as defined above is called fixed recourse, in the sense that W , y, and q do not depend on ω.

We want to put our optimization model (11) into the standard Stochastic Programming structure given in equation (12). We assume for now that the transport losses can indeed be approximated by a linear function. x is the vector indicating the number of generators to install. In our initial stage we cannot incur any losses, so c = 0. We have no restrictions on x apart from the upper and lower bounds which are reflected in the set X. The second stage determines the values of gt(generation), wt(waste), lt(loading), ut (unloading), and indirectly bt (storage

level), t = 1, . . . , T , so

y = [y1 y2 . . . yT], with

yt= [gt wt lt ut bt], t = 1, . . . , T,

q = [q1 q2 . . . qT], with

qt= [1 − αl (1 − αu)/αu 1 − γw (1 − γg)/γg 0], t = 1, . . . , T.

The set Y reflects that all elements of y are nonnegative and bt < bmax for every t = 1, . . . , T ,

such that we are left with the 2T constraints

(21)

Define

v(z) := min

y∈Y{qy : W y ∼ z}, z ∈ R m_,

Q(x) := Eω[v(h(ω) − T (ω)x)], x ∈ Rn.

We recall the following definitions and theorem from [21].

Definition 3.1. The fixed recourse structure (Y, q, W ) is called complete if for all z ∈ Rm_there

exists a y ∈ Y , possibly depending on z, such that W y ∼ z. Equivalently, if v(z) < +∞ for all z ∈ Rm_.

Definition 3.2. The fixed recourse structure (Y, q, W ) is called extremely inexpensive if there exists a z ∈ Rm such that v(z) = −∞.

Theorem 3.3. Consider a SLPwR (stochastic linear program with recourse) model with fixed recourse structure (Y, q, W ). If

(a) the recourse structure is complete,

(b) the recourse structure is not extremely expensive, (c) and Eω[|ωi|] < ∞, i = 1, . . . , r,

then Q(x) is a finite convex function for all x ∈ Rn. Moreover, it is polyhedral (piecewise linear) if the distribution of ω is discrete, and it is everywhere differentiable if the distribution of ω is a continuous one.

The definitions above require the stochastic program to be in canonical form. It is easy to see that the problem at hand has a complete recourse structure and is sufficiently inexpensive. Hence, as long as we define our random variables such that condition (c) is also fulfilled, we are dealing with a convex problem in x. It follows naturally that when we take into account that the transport losses are actually quadratic in waste and generation, we are still dealing with a convex problem in x.

3.5 Complexity of the model

Although our model has some nice properties, it may still be hard to solve in a reasonable amount of time. The difficulty of solving recourse problems is in evaluating the expected value, since this requires calculating iterated integrals. To circumvent this, we could perform either one of two approximations commonly made. First of all, if we have uncertain parameters that are described by a continuous distribution we can approximate this distribution by a discrete one. In that case the expectation can be replaced by the weighted sum of the outcomes for the different scenario’s, such that we are left with a deterministic problem. This problem can be solved by for example the L-shaped method, which is equivalent to Benders’ Decomposition as used for Mixed Integer Programming problems.

Looking at our model, suppose that we only consider one type of generator. Also assume that the output of this generator within one period can be approximated by a discrete distribution containing S1 values. Similarly, let demand be represented by a discrete distribution with S2

possible outcomes, such that there are S1S2combinations of the two. As these random variables

are present in every period, in total we have (S1S2)T possible realizations. Say that we will

(22)

equivalent of our model, is huge. Hence applying this approximation to our model leaves us with an unsolvable problem.

The second type of approximation involves approximating the objective function rather than the input distribution, by sampling from the joint distribution of the uncertain parameters. The resulting model can then again be solved using the L-shaped algorithm. For some algorithms it is not necessary to approximate the objective before applying the algorithm. In those cases the sampling process is used to guide the algorithm towards the optimal solution. An example of this is Stochastic Decomposition, which we could see as a stochastic equivalent of the L-shaped algorithm.

By applying this second type of approximation to our problem, we eliminate the need to evaluate infinitely many scenarios. Unfortunately even when using this approximation we are left with a huge model. Applying either the L-shaped method or Stochastic Decomposition, we have to iteratively evaluate the second stage for a given scenario. In our case this implies that in every iteration of the algorithm we have to evaluate an LP with 11, 520 constraints and 28, 800 variables, which is an unpractical task.

3.6 Simplifying the second stage

In the previous section we explained that by using different forms of approximations, seemingly unsolvable models can be made solvable. Unfortunately some models, such as ours, are even after these efforts very complicated. In some cases it may pay off to take a closer look at the type of problem that is solved in the second stage, to see if we can use this to our advantage. We believe that this may be so in our case. For a given x, the second stage boils down to finding an optimal control policy for the storage unit: when should we charge/discharge the storage unit to minimize losses as much as possible? If we can choose appropriate policies before solving for x, rather than performing these tasks simultaneously, the problem becomes much simpler.

Finding optimal control policies for electricity storage is a vivid research area in itself, see for example [24, 25, 32, 39]. As we implicitly did above, these articles assume that the smart grid will contain a smart storage system. This storage systems makes decisions itself on when and how much power to charge taking into account knowledge about its current state as well as the future. However, it might be more reasonable to assume that this decision is only based on the current state of the system. Suppose first we are dealing with such a ’not so smart’ storage system, that does not anticipate to future demand and supply.

The first thing we can establish is that we will never charge the battery using generation. Also we will never unload the battery to export. This is a simple consequence of all the nonnegative coefficients in our objective. Suppose we import one extra unit of conventional generation to charge our battery. We would incur generation losses as well as storage losses (βg+ αl ≥ 0),

which we could save by simply not importing the extra unit. Similarly, we will never charge and discharge simultaneously or import and export simultaneously. With this in mind we can establish that an excess in electricity can be covered by either storing it or exporting it. A shortage of electricity can be dealt with by unloading the battery or import.

When we assume we have a more advanced storage system that can make predictions about the future and take these into account, the above findings no longer hold. Since transport losses are a quadratic function of waste and generation, we want to avoid the need to generate a lot within one time period. We can anticipate to expected high demands by charging some conventional energy to our storage system during periods of low demand. Similarly, we can absorb an expected large excess of supply by discharging the storage system at times of little excess supply.

(23)

wt gt lt ut bt

∆t≤ 0 0 −∆t 0 0 bt−1

0 < ∆t ∆t 0 0 0 bt−1

Table 2: Control Policy I.

rather than quadratic losses. In this case we would never choose to anticipate to the future. This can be made clear by considering an example in which we do anticipate to the future. Suppose we decide to import one unit of electricity which we charge to our battery today, because we want to use it t periods from now. We incur losses from importing and loading (2 − γg− αl) at this moment and from unloading

1−αu

αu

t periods from now. While if we would just generate one unit of electricity t periods from now, we would only have the losses from generating1 − γg < 2 − γg− αl+1−α_α u

u

.

From here on we assume that our storage system does not take into account any information about the future. We suggest three possible optimal control policies.

Control policy I: Never using storage

The first control policy consists of never using storage. In that case all excess supply will go to waste and all shortage of supply will be covered by generation, see Table 2. By defining x+ _{= max(0, x) and x}−_{= max(0, −x), we can write w}

t= ∆+t, gt= ∆−t and the model becomes

min x E " T X t=1 βw(∆+t) 2_{+ β} g(∆−t) 2 # s.t. ∆t= NDG X i=1 sitxi− dt t = 1, . . . , T, 0 ≤ xi≤ Nh i = 1, . . . NDG.

If we use the linear approximation to the problem the objective is replaced by

min x E " T X t=1 (1 − γw)∆+t + 1 − γg γg ∆−_t # .

This type of problem is also known as simple recourse.

Control policy I is optimal when the storage system is so inefficient that we incur more losses by shifting supply using storage, than by transporting it over the grid. This is the case when the marginal losses from transporting are always lower than the marginal losses from using storage, that is, when 1 − αl> 2βg− min{∆t} and 1 − αu> 2βwmax{∆t}. If we can assume losses are

linear this control policy is optimal when both αl< γg and αD< γw.

Control policy II: Using storage when possible

(24)

wt gt lt ut bt ∆t≤ −αubt−1 0 −∆t− αubt−1 0 αubt−1 0 −αubt−1< ∆ ≤ 0 0 0 0 −∆t bt−1+_α1 u∆t 0 < ∆ ≤ bmax−bt−1 αl 0 0 ∆t 0 bt−1+ αl∆t bmax−bt−1 αl < ∆ ∆t− bmax−bt−1 αl 0 bmax−bt−1 αl 0 bmax

Table 3: Control Policy II.

and only if this is already empty it will be imported. The relationship between our second stage variables and excess supply is displayed in Table 3.

Within the original formulation of the problem this control policy is only optimal when the storage unit is completely efficient, so αl = αu = 1, since otherwise there always exists a value

of ∆t, possibly very close to zero, such that 1 − αl > 2βgmax (∆−t). Our mathematical model

in this case reduces to

min x E " T X t=1 βww2t+ βgg2t # s.t. wt= [∆t− bmax+ bt−1]+ t = 1, . . . , T, gt= [−∆t− bt−1]+ t = 1, . . . , T, bt= bt−1+ ∆t− wt+ gt t = 1, . . . , T, ∆t= NDG X i=1 sitxi− dt t = 1, . . . , T, 0 ≤ xi≤ Nh i = 1, . . . , NDG.

If we can assume losses are linear this control policy is optimal when it is both more efficient to charge than to export and more efficient to discharge than to import. That is, it is optimal when both αl> γg and αD> γw. The mathematical model we are left with is

(25)

which looks a lot more complicated than the two models above.

Note that in this case we are basically dealing with a queue, as in the setting of Wang et al. [44]. However, there are some important differences between our model and those generally used in queuing theory. First of all we have an inefficient queue, which is as far as we know never used in queuing theory. Only queues with impatient customers show some resemblance, which are systems in which it is assumed customers leave if they have been in line for longer than a certain threshold period. A second important difference is that where we normally have arrivals according to a certain distribution that get serviced with a given rate, we do not have a specific service rate, but positive and negative arrivals which are independent of each other. The last point at which our model does not fit in standard queuing theory is the fact that we are dealing with nonstationary processes, as mean supply and demand vary over the day. For an analysis where demand and supply are assumed to be stationary we refer to Su and El Gamal [39], who perform some interesting analyses on this case.

Control policy III: Transporting up to a certain threshold

When applying this policy we will use transport as long as the marginal losses in transport are smaller than the marginal losses for storage. Once the marginal losses in transport start exceeding the marginal losses for storage we start using storage, until it is full/empty, such that we have to use transport again. The first two control policies can be seen as special cases of this control policy. The relationship between ∆tand second stage variables is shown in Table 4.

Rewriting our problem given this control policy we get

min x E " T X t=1 (1 − αl)lt+ 1 − αu αu ut+ γww2t+ γgg2t # s.t. lt= min " ∆t− 1 − αl 2βw + ,bmax− bt−1 αl # t = 1, . . . , T, ut= min " −∆t− 1 − αu 2βg − , αubt−1 # t = 1, . . . , T, wt= [∆t− ut]+ t = 1, . . . , T, gt= [−∆t− lt]+ t = 1, . . . , T, bt= bt−1+ αllt− 1 αu ut t = 1, . . . , T, ∆t= NDG X i=1 sitxi− dt t = 1, . . . , T, 0 ≤ xi≤ Nh i = 1, . . . , NDG.

(26)

(27)

we only deplete the battery but never use it does not make sense in reality.

The different control policies provide a direct (though not necessarily easy) expression of losses in terms of our random variable. So rather than needing to evaluate the two-stage recourse problem min x∈X Eω[min y∈Yqy : W y ≤ h(ω) − T (ω)x],

we are now dealing with a set of problems of the form min

x∈X G(x), G(x) := Eω[g(x, ω)]. (13)

These new problems are much smaller in size, when it comes to the number of constraints and decision variables. Every realization of our continuous random variable ω still contains 2 ∗ T (possibly correlated) elements, so calculating the expectation is not trivial. In Section 3.4 we already touched upon the subject of sampling to go around this problem. In the next section we will go into greater detail on how we can use this, and how we will apply sampling to find an optimal solution.

4 Solution method

4.1 Optimization algorithms based on sampling

There are many different algorithms involving sampling within stochastic programming. Rather than using G(x), these kind of methods make use of an approximation of G(x), given by

GK(x) := 1 K K X k=1 g(x, ωk)

where the ωk _{are independent samples from the joint distribution of ω. We can differentiate}

between two sampling strategies, either we sample outside the method or we sample within. For the first type of strategy we take K samples and solve the problem

min

x∈XGK(x)

This problem is known as the Sample Average Approximation (SAA) problem. This is a big deterministic problem and can be solved as such. It is important to choose K wisely, since a too small sample may result in unreliable solutions, while a very big sample may ask for unnecessary computing power. Therefore it is advised to not just apply this strategy to the problem once, but to repeatedly solve the problem each time using a bigger sample, as described in Kleywegt et al. [22]. This process can be stopped when the expected difference between the solution from the SAA problem and the ’real’ problem is sufficiently small. It has been shown that the solutions retrieved in this way converge to the optimal solution.

(28)

we can imagine that for a solution that is far away from the optimum it is no problem if the approximation to the objective value at that point is not that accurate, while for solutions near the optimum we would like this approximation to be very precise. We could thus start our algorithm using only a small sample and increase the size of the sample as the algorithm progresses. In this way most sampling effort is allocated to solutions that are close to the optimal value. See for example Alrefaei and Andrad´ottir [2] for such an adaption of Simulated Annealing for discrete stochastic optimization problems.

For the problem at hand we propose to use an algorithm that takes on the second strategy. The algorithm we suggest is called Stochastic Decomposition. This algorithm was first presented in Higle and Sen [15]. The aim of this method is to create an outer approximation of the objective function by creating a simple approximating LP problem.

We start with the problem min

x cx

s.t. Ax ≤ b x ∈ Rn+.

Solving this gives us x1, which we use as the starting value of the algorithm. Then at the kth iteration we consider the problem

min

x,θ cx + θ

s.t. Ax ≤ b (14)

θ ≥ αl+ βlx, l = 1, . . . , k − 1 x ∈ Rn+, θ ∈ R.

At the current solution xk _{we construct a cutting plane to the objective, which we add as a}

constraint to problem (14). Then we resolve the problem to obtain xk+1_{and construct another}

cut and so on. So in iteration k the objective function is approximated by k − 1 cuts. As more cuts are added, the LP provides a more precise approximation to the objective function.

We expect that the cutting planes we add in later iterations are more likely to be binding at the optimal solution than the ones we create in the earlier iterations. Also it can be quite tedious to construct cutting planes to G(x). Therefore, rather than constructing cutting planes to G(x) at every iteration, at iteration k we construct a cutting plane to Gk(x), the SAA of G(x) given

information from k samples. This implies we are constructing cuts with incomplete information and hence we may be constructing invalid cuts. Therefore we have to make sure that we update the old cuts at every iteration, such that all cuts in problem (14) form a statistically valid lower bounds to Gk−1.

In the standard version of the algorithm in every iteration the following steps are taken. 1. Generate observation ωk_.

(29)

to problem (14) where for l = 1, . . . , k − 1, λk_l = argmax

λ

{λ(hl_{− T}l_xk_{) : λ ∈ Λ}k_}.

4. Similarly, update the cut at the incumbent ¯xk−1. 5. Update the other cuts such that [αk

l, β k l] = k−1 k [α k−1 l , β k−1 l ]. 6. Let fk(x) := max{αkl + βlkx|l = 1, . . . , k}. If fk(xk) − fk(¯xk−1) < r(fk−1(xk) − fk−1(¯xk−1))

then the incumbent becomes ¯xk_{= x}k_{, else ¯}_xk_{= ¯}_xk−1_.

7. Solve the master problem in order to find xk+1_.

The cuts are constructed in step 2 and 3 using subgradients of the objective in xk_{. These are}

calculated using the dual of the second stage problem. With the construction of control policies we eliminated the second stage problem from our problem. We will thus have to adapt this algorithm slightly to be able to use it on the resulting set of simplified models. How we do this will be explained in Section 4.2. In step 5 the old cuts are updated to assure that at iteration k + 1 the constraints of problem (14) form statistically valid lower bounds to the Gk. In step 6

we check if the incumbent should be updated. We only do this when the current solution is a sufficient improvement with respect to the current incumbent. Using the parameter r ∈ (0, 1) one can indicate how much of an improvement is required for the incumbent to be updated.

As the algorithm proceeds the current solution xkdoes not necessarily converge to the optimal solution. Luckily it can be shown that an identifiable subsequence of the incumbent solutions ¯xk does. We can say the algorithm has converged when the incumbent solution does not change, the objective value changes by less than a specified > 0 and no new dual vertex (λ) is found. It often happens that the sequence of incumbent solutions has already converged to the optimal solution, while the objective value is not yet stable. This occurs for example when many samples are needed to obtain an accurate estimate of the objective function. With this in mind many other stopping rules are developed, for example ones based on statistical tests [14].

4.2 An adaption of Stochastic Decomposition

Calculating the subgradients

The standard version of Stochastic Decomposition uses the dual of the second stage to construct supporting hyperplanes. We no longer have a second stage, however we do have a direct expres-sion of our objective function in terms of x and realizations ωk_{. This allows us to numerically}

calculate the gradients of our objective function for a given x and to obtain the supporting hy-perplanes. Specifically, we calculate the subgradients based on all the information known at that point in the algorithm, so the subgradient calculated an the kth iteration is a subgradient to the function Gk(x) in the point xk. The other steps of the algorithm are unchanged.

Similarly to above we choose a starting value x1_{. At the kth iteration, we consider problem}

(14) and perform the following steps: 1. Generate an observation ωk_.

2. Numerically calculate ∇Gk(x) and construct optimality cut θ ≥ αkk+ β k

kx, where

β_kk = ∇Gk(xk),

(30)

3. Update the cut at the incumbent ¯xk−1_.

4. Update the other cuts such that [αk l, βlk] = k−1 k [α k−1 l , β k−1 l ].

5. Evaluate the incumbent: if

fk(xk) − fk(¯xk−1) < r(fk−1(xk) − fk−1(¯xk−1))

then ¯xk= xk, else ¯xk = ¯xk−1.

6. Solve the master problem in order to find xk+1.

Since we only adapted the way in which the subgradients are calculated, we expect that conver-gence still holds. The cutting planes still provide statistically valid lower bounds, since

αk+ βkx ≤ Gk(xk),

by construction is a supporting hyperplane to the average function. For the updated cuts we have, αk−m+ βk−mx ≤ 1 k − m k−m X l=1 g(x, ωl) ≤ 1 k − m k X l=1 g(x, ωl), αkk−m+ βk−mk x = k − 1 k k − 2 k − 1· · · k − m k − m + 1(α k−m_{+ β}k−m_x) =k − m k (α k−m_{+ β}k−m_x) ≤ 1 k k X l=1 g(x, ωl).

Updating old cuts

We tested the adapted algorithm on some toy problems (for example minx{ωx2+ c}, with ω a

random variable and c a given constant) and we observed that the algorithm always converged to the optimum. However, when running it on instances of our problem the algorithm often did not manage to converge. Specifically, after performing multiple runs, each using different starting values, the recommended percentage of houses installing micro-CHP were as much as fourteen percentage points away from each other. For PV the solutions were even up to 26 percentage points apart. Since the algorithm itself is stochastic in nature, it is not strange that different runs produce slightly different solutions, so to be sure this is really an issue of convergence, we tried out the algorithm on deterministic input. Would our algorithm converge well it should always lead to approximately the same point, regardless of where we start. As we expected indeed runs using the same starting point led to the same solution, but unfortunately different starting points still led to solutions that were up to four percentage points apart.

(31)

Figure 6: An example of the behavior of the incumbent and candidate solution throughout the algorithm.

described by King et al. [20]. They explain that while we can expect to achieve stability of the objective function, requiring stability of solutions may be unreasonable. As Shapiro et al. [37] further explain, large problems in numerical analysis are often ill conditioned. In the case of solving large stochastic problems using SAA, this results in downward biased estimators of the objective value, which makes it hard to perform tests on the optimality of the solution found. For practical purposes it is not really an issue that our solutions do not converge to the actual optimum, as long as the recommendations give quite similar outcomes with respect to losses. Be that as it may, we would like to compare the effect of different types of inputs on the solutions, which is hindered by the issues with convergence.

Figure 6 shows an example of the incumbent (solid lines) and the optimal solutions (dotted lines) throughout the algorithm. Figure 7 shows the value of the objective function at every iteration. We see that the algorithm converges to an approximate solution quite fast, after which it keeps jumping back and forward around this solution. Although it seems that the jumps are getting somewhat smaller as the algorithm proceeds, after around 100 iterations the behavior starts to get more unpredictable, until at one point it stops. The incumbent at this point is in fact not optimal. The objective seems to have been quite stable for 200 iteration already, however if we zoom in as in Figure 7b, we see that it is in fact following the same pattern as the optimal solutions (which makes sense) and only seems to become more stable for the last 10 iterations.

(32)

(a) (b)

Figure 7: An example of the behavior of the optimal value throughout the algorithm.

that we update the cut constructed in iteration k − q using αkk−q+ βk−qk x =

k − q k (α

k−q_{+ β}k−q_x).

The cuts constructed at iteration m = 1, . . . , t − q are updated as before i.e. αk_k−m+ β_k−mk x = k − 1 k (α k−1 k−m+ β k−1 k−mx).

We have to note that due to this adaption the approximating problem no longer forms a statistically valid lower bound to the average function. For the cuts created at iteration 1, . . . , t−q nothing has changed as

αk_k−m+ β_k−mk x = k − q k k − q − 1 k − q · · · k − m k − m + 1(α k−m_{+ β}k−m_x) =k − m k (α k−m_{+ β}k−m_x).

For the cuts that have not been updated yet we have αk−s+ βk−sx ≤ 1 k − s k−s X l=1 g(x, ωl) (s < q) ≤ 1 k − s k X l=1 g(x, ωl),

but we cannot be sure that αk−s_{+ β}k−s_{x ≤} 1 k

Pk

l=1g(x, ω

l_{). Hence all cuts except the ones that}

were created up to q iterations ago (and thus not updated in this iteration) provide statistically valid lower bounds. In the limit all cuts provide statistically valid lower bounds, since

lim k→∞ k − s k α k−s_{+ β}k−s_{x = lim} k→∞α k−s_{+ β}k−s_x.

(33)

Figure 8: Sketch providing an intuition of how the optimal solution should behave in a deter-ministic setting.

4.3 Obtaining the set of near optimal solutions

When the objective is so flat around the optimum it may make more sense to look for a set of near optimal solutions, rather than looking for the one best answer. There does not exist a standard approach to defining or finding near optimal solutions in the setting of stochastic programming. We define the set of -optimal solutions to be the subset X _{of X, such that}

X= {x ∈ X : G(x) < G(x∗) + }.

Rather than trying to find the exact set X_{, we aim to find a reasonably large subset of this set,}

which we define by eX

. We find this subset by defining some c ∈ Rn_{. Then, starting from c = 0,}

we increase (decrease) one of its elements until G(x∗+ c) > G(x∗) + . In this way, we can find a range of values for every element in x for which we obtain an -optimal solution. Now the set of convex combinations of the end points of these ranges for all elements of x form a subset of X by the convexity of our objective function. In the one dimensional case eX = X, in the two-dimensional case the relation will be as depicted in Figure 9.

We can choose based on how much of a difference in losses is considered negligible. An alternative approach is to define in terms of a confidence limit. Suppose we found x∗ and approximate G(x∗_{) by G(x}∗_{) =} PK

k=1G(x∗, ω

k_{). Furthermore assume we can calculate the}

sample standard deviation which we define by SK. Suppose we consider all solutions x for which

G(x) is smaller than the upper limit of the 95% confidence interval around G(x∗) to be near optimal, then we should choose = tK−1,0.975∗ sK. In our case this leads to very small , so we

(34)

Figure 9: Graphical representation of the near optimal set when the solution space is two di-mensional.

5 Data

5.1 Top-down demand and supply profiles

As input for our model we need to generate streams of demand and supply. We will run our model on a granularity of one minute as well as fifteen minutes. Above we assumed that the distribution for demand and supply was known at every point in time. We got information on deterministic fifteen minutes demand and supply levels for one week in every season from earlier research of TNO. We used this data as a basis to derive top-down demand and supply profiles.

First we derived a typical profile for one day in each season. The profiles measured on 15 minute time steps were very bumpy, such that we did not believe these were suitable estimators for means. We chose to sum all 15 minute values within one hour to obtain a profile with hourly time steps. Furthermore, we created an average day profile for every season by averaging the values over the different days of the week. For this we did not differ between week and weekend days, since the patterns are close together for these different days. Furthermore we summed all the 15 minute values within one hour to obtain a profile with hourly time steps.

Demand

(35)

Figure 10: Average demand profiles.

(36)

Figure 12: Average PV output profiles.

(37)

Figure 14: Average CHP output profiles.

We used Figure 11 to make assumptions on the variance and distribution family of energy demand on one minute granularity. A high variability of power demand seems to coincides with high average values of power demand and that the distribution is very much skewed. So demand is best described by a skewed distribution where then mean and variance are defined such that they are related to each other.

PV output

Figure 12 shows the average power profile for PV generation for all seasons. As one could expect PV panels only generate power during the day, with a peak around noon. The amount of generation is dependent on the season, as the summer knows more hours of sunshine then winter.

On a smaller time scale PV output is quite variable due to passing clouds. In Figure 13 graphs are shown of solar output on the hour and one minute timescale. For PV output the connection between variability and mean does not seem to be as clear as for demand. The output on the one minute time scale is highly left skewed.

Micro-CHP output

The eﬀect of modeling choices on decision making regarding the optimal planning of distributed generation

The effect of modeling choices on decision making regarding

the optimal planning of distributed generation

The effect of modeling choices on decision making regarding

the optimal planning of distributed generation

Lieke Kools

s1771817

January 29, 2014

Abstract

Summary

Preface

Contents

1

Introduction

2

Literature review

2.1

Important factors for minimizing losses

2.2

Alternative objectives

2.3

Modeling demand and supply

2.4

Stochastic models of the electricity grid

3

Mathematical model

3.1

Problem formulation and notation

3.2

Objective

3.3

Constraints

3.4

Properties of the model

3.5

Complexity of the model

3.6

Simplifying the second stage

4

Solution method

4.1

Optimization algorithms based on sampling

4.2

An adaption of Stochastic Decomposition

4.3

Obtaining the set of near optimal solutions

5

Data

5.1

Top-down demand and supply profiles