• No results found

Machine learning applied to smart grids

N/A
N/A
Protected

Academic year: 2021

Share "Machine learning applied to smart grids"

Copied!
165
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

PROEFSCHRIFT

ter verkrijging van de graad van doctor aan de Technische Universiteit Eindhoven, op gezag van de rector magnificus, prof.dr.ir. F.P.T. Baaijens,

voor een commissie aangewezen door het College voor Promoties, in het openbaar te verdedigen op maandag 9 oktober 2017 om 16.00 uur

door

Elena Mocanu

(2)

promotiecommissie is als volgt:

Voorzitter: prof.dr.ir. A.B. Smolders Promotor: prof.dr.ir. J.G. Slootweg Co-promotoren: dr. M. Gibescu

dr. P.H. Nguyen

Leden: prof.dr. M.E. Webber (The University of Texas at Austin) prof.dr. P. Pinson (Technical University of Denmark) prof.dr. A. Liotta

prof.ir. W. Zeiler

Het onderzoek of ontwerp dat in dit proefschrift wordt beschreven is uitgevoerd in overeenstemming met de TU/e Gedragscode Wetenschapsbeoefening.

(3)

brary.

ISBN: 978-90-386-4338-0

NUR: 984

Title: Machine Learning applied to Smart Grids Author: Elena Mocanu

Eindhoven University of Technology, 2017.

Copyright c 2017 by Elena Mocanu

All rights reserved. No part of this publication may be stored in a retrieval system, reproduced, or transmitted in any form or by any means, electronic, mechanical, including photocopy, recording, without the prior written permission of the author. Typeset using LaTeX, printed by Ipskamp Printing, Enschede, the Netherlands. The cover is inspired by the recent work of the visual artist Mihai Topescu. The artist painted a Romanian forest ”Dumbrava”, Poienari, Gorj as a response to the abusive deforestation and also as an artistic intervention to rediscover the beauty of nature.

(4)

prof.ir. W.L. Kling(†) en

prof.dr.ir. J.G. Slootweg

This work is part of the TKI SG-BEMS project. This project is funded by Agentschap NL, an agency of the Dutch Ministry of Economic Affairs, Agricul-ture and Innovation.

(5)

First I would like to thank to Madeleine Gibescu and Phuong Nguyen, for their guidance and support trough the last four years. It was a great pleasure to work with both of you. Madeleine, your professionalism, open-mind and love for science is truly admirable, doing research with you was rewarding and fun. Thank you very much for everything, and everything means a lot. Phuong, thank you very much for having always the door open for me. Your practical way make me always to go one step further in my research. A special thank goes to my promoter Han Slootweg for his sharp advices about scientific writing and remarkable help concluding this thesis. I really appreciate it! Also, I would like to thank to Wil Kling for including me in his scientific family, the EES group. I am forever grateful for the research time we shared together and I am deeply sorry that you didn’t live to see me finishing. You are the key to my whole PhD journey. Thank you!

For willing to be on my doctoral committee and for helpful comments on my research, I thank Michael Webber, Pierre Pinson, Antonio Liotta, Wim Zeiler, Phuong Nguyen, Madeleine Gibescu and Han Slootweg. Furthermore, I am greatly thankful to the excellent cooperation with all my coauthors.

My special thanks go to my PhD colleague, Luis Hurtado Mu˜noz, for his unconditioned support during these four years, being a good collaborator and a great friend. Thanks as well go to all SG-BEMS project collaborators, and specially to Wim Zeiler, Eric Pauwels, Gert Boxem, Joep van der Velden, Peet van Tooren, and Kennedy Aduda.

Parts of this thesis are based on my experience gained in the research visits that I performed, and for which I am profoundly grateful to my host groups. I enjoyed discussing and interacting with Pierre Pinson and the people from the ELMA group at DTU, where I had the opportunity to give my first international invited talk. From the same group, I thank to Emil Larsen for preparing a dataset. During my research visit at the University of Texas at Austin in the Webber Energy group led by Michael Webber and LARG group led by Peter Stone, besides having an excellent and amazing experience, I re-understood the importance of setting high goals. Michael, thank you for inspiring me.

Of course, there is a huge community of other people who supported me deeply. I don’t have the space or time to thank every EES member with whom I overlapped individually, but they’ve all made my time here amazing: Nikos, Luis, Vladimir, Michail, Ioannis, Daniele, Annia, Shahab, Elke, Michiel, Raoul, Andrew, Samina, Pavlo, Jerom, Helder, Ballard, Mansoor, Babar, Ramiro and Niels. Special thanks go to Guus Pemen for his continuous support and to Annemarie for helping me with

(6)

all the bureaucratic aspects (but also for inviting me to my first Dutch wedding). A particular thank to Michiel Nijhuis for the translation of this thesis summary. It has been a delight for me to collaborate with Nikolaos Paterakis and Daniele Bosich. For random inspiring discussions I am thanking to Teresa Piovesan from CWI, Maria Torres Vega from ECO group, Bahadir Saltik from CS group, Iliana Pappi, and Haitham Bou Ammar (my master thesis supervisor and friend from Maastricht). Huge thanks to all my Romanian friends and in particular to Roxi, Monica, Urlˇa, Cˇatˇalina, Mihai, Victor, Carmen, Vali, Laura and Bobo for keeping me grounded.

For sure, the path to my doctorate started well before to move in Netherlands. I would like to acknowledge Prof. Dumitru Melencu for learning me (how to love) mathematics. His teaching was life-changing for me. Further on, my thanks go to Prof. Popa Marin for giving me the opportunity to work in an university and to share with me his interest on graph theory. From the same university, but from Faculty of Physics my special thanks go to Liviu Giurgiu and Minola Leonovici. Then, I would like to extend my thanks to my master thesis supervisor, Evgueni Smirnov from Maastricht University, for his continuous support during all my PhD research time. A special thank goes to Georgiana Ifrim from UCD Dublin for our discussions regarding the interpretable machine learning topic, and for guidance and trust during the last year.

This thesis is dedicated to my family. To my parents, Despina and Virgil, for their unconditional love and support, and for teaching me the wonderful things that come from hard work and the pursuit of knowledge. To my sisters, Emilia and Maria, for always being on my side. To my parents-in-law, Toia and Traian, for their thrust and confidence, to my grandparents, and to Andrei and Alex, for all the support and good times over the years.

And to Dec for keeping me grounded and laughing throughout the most dif-ficult times of this journey. But also for our discussions on artificial intelligence, deep learning, and all kind of possible architectures which made my research to be what it is now.

Elena Mocanu

(7)

Scientific advancements based on the electricity, as a way to transfer energy, are fundamental in our understanding of different types of complex networks. On the one hand, the principle of electricity was evolving trough the largest intercon-nected network built by humans for delivering electricity — the electric grid. The energy system is complex and for the industrialized and developing world energy is a commodity on which we all rely for our quality of life. There is an energy tran-sition underway since the start of this millennium, comprised primarily of a push towards replacing large, fossil-fuel plants with renewable and distributed genera-tion. This ongoing transformation results in increased uncertainty and complexity in both the business transactions and in the physical flows of electricity in the smart grid. On the other hand, one part of scientific communities relying on the some electricity concept was able to build the first artificial (brain-like) network. Although controversial, one part of this comunity argues in the light of reduction-ist approach that we are all electric machines. Still under development, nowadays the artificial intelligence recorded remarkable success using increasingly complex neural networks. So far, both types of networks (e.g. electricity grid and neural networks) are composed of many interconnected layers. From a theoretical per-spective, this thesis has tried to bring closer the future electric grid and artificial intelligence for their mutual benefits.

Energy is a limited resource which faces additional challenges due to recent efficiency and de-carbonization goals worldwide. The actual transition of electric grid towards a sustainable, efficient and flexible electricity network requires more and more complex methods. Moreover, urbanization and electrification trends show that the total energy demand will increase in the future, while at the same time the total electrical consumption in the world is increasing and the penetration of energy from renewable sources is increasing as well. Consequently, in this thesis we investigate the future smart grids capabilities in order to have a system that can monitor, predict, schedule, learn and make decisions regarding local energy consumption and production in real-time. This challenging problems in smart grids, where confined to more fundamental research problems, such as: (1) how to obtain a more accurate prediction method; (2) how to find an optimal scheduling when is performed an online learning task; and (3) how to learn multiple-tasks in a more automatic way.

Prediction — As prediction developed, different sub-fields were created. The electrical demand forecasting problem can be regarded as a nonlinear time series

(8)

prediction problem depending on many complex factors since it is required at various aggregation levels and at high resolution. To solve this challenging prob-lem, various time series and machine learning approaches have been proposed in the literature. As an evolution of neural network-based prediction methods, deep learning techniques are expected to increase the prediction accuracy by be-ing stochastic and allowbe-ing bi-directional connections between neurons. Based on the available data, we explore and extend the supervised prediction methods and pioneer the use of unsupervised building energy prediction methods.

Supervised energy prediction — We propose three new deep learning meth-ods for supervised energy prediction. Overall, the thesis details the mathematical derivation of the proposed learning methods as well as a comparison with state-of-the-art methods for energy prediction, such as artificial neural networks, recurrent neural networks, support vector machine, hidden Markov models and persistence method. The methods are tested under different time horizons using various res-olutions. Two datasets are used to validate our proposed methods at the building level, while another dataset collected from the Danish Island Bornholm within the EcoGrid EU project was used to analyze the prediction accuracy at the aggregated level.

Unsupervised energy prediction — We introduce a new paradigm on building energy prediction, which does not require historical data from the specific build-ing under scrutiny. In an unified approach, we can successfully learn a buildbuild-ing model by including a generalization of the state space domain, then we transfer it across other building. The contribution is two-fold. First, we present a Deep Belief Network for automatically feature extraction and second, we extend two standard Reinforcement Learning algorithms able to perform knowledge transfer between domains (buildings models), namely State-Action-Reward-State-Action (SARSA) algorithm and Q-learning algorithm by incorporating the states estimated with a Deep Belief Network. The novel proposed machine learning methods for energy prediction are evaluated over different time horizons with different time resolutions using real data.

Optimization — In the second part the potential benefits of strategic opti-mization at the building and aggregated level are proposed. This work is motivated by the hypothesis that an optimal resource allocation of end-users patterns based on daily smart electrical devices profiles could be used to smoothly reconcile dif-ferences in future energy consumption patterns and the supply of variable sources such as wind and solar, leveraging valuable information to facilitate Demand Re-sponse and Demand Side Management programs. Furthermore, it is expected that a cost minimization problem could be solved to activate real-time price respon-siveness behavior.

Online building energy optimization — We introduce deep reinforcement learn-ing for online buildlearn-ing energy optimization. Therein, two important problems are addressed: price reduction and energy minimization. Specifically, we propose two methods, Deep Q-learning and Deep Policy Gradient, which are extended to per-form multiple actions simultaneously. The proposed approach was validated on the highly-dimensional Pecan Street Inc. database, in terms of accuracy, convergence

(9)

and scalability capabilities. We have proved that these on-line strategies can be used to provide real-time feedback to consumers to encourage a more efficient use of the electricity.

Analysis and quantification of building flexibility — Secondly, the building flex-ibility is quantified based on a data-driven approach. Within a unified framework, we divide the problem addressed in four subproblems, namely energy disaggrega-tion, flexibility identificadisaggrega-tion, flexibility prediction and estimation of optimal flex-ibility. Therefore, an automatic solution for multi-task learning is presented. The main contribution is a five-order restricted Boltzmann machine useful to solve all at once the four subproblems. The proposed architecture is compared with state-of-the-art solutions, such as classification methods for detection and deep learning methods for prediction.

To conclude, in this thesis we start by considering challenging problems in smart grids. These challenges lead us to make fundamental theoretical contribu-tions in the areas of artificial intelligence and smart grids, building new bridges between them. In summary, by looking at the synergy between smart grids and complex neural networks, in this thesis we have been able to advance the state-of-the-art prediction accuracy, by proposing three new deep learning methods. Additionally, we have conceived a new class of unsupervised stochastic prediction methods. Having arrived at a reasonable predictive model, we have pioneered the use of Deep Reinforcement Learning for online building energy optimization. While most effort so far was put into trying to quantify the building flexibility, we showed for the first time that is possible to perform in a unified method flexibility detection, prediction and planning. Therefore, the artificial intelligence methods were improved using high-order restricted Boltzmann machines paradigms in order to perform multi-task learning.

(10)
(11)

Wetenschappelijke vooruitgang op basis van de elektriciteit, als energie, is fundamenteel voor ons begrip van verschillende typen complexe netwerken. Aan de ene kant was het principe van elektriciteit aan het evolueren door het groot-ste aan elkaar gebonden netwerk voor elektriciteitsdistributie dat door de mens is gebouwd — het elektriciteitsnet. Het energiesysteem is complex en voor de gendustrialiseerde en ontwikkelende wereld is energie een goed waarop we alle-maal vertrouwen voor onze kwaliteit van leven. Er is sinds het begin van dit millennium een energietransitie aan de gang, die voornamelijk bestaat uit het ver-vangen van grote fossiele elektriciteitscentrales door hernieuwbare en decentrale energieopwekking. Deze transformatie resulteert in een verhoogde onzekerheid en complexiteit, zowel in de zakelijke transacties als in de fysieke stroom van de elek-triciteit in het ’smart grid’. Aan de andere kant kon de wetenschap op het eerste kunstmatige (hersenachtige) netwerk bouwen, door van hetzelfde concept van elek-triciteit gebruik te maken. Het wordt als controversieel gezien, maar een deel van de wetenschap pleit ervoor dat we allemaal elektrische machines zijn, in het licht van een reductionistische benadering. Hoewel het nog steeds in ontwikkeling is, heeft de kunstmatige intelligentie nu al opmerkelijke successen opgebouwd door ge-bruik te maken van steeds complexere neurale netwerken. Beide soorten netwerken (het elektriciteitsnet en artifici¨ele neurale netwerken) zijn samengesteld uit veel on-derling verbonden lagen. Vanuit een theoretisch oogpunt, tracht dit proefschrift het toekomstige elektrische net en de kunstmatige intelligentie dichterbij elkaar te brengen om zo voordelen voor beide technieken te behalen.

Energie is beperkt beschikbaar. In het energiedomein zijn er nu ook nog ex-tra uitdagingen vanwege de effici¨entie en milieu doelen wereldwijd. De overgang van het elektriciteitsnet naar een duurzaam, effici¨ent en flexibel elektriciteitsnet vereist steeds complexere methodes. Bovendien blijkt dat door urbanisatie en elektrificatie de totale vraag naar energie in de toekomst zal toenemen, terwijl tegelijkertijd het totale elektriciteitsverbruik in de wereld stijgt neemt ook de pen-etratie van energie uit hernieuwbare bronnen toe. Daarom onderzoeken wij in dit proefschrift methode die kunnen worden gebruikt in de toekomstige slimme net-ten. Zodat een elektriciteitsnet gecre¨eerd kan worden dat in real-time kan, meten, voorspellen, plannen, leren en beslissingen nemen over het lokale energieverbruik en de productie. Deze uitdagende problemen om slimme netten mogelijk te maken, hebben we beperkt tot de meer fundamentele onderzoeksproblemen, zoals: (1) hoe men een nauwkeurigere voorspellingsmethode kan verkrijgen; (2) hoe men een op-timale planning kan maken en wanneer men een leertaak online kan uitvoeren; En

(12)

(3) hoe men meerdere taken op een automatische manier tegelijk kan leren. Voorspelling — Voor het maken van voorspellingen kunnen verschillende methoden worden gebruikt. Het voorspellen van de elektriciteitsvraag is een niet-lineair voorspellingsprobleem van een tijdsreeks, dat afhankelijk is van veel com-plexe factoren, aangezien resultaten op verschillende aggregatieniveaus en met een hoge resolutie nodig zijn. Om dit uitdagende probleem op te lossen, zijn in de literatuur verschillende tijdreeks- en machine-learning benaderingen voorgesteld. Als een evolutie van de neuraal netwerk gebaseerde voorspellingstechniek wordt verwacht dat deep learning technieken de voorspellingsnauwkeurigheid kunnen ver-hogen door de stochastische tweerichtingsverbindingen tussen de neuronen. Op grond van de beschikbare gegevens verkennen en verbeteren we de supervised voorspellingstechnieken en bereiden we de weg voor het gebruik van de unsuper-vised voorspellingstechnieken.

Supervised energievoorspelling — Wij stellen drie nieuwe deep learning tech-nieken voor de suprvied energievoorspelling voor. In het proefschrift wordt de wiskundige afleiding van de voorgestelde deep learning technieken beschreven. Een vergelijking met state-of-the-art technieken voor het voorspellen van de en-ergievraag, zoals artificile neurale netwerken, terugkerende neurale netwerken, sup-port vector machines, hidden Markov-modellen en persistentie modellen, wordt gemaakt. De methoden worden voor verschillende tijdshorizons getest en voor verschillende resoluties. Twee datasets zijn gebruikt om onze voorgestelde tech-nieken te valideren op het gebouw niveau en een dataset die is verzameld op het Deense eiland Bornholm in het EcoGrid EU-project, is gebruikt om de voor-spellingsnauwkeurigheid op een hoger aggregatieniveau te analyseren.

Unsupervised energievoorspelling — Wij introduceren een nieuwe aanpak voor het voorspellen van het energieverbruik in een gebouw. Deze aanpak vereist geen historische gegevens van het specifieke gebouw. In een gentegreerde aanpak kun-nen we succesvol een model van een gebouw leren door een generalisatie van de toestand van een gebouw te integreren. Vervolgens kunnen we deze toestand dan overbrengen naar een ander gebouw. De bijdrage is tweevoudig. Ten eerste pre-senteren we een Deep Belief Network voor automatische functie-extractie en ten tweede verdiepen we twee standaard Reinforcement Learning-algoritmes die ken-nisoverdracht tussen de domeinen (gebouwenmodellen) kunnen uitvoeren, namelijk het Q State-Action-Reward-State-Action (SARSA) en Q-learning algoritme waar-bij staten zijn opgenomen die geschat zijn met een Deep Belief Network. De nieuwe voorgestelde machine learning methoden voor energievoorspelling zijn ge¨evalueerd voor verschillende tijdhorizons en voor verschillende tijdresoluties met behulp van gemeten data.

Optimalisatie — In het tweede deel worden de potenti¨ele voordelen van strategische optimalisatie op gebouw en geaggregeerd niveau voorgesteld. Een op-timale allocatie van energieverbruik, door middel van slimme elektrische apparaten die hun dagelijkse verbruikspatroon kunnen voorspellen, kan worden gebruikt om

(13)

de verschillen in de verbruik en de levering van variabele energiebronnen zoals wind en zonne-energie, te minimaliseren. Het is de verwachting dat in de toekomst en kostenminimaliseringsprobleem kan worden opgelost om real-time prijsrespon-siviteitsgedrag te activeren.

Online gebouw energieverbruik optimalisatie — Wij introduceren een deep re-inforcement learning techniek voor online energie optimalisatie. Daarmee worden twee belangrijke doelen bereikt: een lagere energieprijs en energieverbruik. Speci-fiek stellen wij twee methoden voor, Deep Q-learning en Deep Policy Gradient, die zijn uitgebreid om meerdere acties tegelijkertijd uit te voeren. De nauwkeurigheid, convergentie en schaalbaarheid van de voorgestelde methodes werd gevalideerd op de hoog-dimensionale Pecan Street Inc. Database. We hebben aangetoond dat deze online strategien kunnen worden gebruikt om real-time feedback te geven aan de consument zodat hijeffici¨enter gebruik kan maken van elektriciteit.

Analyse en kwantificering van flexibiliteit van het gebouw — Ten tweede wordt de flexibiliteit van het gebouw gekwantificeerd op basis van een data gedreven benadering. Binnen een uniform kader verdelen we het probleem in vier sub-problemen, namelijk energie disaggregatie, flexibiliteitsidentificatie, flexibiliteitsvoor-spelling en het schatten van de optimale flexibiliteit. Daarom hebben we een multi-task leer methode ontwikkeld die alle vier de sub-problemen tegelijkertijd kan oplossen. Deze methode bestaat uit een vijfde-orde beperkte Boltzmann-machine. De voorgestelde architectuur wordt vergeleken met state-of-the-art oplossingen, zoals classificatiemethoden voor detectie en deep learning methoden voor voor-spelling.

Opsommend, in dit proefschrift zijn we van uitdagende problemen in slimme netten uitgegaan. Deze uitdagingen leiden ons tot een fundamentele theoretische bijdragen op het gebied van kunstmatige intelligentie en slimme netten, waarbij er nieuwe bruggen tussen beide velden zijn gebouwd. Samenvattend, door naar de synergie tussen smart grids en complexe neurale netwerken te kijken, hebben we in dit proefschrift de state-of-the-art voorspellingsnauwkeurigheid kunnen bevorderen door drie nieuwe deep learning methoden voor te stellen. Daarnaast hebben we een nieuwe klasse van unsupervised stochastische voorspellingsmethoden ontwikkeld. Nadat we een goed voorspellend model hebben gemaakt, hebben we als eerste gebruik gemaakt van Deep Reinforcement Learning voor online energie optimal-isatie voor gebouwen. Terwijl de meeste inspanningen tot dusver werden gedaan om de flexibiliteit van een gebouw te kwantificeren, lieten we voor het eerst zien dat in een gentegreerde methode de flexibiliteitsdetectie, voorspelling en planning tegelijkertijd kunnen worden uitgevoerd. Hierbij werden de kunstmatige intel-ligentietechnieken verbeterd door gebruik te maken van de hoge-order beperkte Boltzmann machine paradigma’s om multi-task leren uit te kunnen voeren.

(14)
(15)

List of Abbreviations Abbreviation Description

AI Artificial Intelligence

AB AdaBoost

AMI Advanced Metering Infrastructure

ANN Artificial Neural Network

ANN-NAR Artificial Neural Network (with one time series as input) ANN-NARX Artificial Neural Network (with more time series as input) BEMS Building Energy Management System

BM Boltzmann Machine

CD Contrastive Divergence

CRBM Conditional Restricted Boltzmann Machine

DBM Deep Boltzmann Machine

DBN Deep Belief Network

DER Distributed Energy Resources

DFFW-CRBM Disjunctive Factored Four-Way CRBM

DL Deep Learning

DKL Kullback-Leibler Divergence

DMS Demand-side Management

DNN Deep Neural Network

DPG Deep Policy Gradient

DQL Deep Q-Learning

DR Demand Response

DRL Deep Reinforcement Learning

DS Distributed Storage

DSO Distribution System Operator

FCRBM Factored Conditional Restricted Boltzmann Machine

FFW-CRBM Factored Four-Way Conditional Restricted Boltzmann Machines FFive-RBM Factored Five-Way Conditional Restricted Boltzmann Machines Five-RBM Five-Way Conditional Restricted Boltzmann Machines

GMM Gaussian Mixture Model

GRBM Gaussian Restricted Boltzmann Machine

HMM Hidden Markov Model

HVAC Heating, Ventilating, and Air-Conditioning

IoT Internet of Things

KNN k-Nearest Neighbors

(16)

DFFW-CRa Disjunctive Factored Four-Way Conditional Restricted Boltzmann aaaa

MDP Markov Decision Process

ML Machine Learning

NB Na¨ıve Bayes

NRMSE Normalized root-mean-square error PCC Pearson Correlation Coefficient

POMDP Partially Observable Markov Decision Process RBM Restricted Boltzmann Machine

RES Renewable Energy Sources

RL Reinforcement Learning

RMSE Root-mean-square error RNN Recurrent Neural Networks SARSA State-Action-Reward-State-Action

SVM Support Vector Machine

TL Transfer Learning

TSO Transmission System Operator

Special Symbols

Symbol Description

N, R The set of natural and real numbers P, Q,R

Sum, product, and integral p[·], p[·], P [·] Probability value/vector/matrix p(a|b) The conditional probability of a given b

N (µ, σ2) The normal distribution with mean(µ) and standard deviation(σ)

E[·] Expected value operator

5f , ∂[·] The gradient of f ; The partial derivative of [·]

∝ Proportionality

A ⊗ B Tensor product (Kronecker product) of matrix A and B List of Notations

Chapter 2 Description

nu represents the index of the last history neuron (input) nv represents the index of the last visible neuron (output) nh represents the index of the last hidden neuron

u = [u1, . . . , unu] represents a vector with all history neurons, u ∈ R

v = [v1, . . . , vnv] is a vector collecting all visible units vi, v ∈ R

h = [h1, . . . , hnh] is a vector collecting all the hidden units hj, h ∈ {0, 1}

Wvh∈ Rnh×nv represents the matrix of all weights connecting v and h

Wuv∈ Rnu×nv represents the matrix of all weights connecting u and v

Wuh∈ Rnu×nh represents the matrix of all weights connecting u and h

bh∈ Rnh represents the biases for hidden neurons

bv∈ Rnv represents the biases for visible neurons

τ represents the iteration

(17)

Chapter 3 Description

t Time

D Dataset

X The training set, ∀X ∈ D

M Building energy consumption model

S The set of states, ∀s ∈ S. A The set of actions, ∀a ∈ A.

T Transition probability function, T : S × A × S → [0, 1].

R The reward function

Q The quality matrix, Q : S × A → R

π represents the policy

V represents the value function, Vπ

: S → R τ represents the iteration

α Learning rate, ∀α ∈ [0, 1] γ Discount factor, ∀γ ∈ (0, 1)

v Vector collecting all visible units, vi∈ R

h Vector collecting all the hidden units, hj∈ {0, 1} Wvh Matrix of all weights connecting v and h

E Total energy function in the RBM model

Z Normalization function

k The number of hidden layers in DBN

Chapter 4 Description

T Time horizon

D Dataset

X The training set, ∀X ∈ D

B Set of buildings

i index of buildings, such that Bi∈ B, ∀i ∈ N d index of electrical devices, d ∈ {1, .., mi}.

P+ Power generation

P− Power consumption

Pd− Power consumption per device λ+t The price value of P+

λ−t The price value of P− S The set of states, ∀s ∈ S. A The set of actions, ∀a ∈ A.

T Transition probability function, T : S × A × S → [0, 1] R The reward function, R : S × A × S → R

Q The quality matrix, Q : S × A → R

π represents the stochastic policy, π : S × A × R → R+ θ represents the set of parameters in Deep Belief Network k represents the number of layers in Deep Belief Network η represents a coefficient controlling the activation function τ represents a trajectory, τ = (st, at, rt)

g, ˆg represents the gradient and the estimated gradient ζ1, ζ2 represents the coefficients controlling the reward function

(18)

Chapter 5 Description

t Time

D Dataset

B Set of buildings

n represents the number of buildings

N The number of history frames

d Electrical device consumtion

ˆ

d Estimated electrical device consumtion

P oAd The period-of-activation for every device d

P oAdˆ The estimated period-of-activation for every device d

π∗ Optimal flexibility allocation

c

π∗ Estimated optimal flexibility

C Classifier, C : D → B

nv represents the index of the last visible neuron (output) nh represents the index of the last hidden neuron

v = [v1, . . . , vnv] represents the visible layer in RBM, v ∈ R

h = [h1, . . . , hnh] represents the hidden layer in RBM, h ∈ {0, 1}

Wvh ∈ Rnh×nv represents the matrix of all weights connecting v and h

a ∈ Rnv represents the biases for visible neurons

b ∈ Rnh represents the biases for hidden neurons

E(v, h) The total energy function in RBM

θ represents the model parameter, θ = [W, a, b τ , α, ρ,ξ]

τ represents the update number

α represents the learning rate

ρ represents the momentum

ξ represents the weights decay

A represents the confusion matrix

v<t represents the history layer in FFW-CRBM

v represents the visibile layer in FFW-CRBM (output)

h represents the hidden layer in FFW-CRBM

l represents the label layer in FFW-CRBM (output)

Wijko The fourth order tensor, Wijko ∈ Rnv×nh×nv<t×nl EF F W −CRBM The energy of FFW-CRBM, E(vt, ht, lt|v<t, Θ)

v represents the history layer in Five-RBM

h represents the hidden layer in Five-RBM

o(1) represents the label layer in Five-RBM (output 1) o(2) represents the visibile layer in Five-RBM (output 2) o(3) represents the estimation layer in Five-RBM (output 3) Wijklm The five order tensor, Wijklm∈ Rno(1)×nh×nv×no(2)×no(3)

EF ive−RBM The energy of Five-RBM, E(o(1), h, o(2),o(3)|v, Θ) a, b, c, b Biases for the present, hidden, label and estimated layers

(19)

Acknowledgments . . . v

Summary . . . vii

Samenvatting . . . xi

Notations . . . xv

1. Introduction . . . 1

1.1. Challenges and opportunities in the electric grid . . . 2

1.1.1. The Smart Grid concept . . . 2

1.1.2. The synergy between Smart Grids and Smart Buildings . . . 3

1.2. Machine Learning . . . 4

1.3. Research Questions and Objectives . . . 6

1.4. Thesis Contribution and Outline . . . 6

Part 1. Prediction 9 2. Supervised energy prediction . . . 11

2.1. Introduction . . . 11

2.1.1. Deep Learning . . . 14

2.1.2. Restricted Boltzman Machine . . . 15

2.2. Energy prediction at building level . . . 17

2.2.1. Conditional Restricted Boltzmann Machines . . . 19

2.2.2. Experiments and results . . . 21

2.3. Energy prediction under different time horizon . . . 24

2.3.1. Factored Conditional Restricted Boltzmann Machine . . . 25

2.3.2. Experiments and results . . . 28

2.4. Energy prediction in a price-responsive context . . . 35

2.4.1. Fusion architecture based on Restricted Boltzmann Machine . . . . 36

2.4.2. Experiments and results . . . 37

2.5. Summary . . . 43

3. Unsupervised energy prediction . . . 45

3.1. State-of-the-art . . . 45

3.2. Problem Formulation . . . 48

3.3. Reinforcement Learning . . . 49

(20)

3.3.2. SARSA. . . 50

3.4. States estimation via Deep Belief Networks . . . 51

3.4.1. Deep Belief Networks . . . 51

3.5. Numerical Results . . . 53

3.6. Summary . . . 59

Part 2. Optimization 63 4. On-line Building Energy Optimization . . . 65

4.1. Introduction . . . 65

4.2. Resource allocation problem . . . 67

4.3. Background and Preliminaries . . . 69

4.4. Deep reinforcement learning . . . 70

4.4.1. Deep Q-learning (DQN) . . . 71

4.4.2. Deep Policy Gradient (DPG) . . . 72

4.5. Implementation details . . . 73

4.6. Results and Discussion . . . 75

4.6.1. Numerical results - Peak reduction problem . . . 76

4.6.2. Numerical results - Cost minimization problem . . . 76

4.6.3. Scalability and learning capabilities of DRL . . . 77

4.7. Summary . . . 79

5. Analysis and quantification of building flexibility . . . 81

5.1. Introduction . . . 81

5.2. Flexibility identification . . . 83

5.2.1. Classification using first-order Restricted Boltzman Machine . . . . 84

5.2.2. Experiments and Results . . . 87

5.3. Flexibility identification and prediction . . . 90

5.3.1. Four-way Conditional Restricted Boltzmann Machines . . . 91

5.3.2. Experiments and Results . . . 92

5.4. Flexibility estimation using multi-task learning . . . 95

5.4.1. Five-way conditional restricted Boltzmann machine . . . 97

5.4.2. Factored five-way conditional restricted Boltzmann machine . . . . 99

5.4.3. Experiments and Results . . . 101

5.5. Summary . . . 106

6. Conclusions and discussions . . . 109

6.1. Results and discussions . . . 109

6.1.1. Thesis contributions . . . 110

6.1.2. Limitations . . . 112

6.2. Recommendations for future research . . . 113

Appendix A Gaussian Mixture Model . . . 115

Bibliography . . . 123

(21)

List of Tables . . . 137 List of Publications . . . 139 Curriculum Vitae . . . 143

(22)
(23)

Introduction

Scientific advancements based on the electricity concept are fundamental in our understanding of different types of complex networks. On the one hand, the principle of electricity was evolving trough the largest interconnected network built by humans for delivering electricity — the electric grid. The energy system is complex and for the industrialized and developing world energy is a commodity on which we all rely for our quality of life [1]. There is an energy transition underway since the start of this millennium, comprised primarily of a push towards replacing large, fossil-fuel plants with renewable and distributed generation. This ongoing transformation results in increased uncertainty and complexity in both the business transactions and in the physical flows of electricity in the smart grid. On the other hand, one part of scientific communities relying on the electricity concept was able to build the first artificial (brain-like) network. Although contro-versial, one part of this comunity argues in the light of reductionist approach that we are all electric machines [2]. In general, AI community tries to understand and develop more compressive human brains models [3]. Nowadays, the artificial intelligence has a remarkable success using increasingly complex neural networks, called deep learning [3–5].

So far, both types of networks (e.g. electricity grid and neural networks) are composed of many interconnected layers. From a theoretical perspective, this thesis tries to bring closer the future electric grid and the concept of artificial intelligence as shown in Figure 1.1. For example, a large-scale smart grid database could be overall optimized if there is a computational intelligence infusion [6],

(24)

while the information spread in a complex network could be analyzed using ideas inspired by the current-flow concepts [7]. In this thesis, special attention is given to the on-line and autonomous solutions aiming to perform multi-task learning in the smart grid context, as it is depicted in Figure 1.1.

1.1. Challenges and opportunities in the electric grid

Energy is a limited resource which faces additional challenges due to recent efficiency and de-carbonization goals worldwide. The actual transition towards a sustainable, efficient and flexible electricity network requires more and more com-plex methods for planning and operating the grid. This ongoing transformation results in increased uncertainty and complexity in both the business transactions and in the physical flows of electricity in the smart grid.

1.1.1. The Smart Grid concept

The Smart Grid is a complex system comprising of different subsystems at different levels of aggregation. It facilitates a bi-directional energy flow accompa-nied by a bi-directional information flow among all the actors, such as producers, end-users, transmission and distribution system operators (TSO/DSO), demand response (DR) aggregators. The amount of flexibility generated through the active participation of consumers in demand response, using direct control or dynamic pricing incentives, is still an open problem in the future Smart Grid. Optimizing distributed assets is expected to open the market for advanced energy manage-ment, such as building energy management systems (BEMS). Furthermore, the Smart Grid has to find solutions that allow the incorporation of distributed en-ergy resources (DER) including distributed storage (DS) and renewable enen-ergy sources (RES) at all voltage levels [8, 9]. Integration of information given by deployed sensors, smart meters and smart appliances will increase: 1) the moni-toring capabilities over the direction and the amount of power flow, 2) monimoni-toring locations and shapes of the DER’s generation patterns, and 3) monitoring DER evolution over time in the system [10]. Another important aspect in smart grids, which has to be considered, is represented by its real time self-healing capabilities. Various Smart Grid concepts have been proposed to handle these challenges. Major solutions are based on the synergy of traditional power system planning and operation practices with the latest developments in information and communica-tion technology. However, many of these issues are solved separately, by different methods, in different areas of the world. At this point, a universally accepted defi-nition for the ”Smart Grid” concept does not exist, and no common language to de-velop this concept. In 2012, the European Committee for Standardization (CEN), the European Committee for Electrotechnical Standardization (CENELEC) and the European Telecommunications Standards Institute (ETSI) have published a general framework, Smart Grid Architecture Model (SGAM) [11]. In addition, an Universal Smart Energy Framework (USEF) is developed in the Netherlands in order to provide a guideline to develop a common market reference model for all the actors involved in the system, both from technology and implementation point of view [12].

(25)

Moreover, urbanization and electrification trends show that the total energy demand (for all carriers) will increase in the future [13], while at the same time the share of electricity in this world-wide energy demand is increasing, as well as the penetration of energy from renewable sources. Electricity generation and demand must be balanced over various temporal and spatial scales. Therefore, future smart grids need a system that can monitor, predict, schedule, learn and make decisions regarding (local) energy consumption and production in real-time.

1.1.2. The synergy between Smart Grids and Smart Buildings

Because the built environment is currently the largest consumer of electric-ity, a deeper look at building energy consumption holds promise for helping to achieve overall optimization of the energy system. Next to the transport sector, buildings are critical in achieving overall energy efficiency and reduction in car-bon footprints [14]. In the performance of this function, automation systems play a key role, especially in large, non-residential buildings [15]. Initial automation in buildings was geared towards maintaining comfort and safety. As energy ef-ficiency issues became paramount and information, communication and control systems become more advanced, further improvements ensure dedicated energy management systems, distributed decision making and coordination inside and among buildings [16].

The complexity of building physics and the uncertain factors influencing local electricity production and demand, make optimal building energy management very hard to describe with a proper mathematical formalism. Variations in en-ergy flows are given by weather conditions, the use of appliances and equipment, the occupancy patterns and the occupants’ comfort preferences, efficiency of com-ponents such as lighting, appliances and HVAC (Heating, Ventilation, and Air-Conditioning) systems and so on. See for example [17] and [18] for an overview about the complexity of the building and its environment.

This work aims to show that production variability due to RES such as wind and solar can be matched on a daily basis by optimally scheduling smart electrical devices at the end-user side. Furthermore, it is expected that a cost minimization problem can be solved to activate real-time price-responsive behavior of such end-users [19]. The outcomes of this research will leverage valuable information to facilitate Demand Response (DR) programs. A wide-range of methods have been proposed to solve the above optimization problems, but usually they fail to consider on-line solutions for large-scale and real data [20].

A summary of the Scopus-indexed publications aiming to optimize the build-ing energy consumption is depicted in Figure 1.2. Therein, zooming over the methods used in the past five years we can observe an increasing trend for pre-diction methods, swarm intelligence methods (e.g. genetic algorithms, particle swarm, ant colony optimization and other heuristic methods) as well as methods based on dynamic game theory. One promising technology for solving multi-actor, multi-objective, multi-time period optimization problems are Multi Agent Systems (MAS) [21–24]. MAS is a sub-area of artificial intelligence that particularly fo-cuses on agent interactions and technologies that contribute to such interactions.

(26)

1975 1980 1985 1990 1995 2000 2005 2010 2015 0 100 200 300 400 500 600 700 800 900 1000 1100 Time [years] Number of publications 20110 2012 2013 2014 2015 2016 100 200 300 400 500 600 Prediction Linear programming Dinamic programing Stochastic programing Genetic algorithms Particle swarm optimization Ant colony optimization Other heuristics Gradient Methods Multi−agent system Game theory Robust opimization Model predictive control Fuzzy methods Other methods

Figure 1.2 – BEMS - A summary of the Scopus-indexed publications with focus on building energy optimization over years (i.e.1972-2016), including a zoom over the years 2011-2016.

Specifically for applications in buildings, researchers have employed MAS to ad-dress challenges of adaptively controlling a HVAC system in order to minimize cost [25, 26], and to compute energy-efficient schedules for the optimal use of limited energy resources [23, 24].

On the one hand, complementary with these prior studies, this thesis focuses on developing algorithms (agents) able to autonomously perform multi-task learn-ing. On the other hand, we look to pass from traditionally off-line optimization methods to on-line solutions. Enabling real-time applications from the high level of aggregation in the smart grid will put end-users in a position to change their consumption patterns, offering useful benefits for the system as a whole. Moreover, understanding individual consumption behavior and the ways in which consumers use energy is an essential step to optimize building energy consumption and con-sequently its effects on the grid.

1.2. Machine Learning

In the literature, machine learning methods are classified, based on the avail-able data and the expected learning tasks, in three general categories: supervised learning, semi-supervised learning and unsupervised learning. Prior studies in the smart grid context were generally focused on the use of supervised methods. In this thesis we explore the ability of several learning methods in order to prefigure intelligent buildings and smart grids of the future. Hence, new concepts are con-ceived based on deep learning, transfer learning and deep reinforcement learning in order to optimize the connection between flexible buildings and smart grids.

Deep Learning — While deep learning has had a huge success in past years, the reasons for this success remain elusive [4]. We start by putting our proposed methods in the context of recent developments, and we argue that a topological investigation of this deep learning architectures based on the order of tensor fac-torization, as depicted in Figure 1.3, could highlight two general trends in this

(27)

research field. By increasing the number of layers in the well-known artificial neu-ral networks, nowadays we are referring to Deep Neuneu-ral Networks (i.e. Deep Belief Networks and ANNs) as the principal direction in deep learning, Figure 1.3, blue shadow. Tensor connection 1 2 3 4 5 1st 2nd 3th 4th 5th N u mb er o f h id d en la yer s To tal n u mb er o f la yer s 1 2 3 4 5 ANN RBM CRBM FCRBM De ep Boltz m an n Ma ch in e FW-CRBM Perceptron De ep Be lie f N etwork RBM FCRBM FW-CRBM Five-way CRBM Five-way CRBM

Figure 1.3 – Deep Learning - A schematic representation of a) high-order restricted Boltzmann machine architectures and b) their corresponding factorization.

Nevertheless, based on the Restricted Boltzmann Machine (RBM) architec-ture, a second direction using higher order tensor factorization arises. Overall, each point highlighted in Figure 1.3 is a method, or a class of methods, that will be separately detailed throughout this thesis. For example, we used Con-ditional Restricted Boltzmann Machine (CRBM) and Factored ConCon-ditional Re-stricted Boltzmann Machine (FCRBM) for supervised energy prediction, while the generalization capabilities of RBM are used for feature extraction in order to reduce dimensionality and thus increase computation efficiency.

Reinforcement and Transfer learning — concepts are extended with deep learning methods for domain adaptation, such us Deep Belief Networks for contin-uous states estimation. In smart grids, and more specifically in smart buildings, the electrical patterns are generated by different factors, almost independently. This is what allows us to learn about the effect of one of the factors, without hav-ing to know everythhav-ing about all the other factors and their exponentially large number of interactions. This assumption also arises naturally if we first assume something that appears very straightforward: the input data we observe are the effects of some underlying causes, and these causes are marginally related to each other in simple ways (e.g. independent causes being the extreme situation). The predictions are more directly connected to causes, whereas the inputs (i.e. the observations) can be seen as effects. This assumption also suggests that unsuper-vised pre-training and semi-superunsuper-vised learning of representations will work well

(28)

when there is not enough labeled data for supervised learning.

Deep reinforcement learning — is introduced for building energy optimiza-tion, as the state-of-the-art method in machine learning [5] able to dynamically use information in an on-line manner.

Multi-task learning — First, we envisioned a model for the 5th order con-nection of a conditional RBM as part of the red shadow area in Figure 1.3. Second, the 5thorder weight tensor connection is factored, and the mathematical deriva-tion of the newly resulting method is presented. This model structure makes it possible to learn more flexible correlations. Finally, we demonstrate both on syn-thetic and real-world databases that our factored model achieves better generative capabilities than the existing 4th order connection model, at no additional com-putational costs.

1.3. Research Questions and Objectives

In this thesis we investigate some of the future smart grids capabilities, fo-cusing on the interface between buildings and grids, in order to create a system that can monitor, predict, schedule, learn and make decisions regarding local en-ergy consumption and production in real-time. These challenging problems are confined to fundamental research questions, as follows:

• How to obtain a more accurate energy prediction method?

• How to derive an optimal resource schedule through on-line learning? • How to learn multiple-tasks in a more automatic way?

To address this research question in the smart grid context the following ob-jectives have been identified and addressed in this thesis:

◦ Investigate the existing machine learning methods for energy prediction. ◦ Develop new methods able to decrease the energy prediction error. ◦ Investigate the existing methods for building resource allocation. ◦ Investigate the potential of reducing the energy costs at the building

level.

◦ Understand building energy patterns and perform automatic extraction. ◦ Develop a method to dynamically quantify building energy flexibility.

1.4. Thesis Contribution and Outline

This thesis contains another five chapters, of which four are research chapters and a final chapter summarizes the research findings and presents recommenda-tions for future research. The research chapters are split into two parts.

Part I (Chapter 2 & 3) — As prediction developed, different sub-fields were created. The electrical demand forecasting problem can be regarded as a nonlinear time series prediction problem depending on many complex factors since it is required at various aggregation levels and at high resolution. To solve this challenging problem, various time series and machine learning approaches have

(29)

been proposed in the literature. These range from heuristic based approaches to mathematically grounded ones such as those residing in the realm of Machine Learning (ML). As an evolution of neural network-based prediction methods, deep learning techniques are expected to increase the prediction accuracy by being stochastic and allowing bi-directional connections between neurons [27,28]. Based on the available data, we explore and extend the supervised prediction methods in Chapter 2 and pioneering the use of unsupervised building energy prediction methods in Chapter 3.

In Chapter 2, we propose three deep learning methods for supervised energy prediction. Overall, the chapter details the mathematical derivation of the pro-posed learning methods and compares them to state-of-the-art methods for energy prediction, such as artificial neural networks, recurrent neural networks, support vector machine, hidden Markov models and the persistence method. The methods are tested under different time horizons using various resolutions. Two datasets are used to validate our proposed methods at the building level, one provided by TKI Swich2SmartGrid project from a typical Dutch building as part of Krop-man Installatietechniek BV office buildings and an online database available on UCI Machine Learning Repository [29], while another dataset collected from the Danish Island Bornholm within the EcoGrid EU project was used to analyze the prediction accuracy at the aggregated level.

Chapter 3 introduces a new paradigm on building energy prediction, which does not require historical data from the specific building under consideration. In a unified approach, we can successfully learn a building model by including a generalization of the state space domain, then we transfer it across other building. The contribution is two-fold. First, we present a Deep Belief Network for auto-matically feature extraction and second, we extend two standard Reinforcement Learning algorithms making them able to perform knowledge transfer between domains (buildings models), namely State-Action-Reward-State-Action (SARSA) algorithm and Q-learning algorithm by incorporating the states estimated with the Deep Belief Network [30]. The novel proposed machine learning methods for unsupervised energy prediction are evaluated over different time horizons with dif-ferent time resolutions using real data.

Part II (Chapter 4 & 5) — In the second part the potential benefits of strategic optimization at the building and aggregated level are proposed. A com-parative study of various heuristics methods (e.g. genetic algorithms, particle swarm optimization, quantum particle swarm optimization, min-max algorithm) used for building resource allocation, as well as an analyses of centralized versus decentralized and cooperative vs. non-cooperative approaches (e.g. Nash n-player game, majority voting, Q-learning and extended joint action learning) are not in-cluded in this thesis but are a result of this doctoral study, and are presented in [31–33].

Chapter 4 introduces deep reinforcement learning for online building energy optimization. Therein, two important problems are address: cost reduction and energy minimization. Specifically, we proposed two methods, Deep Q-learning

(30)

and Deep Policy Gradient, which are extended to perform multiple actions simul-taneously. The proposed approach was validated on the highly-dimensional Pecan Street Inc. database, in terms of accuracy, convergence and scalability capabilities. In Chapter 5 the building flexibility is analyzed and quantified based on a data-driven approach using high-order Restricted Boltzmann Machines (RBMs). Within a unified framework, we divide the problem addressed in four subprob-lems, namely energy disaggregation, flexibility identification, flexibility prediction and estimation of optimal flexibility. The proposed methods focus on autonomous agents aiming to handle multiple tasks simultaneously. Our main contribution is the first agent able to do detection, prediction and estimation of optimal flexibility simultaneously using a five-order restricted Boltzmann machine. The proposed ar-chitecture is compared with state-of-the-art solutions, such as classification meth-ods for detection and deep learning methmeth-ods for prediction.

Based on the previous chapters, in Chapter 6, conclusion and recommendation are formulated.

A Guide for the Reader — The chapters of this thesis are generally self-contained and there is no need to read all chapters successively. In Figure 1.4 an overview of the main topics studied in this thesis are represented in connection with their corresponding chapters.

Conclusions and discussions Player

(TSO, DSO, end-user) Data Methods

Different time horizon Different resolution Electric signals Influencing factors Supervised Unsupervised PART I Prediction Chapter 2&3

Stochastic decision making Multi-task learning

Introduction PART II Optimization Chapter 4&5 System oriented Market oriented Building level Aggregated level Conclusions Chapter 6

(31)
(32)
(33)

Supervised energy prediction

Abstract — Deep Learning has become one of the most important trends in Machine Learning with a wide range of applications. In this chapter, we introduce for the first time its application to energy prediction by proposing three Deep Learn-ing methods. Firstly, a new stochastic model for time series prediction of total building energy consumption and lighting load, namely the Conditional Restricted Boltzmann Machine (CRBM), is introduced in Section 2.2. The assessment is made on a real dataset consisting of 7 weeks of electricity consumption collected with hourly resolution from a Dutch office building. The results of CRBMs are compared with the well-known Artificial Neural Networks (ANNs), and Hidden Markov Models (HMMs). Secondly, we investigate a new deep learning method, namely the Factored Conditional Restricted Boltzmann Machine (FCRBM), and extend its applicability to the energy prediction problem. The assessment is made on a benchmark dataset consisting of almost four years with one minute resolution electric power consumption data collected from an individual residential customer. The results show that for this database, FCRBM outperforms ANN, Support Vec-tor Machine (SVM), Recurrent Neural Networks (RNN) and CRBM. Thirdly, we propose in Section 2.4. a new fusion model for time series prediction of energy consumption in a price-responsive context, based on Restricted Boltzman Machine and FCRBM. The assessment is made on the EcoGrid EU dataset consisting of ag-gregated electric power consumption, price, and meteorological data collected from 1900 household customers. The results show that for the energy prediction problem solved here, FCRBM outperforms SVM and the persistence method.

2.1. Introduction

The need for predicting an uncertain event (or series of events) makes the science community to continuously search for more and more accurate methods. In an attempt to determine which approaches are the most popular, and to integrate our proposed methods in the existing literature, in Figure 2.1 a short overview over the evolution of the machine learning methods applied to prediction in general are shown. A zoom over the last decade is added. This is a starting point with respect to the state-of-the-art, and a glimpse into the future trends in prediction with a focus on electricity prediction.

Overall, perhaps the most investigated machine learning methods are based on Neural Networks (NNs), and their variations (e.g. Artificial Neural Networks (ANN), Recurrent Neural Networks (RNN), Deep Neural Networks (DNN) or Deep Belief Networks (DBN)). It is worth mentioning that up to 2016, the collection of

(34)

1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 0 50000 100000 150000 200000 Time [year] Number of publications 2005 2007 2009 2011 2013 2015

Electricity prediction NN ANN RNN DNN DL SVM

Figure 2.1 – Prediction - A summary of the Scopus-indexed publications with focus on prediction over the last century (i.e.1909-2015), including a zoom over the years 2005-2015.

publications indexed by Scopus and related with NNs counts for more than one million. From this, we can observe that Deep Learning models (including DNNs) represent the most important trend in the last years.

From a more specific perspective, a short bibliometric analysis of the collec-tions of publicacollec-tions related with the electricity prediction problem is performed by using specialized queries on the Scopus database. With focus on energy pre-diction there are 10071 publications in the last decade, from which in 2016 there are 1605 publications. On the one hand, Figure 2.2 shows the distribution of this existing literature based on the type of publication (conferences, articles, reviews and others). On the other hand, using specified queries we classified these papers into machine learning and non-machine learning methods.

0 200 400 600 800 1000 1200 1400 1600 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 Number of publications [#] Time [years] NN − Conferences NN − Articles NN − Reviews NN − Other types NN and SVM − Conferences NN and SVM − Articles NN and SVM − Reviews NN and SVM − Other types SVM − Conferences SVM − Articles SVM − Reviews SVM −Other types Non ML − Conferences Non ML − Articles Non ML − Reviews Non ML − Other types

Figure 2.2 – Electricity prediction - A summary of the Scopus-indexed publications with focus on electricity prediction in the years 2007-2016.

Specifically, over the last 10 years, the overall count for Neural Networks is 3042 publications, 1041 use Support Vector Machines (SVM), while a number of 907 publications are using both methods simultaneously. Despite this large number of publications, electricity prediction remains a thorny issue that has to take into account a number of energy pattern peculiarities.

(35)

Energy prediction particularities — The complexity of the consumers en-ergy producing and consuming technologies and the uncertainty in the influencing factors, yield frequent fluctuations. Nowadays, commercial, industrial and residen-tial buildings represent a tremendous amount of the global energy used. Moreover, urbanization and electrification trends show that the total energy demand will increase in the future, and the penetration of energy from renewable sources is in-creasing as well. Therefore, the electrical demand forecasting problem, at various aggregation levels, can be regarded as a highly nonlinear and non-stationary time series prediction problem [28, 34–36]. In Tang et. al [37] a comprehensive list of data characteristics of energy time series are summarized, such as stationarity (and non-stationarity), linearity (and nonlinearity), complexity, chaotic property, fractality, regularity (and irregularity), cyclicity, seasonality, saltation (or mutabil-ity), randomness and so far. Therein, the characteristics are split in two, i.e. the nature and pattern characteristics, analyzing energy time series data from different perspectives. The first type, given by the nature, refers to a series of components, i.e., trend, cyclical, seasonal, saltatory and noisy patterns. The pattern refers to the ability of a prediction method to extract coexisting hidden patterns from data. In this thesis the following three energy pattern characteristics are considerated:

(1) Prediction horizon and resolution — Prediction of temporal energy con-sumption enables building operators to schedule the energy usage over time, shift energy usage to off-peak periods, and make more effective en-ergy purchase plans. From this perspective, demand forecasting can be considered to fall into three categories: (i) Short-term forecasts are usu-ally applied to intervals ranging from one hour to one week, (ii) Medium-term forecasts are usually from one week to one year, and (iii) Long-Medium-term forecasts are for ranges longer than one year. The predictions performed in this chapter are restricted to the short-term and medium-term inter-vals, with a special focus on the day-ahead energy prediction with various resolutions.

Traditionally, the short-term prediction problem is referring to 1 hour and 15 minutes resolutions, but higher resolutions make the problem even more complicated. The proposed methods are validated using three datasets specially selected with various granularity. In Section 2.3 we are using data with one minute resolution, while in Section 2.4 the analyze is performed using data with 5 minutes resolutions. A summary of all scenarios investigated in this chapter is depicted in Figure 2.3.

(2) Level of aggregation — In the Smart Grid context it is important to predict not only aggregated, but to go deep into the individual build-ing level, so that distributed generation resources can be deployed based on the local forecast. Decomposition of demand forecasting helps an-alyze energy consumption patterns and identify the prime targets for energy conservation. Hence, the commercial (Section 2.2) and residen-tial (Section 2.3) building profiles, as well as sub-metering (Section 2.3) and aggregated (Section 2.4) levels are estimated. The validation was done using three datasets provided by: TKI Swich2SmartGrid project

(36)

from a typical office Dutch building (Section 2.2), an online database available on UCI Machine Learning Repository [29] (Section 2.3), and a dataset collected from the Danish Island Bornholm within the EcoGrid EU project (Section 2.4).

(3) Influencing factors — The complexity of building energy behavior and the uncertainty of the influencing factors, such as more fluctuations in demand, make energy prediction a hard problem. These fluctuations are given by weather conditions, the building construction and thermal prop-erties of the physical materials used, the occupants and their behavior, sub-level systems components lighting or HVAC (Heating, Ventilating, and Air-Conditioning). 1 week 1 hour 15 min 5 min 1 min

5 min 15 min 1 hour 6 hours 1 day 1 week 1 year

Reso lu tio n t im e S1 S2 S3 S4 S5 S1 S2 S3 S4 S5 S6 S7 Prediction horizon S1

Figure 2.3 – Overview of the scenarios (S) considered in this chapter, including Section 2.2 (magenta), Section 2.3 (blue) and Section 2.4 (yellow).

Throughout the remainder of this chapter we will focus on deep learning methods related to building energy prediction problems. The chapter concludes by high-lighting the existing open challenges addressed by the proposed solutions (Section 2.5).

2.1.1. Deep Learning

Since its conception, deep learning [38] has been widely studied and ap-plied, from pure academic research to large-scale industrial applications, due to its success in different real-world machine learning problems such as audio nition [39], reinforcement learning [40], transfer learning [41], and activity recog-nition [42].

Deep learning models are artificial neural networks with multiple layers of hid-den neurons, which have connections only among neurons belonging to consecutive layers, but have no connections within the same layer. In general, these models are composed by basic building blocks, such as Restricted Boltzmann Machines (RBMs) [43]. In turn, RBMs have proven to be successfully not just providing good initialization weights in deep architectures (in both supervised and unsuper-vised learning), but also as standalone models in other types of applications [44]. Examples are density estimation to model human choice [45], collaborative fil-tering [46], information retrieval [47], multi-class classification [48]. Thus, an important research direction is to improve the performance of RBMs on any com-ponent (e.g. computational time, generative and discriminative capabilities).

(37)

2.1.2. Restricted Boltzman Machine

Restricted Boltzmann Machines (RBMs) [43] have been applied in different machine learning fields including, multi-class classification [48], collaborative filter-ing [46], among others. They are energy-based models for unsupervised learnfilter-ing. These models have stochastic nodes and layers, making them less vulnerable to local minima [49]. Further, due to their multiple layers and neural configurations, RBMs possess excellent generalisation capabilities [38]. Formally, an RBM con-sists of visible and hidden binary layers. The visible layer represents the data, while the hidden one increases the learning capacity by enlarging the class of dis-tributions that can be represented to an arbitrary complexity [49].

Figure 2.4 – A schematic representation of an Boltzmann machine (right) and an restricted Boltzmann machine (left). WhereG# denotes binary neurons, and the others are continuous or binary data.

This thesis follows a standard notation where i represents the indices of the visible layer, j those of the hidden layer, and wi,j denotes the weight connection between the ithvisible and jthhidden unit. Further, v

i and hj denote the state of the ithvisible and jthhidden unit, respectively. According to the above definitions, the energy function1of an RBM is given by:

E(v, h) = − m X i=1 n X j=1 vihjwij− m X i=1 viai− n X j=1 hjbj (2.1)

where, ai and bjrepresent the biases of the visible and hidden layers, respectively. The termP

i,jvihjWij represents the total energy between neurons from different layers, P

iviai represents the energy of the visible layer and Pjhjbj the energy of the hidden layer.

Inference in Restricted Boltzmann Machine — The inference in RBMs is stochastic and is simply done by conditioning on the observed data. Thus, the joint probability of a state of the hidden and visible layers is defined as:

P (v, h) = exp (−E(v, h)) Z = 1 Zexp  − m X i=1 n X j=1 vihjwij− m X i=1 viai− n X j=1 hjbj  (2.2)

1Please note that the energy function of RBM should not be confused with the aggregated

(38)

with Z =P

x,yexp (E(x, y)). To determine the probability of a data point rep-resented by a state v, the marginal probability is used. This is determined by summing out the state of the hidden layer, such as p(v) =P

hP (v, h), as: p(v) = 1 Z X h  exp − m X i=1 n X j=1 vihjwij− m X i=1 viai− n X j=1 hjbj  (2.3)

For any hidden neuron j, the inference is given by p(hj|v) = sig(bj+PiviWij), and for any visible unit i it is given by p(vi|h) = sig(ai+PjhjWij), where sig(·) is a sigmoid function. In order to maximize the likelihood of the model, the gradients of the energy function with respect to the weights have to be calculated. Because of the difficulty in computing the derivative of the log-likelihood gradients, it was proposed in [50] an approximation method called Contrastive Divergence (CD). Contrastive Divergence — In maximum likelihood, the learning phase actually minimizes the Kullback-Leibler (KL) measure between the input data distribution and the model approximation [50]. In CD, learning follows the gradient of:

CDn∝ DKL(p0(x)||p∞(x)) − DKL(pn(x)||p∞(x)) (2.4) where, pn(.) is the distribution of a Markov chain running for n steps. Since the visible units are conditionally independent given the hidden units and vice versa, learning can be performed using one step Gibbs sampling, which is carried out in two half-steps: (1) update all the hidden units, and (2) update all the visible units. Thus, the weight updates in CDn, by using Eq.(2.4), are done as follows:

wτ +1ij = wijτ + α hhjviip(h|v;W)

0− hhjviin 

(2.5) where τ is the iteration, α is the learning rate, and

hhjviip(h|v;W) 0= 1 N N X k=1 vi(k)P (h(k)j = 1|v(k); W) (2.6) hhjviin= 1 N N X k=1 vi(k)(n)P (h(k)(n)j = 1|v(k)(n); W) (2.7)

where N is the total number of input instances, and the superscript (n)indicates that the states are obtained after n iterations of Gibbs sampling from the Markov chain starting at p0(·). Besides that, other methods have been proposed to train RBMs (e.g. persistent contrastive divergence [51], fast persistent contrastive di-vergence [52], parallel tempering [53]), or to replace the Gibbs sampling with a transition operator for a faster mixing rate and to improve the learning accuracy without affecting computational costs [54].

(39)

2.2. Energy prediction at building level

Introduction2 — In this section, we perform a comparison of three ma-chine learning methods for estimating the energy consumption in a Dutch office building, as part of the Kropman Installatietechniek BV portofolio, within TKI Swich2SmartGrid project. Specifically, it investigates the capabilities of the pro-posed methods in the case of total building energy consumption and lighting en-ergy consumption. Due to the fact that enen-ergy consumption can be seen as a time series problem, we proposed the use of Conditional Restricted Boltzmann Machine (CRBM) [46], a recent introduced stochastic machine learning method which has been applied successfully to model highly non-linear time series (e.g. human motion style, structured output prediction) [49] [55]. Up to our knowl-edge, this method has never been used in the context of building energy forecast-ing. The method is compared with the widely used Artificial Neural Networks (ANNs) [56] [57] and Support Vector Machines [58] for energy prediction. The general architecture of these models is presented in Fig 2.5.

Neural networks a) c) h h (states) v

b) h v u

v T E u Stochastic methods

Figure 2.5 – The general architecture of the models used in this section to predict the energy consumption: a) Artificial Neural Networks b) Conditional Restricted Boltz-mann Machines, and c) Hidden Markov Models. In both types of neural networks, u is the conditional history layer (input), h is the hidden layer and v is the visible layer (output).

Problem Definition — Predicting the energy consumption is equivalent to minimizing the (expected) distance between the real and estimated values. More formally, let define the following: i ∈ Nlrepresents the index of energy consump-tion data instances, t ∈ NT denotes time and χ ⊂ Rd represents a d-dimensional feature space.

Definition 2.1. (Energy estimation problem) Given a data set DEnergy = {U(i), v(i)}l

i=1, where U (i)

⊆ Rd×(t−N :t−1), is a d-dimensional input sequence where {t−N : t−1} represents a temporal window, v(i)t ⊆ Rdis a multidimensional output vector over the space of real-valued outputs, determine p(V|Γ; Θ), with V ⊆ l × Rdand Γ ⊆ l × Rd×(t−N :t−1)representing the concatenation of all outputs

2The content of this section is based on: E. Mocanu, P.H.Nguyen, M.Gibescu and W.L.

Kling, Comparison of machine learning methods for estimating energy consumption in buildings. Proceedings of the 13th International Conference on Probabilistic Methods Applied to Power Systems (PMAPS), 2014.

Referenties

GERELATEERDE DOCUMENTEN

With the above in mind, the denomination of a blockchain-based system as “trustless” or “trust-free” technology is largely misleading. To paraphrase Lustig &amp; Nardi [

• Neem deel aan beduidende teoretiese en pragmatiese navorsing wat ’n beter begrip van organisasies (openbare sektor, sakesektor, burgerlike samelewing) bevorder in die soeke na

All isolates exhibiting reduced susceptibility to carbapenems were PCR tested for bla KPC and bla NDM-1 resistance genes.. Overall, 68.3% of the 2 774 isolates were

Proefsleuf 2 bevond zich parallel met de westelijke perceelsgrens en was - de resultaten van proefsleuf 1 indachtig - niet continue, maar bestond uit drie kijkgaten over een

Learning modes supervised learning unsupervised learning semi-supervised learning reinforcement learning inductive learning transductive learning ensemble learning transfer

Learning modes supervised learning unsupervised learning semi-supervised learning reinforcement learning inductive learning transductive learning ensemble learning transfer

characteristics (Baarda and De Goede 2001, p. As said before, one sub goal of this study was to find out if explanation about the purpose of the eye pictures would make a

Other aspects lead to advantages, as has been shown for various communication and com- putation tasks: for solving algebraic problems, reduction of sample complexity in