## Predicting demand of natural gas: a system identification approach

### Master research project Applied Mathematics

### March 2013

### Student: R.R.A Prins

### First advisor: Prof. Dr. M.K. Camlibel

### Abstract

### In this thesis we construct a linear discrete-time time-invariant input/state/output sys-

### tem in order to predict the demand of natural gas based on the available data. A Python

### program is written to create this model based on the known input and output. Further-

### more a methodology is established to give the best parameters of the system which lead

### to the best predictions.

### Acknowledgements

### I would like to thank GasTerra B.V. for giving me the opportunity to do my final years project at their company. Furthermore I thank the coworkers at GasTerra B.V. for helping me during the process. I would like to especially than H. Kremer for providing an interesting research, L.M.H. Haveman and S. Douma for their guidance and moral support.

### At the university I would like to thank Prof.dr. J. Top for accepting the role as second supervisor. Last but definitely not least I would like to give special thanks to Prof. dr.

### M.K. Camlibel for supervising me in this for us both unknown field of research, and for his guidance and support along the way.

### Of course I am also grateful for all the support given by friends, family and fellow students, in especially S. Baars for helping me with understanding Python.

### Thank you all.

### List of Tables

### 1.1 List of connections to Germany . . . . 2

### 1.2 List of factors . . . . 3

### 1.3 Used factors . . . . 3

### 2.1 Input data . . . . 10

### 2.2 Output data . . . . 10

### 3.1 Results for hour-ahead . . . . 20

### 3.2 Results for 12-hours ahead . . . . 20

### 3.3 Results for day-ahead . . . . 21

### 3.4 Results for week-ahead . . . . 21

### List of Figures

### 2.1 n = 17, N = 2500, p = 31, i = 2530 . . . . 18

### 3.1 Reinitializing . . . . 23

### 3.2 4,5 years of data . . . . 24

### Contents

### 1 Introduction 1

### 1.1 GasTerra B.V. . . . . 1

### 1.2 This research . . . . 1

### 1.3 Existing methods . . . . 2

### 1.4 Choosing the model . . . . 3

### 2 The model 5 2.1 Discrete-time dynamical linear models . . . . 5

### 2.2 System identification . . . . 7

### 2.2.1 Preliminaries . . . . 7

### 2.3 Predicting gas demand . . . . 16

### 3 Using the model 19 3.1 Reinitialization . . . . 19

### 3.2 Parameter analysis . . . . 19

### 4 Conclusions and future work 25 A Python programs 28 A.1 Data processing . . . . 28

### A.1.1 Creating the input . . . . 28

### A.1.2 Creating the output . . . . 37

### A.1.3 Creating input and output . . . . 38

### A.2 Estimating the system matrices . . . . 41

### A.3 Running the system . . . . 46

### A.4 Data analyzis . . . . 47

### A.5 Plotting and reinitialization . . . . 49

### A.6 Optimizing p,N,n . . . . 54

### Chapter 1

### Introduction

### This master thesis is the result of a research worth 50 European Credits done at GasTerra B.V. in association with the University of Groningen concerns research on the prediction of demand of natural gas.

### 1.1 GasTerra B.V.

### GasTerra B.V. is an international company trading in natural gas. It operates on the European energy market and accounts for a significant share of the Dutch gas supply.

### The company also provides services related to gas trading. The company has a strong purchasing position and almost fifty years of experience in natural gas procurement and sales [4].

### At GasTerra B.V. there is a need to predict the demand of natural gas, based on fore- casting they can make decisions important to insure their position in the gas market.

### The methods used at this moment are based on stationary models, which do not change in response to the alternating environment. Their interest lays at constructing a model which can learn from new information and adjust itself in such a way that it captures the change in response to changing environment.

### 1.2 This research

### Based on the need of GasTerra B.V. we want to construct a model which learns from

### the changing environment and predicts the demand of gas. The focus of the thesis will

### be on the export to Germany. Table 1.1 shows the connections for pipelines between the

### Netherlands and Germany. The connection Zevenaar/Winterswijk will be considered in

### this work. This connection is together with Oude StatenZijl (OSZ) the biggest connection

### to Germany. Zevenaar/Winterswijk delivers only L-gas, OSZ on the other hand delivers

### both L-gas and G-gas. Tegelen also delivers to to the same distributer in Germany, but

### in comparison to Zevenaar/Winterswijk this is a really small quantity, therefore it is

### not considered in this thesis. Hence we focus on the total amount of natural gas which

### Oude Statenzijl (OSZ) Winterswijk

### Zevenaar Tegelen Bocholtz

### Table 1.1: List of connections to Germany

### flows through Zevenaar/Winterswijk. The aim of this work is to create a dynamic model which can predict the demand of natural gas and adjusts itself in case of changes in the behavior of the system.

### The rest of the chapter will proceed as follows. First we will give a summary of the already existing methods in Section 1.3. In Section 1.4 the proposed model will be introduced.

### 1.3 Existing methods

### In [3],the authors proposed a model for the demand of gas for household use in four states of the United States of America, first by using only physical factors and second by also implementing economical factors. It turns out that for the four states the model is differently accurate. Furthermore, the influence of economical factors was in some cases insignificant and in other cases significant but not of big influence in comparison to the physical factors. One can conclude from this that physical factors are the leading influence on the demand of natural gas and economical factors might be of influence (but of smaller influence then physical factors). H.G. Berrisford [2] stated that using predetermined seasonal pattern would cause less power to predict. He only uses the direct past history of the temperature by means of an exponentially weighted average.

### In [1], P. Balestra and M. Nerlove propose a dynamical model, for the estimation of the parameters they use the least squares and maximum likelihood method, both methods of estimation result in weak predictions. In [17], Siemek et al. employ a Hubert model which is based on a decay method. This method works too slow to be able to deal with hourly data. It is used to estimate an overall demand of natural gas of a specific country, so not specifically for a company, or for a certain client. Liu and Lin [12] forecast household use of natural gas by using monthly and quarterly time series. They found that using quarterly data was less time-consuming, but not necessarily more accurate in comparison to monthly data. Furthermore, they state that using monthly data has the tendency to over-parametrize the system. In [18], the authors model the demand of gas in Bangladesh. Their model gives a lower bound and does not estimate the actual amount.

### Many other models exist for predicting the demand of gas. Most of these models are

### stationary. Almost all papers estimate the demand of household use, and are not aiming

### to estimate the amount of gas for a specific company. In this thesis, we will propose a

### dynamical model in order to predict the demand of natural gas for a specific company.

### 1.4 Choosing the model

### In [13], Ljung mentions that the construction of a model involves three basic entities:

### 1. A data set.

### 2. A set of candidate models

### 3. A rule by which the candidate models can be assessed using the data.

### This can be used as a guide to decide which kind of model one should use. We start with constructing a data set. The list of factors in Table 1.2 consists of factors which might be of interest. Among these possible factors, we focus n physical parameters suggested by the experts of GasTerra B.V.. The output will be the physical flow.

### Temperature Price of gas Day

### Hours sun Wind Price other energy carriers

### CO

2### Price Spark spread Storage

### Unemployment rate Consumer trust Einkaufsmanagers index

### Transportation costs rainfall Time

### Table 1.2: List of factors

### Spark spread shows a balance between the price of gas and the price of coal included the CO

_{2}

### price. CO

_{2}

### Price represents the costs a company has to pay for the CO

_{2}

### created by for instance burning coal. Einkaufsmanagers index is an index which stands for the development of the economy of Germany.

### For data concerning weather we will retrieve our information from the Koninklijk Ned- erlands Meteorologisch Instituut (KNMI) [9]. For the data concerning gas flows we will retrieve information from the Gasunie Transport Services (GTS) [16]. In the rest of the thesis we will use the factors given in Table 1.3.

### Direction of the wind Speed of the wind Temperature Sunshine duration Solar radiation Duration of rainfall Amount of rainfall Air pressure Overcast

### Table 1.3: Used factors

### The direction of the wind (in degrees) represents the average of the last 10 minutes of the past hour. Speed of the wind (in m/s) is the hourly average. Temperature (in degrees Celsius) is measured 1.50 meter above sea level and represents the temperature at that moment. Sunshine duration (in hours) is the duration of sunshine in an hour.

### Solar radiation (in J/cm

^{2}

### ) represents the strength of the sun. The duration of rainfall (in hours) is the duration of rainfall during the hour. The amount of rainfall (in mm) is the total amount of rainfall in an hour. Air pressure is given in hPa. Overcast is a number which represents the cloudiness.

### In this thesis, we will construct a discrete-time dynamical model based on the factors

### given in Table 1.3. One of the advantages working with the model class we have chosen is the case of including new factors and/or discarding existing ones.

### The class of discrete dynamical linear models can be divided into time-invariant and time-varying models. The demand of gas is most likely time-varying, since the same conditions in the winter might lead to another result than the same conditions in the summer. However, dealing with time-varying models is not always preferable due to the heavy computations required by such models. In this thesis, we will employ the-invariant models and try to capture the time dependency of the model by creating new models over time. The findings of this thesis can be taken as a starting point for further research on modeling the gas demand by time-varying systems. We will discuss the details of the model we propose in Section 2.1. Before dealing with the mathematical details, we first discuss various existing methodologies in what follows.

### In [6], the authors investigate how a linear input/state/output system can be constructed from given input/output data. Their method uses the so-called information matrix along with its Toeplitz matrix structure. The information matrix contains the given input/out- put data. By using the information matrix as well as its shifted versions, they construct the matrices that determine a linear input/state/output system. In [19], Wang and Qin proposed a subspace identification method that is based on principal component analy- sis (PCA) which gives consistent estimates in the error-in-variables (EIV) formulation.

### Lacy and Bernstein presented a constraint optimization method to ensure asymptotic stability of the identified model in the context of subspace identification in [11]. Several other papers investigated how to estimate the system matrices in the presence of noise, e.g. [5] and [15]. Katayama and Tanaka [8] also use the noise term, they estimate the matrices using an LQ-decomposition which can be used to create the orthogonal and oblique projections of parts of the information matrices. The paper [10] exploits the information and its Toeplitz structure in the presence of noise. In [7] adaptive Iterative learning control (ILC) has been employed. They first estimate the matrices, in what- ever way one might prefer, and then they keep reshaping the system matrices by use of ILC. Merc` ere and Bako [14] revise the problem of determining a canonical state-space representation for multi-variable systems. They first estimate the system matrices and finally put them in canonical representation.

### Most of the methods mentioned in the papers are using an information matrix and do basically the same as Juang did in [6], even the papers concerning noise terms often use this method of estimation. In all the papers mentioned above, the estimation problem boils down to solving certain linear equations. These papers differ from each other by the methods they proposed in solving the equations. Such as LQ-factorization, least squares, or pseudo-inverse etc. In this thesis, we will focus on the method proposed by Juang in [6].

### The structure of the thesis is as follows. In Chapter 2 the modeling methodology we

### adopt will be discussed in detail. Chapter 3 is devoted to the question of how to update

### the model when new data is available. Finally, the thesis closes with the conclusions and

### a discussion for further research in Chapter 4. The computer program that was used for

### this thesis is included in Appendix A.

### Chapter 2

### The model

### Physical behavior can be modeled in a system, for instance the motion of a car driving in a straight line. We can use as input the pressure given on the throttle, as output we want the distance traveled. This is the best modeled by a dynamical input/state/output system, where the state represents some physical quantities and is able to keep track of the speed and the distance. This chapter will provide a dynamical system which will be used to predict the amount of gas. Note that we do not know the exact behavior of the demand, and therefore cannot give physical meaning to the state.

### The organization of this chapter is as follows. In Section 2.1 we will introduce discrete time-invariant input/state/output dynamical linear systems and review certain proper- ties of such systems. In Section 2.2, we will present a methodology based on [6] in order to estimate system matrices from given input/output observations. Finally we will dis- cuss how this methodology can be employed in the context of predicting gas demand in Section 2.3.

### 2.1 Discrete-time dynamical linear models

### Consider the linear discrete-time time-invariant input/state/output system Σ(A, B, C, D) of the form:

### x(k + 1) = Ax(k) + Bu(k),

### y(k) = Cx(k) + Du(k), (2.1)

### where k ∈ N denotes the time, u(k) ∈ R

^{r}

### the input vector, x(k) ∈ R

^{n}

### the state vec-

### tor, y(k) ∈ R

^{m}

### the output vector and the so-called system matrices A, B, C, D are of

### appropriate dimensions. We sometimes refer to this representation as state-space repre-

### sentation. If the system description (2.1) is obtained from a physical system the state

### of the system (very often) has a physical interpretation. For instance, if one models an

### electrical circuit containing linear dynamical elements like capacitors and inductors then

### the components of the state vector correspond to the voltages/charges of the capacitors

### and the current/flows of the inductors. In this thesis we attempt to identify a system

### of the form (2.1) in order the model gas demand. As such, the state vector will not

### necessarily admit a certain physical meaning.

### Next, we illustrate the underlying dynamics of the system of the form (2.1) by means of an example.

### Example Consider the state space representation of the form (2.1) where A = 1 2

### 3 4

### , B = 1 −0.2 3 4 −0.5 6

### , C = 1 2 , D = 1 −0.4 2 .

### For this model, the input vector u consists of three components: the first corresponds to the speed of the wind (10

^{−1}

### m/s), the second to the temperature (10

^{−1}

### ), and the third corresponds to the amount of rain fallen in the hour (10

^{−1}

### mm)

^{1}

### whereas the scalar output y corresponds to the amount of gas (10

^{7}

### kWh)

^{2}

### .

### Suppose that the initial state x(0) is given by x(0) = 1

### 1

### .

### To calculate the output we use the input, the system matrices and the state together with equation (2.1). Assume that at time instance 0 the speed of the wind is 1.0 m/s, the temperature is 10.6 and the amount of rain fallen is 1.6 mm. This leads to the output at time instance 0

### y(0) = Cx(0) + Du(0) = 1 2 1 1

### + 1 −0.4 2

###

### 10 106

### 16

###

### = 2.6.

### Additionally, we calculate x(1):

### x(1) = Ax(0) + Bu(0) = 1 2 3 4

### 1 1

### + 1 −0.2 3 4 −0.5 6

###

### 10 106

### 16

###

### = 39, 8 96

### .

### Assume at time instance 1 that the speed of the wind is 2.0 m/s, the temperature is 10.8

### and the amount of rain fallen is 1.2 mm. This leads to the output at time instance 1

### y(1) = Cx(1) + Du(1) = 1 2 39, 8 96

### + 1 −0.4 2

###

### 20 108

### 12

###

### = 232.6.

### Additionally, we calculate x(2):

### x(2) = Ax(1) + Bu(1) = 1 2 3 4

### 39, 8 96

### + 1 −0.2 3 4 −0.5 6

###

### 20 108

### 12

###

### = 266.2 601.4

### .

### In a similar fashion, one can compute the state and the output at time instant k, i.e.

### x(k) and y(k), based on the values of inputs u(0), u(1), . . . , u(k).

11 millimeter rainfall is the equivalent of 1 liter of water per square meter per hour.

### 2.2 System identification

### Roughly speaking, the system identification problem we deal with in this thesis amounts to construct a dynamical system of the form (2.1) based on given input/output observa- tions. In order to elaborate on the method we use, we first introduce a set of machinery in what follows

### 2.2.1 Preliminaries

### This section is devoted to the definitions of mathematical concepts and notations that will be used throughout the thesis.

### Most of the time we follow the standard mathematical notation and conventions. We denote the set of natural numbers (including zero) by N, the set of real number by R, the set of q-tuples of real numbers by R

^{q}

### and the set of q × p matrices by R

^{q×p}

### .

### In this Chapter the Moore-Penrose pseudo-inverse will be used, hence we define it below.

### Definition 2.2.1 (Moore-Penrose pseudo-inverse). For the matrix A ∈ R

^{m×n}

### , the Moore-Penrose pseudo-inverse is defined as A

^{†}

### ∈ R

^{n×m}

### and satisfies the following prop- erties:

### (i) AA

^{†}

### A = A (ii) A

^{†}

### AA

^{†}

### = A

^{†}

### (iii) (AA

^{†}

### )

^{T}

### = AA

^{†}

### (iv) (A

^{†}

### A)

^{T}

### = A

^{†}

### A.

### Furthermore, A

^{†}

### exists and is unique.

### In the case that A is invertible this inverse will be the normal inverse of an invertible matrix. The advantage of using this pseudo-inverse is that it is widely used and most mathematical programming languages have a build in program to compute the Moore- Penrose pseudo-inverse of any matrix.

### A commonly used class of matrices in the context of system identification is the so-called Hankel matrices. The Output matrix and the Input matrix have the shape of a Hankel matrix.

### Definition 2.2.2 (Hankel matrix). A matrix M ∈ R

^{q×q}

### is called a Hankel matrix if M

ij### = M

_{kr}

### whenever i + j = k + l, that is if it is off the form:

###

###

###

###

###

###

###

### m

1### m

2### m

3### · · · m

q### m

_{2}

### m

_{3}

### m

_{4}

### · · · m

_{q+1}

### m

3### m

4### m

5### · · · m

q+2### .. . .. . .. . . .. .. . m

q### m

q+1### m

q+2### · · · m

2q−1###

###

###

###

###

###

### .

### for some real numbers m

_{i}

### with 1 ≤ i ≤ 2q − 1.

### We sometimes say that a (possibly non-square) matrix is a block Hankel matrix if it admits the Hankel structure blockwise.

### Another relevant class of matrices is the so called Toeplitz matrices.

### Definition 2.2.3 (Toeplitz matrix). Given the system matrices A ∈ R

^{n×n}

### , B ∈ R

^{n×r}

### , C ∈ R

^{m×n}

### , D ∈ R

^{m×r}

### and an integer p ∈ N we define the corresponding Toeplitz matrix as

### T

_{p}

### :=

###

###

###

###

###

###

###

###

###

###

### D 0

### CB D 0

### CAB CB D 0

### .. . .. . .. . . .. ...

### .. . .. . .. . .. . . .. 0 CA

^{p−2}

### B CA

^{p−3}

### B CA

^{p−4}

### B · · · · · · D

###

###

###

###

###

###

###

###

###

### .

### In the course of the discussion, we will work with matrices obtained from sequences of inputs, states and outputs. In what follows we introduce these matrices.

### Definition 2.2.4 (Input matrix). Let u(k) ∈ R

^{r}

### be the input at time instance k ∈ N and p ∈ N, then we define the vector u

p### (k) ∈ R

^{pr}

### as:

### u

p### (k) :=

###

###

###

###

###

###

###

### u(k) u(k + 1) u(k + 2)

### .. . u(k + p − 1)

###

###

###

###

###

###

### .

### Then for given N ∈ N we define the input matrix U

p### (k) ∈ R

^{pr×N}

### as:

### U

_{p}

### (k) := u

_{p}

### (k) u

_{p}

### (k + 1) · · · u

_{p}

### (k + N − 1)

### =

###

###

###

###

###

### u(k) u(k + 1) · · · u(k + N − 1) u(k + 1) u(k + 2) · · · u(k + N )

### .. . .. . . .. .. .

### u(k + p − 1) u(k + p) · · · u(k + p + N − 2)

###

###

###

###

### .

### Similar to the input matrix, we define the output matrix as the following.

### Definition 2.2.5 (Output matrix). Let y(k) ∈ R

^{m}

### be the output at time instance k ∈ N and p ∈ N, then we define the vector y

p### (k) ∈ R

^{pm}

### as:

### y

_{p}

### (k) :=

###

###

###

###

###

###

###

### y(k) y(k + 1) y(k + 2)

### .. .

###

###

###

###

###

###

###

### .

### Then for given N ∈ N we define the output matrix Y

p### (k) ∈ R

^{pm×N}

### as:

### Y

p### (k) := y

_{p}

### (k) y

p### (k + 1) · · · y

p### (k + N − 1)

### =

###

###

###

###

###

### y(k) y(k + 1) · · · y(k + N − 1) y(k + 1) y(k + 2) · · · y(k + N )

### .. . .. . . .. .. .

### y(k + p − 1) y(k + p) · · · y(k + p + N − 2)

###

###

###

###

### .

### Note that the input and output matrix are of block Hankel matrix form.

### Apart from these matrices, we need to define the state matrix.

### Definition 2.2.6 (State matrix). Let x(k) ∈ R

^{n}

### be the state at time instance k ∈ N then for a given N ∈ N we define the state matrix X(k) ∈ R

^{n×N}

### as:

### X(k) := x(k) x(k + 1) · · · x(k + N − 1) . Finally, we define the observability matrix.

### Definition 2.2.7 (Observability matrix). Given the system matrices A ∈ R

^{n×n}

### , C ∈ R

^{m×n}

### and p ∈ N we define the observability matrix as:

### O

_{p}

### :=

###

###

###

###

###

###

###

###

###

### C CA CA

^{2}

### .. . .. . CA

^{p−1}

###

###

###

###

###

###

###

###

###

### .

### The relationships among the matrices defined above is in order. By straightforward calculations, we arrive at the following state and output trajectories of the system (2.1) for the initial state x(0) and the input sequence u(0), u(1), . . . , u(k):

### x(k) = A

^{k}

### x(0) + A

^{k−1}

### Bu(0) + A

^{k−2}

### Bu(1) + · · · + ABu(k − 2) + Bu(k − 1), y(k) = CA

^{k}

### x(0) + CA

^{k−1}

### Bu(0) + · · · + CABu(k − 2) + CBu(k − 1) + Du(k).

### In a more compact form we have

### y

p### (k) = O

p### x(k) + T

p### u

p### (k), (2.2)

### Y

_{p}

### (k) = O

_{p}

### X(k) + T

_{p}

### U

_{p}

### (k). (2.3)

### The goal of the method we use it to determine A, B, C, D matrices from given input/out-

### put observations. To do so, we first focus on determining the observabilitymatrix O

_{p}

### .

### Define:

### R

_{yy}

### = 1

### N Y

p### (k)Y

_{p}

^{T}

### (k), R

_{yu}

### = 1

### N Y

p### (k)U

_{p}

^{T}

### (k), R

_{uu}

### = 1

### N U

_{p}

### (k)U

_{p}

^{T}

### (k), R

_{xx}

### = 1

### N X(k)X

^{T}

### (k), R

_{yx}

### = 1

### N Y

_{p}

### (k)X

^{T}

### (k), R

_{xu}

### = 1

### N X(k)U

_{p}

^{T}

### (k).

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

### (2.4)

### The matrices R

_{yy}

### ∈ R

^{mp×mp}

### , R

_{uu}

### ∈ R

^{rp×rp}

### , R

_{xx}

### ∈ R

^{n×n}

### are symmetric matrices that represent the autocorrelations of the output data y with the time shifts, the input data u with the time shifts, and the state vector x, respectively, whereas the matrices R

_{yu}

### ∈ R

^{mp×rp}

### , R

_{yx}

### ∈ R

^{mp×n}

### , R

_{xu}

### ∈ R

^{n×rp}

### represent the cross correlations of the output data y with time shifts and the input data u with time shifts, the output data y with time shifts and state vector x, and the state vector x and the input u with time shifts, respectively.

### 0 1 2 3 4 5

### Speed of the wind (in 10

^{−1}

### m/s) 10 10 20 30 10 20 Temperature (in 10

^{−1}

### ) 98 94 136 106 102 92 Amount of rain fallen (in 10

^{−1}

### mm) 18 73 16 33 40 31

### Table 2.1: Input data

### 0 1 2 3 4 5

### Energy (in 10

^{7}

### kWh) 1.2 1.1 1.2 1.1 1.2 1.3 Table 2.2: Output data

### Example We can use the input data and output data given in Table 2.1 and Table 2.2,

### respectively to calculate the input matrix U

_{3}

### (0) and the output matrix Y

_{3}

### (0). We have

### r = 3 and m = 1, choose p = 3 and N = 4 and k = 0. This leads to the following input

### and output matrices:

### U

3### (0) =

###

###

###

###

###

###

###

###

###

###

###

###

### 10 10 20 30 98 94 136 106 18 73 16 33 10 20 30 10 94 136 106 102 73 16 33 40 20 30 10 20 136 106 102 92 16 33 40 31

###

###

###

###

###

###

###

###

###

###

###

### ,

### Y

_{3}

### (0) =

### " 1.2 1.1 1.2 1.1 1.1 1.2 1.1 1.2 1.2 1.1 1.2 1.3

### # .

### Using these matrices the quantities R

_{yy}

### , R

_{yu}

### , R

_{uu}

### can be calculated using (2.4) Estimating the observability matrix

### By post-multiplying (2.3) by U

_{p}

^{T}

### (k)/N , we get

### R

_{yu}

### = Y

p### (k)U

_{p}

^{T}

### (k) = O

p### X(k)U

_{p}

^{T}

### (k) + T

p### U

p### (k)U

_{p}

^{T}

### (k) = O

p### R

_{xu}

### + T

p### R

_{uu}

### .

### Suppose that R

uu### is nonsingular. Then, we obtain the Toeplitz matrix T

p### by post- multiplying the last equation by R

^{−1}

_{uu}

### :

### T

_{p}

### = (R

_{yu}

### − O

_{p}

### R

_{xu}

### ) R

^{−1}

_{uu}

### (2.5) Similarly, we get

### R

_{yy}

### = Y

p### (k)Y

_{p}

^{T}

### (k) = O

p### X(k)Y

_{p}

^{T}

### (k) + T

p### U

p### (k)Y

_{p}

^{T}

### (k) = O

p### R

^{T}

_{yx}

### + T

p### R

^{T}

_{yu}

### , (2.6) and

### R

_{yx}

### = Y

p### (k)X

^{T}

### (k) = O

p### X(k)X

^{T}

### (k) + T

p### U

p### (k)X

^{T}

### (k) = O

p### R

_{xx}

### + T

p### R

^{T}

_{xu}

### . (2.7) by post-multiplying (2.3) by Y

_{p}

^{T}

### /N and by X

_{p}

^{T}

### /N , respectively. Substituting (2.7) in (2.6) results in

### R

_{yy}

### = O

_{p}

### O

_{p}

### R

_{xx}

### + T

_{p}

### R

^{T}

_{xu}T

### + T

_{p}

### R

^{T}

_{yu}

### = O

_{p}

### R

_{xx}

### O

_{p}

### + O

_{p}

### R

_{xu}

### T

_{p}

^{T}

### + T

_{p}

### R

^{T}

_{yu}

### . Furthermore, one can substitute T

_{p}

### given by (2.5) and get:

### R

_{yy}

### = O

_{p}

### R

_{xx}

### O

_{p}

### + O

_{p}

### R

_{xu}

### T

_{p}

^{T}

### + T

_{p}

### R

^{T}

_{yu}

### ,

### = O

_{p}

### R

_{xx}

### O

_{p}

### + O

_{p}

### R

_{xu}

### (R

_{yu}

### − O

_{p}

### R

_{xu}

### ) R

^{−1}

_{uu}T

### + (R

_{yu}

### − O

_{p}

### R

_{xu}

### ) R

^{−1}

_{uu}

### R

^{T}

_{yu}

### ,

### = O

p### R

_{xx}

### O

^{T}

_{p}

### + O

p### R

_{xu}

### R

^{−1}

_{uu}

### R

^{T}

_{yu}

### − O

_{p}

### R

_{xu}

### R

^{−1}

_{uu}

### R

^{T}

_{xu}

### O

_{p}

^{T}

### + R

yu### R

^{−1}

_{uu}

### R

^{T}

_{yu}

### − O

_{p}

### R

_{xu}

### R

^{−1}

_{uu}

### R

^{T}

### ,

### = O

_{p}

### R

_{xx}

### O

^{T}

_{p}

### − O

_{p}

### R

_{xu}

### R

^{−1}

_{uu}

### R

^{T}

_{xu}

### O

_{p}

^{T}

### + R

_{yu}

### R

^{−1}

_{uu}

### R

^{T}

_{yu}

### .

### This leads to

### R

_{yy}

### − R

_{yu}

### R

^{−1}

_{uu}

### R

^{T}

_{yu}

### = O

_{p}

### R

_{xx}

### − R

_{xu}

### R

^{−1}

_{uu}

### R

^{T}

_{xu}

### O

^{T}

_{p}

### . (2.8) Now, we define

### R

_{hh}

### = R

yy### − R

_{yu}

### R

^{−1}

_{uu}

### R

^{T}

_{yu}

### , (2.9) R ˜

_{xx}

### = R

_{xx}

### − R

_{xu}

### R

^{−1}

_{uu}

### R

^{T}

_{xu}

### . (2.10) Then, (2.8) can be written as

### R

_{hh}

### = O

p### R ˜

_{xx}

### O

^{T}

_{p}

### . (2.11)

### Clearly, both R

_{hh}

### ∈ R

^{mp×mp}

### and ˜ R

_{xx}

### ∈ R

^{n×n}

### are symmetric matrices.

### We are now in a position to compute an estimate of the observability matrix. By following the footsteps of [6], let R

_{hh}

### R

_{hh}

### = U Σ

^{2}

### V

^{T}

### = U

_{n}

### U

_{0}

### Σ

^{2}

_{n}

### 0

_{n×n}

_{0}

### 0

_{n}

_{0}×n

### 0

_{n}

_{0}×n

_{0}

### V

_{n}

^{T}

### V

_{0}

^{T}

### = U

_{n}

### Σ

^{2}

_{n}

### V

_{n}

^{T}

### , (2.12) be the singular value decomposition of R

_{hh}

### where n

0### is the amount of singular values equal to 0. Since R

_{hh}

### is symmetric, we have U

_{n}

### = V

_{n}

### and U

_{0}

### = V

_{0}

### . Then, it follows from (2.11) that

### R

_{hh}

### = O

p### R ˜

_{xx}

### O

_{p}

^{T}

### = U

n### Σ

^{2}

_{n}

### U

_{n}

^{T}

### This leads to the obvious estimates for O

_{p}

### and ˜ R

_{xx}

### :

### O

_{p}

### = U

_{n}

### , (2.13)

### R ˜

_{xx}

### = Σ

^{2}

_{n}

### . (2.14)

### In this method the dimension of the state is equal to n. In order to get a specific dimension of the state, we must have pm − n singular values equal to 0. In practice non of the singular values of R

hh### are equal to 0. To achieve the desired dimension of the state we will truncate the smallest pm − n singular values. So Σ

^{2}

_{n}

### contains the n highest singular values. Then, we have U

_{n}

### ∈ R

^{pm×n}

### = O

_{p}

### .

### Example Assume p = 3 and N is chosen such that R

uu### is non-singular. Using R

_{yy}

### , R

yu### , R

uu### we can calculate R

_{hh}

### . Assume that we have

### R

_{hh}

### =

###

###

### 2 3 5 3 8 7 5 7 4

###

### .

### Using the singular value decomposition shown in (2.12) gives us the following matrices for U

n### , Σ

^{2}

_{n}

### , V

n### :

### U

_{n}

### =

###

###

### −0.39 −0.57 −0.73

### −0.71 −0.33 0.63

### −0.59 0.75 −0.28

###

### ,

### Σ

^{2}

_{n}

### =

###

###

### 15.49 0 0

### 0 2.85 0

### 0 0 1.36

###

### ,

### V

_{n}

### =

###

###

### −0.39 −0.57 −0.73

### −0.71 −0.33 0.63

### −0.59 0.75 −0.28

###

### . Clearly, U

n### = V

n### . Assume n = 2, then this leads to:

### O

_{p}

### =

###

###

### −0.39 −0.57

### −0.71 −0.33

### −0.59 0.75

###

### . (2.15)

### Estimating A

### We construct the matrix A ∈ R

^{n×n}

### using the observability matrix. We have the follow- ing:

### O

_{p}

### (m + 1 : pm, :) =

###

###

###

###

###

###

### CA CA

^{2}

### CA

^{3}

### .. . CA

^{p−1}

###

###

###

###

###

###

###

### =

###

###

###

###

###

###

### C CA CA

^{2}

### .. . CA

^{p−2}

###

###

###

###

###

###

###

### A = O

_{p}

### (1 : (p − 1)m, :)A.

### We use the Moore-Penrose pseudo-inverse to compute the matrix A. This leads to:

### A = O

_{p}

^{†}

### (1 : (p − 1)m, :)O

p### (m + 1 : pm, :), (2.16) where O

^{†}p

### (1 : (p−1)m, :) denotes the Moore-Penrose pseudo-inverse of O

p^{†}

### (1 : (p−1)m, :).

### Example Using the observability matrix given by (2.15) we have A = −0.39 −0.57

### −0.71 −0.33

†### −0.71 −0.33

### −0.59 0.75

### = 0.39 −1, 90 0.98 1.83

### . (2.17)

### Estimating C

### The estimation of the matrix C ∈ R

^{m×n}

### follows directly from the estimation of the observability matrix. Recall that the first m rows of the observability matrix are exactly C. In other words, we have

### C = O

_{p}

### (1 : m, :). (2.18)

### Example Using the observability matrix given by (2.15) we have

### C = −0.39 −0.57 . (2.19)

### Estimating B and D

### The observability matrix and the form of the Toeplitz matrix are used for the estimation of the matrices B and D. To do so we use the output-error minimization method proposed in [6].

### We begin with substituting p = N and k = 0 in (2.2)

### y

_{N}

### (0) = O

_{N}

### x(0) + T

_{N}

### u

_{N}

### (0). (2.20) We write the matrices B and D in the following way:

### B = b

_{1}

### b

_{2}

### · · · b

_{r}

### , D = d

_{1}

### d

_{2}

### · · · d

_{r}

### ,

### where b

_{i}

### ∈ R

^{n}

### and d

_{i}

### ∈ R

^{m}

### for i = 1, 2, . . . , r. Define the vectors b ∈ R

^{rn}

### and d ∈ R

^{rm}

### as:

### b =

###

###

###

###

### b

_{1}

### b

_{2}

### .. . b

_{r}

###

###

###

###

###

### , d =

###

###

###

###

### d

_{1}

### d

_{2}

### .. . d

_{r}

###

###

###

###

### .

### Also, let

### u(k) =

###

###

###

###

### u

1### (k) u

2### (k)

### .. . u

r### (k)

###

###

###

###

### .

### We now have the following for T

N### u

N### (0):

### T

_{N}

### u

N### (0) =

###

###

###

###

###

###

### D

### CB D

### CAB CB D

### .. . .. . .. . . ..

### CA

^{N −2}

### B CA

^{N −3}

### B CA

^{N −4}

### B · · · D

###

###

###

###

###

###

###

###

###

###

###

###

###

### u(0) u(1) u(2) .. . u(N − 1)

###

###

###

###

###

###

### ,

### =

###

###

###

###

###

###

###

### U

_{m}

### (0) U

_{m}

### (1) U

_{m}

### (2)

### .. . U

_{m}

### (N − 1)

###

###

###

###

###

###

### d +

###

###

###

###

###

###

###

###

###

### 0

m×nr### CU

_{n}

### (0) CAU

_{n}

### (0) + CU

_{n}

### (1)

### .. .

N −2

### P

k=0

### CA

^{N −k−2}

### U

_{n}

### (k)

###

###

###

###

###

###

###

###

### b,

### (2.21)

### where 0

m×nr### ∈ R

^{m×n}

### denotes the zero matrix, and

### U

_{m}

### (k) = I

_{m}

### u

_{1}

### (k) I

_{m}

### u

_{2}

### (k) · · · I

_{m}

### u

_{r}

### (k) , U

_{n}

### (k) = I

_{n}

### u

1### (k) I

n### u

2### (k) · · · I

n### u

r### (k) ,

### where I

_{m}

### ∈ R

^{m×m}

### and I

_{n}

### ∈ R

^{n×n}

### are the identity matrices of order m and n, respec- tively. Substituting the expression for T

_{N}

### u

_{N}

### (0) from (2.21) into (2.20) gives:

### y

_{N}

### (0) = ΦΘ, where

### Θ =

###

### x(0)

### d b

###

### , Φ =

###

###

###

###

###

###

###

###

###

### C U

_{m}

### (0) 0

_{m×nr}

### CA U

_{m}

### (1) CU

_{n}

### (0)

### CA

^{2}

### U

_{m}

### (2) CAU

_{n}

### (0) + CU

_{n}

### (1)

### .. . .. . .. .

### CA

^{N −1}

### U

_{m}

### (N − 1)

N −2

### P

k=0

### CA

^{N −k−2}

### U

_{n}

### (k)

###

###

###

###

###

###

###

###

### .

### Using the Moore-Penrose pseudo-inverse, we get the following:

### Θ = Φ

^{†}

### y

N### (0), (2.22)

### where Φ

^{†}

### denotes the Moore-Penrose pseudo-inverse of Φ. Then we have:

### b = Θ(mr + n + 1 : (r + 1)n + mr), d = Θ(n + 1 : mr + n). (2.23) This method also provides an estimation of the initial state vector x(0):

### x(0) = Θ(1 : n). (2.24)

### Example Assume p = 1, N = 4, m = 1, r = 3 and A and C are given by (2.17) and (2.19), respectively. Furthermore, assume that we have the input and output data given by Table 2.1 and Table 2.2, respectively. Then, we get

### Φ =

###

###

###

### −0.39 −0.57 10 98 18 0 0 0 0 0 0

### −0.41 −0.30 10 94 73 −3.9 −5.7 −38.22 −55.86 −7.02 −10.26

### −0.14 0.22 20 136 16 −7.97 −8.72 −76.50 −83.19 −35.79 −47.05 0.27 0.66 30 106 33 −13.24 −12.23 −104.73 −84.41 −39.39 −27.22

###

###

### , and

### y

_{N}

### (0) =

###

###

###

### 1.2 1.1 1.2 1.1

###

###

###

###

### .

### As such, we have

### Φ

^{†}

### =

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

### 1.66 · 10

^{−4}

### −2.26 · 10

^{−4}

### −2.13 · 10

^{−4}

### 2.92 · 10

^{−4}

### −1.46 · 10

^{−5}

### −1.40 · 10

^{−4}

### −6.46 · 10

^{−5}

### 1.74 · 10

^{−4}

### 9.54 · 10

^{−3}

### −8.08 · 10

^{−3}

### −9.13 · 10

^{−3}

### 1.12 · 10

^{−2}

### 1.15 · 10

^{−2}

### −2.70 · 10

^{−3}

### −1.38 · 10

^{−4}

### 1.18 · 10

^{−3}

### −1.23 · 10

^{−2}

### 1.92 · 10

^{−2}

### 5.82 · 10

^{−3}

### −1.27 · 10

^{−2}

### −2.12 · 10

^{−3}

### 2.44 · 10

^{−3}

### 2.75 · 10

^{−3}

### −3.78 · 10

^{−3}

### 5.50 · 10

^{−4}

### 1.01 · 10

^{−4}

### 8.51 · 10

^{−5}

### −6.88 · 10

^{−4}

### 5.24 · 10

^{−3}

### 2.81 · 10

^{−3}

### −2.03 · 10

^{−3}

### −4.81 · 10

^{−3}

### −2.90 · 10

^{−3}

### 1.02 · 10

^{−3}

### 1.18 · 10

^{−2}

### −1.28 · 10

^{−2}

### 8.32 · 10

^{−3}

### −1.22 · 10

^{−3}

### −9.65 · 10

^{−3}

### 5.57 · 10

^{−3}

### 1.80 · 10

^{−2}

### −9.66 · 10

^{−3}

### −1.20 · 10

^{−2}

### −1.73 · 10

^{−2}

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

### .

### Together with (2.22), this leads to

### Θ =

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

### −4.49 · 10

^{−5}

### 4.09 · 10

^{−5}

### 5.04 · 10

^{−3}

### 1.21 · 10

^{−2}

### −1.88 · 10

^{−3}

### −1.10 · 10

^{−3}

### 4.73 · 10

^{−5}

### 1.12 · 10

^{−3}

### −3.59 · 10

^{−3}

### 3.75 · 10

^{−3}

### 8.01 · 10

^{−3}

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

###

### . (2.25)

### Using Θ, (2.23) results in

### B = −1.10 · 10

^{−3}

### 1.12 · 10

^{−3}

### 3.75 · 10

^{−3}

### 4.73 · 10

^{−5}

### −3.59 · 10

^{−3}

### 8.01 · 10

^{−3}

### , (2.26)

### D = 5.04 · 10

^{−3}

### 1.21 · 10

^{−2}

### −1.88 · 10

^{−3}

### , (2.27) and (2.24) in

### x(0) = −4.49 · 10

^{−5}

### 4.09 · 10

^{−5}

### . (2.28)

### 2.3 Predicting gas demand

### The methodology described in this chapter can be used to construct the linear discrete-

### time time-invariant input/state/output system Σ(A, B, C, D) given by (2.1). The ma-

### trices A, B, C, D and the initial state are given by (2.16),(2.23),(2.18),(2.23) and (2.24),

### Note that the initial state is calculated for the first input/output used for the estima- tions, so u(0) is the first input. For the creation of the matrices we used N + p − 1 data-points. So the output y(N + p − 1), resulting from input u(N + p − 1) will be the first prediction.

### Python is used to create programming that can calculate the system matrices A, B, C, D and the initial state x(0), and create a system which can produce the estimated output y for given input u. Additionally, we created programming to plot the actual output against the estimated output as well as the error. The code can be found in Appendix A. In Figure 2.3, such a plot is given. For the creation of the matrices we have used n = 17, N = 2500, p = 30. The parameter i = 2530 indicates the time-index after which only predicted output is estimated, in the plot this is indicated by the vertical line. The second part of the plot contains the estimation error. The horizontal lines indicate a certain allowed error set for this example at 0.25 · 10

^{7}