University of Groningen Relationship between Granger non-causality and network graph of state-space representations Jozsa, Monika

(1)

Relationship between Granger non-causality and network graph of state-space

representations

Jozsa, Monika

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below.

Document Version

Publisher's PDF, also known as Version of record

Publication date: 2019

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):

Jozsa, M. (2019). Relationship between Granger non-causality and network graph of state-space representations. University of Groningen.

Copyright

Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

Take-down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.

(2)

Chapter 4

Granger causality and Kalman representations

with transitive acyclic directed graph zero

struc-ture

We have seen in Chapters 2 and 3 that the existence of Kalman representations whose network graphs are star graphs are equivalent to the lack of conditional and unconditional Granger causalities in the output process of those representations. In this chapter we present a generalization of these results by introducing Kalman rep-resentations with network graphs that are transitive acyclic directed graphs (TADG). We call these representations Kalman representations with TADG-zero structure. The existence of Kalman representations with TADG-zero structure is then asso-ciated to the lack of conditional and unconditional Granger non-causalities in the output process. These causalities are consistent with the network graph of the rep-resentation. We also present algorithms for constructing a Kalman representation with TADG-zero structure in the presence of the appropriate conditional and un-conditional Granger non-causalities.

The paper (Caines and Wynn, 2007) is the closest one to the results in this chap-ter. The cited paper studies LTI–SS representations of Gaussian processes in a form that is a subclass of the Kalman representations with TADG-zero structure with ad-ditional assumptions on the covariance matrix of the noise process. The existence of these LTI–SS representations are formalized by conditional orthogonal condi-tions which are stronger than the conditional orthogonality condition that are coun-terparts of the conditional and unconditional Granger causalities proposed in this chapter. Note that in (Caines and Wynn, 2007) there are no detailed proofs or al-gorithms to calculate the representations. Furthermore, it does not deal with non-coercive or non-Gaussian processes. The results of this chapter are based on the conference papers (Jozsa et al., 2017a). However, several additional statements are presented here that were not in the cited paper.

This chapter is organized as follows: First, we introduce Kalman representa-tions with TADG-zero structure. Then, we characterize their existence in terms of conditional and unconditional Granger causality. Next, the construction for calcu-lating Kalman representations with TADG-zero structure and the corresponding

(3)

al-gorithms are presented. Finally, we provide an example to illustrate the results. The proofs of the statements can be found in Appendices4.A,4.B, and4.C. If not stated otherwise, we assume throughout this chapter that y “ ryT

1, . . . , yTnsT is a ZMSIR

process where n ě 2, yiP Rri, and rią 0for i “ 1, . . . , n.

4.1 Kalman representation with TADG-zero structure

In this section, we introduce Kalman representations whose network graph is a tran-sitive acyclic directed graph (TADG) and discuss their properties. To begin with, we define the class of transitive acyclic graphs.

Definition 4.1 (TADG). A directed graph G “ pV, Eq, with set of nodes V “ t1, . . . , kuand set of directed edges E Ď V ˆ V is called acyclic if there is no cycle i.e., closed directed path. Furthermore, it is transitive if for i, j, l P V the implication pi, jq, pj, lq P E ùñ pi, lq P Eholds. The class of transitive acyclic directed graphs is denoted by TADG.

For convenience we make the following assumption that applies for all ZMSIR processes throughout this chapter.

Assumption 4.2. For a process y “ ryT

1, . . . , yTnsT, we assume that none of the

compo-nents of y is a white noise process, or equivalently, the dimension of a minimal Kalman representation of yiis strictly positive for all i P t1, . . . , nu.

For a TADG G “ pV “ t1, . . . , nu, Eq, the set of nodes V has a so-called topological ordering. By topological ordering we mean an ordering on V such that if pi, jq P E is a directed edge then i ą j. Throughout this chapter we use integers to represent nodes of graphs and, without the loss of generality, we assume the following:

Assumption 4.3. Consider a TADG G “ pV, Eq where V “ t1, . . . , nu. Then pi, jq P E

implies i ą j.

Remark 4.4. Let G “ pV “ ta1, . . . , anu, Eqbe a TADG, then we can generate

topo-logical ordering on G as follows: Assume that the leaves of G are pai1, . . . , aik1q

where k1 ą 1, ij P t1, . . . , nufor j “ 1, . . . , k1 and the leaves are enumerated in

an arbitrary order. Then, delete the leaves of G and all the directed edges whose target node is a leave. Call the new graph G1. Assume now that the leaves of G1are

pai_k1`1, . . . , ai_k2qwhere k2ą k1, ij P t1, . . . , nufor j “ k1` 1, . . . , k2and the leaves

are again enumerated in an arbitrary order. Then, delete the leaves of G1and all the

directed edges whose target node is a leave of G1. Continue this until each of the

nodes of G are enumerated. The new graph ˜G “ pt1, . . . , nu, ˜Eq, where pk, lq P ˜Eif and only if paik, ailq P Eis isomorphic with G and has topological ordering.

(4)

4.1. Kalman representation with TADG-zero structure 75

The class of transitive acyclic directed graphs will be used to represent internal interconnection structure of Kalman representations. We will say that a Kalman representation has TADG-zero structure if its network graph is a TADG. To define this class of Kalman representations, we need to introduce some new terminology.

Notation 4.5(parent and non-parent succeeding nodes). Let G “ pV “ t1, . . . , nu, Eq be a TADG and consider a node j P V . The set of parent nodes ti P V |pi, jq P Eu of jis denoted by Ij. In addition, the set of non-parent succeeding (with respect to the

topological ordering of V ) nodes ti P V |i ą j, pi, jq R Eu of j is denoted by ¯Ij.

The topological ordering on the set of nodes of a TADG graph implies that Ij, ¯Ij Ď tj ` 1, . . . , nufor all j P t1, . . . , n ´ 1u. Furthermore, from the definition

of ¯Ij, we have that IjY ¯Ij “ tj ` 1, . . . , nu. The next notation helps in referring to

components of processes beyond the original partitioning of those processes.

Notation 4.6(sub-process). Consider the finite set V “ t1, . . . , nu and a tuple J “ pj1, . . . , jlqwhere j1, . . . , jl P V. Then for a process y “ “yT1, . . . , yTn

‰T , we denote the sub-process ryT j1, . . . , y T jls T _{by y} j1,...,jlor by yJ. By abuse of terminology, if J is

a subset of V and not a tuple, then yJ will mean process yα, where α is the tuple

obtained by taking the elements of J in increasing order, i.e. if J “ tj1, . . . , jku,

j1 ă j2 ă ¨ ¨ ¨ jk, then α “ pj1, . . . , jkq. However, yα,β always means ryαT, yTβsT

regardless the topological order between the elements of α and β.

Next, we introduce what we mean by partition of matrices. Call the set tpi, qiuki“1

a partition of pp, qq, where p, q ą 0, ifřk

i“1pi “ pandř k

i“1qi “ q, where pi, qi ą 0

for i “ 1, . . . , k.

Definition 4.7(partition of a matrix). Let tpi, qiuki“1be a partition of pp, qq for some

p, q ą 0. Then the partition of a matrix M P Rpˆq_{with respect to tp}

i, qiuk_i“1is a collection

of matrices tMijP Rpiˆqjuk_i,j“1, such that

M “ » — – M11 ¨ ¨ ¨ M1k .. . . .. ... Mk1¨ ¨ ¨ Mkk fi ffi fl.

In Definition4.7, the indexing of matrix M refers to the blocks of M and does not refer directly to the elements of M . It is parallel to the component-wise indexing of processes where the components can be multidimensional.

Notation 4.8(sub-matrix). Consider the partition tMij P Rpiˆqjuk_i,j“1 of a matrix

M P Rpˆq_{with respect to the partition tp}

(5)

tuples I “ pi1, . . . , inqand J “ pj1, . . . , jmqwhere i1, . . . , in, j1, . . . , jm P t1, . . . , ku.

Then by the sub-matrix of M indexed by IJ we mean

MIJ:“ » — – Mi1j1 ¨ ¨ ¨ Mi1jm .. . . .. ... Minj1 ¨ ¨ ¨ Minjm fi ffi fl

We are now ready to defined Kalman representations which have a so-called TADG-zero structure:

Definition 4.9 (G-zero structure). Consider a process y “ “yT

1, . . . , yTn

‰T

and a TADG G “ pV “ t1, . . . , nu, Eq. Let pA, K, C, I, eq be a p-dimensional Kalman rep-resentation of y P Rr_{and partition A with respect to tp}

i, piun_i“1, K with respect to

tpi, riun_i“1 and C with respect to tri, piun_i“1 where tpi, riun_i“1 is a partition of pp, rq.

Then we say that pA, K, C, I, eq has G-zero structure if Aij “ 0, Kij “ 0, Cij “ 0

whenever pj, iq R E. If, in addition, for all j P V the tuple J :“ pj, ¯Ij, Ijqdefines a

Kalman representation pAJ J, KJ J, CJ J, I, reTj, eT_I¯_j, eTIjs

T

qof ryT

j, y_IT¯_j, yTIjs

T _{in causal}

coordinated form (see Definition3.1), then we say that pA, K, C, I, eq has causal G-zero structure.

Besides saying that a representation has G-zero structure or causal G-zero struc-ture, we also say, representation with G-zero structure or with causal G-zero structure.

Consider the TADGs G1“ pt1, 2u, tp2, 1quqand G2“ pt1, 2, . . . , nu, tpn, 1q, pn, 2q,

. . . , pn, n ´ 1quq. If the graph G in Definition4.9is G1, then Definition4.9coincides

with Definition2.1in Section2.1considering ZMSIR processes that satisfy Assump-tion4.2(see Remark2.2in Section2.1). In a similar manner, if the graph G in Defini-tion4.9is G2then it coincides with Definition3.1in Section3.1considering ZMSIR

processes that satisfy Assumption4.2.

If a p-dimensional Kalman representation pA, K, C, I, eq of y P Rr_{has causal}

G-zero structure, where G “ pV, Eq is a TADG, then the partition tpi, riun_i“1 of pp, rq

in Definition 4.9is uniquely determined by y. It is equivalent of saying that the block dimensions of the partitioned matrices A, K and C are uniquely determined by y. Indeed, for all nodes j P V the tuple J :“ pj, ¯Ij, Ijqdefines a Kalman

repre-sentation pAJ J, KJ J, CJ J, I, reTj, e T ¯ Ij, e T Ijs T qof ryT j, y T ¯ Ij, y T Ijs T _{in causal coordinated}

form. Therefore, from Chapter3we know that the dimensions of pAlk, Klk, Clkqfor

k, l P tj, ¯Ij, Ijuare uniquely determined by ryTj, y T ¯ Ij, y

T Ijs

T_{. Then, using this for node}

j “ n ´ 1, . . . , 1it is easy to see that all block dimensions of the partitioned matrices A, Kand C are determined by y.

A Kalman representation with TADG-zero structure can be viewed as consisting of subsystems where each subsystem generates a component of y “ ryT

(6)

4.1. Kalman representation with TADG-zero structure 77

More precisely, let G “ pV “ t1, . . . , nu, Eq be a TADG and pA, K, C, I, e, yq be a p-dimensional Kalman representation with G-zero structure where A, K and C are partitioned with respect to a partition tpi, qiuk_i“1 of pp, qq. Furthermore, let x “

rxT1, . . . , xTnsT be its state such that xi P Rpi, i “ 1, . . . , n. Then the representation

with output yj, j P V is in the form of

Sj # xjpt ` 1q “ Ajjxjptq ``AjIjxIjptq ` KjIjeIjptq ˘ ` Kjjejptq yjptq “ Cjjxjptq ` CjIjxIjptq ` ejptq. (4.1)

Notice that if pi, jq P E, i.e., i is a parent node of j, then subsystem Sj takes

in-puts from subsystem Si, namely the state and noise processes of Si. In contrast, if

pj, iq R E, Sjdoes not take input from Si. Intuitively, it means that the subsystems

communicate with each other as it is allowed by the directed paths of the graph G. Note that from transitivity, if there is a directed path from i P V to j P V then there is also an edge pi, jq P E.

Take the TADG graph G “ pt1, 2, 3, 4u, tp4, 1q, p4, 2q, p3, 1q, p2, 1quq and a process ryT1, y2T, yT3, yT4sT with innovation process reT1, eT2, eT3, eT4sT. Then a Kalman

repre-sentation with G-zero structure of ryT

1, yT2, yT3, y4TsT is given by » — — – x1pt ` 1q x2pt ` 1q x3pt ` 1q x4pt ` 1q fi ffi ffi fl “ » — — – A11A12A13A14 0 A22 0 A24 0 0 A33 0 0 0 0 A44 fi ffi ffi fl » — — – x1ptq x2ptq x3ptq x4ptq fi ffi ffi fl ` » — — – K11K12K13K14 0 K22 0 K24 0 0 K33 0 0 0 0 K44 fi ffi ffi fl » — — – e1ptq e2ptq e3ptq e4ptq fi ffi ffi fl » — — – y1ptq y2ptq y3ptq y4ptq fi ffi ffi fl “ » — — – C11C12C13C14 0 C22 0 C24 0 0 C33 0 0 0 0 C44 fi ffi ffi fl » — — – x1ptq x2ptq x3ptq x4ptq fi ffi ffi fl ` » — — – e1ptq e2ptq e3ptq e4ptq fi ffi ffi fl , (4.2)

where Aij P Rpiˆpj, Kij P Rpiˆrj, Cij P Rriˆpj and yi, ei P Rri, xi P Rpi for some

pi ą 0, i, j “ 1, 2, 3, 4. The network graph of this representation is the network

of the representations S1, S2, S3, S4 defined in (4.1), generating y1, y2, y3 and y4,

respectively. See Figure4.1for illustration of this network graph.

Motivation for Kalman representations with causal TADG-zero structure

If we consider a general LTI–SS representation of a process y “ ryT

1, . . . , ynTsT with

a TADG G “ pV “ t1, . . . , nu, Eq network graph, then the noise process could be any process. Such as for LTI–SS representations in coordinated form in Chapter3, if the noise process were not the innovation process of y, then it could happen that information flows through it in an implicit way that is not allowed by the directed

(7)

x4pt ` 1q “ A44x4ptq ` K44e4ptq y4ptq “ C44x4ptq ` e4ptq x3pt ` 1q “ A33x3ptq ` K33e3ptq y3ptq “ C33x3ptq ` e3ptq x2pt ` 1q “ ř i“2,4pA2ixiptq ` K2ieiptqq y2ptq “ři“2,4C2ixiptq ` e2ptq x1pt ` 1q “ ř4 i“1pA1ixiptq ` K1ieiptqq y1ptq “ř4i“1C1ixiptq ` e1ptq px4, e4q px4, e4q px3, e3q px2, e2q

Figure 4.1:Network graph of the Kalman representation (4.2) with G-zero structure

paths (edges) of G. However, if we assume that pA, K, C, I, e, yq is a Kalman rep-resentation with causal G-zero structure, then reT

j, eTIjs

T _{is the innovation process}

of ryT j, y

T Ijs

T _{and e}

Ij is the innovation process of yIj for j “ 1, . . . , n. Hence, the

present value of ejdepends only on the past and present values of yj, yIj, whereas

the present value of eIj depends only on the past and present values of yIj.

More-over, xIj depends only on the past values of yIj and xj depends only on the past

values of yj, yIj. That is, in case of Kalman representations with causal G-zero

struc-ture, information only flows from subsystems SIj, generating yIj, to the subsystem

Sj, generating yj, see (4.1). That is, the information flows according to the directed

paths (edges) of G.

Kalman representations with causal TADG-zero structure have a number of de-sirable properties, e.g., as it is explained above, the block dimensions of the system matrices are determined by y. Furthermore, in order to estimate a state xj, using a

Kalman filter, only the output ryT j, yTIjs

T_{is necessary (if j is a root node of the TADG}

then only yjis necessary). Moreover, from Lemma4.10below, Kalman

representa-tions with causal G-zero structure are isomorphic (see Definition1.11). Hence, if they represent the same output process, their properties are essentially the same. Note that as a consequence of Lemma 4.10, if a Kalman representation of a pro-cess y with TADG-zero structure is not minimal then there does not exist a minimal Kalman representation of y with TADG-zero structure.

Lemma 4.10. Consider a TADG G “ pV “ t1, . . . , nu, Eq and a process y “ ryT

1, . . . , yTnsT.

Then any two Kalman representations of y with causal G-zero structure are isomorphic.

(8)

4.2. Granger causality and Kalman representation with TADG-zero structure 79

4.2 Granger causality and Kalman representation with

TADG-zero structure

Kalman representations of a process y with causal TADG-zero structure determine causal relationships among the components of y. In fact, we will show that the existence of a Kalman representation of y with causal TADG-zero structure can be characterized by conditional Granger non-causalities among the components of y.

To begin with, we define G-consistent causality structure in a process which in-volves a combination of conditional Granger non-causality conditions between the components of a process.

Definition 4.11(G-consistent causality structure). Consider a TADG G “ pV, Eq, where V “ t1, . . . , nu and a process y “ “yT

1, . . . , yTn

‰T

. We say that y has G-consistent causality structure if yiconditionally does not Granger cause yj with

re-spect to yIj for any i, j P V, i ‰ j such that pi, jq R E.

If G “ pt1, 2u, tp2, 1quq, then Definition4.11coincides with Definition2.3. Fur-thermore, if G “ pt1, 2, . . . , nu, tpn, 1q, pn, 2q, . . . , pn, n ´ 1quq then Definition 4.11

coincides with Definition3.3.

Remark 4.12. Notice that if yiis a root node in the TADG graph, i.e., Ij “ Hthen

none of the other components causes yi. In this case, the conditional Granger

non-causality that for pj, iq R E the process yj conditionally does not Granger cause yi

with respect to yIisimplifies to that yjdoes not Granger cause yi.

Lemma4.13below provides an equivalent reformulation of Definition4.11.

Lemma 4.13. Consider a TADG G “ pV, Eq, where V “ t1, . . . , nu and a process y “

“yT

1, . . . , yTn

‰T

. Then y has G-consistent causality structure if and only if • yjdoes not Granger cause yIj

• y_I¯

j does not Granger cause yIj

• yjdoes not Granger cause ryIjyI¯js

• y_I¯

j does not Granger cause ryIjyjs

for all nodes j P V of G.

The main result of this chapter includes a condition for the existence of minimal Kalman representations with G-zero structure. For this, we recall Definition3.4, the definition of conditionally trivial intersection of two subspaces U, V Ď H in a Hilbert space H with respect to a closed subspace W Ď H.

(9)

Definition 4.14(conditionally trivial intersection). Consider the subspaces U, V, W Ď H such that W is closed. Then U, V have a conditionally trivial intersection with re-spect to W denoted by U X V |W “ t0u if

tu ´ Elru|W s | u P U u X tv ´ Elrv|W s | v P V u “ t0u,

i.e., the intersection of the projections of U and V onto the orthogonal complement of W in H is the zero subspace.

Now we are ready to state the main result of this chapter:

Theorem 4.15. Consider the following statements for a TADG G “ pV “ t1, . . . , nu, Eq

and a process y “ ryT 1, . . . , y

T ns

T_:

(i) y has G-consistent causality structure; (ii) (i)holds and for any node j P V in G

ElrH yj t`|H yj,yIj t´ s X ElrH y_Ij¯ t`|H y_Ij¯,y_Ij t´ s | ElrH y_Ij t`|H y_Ij t´s “ t0u (4.3)

(iii) there exists a minimal Kalman representation of y with causal G-zero structure; (iv) there exists a Kalman representation of y with causal G-zero structure;

(v) there exists a Kalman representation of y with G-zero structure; Then, the following hold:

(a) (ii) ðñ (iii); (b) (i) ùñ (v); (c) (iv) ùñ (i).

If, in addition, y is coercive, then we have (d) (i) ðñ (iv) ðñ (v).

Theproofcan be found in Appendix4.C.

The intuition behind Theorem 4.15 is the following. If the information flows among subsystems tSiun_i“1 (see (4.1)) according to the topology of a TADG G “

pV “ t1, . . . , nu, Eq, then the outputs of subsystems that are not connected by a directed path (edge) in G should not influence each other. For instance, there is no edge from a child node to its parent nodes, which implies that yj should not

(10)

4.3. Computing Kalman representations with TADG-zero structure 81

pi, kq R Eif i P ¯Ijand k P Ij, thus yI¯j should not Granger cause yIj. In addition, the

succeeding non-parent nodes of a node j are disconnected from j, i.e., there is no edge from j to ¯Ij or from ¯Ijto j, they only can have some common parent nodes in

Ij. In a similar manner to Chapter3, it implies that yI¯j and yjconditionally does not

Granger cause each other with respect to yIj. With the help of Lemmas4.13and4.32

in Appendix4.A, it can be seen that the discussion above supports the statement that a Kalman representation with causal G-zero structure implies G-consistent causal-ity structure in its output process, see the proof of Theorem4.15 for more details. The implication that a process with G-consistent causality structure always has a Kalman representation with G-zero structure is more involved. It is based on the construction of the Kalman representation with G-zero structure, see Section4.3and the proof of Theorem4.15.

Condition(ii)for minimality in Theorem4.15can be explained as follows. It can be shown that a Kalman representation with causal G-zero structure is observable, so for minimality, we only have to ensure its reachability. This is equivalent to the components of the state, denoted by x “ rxT

1, . . . , x T ns

T_{, being linearly independent}

at each time. In a Kalman representation with causal G-zero structure, the space generated by the components of xjptqfor a node j P V is ElrH

yj

t`|H yj,yIj

t´ s, the space

generated by the component of xI¯jptqis ElrH

y_Ij¯

t`|H y_Ij,y_Ij¯

t´ sand the space generated

by the components of xIjptqis ElrH

y_Ij t`|H

y_Ij

t´s. It then can be seen that condition(ii)

is a necessary and sufficient condition for the state x to be linearly independent in a Kalman representation with causal G-zero structure (for more details see the proof of Theorem4.15in Appendix4.C).

Using minimal Kalman representations with causal G-zero structure is desirable since they are isomorphic to any other minimal Kalman representation of the same process (see Proposition1.12). Hence, any property derived for a minimal Kalman representation with causal G-zero structure, if it is invariant under isomorphism, remains valid for any other minimal Kalman representation of the same process. Theorem 4.15gives a necessary and sufficient condition for existence of minimal Kalman representations with causal G-zero structure. Finding conditions on the output process that ensure existence of a minimal Kalman representation with (non-causal) G-zero structure is an open problem to the best of the author’s knowledge.

4.3 Computing Kalman representations with

TADG-zero structure

Assuming that a ZMSIR process y “ ryT

1, . . . , yTnsT has G-consistent causality

(11)

G-zero structure can be calculated algorithmically. In this section, we formulate two algorithms for this purpose: The first algorithm, Algorithm11, takes the second or-der statistics of y as input and calculates a Kalman representation of y with G-zero structure. The second algorithm, Algorithm12, calculates the same representation but takes an arbitrary LTI–SS representation of y as its input.

In the rest of this chapter, we will use the following notation:

Notation 4.16. The restriction of a TADG G “ pV “ t1, . . . , ku, Eq to I “ ti1, . . . , ipu Ď

V is the graph defined by G|I:“ pti1, . . . , ipu, tpi, jq P E|i, j P Iuq.

Remark 4.17. The restriction of a TADG to any subset of nodes is a TADG.

Consider a TADG G “ pV “ t1, . . . , nu, Eq and a process y “ ryT

1 . . . , yTnsT

and recall that we assumed topological ordering on V , see Assumption4.3. Then, the main idea of the procedure that calculates a Kalman representation of y with G-zero structure is as follows: first, we take a minimal Kalman representation S0

of yn. Second, S0 is extended to a Kalman representation S1 of ryTn´1, yTnsT with

G|tn´1,nu-zero structure. That is, if pn, n ´ 1q P E then S1is in block triangular form,

otherwise it is in block diagonal form, i.e., the system matrices are block diagonal, see Lemma 4.23 below. Then, we continue it as follows: for i “ 1, . . . , n ´ 2 Si

is extended to a Kalman representation Si`1 of ryTn´i, . . . , yTnsT with G|tn´i,...,nu

-zero structure. In general, the extension of Si can happen in three different ways

depending on the edges of G: In the first case, when ¯In´iis empty, the extended

representation has block triangular form; in the second case, when In´iis empty, it

has block diagonal form and in the third case, when neither ¯In´inor In´iis empty,

then it has coordinated form. Note that ¯IníY Iní“ tn ´ i ` 1, . . . , nu, i.e., ¯Iníand

I_n´icannot be empty at the same time for i “ 1, . . . , n´2. To ease the formulation of the procedure described above, we introduce some auxiliary results and algorithms on the above-mentioned extensions of Kalman representations.

4.3.1 Auxiliary results

To extend a Kalman representation of y2to a Kalman representation of ryT1, yT2sT in

block triangular form, we will use the following definition:

Definition 4.18. Consider an observable Kalman representation pA22, K22, C22, I, e2q

(12)

ob-4.3. Computing Kalman representations with TADG-zero structure 83

servable Kalman representation of y of the form „x1pt ` 1q x2pt ` 1q  “„A11A12 0 A22  „x1ptq x2ptq  `„K11K12 0 K22  „e1ptq e2ptq  „y1ptq y2ptq  “„C11C12 0 C22  „x1ptq x2ptq  `„e1ptq e2ptq  . (4.4)

Next, we present Lemma4.19, Algorithm8and Corollary4.20below on exten-sions of Kalman representations in block triangular form.

Lemma 4.19. Consider a process y “ ryT

1, yT2sT and an observable Kalman

represen-tation pA22, K22, C22, I, e2q of y2. If y1 does not Granger cause y2 then there exists

an extension of pA22, K22, C22, I, e2, y2q for y in block triangular form. Moreover, if

the representation pA22, K22, C22, I, e2, y2qis minimal, then there exists an extension of

pA22, K22, C22, I, e2, y2qfor y in causal block triangular form which is a minimal Kalman

representation of y.

The proof can be found in Appendix 4.B. Note that Lemma 4.19 can be seen

as a consequence of Theorem 2.5. If pA22, K22, C22, I, e2, y2q is minimal, then by

the isomorphism between minimal Kalman representations, the minimal Kalman representation of y2 in Theorem 2.5 and in Lemma 4.19 are isomorphic. Using

this isomorphism, the minimal Kalman representations of y in Theorem 2.5 and in Lemma 4.19are isomorphic as well. If pA22, K22, C22, I, e2, y2qis not minimal,

then the representation of y2 in Theorem 2.5can be transformed to the

represen-tation of y2 in Lemma4.19with a non-singular state-space transformation. Also,

the representation of y in Theorem2.5can be transformed to an observable Kalman representation of y that is the extension of the representation of y2in Lemma4.19.

Therefore, Theorem2.5ensures the existence of the Kalman representations of y in Lemma4.19provided that y1does not Granger cause y2.

The representation in Lemma 4.19 can be calculated using Algorithm5. This is elaborated in Algorithm8below. Recall that for a tuple pA, Cq of two matrices A P Rnˆn_{and C P R}mˆn_{, the finite observability matrix up to N ą 0 is defined by}

ON “ rCT pCAqT¨ ¨ ¨ pCAN ´1qTsT. (4.5)

Consider a ZMSIR process y “ ryT

1, yT2sT with covariance sequence tΛ y ku8k“0.

Let e be the innovation process of y and N be any number larger than or equal to the dimension of a minimal Kalman representation of y. Assume that y1 does

not Granger cause y2and note that Algorithm5calculates a minimal Kalman

(13)

Algorithm 8Extension of an observable Kalman representation in block triangular form Input tA22, K22, C22uand tΛ y ku 2N

k“0: System matrices of an observable Kalman

rep-resentation of y2and covariance sequence of y “ ryT1, yT2sT

Output tA, K, Cu: System matrices of (4.4)

Step 1Apply Algorithm5with input tΛyku 2N

k“0and denote its output by t ˆA, ˆK, ˆCu,

where ˆ A “ „_ˆ A11Aˆ12 0 Aˆ22  , ˆK “ „_ˆ K11Kˆ12 0 Kˆ22  , ˆC “ „_ˆ C11Cˆ12 0 Cˆ22  . Step 2Define T “ ˆO`

NON where ˆON` is the left inverse of the finite (up to N )

observability matrix of p ˆA22, ˆC22q and ON is the finite (up to N ) observability

matrix of pA22, C22q.

Step 3Define the following matrices

A “„ ˆA11Aˆ12T 0 A22  , K “„ ˆK11Kˆ12 0 K22  , C “„ ˆC11Cˆ12T 0 C22  . input tA22, K22, C22uand tΛ y ku 2N

k“0, where tA22, K22, C22uare system matrices of an

observable Kalman representation pA22, K22, C22, I, e2qof y2. Denote the output by

tA, K, Cu. Then we have the following result:

Corollary 4.20 (Correctness of Algorithm 8). The tuple pA, K, C, I, e, yq is an ob-servable Kalman representation and it is an extension of pA22, K22, C22, I, e2, y2q for

y in block triangular form. Furthermore, if pA22, K22, C22, I, e2, y2q is minimal then

pA, K, C, I, e, yqis a minimal Kalman representation in causal block triangular form. Theproofcan be found in Appendix4.B.

Remark 4.21. Similar to Algorithm 8, Algorithms5and4in Chapter2also calculate Kalman representations in block triangular form provided that y1does not Granger

cause y2. However, Algorithm 8takes, besides the covariances of y, the system

ma-trices of a Kalman representation of y2as its input and extends it in such a way that

the input Kalman representation of y2is a sub-system of the Kalman representation

of y that the output matrices of Algorithm 8 define. As a consequence, contrary to Algorithms5and4, the Kalman representation that Algorithm 8defines is not necessarily minimal or is in a causal block triangular form.

To extend a Kalman representation of y2to a Kalman representation of ry1T, yT2sT

(14)

Definition 4.22. Consider an observable Kalman representation pA22, K22, C22, I, e2q

of y2. An extension of pA22, K22, C22, I, e2, y2qfor y in block diagonal form is an

observ-able Kalman representation of y of the form „x1pt ` 1q x2pt ` 1q  “„A11 0 0 A22  „x1ptq x2ptq  `„K11 0 0 K22  „e1ptq e2ptq  „y1ptq y2ptq  “„C11 0 0 C22  „x1ptq x2ptq  `„e1ptq e2ptq  . (4.6)

Next, we present Lemma4.23, Algorithm9, and Corollary4.24below on exten-sions of Kalman representations in block diagonal form.

1, y2TsT and an observable Kalman representation

pA22, K22, C22, I, e2qof y2. If y1 and y2mutually do not Granger cause each other then

there exists an extension of pA22, K22, C22, I, e2, y2qfor y in block diagonal form.

More-over, if the representation pA22, K22, C22, I, e2, y2qis minimal, then there exists an

exten-sion of pA22, K22, C22, I, e2, y2qfor y in block diagonal form which is a minimal Kalman

representation of y.

Theproofcan be found in Appendix4.B.

An algorithm that calculates the representation in Lemma4.23is presented next: Consider y “ ryT

1, yT2sT, its innovation process e and the covariance sequence

Algorithm 9Extension of an observable Kalman representation in block diagonal form

Input tA22, K22, C22u and tΛyk1u 2N

k“0: System matrices of an observable Kalman

representation of y2and covariance sequence of y1

Step 1 Apply Algorithm 1 with input tΛy1

k u 2N

k“0 and denote its output by

tA11, K11, C11u.

A “„A11 0 0 A22  , K “„K11 0 0 K22  , C “„C11 0 0 C22  . tΛy1 k u 2N

k“0 of y1 where N is larger than or equal to the dimension of a minimal

Kalman representation of y1. Assume that y1 and y2 mutually do not Granger

cause each other. Apply Algorithm 9 with input tA22, K22, C22u and tΛy_k1u2N_k“0,

(15)

pA22, K22, C22, I, e2qof y2. Denote the output by tA, K, Cu. Then we have the

fol-lowing result.

Corollary 4.24(Correctness of Algorithm9). The tuple pA, K, C, I, e, yq is an observ-able Kalman representation and it is an extension of pA22, K22, C22, I, e2, y2qfor y in block

diagonal form. Furthermore, if pA22, K22, C22, I, e2, y2qis minimal then pA, K, C, I, e, yq

is also minimal.

Next, we discuss the extension of Kalman representations in coordinated form. To extend a Kalman representation of ryT

2, yT3sTin block triangular form to a Kalman

representation of ryT

1, yT2, yT3sT in coordinated form, we will use the following

defi-nition:

Definition 4.25. Consider an observable Kalman representation

S “ˆ„A22A23 0 A33  ,„K22K23 0 K33  ,„C22C23 0 C33  , I,„x2 x3  ,„e2 e3 ˙ of ryT

2, yT3sT in block triangular form. An extension of S for y “ ryT1, yT2, yT3sT in

coordinated form is an observable Kalman representation of y in the form » – x1pt ` 1q x2pt ` 1q x3pt ` 1q fi fl“ » – A11 0 A13 0 A22A23 0 0 A33 fi fl » – x1ptq x2ptq x3ptq fi fl` » – K11 0 K13 0 K22K23 0 0 K33 fi fl » – e1ptq e2ptq e3ptq fi fl » – y1ptq y2ptq y3ptq fi fl“ » – C11 0 C13 0 C22C23 0 0 C33 fi fl » – x1ptq x2ptq x3ptq fi fl` » – e1ptq e2ptq e3ptq fi fl. (4.7)

Next, we present Lemma4.26, Algorithm10and Corollary4.27on extensions of Kalman representations in coordinated form.

1, yT2, yT3sT and an observable Kalman

represen-tation S of ryT

2, y3TsT in block triangular form. If

(i) y1does not Granger cause y3,

(ii) y2does not Granger cause y3,

(iii) y1conditionally does not Granger cause y2with respect to y3,

(16)

then there exists an extension of S for y in coordinated form. Moreover, if S is a minimal Kalman representation in causal block triangular form and for i ‰ j, i, j “ 1, 2

ElrH yi t`|H yi,y3 t´ s X ElrH yj t`|H yj,y3 t´ s | ElrH y3 t`|H y3 t´s “ t0u,

then there exists an extension of S for y in coordinated form which is a Kalman representa-tion of y in causal coordinated form.

Note that Lemma4.26can be seen as a consequence of Theorem3.5applied to a process y “ ryT 1, y T 2, y T 3s

T_{. If S is minimal and in causal block triangular form, then}

by the isomorphism between minimal Kalman representations, the representation of ryT

2, yT3sT in Theorem3.5and in Lemma4.26are isomorphic. Otherwise, the

rep-resentation of of ryT

2, yT3sT in Theorem3.5can be transformed to the representation

S in Lemma4.26with a non-singular state-space transformation. Also, the repre-sentation of y in Theorem3.5can be transformed to an observable Kalman repre-sentation of y that is the extension of the reprerepre-sentation S in Lemma4.26. Therefore, Theorem3.5ensures the existence of the Kalman representations of y in Lemma4.19

provided the conditional and unconditional Granger non-causality conditions. An algorithm that calculates the representation in Lemma4.26is presented next: Consider a process y “ ryT

1, y2T, yT3sT, its innovation process e and its covariance

sequence tΛyku2Nk“0, where N is larger than or equal to the dimension of a minimal

Kalman representation of y. Assume that y1 and y2 do not Granger cause each

other with respect to y3and y1, y2do not Granger cause y3. Apply Algorithm10

with input tA2, K2, C2uand tΛy_ku2N_k“0, where tA2, K2, C2uare system matrices of an

observable Kalman representation pA2, K2, C2, I, e2qof ryT2, y T 3s

T_{in block triangular}

form. Denote the output by tA, K, Cu. Then we have the following result:

Corollary 4.27(Correctness of Algorithm10). The tuple pA, K, C, I, e, yq is an observ-able Kalman representation and it is an extension of pA2, K2, C2, I, e2, y2qfor y in

coor-dinated form. Furthermore, if pA2, K2, C2, I, e2, y2qis a minimal Kalman representation

in causal block triangular form, then pA, K, C, I, e, yq is a Kalman representation in causal coordinated form.

Remark 4.28. Similar to Algorithm 10, Algorithms7 and 6in Chapter3 also cal-culate Kalman representations in coordinated form. However, Algorithms7and6

are formulated for a more general process class, where y has n ě 3 components. Furthermore, Algorithm10takes, besides the covariances of y, the system matrices of a Kalman representation of ryT

(17)

Algorithm 10 Extension of an observable Kalman representation in coordinated form Input tA2 “ „A22A23 0 A33  , K2 “ „K22K23 0 K33  , C2 “ „C22C23 0 C33  uand tΛyku 2N k“0:

Sys-tem matrices of an observable Kalman representation of ryT

2, yT3sT and covariance

sequence of y “ ryT

1, yT2, y3TsT

Step 1Apply Algorithm7with input tΛyku 2N

k“0and denote its output by t ˆA, ˆK, ˆCu,

where ˆ A “ » – ˆ A11 0 Aˆ13 0 Aˆ22Aˆ23 0 0 Aˆ33 fi fl, ˆK “ » – ˆ K11 0 Kˆ13 0 Kˆ22Kˆ23 0 0 Kˆ33 fi fl, ˆC “ » – ˆ C11 0 Cˆ13 0 Cˆ22Cˆ23 0 0 Cˆ33 fi fl Step 2Define T “ ˆO`

NON where ˆO` is the left inverse of the finite (up to N )

observability matrix of p ˆA33, ˆC33qand ˆO is the finite (up to N ) observability matrix

of pA33, C33q.

A “ » – ˆ A11 0 Aˆ13T 0 A22 A23 0 0 A33 fi fl, K “ » – ˆ K11 0 Kˆ13 0 K22K23 0 0 K33 fi fl, C “ » – ˆ C11 0 C13T 0 C22 C23 0 0 C33 fi fl

input Kalman representation is a sub-system of the Kalman representation of y that the output matrices of Algorithm10define. Therefore, contrary to Algorithms7and

6, the Kalman representation that Algorithm 10defines is not necessarily in a causal coordinated form.

4.3.2 Algorithms for Kalman representation with causal

TADG-zero structure

To formulate the algorithms that calculate a Kalman representation of y with G-zero structure, we will use Algorithms8,9, and10. Notice that these algorithms only calculate system matrices of Kalman representations if certain Granger causal-ity conditions hold. We ensure these conditions by relying on the following result:

1, . . . , yTnsT and a TADG graph G “ pV, Eq

with V “ t1, . . . , nu, and assume that y has G-consistent causality structure. Then for any j P t1, . . . , n ´ 1uthe following holds:

(18)

(i) yjdoes not Granger cause yIj

(ii) y_I¯

j does not Granger cause yIj

(iii) yjconditionally does not Granger cause y_I¯_j with respect to yIj

(iv) y_I¯_j conditionally does not Granger cause yjwith respect to yIj.

Notice that for j ‰ n, ¯Ij “ Hand Ij “ Hcannot happen at the same time since

¯

Ij Y Ij “ tj ` 1, . . . , nu. When ¯Ij “ Hthen Ij “ tj ` 1, . . . , nuand the Granger

causality conditions(i),(ii),(iii), and(iv)simplify to the condition that yjdoes not

Granger cause yIj. Hence, Algorithm8can be applied. On the other hand, if Ij“ H

then ¯Ij “ tj ` 1, . . . , nuand the Granger causality conditions(i),(ii),(iii), and(iv)in

Lemma4.29simplify to the conditions: y_I¯_j does not Granger cause yjand yjdoes

not Granger cause y_I¯_j. Therefore, Algorithm9can be applied. If neither Ij, nor ¯Ijis

the empty set then from conditions(i)–(ii)–(iii)and(iv)Algorithm10can be applied. Consider a process y “ ryT

1, . . . , yTnsT and a TADG graph G “ pV, Eq with V “

t1, . . . , nu, and assume that y has G-consistent causality structure. The algorithms that calculate a Kalman representation of y with G-zero structure are elaborated in Algorithms11and12below. Algorithm11takes the covariances of y as its input and transforms it into a Kalman representation of y with G-zero structure. Algorithm12

calculates the same representation from an LTI–SS representation of y. Note that by using empirical covariances Algorithm11can be applied to data.

Let tΛyku8k“0 be the covariance sequence of y and e be the innovation process

of y. Furthermore, let N be any number larger than or equal to the dimension of a minimal Kalman representation of y. Assume that y has G-consistent causality structure and note that Algorithms8,9, and10calculate Kalman representations in causal block triangular, block diagonal and in causal coordinated form, respectively (see Remarks4.20, 4.24, and 4.27). Apply Algorithm 11with input tΛyku

2N k“0 and

denote its output by tA, K, Cu. Now apply Algorithm12with input t ¯A, ¯B, ¯C, ¯D, Λv 0u

where p ¯A, ¯B, ¯C, ¯D, vqdefines an LTI–SS representation of y and Λv

0 “ ErvptqvTptqs.

Denote the output by t ˆA, ˆK, ˆCu. Then we have the following result.

Corollary 4.30(Correctness of Algorithm11and Algorithm12). The tuples pA, K, C, I, eqand p ˆA, ˆK, ˆC, I, eqare observable Kalman representations of y with G-zero structure. Furthermore, if for all nodes j P V in G

ElrH yj t`|H yj,yIj t´ s X ElrH y_Ij¯ t`|H y_Ij¯,y_Ij t´ s | ElrH y_Ij t` |H y_Ij t´s “ t0u,

then pA, K, C, I, eq and p ˆA, ˆK, ˆC, I, eq are minimal Kalman representations of y with causal G-zero structure.

(19)

Algorithm 11Kalman representation with causal G-zero structure based on output covariances

Input tΛyku 2N

k“0: Covariance sequence of y ““y T 1, .., y

T n

‰T

Output tA, K, Cu: System matrices of a Kalman representation of y with G-zero structure

Step 1 Apply Algorithm 1 with input tΛyn

k u 2N

k“0 and denote its output by

tAn, Kn, Cn, Qenu.

Step 2

for i “ n, n ´ 1 . . . , 2

if ¯Ii´1“ Hthenapply Algorithm8with input tAi, Ki, Ciuand tΛzku2Nk“0where

z “ ryT

i´1, yti,...,nuT s

T_{. Denote the output by tA}

i´1, Ki´1, Ci´1u.

else if Ii´1 “ H then apply Algorithm 9 with input tAi, Ki, Ciu and

tΛy_ki´1u2N_k“0. Denote the output by tAi´1, Ki´1, Ci´1u.

end if

if Ii´1‰ Hand ¯Ii´1‰ Hthenapply Algorithm10with input tAi, Ki, Ciuand

tΛz_ku2N_k“0where z “ ryT i´1, y T ¯ Ii´1, y T Ii´1s

T_{. Denote the output by tA}

i´1, Ki´1, Ci´1u.

end if end for

Step 3Define A “ A1, K “ K1and C “ C1.

Algorithm 12Kalman representation with G-zero structure based on LTI–SS repre-sentation

Input t ¯A, ¯B, ¯C, ¯D, Λv

0u, G “ pV, Eq: System matrices of an LTI–SS representation

p ¯A, ¯B, ¯C, ¯D, vqof y and variance matrix of v

Output tA, K, Cu: System matrices of a Kalman representation of y with G-zero structure

Step 1Find the solution Σxof the Lyapunov equation Σ “ ¯AΣ ¯AT ` ¯BΛv0B¯T.

Step 2Define G :“ ¯CΣxA¯T ` ¯DΛv0B¯T and calculate the output covariance

matri-ces Λyk :“ ¯C ¯Ak´1G

T _{for k “ 0, . . . , 2n, where n is such that ¯}

A P Rnˆn.

Step 3 Apply Algorithm 11 with input tΛyku 2n

k“0 and denote the output by

tA, K, Cu.

4.4 Conclusions

In this chapter, we have studied Kalman representations whose network graphs are transitive acyclic directed graphs (TADGs), called Kalman representations with

(20)

4.4. Conclusions 91

TADG-zero structure. This class of Kalman representations have been related to conditional Granger causality conditions among the components of their output processes. More precisely, we have shown that there exists a Kalman representation with a TADG G-zero structure if and only if certain conditional Granger causality conditions hold that are determined by G. To construct the Kalman representations in question, we provided algorithms that take an arbitrary LTI–SS representation of the output process or the covariance sequence of that process as its input. In fact, the latter input can be substituted with empirical covariances, and thus the algorithm can be applied to data. Also, the results deal with the minimality of the representa-tions and the so-called coercive property of the output processes.

(21)

4.A

Proofs of Lemmas

4.10 and

4.13

Proof of Lemma4.10. Consider a TADG G “ pV “ t1, . . . , nu, Eq and a process y “ ry1T, . . . , y

T ns

T _{where y}

iP Rri, for rią 0, i “ 1, . . . , n. Let S “ pA, K, C, I, eq and

ˆ

S “ p ˆA, ˆK, ˆC, I, eqbe two Kalman representations of y with causal G-zero structure. Then, by Definition4.9, for J :“ p1, ¯I1, I1qthe tuples

SJ J “ pAJ J, KJ J, CJ J, I, reT1, e T ¯ I1, e T I1s T q ˆ SJ J “ p ˆAJ J, ˆKJ J, ˆCJ J, I, reT1, e T ¯ I1, e T I1s T q are Kalman representations of ryT

1, yT_I¯₁, yTI1s

T _{in causal coordinated form. By using}

Lemma3.2, it follows that SJ Jand ˆSJ J are isomorphic with a non-singular T

trans-formation matrix, i.e., AJ J “ T ˆAJ JT´1, KJ J “ T ˆKJ J, CJ J “ ˆCJ JT´1. Let the

state processes of S and ˆSbe x “ rxT

1, . . . , xTnsT and ˆx “ rˆxT1, . . . , ˆxTnsT, respectively.

Assume that ¯I1“ ti1, . . . , ikuand that I1“ tik`1, . . . , in´1u. Define the permutation

matrices Pyand Pxsuch that

» — – y1 .. . yn fi ffi fl “Py » – y1 yI¯1 yI1 fi fl, » — – x1 .. . xn fi ffi fl “Px » – x1 xI¯1 xI1 fi fl. Note that P´1 y “ PyT and Px´1“ PxT. Then, A “ PxTAJ JPx K “ PxTKJ JPy C “ PyTCJ JPx ˆ A “ PxTAˆJ JPx K “ Pˆ xTKˆJ JPy C “ Pˆ yTCˆJ JPx.

Therefore, by using AJ J “ T ˆAJ JT´1 we obtain that PxAPxT “ T PxAPˆ xTT´1, by

using KJ J “ T ˆKJ J we obtain that PxKPyT “ T PxKPˆ yTP and lastly, by using

CJ J “ ˆCJ JT´1 we obtain that PyCPxT “ PyCPˆ xTT´1. It then follows that by the

transformation matrix ˜T “ PxT PxT the representations S and ˆSare isomorphic.

For the proof of Lemma 4.13 we need auxiliary lemmas on the properties of Granger and conditional Granger causality: First, we recall Lemma3.12from Ap-pendix3.Band Lemma3.6from Chapter3:

Lemma 4.31(Lemma3.12). Consider a ZMSIR process y “ ryT

1, yT2, y3T, yT4sT. Then

y1and y2conditionally do not Granger cause y3with respect to y4if and only if ry1T, yT2sT

conditionally does not Granger cause y3with respect to y4.

Lemma 4.32 (Lemma3.6 ). Consider a process y “ ryT

1, yT2, yT3sT and the following

(22)

4.A. Proofs of Lemmas4.10and4.13 93

(i) y1does not Granger cause y3

(ii) y2does not Granger cause y3

(iii) y1conditionally does not Granger cause y2with respect to y3

(iv) y1does not Granger cause ryT2, yT3sT

Then(i)-(ii)-(iii)if and only if(ii)-(iv).

There are two additional results the we employ in the proof of Lemma4.13. The first one is as follows:

1, yT2, yT3, yT4sT. If y1 and y2conditionally do

not Granger cause y3with respect to y4then y1 conditionally does not Granger cause y3

with respect to ryT 2, yT4sT.

Proof. Let α “ y3pt ` sq ´ Elry3pt ` sq|H y3,y4

t´ s for some t, s P Z, s ą 0. Then

from the conditional Granger non-causality conditions we obtain that α “ y3pt `

sq ´ Elry3pt ` sq|Hy_t´1,y3,y4sand α “ y3pt ` sq ´ Elry3pt ` sq|Hy_t´2,y3,y4s. Therefore, αis orthogonal to Hy2,y3,y4 t´ and to H y1,y3,y4 t´ and thus to H y1,y2,y3,y4 t´ . This implies that Elrα|H y1,y2,y3,y4 t´ s “ 0and thus Elry3pt ` sq|H y1,y2,y3,y4 t´ s “ Elry3pt ` sq|H y3,y4 t´ s.

From the condition that y2conditionally does not Granger cause y3w..r.t. y4, the

lat-ter is further equivalent to Elry3pt`sq|H_t´y2,y3,y4s. That is, Elry3pt`sq|Hy_t´1,y2,y3,y4s “

Elry3pt ` sq|Hyt´2,y3,y4s, which holds for any choice of t, s P Z, s ą 0. This, by

defini-tion means that y1conditionally does not Granger cause y3w..r.t. ryT2, yT4sT.

The last auxiliary lemma that helps us in proving Lemma4.13is presented below.

1, . . . , yTnsTand a TADG G “ pV “ t1, . . . , nu, Eq.

Then we state the following for the conditions below: for any node i P V (i),(ii), and(iii)

hold if and only if(i),(iv), and(v)hold. (i) yiand y_I¯_ido not Granger cause yIi

(ii) yidoes not Granger cause ryIiyI¯is

(iii) yI¯idoes not Granger cause ryIiyis

(iv) yiconditionally does not Granger cause yI¯iwith respect to yIi

(v) yI¯iconditionally does not Granger cause yiwith respect to yIi

Proof. Considering(i),(ii), and(iii), we can apply Lemma4.32to y “ ryT i, yT_I¯ i, y T Iis T and to y “ ryT ¯ Ii, y T i, yTIis

T_{. As a results, we obtain that the conditions}_(i)_,_(ii)_{, and}

(23)

Remark 4.35. A Granger causality condition that y1 does not Granger cause

ryT2, y3TsT means by definition that

El „„y2pt ` kq y3pt ` kq  |Hy_t´1,y2,y3  “ El „„y2pt ` kq y3pt ` kq  |H_t´y2,y3 

for all t, k P Z, k ą 0. By looking at the latter component-wise, an equivalent form is that y1conditionally does not Granger cause y2with respect to y3and y1

condi-tionally does not Granger cause y3with respect to y2.

Also, it trivially holds for any ryT

1, yT2, yT3s process that y1 conditionally does

not Granger cause y2 with respect to ry1, y3s. That is, the conditional Granger

non-causality holds automatically because y1is in the condition of the conditional

Granger non-causality.

Now we are ready to present the proof of Lemma4.13.

Proof of Lemma4.13. Necessity: We will prove that if the conditions(i),(ii), and(iii)

in Lemma4.34hold for all i P V then y has G-consistent causality structure. From Lemma4.34we know that(i),(ii), and(iii)imply(iv)and(v). By Lemma4.31,(v)

holds if and only if for all i P ¯Ij yidoes not Granger cause yj with respect to yIj.

Recall that y has G-consistent causality structure if pi, jq R E implies that yi does

not Granger cause yj with respect to yIj. Therefore, considering(v), it remains to

show that yi does not Granger cause yj with respect to yIj for all i P V z ¯Ij where

pi, jq R E.

Define the set S “ ti P V |i ă ju and notice that S “ ti P V z ¯Ij|pi, jq R Eu.

Therefore, to finish our proof, we have to show that for any i P S, yi does not

Granger cause yj with respect to yIj. Fix an s “ j ´ L P S and apply condition

(ii)to yj´l, l “ L, . . . , j ´ 1: for component yjit gives that yj´lconditionally does

not Granger cause yj with respect to ryTIj´l, y

T ¯ Ij´ls

T_{, see also Remark} _4.35_{. From}

Ij´lY ¯Ij´l“ Ij´l`1Y ¯Ij´l`1Y tj ´ l ` 1uthe latter implies that

(24)

4.A. Proofs of Lemmas4.10and4.13 95

the equation above to Hyt´Ij,yj,yj´Land considering that

H_týIj´L,yIj´L¯ ,yj´LĚ H_týIj,yj,yj´LĚ Hy_tÍj,yj, it follows that Elryjpt ` kq|H y_Ij,yj,yj´L t´ s “ Elryjpt ` kq|H y_Ij,yj t´ s. This, by definition

means that yj´Ldoes not Granger cause yjwith respect to yIj. Since s “ j ´ L was

an arbitrary element in S, this proves that for any s P S ysdoes not Granger cause

yjwith respect to yIj which completes the necessity part of the proof.

Sufficiency: Below, we will show that G-consistent causality structure of y implies

(i),(ii), and(iii)in Lemma4.34, respectively.

G-consistent causality implies(i): Consider a node j P V and let S be a subset of the set tl P V | pl, jq R Eu. Then, since for all s P S, ysdoes not Granger cause yj

with respect to yIj, from Lemma4.31it follows that yS does not Granger cause yj

with respect to yIj. By definition it gives that

Elryjpt ` kq|HyIj,yj,ySs “ Elryjpt ` kq|HyIj,yjs. (4.9)

Next, by using (4.9), we show that for any i P V yidoes not Granger cause yIi. Let

j P Iiand notice that since G is acyclic, pi, jq R E. Moreover, for any l P IizIj, pl, jq R

E, hence applying (4.9) to S “ i Y pIizIjqand then to S “ IizIj, it follows that

Elryjpt ` kq|HyIi,yis “ Elryjpt ` kq|H y_Ij,yj

s “ Elryjpt ` kq|HyIi,yjs

for every j P Ii. In other words, yidoes not Granger cause yIi, see Remark4.35.

For proving that y_I¯_i does not Granger cause yIi, we apply (4.9) for S “ ¯Ii and

j P Ii. Notice that ¯IiĎ tl P V | pl, jq R Eusince if l P ¯Ii, then pl, jq R E for any j P Ii,

otherwise pl, jq, pj, iq P E would imply pl, iq P E by transitivity, which contradicts l P ¯Ii.

G-consistent causality implies(ii): Let s P IjY ¯Ij. Then notice that since s ą j we

know that pj, sq R E. Hence, from the fact that y has G-consistent causality structure, yjconditionally does not Granger cause yswith respect to yIs. Let S “ IjY ¯IjzIsand

assume that S “ ts1, . . . , sLu. To see that from the G-consistent causality structure of

ycondition(ii)follows, we first show that yjand ysp`1conditionally do not Granger

cause yswith respect to ryTIs, y

T s1,...,sps

T _{for all p “ 1, . . . , L ´ 1.}

Notice that psk, sq R E for all k “ 1, . . . , L, hence, ys1 and ys2 conditionally

do not Granger cause ys with respect to yIs. Considering the latter two

condi-tional Granger causalities, we can apply Lemma4.33to ryT j, y T s1, y T s, yTIss T _{and to} ryTs2, y T s1, y T s, yITss

T_{. Then, we obtain that y}

jand ys2conditionally does not Granger

cause yswith respect to ryTIs, y

T s1s

T_{. Assume by induction that y}

j and ysi`1

condi-tionally do not Granger cause yswith respect to ryTIs, y

T s1,...,sis

(25)

where p is smaller than the number of elements in S. From this, we can apply Lemma4.33to ryT jyTsp, y T s, ryTIs, y T s1,...,sp´1ss T_{and to ry}T sp`1, y T sp, y T s, ryTIs, y T s1,...,sp´1ss T_.

As a results, we obtain that yjand ysp`1conditionally do not Granger cause yswith

respect to ryT Is, y

T s1,...,sps

T_{, which completes the induction.}

From the discussion above, yj and ysL conditionally do not Granger cause

ys with respect to ryTIs, y T s1,...,sL´1s T_{. By applying Lemma} _4.33 _{to ry}T j, yTsL, y T s, ryT_I_s, yTs1,...,sL´1ss T _{we obtain that y}

j conditionally does not Granger cause yswith

respect to ryT Is, ySs

T_{. Since s is an arbitrary element of I}

jY ¯Ij, the latter is equivalent

to condition(ii)if we look at the condition component-wise, see Remark4.35. G-consistent causality implies (iii): From Lemma4.32 it follows that (i) and (v)

is equivalent to(i) and(iii). Since(i)holds, in order to prove(iii)we will instead prove(v). Let j P V and ¯Ij “ t¯i1, . . . ,¯ipuand notice that from p¯ik, jq R E it follows

that y¯ik conditionally does not Granger cause yj with respect to yIj, k “ 1, . . . , p.

Then, applying Lemma4.31to ryT ¯i1, y T ¯i2, y T j, yTIjs T _{we obtain that ry}T ¯i1, y T ¯i2s T

condi-tionally does not Granger cause yj with respect to yIj. Apply now Lemma4.31to

ry¯Ti1¯i2, y T ¯i3, y T j, yTIjs T_{, . . . , ry}T ¯i1,...,¯ip´1, y T ¯ip, y T j, yTIjs

T_{, respectively. It then implies that}

yI¯j conditionally does not Granger cause yjwith respect to yIj, i.e.,(v)holds.

4.B

Proof of auxiliary results in Section

4.3.1

In this section, we present the proofs of Lemmas 4.19-, 4.23, 4.26, and Corollar-ies4.20, 4.24, and 4.27 from Section 4.3.1. These results are used later on in the proof of Theorem4.15.

Proof of Lemma4.19. Consider an observable Kalman representation S2“ pA22, K22,

C22, I, e2, y2qwith state process x2P Rn2. Then Elry2pt ` kq|H y2

t´s “ C22Ak22x2ptqfor

all k ě 1 and thus ElrY2ptq|H y2

t´s “ On2x2ptqwhere On2 is the finite observability

matrix of pA22, C22q(up to n2) and Y2ptq “ ryT2ptq . . . y2Tpt ` n2´ 1qsT.

Recall now the result(i) ðñ (iii)of Theorem2.5in Chapter2:

Corollary 4.36(Theorem2.5,(i) ðñ (iii)). Consider a ZMSIR process y “ ryT 1, y

T 2s

T_.

Then y1does not Granger cause y2if and only if there exists a minimal Kalman

representa-tion of y in block triangular form „ ˆx1pt ` 1q ˆ x2pt ` 1q  “ « ˆ_A₁₁_Aˆ₁₂ 0 Aˆ22 ff „ ˆx1ptq ˆ x2ptq  ` « ˆ_K₁₁_Kˆ₁₂ 0 Kˆ22 ff „e1ptq e2ptq  „y1ptq y2ptq  “ « ˆ_C₁₁_Cˆ₁₂ 0 Cˆ22 ff „ ˆx1ptq ˆ x2ptq  `„e1ptq e2ptq  , (4.10)

(26)

4.B. Proof of auxiliary results in Section4.3.1 97

where p ˆA22, ˆK22, ˆC22, I, e2qis a minimal Kalman representation of y2.

Consider a minimal Kalman representation (4.10) of y (by assumption y1does

not Granger cause y2). Then p ˆA22, ˆK22, ˆC22, I, e2, y2q is a minimal Kalman

rep-resentation with state process ˆx2. Notice that ElrY2ptq|H y2

t´s “ Oˆn2xˆ2ptq where

ˆ

On2 is the finite observability matrix of p ˆA22, ˆC22q(up to n2, see (4.5)) and Y2ptq “

ryT2ptq . . . yT2pt ` n2´ 1qsT. Since p ˆA22, ˆK22, ˆC22, I, e2, y2qis a minimal, thus

observ-able Kalman representation, we have that ˆO`

n2ElrY2ptq|H

y2

t´s “ ˆx2ptqwhere ˆO`n2 is

the left inverse of ˆOn2. Define now T “ ˆO

`

n2On2and notice that ˆx2“ T x2. Then

„ ˆx1pt ` 1q x2pt ` 1q  “ „_ˆ A11Aˆ12T 0 A22  „ ˆx1ptq x2ptq  ` „_ˆ K11Kˆ12 0 K22  „e1ptq e2ptq  „y1ptq y2ptq  “ „_ˆ C11Cˆ12T 0 C22  „ ˆx1ptq x2ptq  `„e1ptq e2ptq  (4.11)

is a Kalman representation of y. Furthermore, it is observable since the observability of pA22, C22qand p ˆA11, ˆC11qensures the observability of (4.11). Hence, (4.11) is the

extension of pA22, K22, C22, I, e2, y2qfor y in block triangular form.

If pA22, K22, C22, I, e2qwas a minimal Kalman representation of y2then the

di-mension of ˆx2 and x2 would be the same, i.e., (4.11) would be a minimal Kalman

representation in causal block triangular form.

Proof of Corollary4.20. Let S2in the proof of Lemma4.19be pA22, K22, C22, I, e2, y2q,

where e2is the innovation process of y2and A22, K22, C22are the input matrices of

Algorithm 8. Then the representation (4.11) coincides with the Kalman represen-tation pA, K, C, I, eq, where A, K, C are the output matrices of Algorithm 8. This completes the proof.

Proof of Lemma4.23. If y1and y2mutually do not Granger cause each other then

the innovation process e1 of y1 and the innovation process e2 of y2 together as

reT1, eT2sT form the innovation process of y “ ryT1, y2TsT. Then, putting together

a minimal Kalman representation pA11, K11, C11, I, e1q of y1 and an observable

Kalman representation pA22, K22, C22, I, e2qof y2into a block diagonal

representa-tion such as „x1pt ` 1q x2pt ` 1q  “„A11 0 0 A22  „x1ptq x2ptq  `„K11 0 0 K22  „e1ptq e2ptq  „y1ptq y2ptq  “„C11 0 0 C22  „x1ptq x2ptq  `„e1ptq e2ptq  , (4.12)

(27)

we obtain an observable Kalman representation of y which is an extension of the Kalman representation pA22, K22, C22, I, e2qin block diagonal form. Note that the

observability of (4.12) is ensured by the observability of pA11, C11qand pA22, C22q.

If the representation pA22, K22, C22, I, e2qof y2was minimal then (4.12) would

also be minimal. Indeed, the controllability of pA11, K11qand pA22, K22qensures the

controllability of (4.12) (see Proposition1.10).

Proof of Corollary4.24. Let the observable Kalman representation of y2 in the

proof of Lemma4.23be pA22, K22, C22, I, e2, y2q, where e2is the innovation process

of y2and A22, K22, C22are the input matrices of Algorithm8. Then the

representa-tion (4.12) coincides with the Kalman representation pA, K, C, I, eq, where A, K, C are the output matrices of Algorithm8. This completes the proof.

Proof of Lemma4.26. Consider an observable Kalman representation

S “ˆ„A22A23 0 A33  ,„K22K23 0 K33  ,„C22C23 0 C33  , I,„e2 e3 ˙ of ryT

2, yT3sT in block triangular form, where dimpeiq “ dimpyiqfor i “ 2, 3. Denote

the tuple pA33, K33, C33, I, e3qby S3. Notice that A33 is stable and because of(ii),

the noise process e3is the innovation process of y3 and hence S3 is a Kalman

rep-resentation of y3. Furthermore, it is observable. Then, by using(i), we can apply

Lemma4.19to obtain an observable Kalman representation for ryT

1, yT3sT in block

triangular form as follows: „x1pt ` 1q x3pt ` 1q  “„A11A13 0 A33  „x1ptq x3ptq  `„K11K13 0 K33  „e1ptq e3ptq  „y1ptq y3ptq  “„C11C13 0 C33  „x1ptq x3ptq  `„e1ptq e3ptq  , (4.13)

Combine the representation S of ryT

2, y3TsTand the representation (4.13) of ry1T, yT3sT

such that » – x1pt ` 1q x2pt ` 1q x3pt ` 1q fi fl“ » – A11 0 A13 0 A22A23 0 0 A33 fi fl » – x1ptq x2ptq x3ptq fi fl` » – K11 0 K13 0 K22K23 0 0 K33 fi fl » – e1ptq e2ptq e3ptq fi fl » – y1ptq y2ptq y3ptq fi fl“ » – C11 0 C13 0 C22C23 0 0 C33 fi fl » – x1ptq x2ptq x3ptq fi fl` » – e1ptq e2ptq e3ptq fi fl. (4.14)

(28)

4.C. Proofs of Lemma4.29, Theorem4.15, and Corollary4.30 99

From conditions (iii) and (iv), it follows that e1 and e2 are the first and second

components of the innovation process of y “ ryT

1, yT2, y3TsT. In addition, by

us-ing Lemma3.12] that was recalled as Lemma4.31in Appendix4.B, we obtain that the conditions(i)and(ii)are equivalent to that ryT

1, yT2sdoes not Granger cause y3.

The latter implies that Elry3pt ` kq|H y3

t´s “ Elry3pt ` kq|H y

t´sfor all t, k P Z, k ą 0,

i.e., e3ptqis the third component of the innovation process of y. As a consequence,

(4.14) is a Kalman representation of y in coordinated form. Furthermore, (4.14) is observable since the pairs pA11, C11q, pA22, C22qand pA33, C33qare observable pairs.

Assume that S is a minimal representation in causal block triangular form and that for i ‰ j, i, j “ 1, 2 ElrH yi t`|H yi,y3 t´ s X ElrH yj t`|H yj,y3 t´ s | ElrH y3 t`|H y3 t´s “ t0u. (4.15)

Then pA33, K33, C3, I, e3qis a minimal representation of y3. Hence, when we

ap-ply Lemma4.19, the representation (4.13) is minimal and in causal block triangular form. Therefore, (4.14) is a Kalman representation of y in causal coordinated form. We know from Theorem3.5that the conditions(i), (ii), (iii),(iv)and (4.15) imply that there exist a minimal Kalman representation of y in causal coordinated form. Since, by Lemma3.2, Kalman representations of y in causal coordinated form are isomorphic, we obtain that (4.14) is also minimal.

Proof of Corollary4.27. Notice that the steps of the proof of Lemma4.23coincide with the steps of Algorithm10. That is, if in the proof of Lemma 4.23the initial observable Kalman representation S of ryT

2, yT3sT was the Kalman representation

pA2, K2, C2, I, reT2, eT3sT, ry2T, yT3sTq, where A2, K2, C2are the input matrices of

Al-gorithm10, then the system matrices A, K, C of the Kalman representation (4.14) in Lemma4.23would coincide with the output matrices of Algorithm10. This com-pletes the proof.

4.C

Proofs of Lemma

4.29 , Theorem

4.15 , and

Corol-lary

4.30

For the proof of the Lemma4.29we will use two auxiliary lemmas. The first one is Lemma3.12from Chapter3that was recalled as Lemma4.32in Appendix4.Band the second one is Lemma4.13.

Proof of Lemma4.29. By using Lemma 4.13 we obtain that y has G-consistent causality structure if and only if

(29)

(ii) yI¯j does not Granger cause yIj

(iii) yjdoes not Granger cause ryIjyI¯js

(iv) yI¯j does not Granger cause ryIjyjs

Then if we apply Lemma4.32to y “ ryT j, yT_I¯ j, y T Ijs T _{and to y “ ry}T ¯ Ij, y T j, yTIjs T _we

obtain the statement of the Lemma.

Next, we present the proof of Theorem4.15.

Proof of Theorem4.15. Consider a TADG G “ pV “ t1, . . . , nu, Eq and a process y “ ry1T, . . . , yTnsT. Then notice that any Kalman representation with causal

G-zero structure is a Kalman representation with G-G-zero structure, hence(iv)ùñ (v)

follows. We continue with the proof of the remaining implications.

(i) ùñ (v): Assume that y has G-consistent causality structure. Using induction, we will show that with the help of Lemma4.19,4.23, and4.26, an observable Kalman representation with G-zero structure can be constructed. In fact, we will show that for y “ ryT

n´j, . . . , ynTsT, j “ 1, . . . , n ´ 1 there exists an observable Kalman

repre-sentation Sjwith a G|tn´j,...,nu-zero structure such that Sjis an extension of Sj´1in

block triangular, block diagonal or coordinated form.

Recall that the graph G|tn´j,...,nu is the restriction of G to the set of vertices

tn ´ j, . . . , nu Ď V and note that if G is TADG, then so is G|tn´j,...,nu, see also

Remark4.17. For j “ 1, G|tn´1,...,nu can be two types of graph: either G|tn´1,nu “

ptn´1, nu, tpn, n´1quqor G|tn´1,nu“ ptn´1, nu, Hq. Let S1“ pAnn, Knn, Cnn, I, enq

be a minimal Kalman representation of yn. If G|tn´1,nu“ ptn ´ 1, nu, tpn, n ´ 1quq,

then by assumption, yn´1does not Granger cause yn. Hence, by using Lemma4.19

for S1and ryTn´1, yTnsT, we obtain a minimal Kalman representation of ryTn´1, yTnsT

that is an extension of S1for ryTn´1, yTnsT in causal block triangular form.

„ xn´1pt ` 1q xnpt ` 1q  “„Apn´1qpn´1qApn´1qn 0 Ann „xn´1ptq xnptq  `„Kpn´1qpn´1qKpn´1qn 0 Knn „en´1ptq enptq  „yn´1ptq ynptq  “„Cpn´1qpn´1qCpn´1qn 0 Cnn „xn´1ptq xnptq  `„en´1ptq enptq  . (4.16)

Defining a partition tpi, riun_i“n´1where pi “ dimpxiqfor i “ n ´ 1, n, it follows that

the representation (4.16) has a causal G|tn´1,nu-zero structure.

If G|tn´1,nu “ ptn ´ 1, nu, Hq, then by assumption ynand yn´1do not Granger

cause each other. Hence, by using Lemma4.23for S1and ryT_n´1, yTnsT, we obtain

a minimal Kalman representation of ryT

(30)

4.C. Proofs of Lemma4.29, Theorem4.15, and Corollary4.30 101

ryT_n´1, yTnsT in block diagonal form

„xn´1pt ` 1q xnpt ` 1q  “„Apn´1qpn´1q 0 0 Ann „xn´1ptq xnptq  `„Kpn´1qpn´1q 0 0 Knn „en´1ptq enptq  „yn´1ptq ynptq  “„Cpn´1qpn´1q 0 0 Cnn „xn´1ptq xnptq  `„en´1ptq enptq  . (4.17)

Defining a partition tpi, riuni“n´1where pi“ dimpxiqfor i “ n ´ 1, n, we can see that

the representation (4.17) has a causal G|tn´1,nu-zero structure.

Suppose that we have an observable Kalman representation Sj “ pA, K, C, I, eq

of ryT

n´j, . . . , y T

nsT, j P t1, . . . , n ´ 2u with a G|tn´j,...,nu-zero structure with respect

to a partition tpi, riuni“n´j, i.e., the system matrices are

A “ » — — — – An´j,n´j An´j,n´j`1 ¨ ¨ ¨ An´j,n An´j`1,n´j An´j`1,n´j`1¨ ¨ ¨ An´j`1,n .. . ... ... An,n´j An,n´j`1 ¨ ¨ ¨ An´j,n fi ffi ffi ffi fl K “ » — — — – Kn´j,n´j Kn´j,n´j`1 ¨ ¨ ¨ Kn´j,n K_n´j`1,n´j K_{n´j`1,n´j`1}¨ ¨ ¨ Kn´j`1,n .. . ... ... Kn,n´j Kn,n´j`1 ¨ ¨ ¨ Kn´j,n fi ffi ffi ffi fl C “ » — — — – Cn´j,n´j Cn´j,n´j`1 ¨ ¨ ¨ Cn´j,n C_n´j`1,n´j C_{n´j`1,n´j`1}¨ ¨ ¨ Cn´j`1,n .. . ... ... Cn,n´j Cn,n´j`1 ¨ ¨ ¨ Cn´j,n fi ffi ffi ffi fl (4.18)

such that if t, s P tn ´ j, . . . , nu, pt, sq R E then Ast “ 0, Kst “ 0, Cst “ 0. We

will show that Sj can be extended to a representation of ryn´j´1T , . . . , yTnsT with

a G|tn´j´1,...,nu-zero structure with a partition tpi, riun_i“n´j´1. Note that the state

process x of Sj is partitioned by x “ rxT_n´j, . . . , xTns

T _{where x}

i P Rpi. For

conve-nience, define i :“ n ´ j ´ 1 and let the set of parent and non-parent nodes of i be Ii “ ti1, . . . , ikuand ¯Ii “ t¯i1, . . . ,¯ilu. Accordingly, we denote the subprocesses

xIi “ rx T i1, . . . , x T iks T_{, x} ¯ Ii “ rx T ¯i1, . . . , x T ¯ils T_{, y} Ii “ ry T i1, . . . , y T iks T_{, y} ¯ Ii “ ry T ¯i1, . . . , y T ¯ils T and eIi “ re T i1, . . . , e T iks T_{, e} ¯ Ii “ re T ¯i1, . . . , e T ¯ ils T_.

Notice that because of IiY ¯Ii“ ti`1, . . . , nu, we can define permutation matrices

Pyand Pxsuch that ryT_I¯_i, yTIis

T “ PyryT_i`1, . . . , ynTsT, reT_I¯_i, eTIis T “ PyreT_i`1, . . . , xTnsT and rxT ¯ Ii, x T Iis T