Relationship between Granger non-causality and network graph of state-space
representations
Jozsa, Monika
IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below.
Document Version
Publisher's PDF, also known as Version of record
Publication date: 2019
Link to publication in University of Groningen/UMCG research database
Citation for published version (APA):
Jozsa, M. (2019). Relationship between Granger non-causality and network graph of state-space representations. University of Groningen.
Copyright
Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).
Take-down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.
Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum.
Chapter 4
Granger causality and Kalman representations
with transitive acyclic directed graph zero
struc-ture
We have seen in Chapters 2 and 3 that the existence of Kalman representations whose network graphs are star graphs are equivalent to the lack of conditional and unconditional Granger causalities in the output process of those representations. In this chapter we present a generalization of these results by introducing Kalman rep-resentations with network graphs that are transitive acyclic directed graphs (TADG). We call these representations Kalman representations with TADG-zero structure. The existence of Kalman representations with TADG-zero structure is then asso-ciated to the lack of conditional and unconditional Granger non-causalities in the output process. These causalities are consistent with the network graph of the rep-resentation. We also present algorithms for constructing a Kalman representation with TADG-zero structure in the presence of the appropriate conditional and un-conditional Granger non-causalities.
The paper (Caines and Wynn, 2007) is the closest one to the results in this chap-ter. The cited paper studies LTI–SS representations of Gaussian processes in a form that is a subclass of the Kalman representations with TADG-zero structure with ad-ditional assumptions on the covariance matrix of the noise process. The existence of these LTI–SS representations are formalized by conditional orthogonal condi-tions which are stronger than the conditional orthogonality condition that are coun-terparts of the conditional and unconditional Granger causalities proposed in this chapter. Note that in (Caines and Wynn, 2007) there are no detailed proofs or al-gorithms to calculate the representations. Furthermore, it does not deal with non-coercive or non-Gaussian processes. The results of this chapter are based on the conference papers (Jozsa et al., 2017a). However, several additional statements are presented here that were not in the cited paper.
This chapter is organized as follows: First, we introduce Kalman representa-tions with TADG-zero structure. Then, we characterize their existence in terms of conditional and unconditional Granger causality. Next, the construction for calcu-lating Kalman representations with TADG-zero structure and the corresponding
al-gorithms are presented. Finally, we provide an example to illustrate the results. The proofs of the statements can be found in Appendices4.A,4.B, and4.C. If not stated otherwise, we assume throughout this chapter that y “ ryT
1, . . . , yTnsT is a ZMSIR
process where n ě 2, yiP Rri, and rią 0for i “ 1, . . . , n.
4.1
Kalman representation with TADG-zero structure
In this section, we introduce Kalman representations whose network graph is a tran-sitive acyclic directed graph (TADG) and discuss their properties. To begin with, we define the class of transitive acyclic graphs.
Definition 4.1 (TADG). A directed graph G “ pV, Eq, with set of nodes V “ t1, . . . , kuand set of directed edges E Ď V ˆ V is called acyclic if there is no cycle i.e., closed directed path. Furthermore, it is transitive if for i, j, l P V the implication pi, jq, pj, lq P E ùñ pi, lq P Eholds. The class of transitive acyclic directed graphs is denoted by TADG.
For convenience we make the following assumption that applies for all ZMSIR processes throughout this chapter.
Assumption 4.2. For a process y “ ryT
1, . . . , yTnsT, we assume that none of the
compo-nents of y is a white noise process, or equivalently, the dimension of a minimal Kalman representation of yiis strictly positive for all i P t1, . . . , nu.
For a TADG G “ pV “ t1, . . . , nu, Eq, the set of nodes V has a so-called topological ordering. By topological ordering we mean an ordering on V such that if pi, jq P E is a directed edge then i ą j. Throughout this chapter we use integers to represent nodes of graphs and, without the loss of generality, we assume the following:
Assumption 4.3. Consider a TADG G “ pV, Eq where V “ t1, . . . , nu. Then pi, jq P E
implies i ą j.
Remark 4.4. Let G “ pV “ ta1, . . . , anu, Eqbe a TADG, then we can generate
topo-logical ordering on G as follows: Assume that the leaves of G are pai1, . . . , aik1q
where k1 ą 1, ij P t1, . . . , nufor j “ 1, . . . , k1 and the leaves are enumerated in
an arbitrary order. Then, delete the leaves of G and all the directed edges whose target node is a leave. Call the new graph G1. Assume now that the leaves of G1are
paik1`1, . . . , aik2qwhere k2ą k1, ij P t1, . . . , nufor j “ k1` 1, . . . , k2and the leaves
are again enumerated in an arbitrary order. Then, delete the leaves of G1and all the
directed edges whose target node is a leave of G1. Continue this until each of the
nodes of G are enumerated. The new graph ˜G “ pt1, . . . , nu, ˜Eq, where pk, lq P ˜Eif and only if paik, ailq P Eis isomorphic with G and has topological ordering.
4.1. Kalman representation with TADG-zero structure 75
The class of transitive acyclic directed graphs will be used to represent internal interconnection structure of Kalman representations. We will say that a Kalman representation has TADG-zero structure if its network graph is a TADG. To define this class of Kalman representations, we need to introduce some new terminology.
Notation 4.5(parent and non-parent succeeding nodes). Let G “ pV “ t1, . . . , nu, Eq be a TADG and consider a node j P V . The set of parent nodes ti P V |pi, jq P Eu of jis denoted by Ij. In addition, the set of non-parent succeeding (with respect to the
topological ordering of V ) nodes ti P V |i ą j, pi, jq R Eu of j is denoted by ¯Ij.
The topological ordering on the set of nodes of a TADG graph implies that Ij, ¯Ij Ď tj ` 1, . . . , nufor all j P t1, . . . , n ´ 1u. Furthermore, from the definition
of ¯Ij, we have that IjY ¯Ij “ tj ` 1, . . . , nu. The next notation helps in referring to
components of processes beyond the original partitioning of those processes.
Notation 4.6(sub-process). Consider the finite set V “ t1, . . . , nu and a tuple J “ pj1, . . . , jlqwhere j1, . . . , jl P V. Then for a process y “ “yT1, . . . , yTn
‰T , we denote the sub-process ryT j1, . . . , y T jls T by y j1,...,jlor by yJ. By abuse of terminology, if J is
a subset of V and not a tuple, then yJ will mean process yα, where α is the tuple
obtained by taking the elements of J in increasing order, i.e. if J “ tj1, . . . , jku,
j1 ă j2 ă ¨ ¨ ¨ jk, then α “ pj1, . . . , jkq. However, yα,β always means ryαT, yTβsT
regardless the topological order between the elements of α and β.
Next, we introduce what we mean by partition of matrices. Call the set tpi, qiuki“1
a partition of pp, qq, where p, q ą 0, ifřk
i“1pi “ pandř k
i“1qi “ q, where pi, qi ą 0
for i “ 1, . . . , k.
Definition 4.7(partition of a matrix). Let tpi, qiuki“1be a partition of pp, qq for some
p, q ą 0. Then the partition of a matrix M P Rpˆqwith respect to tp
i, qiuki“1is a collection
of matrices tMijP Rpiˆqjuki,j“1, such that
M “ » — – M11 ¨ ¨ ¨ M1k .. . . .. ... Mk1¨ ¨ ¨ Mkk fi ffi fl.
In Definition4.7, the indexing of matrix M refers to the blocks of M and does not refer directly to the elements of M . It is parallel to the component-wise indexing of processes where the components can be multidimensional.
Notation 4.8(sub-matrix). Consider the partition tMij P Rpiˆqjuki,j“1 of a matrix
M P Rpˆqwith respect to the partition tp
tuples I “ pi1, . . . , inqand J “ pj1, . . . , jmqwhere i1, . . . , in, j1, . . . , jm P t1, . . . , ku.
Then by the sub-matrix of M indexed by IJ we mean
MIJ:“ » — – Mi1j1 ¨ ¨ ¨ Mi1jm .. . . .. ... Minj1 ¨ ¨ ¨ Minjm fi ffi fl
We are now ready to defined Kalman representations which have a so-called TADG-zero structure:
Definition 4.9 (G-zero structure). Consider a process y “ “yT
1, . . . , yTn
‰T
and a TADG G “ pV “ t1, . . . , nu, Eq. Let pA, K, C, I, eq be a p-dimensional Kalman rep-resentation of y P Rrand partition A with respect to tp
i, piuni“1, K with respect to
tpi, riuni“1 and C with respect to tri, piuni“1 where tpi, riuni“1 is a partition of pp, rq.
Then we say that pA, K, C, I, eq has G-zero structure if Aij “ 0, Kij “ 0, Cij “ 0
whenever pj, iq R E. If, in addition, for all j P V the tuple J :“ pj, ¯Ij, Ijqdefines a
Kalman representation pAJ J, KJ J, CJ J, I, reTj, eTI¯j, eTIjs
T
qof ryT
j, yIT¯j, yTIjs
T in causal
coordinated form (see Definition3.1), then we say that pA, K, C, I, eq has causal G-zero structure.
Besides saying that a representation has G-zero structure or causal G-zero struc-ture, we also say, representation with G-zero structure or with causal G-zero structure.
Consider the TADGs G1“ pt1, 2u, tp2, 1quqand G2“ pt1, 2, . . . , nu, tpn, 1q, pn, 2q,
. . . , pn, n ´ 1quq. If the graph G in Definition4.9is G1, then Definition4.9coincides
with Definition2.1in Section2.1considering ZMSIR processes that satisfy Assump-tion4.2(see Remark2.2in Section2.1). In a similar manner, if the graph G in Defini-tion4.9is G2then it coincides with Definition3.1in Section3.1considering ZMSIR
processes that satisfy Assumption4.2.
If a p-dimensional Kalman representation pA, K, C, I, eq of y P Rrhas causal
G-zero structure, where G “ pV, Eq is a TADG, then the partition tpi, riuni“1 of pp, rq
in Definition 4.9is uniquely determined by y. It is equivalent of saying that the block dimensions of the partitioned matrices A, K and C are uniquely determined by y. Indeed, for all nodes j P V the tuple J :“ pj, ¯Ij, Ijqdefines a Kalman
repre-sentation pAJ J, KJ J, CJ J, I, reTj, e T ¯ Ij, e T Ijs T qof ryT j, y T ¯ Ij, y T Ijs T in causal coordinated
form. Therefore, from Chapter3we know that the dimensions of pAlk, Klk, Clkqfor
k, l P tj, ¯Ij, Ijuare uniquely determined by ryTj, y T ¯ Ij, y
T Ijs
T. Then, using this for node
j “ n ´ 1, . . . , 1it is easy to see that all block dimensions of the partitioned matrices A, Kand C are determined by y.
A Kalman representation with TADG-zero structure can be viewed as consisting of subsystems where each subsystem generates a component of y “ ryT
4.1. Kalman representation with TADG-zero structure 77
More precisely, let G “ pV “ t1, . . . , nu, Eq be a TADG and pA, K, C, I, e, yq be a p-dimensional Kalman representation with G-zero structure where A, K and C are partitioned with respect to a partition tpi, qiuki“1 of pp, qq. Furthermore, let x “
rxT1, . . . , xTnsT be its state such that xi P Rpi, i “ 1, . . . , n. Then the representation
with output yj, j P V is in the form of
Sj # xjpt ` 1q “ Ajjxjptq ``AjIjxIjptq ` KjIjeIjptq ˘ ` Kjjejptq yjptq “ Cjjxjptq ` CjIjxIjptq ` ejptq. (4.1)
Notice that if pi, jq P E, i.e., i is a parent node of j, then subsystem Sj takes
in-puts from subsystem Si, namely the state and noise processes of Si. In contrast, if
pj, iq R E, Sjdoes not take input from Si. Intuitively, it means that the subsystems
communicate with each other as it is allowed by the directed paths of the graph G. Note that from transitivity, if there is a directed path from i P V to j P V then there is also an edge pi, jq P E.
Take the TADG graph G “ pt1, 2, 3, 4u, tp4, 1q, p4, 2q, p3, 1q, p2, 1quq and a process ryT1, y2T, yT3, yT4sT with innovation process reT1, eT2, eT3, eT4sT. Then a Kalman
repre-sentation with G-zero structure of ryT
1, yT2, yT3, y4TsT is given by » — — – x1pt ` 1q x2pt ` 1q x3pt ` 1q x4pt ` 1q fi ffi ffi fl “ » — — – A11A12A13A14 0 A22 0 A24 0 0 A33 0 0 0 0 A44 fi ffi ffi fl » — — – x1ptq x2ptq x3ptq x4ptq fi ffi ffi fl ` » — — – K11K12K13K14 0 K22 0 K24 0 0 K33 0 0 0 0 K44 fi ffi ffi fl » — — – e1ptq e2ptq e3ptq e4ptq fi ffi ffi fl » — — – y1ptq y2ptq y3ptq y4ptq fi ffi ffi fl “ » — — – C11C12C13C14 0 C22 0 C24 0 0 C33 0 0 0 0 C44 fi ffi ffi fl » — — – x1ptq x2ptq x3ptq x4ptq fi ffi ffi fl ` » — — – e1ptq e2ptq e3ptq e4ptq fi ffi ffi fl , (4.2)
where Aij P Rpiˆpj, Kij P Rpiˆrj, Cij P Rriˆpj and yi, ei P Rri, xi P Rpi for some
pi ą 0, i, j “ 1, 2, 3, 4. The network graph of this representation is the network
of the representations S1, S2, S3, S4 defined in (4.1), generating y1, y2, y3 and y4,
respectively. See Figure4.1for illustration of this network graph.
Motivation for Kalman representations with causal TADG-zero structure
If we consider a general LTI–SS representation of a process y “ ryT
1, . . . , ynTsT with
a TADG G “ pV “ t1, . . . , nu, Eq network graph, then the noise process could be any process. Such as for LTI–SS representations in coordinated form in Chapter3, if the noise process were not the innovation process of y, then it could happen that information flows through it in an implicit way that is not allowed by the directed
x4pt ` 1q “ A44x4ptq ` K44e4ptq y4ptq “ C44x4ptq ` e4ptq x3pt ` 1q “ A33x3ptq ` K33e3ptq y3ptq “ C33x3ptq ` e3ptq x2pt ` 1q “ ř i“2,4pA2ixiptq ` K2ieiptqq y2ptq “ři“2,4C2ixiptq ` e2ptq x1pt ` 1q “ ř4 i“1pA1ixiptq ` K1ieiptqq y1ptq “ř4i“1C1ixiptq ` e1ptq px4, e4q px4, e4q px3, e3q px2, e2q
Figure 4.1:Network graph of the Kalman representation (4.2) with G-zero structure
paths (edges) of G. However, if we assume that pA, K, C, I, e, yq is a Kalman rep-resentation with causal G-zero structure, then reT
j, eTIjs
T is the innovation process
of ryT j, y
T Ijs
T and e
Ij is the innovation process of yIj for j “ 1, . . . , n. Hence, the
present value of ejdepends only on the past and present values of yj, yIj, whereas
the present value of eIj depends only on the past and present values of yIj.
More-over, xIj depends only on the past values of yIj and xj depends only on the past
values of yj, yIj. That is, in case of Kalman representations with causal G-zero
struc-ture, information only flows from subsystems SIj, generating yIj, to the subsystem
Sj, generating yj, see (4.1). That is, the information flows according to the directed
paths (edges) of G.
Kalman representations with causal TADG-zero structure have a number of de-sirable properties, e.g., as it is explained above, the block dimensions of the system matrices are determined by y. Furthermore, in order to estimate a state xj, using a
Kalman filter, only the output ryT j, yTIjs
Tis necessary (if j is a root node of the TADG
then only yjis necessary). Moreover, from Lemma4.10below, Kalman
representa-tions with causal G-zero structure are isomorphic (see Definition1.11). Hence, if they represent the same output process, their properties are essentially the same. Note that as a consequence of Lemma 4.10, if a Kalman representation of a pro-cess y with TADG-zero structure is not minimal then there does not exist a minimal Kalman representation of y with TADG-zero structure.
Lemma 4.10. Consider a TADG G “ pV “ t1, . . . , nu, Eq and a process y “ ryT
1, . . . , yTnsT.
Then any two Kalman representations of y with causal G-zero structure are isomorphic.
4.2. Granger causality and Kalman representation with TADG-zero structure 79
4.2
Granger causality and Kalman representation with
TADG-zero structure
Kalman representations of a process y with causal TADG-zero structure determine causal relationships among the components of y. In fact, we will show that the existence of a Kalman representation of y with causal TADG-zero structure can be characterized by conditional Granger non-causalities among the components of y.
To begin with, we define G-consistent causality structure in a process which in-volves a combination of conditional Granger non-causality conditions between the components of a process.
Definition 4.11(G-consistent causality structure). Consider a TADG G “ pV, Eq, where V “ t1, . . . , nu and a process y “ “yT
1, . . . , yTn
‰T
. We say that y has G-consistent causality structure if yiconditionally does not Granger cause yj with
re-spect to yIj for any i, j P V, i ‰ j such that pi, jq R E.
If G “ pt1, 2u, tp2, 1quq, then Definition4.11coincides with Definition2.3. Fur-thermore, if G “ pt1, 2, . . . , nu, tpn, 1q, pn, 2q, . . . , pn, n ´ 1quq then Definition 4.11
coincides with Definition3.3.
Remark 4.12. Notice that if yiis a root node in the TADG graph, i.e., Ij “ Hthen
none of the other components causes yi. In this case, the conditional Granger
non-causality that for pj, iq R E the process yj conditionally does not Granger cause yi
with respect to yIisimplifies to that yjdoes not Granger cause yi.
Lemma4.13below provides an equivalent reformulation of Definition4.11.
Lemma 4.13. Consider a TADG G “ pV, Eq, where V “ t1, . . . , nu and a process y “
“yT
1, . . . , yTn
‰T
. Then y has G-consistent causality structure if and only if • yjdoes not Granger cause yIj
• yI¯
j does not Granger cause yIj
• yjdoes not Granger cause ryIjyI¯js
• yI¯
j does not Granger cause ryIjyjs
for all nodes j P V of G.
The main result of this chapter includes a condition for the existence of minimal Kalman representations with G-zero structure. For this, we recall Definition3.4, the definition of conditionally trivial intersection of two subspaces U, V Ď H in a Hilbert space H with respect to a closed subspace W Ď H.
Definition 4.14(conditionally trivial intersection). Consider the subspaces U, V, W Ď H such that W is closed. Then U, V have a conditionally trivial intersection with re-spect to W denoted by U X V |W “ t0u if
tu ´ Elru|W s | u P U u X tv ´ Elrv|W s | v P V u “ t0u,
i.e., the intersection of the projections of U and V onto the orthogonal complement of W in H is the zero subspace.
Now we are ready to state the main result of this chapter:
Theorem 4.15. Consider the following statements for a TADG G “ pV “ t1, . . . , nu, Eq
and a process y “ ryT 1, . . . , y
T ns
T:
(i) y has G-consistent causality structure; (ii) (i)holds and for any node j P V in G
ElrH yj t`|H yj,yIj t´ s X ElrH yIj¯ t`|H yIj¯,yIj t´ s | ElrH yIj t`|H yIj t´s “ t0u (4.3)
(iii) there exists a minimal Kalman representation of y with causal G-zero structure; (iv) there exists a Kalman representation of y with causal G-zero structure;
(v) there exists a Kalman representation of y with G-zero structure; Then, the following hold:
(a) (ii) ðñ (iii); (b) (i) ùñ (v); (c) (iv) ùñ (i).
If, in addition, y is coercive, then we have (d) (i) ðñ (iv) ðñ (v).
Theproofcan be found in Appendix4.C.
The intuition behind Theorem 4.15 is the following. If the information flows among subsystems tSiuni“1 (see (4.1)) according to the topology of a TADG G “
pV “ t1, . . . , nu, Eq, then the outputs of subsystems that are not connected by a directed path (edge) in G should not influence each other. For instance, there is no edge from a child node to its parent nodes, which implies that yj should not
4.3. Computing Kalman representations with TADG-zero structure 81
pi, kq R Eif i P ¯Ijand k P Ij, thus yI¯j should not Granger cause yIj. In addition, the
succeeding non-parent nodes of a node j are disconnected from j, i.e., there is no edge from j to ¯Ij or from ¯Ijto j, they only can have some common parent nodes in
Ij. In a similar manner to Chapter3, it implies that yI¯j and yjconditionally does not
Granger cause each other with respect to yIj. With the help of Lemmas4.13and4.32
in Appendix4.A, it can be seen that the discussion above supports the statement that a Kalman representation with causal G-zero structure implies G-consistent causal-ity structure in its output process, see the proof of Theorem4.15 for more details. The implication that a process with G-consistent causality structure always has a Kalman representation with G-zero structure is more involved. It is based on the construction of the Kalman representation with G-zero structure, see Section4.3and the proof of Theorem4.15.
Condition(ii)for minimality in Theorem4.15can be explained as follows. It can be shown that a Kalman representation with causal G-zero structure is observable, so for minimality, we only have to ensure its reachability. This is equivalent to the components of the state, denoted by x “ rxT
1, . . . , x T ns
T, being linearly independent
at each time. In a Kalman representation with causal G-zero structure, the space generated by the components of xjptqfor a node j P V is ElrH
yj
t`|H yj,yIj
t´ s, the space
generated by the component of xI¯jptqis ElrH
yIj¯
t`|H yIj,yIj¯
t´ sand the space generated
by the components of xIjptqis ElrH
yIj t`|H
yIj
t´s. It then can be seen that condition(ii)
is a necessary and sufficient condition for the state x to be linearly independent in a Kalman representation with causal G-zero structure (for more details see the proof of Theorem4.15in Appendix4.C).
Using minimal Kalman representations with causal G-zero structure is desirable since they are isomorphic to any other minimal Kalman representation of the same process (see Proposition1.12). Hence, any property derived for a minimal Kalman representation with causal G-zero structure, if it is invariant under isomorphism, remains valid for any other minimal Kalman representation of the same process. Theorem 4.15gives a necessary and sufficient condition for existence of minimal Kalman representations with causal G-zero structure. Finding conditions on the output process that ensure existence of a minimal Kalman representation with (non-causal) G-zero structure is an open problem to the best of the author’s knowledge.
4.3
Computing Kalman representations with
TADG-zero structure
Assuming that a ZMSIR process y “ ryT
1, . . . , yTnsT has G-consistent causality
G-zero structure can be calculated algorithmically. In this section, we formulate two algorithms for this purpose: The first algorithm, Algorithm11, takes the second or-der statistics of y as input and calculates a Kalman representation of y with G-zero structure. The second algorithm, Algorithm12, calculates the same representation but takes an arbitrary LTI–SS representation of y as its input.
In the rest of this chapter, we will use the following notation:
Notation 4.16. The restriction of a TADG G “ pV “ t1, . . . , ku, Eq to I “ ti1, . . . , ipu Ď
V is the graph defined by G|I:“ pti1, . . . , ipu, tpi, jq P E|i, j P Iuq.
Remark 4.17. The restriction of a TADG to any subset of nodes is a TADG.
Consider a TADG G “ pV “ t1, . . . , nu, Eq and a process y “ ryT
1 . . . , yTnsT
and recall that we assumed topological ordering on V , see Assumption4.3. Then, the main idea of the procedure that calculates a Kalman representation of y with G-zero structure is as follows: first, we take a minimal Kalman representation S0
of yn. Second, S0 is extended to a Kalman representation S1 of ryTn´1, yTnsT with
G|tn´1,nu-zero structure. That is, if pn, n ´ 1q P E then S1is in block triangular form,
otherwise it is in block diagonal form, i.e., the system matrices are block diagonal, see Lemma 4.23 below. Then, we continue it as follows: for i “ 1, . . . , n ´ 2 Si
is extended to a Kalman representation Si`1 of ryTn´i, . . . , yTnsT with G|tn´i,...,nu
-zero structure. In general, the extension of Si can happen in three different ways
depending on the edges of G: In the first case, when ¯In´iis empty, the extended
representation has block triangular form; in the second case, when In´iis empty, it
has block diagonal form and in the third case, when neither ¯In´inor In´iis empty,
then it has coordinated form. Note that ¯In´iY In´i“ tn ´ i ` 1, . . . , nu, i.e., ¯In´iand
In´icannot be empty at the same time for i “ 1, . . . , n´2. To ease the formulation of the procedure described above, we introduce some auxiliary results and algorithms on the above-mentioned extensions of Kalman representations.
4.3.1
Auxiliary results
To extend a Kalman representation of y2to a Kalman representation of ryT1, yT2sT in
block triangular form, we will use the following definition:
Definition 4.18. Consider an observable Kalman representation pA22, K22, C22, I, e2q
ob-4.3. Computing Kalman representations with TADG-zero structure 83
servable Kalman representation of y of the form „x1pt ` 1q x2pt ` 1q “„A11A12 0 A22 „x1ptq x2ptq `„K11K12 0 K22 „e1ptq e2ptq „y1ptq y2ptq “„C11C12 0 C22 „x1ptq x2ptq `„e1ptq e2ptq . (4.4)
Next, we present Lemma4.19, Algorithm8and Corollary4.20below on exten-sions of Kalman representations in block triangular form.
Lemma 4.19. Consider a process y “ ryT
1, yT2sT and an observable Kalman
represen-tation pA22, K22, C22, I, e2q of y2. If y1 does not Granger cause y2 then there exists
an extension of pA22, K22, C22, I, e2, y2q for y in block triangular form. Moreover, if
the representation pA22, K22, C22, I, e2, y2qis minimal, then there exists an extension of
pA22, K22, C22, I, e2, y2qfor y in causal block triangular form which is a minimal Kalman
representation of y.
The proof can be found in Appendix 4.B. Note that Lemma 4.19 can be seen
as a consequence of Theorem 2.5. If pA22, K22, C22, I, e2, y2q is minimal, then by
the isomorphism between minimal Kalman representations, the minimal Kalman representation of y2 in Theorem 2.5 and in Lemma 4.19 are isomorphic. Using
this isomorphism, the minimal Kalman representations of y in Theorem 2.5 and in Lemma 4.19are isomorphic as well. If pA22, K22, C22, I, e2, y2qis not minimal,
then the representation of y2 in Theorem 2.5can be transformed to the
represen-tation of y2 in Lemma4.19with a non-singular state-space transformation. Also,
the representation of y in Theorem2.5can be transformed to an observable Kalman representation of y that is the extension of the representation of y2in Lemma4.19.
Therefore, Theorem2.5ensures the existence of the Kalman representations of y in Lemma4.19provided that y1does not Granger cause y2.
The representation in Lemma 4.19 can be calculated using Algorithm5. This is elaborated in Algorithm8below. Recall that for a tuple pA, Cq of two matrices A P Rnˆnand C P Rmˆn, the finite observability matrix up to N ą 0 is defined by
ON “ rCT pCAqT¨ ¨ ¨ pCAN ´1qTsT. (4.5)
Consider a ZMSIR process y “ ryT
1, yT2sT with covariance sequence tΛ y ku8k“0.
Let e be the innovation process of y and N be any number larger than or equal to the dimension of a minimal Kalman representation of y. Assume that y1 does
not Granger cause y2and note that Algorithm5calculates a minimal Kalman
Algorithm 8Extension of an observable Kalman representation in block triangular form Input tA22, K22, C22uand tΛ y ku 2N
k“0: System matrices of an observable Kalman
rep-resentation of y2and covariance sequence of y “ ryT1, yT2sT
Output tA, K, Cu: System matrices of (4.4)
Step 1Apply Algorithm5with input tΛyku 2N
k“0and denote its output by t ˆA, ˆK, ˆCu,
where ˆ A “ „ˆ A11Aˆ12 0 Aˆ22 , ˆK “ „ˆ K11Kˆ12 0 Kˆ22 , ˆC “ „ˆ C11Cˆ12 0 Cˆ22 . Step 2Define T “ ˆO`
NON where ˆON` is the left inverse of the finite (up to N )
observability matrix of p ˆA22, ˆC22q and ON is the finite (up to N ) observability
matrix of pA22, C22q.
Step 3Define the following matrices
A “„ ˆA11Aˆ12T 0 A22 , K “„ ˆK11Kˆ12 0 K22 , C “„ ˆC11Cˆ12T 0 C22 . input tA22, K22, C22uand tΛ y ku 2N
k“0, where tA22, K22, C22uare system matrices of an
observable Kalman representation pA22, K22, C22, I, e2qof y2. Denote the output by
tA, K, Cu. Then we have the following result:
Corollary 4.20 (Correctness of Algorithm 8). The tuple pA, K, C, I, e, yq is an ob-servable Kalman representation and it is an extension of pA22, K22, C22, I, e2, y2q for
y in block triangular form. Furthermore, if pA22, K22, C22, I, e2, y2q is minimal then
pA, K, C, I, e, yqis a minimal Kalman representation in causal block triangular form. Theproofcan be found in Appendix4.B.
Remark 4.21. Similar to Algorithm 8, Algorithms5and4in Chapter2also calculate Kalman representations in block triangular form provided that y1does not Granger
cause y2. However, Algorithm 8takes, besides the covariances of y, the system
ma-trices of a Kalman representation of y2as its input and extends it in such a way that
the input Kalman representation of y2is a sub-system of the Kalman representation
of y that the output matrices of Algorithm 8 define. As a consequence, contrary to Algorithms5and4, the Kalman representation that Algorithm 8defines is not necessarily minimal or is in a causal block triangular form.
To extend a Kalman representation of y2to a Kalman representation of ry1T, yT2sT
4.3. Computing Kalman representations with TADG-zero structure 85
Definition 4.22. Consider an observable Kalman representation pA22, K22, C22, I, e2q
of y2. An extension of pA22, K22, C22, I, e2, y2qfor y in block diagonal form is an
observ-able Kalman representation of y of the form „x1pt ` 1q x2pt ` 1q “„A11 0 0 A22 „x1ptq x2ptq `„K11 0 0 K22 „e1ptq e2ptq „y1ptq y2ptq “„C11 0 0 C22 „x1ptq x2ptq `„e1ptq e2ptq . (4.6)
Next, we present Lemma4.23, Algorithm9, and Corollary4.24below on exten-sions of Kalman representations in block diagonal form.
Lemma 4.23. Consider a process y “ ryT
1, y2TsT and an observable Kalman representation
pA22, K22, C22, I, e2qof y2. If y1 and y2mutually do not Granger cause each other then
there exists an extension of pA22, K22, C22, I, e2, y2qfor y in block diagonal form.
More-over, if the representation pA22, K22, C22, I, e2, y2qis minimal, then there exists an
exten-sion of pA22, K22, C22, I, e2, y2qfor y in block diagonal form which is a minimal Kalman
representation of y.
Theproofcan be found in Appendix4.B.
An algorithm that calculates the representation in Lemma4.23is presented next: Consider y “ ryT
1, yT2sT, its innovation process e and the covariance sequence
Algorithm 9Extension of an observable Kalman representation in block diagonal form
Input tA22, K22, C22u and tΛyk1u 2N
k“0: System matrices of an observable Kalman
representation of y2and covariance sequence of y1
Output tA, K, Cu: System matrices of (4.6)
Step 1 Apply Algorithm 1 with input tΛy1
k u 2N
k“0 and denote its output by
tA11, K11, C11u.
Step 2Define the following matrices
A “„A11 0 0 A22 , K “„K11 0 0 K22 , C “„C11 0 0 C22 . tΛy1 k u 2N
k“0 of y1 where N is larger than or equal to the dimension of a minimal
Kalman representation of y1. Assume that y1 and y2 mutually do not Granger
cause each other. Apply Algorithm 9 with input tA22, K22, C22u and tΛyk1u2Nk“0,
pA22, K22, C22, I, e2qof y2. Denote the output by tA, K, Cu. Then we have the
fol-lowing result.
Corollary 4.24(Correctness of Algorithm9). The tuple pA, K, C, I, e, yq is an observ-able Kalman representation and it is an extension of pA22, K22, C22, I, e2, y2qfor y in block
diagonal form. Furthermore, if pA22, K22, C22, I, e2, y2qis minimal then pA, K, C, I, e, yq
is also minimal.
Theproofcan be found in Appendix4.B.
Next, we discuss the extension of Kalman representations in coordinated form. To extend a Kalman representation of ryT
2, yT3sTin block triangular form to a Kalman
representation of ryT
1, yT2, yT3sT in coordinated form, we will use the following
defi-nition:
Definition 4.25. Consider an observable Kalman representation
S “ˆ„A22A23 0 A33 ,„K22K23 0 K33 ,„C22C23 0 C33 , I,„x2 x3 ,„e2 e3 ˙ of ryT
2, yT3sT in block triangular form. An extension of S for y “ ryT1, yT2, yT3sT in
coordinated form is an observable Kalman representation of y in the form » – x1pt ` 1q x2pt ` 1q x3pt ` 1q fi fl“ » – A11 0 A13 0 A22A23 0 0 A33 fi fl » – x1ptq x2ptq x3ptq fi fl` » – K11 0 K13 0 K22K23 0 0 K33 fi fl » – e1ptq e2ptq e3ptq fi fl » – y1ptq y2ptq y3ptq fi fl“ » – C11 0 C13 0 C22C23 0 0 C33 fi fl » – x1ptq x2ptq x3ptq fi fl` » – e1ptq e2ptq e3ptq fi fl. (4.7)
Next, we present Lemma4.26, Algorithm10and Corollary4.27on extensions of Kalman representations in coordinated form.
Lemma 4.26. Consider a process y “ ryT
1, yT2, yT3sT and an observable Kalman
represen-tation S of ryT
2, y3TsT in block triangular form. If
(i) y1does not Granger cause y3,
(ii) y2does not Granger cause y3,
(iii) y1conditionally does not Granger cause y2with respect to y3,
4.3. Computing Kalman representations with TADG-zero structure 87
then there exists an extension of S for y in coordinated form. Moreover, if S is a minimal Kalman representation in causal block triangular form and for i ‰ j, i, j “ 1, 2
ElrH yi t`|H yi,y3 t´ s X ElrH yj t`|H yj,y3 t´ s | ElrH y3 t`|H y3 t´s “ t0u,
then there exists an extension of S for y in coordinated form which is a Kalman representa-tion of y in causal coordinated form.
Theproofcan be found in Appendix4.B.
Note that Lemma4.26can be seen as a consequence of Theorem3.5applied to a process y “ ryT 1, y T 2, y T 3s
T. If S is minimal and in causal block triangular form, then
by the isomorphism between minimal Kalman representations, the representation of ryT
2, yT3sT in Theorem3.5and in Lemma4.26are isomorphic. Otherwise, the
rep-resentation of of ryT
2, yT3sT in Theorem3.5can be transformed to the representation
S in Lemma4.26with a non-singular state-space transformation. Also, the repre-sentation of y in Theorem3.5can be transformed to an observable Kalman repre-sentation of y that is the extension of the reprerepre-sentation S in Lemma4.26. Therefore, Theorem3.5ensures the existence of the Kalman representations of y in Lemma4.19
provided the conditional and unconditional Granger non-causality conditions. An algorithm that calculates the representation in Lemma4.26is presented next: Consider a process y “ ryT
1, y2T, yT3sT, its innovation process e and its covariance
sequence tΛyku2Nk“0, where N is larger than or equal to the dimension of a minimal
Kalman representation of y. Assume that y1 and y2 do not Granger cause each
other with respect to y3and y1, y2do not Granger cause y3. Apply Algorithm10
with input tA2, K2, C2uand tΛyku2Nk“0, where tA2, K2, C2uare system matrices of an
observable Kalman representation pA2, K2, C2, I, e2qof ryT2, y T 3s
Tin block triangular
form. Denote the output by tA, K, Cu. Then we have the following result:
Corollary 4.27(Correctness of Algorithm10). The tuple pA, K, C, I, e, yq is an observ-able Kalman representation and it is an extension of pA2, K2, C2, I, e2, y2qfor y in
coor-dinated form. Furthermore, if pA2, K2, C2, I, e2, y2qis a minimal Kalman representation
in causal block triangular form, then pA, K, C, I, e, yq is a Kalman representation in causal coordinated form.
Theproofcan be found in Appendix4.B.
Remark 4.28. Similar to Algorithm 10, Algorithms7 and 6in Chapter3 also cal-culate Kalman representations in coordinated form. However, Algorithms7and6
are formulated for a more general process class, where y has n ě 3 components. Furthermore, Algorithm10takes, besides the covariances of y, the system matrices of a Kalman representation of ryT
Algorithm 10 Extension of an observable Kalman representation in coordinated form Input tA2 “ „A22A23 0 A33 , K2 “ „K22K23 0 K33 , C2 “ „C22C23 0 C33 uand tΛyku 2N k“0:
Sys-tem matrices of an observable Kalman representation of ryT
2, yT3sT and covariance
sequence of y “ ryT
1, yT2, y3TsT
Output tA, K, Cu: System matrices of (4.7)
Step 1Apply Algorithm7with input tΛyku 2N
k“0and denote its output by t ˆA, ˆK, ˆCu,
where ˆ A “ » – ˆ A11 0 Aˆ13 0 Aˆ22Aˆ23 0 0 Aˆ33 fi fl, ˆK “ » – ˆ K11 0 Kˆ13 0 Kˆ22Kˆ23 0 0 Kˆ33 fi fl, ˆC “ » – ˆ C11 0 Cˆ13 0 Cˆ22Cˆ23 0 0 Cˆ33 fi fl Step 2Define T “ ˆO`
NON where ˆO` is the left inverse of the finite (up to N )
observability matrix of p ˆA33, ˆC33qand ˆO is the finite (up to N ) observability matrix
of pA33, C33q.
Step 3Define the following matrices
A “ » – ˆ A11 0 Aˆ13T 0 A22 A23 0 0 A33 fi fl, K “ » – ˆ K11 0 Kˆ13 0 K22K23 0 0 K33 fi fl, C “ » – ˆ C11 0 C13T 0 C22 C23 0 0 C33 fi fl
input Kalman representation is a sub-system of the Kalman representation of y that the output matrices of Algorithm10define. Therefore, contrary to Algorithms7and
6, the Kalman representation that Algorithm 10defines is not necessarily in a causal coordinated form.
4.3.2
Algorithms for Kalman representation with causal
TADG-zero structure
To formulate the algorithms that calculate a Kalman representation of y with G-zero structure, we will use Algorithms8,9, and10. Notice that these algorithms only calculate system matrices of Kalman representations if certain Granger causal-ity conditions hold. We ensure these conditions by relying on the following result:
Lemma 4.29. Consider a process y “ ryT
1, . . . , yTnsT and a TADG graph G “ pV, Eq
with V “ t1, . . . , nu, and assume that y has G-consistent causality structure. Then for any j P t1, . . . , n ´ 1uthe following holds:
4.3. Computing Kalman representations with TADG-zero structure 89
(i) yjdoes not Granger cause yIj
(ii) yI¯
j does not Granger cause yIj
(iii) yjconditionally does not Granger cause yI¯j with respect to yIj
(iv) yI¯j conditionally does not Granger cause yjwith respect to yIj.
Theproofcan be found in Appendix4.C.
Notice that for j ‰ n, ¯Ij “ Hand Ij “ Hcannot happen at the same time since
¯
Ij Y Ij “ tj ` 1, . . . , nu. When ¯Ij “ Hthen Ij “ tj ` 1, . . . , nuand the Granger
causality conditions(i),(ii),(iii), and(iv)simplify to the condition that yjdoes not
Granger cause yIj. Hence, Algorithm8can be applied. On the other hand, if Ij“ H
then ¯Ij “ tj ` 1, . . . , nuand the Granger causality conditions(i),(ii),(iii), and(iv)in
Lemma4.29simplify to the conditions: yI¯j does not Granger cause yjand yjdoes
not Granger cause yI¯j. Therefore, Algorithm9can be applied. If neither Ij, nor ¯Ijis
the empty set then from conditions(i)–(ii)–(iii)and(iv)Algorithm10can be applied. Consider a process y “ ryT
1, . . . , yTnsT and a TADG graph G “ pV, Eq with V “
t1, . . . , nu, and assume that y has G-consistent causality structure. The algorithms that calculate a Kalman representation of y with G-zero structure are elaborated in Algorithms11and12below. Algorithm11takes the covariances of y as its input and transforms it into a Kalman representation of y with G-zero structure. Algorithm12
calculates the same representation from an LTI–SS representation of y. Note that by using empirical covariances Algorithm11can be applied to data.
Let tΛyku8k“0 be the covariance sequence of y and e be the innovation process
of y. Furthermore, let N be any number larger than or equal to the dimension of a minimal Kalman representation of y. Assume that y has G-consistent causality structure and note that Algorithms8,9, and10calculate Kalman representations in causal block triangular, block diagonal and in causal coordinated form, respectively (see Remarks4.20, 4.24, and 4.27). Apply Algorithm 11with input tΛyku
2N k“0 and
denote its output by tA, K, Cu. Now apply Algorithm12with input t ¯A, ¯B, ¯C, ¯D, Λv 0u
where p ¯A, ¯B, ¯C, ¯D, vqdefines an LTI–SS representation of y and Λv
0 “ ErvptqvTptqs.
Denote the output by t ˆA, ˆK, ˆCu. Then we have the following result.
Corollary 4.30(Correctness of Algorithm11and Algorithm12). The tuples pA, K, C, I, eqand p ˆA, ˆK, ˆC, I, eqare observable Kalman representations of y with G-zero structure. Furthermore, if for all nodes j P V in G
ElrH yj t`|H yj,yIj t´ s X ElrH yIj¯ t`|H yIj¯,yIj t´ s | ElrH yIj t` |H yIj t´s “ t0u,
then pA, K, C, I, eq and p ˆA, ˆK, ˆC, I, eq are minimal Kalman representations of y with causal G-zero structure.
Algorithm 11Kalman representation with causal G-zero structure based on output covariances
Input tΛyku 2N
k“0: Covariance sequence of y ““y T 1, .., y
T n
‰T
Output tA, K, Cu: System matrices of a Kalman representation of y with G-zero structure
Step 1 Apply Algorithm 1 with input tΛyn
k u 2N
k“0 and denote its output by
tAn, Kn, Cn, Qenu.
Step 2
for i “ n, n ´ 1 . . . , 2
if ¯Ii´1“ Hthenapply Algorithm8with input tAi, Ki, Ciuand tΛzku2Nk“0where
z “ ryT
i´1, yti,...,nuT s
T. Denote the output by tA
i´1, Ki´1, Ci´1u.
else if Ii´1 “ H then apply Algorithm 9 with input tAi, Ki, Ciu and
tΛyki´1u2Nk“0. Denote the output by tAi´1, Ki´1, Ci´1u.
end if
if Ii´1‰ Hand ¯Ii´1‰ Hthenapply Algorithm10with input tAi, Ki, Ciuand
tΛzku2Nk“0where z “ ryT i´1, y T ¯ Ii´1, y T Ii´1s
T. Denote the output by tA
i´1, Ki´1, Ci´1u.
end if end for
Step 3Define A “ A1, K “ K1and C “ C1.
Algorithm 12Kalman representation with G-zero structure based on LTI–SS repre-sentation
Input t ¯A, ¯B, ¯C, ¯D, Λv
0u, G “ pV, Eq: System matrices of an LTI–SS representation
p ¯A, ¯B, ¯C, ¯D, vqof y and variance matrix of v
Output tA, K, Cu: System matrices of a Kalman representation of y with G-zero structure
Step 1Find the solution Σxof the Lyapunov equation Σ “ ¯AΣ ¯AT ` ¯BΛv0B¯T.
Step 2Define G :“ ¯CΣxA¯T ` ¯DΛv0B¯T and calculate the output covariance
matri-ces Λyk :“ ¯C ¯Ak´1G
T for k “ 0, . . . , 2n, where n is such that ¯
A P Rnˆn.
Step 3 Apply Algorithm 11 with input tΛyku 2n
k“0 and denote the output by
tA, K, Cu.
Theproofcan be found in Appendix4.C.
4.4
Conclusions
In this chapter, we have studied Kalman representations whose network graphs are transitive acyclic directed graphs (TADGs), called Kalman representations with
4.4. Conclusions 91
TADG-zero structure. This class of Kalman representations have been related to conditional Granger causality conditions among the components of their output processes. More precisely, we have shown that there exists a Kalman representation with a TADG G-zero structure if and only if certain conditional Granger causality conditions hold that are determined by G. To construct the Kalman representations in question, we provided algorithms that take an arbitrary LTI–SS representation of the output process or the covariance sequence of that process as its input. In fact, the latter input can be substituted with empirical covariances, and thus the algorithm can be applied to data. Also, the results deal with the minimality of the representa-tions and the so-called coercive property of the output processes.
4.A
Proofs of Lemmas
4.10
and
4.13
Proof of Lemma4.10. Consider a TADG G “ pV “ t1, . . . , nu, Eq and a process y “ ry1T, . . . , y
T ns
T where y
iP Rri, for rią 0, i “ 1, . . . , n. Let S “ pA, K, C, I, eq and
ˆ
S “ p ˆA, ˆK, ˆC, I, eqbe two Kalman representations of y with causal G-zero structure. Then, by Definition4.9, for J :“ p1, ¯I1, I1qthe tuples
SJ J “ pAJ J, KJ J, CJ J, I, reT1, e T ¯ I1, e T I1s T q ˆ SJ J “ p ˆAJ J, ˆKJ J, ˆCJ J, I, reT1, e T ¯ I1, e T I1s T q are Kalman representations of ryT
1, yTI¯1, yTI1s
T in causal coordinated form. By using
Lemma3.2, it follows that SJ Jand ˆSJ J are isomorphic with a non-singular T
trans-formation matrix, i.e., AJ J “ T ˆAJ JT´1, KJ J “ T ˆKJ J, CJ J “ ˆCJ JT´1. Let the
state processes of S and ˆSbe x “ rxT
1, . . . , xTnsT and ˆx “ rˆxT1, . . . , ˆxTnsT, respectively.
Assume that ¯I1“ ti1, . . . , ikuand that I1“ tik`1, . . . , in´1u. Define the permutation
matrices Pyand Pxsuch that
» — – y1 .. . yn fi ffi fl “Py » – y1 yI¯1 yI1 fi fl, » — – x1 .. . xn fi ffi fl “Px » – x1 xI¯1 xI1 fi fl. Note that P´1 y “ PyT and Px´1“ PxT. Then, A “ PxTAJ JPx K “ PxTKJ JPy C “ PyTCJ JPx ˆ A “ PxTAˆJ JPx K “ Pˆ xTKˆJ JPy C “ Pˆ yTCˆJ JPx.
Therefore, by using AJ J “ T ˆAJ JT´1 we obtain that PxAPxT “ T PxAPˆ xTT´1, by
using KJ J “ T ˆKJ J we obtain that PxKPyT “ T PxKPˆ yTP and lastly, by using
CJ J “ ˆCJ JT´1 we obtain that PyCPxT “ PyCPˆ xTT´1. It then follows that by the
transformation matrix ˜T “ PxT PxT the representations S and ˆSare isomorphic.
For the proof of Lemma 4.13 we need auxiliary lemmas on the properties of Granger and conditional Granger causality: First, we recall Lemma3.12from Ap-pendix3.Band Lemma3.6from Chapter3:
Lemma 4.31(Lemma3.12). Consider a ZMSIR process y “ ryT
1, yT2, y3T, yT4sT. Then
y1and y2conditionally do not Granger cause y3with respect to y4if and only if ry1T, yT2sT
conditionally does not Granger cause y3with respect to y4.
Lemma 4.32 (Lemma3.6 ). Consider a process y “ ryT
1, yT2, yT3sT and the following
4.A. Proofs of Lemmas4.10and4.13 93
(i) y1does not Granger cause y3
(ii) y2does not Granger cause y3
(iii) y1conditionally does not Granger cause y2with respect to y3
(iv) y1does not Granger cause ryT2, yT3sT
Then(i)-(ii)-(iii)if and only if(ii)-(iv).
There are two additional results the we employ in the proof of Lemma4.13. The first one is as follows:
Lemma 4.33. Consider a process y “ ryT
1, yT2, yT3, yT4sT. If y1 and y2conditionally do
not Granger cause y3with respect to y4then y1 conditionally does not Granger cause y3
with respect to ryT 2, yT4sT.
Proof. Let α “ y3pt ` sq ´ Elry3pt ` sq|H y3,y4
t´ s for some t, s P Z, s ą 0. Then
from the conditional Granger non-causality conditions we obtain that α “ y3pt `
sq ´ Elry3pt ` sq|Hyt´1,y3,y4sand α “ y3pt ` sq ´ Elry3pt ` sq|Hyt´2,y3,y4s. Therefore, αis orthogonal to Hy2,y3,y4 t´ and to H y1,y3,y4 t´ and thus to H y1,y2,y3,y4 t´ . This implies that Elrα|H y1,y2,y3,y4 t´ s “ 0and thus Elry3pt ` sq|H y1,y2,y3,y4 t´ s “ Elry3pt ` sq|H y3,y4 t´ s.
From the condition that y2conditionally does not Granger cause y3w..r.t. y4, the
lat-ter is further equivalent to Elry3pt`sq|Ht´y2,y3,y4s. That is, Elry3pt`sq|Hyt´1,y2,y3,y4s “
Elry3pt ` sq|Hyt´2,y3,y4s, which holds for any choice of t, s P Z, s ą 0. This, by
defini-tion means that y1conditionally does not Granger cause y3w..r.t. ryT2, yT4sT.
The last auxiliary lemma that helps us in proving Lemma4.13is presented below.
Lemma 4.34. Consider a process y “ ryT
1, . . . , yTnsTand a TADG G “ pV “ t1, . . . , nu, Eq.
Then we state the following for the conditions below: for any node i P V (i),(ii), and(iii)
hold if and only if(i),(iv), and(v)hold. (i) yiand yI¯ido not Granger cause yIi
(ii) yidoes not Granger cause ryIiyI¯is
(iii) yI¯idoes not Granger cause ryIiyis
(iv) yiconditionally does not Granger cause yI¯iwith respect to yIi
(v) yI¯iconditionally does not Granger cause yiwith respect to yIi
Proof. Considering(i),(ii), and(iii), we can apply Lemma4.32to y “ ryT i, yTI¯ i, y T Iis T and to y “ ryT ¯ Ii, y T i, yTIis
T. As a results, we obtain that the conditions(i),(ii), and
Remark 4.35. A Granger causality condition that y1 does not Granger cause
ryT2, y3TsT means by definition that
El „„y2pt ` kq y3pt ` kq |Hyt´1,y2,y3 “ El „„y2pt ` kq y3pt ` kq |Ht´y2,y3
for all t, k P Z, k ą 0. By looking at the latter component-wise, an equivalent form is that y1conditionally does not Granger cause y2with respect to y3and y1
condi-tionally does not Granger cause y3with respect to y2.
Also, it trivially holds for any ryT
1, yT2, yT3s process that y1 conditionally does
not Granger cause y2 with respect to ry1, y3s. That is, the conditional Granger
non-causality holds automatically because y1is in the condition of the conditional
Granger non-causality.
Now we are ready to present the proof of Lemma4.13.
Proof of Lemma4.13. Necessity: We will prove that if the conditions(i),(ii), and(iii)
in Lemma4.34hold for all i P V then y has G-consistent causality structure. From Lemma4.34we know that(i),(ii), and(iii)imply(iv)and(v). By Lemma4.31,(v)
holds if and only if for all i P ¯Ij yidoes not Granger cause yj with respect to yIj.
Recall that y has G-consistent causality structure if pi, jq R E implies that yi does
not Granger cause yj with respect to yIj. Therefore, considering(v), it remains to
show that yi does not Granger cause yj with respect to yIj for all i P V z ¯Ij where
pi, jq R E.
Define the set S “ ti P V |i ă ju and notice that S “ ti P V z ¯Ij|pi, jq R Eu.
Therefore, to finish our proof, we have to show that for any i P S, yi does not
Granger cause yj with respect to yIj. Fix an s “ j ´ L P S and apply condition
(ii)to yj´l, l “ L, . . . , j ´ 1: for component yjit gives that yj´lconditionally does
not Granger cause yj with respect to ryTIj´l, y
T ¯ Ij´ls
T, see also Remark 4.35. From
Ij´lY ¯Ij´l“ Ij´l`1Y ¯Ij´l`1Y tj ´ l ` 1uthe latter implies that
Elryjpt ` kq|H yIj´L,yIj´L¯ ,yj´L t´ s “ Elryjpt ` kq|H yIj´L,yIj´L¯ t´ s “ Elryjpt ` kq|H yIj´L`1,yIj´L`1¯ ,yj´L`1 t´ s “ Elryjpt ` kq|H yIj´L`1,yIj´L`1¯ t´ s “ ¨ ¨ ¨ “ Elryjpt ` kq|H yIj´1,yIj´1¯ ,yj´1 t´ s “ Elryjpt ` kq|H yIj´1,yIj´1¯ t´ s “ Elryjpt ` kq|H yIj,yIj¯,yj t´ s “ Elryjpt ` kq|H yIj,yj t´ s, (4.8)
4.A. Proofs of Lemmas4.10and4.13 95
the equation above to Hyt´Ij,yj,yj´Land considering that
Ht´yIj´L,yIj´L¯ ,yj´LĚ Ht´yIj,yj,yj´LĚ Hyt´Ij,yj, it follows that Elryjpt ` kq|H yIj,yj,yj´L t´ s “ Elryjpt ` kq|H yIj,yj t´ s. This, by definition
means that yj´Ldoes not Granger cause yjwith respect to yIj. Since s “ j ´ L was
an arbitrary element in S, this proves that for any s P S ysdoes not Granger cause
yjwith respect to yIj which completes the necessity part of the proof.
Sufficiency: Below, we will show that G-consistent causality structure of y implies
(i),(ii), and(iii)in Lemma4.34, respectively.
G-consistent causality implies(i): Consider a node j P V and let S be a subset of the set tl P V | pl, jq R Eu. Then, since for all s P S, ysdoes not Granger cause yj
with respect to yIj, from Lemma4.31it follows that yS does not Granger cause yj
with respect to yIj. By definition it gives that
Elryjpt ` kq|HyIj,yj,ySs “ Elryjpt ` kq|HyIj,yjs. (4.9)
Next, by using (4.9), we show that for any i P V yidoes not Granger cause yIi. Let
j P Iiand notice that since G is acyclic, pi, jq R E. Moreover, for any l P IizIj, pl, jq R
E, hence applying (4.9) to S “ i Y pIizIjqand then to S “ IizIj, it follows that
Elryjpt ` kq|HyIi,yis “ Elryjpt ` kq|H yIj,yj
s “ Elryjpt ` kq|HyIi,yjs
for every j P Ii. In other words, yidoes not Granger cause yIi, see Remark4.35.
For proving that yI¯i does not Granger cause yIi, we apply (4.9) for S “ ¯Ii and
j P Ii. Notice that ¯IiĎ tl P V | pl, jq R Eusince if l P ¯Ii, then pl, jq R E for any j P Ii,
otherwise pl, jq, pj, iq P E would imply pl, iq P E by transitivity, which contradicts l P ¯Ii.
G-consistent causality implies(ii): Let s P IjY ¯Ij. Then notice that since s ą j we
know that pj, sq R E. Hence, from the fact that y has G-consistent causality structure, yjconditionally does not Granger cause yswith respect to yIs. Let S “ IjY ¯IjzIsand
assume that S “ ts1, . . . , sLu. To see that from the G-consistent causality structure of
ycondition(ii)follows, we first show that yjand ysp`1conditionally do not Granger
cause yswith respect to ryTIs, y
T s1,...,sps
T for all p “ 1, . . . , L ´ 1.
Notice that psk, sq R E for all k “ 1, . . . , L, hence, ys1 and ys2 conditionally
do not Granger cause ys with respect to yIs. Considering the latter two
condi-tional Granger causalities, we can apply Lemma4.33to ryT j, y T s1, y T s, yTIss T and to ryTs2, y T s1, y T s, yITss
T. Then, we obtain that y
jand ys2conditionally does not Granger
cause yswith respect to ryTIs, y
T s1s
T. Assume by induction that y
j and ysi`1
condi-tionally do not Granger cause yswith respect to ryTIs, y
T s1,...,sis
where p is smaller than the number of elements in S. From this, we can apply Lemma4.33to ryT jyTsp, y T s, ryTIs, y T s1,...,sp´1ss Tand to ryT sp`1, y T sp, y T s, ryTIs, y T s1,...,sp´1ss T.
As a results, we obtain that yjand ysp`1conditionally do not Granger cause yswith
respect to ryT Is, y
T s1,...,sps
T, which completes the induction.
From the discussion above, yj and ysL conditionally do not Granger cause
ys with respect to ryTIs, y T s1,...,sL´1s T. By applying Lemma 4.33 to ryT j, yTsL, y T s, ryTIs, yTs1,...,sL´1ss T we obtain that y
j conditionally does not Granger cause yswith
respect to ryT Is, ySs
T. Since s is an arbitrary element of I
jY ¯Ij, the latter is equivalent
to condition(ii)if we look at the condition component-wise, see Remark4.35. G-consistent causality implies (iii): From Lemma4.32 it follows that (i) and (v)
is equivalent to(i) and(iii). Since(i)holds, in order to prove(iii)we will instead prove(v). Let j P V and ¯Ij “ t¯i1, . . . ,¯ipuand notice that from p¯ik, jq R E it follows
that y¯ik conditionally does not Granger cause yj with respect to yIj, k “ 1, . . . , p.
Then, applying Lemma4.31to ryT ¯i1, y T ¯i2, y T j, yTIjs T we obtain that ryT ¯i1, y T ¯i2s T
condi-tionally does not Granger cause yj with respect to yIj. Apply now Lemma4.31to
ry¯Ti1¯i2, y T ¯i3, y T j, yTIjs T, . . . , ryT ¯i1,...,¯ip´1, y T ¯ip, y T j, yTIjs
T, respectively. It then implies that
yI¯j conditionally does not Granger cause yjwith respect to yIj, i.e.,(v)holds.
4.B
Proof of auxiliary results in Section
4.3.1
In this section, we present the proofs of Lemmas 4.19-, 4.23, 4.26, and Corollar-ies4.20, 4.24, and 4.27 from Section 4.3.1. These results are used later on in the proof of Theorem4.15.
Proof of Lemma4.19. Consider an observable Kalman representation S2“ pA22, K22,
C22, I, e2, y2qwith state process x2P Rn2. Then Elry2pt ` kq|H y2
t´s “ C22Ak22x2ptqfor
all k ě 1 and thus ElrY2ptq|H y2
t´s “ On2x2ptqwhere On2 is the finite observability
matrix of pA22, C22q(up to n2) and Y2ptq “ ryT2ptq . . . y2Tpt ` n2´ 1qsT.
Recall now the result(i) ðñ (iii)of Theorem2.5in Chapter2:
Corollary 4.36(Theorem2.5,(i) ðñ (iii)). Consider a ZMSIR process y “ ryT 1, y
T 2s
T.
Then y1does not Granger cause y2if and only if there exists a minimal Kalman
representa-tion of y in block triangular form „ ˆx1pt ` 1q ˆ x2pt ` 1q “ « ˆA11Aˆ12 0 Aˆ22 ff „ ˆx1ptq ˆ x2ptq ` « ˆK11Kˆ12 0 Kˆ22 ff „e1ptq e2ptq „y1ptq y2ptq “ « ˆC11Cˆ12 0 Cˆ22 ff „ ˆx1ptq ˆ x2ptq `„e1ptq e2ptq , (4.10)
4.B. Proof of auxiliary results in Section4.3.1 97
where p ˆA22, ˆK22, ˆC22, I, e2qis a minimal Kalman representation of y2.
Consider a minimal Kalman representation (4.10) of y (by assumption y1does
not Granger cause y2). Then p ˆA22, ˆK22, ˆC22, I, e2, y2q is a minimal Kalman
rep-resentation with state process ˆx2. Notice that ElrY2ptq|H y2
t´s “ Oˆn2xˆ2ptq where
ˆ
On2 is the finite observability matrix of p ˆA22, ˆC22q(up to n2, see (4.5)) and Y2ptq “
ryT2ptq . . . yT2pt ` n2´ 1qsT. Since p ˆA22, ˆK22, ˆC22, I, e2, y2qis a minimal, thus
observ-able Kalman representation, we have that ˆO`
n2ElrY2ptq|H
y2
t´s “ ˆx2ptqwhere ˆO`n2 is
the left inverse of ˆOn2. Define now T “ ˆO
`
n2On2and notice that ˆx2“ T x2. Then
„ ˆx1pt ` 1q x2pt ` 1q “ „ˆ A11Aˆ12T 0 A22 „ ˆx1ptq x2ptq ` „ˆ K11Kˆ12 0 K22 „e1ptq e2ptq „y1ptq y2ptq “ „ˆ C11Cˆ12T 0 C22 „ ˆx1ptq x2ptq `„e1ptq e2ptq (4.11)
is a Kalman representation of y. Furthermore, it is observable since the observability of pA22, C22qand p ˆA11, ˆC11qensures the observability of (4.11). Hence, (4.11) is the
extension of pA22, K22, C22, I, e2, y2qfor y in block triangular form.
If pA22, K22, C22, I, e2qwas a minimal Kalman representation of y2then the
di-mension of ˆx2 and x2 would be the same, i.e., (4.11) would be a minimal Kalman
representation in causal block triangular form.
Proof of Corollary4.20. Let S2in the proof of Lemma4.19be pA22, K22, C22, I, e2, y2q,
where e2is the innovation process of y2and A22, K22, C22are the input matrices of
Algorithm 8. Then the representation (4.11) coincides with the Kalman represen-tation pA, K, C, I, eq, where A, K, C are the output matrices of Algorithm 8. This completes the proof.
Proof of Lemma4.23. If y1and y2mutually do not Granger cause each other then
the innovation process e1 of y1 and the innovation process e2 of y2 together as
reT1, eT2sT form the innovation process of y “ ryT1, y2TsT. Then, putting together
a minimal Kalman representation pA11, K11, C11, I, e1q of y1 and an observable
Kalman representation pA22, K22, C22, I, e2qof y2into a block diagonal
representa-tion such as „x1pt ` 1q x2pt ` 1q “„A11 0 0 A22 „x1ptq x2ptq `„K11 0 0 K22 „e1ptq e2ptq „y1ptq y2ptq “„C11 0 0 C22 „x1ptq x2ptq `„e1ptq e2ptq , (4.12)
we obtain an observable Kalman representation of y which is an extension of the Kalman representation pA22, K22, C22, I, e2qin block diagonal form. Note that the
observability of (4.12) is ensured by the observability of pA11, C11qand pA22, C22q.
If the representation pA22, K22, C22, I, e2qof y2was minimal then (4.12) would
also be minimal. Indeed, the controllability of pA11, K11qand pA22, K22qensures the
controllability of (4.12) (see Proposition1.10).
Proof of Corollary4.24. Let the observable Kalman representation of y2 in the
proof of Lemma4.23be pA22, K22, C22, I, e2, y2q, where e2is the innovation process
of y2and A22, K22, C22are the input matrices of Algorithm8. Then the
representa-tion (4.12) coincides with the Kalman representation pA, K, C, I, eq, where A, K, C are the output matrices of Algorithm8. This completes the proof.
Proof of Lemma4.26. Consider an observable Kalman representation
S “ˆ„A22A23 0 A33 ,„K22K23 0 K33 ,„C22C23 0 C33 , I,„e2 e3 ˙ of ryT
2, yT3sT in block triangular form, where dimpeiq “ dimpyiqfor i “ 2, 3. Denote
the tuple pA33, K33, C33, I, e3qby S3. Notice that A33 is stable and because of(ii),
the noise process e3is the innovation process of y3 and hence S3 is a Kalman
rep-resentation of y3. Furthermore, it is observable. Then, by using(i), we can apply
Lemma4.19to obtain an observable Kalman representation for ryT
1, yT3sT in block
triangular form as follows: „x1pt ` 1q x3pt ` 1q “„A11A13 0 A33 „x1ptq x3ptq `„K11K13 0 K33 „e1ptq e3ptq „y1ptq y3ptq “„C11C13 0 C33 „x1ptq x3ptq `„e1ptq e3ptq , (4.13)
Combine the representation S of ryT
2, y3TsTand the representation (4.13) of ry1T, yT3sT
such that » – x1pt ` 1q x2pt ` 1q x3pt ` 1q fi fl“ » – A11 0 A13 0 A22A23 0 0 A33 fi fl » – x1ptq x2ptq x3ptq fi fl` » – K11 0 K13 0 K22K23 0 0 K33 fi fl » – e1ptq e2ptq e3ptq fi fl » – y1ptq y2ptq y3ptq fi fl“ » – C11 0 C13 0 C22C23 0 0 C33 fi fl » – x1ptq x2ptq x3ptq fi fl` » – e1ptq e2ptq e3ptq fi fl. (4.14)
4.C. Proofs of Lemma4.29, Theorem4.15, and Corollary4.30 99
From conditions (iii) and (iv), it follows that e1 and e2 are the first and second
components of the innovation process of y “ ryT
1, yT2, y3TsT. In addition, by
us-ing Lemma3.12] that was recalled as Lemma4.31in Appendix4.B, we obtain that the conditions(i)and(ii)are equivalent to that ryT
1, yT2sdoes not Granger cause y3.
The latter implies that Elry3pt ` kq|H y3
t´s “ Elry3pt ` kq|H y
t´sfor all t, k P Z, k ą 0,
i.e., e3ptqis the third component of the innovation process of y. As a consequence,
(4.14) is a Kalman representation of y in coordinated form. Furthermore, (4.14) is observable since the pairs pA11, C11q, pA22, C22qand pA33, C33qare observable pairs.
Assume that S is a minimal representation in causal block triangular form and that for i ‰ j, i, j “ 1, 2 ElrH yi t`|H yi,y3 t´ s X ElrH yj t`|H yj,y3 t´ s | ElrH y3 t`|H y3 t´s “ t0u. (4.15)
Then pA33, K33, C3, I, e3qis a minimal representation of y3. Hence, when we
ap-ply Lemma4.19, the representation (4.13) is minimal and in causal block triangular form. Therefore, (4.14) is a Kalman representation of y in causal coordinated form. We know from Theorem3.5that the conditions(i), (ii), (iii),(iv)and (4.15) imply that there exist a minimal Kalman representation of y in causal coordinated form. Since, by Lemma3.2, Kalman representations of y in causal coordinated form are isomorphic, we obtain that (4.14) is also minimal.
Proof of Corollary4.27. Notice that the steps of the proof of Lemma4.23coincide with the steps of Algorithm10. That is, if in the proof of Lemma 4.23the initial observable Kalman representation S of ryT
2, yT3sT was the Kalman representation
pA2, K2, C2, I, reT2, eT3sT, ry2T, yT3sTq, where A2, K2, C2are the input matrices of
Al-gorithm10, then the system matrices A, K, C of the Kalman representation (4.14) in Lemma4.23would coincide with the output matrices of Algorithm10. This com-pletes the proof.
4.C
Proofs of Lemma
4.29
, Theorem
4.15
, and
Corol-lary
4.30
For the proof of the Lemma4.29we will use two auxiliary lemmas. The first one is Lemma3.12from Chapter3that was recalled as Lemma4.32in Appendix4.Band the second one is Lemma4.13.
Proof of Lemma4.29. By using Lemma 4.13 we obtain that y has G-consistent causality structure if and only if
(ii) yI¯j does not Granger cause yIj
(iii) yjdoes not Granger cause ryIjyI¯js
(iv) yI¯j does not Granger cause ryIjyjs
Then if we apply Lemma4.32to y “ ryT j, yTI¯ j, y T Ijs T and to y “ ryT ¯ Ij, y T j, yTIjs T we
obtain the statement of the Lemma.
Next, we present the proof of Theorem4.15.
Proof of Theorem4.15. Consider a TADG G “ pV “ t1, . . . , nu, Eq and a process y “ ry1T, . . . , yTnsT. Then notice that any Kalman representation with causal
G-zero structure is a Kalman representation with G-G-zero structure, hence(iv)ùñ (v)
follows. We continue with the proof of the remaining implications.
(i) ùñ (v): Assume that y has G-consistent causality structure. Using induction, we will show that with the help of Lemma4.19,4.23, and4.26, an observable Kalman representation with G-zero structure can be constructed. In fact, we will show that for y “ ryT
n´j, . . . , ynTsT, j “ 1, . . . , n ´ 1 there exists an observable Kalman
repre-sentation Sjwith a G|tn´j,...,nu-zero structure such that Sjis an extension of Sj´1in
block triangular, block diagonal or coordinated form.
Recall that the graph G|tn´j,...,nu is the restriction of G to the set of vertices
tn ´ j, . . . , nu Ď V and note that if G is TADG, then so is G|tn´j,...,nu, see also
Remark4.17. For j “ 1, G|tn´1,...,nu can be two types of graph: either G|tn´1,nu “
ptn´1, nu, tpn, n´1quqor G|tn´1,nu“ ptn´1, nu, Hq. Let S1“ pAnn, Knn, Cnn, I, enq
be a minimal Kalman representation of yn. If G|tn´1,nu“ ptn ´ 1, nu, tpn, n ´ 1quq,
then by assumption, yn´1does not Granger cause yn. Hence, by using Lemma4.19
for S1and ryTn´1, yTnsT, we obtain a minimal Kalman representation of ryTn´1, yTnsT
that is an extension of S1for ryTn´1, yTnsT in causal block triangular form.
„ xn´1pt ` 1q xnpt ` 1q “„Apn´1qpn´1qApn´1qn 0 Ann „xn´1ptq xnptq `„Kpn´1qpn´1qKpn´1qn 0 Knn „en´1ptq enptq „yn´1ptq ynptq “„Cpn´1qpn´1qCpn´1qn 0 Cnn „xn´1ptq xnptq `„en´1ptq enptq . (4.16)
Defining a partition tpi, riuni“n´1where pi “ dimpxiqfor i “ n ´ 1, n, it follows that
the representation (4.16) has a causal G|tn´1,nu-zero structure.
If G|tn´1,nu “ ptn ´ 1, nu, Hq, then by assumption ynand yn´1do not Granger
cause each other. Hence, by using Lemma4.23for S1and ryTn´1, yTnsT, we obtain
a minimal Kalman representation of ryT
4.C. Proofs of Lemma4.29, Theorem4.15, and Corollary4.30 101
ryTn´1, yTnsT in block diagonal form
„xn´1pt ` 1q xnpt ` 1q “„Apn´1qpn´1q 0 0 Ann „xn´1ptq xnptq `„Kpn´1qpn´1q 0 0 Knn „en´1ptq enptq „yn´1ptq ynptq “„Cpn´1qpn´1q 0 0 Cnn „xn´1ptq xnptq `„en´1ptq enptq . (4.17)
Defining a partition tpi, riuni“n´1where pi“ dimpxiqfor i “ n ´ 1, n, we can see that
the representation (4.17) has a causal G|tn´1,nu-zero structure.
Suppose that we have an observable Kalman representation Sj “ pA, K, C, I, eq
of ryT
n´j, . . . , y T
nsT, j P t1, . . . , n ´ 2u with a G|tn´j,...,nu-zero structure with respect
to a partition tpi, riuni“n´j, i.e., the system matrices are
A “ » — — — – An´j,n´j An´j,n´j`1 ¨ ¨ ¨ An´j,n An´j`1,n´j An´j`1,n´j`1¨ ¨ ¨ An´j`1,n .. . ... ... An,n´j An,n´j`1 ¨ ¨ ¨ An´j,n fi ffi ffi ffi fl K “ » — — — – Kn´j,n´j Kn´j,n´j`1 ¨ ¨ ¨ Kn´j,n Kn´j`1,n´j Kn´j`1,n´j`1¨ ¨ ¨ Kn´j`1,n .. . ... ... Kn,n´j Kn,n´j`1 ¨ ¨ ¨ Kn´j,n fi ffi ffi ffi fl C “ » — — — – Cn´j,n´j Cn´j,n´j`1 ¨ ¨ ¨ Cn´j,n Cn´j`1,n´j Cn´j`1,n´j`1¨ ¨ ¨ Cn´j`1,n .. . ... ... Cn,n´j Cn,n´j`1 ¨ ¨ ¨ Cn´j,n fi ffi ffi ffi fl (4.18)
such that if t, s P tn ´ j, . . . , nu, pt, sq R E then Ast “ 0, Kst “ 0, Cst “ 0. We
will show that Sj can be extended to a representation of ryn´j´1T , . . . , yTnsT with
a G|tn´j´1,...,nu-zero structure with a partition tpi, riuni“n´j´1. Note that the state
process x of Sj is partitioned by x “ rxTn´j, . . . , xTns
T where x
i P Rpi. For
conve-nience, define i :“ n ´ j ´ 1 and let the set of parent and non-parent nodes of i be Ii “ ti1, . . . , ikuand ¯Ii “ t¯i1, . . . ,¯ilu. Accordingly, we denote the subprocesses
xIi “ rx T i1, . . . , x T iks T, x ¯ Ii “ rx T ¯i1, . . . , x T ¯ils T, y Ii “ ry T i1, . . . , y T iks T, y ¯ Ii “ ry T ¯i1, . . . , y T ¯ils T and eIi “ re T i1, . . . , e T iks T, e ¯ Ii “ re T ¯i1, . . . , e T ¯ ils T.
Notice that because of IiY ¯Ii“ ti`1, . . . , nu, we can define permutation matrices
Pyand Pxsuch that ryTI¯i, yTIis
T “ PyryTi`1, . . . , ynTsT, reTI¯i, eTIis T “ PyreTi`1, . . . , xTnsT and rxT ¯ Ii, x T Iis T