A note on persistency ofexcitation

(1)

www.elsevier.com/locate/sysconle

A note on persistency ofexcitation

Jan C. Willems

^a

, Paolo Rapisarda

^b

, Ivan Markovsky

^a^,∗

, Bart L.M. De Moor

^a

aESAT, SCD/SISTA, K.U. Leuven, Kasteelpark Arenberg 10, B 3001 Leuven, Heverlee, Belgium bDepartment of Mathematics, University of Maastricht, 6200 MD Maastricht, The Netherlands

Received 3 June 2004; accepted 7 September 2004 Available online 30 November 2004

Abstract

We prove that ifa component ofthe response signal ofa controllable linear time-invariant system is persistently exciting of sufficiently high order, then the windows of the signal span the full system behavior. This is then applied to obtain conditions under which the state trajectory ofa state representation spans the whole state space. The related question ofwhen the matrix formed from a state sequence has linearly independent rows from the matrix formed from an input sequence and a finite number ofits shifts is ofcentral importance in subspace system identification.

Keywords: Behavioral systems; Persistency ofexcitation; Lags; Annihilators; System identiﬁcation

1. Introduction

Persistency ofexcitation ofan input or a noise signal is ofimportance in system identiﬁcation and adaptive control, see, for example,[1,3–6]. In this paper, we examine consequences ofpersistency ofexcitation using the behavioral language.

The problem studied may be posed as follows. As- sume that a response

˜w(1), ˜w(2), . . . , ˜w(T )

∗Corresponding author. Tel.: +32 16 321710;

fax: +32 16 321970.

E-mail addresses:jan.willems@esat.kuleuven.ac.be (J.C. Willems),p.rapisarda@math.unimaas.nl(P. Rapisarda), ivan.markovsky@esat.kuleuven.ac.be(I. Markovsky), bart.demoor@esat.kuleuven.ac.be(B.L.M. De Moor).

doi:10.1016/j.sysconle.2004.09.003

ofa linear time-invariant system is observed. Now consider, for some L, 1

L

T , the ‘windows’ of length L:

[ ˜w(1), ˜w(2), . . . , ˜w(L)]

[ ˜w(2), ˜w(3), . . . , ˜w(L + 1)]

· · ·

[ ˜w(T − L + 1), ˜w(T − L + 2), . . . , ˜w(T )]. (1) Under which conditions do these windows span the whole space of all possible windows of length L which the system can produce?

We will show that a sufﬁcient condition for this is that a component (typically the input component) ofthe observed signal is persistently exciting oforderL+n, wheren equals the dimension ofthe state space ofthe system.

(2)

2. Linear time-invariant systems

We use the behavioral language[7,11]. A dynamical system is defined as = (T,W,B), with T ⊆ R the time axis,Wthe signal space, andB⊆W^T^the behavior. In the present paper, we deal exclusively with discrete-time systems with time axisT=N^,Wâ finite-dimensional real vector space (generic notation W=R^w), and a behavior B that is (i) linear, (ii) shift-invariant (B⊆ B^{, where} denotes the shift:

(f )(t) := f (t + 1)), and (iii) complete, i.e., Bîs closed in the topology ofpointwise convergence. We denote the class ofsystems=(N,R^w,B) satisfying (i)–(iii) byL^w. Since throughout the time axis equals N, we use both notations∈L^w ândB∈L^w^{. If}w is not specified, we use ∈L^Q ôr B∈L^Q ^whence L^Q=

w∈Z+L^w^.

It is well known (see[10, Theorem 5]) thatB∈L^w ifand only ifthere exists a real polynomial matrixR ∈ R^Q×w[] (this notation means: R is a matrix ofpoly- nomials in the indeterminate, with real coefﬁcients, w columns, and any ﬁnite number ofrows) such that

B

= {w :

N

→

R

^w| R()w = 0}.

Equivalently,

B

= ker(R()). We call R()w = 0 a kernel representation ofthis

B

^{or of}= (

R

,

R

^w,

B

).

The kernel representation associated with a given

B

∈

L

^wis not unique, and there exists always one in which the polynomial matrix R has full row rank. Such kernel representations are called minimal.

Let

B

∈

L

^Q. Denote, for T ∈

N

^{, by}

B

|_{[1,T ]} the w ∈

B

restricted to[1, T ], i.e.

B

|_{[1,T ]} := {w : [1, T ] →

R

^w| ∃v ∈

B

: w(t) = v(t) for 1

t

T }.

Closely associated with a kernel representation

B

∈

L

^wis the module of annihilators of

B

^,

N

_B⊆

R

^w[], deﬁned by

N

_B:= {n ∈

R

^w[] | n()

B

= 0}.

It is easy to see that

N

_B is a submodule of

R

^w[].

N

ker(R())is, in fact, the submodule generated by the rows of R. Consider also for nonnegative integers∈

Z

₊the annihilators ofdegree less than^,

N

_B:={n ∈

R

^w[]|each element of nis ofdegree

}.

Observe that there holds (with apologies for the slight abuse ofnotation), forL ∈

Z

+,

N

^L−1_B = ker(

B

[1,L]).

There are a number ofimportant ‘integer invariants’

associated with

L

^Q. The following are of interest to us in this paper.

• w :

L

^Q→

Z

+, the variable cardinality. If

B

∈

L

^w^, then w(

B

) := w.

• m :

L

^Q →

Z

+, the input cardinality. This may be deﬁned as follows. m(

B

) = m ifthere exists anm-dimensional subvector that is free in

B

^{. An} m-dimensional subvector w1 is free in

B

^ifafter permutation w = (w1, w2) with w1 :

N

→

R

^w¹^, w2 :

N

→

R

^w²^{, and} w1+ w2= w, then for all w1:

N

→

R

^w¹ there exists aw2:

N

→

R

^w²^{, such} that(w1, w2) ∈

B

^.

• p :

L

^Q→

Z

+, the output cardinality, deﬁned as p(

B

) := w(

B

) − m(

B

).

• n :

L

^Q →

Z

+, the state cardinality. This may be deﬁned as follows. Every

B

∈

L

^w admits a state representation, i.e., there existsn ∈

Z

₊ (called the state dimension) and

B

∈

L

^w+n, such that

B

= {w | ∃x : (w, x) ∈

B

}

and such that

B

satisﬁes the state axiom.

This means that if (w1, x1), (w2, x2) ∈

B

^and t0 ∈

N

^satisfy x1(t0) = x2(t0), then (w, x) = (w1, x1)∧t0(w2, x2) ∈

B

^.∧t0 denotes concatena- tion att0, deﬁned by

(w1, x1)∧t0(w2, x2) :=

(w1, x1)(t) for t < t0, (w2, x2)(t) for t

t0. The smallest state-space dimension among all state representations of

B

is the state cardinality n(

B

) of

B

^.

• L :

L

^Q→

Z

+, the lag. This may be deﬁned as fol- lows. LetR()w = 0 be a kernel representation of

B

. The maximum ofthe degrees ofthe polynomial elements of R is called the lag associated with this particular kernel representation. L(

B

) is the small- est possible lag over all kernel representations of

B

^. In fact, L(

B

) is also the smallest^{such that}

N

_B generates the module

N

_B. In particular, there exist a kernel representation of

B

with equation lags less than or equal to L(

B

).

(3)

• l :

L

^Q→

Z

+, the shortest lag. This may be deﬁned as follows. LetR()w = 0 be a kernel representation of

B

. Deﬁne the degree ofa vector ofpolyno- mials to be equal to the largest ofthe degrees ofthe entries. The minimum ofthe degrees ofthe rows of R is called the minimal lag associated with this kernel representation: each equation inR()w = 0 involves lags at least equal to the minimal lag. l(

B

) is the smallest possible minimal lag over all kernel representations of

B

. Every equation in every kernel representation of

B

has lag at least l(

B

).

These integers are all readily computable from a kernel representation, and certainly from an input/state/output representation of

B

^(see ^{[10, Sec-} tion 7]). It is, for example, possible to prove that

n(

B

)

^L(

B

).

Also,

L

^L(

B

) ⇔ dim(

B

|_[1,L]) = m(

B

)L + n(

B

)

⇔ dim(

N

^L−1_B ) = p(

B

)L − n(

B

),

and

L

^l(

B

) ⇔ dim(

B

|[1,L]) = w(

B

)L

⇔ dim(

N

^L−1_B ) = 0.

Recall that

B

∈

L

^Q is said to be controllable :⇔

for allT ∈

N

^,w1∈

B

|_{[1,T ]}, andw2∈

B

, there exists v ∈

B

^andT∈

N

, such thatv|_{[1,T ]}= w1andw2(t − T − T) = v(t) for t > T + T. Denote by

L

^Qcontrollable

and

L

^wcontrollable ofthe controllable elements

L

^Q ^and

L

^w, respectively.

3. Sequences with spanning windows

Let

B

∈

L

^Q, and assume that a ﬁnite trajectory

˜w ∈

B

|[1,T ] is ‘observed’. Under which conditions it is possible to recover from ˜w the laws of the system

B

that generated ˜w? This question is closely related to the question asked in the introduction: Under which conditions do the observed windows of length L span the space of all possible windows of length L which the system can produce?

Deﬁne the Hankel matrix of depth L associated with the vector signalf (1), f (2), . . . , f (T ) by

H

L(f ) :=







f (1) f (2) · · · f (T − L + 1) f (2) f (3) · · · f (T − L + 2)

... ... ...

f (L) f (L + 1) · · · f (T )





 .

Note that the columns ofthe Hankel matrix

H

L( ˜w) correspond to the windows of ˜w displayed in the intro- duction. Ofcourse, since ˜w ∈

B

|[1,T ], anyn ∈

N

^L−1_B ^, n() = n0+ n1+ · · · + n_L−1^L−1, is such that [n0 n1 · · · n_L−1]

H

L( ˜w) = 0.

Therefore, the left kernel of

H

L( ˜w) contains the vectors generated by the elements of

N

^L−1_B ^{. The} question is: When are there no other annihilators?

Equivalently (with a very slight abuse ofnotation):

When is leftkernel(

H

L( ˜w)) =

N

^L−1_B , equivalently, rowspan(

H

L( ˜w)) =

B

|_[1,L]?

Crucial in our result is the persistency ofexcitation ofa component (typically, the input component) of

˜w. The signal f = [1, T ] ∩

N

→

R

^f is said to be persistently exciting of order L :⇔ rank(

H

L(f )) = Lf , i.e., ifthere exist no non-trivial linear relations oforder L among the f (t)’s. In other words, there are no a1, a2, . . . , aL ∈

R

^f, not all zero, such that a₁f (t) + a₂f (t + 1) + · · · + a_Lf (t + L − 1) = 0, fort = 1, 2, . . . , T − L.

The following is the main result of the paper.

Theorem 1. Consider

B

∈

L

^wcontrollable. Let ˜u : [1, T ] →

R

^m^(B)^,˜y : [1, T ] →

R

^p^(B)^{, and} ˜w=( ˜u, ˜y).

Assume that ˜w ∈

B

|_{[1,T ]}. Then, if ˜u is persistently exciting of orderL + n(

B

),

leftkernel(

H

L( ˜w)) =

N

^L_B, (K) and

rowspan(

H

L( ˜w)) =

B

|_[1,L]. (I) Proof. We only need to prove (K). The inclusion leftkernel(

H

L( ˜w)) ⊇

N

^L_Bis obvious.

Consider the reverse inclusion: leftkernel(

H

L

( ˜w)) ⊆

N

^L_B. Assume, to the contrary, that 0= r= [r₀ r₁ · · · r_L−1 ] ∈ leftkernel(

H

L( ˜w))

(4)

butr()=r0+r1+· · ·+r_L−1^L−1/∈

N

^L−1_B ^{. Consider}

H

_L+n(B)( ˜w). Obviously, leftkernel (

H

_L+n(B)( ˜w)) contains

N

^L+n(B)−1_B +

R

^{, with}

R

⊂

R

^w[] the linear span of

R

= span{r(),r(), . . . ,ⁿ^(B)r()}.

Recall that

dim(

N

^L+n(B)−1_B ) = (L + n(

B

))p(

B

) − n(

B

).

Clearly, dim(

R

) = n(

B

) + 1. We now show that the persistency ofexcitation assumption implies

R

∩

N

^L+n(B)_B = {0}. If

R

∩

N

^L+n(B)_B = {0}, then

dim(

N

^L+n(B)−1_B +

R

) = (L + n(

B

))p(

B

) + 1.

But the persistency ofexcitation implies rank(

H

_L+n(B)( ˜w))

(L + n(

B

))m(

B

).

Hence

dim(

N

^L+n(B)−1_B +

R

)

= (L + n(

B

))p(

B

) + 1

^dim(leftkernel(

H

_L+n(B)( ˜w)))

(L + n(

B

))p(

B

).

Therefore

R

∩

N

^L+n(B)_B = {0}.

Consequently, there is a linear combination of r(),r(), . . . ,ⁿ^(B)r(),

that is contained in

N

^L+n(B)_B . In terms ofthe minimal kernel representationR(_d^d_t)w=0 of

B

, this means that there is 0= f ∈

R

[], such that f r = FR, for some 0= F ∈

R

¹^×rowdim(R)[]. If deg(f )

1, then there is

∈

C

, such thatf () = 0, hence F ()R() = 0.

Now use the well-known fact [11] that R()w = 0 ofBis a minimal kernel representation ofa controllable behavior ifand only ifR() has full row rank for all∈C. Hence controllability impliesF ()=0.

This implies that f and each element of F have a com- mon root. Cancel this common factor. Proceed un- til deg(f ) = 0. But then r = F R. This contradicts the assumption r/∈N^L−1_B . Hence leftkernel(HL( ˜w)) ⊆ N^L−1_B , and (K) holds.

4. Comments and corollaries

1. The interesting, and somewhat surprising, part of Theorem 1 is that persistency ofexcitation oforder L + n(

B

) is needed in order to be able to deduce that the observed sequences (1) oflength L have the ‘correct’ annihilators and the ‘correct’ span. In other words, we have to assume a ‘deeper’ persistency ofexcitation on ˜u than the width ofthe windows of( ˜u, ˜y) which are considered.

2. Note that Theorem 1 holds for all L (and not just forL > L(

B

)). So, in particular, if L

^l(

B

), and under persistency ofexcitation oforderL+n(

B

),

H

L( ˜w) has full row rank. Also, if L > L(

B

), and under persistency ofexcitation oforderL+n(

B

), the left kernel of

H

L( ˜w) (identiﬁed in the obvious way with polynomial vectors) generates the full annihilator module

N

_B. The observed system signal then completely speciﬁes the laws ofthe system.

3. An interesting special case is when

B

is the usual state space systemx=Ax+Bu. Note that for this system, L(

B

)=1. Theorem 1 yields the following corollary.

Corollary 2. Assume that x = Ax + Bu is con- trollable. Consider a trajectory ˜u(1), ˜u(2), . . . , ˜u(T );

˜x(1), ˜x(2), . . . , ˜x(T ) of this system. Then

(i) If ˜u is persistently exciting of order dim(x), then rank[ ˜x(1) ˜x(2) · · · ˜x(T )] = dim(x) + 1.

(ii) If ˜u is persistently exciting of order dim(x) + 1, then rank

˜u(1) · · · ˜u(T )

˜x(1) · · · ˜x(T )

= dim(x) + dim(u).

(iii) If ˜u is persistently exciting of order dim(x) + L, then rank

H

L( ˜u)

˜X

= dim(x) + L dim(u),

where

˜X := [˜x(1) · · · ˜x(T − L + 1)].

In [9, Section 3.3] the conditions rank _H

L( ˜u)

˜X

= dim(x) + L dim(u) is recognized to have a crucial

(5)

role in subspace system identiﬁcation. To the best of our knowledge, however, a test to verify it from the given data( ˜u, ˜y) that is an arbitrary response ofthe system is not available in the literature. Special cases that were studied are: u white noise[2]and u periodic [9, Theorem 2]. Corollary 4 gives such a test for an arbitrary u.

4. The matricesA, B, C, D ofthe system

x = Ax + Bu, y = Cx + Du

can be recovered from the input/state/output trajectory

˜u(1)

˜x(1)

˜y(1)

, ˜u(2)

˜x(2)

˜y(2)

, . . . , ˜u(T )

˜x(T )

˜y(T )

(think ofthe input/output as measured directly, and the state computed using a subspace algorithm) if

˜u is persistently exciting oforder dim(x) + 2.

5. Letw = (u, y) with u the input and y the output of

B

∈

L

^Qcontrollable. Assume that the system is driven by a ‘random’ input ˜u, meaning an input that is per- sistently exciting ofany order. How many (exact) data points

˜u(1)

˜y(1),

, ˜u(2)

˜y(2)

, . . . , ˜u(T )

˜y(T )

,

input/output measurements, do we need in order to be able to identify the system? The left kernel of

H

L(B)+1

˜u

˜y

will give us the laws, provided ˜u is persistently exciting oforder L(

B

) + n(

B

) + 1.

This yields the inequality

T

(L(

B

) + n(

B

) + 1)m(

B

) + L(

B

) + n(

B

).

Adapted for the caseD = 0 and known zero initial conditions, our bound of T is the same as the one derived in[8].

Acknowledgements

This research is supported by the Belgian Federal Government under the DWTC program Interuniver- sity Attraction Poles, Phase V, 2002-2006, Dynami- cal Systems and Control: Computation, Identiﬁcation and Modelling, by the KUL Concerted Research Ac- tion (GOA) MEFISTO-666, and by several grants en projects from IWT-Flanders and the Flemish Fund for Scientiﬁc Research.

References

[1]M. Cadic, J.W. Polderman, I.M.Y. Mareels, Set membership identiﬁcation for adaptive control: input design, in:

Proceedings ofthe 42nd IEEE Conference on Decision and Control, Maui, Hawaii, 2003, pp. 5011–5026.

[2]B. Gopinath, On the identiﬁcation oflinear time-invariant systems from input–output data, Bell System Tech. J. 48 (5) (1969) 1101–1113.

[3]I.M.Y. Mareels, Sufﬁciency of excitation, Systems Control Lett. 5 (1984) 159–163.

[4]I.M.Y. Mareels, R.R. Bitmead, M. Gevers, C.R. Johnson, R.L.

Kosut, M.A. Poubelle, How exciting can a signal really be?

Systems Control Lett. 8 (1987) 197–204.

[5]I.M.Y. Mareels, M. Gevers, Persistence ofexcitation criteria for linear, multivariable, time-varying systems, Math. Control Signals Systems 1 (1988) 203–226.

[6]I.M.Y. Mareels, J.W. Polderman, Adaptive Systems: An Introduction, Birkhäuser, Basel, 1996.

[7]J.W. Polderman, J.C. Willems, Introduction to Mathematical Systems Theory, Springer, New York, 1998.

[8]E. Sontag, On the length ofinputs necessary in order to identify a deterministic linear system, IEEE Trans. Automat.

Control 25 (1) (1980) 120–121.

[9]M. Verhaegen, P. Dewilde, Subspace model identiﬁcation, part I: the output error state space model identiﬁcation class ofalgorithms, Int. J. Control 56 (1992) 1187–1210.

[10]J.C. Willems, From time series to linear system—part I. Finite dimensional linear time invariant systems, Automatica 22 (5) (1986) 561–580.

[11]J.C. Willems, Paradigms and puzzles in the theory of dynamical systems, IEEE Trans. Automat. Control 36 (3) (1991) 259–294.