Parameter estimation from noisy observation of imputs and outputs

(1)

Parameter estimation from noisy observation of imputs and

outputs

Citation for published version (APA):

Vregelaar, ten, J. M. (1988). Parameter estimation from noisy observation of imputs and outputs. (Memorandum COSOR; Vol. 8813). Technische Universiteit Eindhoven.

Document status and date: Published: 01/01/1988 Document Version:

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.

• The final author version and the galley proof are versions of the publication after peer review.

• The final published version features the final layout of the paper including the volume, issue and page numbers.

Link to publication

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:

www.tue.nl/taverne

Take down policy

If you believe that this document breaches copyright please contact us at:

openaccess@tue.nl

providing details and we will investigate your claim.

(2)

Department of Mathematics and Computing Science

Memorandum COSOR 88-13 Parameter estimation from noisy observations of inputs and outputs

by

I.M. ten Vregelaar

Eindhoven University of Technology

Department of Mathematics and Computing Science P.O. Box 513

5600 MB Eindhoven The Netherlands

Eindhoven. May 1988 The Netherlands

(3)

In this

paper an algorithm is given to compute least squares estimates for the

parameters of a dynamic model from noisy measurements of inputs and outputs.

The corresponding estimators are proven

to be

strongly consistent and

asymptoti-cally normal under some assumptions.

Keywords:

Least squares parameter estimation. ARMA-representation, Errors-in-variables,

Consistency, Asymptotic normality.

(4)

In this paper we discuss the least squares parameter estimation method for a dynamic errors-in-variables model An efficient algorithm to compute estimates for the unknown parameters is presented in Section 2.

In Section 3 we deal with 5 assumptions which turn out to be sufficient conditions for the strong consistency of the least squares estimators. The latter is proved in Section 4. Under some addi-tional conditions these estimators are asymptotically normally distributed as is shown in Section

5.

We consider a dynamic model for inputs; and outputs 'fl. represented by an ARMA-description:

p q

'flt =

L

(l;'flt-i +

L

~j;t-j, t =m + I, m +2, ...• (1.1)

i=1 j=O

with known orders p and q and m := max (p, q).

Inputs;' and outputs 'flt are vectors in IR r and IRs respectively, allowing for MIMO models. Both inputs and outputs are supposed to be measured with noises Ot and

e

t resp. (see figure below):

Yt ='flt

+

et}

Xt

=;t

+

Ot t = 1,

2.···.m +N .

(1.2)

I

_{11 _ _ _ _ _}ARM A _--.;1

I

_{f_ - - . - _}_n

,

)

.,

_e::

x y

In (1.2), m

+

N represents the total number of measurements (x" Yt).

The problem is now to estimate the unknown parameter matrix (with size s

x

[ps

+

(q

+

1) r])

9= [(It(l2 "'ap~~1 "'~q] (1.3)

from the set of data {(XloYl). (X2,Y2), • " . (Xm+N,Ym+N)}.

We introduce the least squares estimation method for this problem. Estimates for 9 are obtained by minimizing. with respect to 9.;1, ... , ;m+N, Tll •... • 1lm+N the sum of squares

(5)

subject to the model equation (1.1) for t

=

m

+

1, ... ,m

+

N.

1

In this paper II.XII denotes (tr

xTxl

for any real matrix X. It is convenient to employ short-hand notations for model and observations.

We rewrite (1.1) for t = m

+

1, ... ,m

+

N as a matrix-vector equation

D(S) ~=O , (1.5a) where - / (II ••• Om

_Po'"

_Pm D= (1.5b)

.

- / (II •.• Om

_Po'"

_Pm and 1'Im I'll e 1R(s+rXm+N) • ~= _~+N (1.5c) ~1

Matrix D is sN X (s

+

r) (m

+

N), empty space represents zero elements in (1.5b). Furthennore, I

is the identity matrix and we define (It := 0 for k

>

P and

Pk

:= 0 for k

>

q. Observation and noise vectors z and

e

ate defined corresponding to~. Then

z=~+e

is a short-hand notation for (1.2).

Now, minimization problem (1.4) gets into min liz - ~1I2 subject to D ~

=

0 .

8.~

(1.6)

(1.7) To get rid of the constraint and the minimization with respect to ~ we consider the stationary point equations of the Lagrangian

L(~, "')

=t(z -

OT(z -

0 +

",TD ,:

(6)

Here, D+ := DT (DD Trl is the Moore-Penrose inverse of D since D has full row rank. Therefore. (1.7) reduces to

min zTp(O) z withP =D+D

8

(1.8)

(1.9)

As pointed out by Aoki and Yue (1970a) for the SISO model. solutions of (1.9) correspond to maximum likelihood estimates for 0 when the noise is white and Gaussian. However. they only provide algorithms and convergence results for

approximate

maximum likelihood methods. In a companion paper. Aoki and Yue (1970b), they prove convergence results for the true maximum

likelihood estimators in the special case of

no

input

noise.

For a special case, an algorithm and convergence results are provided in Eising, Linssen and Riet-bergen (1983). Their arguments are not rigourous and no explicit assumptions are given.

Two related papers, SOderstrom (1981) and Anderson (1985), should be mentioned here as well. Both authors employ the frequency domain approach and are mainly interested in the identifiability aspect.

(7)

2. Algorithm

Since (1.9) has in general no closed-form solution, we need an iterative algorithm to calcu-late estimates. We propose the Broyden-Fletcher-Goldfarb-Shanno formula. cf. Scales (1985) pp. 89-90, which has good numerical properties. It uses object function and gradient evaluations. Let J N denote the object function,

IN(a) = zTp(a)

z

(2.1)

then for any component of its gradient J N' ,

jN =2zTD+iJpl.z

holds. sinceP =D+iJpl. +Pl.iJT(D+l (. representing

~i

for any element

aj

from

a).

(2.2)

Here, both P and pl. :=/-P

are

orthogonal projection matrices. To evaluate IN and IN' we prefer performing a Q -R decomposition of DT rather than inverting DDT. The matrix DT has

full column rank,

so

there exist orthogonal Q and regular R (sN

x

sN) such that

QTDT

=

[~

.

(2.3)

Because of the special form of D T, R will be lower triangular. If

Q

1 is the submatrix of

Q

con-sisting of its first sN columns, then

DT = Q lR (2.4)

holds.

Due to the orthogonality of Q,

P=QIQtT

hence

IN=l1Q₁TZIl2 •

Furthermore. when A := (D+l z (cf. (1.8» then

DTA=Pz

and (2.2) implies

jN =2ATiJ(Z -DTA) .

Premultiplying (2.7) by Ql T gives via (2.4) and (2.5)

RA=QITz .

Summarizing, when matrix R and vector

u

:= Q 1 T Z E IRsN are computed from

(2.5)

(2.6)

(2.7)

(2.8)

(8)

QT[DT I z]: [

~ ~]

•

then, from (2.6), (2.8) and (2.9) we obtain IN =lIull2

and

where A. is easily solved from

R')..=u ,

since R is lower triangular.

(2.10)

(2.11a)

(2. 11 b)

(2.11c)

The matrix Q in (2.10) is not computed explicitly: by means of Householder matrices, D T is

transfonned into

[~].

Using the special Toeplitz and band structure of D T this can be done very efficiently. For details we refer to Ten Vregelaar (1987).

The computation of R takes O(N) operations, whereas matrix inversion, proposed in Eising, Linssen and Rietbergen (1983) is of order 0(N2).

The computation of u in (2.10) takes 0(N2) operations. Alternatively, once R is computed we can s()lve u from R T U = Dz, which takes O(N) operations, since R is a band matrix:

R

=

rm+l.l , (all ri.j are s x s) .

.

rN,N-m ••• rN.N

Therefore, if N

»

ps2 + (q

+

1) srwe determine u from RT u =Dz rather than from (2.10). Besides. Q - R decomposition is known to be a numerically stable procedure.

(9)

3. Assumptions

From now on the noise vector e in (1.6) is supposed to be random with zero mean, all its scalar components

are

stochastically independent and the variance-covariance matrix of

e

is var

e

=

cil

I

where G is unknown. The latter assumption is made to avoid the problem as pointed out in Solari

(1969), where the stationary point of the likelihood for the SISO model without dynamics

(p

=

q

=

0) is a saddle point if both the variances on input and output are unknown. However, we may allow for different variances if their quotient is known.

For convenience the object function which is random now, is multiplied by

s~:

IN(9)

=

s~

zTp(9)

z .

(3.1)

Furthermore, 9 is redefined as a vector in JR.'!, with J..l

=

ps2

+

(q

+

1) sr, by

aT

=

[(al)l* ... (al)s*(tl2)1* ... (tlp)s*(~0)1* ... (~q)s*] (3.2)

Here Mi

*

and M *i denote row

i

and column j of any matrix M.

The 5 assumptions to be introduced below

are

more or less commonly used to derive asymptoti-cal properties of estimators.

Let

e

denote the parameter space, then any least squares estimator is defined by "

9N

=

argIllinINCO) .

ee8

Assumption 1

(3.3)

The parameter space

e

is a known convex and compact subset of IR 11., containing the unknown true parameter vector 00'

Since IN is almOS! surely (as.) a C'" -function w.r.t

e.

hence in particular continuous,

e

is com-pact implies that ON defined by (3.3) is indeed a random vector in the sense that it is measurable, cf. Bierens (1981), p. 53.

As an introduction for the next assumption we introduce the polynomial matrices p .

A (A) =-1

+

L

tljAI

iet

and

(3.4)

(10)

A(A)J)t+B(A)~t=O, t=m+l,m+2,···

Assumption 2

For all

e

e,

A (A) is stable, i.e. the zeros of det A (A) lie outside the closed unit disk. We associate matrices A and B with the 'polynomial matrices A (A) and B (A), by defining

p

A = -/

+

L

Sk ® ak (sN

x

sN) k=I

where S

is

the N

x

N shiftmatrix

o

1

s=

₁

o

(sN xrN) (3.5) (3.6) (3.7)

In (3.6), ® denotes the Kronecker product for matrices. From (l.5b) it follows immediately that

D = [A C 1 I B C 2] (3.8) with (sN xsm) and

Cz=

~m (sN xrm) .

PI .. :Pm

Assumption 2 has some important consequences concerning (DDT)-l which appears in the object function and gradient

(11)

Lemma 1

(i) Some constants PI and P2 exist with 0

<

PI

<

P2

<

00 such that

Pl/~ (DDTrl ~ P2/ for all

e

E 8 and all N? P + 1

(By definition: Ml~M2 if xTMIX~xTM2X for all x, Ml and M2 are symmetric matrices).

(ii) There exists a constant P3

<

00, such that

II(DDTrIUoo ~ P3 for all e E

e

and all N? P

+

1

It

(By definition: if Mis n x n then IIMlioo := . max

L

I Mij I ).

,=1, ... ,n j=l

(i) See (appendix of) Ten Vregelaar (1988), which contains a proof of the analogous results for

AA T and DDT. Then the result for (DDTrl follows immediately.

(ii) The matrix DDT is a b1ock-Toeplitz and band matrix and it is positive definite. Hence it can be interpreted as the covariance matrix of as-variate MA (m) process.

Then for N

»m

we can approximate (DDTrl by the covariance matrix kAR' the corresponding s-variate AR (m) process, see Mentz (1976). Suppose kAR = [a( I i - j I )], then

m

a(k) =

L

Cjx/, where Ci are m x m matrices. Without loss of generality we assume

i=1

xl, ... , Xm are different.

Now Xl, ••• ,X_mcan be chosen such that x := max I Xi I

<

I, hence

i

00 00 const

IIkARII_~ 2

L

lIa(k)lI~ const

L

Xk =

-k=O k=O I-x

Remark

The result (DDT)-l ~ P2/ is a consequence of II(DDTrllioo ~ P3. since max 1Ai I ~ II(DDTrlll_

i

where Ai denote the eigenvalues of (DD T)-l , see Wilkinson (1965) p. 59.

(12)

Assumption 3

The input sequence is bounded:

there exists a constant M 1 such that lI;ill S; M 1 for i = 1, 2, ... Corollary 1

The sequence of outputs {11i}

i:l

is bounded. Proof

This well-known BmO-stability result is an immediate consequence of Assumptions 2 and 3.

IJ

We prepare the next assumption by rewriting the vector D ~ in (1.5) as

[

11m+1

D~=(H+K)9- : •

11m+l

(3.9)

where 9 as defined in (3.2) and H and K are 5N x J.1 matrices given by

H

=

[(S ® Is) 11 ... (SP ® 1.1)11 I ; (S ® Is); ... (sq ® Is);] (3. lOa)

K= (3.10b)

with Is is the

s

x

s

identity matrix,

11 -

.

(sN X 52) • [ Is

®11~+Nl

- Is ®

~~+l

and (sNxsr) .

(13)

(3.11)

Assumption 4

Th

.

HT(DDTrlH N &. all

e

Th

I' ,. , G

e matrtx

converges as

-4 00,

lor

e . e

lffiltmg matrtx, say ,

sN

is positive definite on

e.

Because of Lemma

1, sN ~ J.l

is a necessary condition for Assumption

4 to

hold.

As will

be

seen in the next section

this

assumption implies also a convergence result for

E IN(a).

which is one of the tools for proving consistency.

Generalizing Aoki and Vue (1970a), we

can

give an interpretation for the convergence of

HT

(D~Trl

H to

a positive definite matrix in the SISO-case s

=

r

=

1.

The above defined

a,

H, 1l

and ; reduce to

9

=

[a.l •..

ap f30 .••

pq]T , H

=

[S1l ... SP1l I ;

s; ...

sq;] ,

11

=

[1lm+N •.. 1lm+dT

and

Defining

v=D~-A1l-B; , (1.5)

and

(3.8)

imply

v=C,

[~

+C,

[~

For

9=90

(notation sup index 0)

A01l=-Bo~-vo

holds. Then. using

ASk=SkA

for

k = 1,2 •.. , Tn.

AOH=[-5BO;'" -5P_B

o;

_{lAo; ...}

_SqA

o;]

₊₀₀₀

where

00 := -[Sv ... SPy I 0 ... 0].

Hence

with

(14)

and 0 -1

-/30

(Xl

E=

-/3q

0 Up -1

,

(p

+

q

+

1) x (p

+

q

+

1) .

-/30

(Xl

-/3q

(Xp HT(DDTrlH

The effect of

(fp

in N vanishes, whence

lim HT(DDT)-IH = (EO)T lim ST(Ao)-T(DDT)-l(Ao)-IS EO

N....- N N....- N (3.12)

The matrices (A °rT(A

°rl

and (DDTrl can be bounded in the sense of Lemma 1. Therefore, provided the existence of the limits. the limit in the left hand side of (3.12) is positive definite if and only if (i) (ii) -T-lim t=.Nt=.

>

0 N....-EO is regular .

Condition (i) could be a definition of persistency of excitation of order p

+

q for the input sequence {~i}

_7:1

(see Aoki and Yue (1970a). p. 544), whereas the second condition is equivalent to the statement that the polynomials A (A) and B (A) in (3.4) are coprime (see Wolovich (1974). pp.234-236),

In tum, the latter is equivalent to the statement in state-space terminology. that the system is con-trollable if it is observable.

The last assumption requires the fourth moment of the noises to be uniformly bounded. Assumption 5

Let ej denote (scalar) component i of the noise vector e. There exists a constant M 2 such that

(15)

4. Consistency

Consistency is obtained by applying the following argument: when the object function converges

in some sense

uniformly on a compact set to a continuous limit function which is uniquely minimal in the true value of the parameter vector, then any minimizing solution converges

in that

sense

to the true value, see Bierens (1981), p. 54, 65.

In the sequel uniform convergence refers to convergence with respect to

e

on the convex and compactS.

Let us start by proving two lemmas. Lemma 2

For all

a

E

e,

lim IE IN(e) =J(e),

N-+oo

where J

(e) := c? + (e -

aol

G

(a) (a - ao) .

Proof

Observe that the object function defined by (3.1) has mean

IE I_N

=c? +

_1

CTpC

sN

=c?+(a-eol

(H+Kl(DDT)-l(H+K) (9-90) sN

(4.1)

by (3.11). The lemma follows from Assumption 4, since K has a finite number of nonzero

ele-ments.

0

Lemma 3

The sequence of functions {.IE IN(a)} is equicontinuous on

e,

i.e. (see Rudin (1964» for every e

>

0 there exists a 6

>

0 such that

Proof

According to the mean value theorem,

-

-for some eN E e with lieN - Szll S lIal - a211.

If· denotes ":'J.d for arbitrary ai, we have

(16)

a

~Tp~

aai IE IN(a)

=

sN .

We recall from (2.2) that

P

=D+Vpl. + pl.i>T (D+l.

Using Lemma 1 part 0), it is easy to verify that there exists a constant k

>

0 such that

-k IS

P

SkI holds for all

a

E 8 and N

~ ~.

s

Hence

By virtue of Assumption 3 and its Corollary 1, it follows that 1\

:a

IE IN(9N)1I S

k

for some

con-- e

stant k, which proves the lemma (take 0

=

k)'

0

Proposition 1

IE IN(a) t-+ 1(9)

as

N -+ 00, uniformly.

Proof

The proposition is a consequence of Lemmas 2 and 3, cf. Dieudonne (1969) Theorem 7.5.6.

0

Remark

The uniform limit function I is continuous on

e,

and since 8 is compact it is even uniformly con-tinuous on

e.

In order to obtain a convergence result for IN (9) we investigate

LN(9) := IN(a) - IE IN(a) . Lemma 4 Proof We have Since a.s. LN(9) t-+ 0, for all

a

E 8 .

(4.2)

(17)

by application of a result in Whittle (1960), p. 302 and Assumption 5, we obtain from Assump-tion 3 and Corollary 1. that

00

2 1:

JP( I

- e

TP(9)CI ~

e)

<

00 for all

e

>

0 .

N=l sN ;

Due to the Borell-Cantelli Lemma, it follows that

2 a.s.

sN

e

T P (9)

t

~ 0 . On the other hand,

JP(I

e

TP(9)e

-a2J

» <

1 Tp(9) sN - e - slN2r? vare e

:S const _I_

F!

sN

holds. by consequence of another result in Whittle (1960) p. 302 and Assumption 5. Hence

Now ; .

~

0 and tr

::v

=

1 imply

eTP(9)e a.s.

sN

~ a2 •

cf. Varberg (1968) Corollary 3. Lemma 5

o

The sequence of random functions {LN(9)} is a.s. equicontinuous on

e,

i.e. if {O, F, lP} is the probability space involved, there exists a null set E c 0 (i.e. JP(E)

=

0) such that for every 01 E

n -

E the sequence {LN(9, Ol)} is equicontinuous on

e.

Proof

The argument is analogous to that

giveI~

in the proof of Lemma 3. I f . denotes

~d

again, then

a9j

(18)

Tp T a.s.

+

1~ISk~~k.£.!...r?

sN sN s

by the Kolmogorov strong law of large numbers (var

e?

$; const by Assumption 5). see Tucker (1967) p. 124.

Then using the Cauchy-Schwarz inequality it follows

eTpC

-

IT. -

I

CTp

2

_C

1101 -

I

eTe

1 I$;'I~-

'I

$ ; k ' I

-sN sN sN

...f;N

sN

_ I

e

T

e

a.s. _

r;::;:;

S const

-'J

-N

~ const

'I

~- 0' , S S

see again the proof of Lemma 3.

It is obvious now that for the gradient vector LN' ,

IILN' (9)11 is a.s. bounded, unifonnly on 8

i.e. there exist a constant

c

and a null set E such that for every co e

n -

E there exits some integer

N 1 (co) with

IILN'IIS c for N > N 1 (co) and for all 9 e

e .

Applying the mean value theorem again. it follows that for every co e

n -

E and every

e

> 0 there exists a 0 := .!. > 0 such that

c

I LN(91 , co) - LN(02. co) I S IILN' (eN, co)IIIIO} - 0211

<

e

for N > N1(ro) and forOI. 02 e

e

with 1101 - 9211

<

O. I]

Proposition 2 a.s. LN(O) H 0, unifonnly, a.s. i.e. ~~ 14v(9) I ~ 0 . Proof

Let e > 0 and 9 e

e

be arbitrary. Then, by virtue of Lemma 4 and the a.s. boundedness of

IILN' (0)11. unifonnly on 8 (see the proof of Lemma 5), there exist a neighbourhood U of 0 and a null set E such that for every co e

n -

E there exists an integer No, satisfying ILN(O', co) I

<

e for allN>NoandallO'e U.

This follows by applying the mean value theorem. Since 8 is compact. the claim results from covering 8 by a finite subcover of the union of all neighbourhoods U. []

(19)

Now we are able to give

the

main result of this section.

Theorem 1

"

Under Assumptions

1-5.

any sequence of estimators

{ON}.

defined by

(3.3).

is strongly consistent

for the true parameter vector

00.

i.e.

fmQf

Propositions 1 and 2 imply

a.3.

IN -,) I.

unifonnly.

The limit function

I

which has been defined

in (4.1).

has on

e

a unique minimum in

00.

see

Assumption 4. Furthermore

I

is continuous according to

the

remark at Proposition 1. The

theorem is an application of Lemma

3.1.3 in

Bierens (1981).

0 Remark

An

estimate for the unknown variance

cJl

is given by

IN(eN):

as a consequence of Theorem 1 and

a.3.

IN -,) I,

uniformly,

,. a.3.

(20)

5. Asymptotic normality

,..

In this section the asymptotic normality property of any sequence {eN} defined by (3.3) is shown. We need some additional assumptions.

Assumption la

The true parametervector eo is an interior point of the parameter space

e.

A common method is starting from the Taylor expansion of IN' caN):

IN'(ON)=IN'(eo)+HN('ON- eO) •

Here H N is the matrix of second derivatives evaluated in some mean value points

eN

i: ,PIN - i

(HN)i,j = aajdei (eN) , i, j = I, ... , J.I.

with

lIeN

i - eoll::;

lIaN -

9011 for i

=

I, •.. ,J.I. • From the assumption above and Theorem 1 it follows

{;ii IN' C'ON) = 0 a.s., for N sufficiently large . Then, by (S.l),

{;ii IN'(eo)+HN{;ii C9N-eO)=O a.s. ,

for N sufficiently large .

(5.1)

(S.2a) (S.2b)

(S.3) Let IN" denote the J.l x J.I. matrix of second derivatives of IN' It can be verified that {IE IN"} is bounded, uniformly on

e.

Assumption 6

For all 9 E

e,

IE JNII converges as N

-+

00.

Proposition 3 a.s. As N

-+

00, IN'

-+

1'.

uniformly and a.s. IN''

-+

J", uniformly. hold Proof

Analogous to Proposition 1, it follows that IE IN" converges uniformly. By virtue of Theorem 8.6.3 in Dieudonne (1969), this implies J is a C2-function and _{IE I N'}

-+

I' and IE IN"

-+

1",

(21)

The proof of

aoS.

IN'(9) - IE IN'(9) ~ 0 unifonnly

and

aoS.

IN'' (9) - IE IN'' (9) ~ 0 unifonnly

is similar to that of Proposition 2: essential are the results

ap alp

( ao;

)2

s.

k 1 / , ( a9jde; )2

s.

kll and

a

3_p

(a a a )2

s.

k3 / • for some constants k 1, k2. and k3 ,

Ok OJ 9;

which follow from Lemma 1 part (i) after some calculations; the a.s. convergence of

1 ap 1

a

2 p . ap

a

2 p .

-

e

T-

e

and

- e

T

e

to zero (smce tr - = tr

a a

=0) IS a consequence of

sN aOj sN a9ja9i a9i 9j 9i

Varberg (1968) Theorem 3. Compare also Lemma 4. The results of the proposition are obvious

now.

0

Corollary 2

aoS.

HN ~

Ho

:=J" (90)

>

0 .

Proof

By virtue of Theorem 1. (5.2) and Proposition 3 we obtain the a.s. convergence of HN to J" (90).

Since J" (90) = 2 G(90). see (4.1), it is positive definite by Assumption 4.

IJ

It remains, showing asymptotic normality for..J;ii IN' (90), see (5.3). We define

SN

:=..J;N

~T

IN'(90)

where {Arv} is any sequence of normed vectors in IR J.I. • HAN:= [Arv.lt··· ,NN,J.I.]T then

SN=zT L z , 1 J.I. • with L := {;N

L

AN,iP'(90) .

sN i=l

Here pi denotes

~

which is (s +r)(m +N) x (s

+

r) (m +N).

Due to tr L=Oand ~TL~=O,

IE SN=O

(5.4)

(S.Sa) (S.Sb)

(22)

holds.

To denote the model equations as D (6)

,=

0 we choose now, instead of (1.5),

'Yo' .. 'Ym D= sN x (8

+

r) (m

+

N) 'YO" • 'Ym and yT _ [TI T t. T ... TI T t. T] ~ - 'Im-tN ~m-tN 'II ~1 . (5.7a) (5.7b) Likewise the vectors z and e in JR(s+r)(m-tN) are redefined in this way, hence (l.6) carries over. In (5.7a), 'YA: := [ak (3k] for k

=

0, 1, ... ,m and

ao:=

-I.

Assumption 5 is replaced by the stronger Assumption 5'

If e

=

[e 1 ••• e (s+r)(m+N)]T then

a) there exist a B

>

0 and a constant M 2

<

00 such that IE lei I't-ta $ M 2, for all

i

=

1,2, ....

b) IE

el

= 0 for i

=

1, ... , (8

+

r)(m

+

N).

In order to apply some central limit theorem matrix D in (5.7a) is partitioned as

(5.8a) where 'YO" • 'Ym 'Ym , sN

x

(8

+

r) N (5.8b) 'Yo and

c=

'Ym sN x(s+r)m . (5.8c)

.

'Yl ••• 'Ym

(23)

where and

-D=r+C,

-r ·-

.

-C '-

,-r

r

oc

o c

ro

oc

the number of block matrices being N.

Then, we obtain for SN2 ,

SN2 = iTiz

=

iT

Mz

+

iTii .

Here

-

-R :=L-M

and

z

is the vector in lR(s+rXm-+N2) corresponding to m

+

N 2 measurements.

Proposition

4

SN

-;=== converges in distribution to the standard normal distribution.

""varSN

Proof

Consider SN2 _{defined in}_(S.10).

After some tedious calculations in which Lemma 1 plays a key role, it is obtained that

variTiz

~

0 as N -+ co •

varSN

2 (S.9a) (S.9b) (S.9c) (S.lO)

According to Bernstein's lemma (see e.g. Whittle (1964) pp. lOS-H>6) asymptotic normality of

zT

Mz

implies asymptotic normality of SN2 •

(24)

-T - N

Z

MZ=

1:

Xi ,

i=1 -T

-where Xi =Zj M Zj •

and the vectors

Zj

e R(s+r)N are defined by

-z= _

ZN

Z

It is now easy to verify that for 0 defined in Assumption 5' a,

N 2

1:

IE I Xj-IE Xi I ,-H)

;=1 -&2

-T - 1+1i12 < const N •

(varz

MZ)

hence tending to zero as N -7 00. Therefore by virtue of the Liapounov central limit theorem,

zT

Mz

is asymptotically normal.

Now SN2. is asymptotically normal. For those N with

...fii

E IN we use a somewhat different

parti-tion of the matrix D.

0

Corollary 3

i.e. converges in distribution to the standard multivariate normal distribution. The square root is arbitrary.

Remaik.

By virtue of Assumption 4, var

(fiN

iN' (90»

>

0 for N sufficiently large. Proof

Immediate from the definition of SN and Proposition 4.

o

(25)

Theorem 2.

Under Assumption 1-4, 1a, 5' and 6, any sequence {eN} defined by (3.3) satisfies [var({;ii IN'(OO))rtTHo{;ii(eN -00) !N(O,I) .

Proof

By virtue of (5.3) and Corollary 3,

_ r:-:: I _ r:::- A d

-[var (VsN IN'(OO))(rTHN"VsN(ON -

00)

~ N(O, /) (5.11)

a.s.

holds. Corollary 2 implies H_{N -}1 ~ H 0 -I and hence

Thereby,

I I a£

[var({;ii IN'(00))f"2THoHN-1[var({;ii IN'(Oo))f T ~ / . (5.12)

The theorem follows from (5.11) and (5.12).

o

Remark

If var

(..J;N

J N' (00» converges with limit V (00), we obtain

_ r:-:: A d

(26)

References

Anderson, B.D.O. (1985), Identification of scalar errors-in-variables models with dynamics,

Automatica 21, pp. 709-716.

Aoki. M. and P.C. Yue (1970a), On a priori error estimates of some identification methods.

IEEE Trans. Automatic Control AC-lS, pp. 541-548.

Aoki, M. and P.C. Yue (1970b), On certain convergence questions in system identification,

SlAM J. Control 8 (2), pp. 239-256.

Bierens, H.J. (1981), Robust Methods and Asymptotic Theory in Nonlinear Econometrics, Lect.

Notes Econom. Math. No. 192, Springer. Berlin.

Dieudonne, J. (1969). Foundations ojmotiernAnalysis, Academic Press, New York.

Eising. F.. H.N. Linssen and H. Rietbergen (1983), System identification from noisy measure-ments of inputs and outputs. Systems & Control Letters 2. pp. 348-353.

Mentz,

RP.

(1976), On the inverse of some covariance matrices of Toeplitz type. SlAM I. App.

Math. 31 (3), pp. 426-437.

Rudin. W. (1964), Principles ofmathematical analysis, McGraw-Hill. New York.

Scales. L.E. (1985). Introduction to Non-linear Optimization, MacMillan, London.

SOderstrom, T. (1981), Identification of stochastic linear systems in presence of input noise,

Automatica 17, pp. 713-725.

Solari, M.E. (1969). The "maximum likelihood solution" of the problem of estimating a linear functional relationship. I.R. Statist. Soc. B 31 (2), pp. 372-375.

Ten Vregelaar. I.M. (1987), An algorithm for computing estimates for parameters of an ARMA-model from noisy measurements of inputs and outputs, Memorandum COSOR 87-13. Eindhoven University of Technology.

Ten Vregelaar, I.M. (1988), On estimating the parameters of a dynamic model from noisy input and output measurements. Memorandum COSOR 88-02. Eindhoven University of Technol-ogy.

(27)

Tucker, H.G. (1967), A Graduate Course in Probability, Academic Press, New York.

Varberg, D.E. (1968), Ahnost sure convergence of quadratic forms in independent random vari-ables, Annals of Math. Stat. 39 (5), pp. 1502-1506.

Whittle, P. (1960), Bounds for the moments of linear and quadratic forms in independent vari-ables, Theory of Prob. Appl. 5, pp. 302-305.

Whittle, P. (1964), On the convergence to normality of quadratic forms in independent vari-ables, Theory of Prob. Appl. 9, 1, pp. 103-108.

Wilkinson, J.R. (1965), The algebraic eigenvalue problem. Oarendon Press, Oxford. Wolovich, W.A. (1974), Linear Multivariable Systems, Springer Verlag. New York.