Parameter estimation from noisy observation of imputs and
outputs
Citation for published version (APA):
Vregelaar, ten, J. M. (1988). Parameter estimation from noisy observation of imputs and outputs. (Memorandum COSOR; Vol. 8813). Technische Universiteit Eindhoven.
Document status and date: Published: 01/01/1988 Document Version:
Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication:
• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.
• The final author version and the galley proof are versions of the publication after peer review.
• The final published version features the final layout of the paper including the volume, issue and page numbers.
Link to publication
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal.
If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:
www.tue.nl/taverne
Take down policy
If you believe that this document breaches copyright please contact us at:
openaccess@tue.nl
providing details and we will investigate your claim.
Department of Mathematics and Computing Science
Memorandum COSOR 88-13 Parameter estimation from noisy observations of inputs and outputs
by
I.M. ten Vregelaar
Eindhoven University of Technology
Department of Mathematics and Computing Science P.O. Box 513
5600 MB Eindhoven The Netherlands
Eindhoven. May 1988 The Netherlands
In this
paper an algorithm is given to compute least squares estimates for the
parameters of a dynamic model from noisy measurements of inputs and outputs.
The corresponding estimators are proven
to bestrongly consistent and
asymptoti-cally normal under some assumptions.
Keywords:
Least squares parameter estimation. ARMA-representation, Errors-in-variables,
Consistency, Asymptotic normality.
In this paper we discuss the least squares parameter estimation method for a dynamic errors-in-variables model An efficient algorithm to compute estimates for the unknown parameters is presented in Section 2.
In Section 3 we deal with 5 assumptions which turn out to be sufficient conditions for the strong consistency of the least squares estimators. The latter is proved in Section 4. Under some addi-tional conditions these estimators are asymptotically normally distributed as is shown in Section
5.
We consider a dynamic model for inputs; and outputs 'fl. represented by an ARMA-description:
p q
'flt =
L
(l;'flt-i +L
~j;t-j, t =m + I, m +2, ...• (1.1)i=1 j=O
with known orders p and q and m := max (p, q).
Inputs;' and outputs 'flt are vectors in IR r and IRs respectively, allowing for MIMO models. Both inputs and outputs are supposed to be measured with noises Ot and
e
t resp. (see figure below):Yt ='flt
+
et}Xt
=;t
+
Ot t = 1,2.···.m +N .
(1.2)I
11 _ _ _ _ _ ARM A --.;1I
f_ - - . - _ n,
).,
e::x y
In (1.2), m
+
N represents the total number of measurements (x" Yt).The problem is now to estimate the unknown parameter matrix (with size s
x
[ps+
(q+
1) r])9= [(It(l2 "'ap~~1 "'~q] (1.3)
from the set of data {(XloYl). (X2,Y2), • " . (Xm+N,Ym+N)}.
We introduce the least squares estimation method for this problem. Estimates for 9 are obtained by minimizing. with respect to 9.;1, ... , ;m+N, Tll •... • 1lm+N the sum of squares
subject to the model equation (1.1) for t
=
m+
1, ... ,m+
N.1
In this paper II.XII denotes (tr
xTxl
for any real matrix X. It is convenient to employ short-hand notations for model and observations.We rewrite (1.1) for t = m
+
1, ... ,m+
N as a matrix-vector equationD(S) ~=O , (1.5a) where - / (II ••• Om
Po'"
Pm D= (1.5b).
- / (II •.• OmPo'"
Pm and 1'Im I'll e 1R(s+rXm+N) • ~= ~+N (1.5c) ~1Matrix D is sN X (s
+
r) (m+
N), empty space represents zero elements in (1.5b). Furthennore, Iis the identity matrix and we define (It := 0 for k
>
P andPk
:= 0 for k>
q. Observation and noise vectors z ande
ate defined corresponding to~. Thenz=~+e
is a short-hand notation for (1.2).
Now, minimization problem (1.4) gets into min liz - ~1I2 subject to D ~
=
0 .8.~
(1.6)
(1.7) To get rid of the constraint and the minimization with respect to ~ we consider the stationary point equations of the Lagrangian
L(~, "')
=t(z -
OT(z -0
+
",TD ,:Here, D+ := DT (DD Trl is the Moore-Penrose inverse of D since D has full row rank. Therefore. (1.7) reduces to
min zTp(O) z withP =D+D
8
(1.8)
(1.9)
As pointed out by Aoki and Yue (1970a) for the SISO model. solutions of (1.9) correspond to maximum likelihood estimates for 0 when the noise is white and Gaussian. However. they only provide algorithms and convergence results for
approximate
maximum likelihood methods. In a companion paper. Aoki and Yue (1970b), they prove convergence results for the true maximumlikelihood estimators in the special case of
no
inputnoise.
For a special case, an algorithm and convergence results are provided in Eising, Linssen and Riet-bergen (1983). Their arguments are not rigourous and no explicit assumptions are given.
Two related papers, SOderstrom (1981) and Anderson (1985), should be mentioned here as well. Both authors employ the frequency domain approach and are mainly interested in the identifiability aspect.
2. Algorithm
Since (1.9) has in general no closed-form solution, we need an iterative algorithm to calcu-late estimates. We propose the Broyden-Fletcher-Goldfarb-Shanno formula. cf. Scales (1985) pp. 89-90, which has good numerical properties. It uses object function and gradient evaluations. Let J N denote the object function,
IN(a) = zTp(a)
z
(2.1)then for any component of its gradient J N' ,
jN =2zTD+iJpl.z
holds. sinceP =D+iJpl. +Pl.iJT(D+l (. representing
~i
for any elementaj
froma).
(2.2)
Here, both P and pl. :=/-P
are
orthogonal projection matrices. To evaluate IN and IN' we prefer performing a Q -R decomposition of DT rather than inverting DDT. The matrix DT hasfull column rank,
so
there exist orthogonal Q and regular R (sNx
sN) such thatQTDT
=
[~
.
(2.3)Because of the special form of D T, R will be lower triangular. If
Q
1 is the submatrix ofQ
con-sisting of its first sN columns, thenDT = Q lR (2.4)
holds.
Due to the orthogonality of Q,
P=QIQtT
hence
IN=l1Q1TZIl2 •
Furthermore. when A := (D+l z (cf. (1.8» then
DTA=Pz
and (2.2) implies
jN =2ATiJ(Z -DTA) .
Premultiplying (2.7) by Ql T gives via (2.4) and (2.5)
RA=QITz .
Summarizing, when matrix R and vector
u
:= Q 1 T Z E IRsN are computed from(2.5)
(2.6)
(2.7)
(2.8)
QT[DT I z]: [
~ ~]
•then, from (2.6), (2.8) and (2.9) we obtain IN =lIull2
and
where A. is easily solved from
R')..=u ,
since R is lower triangular.
(2.10)
(2.11a)
(2. 11 b)
(2.11c)
The matrix Q in (2.10) is not computed explicitly: by means of Householder matrices, D T is
transfonned into
[~].
Using the special Toeplitz and band structure of D T this can be done very efficiently. For details we refer to Ten Vregelaar (1987).The computation of R takes O(N) operations, whereas matrix inversion, proposed in Eising, Linssen and Rietbergen (1983) is of order 0(N2).
The computation of u in (2.10) takes 0(N2) operations. Alternatively, once R is computed we can s()lve u from R T U = Dz, which takes O(N) operations, since R is a band matrix:
R
=
rm+l.l , (all ri.j are s x s) ..
rN,N-m ••• rN.N
Therefore, if N
»
ps2 + (q+
1) srwe determine u from RT u =Dz rather than from (2.10). Besides. Q - R decomposition is known to be a numerically stable procedure.3. Assumptions
From now on the noise vector e in (1.6) is supposed to be random with zero mean, all its scalar components
are
stochastically independent and the variance-covariance matrix ofe
is vare
=
cil
Iwhere G is unknown. The latter assumption is made to avoid the problem as pointed out in Solari
(1969), where the stationary point of the likelihood for the SISO model without dynamics
(p
=
q=
0) is a saddle point if both the variances on input and output are unknown. However, we may allow for different variances if their quotient is known.For convenience the object function which is random now, is multiplied by
s~:
IN(9)
=
s~
zTp(9)z .
(3.1)Furthermore, 9 is redefined as a vector in JR.'!, with J..l
=
ps2+
(q+
1) sr, byaT
=
[(al)l* ... (al)s*(tl2)1* ... (tlp)s*(~0)1* ... (~q)s*] (3.2)Here Mi
*
and M *i denote rowi
and column j of any matrix M.The 5 assumptions to be introduced below
are
more or less commonly used to derive asymptoti-cal properties of estimators.Let
e
denote the parameter space, then any least squares estimator is defined by "9N
=
argIllinINCO) .ee8
Assumption 1
(3.3)
The parameter space
e
is a known convex and compact subset of IR 11., containing the unknown true parameter vector 00'Since IN is almOS! surely (as.) a C'" -function w.r.t
e.
hence in particular continuous,e
is com-pact implies that ON defined by (3.3) is indeed a random vector in the sense that it is measurable, cf. Bierens (1981), p. 53.As an introduction for the next assumption we introduce the polynomial matrices p .
A (A) =-1
+
L
tljAIiet
and
(3.4)
A(A)J)t+B(A)~t=O, t=m+l,m+2,···
Assumption 2
For all
e
ee,
A (A) is stable, i.e. the zeros of det A (A) lie outside the closed unit disk. We associate matrices A and B with the 'polynomial matrices A (A) and B (A), by definingp
A = -/
+
L
Sk ® ak (sNx
sN) k=Iwhere S
is
the Nx
N shiftmatrixo
1s=
1o
(sN xrN) (3.5) (3.6) (3.7)In (3.6), ® denotes the Kronecker product for matrices. From (l.5b) it follows immediately that
D = [A C 1 I B C 2] (3.8) with (sN xsm) and
Cz=
~m (sN xrm) .PI .. :Pm
Assumption 2 has some important consequences concerning (DDT)-l which appears in the object function and gradient
Lemma 1
(i) Some constants PI and P2 exist with 0
<
PI<
P2<
00 such thatPl/~ (DDTrl ~ P2/ for all
e
E 8 and all N? P + 1(By definition: Ml~M2 if xTMIX~xTM2X for all x, Ml and M2 are symmetric matrices).
(ii) There exists a constant P3
<
00, such thatII(DDTrIUoo ~ P3 for all e E
e
and all N? P+
1It
(By definition: if Mis n x n then IIMlioo := . max
L
I Mij I ).,=1, ... ,n j=l
(i) See (appendix of) Ten Vregelaar (1988), which contains a proof of the analogous results for
AA T and DDT. Then the result for (DDTrl follows immediately.
(ii) The matrix DDT is a b1ock-Toeplitz and band matrix and it is positive definite. Hence it can be interpreted as the covariance matrix of as-variate MA (m) process.
Then for N
»m
we can approximate (DDTrl by the covariance matrix kAR' the corresponding s-variate AR (m) process, see Mentz (1976). Suppose kAR = [a( I i - j I )], thenm
a(k) =
L
Cjx/, where Ci are m x m matrices. Without loss of generality we assumei=1
xl, ... , Xm are different.
Now Xl, ••• ,Xm can be chosen such that x := max I Xi I
<
I, hencei
00 00 const
IIkARII_~ 2
L
lIa(k)lI~ constL
Xk =-k=O k=O I-x
Remark
The result (DDT)-l ~ P2/ is a consequence of II(DDTrllioo ~ P3. since max 1Ai I ~ II(DDTrlll_
i
where Ai denote the eigenvalues of (DD T)-l , see Wilkinson (1965) p. 59.
Assumption 3
The input sequence is bounded:
there exists a constant M 1 such that lI;ill S; M 1 for i = 1, 2, ... Corollary 1
The sequence of outputs {11i}
i:l
is bounded. ProofThis well-known BmO-stability result is an immediate consequence of Assumptions 2 and 3.
IJ
We prepare the next assumption by rewriting the vector D ~ in (1.5) as
[
11m+1
D~=(H+K)9- : •
11m+l
(3.9)
where 9 as defined in (3.2) and H and K are 5N x J.1 matrices given by
H
=
[(S ® Is) 11 ... (SP ® 1.1)11 I ; (S ® Is); ... (sq ® Is);] (3. lOa)K= (3.10b)
with Is is the
s
xs
identity matrix,11 -
.
(sN X 52) • [ Is®11~+Nl
- Is ®~~+l
and (sNxsr) .(3.11)
Assumption 4
Th
.
HT(DDTrlH N &. alle
e
Th
I' ,. , Ge matrtx
converges as
-4 00,lor
e . elffiltmg matrtx, say ,
sN
is positive definite on
e.
Because of Lemma
1, sN ~ J.lis a necessary condition for Assumption
4 tohold.
As will
beseen in the next section
thisassumption implies also a convergence result for
E IN(a).which is one of the tools for proving consistency.
Generalizing Aoki and Vue (1970a), we
can
give an interpretation for the convergence of
HT
(D~Trl
H toa positive definite matrix in the SISO-case s
=
r=
1.The above defined
a,
H, 1land ; reduce to
9=
[a.l •..ap f30 .••
pq]T , H=
[S1l ... SP1l I ;s; ...
sq;] ,
11=
[1lm+N •.. 1lm+dTand
Defining
v=D~-A1l-B; , (1.5)and
(3.8)imply
v=C,
[~
+C,
[~
For
9=90(notation sup index 0)
A01l=-Bo~-voholds. Then. using
ASk=SkAfor
k = 1,2 •.. , Tn.
AOH=[-5BO;'" -5PB
o;
lAo; ...
SqAo;]
+000where
00 := -[Sv ... SPy I 0 ... 0].Hence
with
and 0 -1
-/30
(XlE=
-/3q
0 Up -1,
(p+
q+
1) x (p+
q+
1) .-/30
(Xl-/3q
(Xp HT(DDTrlHThe effect of
(fp
in N vanishes, whencelim HT(DDT)-IH = (EO)T lim ST(Ao)-T(DDT)-l(Ao)-IS EO
N....- N N....- N (3.12)
The matrices (A °rT(A
°rl
and (DDTrl can be bounded in the sense of Lemma 1. Therefore, provided the existence of the limits. the limit in the left hand side of (3.12) is positive definite if and only if (i) (ii) -T-lim t=.Nt=.>
0 N....-EO is regular .Condition (i) could be a definition of persistency of excitation of order p
+
q for the input sequence {~i}7:1
(see Aoki and Yue (1970a). p. 544), whereas the second condition is equivalent to the statement that the polynomials A (A) and B (A) in (3.4) are coprime (see Wolovich (1974). pp.234-236),In tum, the latter is equivalent to the statement in state-space terminology. that the system is con-trollable if it is observable.
The last assumption requires the fourth moment of the noises to be uniformly bounded. Assumption 5
Let ej denote (scalar) component i of the noise vector e. There exists a constant M 2 such that
4. Consistency
Consistency is obtained by applying the following argument: when the object function converges
in some sense
uniformly on a compact set to a continuous limit function which is uniquely minimal in the true value of the parameter vector, then any minimizing solution convergesin that
sense
to the true value, see Bierens (1981), p. 54, 65.In the sequel uniform convergence refers to convergence with respect to
e
on the convex and compactS.Let us start by proving two lemmas. Lemma 2
For all
a
Ee,
lim IE IN(e) =J(e),N-+oo
where J
(e) := c? + (e -
aol
G(a) (a - ao) .
Proof
Observe that the object function defined by (3.1) has mean
IE IN
=c? +
_1CTpC
sN
=c?+(a-eol
(H+Kl(DDT)-l(H+K) (9-90) sN(4.1)
by (3.11). The lemma follows from Assumption 4, since K has a finite number of nonzero
ele-ments.
0
Lemma 3
The sequence of functions {.IE IN(a)} is equicontinuous on
e,
i.e. (see Rudin (1964» for every e>
0 there exists a 6>
0 such thatProof
According to the mean value theorem,
-
-for some eN E e with lieN - Szll S lIal - a211.
If· denotes ":'J.d for arbitrary ai, we have
a
~Tp~aai IE IN(a)
=
sN .We recall from (2.2) that
P
=D+Vpl. + pl.i>T (D+l.Using Lemma 1 part 0), it is easy to verify that there exists a constant k
>
0 such that-k IS
P
SkI holds for alla
E 8 and N~ ~.
s
Hence
By virtue of Assumption 3 and its Corollary 1, it follows that 1\
:a
IE IN(9N)1I Sk
for somecon-- e
stant k, which proves the lemma (take 0
=
k)'
0
Proposition 1
IE IN(a) t-+ 1(9)
as
N -+ 00, uniformly.Proof
The proposition is a consequence of Lemmas 2 and 3, cf. Dieudonne (1969) Theorem 7.5.6.
0
RemarkThe uniform limit function I is continuous on
e,
and since 8 is compact it is even uniformly con-tinuous one.
In order to obtain a convergence result for IN (9) we investigate
LN(9) := IN(a) - IE IN(a) . Lemma 4 Proof We have Since a.s. LN(9) t-+ 0, for all
a
E 8 .(4.2)
by application of a result in Whittle (1960), p. 302 and Assumption 5, we obtain from Assump-tion 3 and Corollary 1. that
00
2
1:
JP( I- e
TP(9)CI ~e)
<
00 for alle
>
0 .N=l sN ;
Due to the Borell-Cantelli Lemma, it follows that
2 a.s.
sN
e
T P (9)t
~ 0 . On the other hand,JP(I
e
TP(9)e-a2J
» <
1 Tp(9) sN - e - slN2r? vare e:S const _I_
F!
sNholds. by consequence of another result in Whittle (1960) p. 302 and Assumption 5. Hence
Now ; .
~
0 and tr::v
=
1 implyeTP(9)e a.s.
sN
~ a2 •
cf. Varberg (1968) Corollary 3. Lemma 5
o
The sequence of random functions {LN(9)} is a.s. equicontinuous on
e,
i.e. if {O, F, lP} is the probability space involved, there exists a null set E c 0 (i.e. JP(E)=
0) such that for every 01 En -
E the sequence {LN(9, Ol)} is equicontinuous one.
Proof
The argument is analogous to that
giveI~
in the proof of Lemma 3. I f . denotes~d
again, thena9j
Tp T a.s.
+
1~ISk~~k.£.!...r?
sN sN s
by the Kolmogorov strong law of large numbers (var
e?
$; const by Assumption 5). see Tucker (1967) p. 124.Then using the Cauchy-Schwarz inequality it follows
eTpC
-
IT. -
I
CTp
2C
1101 -I
eTe
1 I$;'I~-'I
$ ; k ' I
-sN sN sN...f;N
sN_ I
e
Te
a.s. _r;::;:;
S const-'J
-N
~ const'I
~- 0' , S Ssee again the proof of Lemma 3.
It is obvious now that for the gradient vector LN' ,
IILN' (9)11 is a.s. bounded, unifonnly on 8
i.e. there exist a constant
c
and a null set E such that for every co en -
E there exits some integerN 1 (co) with
IILN'IIS c for N > N 1 (co) and for all 9 e
e .
Applying the mean value theorem again. it follows that for every co e
n -
E and everye
> 0 there exists a 0 := .!. > 0 such thatc
I LN(91 , co) - LN(02. co) I S IILN' (eN, co)IIIIO} - 0211
<
efor N > N1(ro) and forOI. 02 e
e
with 1101 - 9211<
O. I]Proposition 2 a.s. LN(O) H 0, unifonnly, a.s. i.e. ~~ 14v(9) I ~ 0 . Proof
Let e > 0 and 9 e
e
be arbitrary. Then, by virtue of Lemma 4 and the a.s. boundedness ofIILN' (0)11. unifonnly on 8 (see the proof of Lemma 5), there exist a neighbourhood U of 0 and a null set E such that for every co e
n -
E there exists an integer No, satisfying ILN(O', co) I<
e for allN>NoandallO'e U.This follows by applying the mean value theorem. Since 8 is compact. the claim results from covering 8 by a finite subcover of the union of all neighbourhoods U. []
Now we are able to give
themain result of this section.
Theorem 1
"
Under Assumptions
1-5.any sequence of estimators
{ON}.defined by
(3.3).is strongly consistent
for the true parameter vector
00.i.e.
fmQf
Propositions 1 and 2 imply
a.3.
IN -,) I.
unifonnly.
The limit function
Iwhich has been defined
in (4.1).has on
e
a unique minimum in
00.see
Assumption 4. Furthermore
Iis continuous according to
theremark at Proposition 1. The
theorem is an application of Lemma
3.1.3 inBierens (1981).
0
Remark
An
estimate for the unknown variance
cJl
is given by
IN(eN):as a consequence of Theorem 1 and
a.3.
IN -,) I,
uniformly,
,. a.3.
5. Asymptotic normality
,..
In this section the asymptotic normality property of any sequence {eN} defined by (3.3) is shown. We need some additional assumptions.
Assumption la
The true parametervector eo is an interior point of the parameter space
e.
A common method is starting from the Taylor expansion of IN' caN):IN'(ON)=IN'(eo)+HN('ON- eO) •
Here H N is the matrix of second derivatives evaluated in some mean value points
eN
i: ,PIN - i(HN)i,j = aajdei (eN) , i, j = I, ... , J.I.
with
lIeN
i - eoll::;lIaN -
9011 for i=
I, •.. ,J.I. • From the assumption above and Theorem 1 it follows{;ii IN' C'ON) = 0 a.s., for N sufficiently large . Then, by (S.l),
{;ii IN'(eo)+HN{;ii C9N-eO)=O a.s. ,
for N sufficiently large .
(5.1)
(S.2a) (S.2b)
(S.3) Let IN" denote the J.l x J.I. matrix of second derivatives of IN' It can be verified that {IE IN"} is bounded, uniformly on
e.
Assumption 6
For all 9 E
e,
IE JNII converges as N-+
00.Proposition 3 a.s. As N
-+
00, IN'-+
1'.
uniformly and a.s. IN''-+
J", uniformly. hold ProofAnalogous to Proposition 1, it follows that IE IN" converges uniformly. By virtue of Theorem 8.6.3 in Dieudonne (1969), this implies J is a C2-function and IE I N'
-+
I' and IE IN"-+
1",The proof of
aoS.
IN'(9) - IE IN'(9) ~ 0 unifonnly
and
aoS.
IN'' (9) - IE IN'' (9) ~ 0 unifonnly
is similar to that of Proposition 2: essential are the results
ap alp
( ao;
)2s.
k 1 / , ( a9jde; )2s.
kll anda
3p(a a a )2
s.
k3 / • for some constants k 1, k2. and k3 ,Ok OJ 9;
which follow from Lemma 1 part (i) after some calculations; the a.s. convergence of
1 ap 1
a
2 p . apa
2 p .-
e
T-e
and- e
Te
to zero (smce tr - = tra a
=0) IS a consequence ofsN aOj sN a9ja9i a9i 9j 9i
Varberg (1968) Theorem 3. Compare also Lemma 4. The results of the proposition are obvious
now.
0
Corollary 2
aoS.
HN ~
Ho
:=J" (90)>
0 .Proof
By virtue of Theorem 1. (5.2) and Proposition 3 we obtain the a.s. convergence of HN to J" (90).
Since J" (90) = 2 G(90). see (4.1), it is positive definite by Assumption 4.
IJ
It remains, showing asymptotic normality for..J;ii IN' (90), see (5.3). We defineSN
:=..J;N
~T
IN'(90)where {Arv} is any sequence of normed vectors in IR J.I. • HAN:= [Arv.lt··· ,NN,J.I.]T then
SN=zT L z , 1 J.I. • with L := {;N
L
AN,iP'(90) .sN i=l
Here pi denotes
~
which is (s +r)(m +N) x (s+
r) (m +N).Due to tr L=Oand ~TL~=O,
IE SN=O
(5.4)
(S.Sa) (S.Sb)
holds.
To denote the model equations as D (6)
,=
0 we choose now, instead of (1.5),'Yo' .. 'Ym D= sN x (8
+
r) (m+
N) 'YO" • 'Ym and yT _ [TI T t. T ... TI T t. T] ~ - 'Im-tN ~m-tN 'II ~1 . (5.7a) (5.7b) Likewise the vectors z and e in JR(s+r)(m-tN) are redefined in this way, hence (l.6) carries over. In (5.7a), 'YA: := [ak (3k] for k=
0, 1, ... ,m andao:=
-I.Assumption 5 is replaced by the stronger Assumption 5'
If e
=
[e 1 ••• e (s+r)(m+N)]T thena) there exist a B
>
0 and a constant M 2<
00 such that IE lei I't-ta $ M 2, for alli
=
1,2, ....b) IE
el
= 0 for i=
1, ... , (8+
r)(m+
N).In order to apply some central limit theorem matrix D in (5.7a) is partitioned as
(5.8a) where 'YO" • 'Ym 'Ym , sN
x
(8+
r) N (5.8b) 'Yo andc=
'Ym sN x(s+r)m . (5.8c).
'Yl ••• 'Ymwhere and
-D=r+C,
-r ·-
.
-C '-
,-r
r
oc
o c
ro
oc
the number of block matrices being N.
Then, we obtain for SN2 ,
SN2 = iTiz
=
iTMz
+
iTii .Here
-
-
-R :=L-M
and
z
is the vector in lR(s+rXm-+N2) corresponding to m+
N 2 measurements.Proposition
4
SN
-;=== converges in distribution to the standard normal distribution.
""varSN
Proof
Consider SN2 defined in (S.10).
After some tedious calculations in which Lemma 1 plays a key role, it is obtained that
variTiz
~
0 as N -+ co •varSN
2 (S.9a) (S.9b) (S.9c) (S.lO)According to Bernstein's lemma (see e.g. Whittle (1964) pp. lOS-H>6) asymptotic normality of
zT
Mz
implies asymptotic normality of SN2 •-T - N
Z
MZ=
1:
Xi ,i=1 -T
-where Xi =Zj M Zj •
and the vectors
Zj
e R(s+r)N are defined by -z= _ZN
Z
It is now easy to verify that for 0 defined in Assumption 5' a,
N 2
1:
IE I Xj-IE Xi I ,-H);=1 -&2
-T - 1+1i12 < const N •
(varz
MZ)
hence tending to zero as N -7 00. Therefore by virtue of the Liapounov central limit theorem,
zT
Mz
is asymptotically normal.Now SN2. is asymptotically normal. For those N with
...fii
E IN we use a somewhat differentparti-tion of the matrix D.
0
Corollary 3
i.e. converges in distribution to the standard multivariate normal distribution. The square root is arbitrary.
Remaik.
By virtue of Assumption 4, var
(fiN
iN' (90»>
0 for N sufficiently large. ProofImmediate from the definition of SN and Proposition 4.
o
Theorem 2.
Under Assumption 1-4, 1a, 5' and 6, any sequence {eN} defined by (3.3) satisfies [var({;ii IN'(OO))rtTHo{;ii(eN -00) !N(O,I) .
Proof
By virtue of (5.3) and Corollary 3,
_ r:-:: I _ r:::- A d
-[var (VsN IN'(OO))(rTHN"VsN(ON -
00)
~ N(O, /) (5.11)a.s.
holds. Corollary 2 implies HN -1 ~ H 0 -I and hence
Thereby,
I I a£
[var({;ii IN'(00))f"2THoHN-1[var({;ii IN'(Oo))f T ~ / . (5.12)
The theorem follows from (5.11) and (5.12).
o
Remark
If var
(..J;N
J N' (00» converges with limit V (00), we obtain_ r:-:: A d
References
Anderson, B.D.O. (1985), Identification of scalar errors-in-variables models with dynamics,
Automatica 21, pp. 709-716.
Aoki. M. and P.C. Yue (1970a), On a priori error estimates of some identification methods.
IEEE Trans. Automatic Control AC-lS, pp. 541-548.
Aoki, M. and P.C. Yue (1970b), On certain convergence questions in system identification,
SlAM J. Control 8 (2), pp. 239-256.
Bierens, H.J. (1981), Robust Methods and Asymptotic Theory in Nonlinear Econometrics, Lect.
Notes Econom. Math. No. 192, Springer. Berlin.
Dieudonne, J. (1969). Foundations ojmotiernAnalysis, Academic Press, New York.
Eising. F.. H.N. Linssen and H. Rietbergen (1983), System identification from noisy measure-ments of inputs and outputs. Systems & Control Letters 2. pp. 348-353.
Mentz,
RP.
(1976), On the inverse of some covariance matrices of Toeplitz type. SlAM I. App.Math. 31 (3), pp. 426-437.
Rudin. W. (1964), Principles ofmathematical analysis, McGraw-Hill. New York.
Scales. L.E. (1985). Introduction to Non-linear Optimization, MacMillan, London.
SOderstrom, T. (1981), Identification of stochastic linear systems in presence of input noise,
Automatica 17, pp. 713-725.
Solari, M.E. (1969). The "maximum likelihood solution" of the problem of estimating a linear functional relationship. I.R. Statist. Soc. B 31 (2), pp. 372-375.
Ten Vregelaar. I.M. (1987), An algorithm for computing estimates for parameters of an ARMA-model from noisy measurements of inputs and outputs, Memorandum COSOR 87-13. Eindhoven University of Technology.
Ten Vregelaar, I.M. (1988), On estimating the parameters of a dynamic model from noisy input and output measurements. Memorandum COSOR 88-02. Eindhoven University of Technol-ogy.
Tucker, H.G. (1967), A Graduate Course in Probability, Academic Press, New York.
Varberg, D.E. (1968), Ahnost sure convergence of quadratic forms in independent random vari-ables, Annals of Math. Stat. 39 (5), pp. 1502-1506.
Whittle, P. (1960), Bounds for the moments of linear and quadratic forms in independent vari-ables, Theory of Prob. Appl. 5, pp. 302-305.
Whittle, P. (1964), On the convergence to normality of quadratic forms in independent vari-ables, Theory of Prob. Appl. 9, 1, pp. 103-108.
Wilkinson, J.R. (1965), The algebraic eigenvalue problem. Oarendon Press, Oxford. Wolovich, W.A. (1974), Linear Multivariable Systems, Springer Verlag. New York.