On estimating the parameters of an ARMA-model from noisy
measurements of inputs and outputs
Citation for published version (APA):
Vregelaar, ten, J. M. (1986). On estimating the parameters of an ARMA-model from noisy measurements of inputs and outputs. (Memorandum COSOR; Vol. 8618). Technische Universiteit Eindhoven.
Document status and date: Published: 01/01/1986
Document Version:
Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)
Please check the document version of this publication:
• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.
• The final author version and the galley proof are versions of the publication after peer review.
• The final published version features the final layout of the paper including the volume, issue and page numbers.
Link to publication
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal.
If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:
www.tue.nl/taverne Take down policy
If you believe that this document breaches copyright please contact us at: openaccess@tue.nl
providing details and we will investigate your claim.
EINDHOVEN UNIVERSITY OF TECHNOLOGY Faculty of Mathematics and Computing Science
Memorandum COSOR 86-18
On estimating the parameters of an ARMA-model from noisy measurements
of inputs and outputs by
J.M. ten Vregelaar
Eindhoven, the Netherlands November 1986
On
estimating the parameters of an ARMA-model from
noisy measurements of inputs and outputs
I.M. ten Vregelaar
University of Technology, Eindhoven
ABSTRACTIn this paper we discuss an estimation method for the unknown parameters of an ARMA-model when its inputs and outputs are measured with noise. In case the noise is normally distributed. the consistency and asymptotic normality of the estimator (in this case being the maximum likelihood estimator) is proved under some mild conditions and a consistent expression for the asymptotic covariance matrix is given.
1. Introduction: description of the problem
In this paper we consider the following estimation problem (in system theory called identification problem). Given the ARMA-model
'1')1 =O'l'1}l-l+ .... +O'p'l')t-p+f3o~t+ .... +f3q~t-q, t
=
1,2, ... with observations YI = 'l}1+
el XI=
~I+I[ .
t=
L2, ... N (1) (2a) (2b) the problem is to estimate the parameters O'l, ... O'p ,/3o ... /3g where the orders p and q are supposed to be known.In genera] the true inputs and outputs
e"
'1')1 are allowed to be resp. r - and 5 -vectors (thesocalled MIMO-case, i.e. multiple input-multiple output).
The estimation is performed under the following assumptions for the measurement errors et
,It :
E
It
=o.
t=
1.2 •....N
e 1 ... e.I\' .II , ... .f N are independent,
moreover even all their (5 +r)N components are independent. VAR e
t
= (J'21s • VARIt
= (J'21,. •where (]' is unknown but fixed and Is denotes the 5 X 5 - identity matrix.
When assuming zero initial conditions ('1}0
= ....
=
'I} I-p=
0 ,eo
= , ... =
e
1-q=
0) wecan rewrite model (1) as
A'I')+B~=O (3)
2
A =
t
Sk ® a/.:. B =t
Sk ® (?, J.:1=0 1=0
are sN XsN and sN XrN matrices resp. and 7)
=
(7)[ •... •7),'['l.
~=
(~[,...
,glf
are resp. sN - and rN -vectors. S is the N xN shift matrix whose (i,j) element is equal to
~i.j+l and ao:= -Is·
The measurements (2) and assumptions are reformulated as
Y=7)+e.
Ee=O
(4a)x=g+/.
E/=O
(4b)VAR [
i
1=
( j 2 I (s +r )N . (4c)Let
fV
:= (0: 1, . . . ,O:p .(?,o •... ,(?,q) be the 5 X (pS +(q+
l)r )-matrix of parameters to be estimated, we can rewrite the model and measurements once more asD (90 ) 0
=
0Z = 0
+
n , E n = O. V AR n = ( j 2 Iwhere
eo
is the matrix of true values of the parameters.D := (A B) (size sN X (s +r )N),
~:=
I
~
),
z :=I
~
I
and n :=I
i
I
with n is the noise vector.
2. Method of estimation
(Sa) (5b)
Estimates for the parameters are obtained by the solution of the following con-strained minimization problem:
mine); II Z -0112 ,
subject to
D(9)0=0.
( 11.11 denotes the Euclidean vector norm).
(6a)
(6b)
When assuming the normal distribution for n the solutions of (6) give the maximum likelihood estimates for the parameters.
Application of the Lagrange multiplier method gives an unconstrained minimization problem for the parameter matrix 9:
mine IN(9)
where
(7) with
Qce):= DT(DDTr1D
(the factor
5~
is added for convenience). This (5 +r)N xes +r)N matrix satisfies Q=
QT Q ' so Q is an orthogonal projection matrix with tr Q=
rank Q = sN. Note that3
-since A is nonsingular, DDT
=
AA T +BBT is nonsingular as well for all O.In this paper we will concentrate on asymptotic properties of the so-called total least squares estimators introduced above. For this purpose we need some expres-sions for IN' and IN", resp. the column vector of first derivatives of IN w.r.t. 0 (0 can be
al
Na
21Nseen as a vector) and the matrix of second derivatives: IN':=
(00. ),
IN":= (ao . ao. )
I J I
(obviously Q and therefore IN are infinitely times continously differentiable). Expressions for expectation and covariance matrix will be given here as well.
By differentiating it follows
1.1\' '(0)
=
s~
ZT Q '(O)z (8a)(the r.h.s. represents a vector, its i-th component is
s~
z T Q j (O)z :=s~
zT:~
z ), andI
(8b)
( h h d . . h ( .. ) I 1 TQij (Ll) '- _1_ T
a
2
Q ) W
t e r . . s. enotes a matriX WIt 1.) e ement sN z 17 Z . - sN Z
ao
j aOj z . e
gave an algorithm for the computation of ",·(0) and IN'CO) for given 0 (cf. Ten Vregelaar '85).
Since IN (0), IN '(0) and IN "(0) are random variables we can determine expectation and (co)variance. For arbitrary distribution of the noise vector n, satisfying (5b), we
obtain
(9a) (9b) (9c) using E
Cz
T Qz )=
tr (Q VAR z )+
(IE zY
Q (IE z ), and tr:~
=
a~~
Q=
0 etc.I I
Assuming finite fourth moments for the components of n , var l,v (0), VAR IN '(0) and the variance of the elements of IN "(0) exist.
If
z
has a multinormal distribution, the relationcov
Cz
T M I Z , Z T M 2 Z )=
2 tr (M I V AR z M 2 V AR z )+
4(IE zY
M I V AR z M 2 (IE z) holds for symmetric matrices M I' M 2' Therefore for the normal case ( n --N (0, u 2 I)) it followsvar 1,.,. (0)
=
2u4+
4u2 &T Q (0 )&. sN (SN)2 (lOa)
(VAR 1,,'(0))· = 1 (2u 4tr QiQj+4u 2&TQiQj&)
" I,J (sN)2 (lOb)
var(JN"CO))j,j
=
1 (2u4tr (Qij)2+
4u2&T(Qij )2&).(sN)2 (10c)
When dealing with asymptotic theory we distinguish the normal case and the general case. This paper deals with the normal case.
4
-3. Asymptotical results for the maximum likelihood estimator
Under some mild conditions we will prove the consistency and asymptotic normality of the total least squares estimator (being the maximum likelihood estimator) in case the noise vector n is multinormally distributed: n - N (0. (}"2]). For notational convenience the results are derived for the single input single output (SISO)-system (5
=
r=
1). For the time being the parametrization is supposed to be free.It will be convenient to write
(11a)
where
HN(8):= (S1) S21) . . . SP1) ~ S~ ... sq~) (11b)
is a N X (p +q
+
1) matrix CcL Aoki . 70 p. 241. obviously 1)=
H N (8)9 is equivalent to D(6)S=
0). As a consequence ST Q (6)8 = (6-60)T MN(6)(9-6 0) (11c) N with MN(O):= (HNUnY(DDT)-lHNCS) NWe impose the following smoothness conditions:
0) let S be a compact subset of RP+q+l containing
eo
as an interior point(lId)
(ii) the polynomial A eX)
=
-1+
t
a i X i has its zeros outside the unit circle for 9 E Si = 1
(iii) the input sequence {~i
It
1 is uniformly bounded: sup; I~j 1<00(iv) M .... ,(9) ... M(e) if N ... oo. uniformly in
9,
with M(6»OforOeS (meaning M is positive definite on S).Condition (ij) expresses the stability of the system: (iij) implies the output sequence (1) I
I
t~ 1 is uniformly bounded as welL Furthermore cond ition (ij) implies (the proof can befound in the appendix):
klI~(DDT)-1~k21 for all OES. 0<k)<k2<00 (12) which in addition with (iv) gives the nonsingularity of (HN (S»THN(8) expressing 6 is
uniquely determined from 0 given 1) = HN(8)O. We can qualify 00 as the .true parameter
vector.
We will discuss the asymptotic properties of
9.",
defined byJ,\'
(9
117 )=
mineE e iN (e). (13)Since (DDT )-1> 0 there exists some regular N xN matrix C (9) with (DDT )-1
=
CT C. (We can choose C differentiable, moreover ifC
denotes:~
. it is possible for C to satisfy• • I
C = -CDD T (DDT I, which will be of use in the sequel). Hence (omitting the argument 0 in C and D)
5
-(14a) where
(14b)
c;,
denoting row i of C.From CDz ... N(CD8.cr2J) it follows tl(8).t2(8).··· .t]l;(8) are independent. hence S 1(8 ).s 2(8) .... . S]\' (8) are as well. So in the normal case J N (8) and J N '(8) can be written as sums of independent random variables which is useful in obtaining asymptotical results.
Lemma 3.1
IN (8) -+ J (8) a.s., uniformly in 8. i.e. p(I N (8) -+ I (8) uniformly in 8)
=
1. whereJ(8):= cr2
+
(8-80
Y
M(8)(8-80 ). (15)Proof. From assumptions (ii). (iii) and (lOa) it follows var IN(8) -+ 0 uniformly in 8.
Chebysheff's inequality yields plim (Ix
ce)-
E IN (8))=
0 uniformly in 8. i.e.SUPeEe P( IJN (8)-£ IN(8) 1 ~ e) -+ O. for any e>O. Since IN(8) is a sum of
indepen-dent random variables. a.s. convergence follows (cf. Breiman '68. p. 45). Assumption (iv)
guarantees the existence of the limiting mean of IN (9).
0
A consistency result is given by the following
Theorem 3.1
6"
is (strongly) consistent for the true parameter vector 80 , i.e.6
N -+ 80 a.s ..Proof. Jx (9) is twice continuously differentiable so 1 is continuous on
e.
Since M (8» 0 (assumption (iv)) J is uniquely minimal in 80 , The theorem now follows from the lemmaand theorem 3.1 in Linssen '80. p. 24.
0
Showing asymptotic normality needs some preparations.
Lemma 3.2
1,Y'(8) and J,.:"(9) converge almost surely. uniformly on
e
and limJ]I;"(80 ) > O.As a consequence of assumption (ii), tr
Q
2 ~ C IN and8
TQ
28 ~ C 2118112 where C I, C 2 are bounded functions of 9 (the append ix c~ntains a proof).Assumptions (ii) and (iii) imply var J...,.(8) -+ 0 uniformly in 8. As in the proof of the IV •
previous lemma, since /,,(9)
=
~ i~/i(8)
(cf. (14)), where 51.' ...iN
are independent. almost sure convergence follows.The proof for IN "( 9) is similar.
Actually (cf. Linssen '80 p. 26) we obtain
6
-h.:·(90 ) ... 0 and II\'''(90)-+ 1"(90)= 2M(90 )
>
0 from assumption (iv) and (15).The key for the asymptotic normality result is the following
Lemmo. 3.3
If the smallest eigenvalue of V AR (n T Q '(90 ) n ) tends to infinity for N -+ co then IN '(90 ) is asymptotically normal. moreover
(N EolN '(9)Ul.' '(9))T )_'hT ~ IN '(90) -+ N (0. I) (convergence in distribution).
(16)
o
By notation Eo refers to taking expectation for 9 = 90 and A -lIzT = «A lh)-If where in
general for positive semidefinite matrix A . A 'I, denotes a square root: (A
Ihl
A liz=
A .Proof· If· denotes
a:.
again, it is easy to verify thatI
Q=D+i>Q +Q.1.iJT(D+)T (17)
where Q.1 := I-Q is as Q an orthogonal projection matrix and D+ := DT (DDT )-1.
Therefore E IN '(90 )
=
0 and var ~ IN '(fJo)=
N E 0 IN 'UN ')TSince
z
=
S+n. COY (n T Q '(90 ) n ,ST Q '(90 ) n)=
0 and:f.
8=
HN (S ).i (column ofI
HN (8)), we obtain
NVARJN'(9 0)=
~VAR(nTQ'(90)n)+4(J'2MN(90)'
(18)Assumption (iv) gives VAR ~J,,/(9o)
>
0, for N sufficiently large. Consider now for fixed (p +q+
l)-vector A}...r~I
.• /(eo)= 1 (n TQ'(9o)n)+ IN-XT (STQ'(90)n). (19)
Since the terms of the r.h .s. of ( 19) are uncorrela ted and
IN-XT (8 TQ'(90)n )-N(O,4u2AT MN(90)X) asymptotic normality of the first term
suffices to obtain asymptotic normality for)..T ~J, ... ·'(Oo).
In the appendix it is shown that the assumption in the lemma gives
1 T ( T ' ( 9 ) ) (XIVAR (nTQ'(Oo)n)X)
~X n Q 0 n AN 0, N '
X T
en
T Q '( 90 ) n )Le. ( T ( T (ll) ) )'h ... N (0, 1) in distribution.
X V AR n Q' I ] 0 n A .
Applying the multivariate central limit theorem in Varadarajan '58 yields the lemma.
0
Now we are able to prove
Theorem 3.2
Provided the assumption of lemma 3.3 holds. we have that (N E oI N
·U
s ·)T)_lhT EoIN" ~(eN-OO) ... NCO,I). or equivalently- 7
Proof. As a consequence of tl1eorem 3.1 and lemma 3.2 the limiting distributions of
-m
J/,.;'(90 ) andm
J"(00)(ON-9o) are equal (Linssen '80. p. 27 lemma 3.3). Lemma 3.3 implies(N EOJN'UN'Y )_'hT J"(90 )
mea
N - ( 0 ) ... NCO,I). (20)Since Q ii :=
06;P.~0.
= pri+
(pii l . where) I
pii = QiDiTCDDTrlDiQJ..-D+(DiD+Di+Dj D+Di)Qi -D+DiQiDjTCD+l (21)
. h Dr d . aD
Wlt enotmg
aB. .
I
it follows from (9c) that
Furthermore EOJN" U"(OO»-l -+ I and
(NEoJ",,'UN'lr'/zT EOJN "
U"(9
0»-1
(NEoh·'U,\.yYhT ... I . (22) The theorem now follows from Linssen '80. p. 26 lemma 3.2 with XN the l.h.s. of (22)and YN the l.h.s. of (20).
0
Another expression for the asymptotic covariance matrix of
m
(0
N - (0 ) can be obtained as follows. We conclude from (18) thatNEoh"UN'Y = 40'4.6..1\' (9o)+4O'2M,I\'
C9
o) •where the (i.j) element of .6..\.(00 ) is
14 cov(nTQiCBo),t.nTQieBo)n)= 1NtrQiQi
40' N 2
=
1...
tr DiT(DDT)-lDiQi f 9 IIN or
=
17 0' i .j=
1.2 ... p +q+
1. (23)As a result
m
(6
N -eo) AN (0.0'2 M,\' (Oo)-l{J +O'2.6.N
(9
0) MA,CO
o)-1D.
(24)Finally, a consistent estimator for the asymptotic covariance matrix of m(ON-90 ) is given in this section.
According to (20)
m
(6 , ....
-90 ) AN (0. U "(90 »-1 N E oJI\' 'UN')1 U "(90»-1) holds. As a consequence of 1".;"(0) ... J"(9) a.s., uniformly ine.
'[" is continuous in eo and 9 J\' -+ 90 a.s. we obtain8
-A consistent expression for N Eoh' 'UN
'Y
can be found by considering again (cL (14))1',:
J,\"(O)
=
~
1:5,'(0) ,with 51'(0), ... ,5j\"(9) independent.j = 1
Writing 5j(0)=zTQi(0)Z • with Qi:=DTCTdCi,D it follows
E 5j '(00 )
=
0.
(} .since tr Qi = ae-tr Qi
=
1=
0 (for all 6) and1
.
,
a
8T Q (90)8
=
oT {2DT CT•i Cj.D+
DT a9. (CT., Ci»D}o=
0 1where· denotes
a:
for arbitrary j E {1,2 ... p +q+
I}.1
Therefore
N
N EOJN'UN')T
=
~EOj~15j'(0)(5i'(6))T.
Furthermore. again from (14) we obtain 5i '(00 )
=
2 ti (60)ti '(60 ) and E ti (00 )=
0 implyingcov (ti '(60 ). ti (60
»)
= E 5i '(00 )=
0and because of normality ti (00 ) and t; '(00 ) are independent. i
=
1.2 ... N . (Actually li(6) and t;'(O) are independent for a116 .)Consequently
2 N
N EOJN'UNY =
4~ EOi~/J'(0){t,'(6»T.
From the strong law of large numbers
~
itl
It; '(0,\' )(ti '(O,\,))T - E'oti '(6)(t, '(9»TI-
0 a.s ..Then
N E:
0I
N'U ,\'
'Y
is consisten tl y estimated byA 1 N A A
40-2 N
i~/;
'(6 N)(l j '(6 N)iwhere
(25) Obviously frorp J",(6)-J(6) a.s., uniformly in
9.
J is continuous in9
0 , and ON-OO a.s. we obtain IN(O,,,) - J(60 )=
0-2 a.s ..N A A
The (k.1) element of
Lt
i'(6N)(t/(6N»)T is9
-where ~:= QJ..z (the vector of residuals). Therefore we have proved
Theorem 3.3
A consistent expression for the asymptotic covariance matrix of
m
(9
N - (0 ) is given by4;2
ci
N ,,)-1MN
(8
)(j,,,',)-l where;2
is given by (25), A A IN" := IA' "(6 N ) A A 1 A T -1 A MN(B):= N(HN(B)) (DDT) HN (8)le=e
N CcL (lld)). Remark.For the very special case p
=
q=
0 (no dynamics). IN '(60 ) is a scalar and the assumption in lemma 3.2 is satisfied:a
40" 4 var nT !lfl Q({3o)n=
- - - - N -+ 00 if N -+ 00. V/J (t+
(3 From (24) it followsm(a
._fl ) AN (0 O" 2 N (1+fl2+
O"2N /JA /JO •tTt
/JOt
(cL Linssen '80. p. 51 (4.23)). 4. Discussion
The asymptotical results of the previous section follow from writing IN (6) and
1.'11,' '(0) as sums of independent variables. which allows for applying some law of large numbers and some central limit theorem. In general (dropping the normality assumption for the noise) 1.101 (6) and the components of
It.;
'ce)
can be written as sums of random variables. sayN
LIL,
Y,2 wherecov (Yi.Yj )= O"2B,,j', 1
It is our aim to extend the asymptotical results to the general case mentioned above. Furthermore we want to generalize the results to the MIMO-model. not assuming zero ini-tial conditions. and allowing for relations between the parameters to be estimated.
10
-Appendix
Following Aoki '70 we give some bounds on DDT for the SISO-model (cf, (12)).
As a consequence of the stability assumption (ii) (the polynomial A (A.)
=
-1+
t
Of i A.;i=1
has its zeros outside the unit circle) we recall from Aoki '70, p. 245
The stability assumption yields the lower bound. Since DDT
=
AAT +BBT it easily followsPII ~ DDT ~ P3I
and
(Al)
The orthogonal projection matrix Q being Q = DT (DDT
Q...L
=
I-QID we can write for
Q...L = ET(EET)-IE
where E:= ( -BT A -T 1) (size N x2N).
Since EET
=
1+
B T (AA r )-1 B we obtaink 31 ~ (EET )-1 ~ 1. O<k 3~ 1.
Using (Al) we are able to prove (d. lemma 3.2) the following
lemma A.1
. 2 . (}
Q ~ C 121\! • O<C <00. where Q
=
ao
Q for any i.t
Proof. From (17)
Q =
p+pT holds where P=
D+ DQ..L.For arbitrary 2N - vector z we have z T
Q
2 Z=
II Qz 112.From (Al) and
DT D
~ Iuv (in case of free parametrization) it followsIIPzl12 ~ k21DQ Zll2 ~ k211Q...L z 1l2 ~ k211z112.
Furthermore we conclude from (Al). DDT ~ IN and (DDT)-2 ~ klIN that
IIpTz1l2~ IIDT(D+lzI12~ II(D+)TzIl2~ klllDzl12~ C111z112 2
where C (= k /( 1
+
t
I Of; 1+t
I (3 j I) .t = 1 j 0
(A2)
(A3)
Since II Qz II ~ II pz II +II pT z II. II Qz 112 ~ C II Z 112 holds. where C := (..[k';+
.jE-;i.
yielding the lemma.-11
Finally. it remains to show Ccf.lemma (3.3)) that lNAT(nTQ'(Oo)n) is asymptotically normal if n --N (0. (1'2]) for any (fixed) vector A .
For this purpose we need
lemma A. 2
Let n --N (0. (1'2]) and J-L 1 • . . . • J-L2X are the eigenvalues of a symmetric matrix RN then
n T RN n is asymptotically normal if and only if 2)
----=::-:-,,---.:;~- -+ 0 for N -+ 00 •
Proof· We can write RN as
RN = PNT ANPN
where PN is orthonormal and AN = diag (J-L l' . . • • J-L 2N ).
2N 2N
Then n T RNn
=
L
J-Li (P.,."n)i 2 =:L
J-Li X; .i::::: 1 j = 1
Since PNn --N (0. (1'2]), X 1, . . .
.x
211' are independent and identically distributed(actu-X·
ally I
X
2 ).(1'
Now the lemma follows from the Lindeberg-Feller central limit theorem (cL Serfting '80.
p.29).
0
p+..g..+l .
Let us define
R
N:=L
A;Q' (00 ), and suppose its eigenvalues are J-Ll' . . . ,J-L21','i= 1
2_ IIR:r;wIl2
Then maXi J-Li - max,.. ... 0 IIwll2 .
P +..g..+l
Since II Rj\' w II ~
L
I A i I II Q i w II ~ K II w II. where K does not depend on N (from (At). i::::: 1cf. proof of lemma A.n. we obtain maXi J-Li 2 ~ K2.
which tends to infinity, when the smallest eigenvalue of VAR n T Q '(60 ) n does. As a consequence the eigenvalue condition in lemma A.2 is satisfied for R.\· , yielding the asymp-totic normality of
IN
AT (n T Q '(00 ) n ).Remark.
12
-tr (Q,)2= 2-tr DiT(DDT)-IDiQ-"- ~ 2kJk3I1DiETII2.
For free parametrizations II Di ET 112 ~ k 4N holds (again k 4 denoting some constant not
depending on N ). so the elements on the diagonal of VAR nTQ'(90)n tend to infinity,
13
-References
Aoki '70
Aoki M. and Yue
p.e.,
On certain convergence questions in system identification, SIAM J. Control, vol. 8, no. 2, 239-256, 1970.
Breiman '68 Breiman L., Probability,
Addison-Wesley, Reading, Massachusetts, 1968.
Linssen '80 Linssen H.N.,
Functional relationships and minimum sum estimation, Doctoral thesis,
Techn. Univ. Eindhoven, 1980.
Serfiini:'80 Serfiing R.J.,
Approximation Theorems of Mathematical Statistics, John Wiley and Sons, New York, 1980.
Ten Vrei:elaar '85 (in Dutch) Ten Vregelaar J.M.,
Algoritme voor het schatten van de parameters in Arma-modellen met meetfouten op de in- en uitvoer,
Memorandum-COSOR 85-10, Techn. Univ. Eindhoven, 1985.
Varadarajan '58 Varadarajan V.S.,
A useful convergence theorem, Sankhya, 20, 221-222, 1958.