~SYSID '94

(1)

.

_,

~SYSID

'94

lOth IFAC Symposium on System Identification

4 -

6 July 1994

Editors:

Mogens Blanke and Torsten Söderström

Preprints Vol. 3

(2)

159

DYNAMIC TOTAL LINEAR LEAST SQUARES

BART DE MOOR

ESAT, ElectriCGl Engineering Deparlment, KatholieJce Univer,iteit Leuven

Kardinaal Mercierlaan9-1, B-3001 Leuven, Belgium Tel: 3!/16/1!0931 ; Fas: 3!/16/111855

email: bart.demoorOuat.kuleuven.ac.be

Abstract: The relationshipa between Jinear leut aqU&re8(equation error), total Jinear least sqU&re8(errors-in-vari&bles) ad the Frlsc:h scheme are weU undemood for the static case. For the d)'J1&DÛccue, much work hu been done to explore simnar rel&tionshipa. What eeemed to be 1a.c:1dng80 far is the solution to the so-c:aJled d;ynamic total least squares problem, which itse1f is a special case of a structured totallea.st sqU&re8problem. In this paper, we present a .urvey of lOme recent resulta for this d;ynamic totalleut aquares problem (which in fa.c:tcorresponds to the L:a-optimal modeDing ofJinear SISO ayatem8). We disc:uss the most relevant properties ad make lOme auggestiOnBon how these resulta might be useful in the context of the Frisch scheme. .

Key Words. Estimation Theory, System Identüication, Frisch Scheme, errors-in-varlables, tatal Jinear least aquares. Riem&IUÛa aingular value decomposition

1. INTRODUCTION 2

Forthe static case, the relationships betweenleast squares (equation-error in the

~

norm), total' least squares (errors-in-variables in the L2 norm) [14]and the Frisch scheme are well understood by now (see e.g. [5] [6] [7]). The least squares solutions play a fundamental role in the construc-tion of the soluconstruc-tion set of the Frisch scheme, at least in the case where the inverse of the data co-variance matrix is sign-similar to an elementwise positive matrix [15] [16] [17]. In this case, the solution of the Frisch scheme in the called sa-lution space (the null space ofthe data covariance matrix) can be represented as a polyhedral cone, the vertices of which are the least squares 801u-tions. By appropriate sign changes, these can all be 'reflected' to lie in the p08itive orthant (we re-fer to [10]for details). In this case, the totalleast squares 8Olutionis a convexcombination (a linear combination with positive weights) of the least 2The foUowingtext presentaresearch resulta ob-tained within the frameworkof the Be1gianpro-gramme on interuDiversityattraction poles (IUAP-l1 ad IUAP-SO)initiated by theBelgia.nState - Prime

Minister'. Oftice

-

Science Polic;y programmil1g. The

8CÏentific responBibDity is assumed b;y ita author, who is a Senior Research Associ&te of the Be1gia.nNational Fund for Scientüic Research. A pre1iminary vemon of this text was ava.ilable as ESAT-SISTA TR-1994-19, Dept. of EE, K. U.Leuven, March 1994.

squares solutions. It corresponds to the Perron-Frobenius eigenvector of the inverse data covari-anee matrix and is contained in the polyhedral cone of the least squares 8Olutions[7]. Other re-lations between the least squares and totalleast squares solutions are explored in [7]. In the case where the inverse of the data covariance matrix is not sign-similar to an elementwisepositive ma-trix, the geometry ofthe 8Olutionset is much more involved,as demonstrated in [2].

In the dynamic case, when the model is taken to be a SISO linear time-invariant &ystem,the re-lationships between least squares methods (e.g. equation error in the L2 norm) and linear dy-namic errors-in-variablesmodels (as e.g. explored in [1] [3])are far leasunderstood.

ÄE.a matter of fact, up to recently, there was no known 8Olution.to the so-called dynamic

to-tal leut "2uare8 pro61em,which is the following

(formulated and solvedhere for SISO &ystems): Let WI:e JR2,k

=

O,...,N be a given vector se-quence of data where the scalar sese-quences UI:and

1/1:are defined as a WI: =

(

::

).

Find approxi-30ne could CODIÎderthe fint compcmeut of UIt to be au mput aequence Ut and the 8eCOndcomponent to be aD output IeqUeDcerJcor the other way around.

(3)

.

m;"tions tik of the Uk, and Zk of the fik, such that tik and Zk are related by a linear model of given order n with rea1 coefficients:

n n

Ea,zk-'+L.B,tJk-i=O,

k=n,...,N,

(1)

,=0 ,::0 and tik and Zk minimi7.e

N

J = ~)(Yk - Zk)2hk+ (Uk- tJk)2gk], (2)

k=O

where gk and hk, k

=

0,..., N are givenpositive

weights, subject to

n n

Eal+ E.BI

=

1.

(3)

,::0 '=0

It is assumed that N

~

3n + 1 to make constraint (1) mtmni~gful.

The minimiv.ation problem (2) with constraints (P--and (3) will be called the d,lntJmic tottJllinea.r

,~ J .gutJr"Uproblem.

Let us observe th at all the observations are treated 'symmetrical1y'in the sense that not owy the outputs are modified, as is the case in equa-tion. error methods, but that also the inputs are modified. The weightsUkand hk are god-given or specified by the user. In the case that Uk

=

hk =

I, Vk, the formulation corresponds to the statis-tica1 errors-in-variables problem, which is max-imum 1ike1ihoodwhen the input-output data of an unknownlinear system are corrupted by addi-tive white Gaussian noise and one wants to find from the observations the differeneeequation that models the system.

It is the purpose of this paper to present the sa-lution to the dynamic total linear least squares problem and same of its properties. We hope that thJ.Ilcontribution will clarify same of the relations il -~etema incognittJbetween least squares, to-talleast squares and the Frischscheme (errors-in-variables) for dynamic systems. We will not give proofs in this paper. But details can be found in

[10] [11] [12]. . .

This paper is organized as follows: In Section 2, we briefly recall the static total linear least squares problem and its salution via the singular value decomposition (SVD). We then show how the dynamic totalleast squares problem is a spe-cial case of a structureel totalleast squares prob-lem, and henee its salution can be obtained via a so-called Riemannian singu1ar value decompo-sition (which is one of the major results of [10]). In Section 3, it is shown how the given data se-quence can be decomposed into two sese-quences, the L,-app~ations and the residuals, which are orthogonal to each oth~ (in diagonal inner products derived !rom given weights). In Section

160

4, it is shown how certain block Hanke! matri-ces KT,constructed from the weighted residuaIs are always non-singular and henee, the residuaIs themselves can be consideredas being generated by a linear system. In Section 5, we discuss how

onecan find completions

~com

ofthe

blockHan-kel matrices

~

with the L2-approximations, such

that ~COIDW1 = O.

We will use the notations

(

Uk

)

~

(

Vk

)

Wk

=

Yk ,Wk

=

%k and W

_

(

Uk

-

tik) 9k

)

k

-

(Yk

-

Zk) hk

'

for the given data, the optimal approximations and the weighted residuaIs respectively. By W, we denote the 2i x (N + 8) block Hankel matrix constructed !rom the sequence Wk:

o

WN_l WN 0 0 WN 0 0 0 WN_l WH

)

~:

.:.

.

o 0

Here, i is a user-definedinteger whichis assumed to be larger than n (the order in the differenee equation (1». The block Rankel matrix KT,is constructed similarly!rom the weightedresiduaIs

Wk. Note that the fust block row has i-1leading

zero vectors. We'll a1souse the data sequences

w,w and W, all ofwhich are in ]R(N+1)X2,where

wT =

( Wo Wl ... WN-l WN )

,

and W and w are constructed similarly. The se-quences U,fI,ti and %,all in ]R(N+1),are defined as:

(4)

UT

=

(uo Ul

.. .UN-l UN),

y =

(Yo Yl

.. .1IN-l YN),

tlT = (vo Vl .. .tlN-l VN) ,

zT = (ZO%1. ..ZN-l %N).

The vectors a and b contain the coefficientsai and .Biof the linear model. The transfer function associated to the dift'ereneeequation (1) will be denoted by b(z)/a(z).

2. SOLUTION VIA A RIEMANNIAN

SVD

The static totalleast squares problemfor a given data matrix A e ]RPXfeau be formulated as

min UA

-

BU subject to

.~Y=:O

_l

'

B,1/

lrY-

.

(4)

. ,

,

.

161 The two constraints ensure the rank deficieneyof the approximating matrix B. As is weUDOwn, the solutioD can be caleulated via the 'smallest' singular triplet of A (see [14],[21]),i.e. the triplet (u, 0',ti) COrrespoDdingto the smallest singular value 0', which satisfies

For 0'the smallest singular value of A, the matrix

B is given by a rank ODeupdate of A as B

=

A - uO'tlT.

It turns out (see [10][11] [13]) that the solution of the dynamie tota!least squares problem is given by the following Theorem:

Theorem 1

71aetlector, a and b that contain the coefficienu

ofthe diJjenmce eguation(1) which i, the ,olution of the dynamic totalleast ,guare, problem, follow from the ',malle,t' ,ingular triplet of a generalized

('nonlinear') ,ingular tlalue decompo,ition of the form

(Y U)

( ~)

=

(Dër+ Dr) u r

,

(~)u

= Du(~)r,

in which Y and U are Hankel matricel of dimen-,;on (N - ft + 1) x (n + 1) that are built up with

the output and input data; Do, Dr and Du are

politifJe definite matrice" the element, of which

are certain guadraticfunction, of the co~~onenu

of a, b re,p.u. The tlector, u and (äT b )T are normalizedsuch that

uT(Dä+ Dr)u

=

1 and (äT 'bT)Du

(

~

)

=

1.

71aeminimum value of the objectfunction (e) is

gifJenby the smalle,t ,ingular tldue r. The tlec-tor, a and b from the differenceeguation (1) can

be obtainedfrom a ,imple ,ealing of ä and

0

so that they ,atisfy (3). The data seguencel zand ti

ean be obtainedfrom the smallest ,ingular triplet

and the original data (,ee [8Jfor

detailedformu-la,).

Due to space limitations, we can DOtprovide the full details here (for whichwe refer to [10][11][12] [13]). But let us make the followingremarks:

First Dote the ressemblancebetween the SVD for the statie case in (5) and the 'genera!-ized' SVD in Theorem 1. Both are in terms of the given data (A in the statie case and the matrix (Y U) in the dynamie case).

(5)

As &matter of fact, the SVD of Theorem 1 would be a Well-Downgeneralized SVD (the restricted SVD, see [9]) in case that

Du, Do and Dr wouldbe constant matrices.

Because of the fact that, ODthe one hand thesematrices are Dotconstant as they are a functioDof the singular veetors to be found, and beeause they are a!ways positive defi-nite, we propose to eallthe generalizedSVD of Theorem 1 a Riemannian singular tlalue

decompo,ition(see [13]).

In [10]we demonstrate that the dynamie tota! least squares problemis a specia! case of the more genera! ,tructured total least squares prohlem (STLS), which ean be solved via a Riemannian SVD. The main conclusion of Theorem 1 is that the transfer fune-tion 6(:)/a(:), associatedwith the optima! model (1) that solves the dynamie tota! least squares problem, can be obtained from a Riemannian SVD.

The intermediate steps in the proof of this re-sult [10],whichis obtained via the technique of Lagrange multipliers, are instrumental in the derivation of the properties of the dynamie tota!least squares solution to be stated below. To mentionjust ODeproperty: The differenceof the Ranke!matrices Z and

V, which contain the 'modified' output and

input data of equatioD(1), and the Ranke! matrices Y and Y is a multilin~~ction

of the singulartriplet (u,r, (äT

I

)T)

Re-call that in the statie ease this is similar as the differeDceA

-

B is a rank one matrix.

In the dynamie case, the rank is however larger than one.

A heuristie algorithm whichis inspired by the method of inverse iteratioD is described in [10] [11] (which also contains much more details about the derivatioDand other ad-ditiona! properties). In [13]we will be de-scribing a continuous time method (a gra-dient llow) which employssome ideas from dift'erentialgeometry.

We will now turn to the enumeration of some of the properties that are satisfied by a solution to the dynamie totalleast squares problem.

3. ORTHOGONALITY OF wAND w

When in optimization problems, a criterion to be optimized is a sum of squares, orthogonalit,l is

Deverfar away.

.

For instance,for the statie tota!

least squares problem, there is a property of or-thOgODalityof the residuals and the data in the approximatioDmatrix B, as the column and row spaces of both matrices are perpendieular:

(A

-

B)BT

=

0 and (A

-

B? B

=

0

.

(5)

. .

163 The required extensions of the past and the fu-ture caD be obtained from a recursion, which is completely characterized by the following

"

Corollary

2 rank~com=n+i,

i >n.

Benee, the extensions CaDbe calculated from the di1ferenceequation (1) onee the solution to the dynamic tota!least squares problem has been ob-tained from the Riemannian SVD in Theorem 1.

6. CONCLUSIONS

,-We have presented the L2-optima! solution to the so-called dynamic totalleast squares problemfor SISO systems. There are niee properties that characterize the optima! solution such as a 'dy-namic' orthogonality of the approximating data and the residuals and the fact that also the resid-uals are highly structured as they are generated by a linear system.

Much work remains to be done to find the rela-tions between the existing identification methods for linear dynamic systems in which all observa-tions are assumed to have been corrupted by ad-ditive noise. It remains to be investigated how this particular solution to the dynamic errors-in-variables problem fits into 'Frisch-like' descrip-tions of the solution sets.

Acknowledgment

I would like to thank Berend Roorda from the Tinbergen Institute (Rotterdam) for some stimu-lating discussionswhichlead to the results in this paper.

/...

'T. REFERENCES

[1] Anderson B.D.O., DeistIer M.(1984)Identi-fiability in dynamic errors-in-variablesmodels. J.

Time Seriu Analyaia,S, 1-13.

[2] Anderson B.D.O., DeistIer M. (1994). So-lution set properties for errors-in-variables prob-lems. Proc. ofthe lOth IFAC Sympoaiumon Sys-tem Identification, July 4-6 1994, Copenhagen, Denmark, Session on Errors-in-variables models, to appear.

[3] Beghelli S., Guidorzi R., SoveriniU.(1990). The Frisch scheme in dynamic system identifica-tion. Automatica, 26, 171-176.

[4] Beghelli S., Casta!di P., Guidorzi R., Soverini U.(1994) The Frisch identification

scheme: Properties of the solution in the dynamic case. Proc. of the lOth IFAC S1/mpoaiumon S1/I-tem Identification, July 4-6 1994, Copenhagen, Denmark, Session on Errors-in-variablesmodels, to appear.

[5] De Moor B., VandewalleJ. (1986). A ge-ometrical approach to the maxima! corank prob-lem in the analysis oflinear re1ations.Proc. ofthe

eSth IEEE Conferenceon Deci,ion anti Control,

Athens, Greece,

3,

1990-1996.

[6] De Moor

B.,

Vandewalle J .(1986) The uniquenessversus the non-uniqueneesprinciple in the identmcation of linear relations from noisy

data. Proc.

of the eSth

IEEE conferenceon

Deci-,ion and Control,Athens, Greece, 3, 1663-1666. [7] De Moor B., VandewalleJ .(1990). A uni-fying theorem for linear and total linear least squares identification.IEEE 7hzRlaction, on

Au-tomatic Control,Vol.35,5, 563-566.

[8] MoonenM., De Moor B., Vandenberghe L., VandewalleJ .(1990). On- and oft'-lineidentifica-tion of linear state space models. Internaoft'-lineidentifica-tional

Journalof Control,Vo1.49,1, 219-239".

[9] De Moor B., Golub G.H.(1991). The re-stricted singular value decomp08ition:properties

and applications.

SiamJournalon

Matri~Anal-Ylil and Applications,Vo1.12,3, 401-425.

[10] De Moor B. (1993). Structured totalleast squares and L2 approximation problems. Special

issue of Linear Algebraand iu Appliccztions,on

Numerical Linear Algebnz Methotl, in Control, Signals and Synem, (eu: Van Dooren, Ammar, Nichols, Mehrmann), Volume188-189,163-207.

[11] De Moor B. (1994).Totalleast squares for affinely structured matrices and the noisy real-ization problem. ESAT-SISTAReport 1993-21, Department of ElectricalEngineering,Katholieke Universiteit Leuven, Belgium, 30 pp., May 1993, Accepted for publication in the IEEE

7hznsac-tionl on Signal Processing.

[12] De Moor B. (1994). L2-optimallinear sys-tem identification: Dynamic Tota! Linear Least Squares for SISO systems.Internal Report ESAT-SISTA, Department of Eleetrical Engineering, Katholieke Universiteit Leuven.

[13] De Moor B. (1994).The Riemannian SVD: Continuous-timealgorithmsthat solvestructured totälleast squares problems.Department of Elec-trical Engineering,Internal Report ESAT-SISTA, Katholieke UniversiteitLeuven.

[14] Golub G., Van Loan C. (1983). Matri~

ComputatioRl. Johns Bopkins University Prees,

Baltimore.

[15] Kalman R.E. (1982). Identification from real data. In: Cumnt developmenu in the

In-terface: EconomiCl, Econ6metriCl,

Mathemat-iCl '(edited61/M.EazetDÏnl:eland A.E.G. Rinrooy

tRe-printed in N.meriul Li.ur Al,df'4 TecA.if.e.

lor S,.terru..l Co.tro~R.V.Patel,A.J.Laub,P.Van

Dooren (Eds.), Reprints Book, IEEE Presa, New Vork, 1993.

(6)

'This'~

~q~valent with

vee(A

-

B)T vee(B)

=

0

.

(The operator vee(.) stores the columnsof the ma-trix between brackets in a long column vector). A similar orthogonality property holde true for the dynamie total least squares problem, for which one caDprove:

Theorem 2

wTw

=

0 or

(:~)

(G(u-v) H(1I-z»

= 0,

where G anti H are tliagonal matrices with the weighu 91;ani hl;.

This property caD be exploited in the derivation ofalgorithmsfor a cloaelyrelated problem (within W1l1em's'behavior' context [18][19]).

r

4. RANKDEFICIENCYOFTHE

RESmUAL MATRIX

Let i be a given integer for which it is assumed that i > n where n is the order in the difFer-ence equation (1). Consider the weightetlresidual matrix Wf e }R(2fX(N+I»construeted ftom the se-quenee WI;as in (4). Then one ean show: Theorem 3

If rank[a b]= 2, then:

rank(~) = n+i, Vi> n.

The fact that rank[a b]

=

2 is quite natura! since it implies that the transfer funetion of the system is not just a constant.

This Theorem has somefarreaching consequenCe8.

r

s well knoWDthat, when the rank of a bloek Ranke! matrix behaves as in the Theorem for in-creasing i, the data in the mat.rix (which in this case are the column vectors of the sequence W pe4ded with the appropriate niunber of zeros) are generated by a linear system oforder n [8]. Henee, Theorem 3 reaUymeans that the veetor sequence

W with the weighted residuals (and padded with

zeros) is generated by a linear time invariant sys-tem. As a matter of fact, we caD prove that this system is described as follows:

Corollary 1

fie linetJr ,ystem that generctes the weightetl

resi~ualsW is git/en by

162

Here, al'eY(z)and bI'eY(z)are the polynomials ob-tained ftom a(z) and b(z) by reversing the order of the coefiicients.

l,FromTheorem 2 and 3 we caDnow give the fol-lowing interesting interpretation to the solution of the dynamie totallinear least squares problem (where for simplicity we assume that the weights 91;and hl;are aU1): The given data WI:(whichcaD be 'arbitrary') are split into two sequence WI;and

WI;,whichare not owy orthogonal to one another

(Theorem 2) but which are also generated by two linear systems: The transfer funetion b(z)/a(z) for WI;caDbe obtained ftom a Riemannian SVD as in Theorem 1, while that for WI;is then given by _arev(z)/brev(z).

One might wonder about the fact that the se-quenceof weighted residuals is highly atruetured, in the sense that it can be modelled by a linear system. However, compared to the statie case this is not reaI1y aurprising: There, the matrix of residuals A

-

B

e

RPx, is a rank one matrix, hence can be 'modelled' with f-1linear relations. So also in the statie casethe matrix with residuals is highlyatruetured. In the dynamie case however it is aurpriaing that the residuals themselves also behave as a linear system.

S. ORXHOQONALITY OF THE WEIGHTED RESmUAL MATRIX AND

THE COMPLETED MATRIX WITH APPROXIMATIONS

Let i be a given integer aatisfying i > n. Let

W~m. e ]R2Ix(N+I)be a bloek Hanke! matrix

con-structed !rom an extended sequence

(~com

)

T

(

~ ~ ~ ~ ~

W

=

W-l+l W-l+2 ... W-l Wo ...WN

WN+1 ... WN+I-l ) ,

in which we call (W-l+l, . . ., W-l) a put extension and (WN+1"'" WN+f-l) a future extens~on.

Theorem 4

There ezists Clunigue utension (w-n,..., W-l)

in the past Clntl an utension (w N+1,

...

,

WN +ra)

in the future ,uch that

-

~T

wrmH', = 0 , Vi> n.

(6)

This characterization oforthogonality of the opti-mal solution is quite remarkable. Aa a special case it implies the orthogonality of Theorem 2. But Theorem 4 says that in addition certain auma of products ofveetors ofwcomand W(which, because ofthe bloekHanke! strueture of

wrm

and

~,

are finite discrete convolutions) are zero. Hence this correspondeto a certain 'dynamie' orthogonality.

"'1Qr""sysm '94 CopenhagenDenmark

---.

. ...-.---.

VoL 3.

(7)

.

,

.

ÈIUt}; f):Rcidel Publishing Co., Dordrecht,

161-196.

[16]. Kalman R.E. (1982). System Identification from noisy data. Proc. Int. S,mp.on D,namiC61

S,neml, (Gainesvi1le, Florida 1981), edited by

A.Bednarek, Academie Press.

[11] Kalman R.E. (1983). Identifiability and modeling in econometrics. In: P.R. Krishnaiah

(Ed.). DefJelopmenu in $tati6tiC8

.I,

New York, Academie Press.

[18] Roorda B. (1993).Globaltotalleast squares modelling of multivariabie time series. Preprint Tinbergen Institute, Erasmus University Rotter-dam.

[19] Roorda B.(1994). Algorithms for global to-tal least sqUales modellingof finite multivariabie time series. Preprint Tinbergen Institute, Eras-mus University Rotterdam.

[20] Scherrer W., Deistier M. (1994). A struc-ture theory for linear dynamie errors-in-variables models. Proc. of the lOth IFAC S,mposium on

S,nem IdentifiC6tion,July 4-61994, Copenhagen,

~ark, Session on Errors-in-variablesmodels, t ppear.

[21] Van Hu1felS., Vandewalle J. (1991). The

TotcilLea6t SguareBPro61em:ComputationalAs-pem and Anal,sÏ8. Frontiers in Applied

Mathe-matica 9, SIAM, Philadelphia, 300 pp.

164