The instrumental variable method and related identification schemes

(1)

The instrumental variable method and related identification

schemes

Citation for published version (APA):

Smets, A. J. (1970). The instrumental variable method and related identification schemes. (EUT report. E, Fac. of Electrical Engineering; Vol. 70-E-15). Technische Hogeschool Eindhoven.

Document status and date: Published: 01/01/1970

Document Version:

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.

• The final author version and the galley proof are versions of the publication after peer review.

• The final published version features the final layout of the paper including the volume, issue and page numbers.

Link to publication

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:

www.tue.nl/taverne Take down policy

If you believe that this document breaches copyright please contact us at: openaccess@tue.nl

providing details and we will investigate your claim.

(2)

THE INSTRUMENTAL VARIABLE METHOD AND RELATED IDENTIFICATION SCHEMES

by

(3)

Eindhoven University of Technology Eindhoven, The Netherlands

THE INSTRUMENTAL VARIABLE METHOD AND RELATED IDENTIFICATION SCHEMES

by

A.J. Smets

TH-Report 70-E-IS

November 1970

Submitted in partial fulfillment of the requirements

for the degree of Ir. (M.Sc.) at the Eindhoven University of Technology. The work was carried out in the Measurement and Control Group

under directorship of Prof.dr.ir. P. Eykhoff. Advisor ir. A.J.W. van den Boom.

(4)

THE INSTRUMENTAL VARIABLE METHOD AND RELATED IDENTIFICATION SCHEMES A.J. Smets

Sunnnary

Eindhoven University of Technology Department of Electrical Engineering

Eindhoven, Netherlands

It is assumed that sampled input and output data of a linear process become available in time. These data will be used immediately to update the pre-vious estimate ~ of the parametervector. Consequently recursive estimation techniques will be used.

The instrumental variable (I.V.) method provi~es an estimator with quick convergence, if the output noise level is not excessively high and if the instrumental variables are taken from the modeloutput sequence. The model-parameters are adjusted in an intermittant way (time-varying system) accor-ding to the estimates of the process parameters. If high noise powers occur, the initial modelvector has a strong effect on the speed convergence. It is shown that the method indicated by Wong (18) to obtain the "optimal" I.V. matrix, can be described in a much more elegant way. This leads to the filtering of input, output and modeloutput data with a (noise-whitening) filter having the same parameters as those of the -inverse- noise filter. In this way the close analogy between the optimal I.V. method and the gene-ralized least-squares method becomes apparent.

(5)

Contents

I. Introuuc tion

2~ Discussion of some estimation schemes

2.1. Classical regression analysis 2.1.1. Forward model

2.1.2. Forward and backward model 2.2. Generalized least-squares estimator

2.3. Generalized least-squares estimator with extended matrix 3. The instrumental variable method

3.1. Statement of the problem

3.2. The optimal instrumental variable matrix Z·

3.3. Realisation of the asymptotic optimal LV. matrix

3.4. Alternative realisation of the asymptotic optimal 3.5. Simple construction of the optimal 1. V. algorithm 4. Experimental results and discussion

4.1. Estimation schemes 4.2. The iterative algorithm

(Wong)

1. V. matrix

4.3. The open-loop process with only one forward parameter 4.4. The closed-loop process with only one forward parameter 4.5. The open-loop process with two forward parameters

4.6. The process description introduced by Rstrom 4.7. The generalized extended matrix methods 4.8. The sub-optimal I.V. method

5. Additional remarks and suggestions List of Symbols

Appendix I Appendix II

Definitions, conventions

Recursive formulae

Appendix III: Exponential weighting of input and output data Appendix IV

Literature

(6)

J. In troduc tion

The identification ( including parameter estimation ) of an industrial process is a necessary act, if we want to have a better knowledge about this process. This knowledge can be essential to design a specific regulator, or to improve the performance of an already regulated process

( e.g. chemical plants ). In biological processes, identification can lead to the estimate of so-called "standard" parameters, from which a model of the process studied can be formed. A patient for instance can now be compared with such a model, and the differences between them can give an indication of a disturbance in the patient himself.

As most processes have a "closed-loop" configuration ( e.g. the human being reacting on a stimulus or regulating a certain action; biological. processes ) it is essential to have an estimator which can estimate quite accurately the open-loop parameters of such a process as well as its closed-loop parameters. The instrumental variable estimator is one ( of the few ) estimators which is capable of such an action.

Now, even if we could make a mathematical model of a process, this mo-del can be ·too complex to handle ( e.g. destillation columns and, in general, any chemical plant ). In this case we will have to reduce this mathematical model with our background knowledge and our physical

intuition as a guide, into a more suitable, approximate model description. In this way we will only estimate those parameters of the process, which are quite relevant.

The purpose of identification is a.o. :

J) to design a regulator ( to control an industrial process );

2) to estimate certain specific parameters ( biological, economical ); 3) depending on the aim, we will try to estimate not only the parameters

of the process, but also the parameters of the "noise" ( where

"noise" stands for all disturbances that make the real process output differ from the measured process output ).

It will be clear that the quality of the estimates will depend on J ) the estimator chosen;

2) the ( approximated ) model structure;

3) the noise power;

4) the number of observations available;

(7)

The instrumental variable ( I.V. ) algorithm will work well only if we have some a priori knowledge about the structure of the process ( order of the difference equation describing the process ). It is well known

that a variety of methods have already been proposed to solve this "pure" identification of the process ( cfr. !strom (9), Woodside (1) ). So it will be assumed that the order of the process has been determined making use of one of these methods, before the estimation of the process para-meters starts.

In the following chapters we will give

1) a few estimation schemes ( chapter 2 );

2) the I.V. estimator, as well as its optimized form ( chapter 3 ); 3) the results obtained using some estimators discussed in chapters

2 and 3 ( chapter 4 );

4) some ideas about future work especially on the I.V. estimator.

It will be assumed in chapter 4 that the I.V. algorithm only starts after some knowledge about the process parameters has been obtained. The reason for this is quite obvious, as will be seen in chapter 3 ( when all model parameters are assumed to be zero, the model output will have no relation at all with the true process output ).

This knowledge will be obtained by the application of the least-squares ( L.S. ) estimator on a few ( 10 x m where m is the total number of para-meters to be estimated ) input and output samples. Furthermore we will assume that new input-output samples are used immediately to update the estimates. This necessitates an iterative form of our algorithm used. As the estimator will be applied on simulated processes, this simulations will have a transient ( at the beginning of the simulation, the process output is generated without any knowledge about its past values; these past values are put equal to zero ). To eliminate this transient we calculated the response of the process to be studied, to a unit impulse, and if the transient resulting from this input signal was smaller than 0.01 ( e.g. x

T ~ 0.01; the index T indicates a delay of T times T where T is the sample period ), we decided to reject the first T input and output observations generated.

(8)

It is the author's conviction that the use of instrumental variables in the field of parameter estimation. will be implied more and more ( in a way. the Tally estimate described by Peterka (20) is an I.V. approach ). Indeed this method has two basic fundaments which make it quite appealing :

1) the background knowledge and the physical feeling can be used directly in the estimator;

(9)

2. Discussion of some estimation schemes

2.1 Classical regression analysis 2.1.1 Forward model

Consider the linear process P of which the relation between input and output can be described by the following difference equation ( D.E. ) :

U ~ Fig. 2.1

p

--P E i=O

b

b.~ . • 1 k-1 X ~

+

~~

T

_~

A linear process. The output is disturbed with additive noise.

(2.1)

Let the output x

k be disturbed by an additive noise signal nk e.g. Yk = _xk+ ok'

where Y

k is the observable disturbed output of the process. The sequences

{,-\}~+p

and

{Yk}~+P

are available. where N is considerably larger than the total number of parameters ( in this case p.~ to be estimated.

In this case. we can estimate the coefficients hi ( i = O.I •••.• p ) by

considering the following set of equations :

(2.2)

Using matrix notation. the set of equations (2.2) can be written in a more convenient way :

(10)

with

Z

T = ( _{yp+l·yp+Z···yp+N})

.

bT= ( _{bO,bl,···,bp} )

.

T ( n = _{np+l'np+2'···,np+N} )

.

U U

...

u 1 p+1 _P U p+2 up+ 1· •••.. u2 U = U p+N up+N_1 •••• ~

According to classical regression analysis. the least-squares estimator of ~ is given by ( e.g. Deutsch (2), Goldberger (3» :

This is a linear estimator. If

{nk}~+P

is uncorrelated with unbiased estimate of b as : and thus (2.4) { }~ N+p 1 ,tb~s . estLmator gives an . E ( i ) = b . (2.5)

If the characteristics of the noise sequence

{nk}~+P

are known in the form of its covariance matrix. the ~ linear unbiased least-squares estimator of b is given by ( e.g. Goldberger (3) ) :

i

=

{UT~-IU}-IUT~-IZ _(2.6)

h ( T). h . . h . { }N+p

were L= E ~ ~s t e covar1ance matrLx of t e nOLse ~ I and

E( n

k ) =

o.

In this case we can prove that :

(2.7) Eq.(2.6) is known as the generalized Gauss-Markoff estimator; any choice

T

for L other than E( ~~ ) will give a covariance matrix of

i

that is greater than the matrix given in eq.(2.7).

Already in 1960. Levin (4) used the estimator given in eq.(2.4) to estimate the points of the impulse response h(t). Indeed we can write:

(11)

h.u

k . + nk

J -J (2.8)

Yk = _{xk + nk}=

if we suppose all h . for i> p+1

P

L

j=O

o

to be zero. In this way we will always have a truncation error ( van den Boom & Melis (5) ), but this truncation can be made arbitr_.ely

half

small by choosing p large enough.

...

_....

---o IT: 31: .. . . ..

1'T:

Fig. 2.2 : The impulse response of a first order system. }N+p

Levin supposed the sequence {~I to be a white noise sequence. In this case we get :

(2.9)

and this is an unbiased linear estimator of

h,

where h = ( h ,hI' ... ,h ).

o · p

2.1.2 Forward and backward model ( combined-, generalized model)

The description of a linear process by its impulse response ( forward model ) is in a lot of cases not a practical one. Indeed in general the points

hp+i (i >0) of the impulse response are not zero at all and thus there will be always a truncation error when estimating ~ ( eq.(2.9) ). Of course this error can be made arbitrarely small, but practical reasons limit this

procedure ( time-consuming manipulations with matrices having large dimen-sions; finite sequence length N ). Furthermore for white additive noise n the standard deviation (J in ~ parameter is given by :

rJ

=

I

1/1 (0) i

1. 1/Inn

(0)

uu

(2.9a) ( van den Boom

&

Melis (5) ), where 1/1 and 1/1 are the autocorrelation

nn uu

functions of ~ resp. ~, and I is the total length of the observation

sequence ( e.g. I = (N+p)x, ). It thus will have no sense to estimate these h . that have a lower value than (J as their estimate would be "lost" in

p+1

(12)

( Prinsen (6) ) this would give nO further information about the impulse response of the open-loop process.

A simple example will make it clear that there is much sense in estimating forward as well as backward parameters of a D.E.

Suppose the first order process described by the relation ( with a <I ) 2

x_k = ~ + a ~_I + a ~-2 We can write then

+ ••• + _aj _u. _'.+'" K-J

=

_i=O'" i 1: a ~ -1 . (Z.IO) Z j a ~_I + a ~-Z + ••• + a ~_j+'" = '" l: a i ~ . i=l -1 (Z. I I) or (Z. I la)

The D.E. (2.lla) describes the same physical linear process as eq.(2.10). but it is clear that it will be easier ( when taking the necessary

precautions) to estimate two parameters ( I,a ) than all points of the impulse response.

If however, we describe the process P by the following D.E.

q p

x

k + _i=Il: a .x_1. k _-1. = _i=Ol: b. _1.~ - 1 . (Z.12)

it can be proved (Shaw (7» that classical regression analysis in general gives biased estimates of both forward and backward parameters of the D.E.(Z.II).

output. if there is additive noise on the

Suppose the sequences

{~}~+q

and then eq.(Z.IZ) can be written

{ }N+q

Yk I are given. where Yk

=

~ + nk•

P Y k = l: b.t!. . -1. k-l. q l: I a'Y₁ _k_-1. + n_k+ q 1:

o

a.a . 1. k-l. (2.13)

The noise sequence {nk}1 q is conside·red to have elements taken from a

o

N+ stationary process. which has zero mean. and the elements n

k are uncorrelated with x. and u. ( i.j = I.Z ... N+q ).

1 J

We define :

bT = ( _{b 0' b }. ,_ ••• ,b P}_{,-a I ,-a 2' ••• ,-a}_q )

.

T ( )

,

:t.

= _{Yq+l·yq+Z···'Yq+N} T ₍ )

.

n = _{nq+i,nq+2'···,nq+N} (Z.14)

(13)

and U_q+_{I ••• .u}_q+_J__p Y_{q .•..•.}f • • Y, n(u,y) = = (UiY)

l

u •••• u q+N· q+N-p Y q+N-l··· 'YN and analogously

n(u,x) = (U!X) and n (0 ,n) = (0\1) _(2.15)

( we will always consider the case with q ~ p; this is no essential restriction) •

Now, the least-squares estimator of b is given by

T -I T

{ n (u,y) n(u,y)} n (u,y)

Z

(2.16)

As

z

= rl(u,y) ~ + E; - rl(O,n) ~ (2.17) we can combine eq.(2.16) with eq.(2.17) to give

T -I T T

~

=

~ + {n (u,y) n(u,y)} (n (u,y)E; - rl (u,y) n(O,n)~ )

and thus plim

lIa

=

N-><»

=

plim (

a -

b ) N-><» -

-. ( T -] T . T :\

pll.ll1 {n (u,y)n(u,y)} {(l(u·,Y)E; -,n (u,y)(j(O,n)2}/o

N-><»

(2. 18a) We will show that the expression (2.18a) is in general different from the zero vector, and thus ~ is an asymptotic biased estimator of b ( Shaw (7), Evers (8) ).

We suppose that ~ and ~ are taken from well-behaved stationary stochastic processes, so the time averages of their products are consistent estimators of their auto- and crosscorrelation functions.

T

Let P = n (u,y)(l(u,y) and consider the relevant parts of eq.(2.18a) one by one : a) plim[..!.

pl

=

r

N-><» N T b) plim rl (u,y)n(O,n) N-><» N =

( a positive definite matrix );

o

I

o

tjJ (O) ••••••••• tjJ (i-q) nn nn • • • • tjJ (q-I) ••••••• tjJ (0) nn nn

(14)

lji (k) = lji (-k)

}

nn nn

lji _yn(k) = lji _nn(k) for all k. lji (k) = 0

un "ith

where : lji ab (k) = E{a(j )b(j+k)} ;

c) _plim

.?()

_{u.y _}n ₌ N->oo N and so or plim (

! -

E. )

= N->oo plim t.8 = N+= --1

r

-1 r -_Q._--ljinn (1) ljinn(2) ljiim(q)

o

q lji (1) + 1: nn 1 q

o

.

,

o

--- -, -lji -

(0)

:-.-'~ lji--(l-q) nn .. ' nn I . . I . •

o

I lji (q-I) .. lji (0)

I nn .. nn a.lji _~_nn(I-i) 1: a.lji (q-i) 1 l. nn

.

-a q (2.I8b)

As

r-

I is a non-singular matrix. plim t.8 #

Q (

and in general each

N-+=

component of ~ can be different from %ero ). The plim t.8 is only equal

N->oo-to the zero vecN->oo-tor if q

E.a.lji (j-i) = 0

1 l. nn

This will only be the case when the noise from a white noise

sequence{(k}~+q

in the nk + aIll. 1 + ••••• + a nk 1<- . . . q - q = (j=I.2 ... q). N+q sequence {~}I following way : is derived (2.19)

(15)

Remark I This coincides with the special case, considered by !strom(IO)

-I

{I + A(z )}Yk = {b_O+ B(z -I )}"k + ~ k (2.20) The L.S. estimator will yield asymptotically unbiased estimates of a. and

1

b. ( !strom (10), Evers (8) ). Eq.(2.20) can then be written J = + and so -I {1+A(z )}"k=~k

which is equal to eq.(2.19).

Remark 2 : We have proved in a rather elaborate way, using the a priori knowledge about the structure of the process ( order of the D.E. ), that the use of the L.S. estimator in general will yield biased results. Rejecting this a priori knowledge of the order of the D.E. in some cases can lead to an asymptotically unbiased estimate of ~.

Indeed, consider the D.E. o£ the following type (Rstrom (9»: { I + _{A}Yk = {bO + B}u}_k+ { I + C}f;k

q

-i p -i s -i

where A = E a. z B = E b. z and C = E c. z

I 1 I 1 I 1

and suppose I + D = 1/(1 + C) exists e.g. D can be written as

D

We can rewrite eq.(2.2Ia)

= r

r

I -i d. z 1 ( finite r ).

{I + A}{I + D}Yk = {b_O+ B}{I + D}"k + ~k • If we now write :

'"

{ I + A}{ 1 + D} + A = b O+ B* = {bO+ B}{ I + D} A'" ₌ q+r E a. z tI -i 1 1 where Btl = p+r E b~ -i z

,

I 1 and (2.2Ia) (2.2Ib) (2.22)

(16)

and combine eq.(2.22) with eq.(2.2Ib), it is seen that

*

{I + A }Yk

=

{bO + B }~ + ~k

Using the L.S. estimator on b* (e.g. estimating the obtain consistent estimates of the

The common part in the polynomials

*

parameters a. and

~

'"

I+A and bO+B can

'"

a. and ~

'"

boo ~ (2.23) b~ ), we will ~ then be eliminated, and an unbiased estimate of

l

is thus obtained (Valis (II), Rstrom &

·Eykhoff (12». Note the correlation between the ideas of this remark and those of the following paragraph.

2.2 Generalized least-squares estimator

Fig. 2.3 The ( symbolic ) representation of a linear process described by the D.E. (2.24). The generation of the

.. " . h

samples uk and y k ~s sown.

Suppose that the linear process P _{can be described by the following}

D.E. (eq.(2.20» {I + A}Yk = _{{bO +}B}~ + {I + C}~k (2.24) where : {I + C}~k = e k (2.25) e.g. { I + A}Yk = _{{bO +}_B}~ + _~k'

As in the remark 2 in paragraph 2.1, we suppose I+D = I/(I+C) exists, and:

r

D = l:

I

Eq.(2.25) can then be .written as

-i d. z

~

(17)

It has been shown in .2.1.2 that the normal least-squares estimator

1

of T

,E., where b = ( bO,b

l, ••• ,bp,-al,-a2, ••• ,-aq ), is biased. Only in the case where e

k

=

~k ( cfr. eq.(2.20)) will the .L.S. estimator be consistent.

. {}N+q {N+q . { }N+q

If we f1lter the L~quences ~ 1 and Yk} and e

k 1 in the 1 following way :

..

_{+ d1u} k_1 dr~-r {1+D}'it 'it

=

~ + ••••• +

=

..

dlYk_1 Yk

=

Y_k + + ••••• + _{drYk- r}

₌

{I+D}Yk (2.27)

..

_{+ d1e} k_1 d e {I+D}ek e k

=

ek + ••••• + r k-r

=

~k' then we can rewrite eq. (2.24) as follows

{I + A}Y

..

k

=

{bO + B}~

..

+ ~k (2.28)

It can be seen from eq.(2.28) that. when using the sequences {u:}~+q and

{y=}~+q, a generalized L.S. estimator is given by :

s ..

T . . . . . . . . -I T .. ... ..

= {~ (u .Y )~(u .y)} ~ (u .Y ) Z • (2.29)

..

where S is a consistent estimate of b.

Remark 1 The same idea has also been indicated by Eykhoff (14) when instrumenting the Markoff-estimate (eq. 2.6 applied on the matrix ~(u.y)

-I

instead of U ). He assumes that L can be written in the following way:

-I

L

-I

( D is in this remark a matrix and not the polynomial D(z ) ) . Using this notation

1

can be written as :

T -I T

1

= {(D~(u.y)) D~(u.y)} (DQ(u.y)) (DZ).

which means that the matrix D represents a "noise-whitening" filter.

}N+q }N+q

Applying this filter on the sequences

{"k

1 and {Yk 1 leads to an asymptotic unbiased estimate of b.

In practice. the parameters d. of

1

known; we will try to estimate dT then filter the sequences

{~}~q

the ( inverse ) noise filter are not

=

(dl.dZ ••••••• d

r ) consistently. and

and {Yk}~+q according to eq.(2.27).

The following procedure has been proposed by Clarke (13) I) making use of

1

0

is found;

N+q N+q

(18)

sequence

. N+q available ~ and the sequences {~}I

{_ }N+q e_k J can be computed, where :

2) with the _{and Yk 1 ' a}{ }N+q

(2.30) 3) e q . . (2 26) can e b wr~tten ~n matr~x notat~on, av~ng . . . . h ' {- }N+q e

k 1 available

+ ~.

- '

(2.31)

and thus a consistent estimate of di (which need not be equal to

i.

as we postulated the noise term in eq.(2.31) to be white noise;

our hope is that the di thus obtained will be a good first approximation of ~ ) will be given by

. -T~ -1 AT

Y~ = - {E E} E e ; (2.32)

- . N+q_· N+q

4) filtering the sequences {~}I and {Yk}1 with the estimations

i ( . { ill }N+q ill N+q

Yj j = 1,2, ... ,r ) will give the sequences ~ I and {Y

k} I ; 5) a L.S. solution ~~ according to eq.(2.29) using the sequences

sequence {et;q { ~ ~}N+q 1 an d {*}N+q _Yk I ~s . compute , w ereafter a new d h

will be computed.

If the so-called loss-function V

=

sequence than for the next to last,

N+q E tAe _2 . 11 f h 1 e k ~s sma er or t e a s t

procedure now starts from

3), else it will be stopped. The next to last estimated parameter vector is chosen to be the estimate of the process parameter vector.

Fig. 2.4

=-...

SToP (1l'~')

Schematical diagram of the procedure proposed by Clarke.

The procedure described above is an explicit estimation scheme and it yields, after using N+q input-output pairs and some runs through the

*

(19)

as well as the ( inverse ) noise parameters will be estimated.

A recursive algorithm, which is almost the same as the method described by Clarke, has been given by Hastings-James&Sage (IS).

Hastings-James&SF ," as well as Clarke are not able to prove the convergence of their algorithm, although their results do indicate, that it is indeed convergent in practice, even with noise/signal ratios of 7 (IS).

Remark 2 : By weighting the input and output values ( e.g. exponentially, see Appendix III ), it is possible to approximate the method of Clarke in a recursive way ( Hastings-James & Sage (IS) ). This weighting is necessary. In an iterative scheme without weighting, past values of u*, y* and

e

are held in the memory of the algorithm. Hence they will influence the updating of the parametervector !. As these past values have been obtained by very poor ( in the beginning; small N ) estimates of ~ and

1.,

retaining them in the memory could produce large deviations in the final estimates of! and

1. (

large N ). By weighting these past values, they will be gradually removed from the memory, and their influence on the updating of ! and

1.

will be eliminated. However, a consequence of this weighting is, that the estimates of ~ and

1.

for large N will have a lower bound on their standard deviations ( appendix III ).

2.3 Generalized least-squares estimator with extended matrix Again, consider the process P described by the .D.E.

{I + A}Yk = {b_O+ B}~ + {I + C}~k' (2.20) The sequences

{~}~+q

and

{Yk}~+q

are available. The

sequence{~k}~+q

is a white noise sequence. According to 2.2 we suppose:

e k = { I + CHk (2.25) and _~k= { I + D}e k (2.26) and so e k = -dlek_1 - ... -der k -r +~k

.

(2.27) Combining eq. (2.20) and (2.27) we get :

Yk = ( ~'~_I'"'' ~-p 'yk-\' .. • 'Yk-q ,ek__{1' .. •}_{,ek- r ) b' + ~k}

(2.33) where _{b:p,b;, ... ,b ,-a l , ... ,-a ,-d l , ... ,-d}_. _{p . .} _{q '} _r

= (

!!" -- ) •

)

(20)

Writing eq.(2.33) in matrix notation we get:

, ,

y

= (

U: Y:

E ) b' +

l

=

~(u,y,e) b' +

l.

(2.35) The estimator

T -I T

~' ={~ (u,y,e) ~(u,y,e)} ~ (u,y,e) y (2.36)

gives a consistent estimate of b'. Indeed:

[ T -I T )

plim {~(u,y,e) ~(u,y,e)} ~ (u,y,e)

l

=

N->oo

(plim

~T(u,y,e)~(u,y,e»-I

.plim

~T(u,y,e)

l

=

£

(2.37)

N-><o N-><o

( note the analogy with Clarke ).

The estinmtor given in eq.(2.36) can be put in a recursive form. The sequence {ek}~+q in not available and thus we will compute at estimate of

it using eq.(2.30). It thus will be clear that we will apply weighting too on this recursive algorithm ( analogy with Hastings-James

&

Sage ). Computing time for the algorithm of Hastings-James & Sage is about the same as the time needed to solve eq.(2.36), but now there is no need to filter the data explicitly ( cfr. eq. (2.27) ).

The results obtained using this algorithm are given in chapter 4, 4.7.1. Remark 1 : We will show the strong analogy between

Clarke, and the method described in this paragraph.

{ N+q }N+q .

the method proposed by N+q The sequences {~}I ' Yk}1 and {e_k 1 are once aga1n available.

a) The method given by Clarke is given below ~ = - (ETE)-IET ; + (ETE)-IET

l

if r if r

Yk

=

Yk +

f

diYk-i; ~ = ~ +

i

_{diuk_i}

T « . «« -I T • « «

!

=

{~ (u ,y )~(u ,y)} ~ (u 'Y)\l -

l),

b) The extended matrix method :

premultiplying eq.(2.35) with ~ T (u,y,e) gives the following

r

n T (u,y) y

=

n T (u,y)~(u,y)

!

-~ ; T ,E

Y

= E T n(u,y)!-T ~ (u,y) ( Ed -

l )

+ (2.38) (2.39) (2.40) set: (2.41) (2.42)

(21)

Now we know that e = 1.. fl(U,y)

2,

so eq.(2.42) can be written as ET e

= _ (

ETE )

and it is seen that eq.(2.44) and eq.(2.38) are the same. We know that : e

...

e _q+l-r q e _q+1

...

e _q+2-r E d = e _{q+N-l····} e q+N-r d .••••. d l 0 ... 0 ·r . \ . I = \ I U. _{I ...}d ••• ••• _r dl., .... n _{. .} _T I " \ I

,

\ •• • :.. •••••••••••• '. L , '..I " Q

.

,

0 •••••• Qd r .... ... d) d_l d 2 d_r e q+l-r e q+2-r = = De' •

Defining the vectors 1..' and~' in an analog way as ~', we get e' 1..' -n'(u,y)E.

and thus

E d = D 1..' D 0' (u,y)

2,.

The matrix fl' (u,y) is defined as follows u _q+l-r•••••• u q+l.,-p-rIYq-r ••.•. .• y l-r u ••••••••• u q q-p Y _q-l' •••••• _YO fl'(u,y) = _u+ 1···u+1 Yq

...

YI q . q -p u N q+ - .

I···

u q+ - -p q -N I Y +N

2··· Y

. . N-I I (2.43) (2.44) (2.45)

(22)

Substituting eq. (2.45) in eq. (2.41) we get:

T T T

~ (u,y) (L + DL') = ~ (u,y) ( ~(u,y) + D~'(u,y))~ + ~ (u,y)l

Now L+DL'= d ••••••• d 1 0 ••••• 0 ·r . \ . 1 0, d ••••••. d 1 ) .•• 0 I ,·r . .,. I I , \ I +. I .• , •.••..•...••• _I _{. .} _{. . ""}v. •••• ') ••••••••••••• V

6 ....

'~~d ... e" . . . "d 1 r · '. Yq+N-l + ••• + yq+2 + d 1y 1+···+ d Y 2 q+ . r q+ -r = = Y _,q+N + d y _I _q+N-·1+ •• +d y _{r q+N-r} (with eq.(2.39). ::t. + D::t.' = _LIt

So it can be seen that : and in an analogous way :

It It

~(u,y) + D Q'(u,y)

=

~(u ,y ) • Eq.(2.46) can be rewritten:

T .. .. -I T ..

~ ={~ (u,y)~(u ,y)} ~ (u,y) (L

-.f).

(2.46)

(2.47) (2.48)

(2.49) As we can see eq.(2.49) and eq.(2.40) do correspond quite well. The only difference to be seen is that while eq.(2.40) gives a L.S. solution of

It . . . .

L = ~(u ,y ) ~

+.f

(2.28b)

eq.(2.49) gives a sort of "instrumental variable" ( see chapter 3 ) solution of this same equation, where the "instrumental variable matrix" Z is equal to ~(u,y).

The results given in chapter 4, 4.7.2, are obtained by taking

~T(u,w)

instead of ~T(u,y) in eq.(2.49). This means that the algorithm :

(23)

has been instrumented. The elements w

k are taken from the outputsequence of the model of the process. The choice of the matrix nT(u,w) will be discussed in chapter 3.

Remark 2 Another closely related method, proposed by Young(16) will be given below.

The extended matrix n(u,y,s) is used in this method. Indeed we can write eq.(2.20) in matrixform as follows:

z

= n(u,y,~) b" +

₁

(2.50) where bllT = (b

-,-

T cT ) and : T ( c_l,c

₂

_'···,cs ) c =

A L.S. solution of eq.(2.50) gives estimates of the parameters a. and b.

1. J

as well as the parameters c., which are directly related with the D.E.

1.

description (2.20).

The filling of the matrix n(u,y,t) by calculated values of tk can be done by using the relation :

(where c. ). l. for k

t

_k = = _Yk ₊_"\Yk-\_.+ ••• + a _q-qY_k - c\tk_\

-

...

- c t _{s k-s}

o

all A = 0 the

C.

c. ; are l. 1. - aO~-:"'-' Sp~_p (2.51) the estimates of the parameters

As the iterative solution proceeds ~ will be filled with estimations of the c vector and if the algorithm is convergent, the samples tk will

tend to the samples Sk' In this case ~will be a better and better estimator of

l,

as well as ~ will be a better and better estimator of c.

The algorithm :

T -\ T

A'

= {n (u,y,t)n(u,y.t)} n (u,y,t)

Z

(2.52) as well as its modified form :

T . -I T

~'

=

{n (u,w,t)n(u,y,t)}

n

(u,w,t) Z

(2.53)

have been instrumented.

The results obtained are given for both algorithms in chapter 4, 4.7.3 resp. 4.7.4.

(24)

3. The instrumental variable method 3.1 Statement of the problem

Consider the process P described by the following D.E.

-I -I

{ I + A(z ) }x

k = {b 0 + B(z )}~

where A(z-I) = _alz-I + a -2 + a z-q 2z +

...

q • B(z-I) ₌ _bIz-I + b -2 + b z-p.

2z +

...

p The roots of the polynomial

-I

I + A(z ) = 0

all lay inside the unit circle in the z-plane ( i.e. the system is assumed to be stable ).

(3. I)

(3.2)

(3.3)

Now suppose this process to be disturbed by an additive noise signal n

k

( not necessarely "white" noise ), with zero mean

(3.4)

In this case, eq. (3. I) can be rewritten as a relation between the observable ( dis turbed· ) output Yk _{and the input ~}

{ I + A} _Yk = _{{bO + B}} _~+ _~

with: _~= {I + A} _~. (3.5)

It will be assumed that the noise n

k is the random process output from a

general filter driven by a white noise source ~k' and the input-output relation is :

{I + F} ~

=

{I + G} ~k' (3.6)

Combining eq. (3.6) and eq, (3.5) we get ( cfr. Hastings-James & Sage (15» {I + A} Yk = {bO + B} .~ + (I+A)(I+G) ~k (3.7) ( I+F) (I+A) (I+G) . - I - s = I + C = I + c1z + ••• +c z , (I+F) . s (3.8) and putting

we get ( cfr. Astrom, Bohlin & Wensmark (9» :

{I + A} Yk {bO + B} ~ + {I + C}~k' (3.9) Let q ? max(p,s) - which is not essential. If the sequences {~}I N+q and

(25)

Y_{q+ 1}= bOu 1+" .+b U 1 -aly - ••• -a Y

1

q+ . . P q+ -p q . . q +< ~q+ I+"'+c < . . . s~q+ 1 -5

Yq+2

=

bOu _q+2+···+b u +2 _{P q} _-p-aly _{q+ . . .}_{I-···-a Y2}_q +< _~q+2+" .+C" _.. _s~q+2 _-s

Using matrix notation eq.(3.10) becomes

1.. = Q(u.y).£ + _ c

with T

( )

.

1.. = _{Yq+ I ·Yq+2·.·.· ... •• Yq+N}

bT = ( _{ba,bI,···,b}

_,-a

1

,···,-a

. . p . . . q and Q(U.y) =

=

Let : with : U q+l U q+2 U q+N ~q+1 ~q+2 ~q+N

-

c T e T ( _{1,cl'~2".' ',' .,c~} c = U q+l-p I I q.. Y •••••••• .. U I

· ....

q+2-p , Yq+J

...

I

,.

,

· ....

U Iy q+N-p I

,

q+N-l

.

...

· ... .

~q+l-s ~q+2-s

· ... .

_~q+N-S = e ( = _{e +l,e +2'···,e N}_{q q .}_{. . .} _q+

and define an N x m matrix Z. where m ~ p+q+l. We can then write the following expression :

)

.

Y1 Y2 = Y_N )

.

(

l=

{Z T Q(u.y)} -I T Z 1..

=

~ + {Z Q(u.y)} T -I T Z ~. )

.

U:y

We now assume that the elements of Z are chosen so that 1 ) plim

~

ZTQ(u.y) gives rise to a non-singular matrix;

N- .

2) _{p 1m}1·

_N

1 ZT e = 0 ( zero vector )

.

N_

( the definition of plim is given in appendix I ).

N-)

.

(3.10) (3.11) (3.12) (3.13)

(26)

In this case it can be proved, using Slutsky's theorem (appendix I), that

plim

f

= plim {ZTn(u,y)}-IZT_{Z =~.} (3.14)

N-><o

N-A matrix Z, which satisfies the conditions mentioned above, is called an "instrumental variable" matrix (Reiersoll (17), Wong (18».

The rate of convergence of the estimator

f

(which is indeed a vector random variable) to the unknown parameter vector ~, will depend heavily on the actual choice of Z. A matrix Z*which minimizes the asymptotic variance of the estimate exists (Wong (18».

3.2 The optimal instrumental variable matrix Z*

The asymptotic covariance matrix of the estimator

f

is given by:

T -I T T T -I

= limE fez n(u,y» Z 2. 2. Zen (u,y)Z) }

N-= lim{(ZTn(u,x»-IZT~Z(nT(u,x)z)-I} N-><o

(3.15)

T (Wong (18): assumptions onp. 42 and his appendix IV), where <l> = E (2. 2. )

is a positive definite matrix, and n(u,x) has the same construction as

n(u,y) defined in 3.1, with the elements y. replaced by the corresponding

1

elements xi (~ is the unmeasurable system output) i.e. n(u,x)

=

(UiX).

It has been proved by Wong (18) that the rate of convergence in covariance is I

IN.

Let: Z '"

=

~ -I n(u,x)

'"

then eq. (3.15) can be rewritten with Z replaced by Z :

(3.16)

lim{(nT(u,x)<l>-Tn(u,x»-lnT(u,x)<l>-T<l> <l>-ln(u,x)(nT(u,x)<l>-lnT(u,x»-I}

.N-T -I T -I

=

lim{n (u,x)<l> n (u,x)} (3.17)

N-And it has been proved by Wong (18) that

T -I T T -I T -I -I

(Z n(u,x» Z <l>z(n (u,x)Z) ~ (n (u,x)<l> n(u,x». (3.18) Any other choice of Zthan the one given in eq. (3.16) will result in an asymptotic covariance matrix that is larger than the one obtained with

(27)

z

=

z".

So the matrix Z,.

=

W-I(l(u,x) will be called "optimal instrumental variable matrix",

Now in practice p ~h an optimal I.V. matrix can not be constructed, because:

) { }N+q

I the sequence ~ I is not available and so we cannot construct a ma-trix (l(u,x)

=

(ulX);

2) we do not know the parameters c., and so we cannot compute the matrix

J.

<l>.

However, the idea is clear. Whenever we know something about the parame-ters c. and we have a large number of observations available, we will

J.

,.

try to generate an I.V. matrix as lIclose" to Z as possibl.e.

3.3 Realization of the asymptotic optimal I.V. matrix (Wong (18»

This paragraph is based on the method given by Wong (18), and on the processdescription given in eq. (3.5). In the following paragraph, this same method but now based on the processdescription given in eq. (3.9) will be considered. It will be seen that this latter scheme to obtain Z* without matrix inversion is much easier to follow (and to do in prac-tice) than the former one. So, in reading this thesis, this paragraph can be skipped without any loss of continuity, for the essence of.it will be described in section 3.4.

We will assume that we know the parameters of the process cOmpletely i.e.

~,

w,

(l(u,x)

are given, as well as the parameters of the general noise-filter i.e. f. and g.

1 J (i

= I., ••• ,\'); j = l._ .••• ,ll).

So the discussion given here will be concerned with the idealized mathe-matical problem of computing approximately the matrix Z* without any matrix inversion. Rewrite eq. (3.6): \)

I

Lnk . = ~k + I J. -J. (k = •••• ,-l,O,+~, •••• ) (3.6)

where ~k is a sample taken from a ''white'' noise process with the proper-ties

(28)

and

(where 0kj 1S the kronecker-delta).

Fig. 3.1: Noise-filter If now in eq. (3.6) the initial conditions

to an error in the sequence n

k

=

0 (k ~ 0) are used, {n_k

}7 -

derived from the

this· se-will give rise

+00

quence {~k}-oo- although this error will decrease as k grows (cfr. initial conditions in a D.E. are important only for transients). It has been as-sumed that the polynomial

-I _"

I + flz + ••• + f"z

=

0

has its roots inside the unit circle in the z-plane (stable noise-filter). Remark I: Note that if n_k= ~k (e.q. all fi and gi equal to zero) for all k, there will be no transient, and consequently no error.

From eq. (3.5), we see that: e = j('~ (3.19)

where: and n q+l n q

...

nl n _q+Z n _q+l

...

n_Z J('= (3.20) n _q+N n _{q+N-l ...} _nN

(29)

a a ---a q-l 1 0---0 q

.

• O. a a J 1

,

a₂ _•

,

.

~

,

I

,

A

=

.

_,

,

0

.

,

0---- --0 a a ---J q-J F = G

=

q 0- - - 0

_{, ,}

~l

,

, ,

_,

;

,

f ---_v I _" "

.

,

0... ... , , , '0

,

" I , O----.Q o~--

---0

,

, ,

_,

,

_I " I " I .. I , , I gil ---J

o

I ' I " I ' ,

0--

---0 , I , ,0 g ---g . 1 \l 1

Then eq. (3.19) can be rewritten as:

where:

and eq. (3.6) can be rewritten as:

FE.;:; G

1.

where

iT

=

(~I'~2""'~q+l"""~N""'~q+N)

and thus:

fA]

= Nx(N+q) (3.2ia)

[FJ

=

(N+q)x(N+q) (3.2Ib)

[G)

= (N+q)X(N+q) (3.2Ic) (3.22) (3.23) (3.24)

(30)

Combining eq.(3.24) and eq.(3.22) we get

-I

!: = AF G

I.

(3.25)

Remark 2 F -I ~oes always exist, because det(F) = I. At the same time

-I

F is a lower triangular matrix ( L.T.M. ) as well as F is a L.T.M. Substituting eq.(3.25) in 4> = ( AF-1G T 4> = E (!:!: ) gives : )( AF-1G )T a~. (3.26 ) 2

We normalize a~ to one ( this introduces no restriction ). From eq.(3.26) and eq.(3.16) it can be seen that

( AF-JG )( AF-1G )T

z·

= n(u,~). (3.26a)

We will define the matrices V'1r. H'1r.

,

T· and W·

-

using the following relations

V- _{= AT}

z·

_(a) H- -T II (b) = F V Wll = GT Hll (c) (3.27) T- _{= F-1GW·} _(d) f2(u,x) = A T- (e).

Using the relations (3.27 a,b,c,d,e), the following scheme will be

proposed for the evaluation of the N x m matrix Zll without applying any matrixinversion •

..!.._A_aE.d_

E.(~'2~)_a.!.e_kE.0~_:

generation of an (N+q)xm matrix Tll according to eq.(3.27e). We write eq.(3.27e) explicitly

u q+l

.

u q+ 2 . -J

.

...

l

uq+k "'luq+k+l-jj-' u I x ••••••• q+l-p I q. I I • i • I x I · _m+_-J • • • •• x J u· k _{q+ -p I q+ - ..}I x k I'" ., x k ., ... _m+ _{- J .} x.. I< I . I I I U _q+N U N 1 .••• _{q+ +}_{-J .} U _{q+ -p}N I X _{q+ - . . .}N I···· x N . ••••• _m+ _-J x.._ _~ • I • (j=I,2, ••

? ...

,p+l) (j=p+2,p+3,

~

•• ,p+q+I=m) =

(31)

0 .. ...,._ ... 0

.

_.

_,

\ I 0, a . . . • . .. ) \ \ I

J-.~.-.-.-.~~

A. .

o···\)

a . . . . .. . ... I Io('-v. L...L._:-....s._._._.~ " --q-f ' 0 I ' O----.--'Oa a I .... q q- .

From eq.(3.28) results

q ¥ E a.t k " 1. +q-1.,J

'·0

r'-'""""'!

, * ' ¥ . ¥

I

t l l t l z · · · l t ) . • . . .

_ _

i·

J

i

t21 tZ2 ••••• t2~ ••••••

• i · ·

i

jJ .. tk I tk2 •••••

(~j

.) ....

¥

i

~

!

t_N+ I··· .t_N+ .;1'" q , . . L._~~

3

= uq+I +k-j ( k = I,Z, ••••• N; j = 1.2.~ ••• p+1 ). and q ¥ 1: a.t_k " 1. +q-1..J i.·O = _x _{k '} m+ -J ( k

=

1,2, ••• ,N; j

=

p+2.p+3,~ •• ,m ), with a O = I. ¥ ~+q,m (3.28) (3.29a) (3.29b)

These equations g1.ve rise to the filterscheme given in Fig. 3.2. This filter has only a backward part. for which the parameters are equal to those of the backward part of the processP.

(I:

d.,L, ..•.. , f'H.)

(a)

(/- ftL,

/'t5, .... , "")

(b)

Fig. 3.2 the filters giving the same D.E. as given by eq.(3.29 a.b).

Remark 3 : as A has the dimension Nx(N+q). the equations (3.Z9) give a set of N.equations with N+q unknown output values

t-k _oJ. (for each j). By choosing q of the unknown output values ( initial conditions ) in such a way that transients of the noise filter are eliminated. we will get the optimal matrix Z·. Otherwise a suboptimal matrix

Z

will result.

(32)

~._T~._F_a~d_G_a~e_k~o~_: generation of an (N+q)xm matrix W5 according to eq.(3.27d). Writing eq.(3.27d) explicitly. we obtain the following set of equations b % 1: f. t_k+ 1 . . 1. -1, J = LcD (k=I.2 ... N+q and fO = go =1. <=0 j

=

l,~~.~ ..• m ), (3.30)

These equations describe a filter as given in Fig. 3.3 with the initial conditions : 5 5 II 0 to . = t =

...

= t_l +2•j = oJ -I.j 5 II ¥ 0 wo . = wI' = _oJ _- _.J

.

....

_{w_b+2•j} ( j = t,2, .... ,m.).

1,1

Fig. 3.3 equivalent filter of eq.(3.30).

generation of an (N+q)xm matrix H¥ according to eq.(3.27c). Again. writing eq.(3.27c) explicitly gives the set of

equations : 1 II 1: g.h k . .

,:0

1 +1,J ( k = N+q.N+q-I .••••••• 1 with go = 1. = j = J,2, ••.•. ,m) (3.31 )

These equations describe a filter as given in Fig.3.4, with the initial conditions : h5 q+N+ 1 .j ( j

=

1.2 ••••••• m ). hll' q+N+2.j =

....

=

h% q+N+l.j

=

0

So. if the sequence

{w=.j}~+q

is used in reverse order. resulting from the transpose in eq.(3.27c). as the input of the filter given in Fig.3.4.

. II } 1

the output will deliver the sequence{~- . N • --k.J +q

(33)

:I.

{wlj

~Wr,

I __

~

_

:4...1-(/-

~,~,

...

"")

Fig. 3.4 equivalent filter of eq. (3.31).

i._F_a~d_H~~r~~~~ generation of an (N+q)xm matrix V~ according to

eq.(3.27b). Writing this equation explicitly, we obtain:

b E L=O ( k = N+q, N+q-I ' ... I; j = 1.2 ... m ). with fO = I. (3.32)

This is once again a description of a filter like that given in Fig. 3.5, with as initial conditions :

II hq+N+_1•j ( j 1.2 ... m). hI q+N+2.j =

.p

1

- ,

-...

= h II q+N+b.j

{ i-4.,

t.,···,II>\}

=

o

Fig. 3.5 :equivalent filter of eq.(3.32).

-~ • • II d'

3..._A_a~d_v_ !!.r~ .!:n~~

:

generatl.on of an N x m matrl.X Z accor mg to eq. (3.27a). -I or V = and "I ~ = where A Z~ A ZI

in the following way :

(3.33)

(3.33a) (3.33b)

(34)

and A =

..

_•

v II

· ....

vIm -II V

₌

*

..

v _ql

· ....

v qm a

a.... _______

O q , I a_{q_}₁a~, ... ',' ',' J . , I

. "

I

, I

· ,

' . , I a l a2 ....•. :a q . . . 0 _ _ 0

A

=

~

..

v q+

_.

1 "I ',' •••

~'"

v +N 1··· _q _• ~

..

v q+l.m

.

~

..

v q+N.m 1 •••••••• a Q ______ -D • q , I 0,1, •..•...•. a q

"t ... ,

,"

'" J , . "-I "".,: . " I J ,~. • ""

l

I , ' , '" O •..••.• '"

:-0'.,:.1, .••••••••

"a I " . q J '~ • •

_...

_.

• I ' . _,;

_.

1 '''': .• 0_ ... _ _ _ _ _ _ _ _ _ _ -11.1

Writing eq.(3.33b) explicitly, gives the following set of equations

q l: ~c 0 !IE a,z k . . 1 +1.,J =

-*

v q+ k . .J ( k

=

1.2 ••••••• N; j = 1.2 ••••••• m) with a O = I. (3.34)

This is a description of the filter given in Fig. 3.6. with initial conditions :

..

= ••••• = zN+q,j = 0

( j = 1,2, ...

,m ).

In general. the sequence

{Z:.j}~

found in this way will not satisfy eq. (3.33a). If, however, all initial conditions are the same as stated throughout this discussion. the sequence will indeed s~tisfy eq.(3.33a), ( Wong (18) ) and will contain the components of the optimal I.V. matrix. Remark 4: the scheme given here has already been worked out by Wong(18),

taking only a backward process. I t has been given here using a general processdescription ( forward ~ backward part .). In .this case, it is clearly seen that all filters will be working on the sequence

{Yk}~+q

{ }N+q as well as on the sequence ~ 1 •

(35)

Remark 5 : It has been indicated by Wong (18) that the matrix Z computed with the filters discussed before, and using as initial conditions for

the first filter

= ••••••• = t . = 0

q,J

( j = 1,2, ... ,m ), is a "good" approximation of the asymptotic optimal LV. matrix ( Wong (I8),p. 70).

{ i"

d.,J., ... ,

/Itt")

Fig. 3.6:equivalent filter of eq.(3.34).

Remark 6: The model output sequence

{wk}~+q

( Fig.3.7 ) is an instru-mental variable sequence ( Wong (I 8), p. 90 ). Indeed, a matrix Z = (Ul W), where the elements of Ware the w

k corresponding with the Yk in the matrix Y, satisfies the condition

1 · I zT

P l.m

Ii

e

N-=

o

and, on an intuitive basis :

I -I

plim

N

{Z n(u,y)}

N-exists. Consequently Z

=

(U:W) is an I.V. matrix.

It is this matrix that has been used in the estimation programs for which the results given in chapter 4, sections 4.3, 4.4, 4.5 and 4.6 holds.

r---T

,

It .. :

UA

I

P

b

~I+ it

~

1 ..

,

~ I I I I '-. ____________ 1 M

loft

~

•

(36)

3.4 Alternative realization of the asymptotic optimal I.V. matrix

We will assume that we know the parameters of the process completely i.e .

.£"

~, (J(u,x)

are given, and the processis described by eq. (3.9). The optimal I.V.

matrix is

.. -I

Z = ~ (J(u,x), where:

and where e has been defined in eq. (3.12). Writing ~ explicitly gives:

e

=

= and e _q+1 e_q+₂ c s S l+cIs + _q+ _q ••••• +c S 1 _{s q+ -s} S 2+c I s _q+ _{q+. . .}1+ ••••• +c _{s q+ -s}~ 2 S _q+N+CIS _q+N-I+ ••••• +c S N 6 q+ -s c s- ] ...•. c 1 0,---- -

---11

" I 1 _{. ' ,}" I _f , I .... , I

'. ...., <l

c s-1', " • •• c 1 • I T T T ~

=

E{e e }

=

C E{ll }C (as we normalized

0:

=

I)

Combining eq. (3.36) ·and eq. (3.16) we get:

= ~q+I-S ~q+2-s (3.35) (3.36) C CT Z" = (J(u,x). (3.37)

(37)

z·

can be solved without matrix inversion. by introducing the matrix H in the following way:

&l(u.x) = C H (3.38a)

and

H (3.38b)

(note the similarity between eq. (3.37) and eq. (3.26a».

Eq. (3.38a) and (3.27e) have the same structure. and so eq. (3.28) can be applied. where now the parameters a. are replaced by the parameters

• 1

c. and the elements t .. are replaced by the elements h ..• So the

1 1J 1J

resulting explicit equations from (3.38a) are given by:

s

2 ci~+s-i.J·

= U_q+₁+k- J' i=O (k = 1, 2, •••.. ,N; j = 1, ~ '. eo. to • • ,p+ 1 ) , and s

2

C₁IL . • = x k ' i=O . -K+S-1,] m+ -J (k = 1.2 •••••.• N; j = p+2. p+3 •• ~ •••• m) with c = 1 o

These equations produce the filterscheme proposed in fig. 3.8. At the same time. remark 3 made in section 3.3 holds also.

-1

c

(id,l, .. -- -, 'PH')

(/'" t

H _,

t'"',---, "")

Fig. 3.8: Equivalent filter according to eq. (3.39). Writing eq. (3.38b) explicitly gives:

~+s.

c _s.

0;;::----1

_,

.

. ... • . . ... I

:

"""

... 0

=: c₁ 1 0 .... : '. I ... . I ... ". -...". c I I ... • . 0---~'6-1 (j 1,~,

...•

m), (3.39a) (3.39b) (3.40)

(38)

which is equivalent to: s

L

i=O

..

ci·zk+i,j = hk+s,j with c = I. o (k

=

1.2 ••••••• N; j = 1, .••.••.••• ,m)

..

The initial conditions are: Z I '

N+ ,J =

...

..

= z . N+s,J (3.41) = 0 (j = 1, .•.•. ,m) .

This is equivalent to the filterscheme given in fig. 3.9. (Note the si-milarity with point 5 in section 3.4.)

(/c J.,.t, .... , "" ')

Fig. 3.9: Filter giving the D.E. of (3.41) (j = 1,2, ...•• ,m)

It is seen that this method eliminates a lot of computation time. and reduces P?ssible untrue initial conditions. At the same time. however. we reject the information that the factor (I+A) is implied in the poly-nomial I+C.

3.5 Simple construction of the optimal I.V. algorithm

It has been proved by Wong (18) that the optimal I.V. matrix Z .. is given by:

.. -I

Z

=

4> n(u.x)

where, as defined in section 3.2:

and

n(u,x) = (UiX),

T

= E{e e }

(3.16)

Consider the equivalent noise term in the D.E. given in eq. (3.9):

+ c _{s -s}~k (3.9). (3.8)

and in matrix notation:

(3.35) where ~, C and

1

are already defined in section 3.4.

(39)

Putting, ~ _q+1

=

~ +2

= .••.• =

~

=

0 (initial conditions),

-8 q -8 . q

we can write the following matrix form:

e = C..

_-

~ ..

_,

(3.42) where 0--- - - 0

,

71 1 '\ '\ '.

,

C ..

=

.

'\ '\ c ... : 1 , s.

=

NxN 0, '. . , I ' . '. , I I ' .

,

_' • _' '0

6--

-~d·c

...

:1 s and .. T

l

=

(~q+ I ' ~q+2""'" ~q+N)

..

We will evaluate now the inverse of C , denoted by D

..

=

C .. - I • As C .. is

..

a L.T.M., D also is a L.T.M. D exists as det. (C )

..

=

1.

Consider the following matrix identity:

0 - - - 0 _{0:- - -} _{- - - -} -0

'\

I

' I ~I 1 "

,

1

~I

1 " 1 1

,

I ,

,

"

I

,

_,

1 c

,

d ,

,

1 5 "

,

r ,

9" "

1

9"

,

1

,

'\ , I

,

I

,

I , . , 0 I '0 I ' I

,

O---~O . c _s····c l ·l 0---'0 'd ••• •• d 1 r . 1 where c = d o 0 1 and: c, _J (j > s) 0'

,

d, (j J > r)

=

0, 0 - - - 0 " I 0, 1 "

_.

"

I I

,

1

_"

,

_,

,

_,

"

= I.

..

,

'\

"

,

, I

,

I

,

'1 0 I

,

0- - - - --~ (3.43)

Eq. (3.43) gives rise to the following set of equations relating all c,

J and d,: ~ j+1 .

L

cJ'+I-id'_1 = 0 i=1 ~ (all j > 0) (3.43a)

(40)

Example: j = 2 + c d + C d

=

0

1 0 0 1 or: d 1 =-c 1

Remark I: The d_i resulting from eq. (3.43a) are exactly the coefficients

-I -I -I -s -I

of the polynomial (I+e(z»

=

(l+c1z + ••• +csz ) • Indeed eq. (3.43a) could be obtained from inspection of the following relation between c.

J

and d.:

1

(3.43b) Remark 2: In eq. (3.43) and (3.43a,b) it is assumed that all d. (i > r)

1

are small enough to be neglected.

Combining eq. (3.42) and eq. (3.16), we get: ~ ~ (D,,)-I(D")-T

* *T *

and thus: Z ~ D D Q(u,x).

The optimal I.V. algorithm is then given by: "T -I "T

=

(Z Q(u,y» Z ~

T *T

*

-I T "T"

= {(Q (u,x)D )D Q(u,y)} (Q (u,x)D )D~.

Now define:

"

Q (u, x) _{D Q(u,x); Q (u,y)}

*

to ₌_{D Q(u,y)}"

and

" 'I<

~ = D ~ we get:

* *T " -I *T "

~ = (Q (u,x)Q (u,y» Q (u,x)~.

(3.44)

(3.45 )

(3.46) This relation implies that, by filtering

and

{~}~+q

with a filter which only has

h { }*q { }*q

t e sequences ~ 1 ' Yk 1 a forward part, of which the parametervalues are given by the inverse parameters of the equivalent noise filter (given in eq. (3.9», we will arrive at an algorithm which is not only consistent, but has optimal properties with respect to the asymptotic covariance matrix.

N+q Remark 3: As the sequence {~}I as known, we will introduce in practice

well as the parameters d. are

1

the fQllowing approximations:

(41)

elements wi are taken from the modeloutput sequence

{wk}~+q

(see fig. 3.7). It has been shown that this matrix is a "good" approximation of the matrix n(u,x) (remark 6, section 3.3).

2) The parameters d. can be estimated using the classical L.S. estimator.

1.

Indeed, the equivalent noise was given by

-I e_k = {l+C(z )} ~k' and thus: -I ~k = {l+D(z )}ek or: e_k= _{-dlek-l-d2ek_2-···-drek_r+~k} (3.47) (3.48a) (3.48b) { }N+q

Now suppose the sequence ~ 1 is available. Then eq. (3.48b) can be written in matrix notation:

with: and: e -E

2

+

5.

T e !:.T dT E = = = e q e q-I ...•. e q+l-r e q+N_1 •••••••• eq+N-r

An unbiased (L.S.) estimator of ~ is given by:

T -I T

2

= -(E E) E ~

(3.49)

(3.50) It is clear that the sequence

{ek}~+q

is not available at all. If, how-ever, an estimate S has been obtained (e.g. by applying a simple I.V.

- 0

estimator on {~}!+q and {Yk}~+q)'we can use this estimate to generate the sequence {e~}l+q:

(3.51 ) (k

=

1, ••• ,N+q)

(42)

(with initial conditions y _o

=

y _-1

= ••. =

_... Yl _-q

=

Taking now the estimator given in eq. (3.50) and quence

{e~}~+q,

we will get an estimate

~1 of~.

With this

~1

we can filter the sequences

{~}~+q,

yielding

n~(u,w),

n;(u,y) and

z7,

and now use the

U

o

=

u_l

= ••. =

ul_p

=

0)

applying it to the

se-N+q . N+q {Yk}1 _{and {wk }1 '} estimator (3.46), i.e.:

(3.52) Now again eq. (3.51) will give, with this s~ instead of S*, a new

se-- 1 - 0

i N+q quence {e

k}1 ,and so on. Using a similar criterium as mentioned in section 2.2, the algorithm can be stopped.

The method proposed in this section is closely related to the method introduced by Clarke (13) which is given·in chapter 2, section 2.2. In practice we will apply a recursive scheme. After each iteration step the vectors ~ and

1

are updated using the new information and an expo-nentially weighted part of the past information. (cfr, chapter 2, section

2.3).

The results ~btained with the method proposed in this paragraph are given in chapter 4, section 4.8.