• No results found

The estimation of parameters in functional relationship models

N/A
N/A
Protected

Academic year: 2021

Share "The estimation of parameters in functional relationship models"

Copied!
179
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The estimation of parameters in functional relationship models

Citation for published version (APA):

Hillegers, L. T. M. E. (1986). The estimation of parameters in functional relationship models. Technische

Universiteit Eindhoven. https://doi.org/10.6100/IR254536

DOI:

10.6100/IR254536

Document status and date:

Published: 01/01/1986

Document Version:

Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be

important differences between the submitted version and the official published version of record. People

interested in the research are advised to contact the author for the final version of the publication, or visit the

DOI to the publisher's website.

• The final author version and the galley proof are versions of the publication after peer review.

• The final published version features the final layout of the paper including the volume, issue and page

numbers.

Link to publication

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:

www.tue.nl/taverne

Take down policy

If you believe that this document breaches copyright please contact us at:

openaccess@tue.nl

providing details and we will investigate your claim.

(2)

The estimation of parameters in functional

relationship models

(3)

The estimation of parameters in functional

relationship models

PROEFSCHRIFT

ter verkrijging van de graad van doctor

aan de Technische Universiteit Eindhoven,

op gezag van de rector magnificus, prof.

dr. F.N. Hooge, voor een commissie

aangeLUezen door het coLLege van

dekanen in het openbaar te verdedigen op

vrijdag 19 december 1986 te 14.00 uur

door

Leo Thomas Maria Emmanuel Hillegers

(4)

Dit proefschrift is goedgekeurd

door de promotoren

prof. dr. R. Doornbos

en

prof. dr. P.C. Sander.

Copromotor:

dr. ir. H.N. Linssen.

(5)
(6)

Contents

Chapter 1.

Introduction

Chapter 2. ReLationship modeLs and errors of observation

2.1 The functionaL reLationship modeL

The standard form of the functionaL reLationship modeL Variants of the functionaL reLationship modeL

22 The structuraL reLationship modeL 2.3 The uLtrastructuraL reLationship modeL 2.4 The factor anaLysis modeL

2.5 Temporally reLated experiments 2.6 ModeL errors and controL errors

Chapter 3. Estimating equations

page

11

16

16 17 18 23 24 27 28 30

33

3.1 Estimating functions and estimating equations

Theorem31:

nt(~-~)

til N(O, J-1VJ- T

l,

1 1

Theorem 3.2: n2" V-2"~l(~-~) til N(O,

0,

asymptoticaLLy asymptoticaLLy

33

35

37

Chapter 4. Linear functionaL reLationship modeLs

4.1 The functionaL modeLB(f31~=0

4.2 The construction of the estimating equations 4.3 The asymptotic variance matrix of the estimates

4.3.1 The errors are not normaLly distributed 4.3.2 The errors are normaLLy distributed

44 The functionaL modeLBi(f31~i=0

4.5 The functionaL modeLa((3J+B(f3J~=O

4.6 The functionaL modeL l)=a(f3)+B(f3)~

4.6.1 GeneraL resuLts

4.6.2 a and B are freeLy parameterized

4.6.3 a and B are freeLy parameterized and fl=a2W

4.64 The bivariate functionaL modeLI)=a+b~ 4.7 The functionaL modeL 'l=B(f3)~

4.7.1 GeneraL resuLts

4.7.2 B is freeLy parameterized and fl=a2

w

4.7.3 The bivariate functionaL modeLI)=b~

38

38 39 44 44 47 49 51 54 54 57 57 58 62 62 64 64

(7)

Chapter 5. Linear structural relationship models

67

5.1 The structuraL modeL T1=a(j3)+B(j3)~ 67 5.1.1 Specification of the modeL 67 5.1.2 The Likelihood equations 69 5.1.3 The asymptotic varianCe matrix if~and e are normaL 71 5.1.4 The asymptotic variance matrix in the non-normaL case 72 5.1.5 The effect of the distribution of ~ onAVAR[vec(~,w)l 74 5.1.6 Elimination ofiJandljJfrom the likeLihood equations 77 5.1.7 a and B are freeLy parameterized 81 Theorem 5.1.7: B-structuraL = B-functionaL 82 B caLcuLated by the method of moments 83 5.1.8 a and B are freeLy parameterized and O=o2 W 83 B as a soLution to an eigenvaLue probLem 84 5.19 The bivariate structuraL modeLT1=a+b~ 87 5.2 The structuraL modeLT1=B(j3)~ 92 52.1 Specification of the modeL 92 5.2.2 The Likelihood equations 92 5.2.3 The asymptotic variance matrix 93 5.2.4 Elimination ofiJand ljJfrom the Likelihood equations 95 5.2.5 B is freeLy parameterized and O=o2 W 96 52.6 The bivariate structural modeLT1=b~ 97 5.2.7 Comparison of b-structural and b-functionaL 103

Chapter 6. Non-Linear functional relationship models,

f(~,(.3)=O

107

6.1 Introduction 107

6.2 Specification of the model 108

6.3 The computation of~ 109

algorithm. convergence. consistency 110 6.4 Approximations to the variance matrix 116 6.4.1 VAR(~) obtained from the approximate relation a(I3)+B(j3)~=O 117

64.2 VAR(~)obtained from the approximate reLationa+B~+Cj3=O 118

exampLe 120

6.5 Reduction of bias in~ 121

algorithm 122

A comparison between the bias-reduced estimators f3Mc and

i3EP

125

Chapter 7. Some functional relationship models with a more

complex structure

127

7.1 Replicated observations 127

7.2 Equality constraints on the parameters j3 129

(8)

Chapter 8. Example:

phase separation in a polymer-solvent mixture

138

8.1 Introduction 138

8.2 The Flory-Huggins model 139

8.3 Experimental data 144

8.4 Parameter estimation 147

Appendix A.

Matrix calculus

Appendix B.

Some auxiliary theorems

Appendix C.

List of symbols

References

Samenvatting

Curriculum vitae

153

159

162

165

175

177

(9)

Chapter 1

Introduction

In the naturaL sciences and, to a Lesser extent, in the sociaL sciences it is common to modeL the behaviour of a system by specifying one or more functionaL reLations between its variabLes. As an exampLe, consider an enclosed amount of gas. Van der WaaLs' Law,

a

(p+2)(V - b)

=

RT, (11)

V

reLates the pressure P, the voLume V and the temperature T of one moLe of gas. The symboL R denotes the (universaL) gas constant and a and b are constants specific to the gas being studied. The Law states that onLy two of the system variabLes P, V and T can be varied freeLy; the vaLue of the third can then be derived with the heLp of equation (1.1). Before the Law can be used as a predictive tooL, the constants a and b - called parameters in this context - have to be determined. For that purpose a number of experiments is carried out: the state of the system is varied and the new vaLues of pressure, voLume and temperature are measured. If the measurements were exact, onLy two experiments wouLd be necessary to determine the two parameters and there wouLd be no statisticaL probLem, mereLy a mathematicaL one of soLving a set of two equations. However, in any practicaL situation random errors of measurement occur and the observed vaLues deviate slightly from the 'true' vaLues. As a consequence, estimates

a

and

6

for parameters a and b based on these corrupted measurements have a random component. ComputationaL methods for obtaining parameter estimates and for determining their statisticaL properties are the SUbject of this thesis.

To formuLate the probLem more generaLLy, we have a pxl vector variabLe~ comprising the p variabLes of the system being modeLLed. The vaLues of the eLements of ~ satisfy q, q~p,

functionaL reLationships:

f(~, (3)= 0,

where f is a qxl vector function with~ and the parameter vector(3as arguments. The form of f is known; the vaLues of the eLements of (3, however, are unknown. Experiments have been carried out. n in number and indexed by the subscript i, and the true vaLues~iof the system

(10)

variabLes are observed as xi ALthough we know that the ~i'Ssatisf\,j the q reLationsf(~i'fl)=O. i=1.2, ...n, their actuaL vaLues are unknown. Therefore, the ~i'Sare parameters too. And because their number increases with n, the\,j are caLLed incidental parameters With respect to the measurements xi, it is assumed that each xi differs from~ib\,j the random error vector ei:

xi =~, +ej, i=1.2...n.

The errors form a sampLe from a probabiLit\,j distribution F(wl, where w IS a known or partiaLl\,j known parameter vector parameterizing the distribution F. The errors are independent with expectation zero, Given the measurements xi, i=l,2,...n, and the forms of the reLations f and the distribution F. the probLem addressed in this thesis is to draw statisticaL concLusions regarding the vaLues of the unknown parameters: the reLation parameter vector (3, the error parameter vector wand the incidentaL parameter vectors~i, i=1.2,...n

This observationaL modeL Is known in the Literature b\,j the term functional relationship

model (Kendall &Stuart 1979) or errors-in-variables model (Wolter &Fuller 1982). In Chapter 2 a rigorous description of the model is given. In that chapter we aLso describe various particuLarizations of the functional reLationship model with the purpose of demonstrating its generalit\,j and Wide appLicabiLit\,j. One of these particuLar forms is the regression anal\,jsis model, in which no incidentaL parameters are present. and another the fixed factor anal\,jsis model, featuring extra incidentaL parameters. In addition, the distinction between the functionaL and the structural relationship model (Chan&Mak 1984) is discussed. In the latter model it is assumed that the true values~ihave been generated b\,j some random mechanism Ff- Inference now relates to the unknown parameters - fixed in number - in the distribution Ff' not to the~i'Sas is the case in functionaL relationship modeLs. With respect to inference regarding fl and w, the distinction between functionaL and structuraL is important as wiLL be demonstrated in SUbsequent chapters.

Chapter 3 is on estimating equations. These equations have the generaL form gn({], Z1,

zz, ,..

zn)=O,

where gn is a vector function of size rx1. The function gn has as arguments {] and z"

zz, ...

zn: the Zj'S, i=1,2,,..n, form a sampLe of n independent random vector variabLes from a probabiLit\,j distribution that is parameterized b\,j the rx1 parameter vector {] and - possibL\,j - b\,j other

(11)

parameters. In Later chapters equations of this type wiLL be used frequently;,Jis then repLaced by (j3.w) and (z" Z2 .... zn) by

(x,. x2 . ...

xn) It is proved that the solution~n(z,.Z2, ...zn) to th[ estimating equation gn=O is consistent and asymptotically, i.e. asn~oo, normally distributed. provided that the corresponding estimating function gn meets some suitable requirements. An expression for the asymptotic variance matrix of~nis derived.

In Chapter 4 the functional relationship model B(j3)~=Owith VAR(e)=O(w) is discussed. The relation is linear in ~. On the assumption of normally distributed errors, eV1N[O. o(wll. we derive the likelihood equations for the parameters of the model:~,. ~2.... ~n.13 and w. However. when solved. these equations yield inconsistent estimates for the parameters 13 and w. This failure of the maximum-likelihood method is due to the presence of the incidental parameters ~i (Neyman&Scott 1948). Fortunately. a simple modification of the likelihood equations suffices to solve this problem. The estimating equations gn(j3, w.

x,. x2. ...

xn)=O then obtained yield consistent estimates~n(x"

x2, ...

xn) and wn(x,.

x2' ...

xnl. also for non-normal errors By applying the results of Chapter 3, it is proved that~n and wn are asymptotically normally distributed. Their asymptotic variance matrix is derived.

The formulation of the model is as general as possible: by allowing the relation matrix Band the error variance matrix 0 to be parameterized by, respectively. 13 and w, denoted by B(j3) and O(w), any particular linear functional relationship model is obtained by choosing the appropriate parameterization of B by 13 and 0 by w. This generality enables us to obtain resul ts (estimating equations and asymptotic variance matrices for the estimates~ and

Cln)

for such relations as a+B~=O. ll=B~ and ll=a+B~ in combination with various parameterizations of 0 by particularizing the general resul ts derived for the modelB(j3)~=O

with VAR(e)=O(wl. In the literature, all these linear functionaL relationship models are usually analyzed separately.

Chapter 5 deals with linear structural relationship models. The explicit forms

ll=a(j3)+B(j3)~and ll=B(I3)~are considered. For normally distributed

fS.

~V1N[Il. ljI],and e's,

eV1N[O. O(wJ], the likelihood equations are derived for the parameters Il. ljI, 13 and w. No incidental parameters are now present and the likelihood equations are consistent. i.e. the solutions(P, oj" ~,w) to these equations are consistent. The consistency is maintained if the

(12)

Chapter I

normality assumption with respect to the distribution of the

fS

and the e's is not met. The mechanism by which the g's have been generated, functional or structural. has no effect on the asymptotic behaviour of the estimates. It is proved that the asymptotic variance matrix of

(~,CJ)depends on this mechanism only viaI-Jand1\1,the first and second order moment of~.

From the likelihood equations the unknownsI-Jand1\1can be eliminated. A set of estimating equations in f3 and(Jthen remains. These equations are compared with the estimating equations for f3 and(Jin the corresponding functional relationship model. It turns out that the structural and functional set of estimating equations differ and yield different estimates(~,CJ),except for the model rl'=a+B~with a and B freely parameterized, i.e. f3=vec(a, B). For the simple bivariate relation 'l=f3g the difference appears to be slightly in favour of the structural method in that avar(~structur.d ~ avar(~function.ll.No other relations have been investigated on this point.

Non-linear functional relationship models are deal t with in Chapter 6. Attention is restricted to the case where the error variance matrix 11 is known up to a scalar proportionality factor 0'2, that is, 11=11(0'2)=0'2W with W known. In this case. the estimating method applied in Chapter 4 to the linear functional relationship model is equivalent to the

orthogonal least squares method. An algorithm of the Gauss-Newton type for solving the

orthogonal least squares problem is given.

An approximation to the variance matrix of~is obtained by approximating the non-linear relation f(~,(3)=0 by the relationa(f3)+B(f3)~=O.The latter relation is linear in~and the theory of Chapter 4 is applicable. Another approximation to the variance matrix of ~is obtained by replacingf(~,j3)=0 bya+B~+Cj3=O.This relation, which is linear in both~ and j3, yields a variance matrix of~that is easier to calculate than the first approximation, but that is also less accurate. Almost all authors on non-linear functional relationship models describe the second approximation.

The estimator~is in general biased. Chapter 6 concludes with an algorithm for reducing this bias.

In Chapter 7 two characteristics of the model structure are described that - althOugh of great practical value - have received little or no attention in the literature. These are:

(13)

(1) The presence of equalit!:l constraints g(13)=O on the reLation parameter 13.

(2) The presence in the modeL reLation f[~,13)=0 of state variabLes ~ that have not been observed and for which no measurements x are avaiLabLe.

The theor!:l of the functionaL reLationship modeL and the aLgorithms of Chapter 6 are extended to cover these cases.

The Last chapter gives a worked out exampLe of a non-linear functionaL reLationship model. The s!:lstem being modeLLed is a liquid mixture of a poL!:Imer in a soLvent. Under certain conditions of temperature and poL!:Imer concentration, the mixture separates into two phases. The modeL describes this thermod!:lnamic phenomenon. The exampLe serves as a test for the various theories and aLgorithms put forward in the foregoing chapters.

FairL!:I compLex muLtivariate caLcuLus is needed to derive the man!:l formuLae presented in this thesis. Therefore, in Appendix A some Less known matrix operators, such as the vee-operator and the Kronecker-operator, are reviewed. Appendix B contains some auxiliar!:l theorems and in Appendix C the most frequentL!:I used s!:lmboLs are listed.

(14)

Chapter 2

Relationship models and errors of observation

Summary

In this chapter we wiLL formuLate the functionaL reLationship modeL that pLays a centraL roLe in this thesis. To show its wide applicabiLity, we wiLL compare it with a number of other statisticaL reLationship modeLs (such as the factor anaLysis modeL and the moving average modeL) that are in common practice in various fieLds of expertise. We wiLL pay particuLar attention to the structure of the reLations that exist or are hypothesized to exist between the variabLes describing the siJstem being modeLLed. WhiLe reviewing the various modeLs our interest aLso goes to the stochastic mechanism biJ Which the observations are generated.

Methods for obtaining estimates for the parameters appearing in the modeLs are not discussed, neither are questions associated With parameter identifiabiLitiJ. With respect to the functionaL reLationship modeL and the structuraL reLationship modeL this is done in sUbsequent chapters.

2.1 The functIonal relationship model

In a functionaL reLationship modeL it is presupposed that the eLements of a vector variabLe~, size pxl, are functionaLLiJ reLated:

f(~,(3) =O. (21J

The function f has as arguments not onLy~,but aLso the parameter vector(3,caLLed therelation parameter,whose vaLue is unknown and has to be estimated. More than one reLation may exist between the eLements of~;f is then a vector function, say of size qxl,q~p

A number of experiments has been carried out, saiJ n, whereby the vaLue of~ has varied. In each experiment ~ is observed and its measured vaLue is Xi, i=1,2....n. Due to an error of measurement ej in experiment number i. the vaLue Xi deviates from the true value ~ioof~:

(15)

Chapter 2

To cLarify the concept of true valUe we couLd consider having avaiLabLe a very accurate

instrument with which we couLd measure the variabLe~ with so very smaLL an error that it is negligibLe compared with the instrument that yieLds the measurement x. Of course, it is up to the modeL buiLder to decide whether or not true vaLues are appropriate for the system he wants to modeL.

Not the measurements Xj, but the true vaLues ~io satisfy equation (2.1) for a particuLar and unique vaLue130of13, the true vaLue of13·More preciseLy, (2.1) is then written as

i=l2,...n. (23)

The~o'sare unknown and constitute additionaL parameters besides130'UsuaLly, they are not of primary interest and, hence, are termedincidental parameters (Neyman&Scott 1948). Their number increases with n, a phenomenon that compLicates the estimation procedure considerabLy, as we wiLL see in Chapter4.

It is assumed that from one experiment to the other the measurement errors are identically distributed and independent. The error has no systematic part, Le. its expected vaLue is zero, lE(ej)=O We denote the variance matrix of ej by n. n:=VAR(e;). To allow for different degrees of knowledge regarding n we write n=n(w). meaning that the eLements of n are functions of the error parameter vector w. For exampLe. if n is known to within a proportionality factor

0'2, we have n(w)=n(O'2)=O'2W, with W a known and non-negative definite symmetric matrix

It is aLso possibLe that aLL eLements of n are unknown. Then: w=vec n with the constraining requirement that the symmetric eLements of n are equal.

In (2.4) we now summarize these specifications and, henceforth, wilL refer to (2.4) as the functionaL reLationship modeL in itsstandard form.

The functional relationship model, standard form

number of observations:

incidentaL parameters, true vaLues: n

i=1.2,..n relation parameter. true vaLue: 130

relations:

error parameter. true vaLue:

errors: ei,

i=I.2 .... n

i=1.2 .... n

(2.4)

(16)

observations: Xi.

Cr,apter 2

X'=~io+ei. i=1.2 ... n

The standard form may be adapted in various ways, either generalized or particuLarized. to accommodate the specific structure of the probLem at hand. We discuss a few variants to show its generality.

o

The explicit form of the functional relationship model

Quite often the vector variable~ is partitioned into two vectors~1and~2 of smaLler size,

~=vec(~l, ~2). such that the response / dependent variables ~2are expressed explicitly in

terms of the explanatory / independent variables ~f

b

= f(~l' f3J (25)

The adjectiveexplicit is used for the form (2.5). In (2.4) the relation between the variables ~

is implicit. For the estimation problem this distinction is irrelevant.

Linssen 1980 and others refer to (2.5) by the termnuisanceregressionmodel. They regard the presence of the incidentaL parameters~1in (2.5) as a troubLesome complication.

o,The linear functional relationship model

If function f is linear in~,that is. if we can write

(26) the model is calledlinear. Note that here the term linear refers to the linearity of f in~;in regression models the linearity of f in(3is of importance. The matrixB,size qxp. is known as the relation matrix. Its eLements are parameterized by(3. as

n

is by(J.Other linear variants

are f(~,131 :=a((3)+ B((3)~ = O. f(~, f3) := B(f31~1 - ~2 =0, f(~, 13):= a(l3) + B(131~1 - ~2 =O. (27a) (Vb) (2.7c)

The linear form (2.7al has an intercept. It is seen to be a particuLarization of (2.6) when the foLLowing repLacements are carried out:

,,..,. __ [Oqq Oqp] .

'B'=(a,B). "

Opq

s<

Here. the quantities between quotes in the Left-hand sides refer to the form (2.6). while the quantities in the right-hand sides refer to (2.7a1. The symbol1qdenotes the q-vector whose

(17)

Chapter 2

eLements are aLL1. SimiLarLy. the symboLsOq,q, Oq,pandOp,qdenote matrices containing onLy O's.

The form (2.7b) is the explicit variant of (2.6). It is obtained from (26) by the repLacements

'B'=(B,

-1),

The Linear form (2.7c) is a combination of (2.7a) and (2.7b) and, hence, aLso a particuLarization of (2.6J.

Yet another variant is the modeL with apartially disturbed design as studied by CarroL,

GaLLo & GLeser 1985:

X1= ~1' (2.8)

Of the independent variabLes ~1 and~2 - forming the design - the variabLes ~1 are observed without error and the variabLes ~2 with error. As usuaL, the dependent variabLes~3

are aLso observed with error. By writing B1(13)~1=B1(I3)Xl as a(l3) we immediateLy see that (2.8) is a particuLarization of (2.7c).

In addition, an inexhaustibLe source of different modeLs is offered by the possibility of choosing a particuLar parameterization for Band/or

n.

Many of these modeLs have indeed been studied in the Literature, aLthough separateLy. Leading to a weaLth of seemingLy unreLated papers.

In Chapter4we wiLL discuss the parameter estimation probLem - that is, the construction of consistent estimators and their asymptotic variance matrix - for the generaL Linear functionaL reLationship modeLB(I3)~=O and particuLarize the resuLts of this modeL to a few of its derivatives. In Chapter 6 the non-Linear functionaL reLationship modeL f(~. (3)=0 is deaLt with.

o Replicated observations

Consider a modeL in which each ~,.i=1,2....n, is observed r, n2. times:

j=1,2,...r. i=1.2, ..n. (29a) Let the modeL equation be

(18)

Chapter 2

This model too fits into the framework of the standard form (24). To make this clear. introduce

the auxiliar~variables ~ij. ~ij=~i.j=1,2... r, and rewrite (2.9) as

observations: relations: { (jj - (il = 0, f((it,~)=O j=2,3, ...r i=1,2, ...n The agreement with (24) is now eVident.

A worked out example will be given in Chapter 7.

o Regression models. no incidental parameters

If in the explicit form~2=f(~,.(31. see (2.5). the independent variables ~, are observed without error, that is. X'i=~li. i=1,2".. n, the model turns into a regression model. The incidental parameters have disappeared, which becomes clear when we write

i=1,2"..n. (2.10) Similarly, if ~1 is free of error in the linear variants ~2=B,((31~,and ~2=a((3)+Bl((3)~1'see (2.7b) and (2.7c). these models are then regression models.

Thus we see that the time-honoured regression models are just a class within the functional relationship models. Because, moreover, a critical examination of the observational data often reveals that the regression assumption of error-free independent variables is not fulfilled. we advocate a shift in emphasis towards the functional relationship models.

o Data reconciliation. no relation parameters

There are situations that can be described well by model equations that do not containan~

unknown relation parameters (3. The estimation of the~'sand ofwthen remains as statistical problem. An example is found in the chemical process industr~,see e.g. Tamhane&Mah 1985. Here, the variables are the values at certain places in flows of a fluid through a network of pipes. The relations are the well-known conservation laws of mass and energy. The problem is to reconcile a redundant set of flow measurements.

(19)

o

f

and

n

are indexed

Chapter 2

The specifications (2.4) of the standard form suggest that in each experiment the same set of variables is observed and that the true values of these variabLes satisfy the same functionaL reLation. In practical situations more compLexLy structured models can be found.

Consider for exampLe the following - admittedLy rather artificiaL - model. See Figure 2.1.

---1

Figure 2.1

1'---In this example two sets of observations are availabLe. The first set, the c's, has true vaLues

(~Ij, lllj), i=I,2....nl, which lie on a straight line through the origin. The ",'s are in the second set and their true vaLues(~lli,l111j), i=I,2 ....nll, lie on a circLe with centre (a, 0) and radius r, a and r being unknown parameters, The line and the circLe touch. We thus have

2 2 1

line: fl(~Ii' llli, a, r) _ r~lj - (a -r )2 llli = 0, i=I,2, nl circLe: fll

(~Ili'

l111i, a. r) _

(~lIj-al2

+ I1lli2 - r 2 = 0, i=I.2, nll·

(2.13a) (2.13bl It is conceivabLe that the errors eyand ell have different variance matrices 01 and 0Il.

In the exampLe there are two groups of experiments. This idea couLd be generalized further. even to the extent that each experiment. indexed by i. i=I.2....n. represents one group and has its own set of variabLes (the vector~i),observations (the vector Xj) and errors (the vector ej with variance matrix OJ) and its own set of reLations (defined by the vector function f;l Accordingly, as a generalization of the standard form (2.4), we write

relations: fj(~i' 13) =

o.

errors:

i=I,2 .... n

i=I.2....n. independent with lE(ej)=O and VAR(ej)=Oj(w). To avoid the modeL from falling apart into disjoint sUbmodels, we require that all fj's are parameterized by the same relation parameter vector 13 and all OJ's by the same error parameter vector w.However, not all elements of 13 need to appear in all fj's. neither all

(20)

22 eLements ofWin aLL l1i's.

The Linear variant of this modeL is discussed in Chapter 4, the non-Linear one in Chapter 6 A reaL-life exampLe is worked out fuLLy in Chapter 8.

o Dependence of

n

on other parameters than

lJ,

n=

n(lJ,f3,~)

In the standard formuLation [2.4) of the functionaL modeL the error variance matrix l1 is parameterized by the error parameter vector w, whose eLements do not appear eLsewhere in the modeL. The theory in Chapter 4 is based strictly on this assumption. However, there are functionaL reLationship modeLs in which the eLements of l1 depend not onLy on w, but aLso on 13 and/or on ~.

A not uncommon example is a model with

relative errors:

the standard deviation of a scalar error ex in the observation x of~ is proportional to the magnitUde of~:

Wis the proportionality constant.

If a second variabLe 11, related to~ by 11=a+b~, is also observed with relative error ey, we have

varley) =

w(a+b~)2

The theory in this thesis does not cover the case l1j=l1j(w, 13,~;J. However, extension of the theory from l1j=l1i(w) to l1j=l1j(w, 13) is possible.

o Equality constraints on the parameters

In the example given above the model equations (2.13) can alternatively be written as

Line: 11 =~ tan~ (214a)

circle: (~-a)2+ 112 = r 2 (2.14b)

angLe: r = asin~ (2.14c)

The parameter ~measuring the angle between the ~-axisand the line (2.14aJ has been introduced. At the same time equation (2.14c) has been added. [t contains only relation parameters, no variables. In practical situations it might be handy and advantageous to specify the model in terms of a 'natural' set of parameters, without being forced to eLiminate any -strictly speaking - superflUOUS parameters and to disturb by awkward substitutions a neat and eLegant formulation of the modeL equations.

(21)

In Chapter 7 we wiLL extend the theory derived for the functionaL reLationship modeL in its standard form to cover the presence of extrarelation parameters and extra equaLity constraints on these parameters.

ModeLs with extraincidental parameters are conceivabLe too, see Paragraph 2.'1 where the factor anaLysis modeL is formuLated with such parameters. DetaiLs are given in Chapter7.

It is not very likeLy that in a practicaL situation the need wiLL arise to introduce extra error

parameters constrained by reLations of the type f(w)=O. The mechanism of parameterizing (] by w goes a Long way towards preventing this need. ALso, our knowLedge of the observationaL errors wiLL usuaLLy be so Limited as to not justify compLex modeLling of (]

We wiLL not discussinequality contraints.

2.2 The structural relatIonshIp model

Thestructural relationship model is in its formuLation aLmost identicaL to the functionaL one. The distinction refers to the mechanism by which the true vaLues~are supposed to have been generated and to the ensuing set of parameters to be estimated. The term 'structuraL' is a rather unhappy one (Moran 1971], semanticaLLy devoid of meaning, but we wiLL stick to it as it is used wideLy in the Literature.

The 'structuralists' ascribe to the observer/experimenter a rather passive roLe. The observer cannot or does not want to influence the process/phenomenon he observes. He records the state of the process as weLL as possibLe. From this point of view each ~i is a realization of a random variabLe. Usually the normaL distribution is taken as its distribution:

~j V1 N(fl,ljI), i=1,2,...n, i.i.d. (215)

The expectation vector fl and the variance matrixljIare unknown parameters and are to be estimated.

On the other hand, the 'functionalists' do not mention how the ~'sare generated, either by a random mechanism or by a - possibLy weLL-pLanned - experimentaL design. Inference pertains to all the~i'SthemseLves, not to the parameters of the distribution from which thefS couLd have originated. The 'functionalists' therefore work conditionally; they condition upon the realized~'s.The number offS increases with n, hence the term 'incidentaL' parameter for

(22)

Chapter 2

'Structuralists' are found mainly among sociologists, biologists and econometricians; scientists, however, usually adhere to the functional viewpoint.

In the previous paragraph we have described some variants of the functional relationship model. In principle, all these models have their structural counterpart. In the literature, however, only the explicit Linear forms g2=a(f3)+B((3)gl and g2=B((3) gl of the structural model have been discussed. The problem of obtaining consistent estimates for the parameters

~, $, f3 and wof these two models will be studied in Chapter 5. The estimation is not particularly difficult, because the number of parameters is fixed and therefore the usual maximum-likelihood method can be applied. We will compare the results obtained for the corresponding functional and structural estimators.

2.3 The ultrastructural relationship model

Cox 1976 and Dolby 1976b introduced the

ultrastructural model.

This model can be regarded as a synthesis of the functional and the structural model. In the functional model each g stands alone and is an unknown incidental parameter, whereas in the structural model the g's originate from a normal distribution N(~,$) and here the parameters~ and$ are to be estimated. In the ultrastructural model the g's are split up into groups. Within each group, indexed byi, i=1.2,...n, the g's are random with distributionN(~i,$). From group to group the means ~i,i=1,2....n, are fixed unknown constants, that is, they play the same role as the gj'S in the functional model.

To be more specific we now formulate in detail the linear bivariate ul trastructural model l1=a+bl:, see Figure 2.3.

---1

Figure

2.3

r

TJ=a+b(

~1 ~i (ij

(23)

Chapter 2 (216) n a and b ~i, i=1,2, ... n '!'

~ijo ~ijV1N (~i' '!'l. j=1,2, .. r, i=1,2, ... n llijo llij=a+b~ijo j=1,2,. r, i=1.2, ... n crx2 and cry 2

exijo ex ijV1N (D,crx2), j=1,2, .. r, i·=1,2,... n eyijo ey ijV1N(D,cr/l, j=1.2, .. r, i=1,2, ... n

Xijo Xij=~ij+eXij, j=1,2, .. r, i=1,2 .... n

Yijo Yij=llij+eyij, j=1,2, .. r, i=1,2.... n independent variabLe:

dependent variabLe:

Specification of the Linear bivariate uLtrastructuraL modeL: number of groups:

number of replicates per group: reLation parameters, intercept and sLope: group means, incidentaL parameters: within group variance:

error parameters: error in observing~ir

error in observing llij: observation of~ir

observation of llij:

The random variabLes ~ijoexij, eyij are statisticaLLy independent. The parameters to be estimated are ~1, ~2,...~n, '!" a, b, crx2, cr/.

In connection with the probLem as to whether aLL parameters in (2.16) are identifiabLe, DoLby 1976b and Later PatefieLd 1978 and GLeser 1985 make a distinction between the replicated case (r>l)and theunreplicated case (r=1).

We now show that the uLtrastructuraL modeL (2.16) can be reformuLated either as a functionaL modeL or as a structuraL modeL. The idea invoLved is generaL enough to to be applied to other uL trastructuraL modeLs. Therefore, speciaL theoreticaL treatment of these modeLs seems unnecessary.

(24)

The ultrastructural model formulated as a functional one

Our target is the functional model Il=a+b~with replicated observations. Within each group the points (xij, ~ij),j=1,2, r, can be regarded as repeated observations of the point (lJi' vi). These points (lJi' Vi), i=1,2, n, lie on the lineIl=a+b~.The correspondence between the two models is now establishedb~the following replacements. The quantities between quotes refer to (2.9).

'13' = vec(a, b), 'lJ' = vec(IjI,

0'/,

O'l

b)

'~i' = vec(lJi, Vi), i=1,2,... n

j=1,2, ...r, i=1,2, ...n

'f(~i,13)' = a + blJi - Vi = 0, i=1,2,...n.

Note that the relation parameter b also appears in O. So here we have O=O(lJ, 13).Usuall~the elements inf3and lJ form disjunct sets.

The ultrastructural model formulated as a structural one

The idea is to now formulate the ultrastructural model as a multivariate structural one. Arrange the true values ~ijinto the n-vector~r

j=1,2,...r.

Then, as in (2.15),~jhas a multivariate normal distribution:

with

j=1,2,...r

IJ:=vec(1J1, 1J2, ... IJn) and

Note that here the variance matrixojJis known to within the scalar parameter IjI. In Chapter 5. where we will deal with structural models, it is assumed thatojJisfreel~parameterized. Hence,

(25)

2.4 The factor analysis model

The factor anaLysis modeL is equivaLent to theparametric form of the Linear functionaL

reLationship modeL. Anderson 1976 and 1984 has drawn attention to this fact. The modeL specifies that the true vaLues~i' vectors of size p, i=1,2,...n, n>p, are not scattered throughout the entire p-dimensional space, but Lie in a q-dimensionaL Linear sUbspace thereof, q<p. in parametric form this is stated as

~i

=

a

+BA;, and i=1,2,...n BTB = I. (2.17a) (2.17bJ The p-vector a is theposition vector and the coLumns of the pxq matrix B are the direction vectors of the hyperpLane. The direction vectors have Length 1and are orthogonaL to a and to

each other. Figure 2.4 depicts the situation for p=3 and q=2,

---1

Figure

2 . 4 1

-E';

=

a

+i'..,ibl+i'..2ibZ

=

a

+(b"

b2)[~~iJ

=a +Bi'..i

In the jargon of factor anaLysis the direction vectors are caLLedfactors and the coordinates of

these factorsfactor loadings. The A;'s are known as factor scores.

The parametric form (2,17) is characterized by the presence ofextra incidental parameter vectors Ai, In principLe. they can aLL be eliminated. thus reducing (2.17a) to a set of p-q Linear

reLations between the eLements of ~i' In Chapter 7 we wiLL deaL with extra incidentaL parameters in detaiL.

In the structuraL interpretation of (2.17) the A;'s are generated by a random mechanism, say by

N(~,ljI).In factor anaLysis the terms fixed factor model and random factor model are used to

(26)

2.5 Temporally related experiments

Chapter 2

28

In the models we have discussed so far, the states of the process being observed in experiment number i and in experiment numberj, i;>! j, were not related. The values of the variables making up the state vector ~iwere internally related, but were independent of~j

for i;tj. This need not necessarily be the case, as expressed by the following specifications: mOdel equation: f(~j, ~j+1' 13) = 0, i =1,2, ...n (2.18al

observations: Xj, xi=~i+ei (2.18b)

errors: ej, Li.d. with lE(ei)=O and VAR(ej)=fl(w). (218c) listed below are some examples of models well known from the field of time series analysis (see e.g. Box&Jenkins 1971 or Kendall, Stuart &Ord 1983) exhibiting the structure (2.18) or extensensions thereof. In these models the index i sequences the experiments in order of time; the vectors~,11 and~denote variables; A, Band C are coefficient matrices containing unknown parameters; the model equations are formulated in terms of true values and do not contain any random error term.

o The autoregressive model

(AR)

~i = A1~i-1 +A2~i-2 + ... +Ap~i_p.

o The moving average model

(MA)

11i = Bo~i +B1~i-l + .. +Bq~i_q·

o The autoregressive / moving average model

(ARM A)

11i = A111i-1 + A211i-2 +. + Ap11i-p + Bo~i + B1~i-1 + ... + Bq~i-q·

o The dynamic linear model

~i = A~i-1 +B~i (autoregressive + input)

11i = C~i (output)

Another example - describing a thermodynamic cascade process - is given by Britt&Luecke 1973.

Integral equations

(27)

Chapter 2

its neighbours. A convolution integral is an example of such a model [Ameloot&Hendrickx 1982 give a practical application):

t

,,(t) =

f

B(u,~)

((t-u) du,

t~o

u=O

(219)

Here~(t)and fl(t) are functions of time t and B(t. (3) is the convolutor, parameterized by the unknoUJn parameter vector13.The discrete analog of (2.19) is

i

"i =

L

B)~)

tj-j,

i=0,1,2, .... j=O

(2.20)

If Bj=O for large enough vaLues ofj, say for j>q, then (2.20) is identical to the moving average model.

Differential equations

Yet another exampLe featuring temporaLLy reLated state variabLes is a boundary value probLem: d2 d dt2

nt)

+ B1(t,(,~)

dInt)

+ Boct,(,~) 0,

.s!.

no)

=0 dt

((1)

=

1

OWl}

C2.21)

Here, the function~(t)has to satisfy a second order differentiaL equation. At the boundary t=O the slope of ~(t)is given and at t=l its vaLue. At some intermediate points~(t)is observed UJith error. The parameter13 has to be estimated. The differentiaL quotients in (2.21) can be approximated by finite differences, transforming the differentiaL equation into an algebraic equation of the form f(~j_p,.. ~i'''' ~i+p, (3)=0, an extension of (2.18).

UnfortunateLy, the case of temporaLLy or spatiaLLy reLated experiments as expressed by

f(~i, ~j+l,(3)=Odoes not fit into the standard form of the functionaL reLationship model. The

specification in (2.4) that is vioLated, is the statisticaL independence of the observationaL errors. Here UJe have to take the composite error (ej, ej+ll. and indeed (ei' ei+l) is correLated

[[ e.) [e.)]

[e.e.T

COY I , 1+1

=

IE I I T

(28)

Chapter 2

30

However, if we arrange the experiments in large units with an overlap of one single experiment, the individual units behave more and more as unrelated when the unit size increases. To be more precise, let n=NM where N is the number of units and M the unit size and define the observational error elM) as:

for unit1: e,[M) vec[el, e2," eM)

for unit 2: e2(M) = vec[eM, eM+l, .... e2M-d

for unit i: elM)I vec [ e(j-llM-i+2, e(i-1]M-i+3, .. eiM-i+' ), i=1,2....N. Note that two consecutive units have one experiment in common. We then have

$10 0 lE[ei(M)]=o, VAR[ej(M)] = 0$1 0 = IM

~

$1, 00 $1 00 0 COV[ei(M),ei+l(M)]= 0 0 0 $10 ... 0

The covariance between errors in neighbouring units consists mainly of zeros. The non-zero elements become negligible when M-+oo. Hence, we conjecture that the asymptotic theory we will derive for the standard form (24) requires a slight modification only in order to be applicable to the form (2.18)

2.6 Model errors and control errors

One of the first problems confronting the builder of a statistical model concerns the specification of the stochastic mechanism by which the observations are generated. The sUbsequent inference problem depends crucially on the choice made.

In the functional and structural relationship models observational/measurement errors have been postulated to explain why the observations do not satisfy the model equations. In these models the idea is that if the process being observed could be 'frozen' and repeated measurements of the state variables were made. the mean values of these measurements would satisfy the model equations.

ModeL errors

Consider, however, as an example the relation between the length~of the human foot and the hight 11 of the instep. It is clear that the individual(~,11)pairs will not lie on a smooth curve and that the scattering is due to biological variability, not to errors of measurement. In

(29)

Chapter 2

this situation we can either condition on ~and try to find the relation betweeniE(1"I1~)and~ or condition on 1"1 and ask for the relation between iE(~11"I)and 1"1. These two cases are commonly modelled by the introduction of a model error / disturbance / shock e:

or

1"Ii = f(~i' fl) + ei. with ej LLd" IE(e;l=O and var(e;l=ll(w}.

with ei i.Ld" IE(ei}=O and var(e;l=ll(w),

i=1.2....n

i=1.2....n.

(222a)

(2.22b) In (2.22a) the measurements~i are treated as constants and the 1"Ii'S as random variables. while in (2.22b) it is the other way round. The formulations (2.22) are indistinguishable from regression models with measurement errors. Hence. models with model errors are special cases of functional relationship models.

The simultaneous equations model popular among econometricians falls in this category. see Anderson 1976. Anderson 1984 and Freedman&Peters 1984.

Control errors

Lastly. we pay some attention to the so-called control errors. This type of error has been described by Berkson 1950 and Fedorov 1974. As a simple example may serve the determination of the thermal expansion coefficient b of an iron rod. The length1"1of the rod is measured at different temperatures ~.The assumed relation is1"I=a+b~. The peculiarity of this example is that the chosen temperature values are controlled by a thermostat; the realized values ~i.

i=1.2....n. deviate from the settings Xi by a control error efj- The lengths 1"Ii are measured as Yi with measurement errors eyi' For i=1.2....n we thus have

~i Xi + eEi. lE(eEi}=O, var(eEi}=aE 2 (2.23al

1"1; a +b~i (2.23b)

y', 1"Ii + eyi. lE(eyi)=O. var(eYi}=a/. (223c) If the series of experiments was to be repeated. the Xi'S would remain unchanged. whereas the

~i'S would take random values: quite unlike the situation in the standard functional relationship model where the Xi'S are random and the~i'Sconstant. From (2.23)~i and 1"Ii can be eliminated. yielding

Yi = a + bXi + ei. with

(30)

ei := beg; +eyi' tE(ei) =0, var(ei) =b2

0/

+

al

=

0([3, w). (224b) We see that (2.24) has no incidentaL parameters; the specifications are those of a regression model. This fact illustrates the point, which we wouLd like to stress again, that carefuL anaLysis of the sources of error and of the observational process is essentiaL in buiLding statistical models. In this case of control error in the independent variable it Leads to a modeL much simpler than if measurement error had corrupted the observation of the independent variable.

(31)

Chapter 3

Estimating equations

Summary

The concepts of estimating function and estimating equation are introduced. It is proved that under reguLarity conditions the soLution to an estimating equation is consistent and asymptoticaLLy normally distributed. A computabLe and consistent expression for the asymptotic variance matrix of the soLution is derived.

3.1 Estimating functions and estimating equatIons

Estimating equations wiLL pLay an essential roLe in this thesis, because the estimators(~,

wl.

which in the next chapter we wiLL derive for the parameters(j3,w),are soLutions to estimating equations. Likelihood equations, which arise by differentiating the Log-Likelihood to the parameters and equating the derivatives to zero, are exampLes of estimating equations. WiLks 1962 introduced the concept of estimating equations and proved the asymptotic efficiency of maximum-likelihood estimators. In this chapter we wiLL define reguLarity conditions for estimating functions and prove some properties of the estimators obtained by soLving the associated estimating equations.

Let Zn:=(Zl, z2' . zn) be a sampLe of independent random vector variabLes. The probability distribution of Zj, i=1,2,...n, depends on the rxl parameter vector ~and - possibLy - on other parameters not of interest here. The vector~is alLowed to vary within

e.

a known compact subset of the r-dimensionaL space. Next, Let gn(~,Zn) be a vector function, aLso of size rx1, having ~and Zn as its arguments. The functiongn(~,Zn) is caLLed an estimating

function and the associated equation

gn(~,Zn) = 0

an estimating equation. The soLution~n(ZnJ togn(~,ZnJ=O is an estimate for some unknown true vaLue~=~o' the so-caLLed true value of~,an interior point of

e.

(32)

Chapter 3

Now. gn(.:7. 2n) is a regular estimating function and gn(.:7. 2n)=0 a regular estimating

equation. if for aLL Large vaLues of n and for aLL .:7eB the foLLowing conditions are met.

CONDITIONS 3.1

(a) With probability1the estimating function gn(.:7, 2n) is equicontinuousLy differentiabLe with respect to .:7.

(b) The expectation of gn(.:7.2n) has a unique zero in the true point .:7=.:70 : IE [ gn (.:7. 2n)

1

= 0

(cl With probability1the jacobian of gn(.:7.2n) with respect to .:7 is bounded: bounded with probability1.

The jacobians converge in probability:

and Jo:=J(.:7o

J

is reguLar.

(d) The variance matrix of gn(.:7, 2n) is of the order n-1, i.e. the non-negative definite matrices Vn(.:7l.

converge as n....oo.say to V(.:7).

Furthermore. gn(.:7o.2n) is asymptoticaLLy normaL: 1

n2gn(.:7

o.

2n

J

V1 N(O.

vol

asymptoticaLLy

The concept of estimating equations is iLLustrated by the foLLowing exampLe.

EXAMPLE

In non-Linear regression modeLs with normaLLy distributed errors we have ej V1 N(O. (2). i=1.2....n.

The independent variables ~j are known. as are the measurements Zjof the dependent variable. The true vaLue .:70 of the parameter .:7 is unknown and has to be estimated. Both the maximum-likelihood method and the least-squares' method yieLd as estimating equation for

(33)

Chapter

:3

gn(~'

Zn)- 0 [(2n)-1

1.;~1

(z; -

f(~i' ~)

)2 ] /

o~

~ 1.;~1 [ Zj - f(~i' ~ll [0 f(~i, ~) / o~

1

o.

The regularity Conditions 3.1 for gn are satisfied for 'well-behaved designs' (~1' ~2,...~n) and model functionsf(~, ~J.which appear in almost all practical situations.

0

To prove some properties of the solution 3n to a regular estimating equation, we need the following lemma.

LEMMA

3.1

Let (xn) be a sequence of random vectors of size rx1, and let (An) be a sequence of random matrices of size rxr.

The sequence

{Xnl

is asymptotically distributed like the random vector x, that is, xn and x have asymptotically the same distribution Fx :

Xn

1IlFx, asymptotically.

The sequence (An) converges in probability to the identity matrix: plim n-+oo An =L

The sequence

{AnXnl

then also has Fxas its asymptotic distribution:

AnXn

1IlFx, asymptotically.

PROOF

A proof of this lemma is given in Linssen 1980, page 26.

o

We are now ready to prove the following theorem regarding the asymptotic distribution of~n.

THEOREM

3.1

Letgn(~'ZnJ be a regular estimating function and let ~nbe a solution to the associated estimating equation, i.e. gn(3n, ZnJ =O.

Then, firstly, 3n is consistent for~o: plim n-+oo 3n = ~o'

(34)

Chapter 3

and. secondly. ~n is asymptotically normally distributed with variance matrix ~Jo-l VoJo - T.

1. • -1 - T

n2 (\1n - \10) V1 N(O. Jo VoJo ). asymptotically.

PROOF OF FIRST PART

Consider a seQ.uence of actual observations{2"

I

n=1.2.3. .) and the corresponding seQ.uence of solutions{~n}.Since 0 is compact. the seQ.uence {~n}has at least one convergent subseQ.uence

{~m

I

meN. N is a subset of the set of integers} with limit point \1'"

Since the jacobians Jm(\1. Zm! are bounded. we have for this limit point

gm(~m.

Zm) +

Jm(~m.

Zm) .

(\1"'-~m)

+

0(1I\1"'-~mIl2)

o

+

Jm(~m,

Zm) .

(\1"'-~m)

+

0(1I\1"'-~mIl2)

-+ O. (31)

Condition 3.1b implies that

(32)

From (3.1) and (3.2) it follows that \1"'=\10' This is true for all limit points \1'" and - with probability 1 - for all actual seQ.uences (Zn). ConseQ.uently,

o

PROOF OF SECOND PART

In view of Condition 3.1a. Taylor's Theorem can be applied:

gn(~n. Zn) = 0 = gn(\1o. Zn) + A .(~n-\1ol.

Row i. i=1,2.... p. of matrix A eQ.uals row i of the jacobianI n(\1n(i). Zn), where \1n(il is some point near \10 with II\1n(i) - \1011

~ lI~n

- \1011.

From Condition 3.1d it immediately follows that 1 •

no A . (\1n-\1o) V1 N(O. Vol. asymptotically. (3.3) As

~n

converges to \10' the \1n(i)·s for i=1,2,...p do so too. Now. use Condition 3.1c:

Jo is regular and. hence, from (3.3) and Lemma 3.1:

(35)

Chapter 3

Like -Ito, matrices Jo and Vo will be unknown in practical situations. They can be estimated consistently by

jn=In(~n, Zn) and

V

n = n VAR[gn(~n, ZnJ ].

Using

J"

and

V

n, we can construct a pivotal random variable, as shown in the next theorem

THEOREM

3.2

(34)

Let gn(-It, ZnJ be a regular estimating function and let ~n be the solution to the corresponding estimating equation.

Then

n~ V

n

-~

jn'

(~n--Ito)

Vl N(O, I). asymptotically,

.. ... .. 1 ...

where

On

and Vn are defined in (3.4) and Vn2" denotes a square root of Vn.

PROOF

According to Theorem 3.1 we have

,

n2" (~n--Ito) Vl N(O, Jo-l VoJo -T). and, hence,

1 , •

n2" Vo-2" Jo' (-Itn--Itol Vl N(O, I),

asymptotically,

asymptotically. The regularity of Vo is stated in Condition 3.1d.

Because

J"

and

V

n converge in probability to Jo and Vo respectively, we can apply Theorem3.~ and get

1... 1 . . ...

n2" Vn-2" I n' (-Itn--Ito) Vl N(O, I), asymptotically.

o

The standard normal distribution N(O,l) does not contain a parameter. So Theorem 3.2 enables us to construct approximate confidence regions for -Ito.

In the next chapter regular estimating equations are constructed to estimate the parameters (13,w)of the functional relationship model.

(36)

Chapter 4

Linear functional relationship models

Summary

The linear functional relationship model B(~)I3=Owith VAR(e)=O(w) is discussed in detail. Estimating equations are derived for the relation parameter vector(3 and for the error parameter vector w. These equations are found by modifying the normal theory likelihood equations; the original likelihood equations yield inconsistent estimators for 13 and w due to the presence of the incidental parameters~.Sufficient conditions are given for the estimating equations to be regular. The solution(/3,Cl) is then consistent and asymptotically normal. An expression for the asymptotic variance matrix is given, both for normal and non-normal errors.

Resul ts for various particular linear models are obtained by choosing the appropriate parameterization of B by 13 and 0 by w.

4.1 The functional model

B(f3)~=O

The model states that the p components of each of the incidental parameter vectors~l' ~2,. 1;" simultaneously satisfy a set of q linear relations:

B(l3o)~j= 0, i=1,2....n, B is qxp and is of full rank, q<p. (4.1aJ

The elements of the relation matrix B are continuously differentiable functions of the

relation

parameter vector

13, hence the notation B(I3),:IHI3)~pq.

Observations of the

fS

are available as random pXl vectorsXl, X2, ...xn ' which deviate from

their true value ~by the random observational error vector e:

xi = ~j+ ei, i=1.2....n. (4.1bJ

We assume that the errors are independent and have expectation 0 and variance matrix n(wo): [rei) = 0, VAR(ej) = O(wo), ej and ej are independent for i;o!j. (4.1c) Like B with respect to 13, the elements of the err-or variance matrix 0 are continuously differentiable functions of the

error parameter vector

w,**(w)~tp(p+l).

(37)

It is understood that the model given by (4.1) is valid for exactly one value 13=130 and w=wo, the unknown true values of 13 and w.

The relation (4.1a) defines a (p-q)-dimensional hyperplane passing through the origin. The matrix B has rank q in an open neighbourhood of 13=130' The parameterization of B by 13 is such that 13 uniquely determines the hyperplane. Too many elements in 13 may lead to a model that is indetermined, see Anderson 1984. A similar restriction has to be imposed on the parameterization of 0by w. In the notes following Theorem 4.2 the problem of parameter determinability is discussed more comprehensively.

4.2 The construction of the estimating equations

We will construct estimating equations for 13 andWvia the likelihood equations. To that purpose we assume that the observational errors follow a normal distribution with a regUlar variance matrix 0:

ej Vl N(O, oj, i=1,2,...n, LLd. (4.2J

Later, after the estimating equations have been constructed, the assumption of normality may be dispensed with and the assumption of regUlar 0replaced by the regUlarity of BOBT

Nevertheless, the obtained estimating equations remain regUlar and will provide consistent and asymptotically normally distributed estimators for 13 and w.

On the assumption (4.2) the log-likelihood of the parameters~,13 and w of the model (4.1) is given by

log L(~,13, w, x) (4.3a)

with

B~i = 0, i=1,2....n. (4.3b)

For fixed 13 and w the log-likelihood (4.3a), constrained by (4.3b), can be maximized with respect to the~'s.yielding for ~i:

with

i=1,2....n, (4.4)

G(13, w)= BTHB and H(13. w):= (BOB T)-l. (4.5) We may regardB~=Oas a hyperplane in the p-dimensional space. The point

§

is then the skew

(38)

(47a) (47b) projection of Xi onto this plane. OG represents the projection matrix onto the plane spanned by the columns of OBT It is seen that GOG=G and. of course,

B~i=O.

SUbstitution of (4.4) into (4,3a) results in the 'reduced' log-likelihood

log L*((3, w, xl =

-~np

log (2rr) -

~n

log

101 -

~Li~'

XiTGX,. (46) We note that for (3=(30 and w=wo the third term on the right-hand side of (4.6) is distributed as

-h?

with nq degrees of freedom.

To obtain the likelihood equations (47), the reduced log-likelihood (4.6) is differentiated with respect to the vectors (3 and wand the derivatives are equated to zero. The steps needed to arrive at (4.7) are rather complicated and involve differentiating matrix expressions with respect to a vector. The relevant differentiation rules are reviewed in Appendix A, together with some other probably less known operators from the matrix calculus, As an illustration, the derivation of (47) from (4.6) is also given in Appendix A.

The likelihood equations are

(3 - n

B

T vec [ HB (Sn+x"x" T) (I-GO)] = 0 w:

~n

il

T vec [ G (Sn+x"x" T) G - 0- 1 ] = 0 where Xn =

RLi~'

Xi Sn :=

RLj~'

(Xi-Xn) (Xj-xn)T

B

:= dB / d (3, size pqx#((3)

il

:= d0 / d w, size p2x #(w).

LEMMA

4.2

The likelihood equations (4.7) are inconsistent.

PROOF

As n....oo the sample moment Sn+xnxn T tends to its expected value. In view of the model specifications (4.1bl and (4.1c) we have

E(Sn+xnxn T) =

RLi~' ~j~jT

+ O(w o).

Therefore, as n....ooand if ((30' wol is inserted for ((3, w), the left-hand sides of (47) tend to the expected values

(39)

Chapter 4

w ~n li oT vec (Go - 110-1) (48b)

with Bo:=B(13o). llo:=ll(wo) and Go and Ho simiLarly defined. In generaL. expression (48b) is unequal to zero. Hence. even in the Limit the true parameter vaLues are no soLution to the Likelihood equations (4.7l. i.e. these equations are inconsistent.

o

Neyman&Scott 1948 and KaLbfleisch &Sprott 1970 attribute the inconsistency of the likelihood equations to the presence of the incidental parameters~.which increase in number at the same rate as the number of observations. To obtain consistent estimating equations. Neyman&Scott 1948 - and Later aLso Chan&Mak 1983 - suggest to subtract the expectations (4.8) from the Left-hand sides of (4.7) after repLacement of (130' wo) by (13. w). We do likewise and obtain as our estimating equations

13:

B

T vec [ HB (Sn+xnxn T) (I-G 11)

1

o.

w: liTvec[G-G(Sn+XnxnT)G] = O.

(49a) (49b) For convenience. the constant factors -n and ~nare omitted and the sign of (4.7b) is inverted.

We now pose the question which conditions model (4,1) is to meet so that the proposed procedure wiLL Lead to estimating equations (49) that are regular in the sense that they satisfy Conditions 3.1. These conditions are listed in the next theorem.

THEOREM

4.2

Consider the functionaL relationship modeLB~=O as specified in (4.1). Let the foLlowing conditions be satisfied.

With respect to the design:

(al The~·sare bounded and span a (p-q)-dimensional space: Limn-oooRLi~l ~i =: ~. bounded.

Lim n_ oo

RLr:l

(~i-~J (~i-~JT

=: ljI. bounded and of rank p-q. With respect to the true vaLues 130 andWo and the parameterization of Band 11: (b) (130' woJ is an interior point of a known compact set 8.

(c) The rows of B(13oJ are orthogonal to the ~·s:

B(13oJ~i= O. For alL (13. w)el3

(40)

(d) The matrix B((3) has rank q.

42

(e) The elements of Band 0are continuousl\,j differentiable functions of, respectivel\,j, (3 and IJ.

(f) The matrix BOB T is regular. (g) The set of equations for (3 and IJ

ST vec [HB

(~~

T+tlJ+Oo) (I-GO)

1

0,

II

T vec [ G - G

(~~

T+tlJ+Oo) G

1

= 0,

is functionall\,j independent and its solution ((3, IJ) =((30 ,1J0) is unique within 8. With respect to the distribution of the observational error:

(h) The Strong Law of Large Numbers applies: plim n..oo (Sn+XnXnT) =

~~T+tlJ+Oo'

(i) The Central Limit Theorem applies:

1

n-'2(Sn+x.,x., T) is as\,jmptoticall\,j normal.

If the above conditions hold, the estimating equations for ((3, IJ): (3:

B

T vec [HB (Sn +xnxnT) (I-G OJ

1

0, IJ:

II

T vec [ G - G (Sn +x.,xnT) G

1

0, are regular.

PROOF

(49a) (4.9b)

If we read ~for vec((3, IJJ and 2n for Sn+x.,xnT, it is easil\,j verified that the conditions (a) to 0) impl\,j Conditions 3,1. Hence, the estimating equations (4.9) are regular.

NOTES

o

o Condition (g) imposes a restriction on the parameterization of Band O. Too free a parameterization leads to a set of estimating equations that is functionall\,j dependent. The parameters are then indeterminable, at least via this method. Nussbaum 1977 showed that for the simple bivariate functional relationship modelll=a+b~with normall\,j distributed errors, see Paragraph 4.6,4, no consistent estimators exist for the parameters of this model, if these parameters are not identifiable in the corresponding structural relationship model in Which~

is assumed to be random and normall\,j distributed. It is conceivable that Nussbaum's result could be extended to the mUltivariate model considered here. Unfortunatel\,j, the reverse is not true. Identifiabilit\,j of the parameters in a structural relationship model With normall\,j

(41)

43

distributed errors does not always guarantee that the estimating equations (49) for the parameters of the corresponding functional relationship model are regular. In Paragraph 5.2.7 a counter-example is given.

o The observational errors need not be normally distributed. Al though in the case of non-normal errors more efficient estimators may exist, the estimating equations (49) can be applied and will yield consistent estimates.

o The regularity of fl is not required. Unlike fl, BflBT must be regular.

o Regarding the inconsistency of the likelihood equations (4.7) one parameterization of fl is of particular interest, namely the case that fl is known to within a scalar factora2:

W is known and symmetric.

In this case fl=vecW and the likelihood equations (4.7) can be written as:

13: sTvec[(BWBT)-IB(Sn+XnXnT)(I-BT(BWBT)-IBfl)] = 0, (4.1Dal

a2: pa2 -

r\

Li~'

XiTBT(BWBT)-IBxi = D. (4.1Db) Equation (4.lOa) clearly does not involvea2and can be solved separately from (4.lOb). Hence,

in this case, the maximum-likelihood estimator~Lis consistent. For 13=130 the second term on the left-hand side of equation (4.1Dbl converges in probability to qa02 Hence, the maximum-likelihood estimator aMI.2 is indeed inconsistent. The inconsistency of aML2 is easily remedied by the estimator

0-2:

-2 • 2 ,~n T -T (- -T -1 - ,~n - T -1( .)

a

:= P aML / q = Pi'-i;' Xi B BWB) BXi = Pi'-;;'(Xi-~i) W Xi-~i' (4.11) with B:=B(PMd and ~i:=~i(PML,ll.see (4.4). The same estimator for a02as (411) would have

been obtained if the mOdified likelihood equations (4.9) had been used.

o Consider again the case fl=a2

w

with W known. Application of the generalized least squares method (GlS) entails the minimization over~i.i=1.2....n. and 13 of the weighted sum of squares

Li~i

(Xdi)T W-1(Xdi) sUbject to the constraints

i=1.2,...n.

It is easily verified that the solution i3cLS to this minimization problem satisfies (4.1Dal Hence, in this case, the generalized least squares estimator i3CLS is identical to the normal theory maximum-likelihood estimator i3ML (Nussbaum 1976). In view of the previous note, i3cLS is consistent.

(42)

4.3 The asymptotic variance matrix

of

the estimates

The estimates(~n,

w

n) obtained by soLving equations (49) are consistent for113o,:.10)and asymptoticaLLy normaLLy distributed. By appLying Theorem 3.1 we wILL derive expressions for the asymptotic variance matrix of(f3n,

w

n), first for the generaL case of non-normaL errors,

then for normaL errors.

4.3.1 The errors are not normally distributed

In order to appLy Theorem 3.1. expressions are needed for the matrices Jo and Vo' Denote the estimating functions on the Left-hand sides of (4.9a) and (49b) by gp.n and 9w.n'

gp.n(13,w, Xn)

8

T vec [HB (Sn +xnxnT) (I -G0)

1'

gw,n(P,w. Xn) -

n

T vec [ G - G (Sn +xnx nT) G

1,

where Xn:=(X1, X2"·· xn)·

Furthermore, Let ~:=vec(p,w) and gn:=vec(gp,n, gw,n). The asymptotic jacobianJ(~)is defined as

J(~)= pLim n-oa

a [

gn(~'Xn)

1/

a

~.

The operations of taking the probability Limit and differentiating with respect to~may be interchanged. Hence

J(~)=

a

[pLim n-oa gn(~' Xn)

1/

a

~.

To obtain Jo :=J(~o)' the foLLowing operations are performed in succession on both gp,n and gW,n'

(1) RepLace Sn +xnxnT by

~~

T +1jI+00'

(2) Differentiate with respect to P andw, foLlowing the differentiation ruLes as given in Appendix A.

(3) EvaLuate in the true point (Po, wo) and observe that Bo~=O, 801jl=0 and GoOoGo=Go· After some effort, this yields for the four blocks Jpp,Jpw, Jwp and Jwwof Jo:

Jpp

8

0T [(~~T+1jI) 0 Ho

18

0, (415a] Jpw

- 8

0T [ (1-00Go) 0 Ho80

1

flo, (415b)

Jwp 0, (415c)

(43)

Chapter 4

Note that

J~w;tJw~

T

We now turn our attention to the asymptotic variance matrixV(~)of gn V(~) limn~oo n VAR [ gn(~' Xn)

1

Limn~oo

n IE( gngn T)

We are interested inVo:=V(~o)'For ~=~othe estimating functions can be written as gp,nU30' wo' xnl =

B

oT [ I

~

HoBo

1[

R'i.i~l (~i~eill

+

B

oT [ (HloGol

~

HoBo

I [

~'i.i~l (ei~e;l

], (416a) gw,n(Po, wo, Xn) =

0

0T [

Go~Go

1[

vec 110 -

R'i.i~l (ei~ei)

l.

(4.16b) which resuLts by repLacing Sn+xnxn T by

R'i.i~l (~i+eil (~i+eilT

and noting that

Bo~i=O

and

From (4.16) it is cLear that the first four moments of error e are needed in order to be abLe to caLcuLate lE(gngnT).For the first and second moment we have

lE(e) =0,

lE(e~el = vec 110,

To denote the third and fourth moment of e the matrices Me3 and Me4 are introduced: lE(ee T~eT) = lE(e T~eeT),

lE(ee T@eeTl - (vec 110) (vec 110)T - (I+uj (110@110).

(4.17a) (417b)

(418a) (4.18bl Me3 measures the skewness of the distribution of e and Me4 its excess. For a normaLLy distributed e the matrices Me3 and Me4vanish, see Theorem B1 in Appendix B.

The four bLocks Vpp, Vpw, Vwp, Vww of Vo can now be caLcuLated. Use (4.16), (417) and (4.18). The errors are independent.

Vpp =

B

oT

[(~~

T+tjll

~

Ho

I 13

0 +

B

oT [(110-l1oGo I101 @ Ho

I 13

0 +

13

0T

(~

@ HoBol Me3 [ (I-Gol1o) @ BoTHo

I

13

0 +

13

0T [ (I-l1 oGol @ HoBo

1

Me3T

(~T ~

BoTHol

13

0 +

B

oT [(I-l1oGo) @ HoBo

I

Me4 [(I-Gol1o) @ BoTHo

113

0, Vpw Vwp T = -

13

0T(~@HoBol Me3 (Go @ Gol

0

0

- B

oT [ (I-l1oGol @ HoBo

1M

e4 (Go ~Gol

0

0 Vww = 2

0

0T (Go ~Gol

0

0 +

0

0T (Go @ Go) Me4 (Go @ Gol

0

0,

(4.19a)

(419bl (4.19c) With (4.151 and (4.161 we have obtained expressions for Jo and Vo' Theorem 3.1 can now be applied to give us the asymptotic distribution of the estimates

i3n

and

Cln:

Referenties

GERELATEERDE DOCUMENTEN

Second, as the method is shown to work well only if the conditional variance function of the error term is continuous, we propose an alternative measure of the three local linear

waaraan de stochast de functiewaarde x toevoegt. Omdat de kansfunctie P0 in het volgende geen rol speelt, is hij verder buiten beschouwing gelaten.. En de argumenten van P

Maar juist door deze methode zal het duidelijk worden, dat de mens een hoog gewaardeerd produktiemiddel is waar zui- nig mee omgesprongen dient te worden... In

Box-and-whisker distribution plot of typicality rating scores for young wines (a) and two-year bottle-aged wines (b) from old vine Chenin blanc grapevines of different ages..

We simulated sequence alignments under a model with site- specific rate multipliers (Model 1) and under a model with branch- specific parameters (Model 2), investigating how

Deze studie draagt bij aan het inzicht in de effecten van de fysieke hydromorfologische ingrepen op zowel de nutriëntenconcentraties als ook overige, voor de ecologie onder-

Even later komt ze weer tevoorschijn en vliegt onrustig heen en weer, kenne­ lijk omdat wij er met onze neus vlak bovenop staan. Is ze de eerste die we zien en zullen