• No results found

Bayesian analysis of heteroscedasticity in regression models

N/A
N/A
Protected

Academic year: 2021

Share "Bayesian analysis of heteroscedasticity in regression models"

Copied!
21
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Tilburg University

Bayesian analysis of heteroscedasticity in regression models

Chowdhury, S.R.; Vandaele, W.H.

Publication date:

1969

Document Version

Publisher's PDF, also known as Version of record

Link to publication in Tilburg University Research Portal

Citation for published version (APA):

Chowdhury, S. R., & Vandaele, W. H. (1969). Bayesian analysis of heteroscedasticity in regression models. (EIT

Research Memorandum). Stichting Economisch Instituut Tilburg.

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal Take down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

(2)

R

7626

1969

3

EIT

3

TIJDSCHRIFTE,NBLigEAU

BIBLIOTH~.EK

K.fi T"ríOL~~'?iE

HOGI:;.SCHOOf~ ~

T'ILBURO

----S.

R. Chowdhury and W. Vandaele

A bayesian analysis of

heteroscedasticity in

regression models

~ ~I

Research

~

~ s~ ~~~~~' ~-t~,

~L.c.... ~.~! s

~ ) ,'tPJ..~'in~ é?~!~ .;~ ri.-t:~ r feàt-~~-Y ~'c C~J

f ~~

memorandurí~

IIIIIIIIINIIhIIIIIInIIIIIIIIIIIIIIINInIINI

(3)
(4)

IN

REGRESSION MODELS BY

S. R.CHOWDHURY AND W. VANDAELE

1. Introduction

In this paper we will try to examine heteroscedasti-city in the regression model with Bayesian analysis. We will set up two types of models, one linear and the other ratio, and examine the posterior distributions of the unknown vari-ances.

If the form of heteroscedasticity is completely known we can by Aitken's generalised least squares method

find out the best linear unbiased estimates of the parame-ters. But before applying such a method we should first of all be able to test the presence of heteroscedasticity in the orginal model. TYíe usual Bartlett's t test of homogeneity of variances cannot be applied because we have only one sam-ple at our disposal.

Goldfeld, M. and R.E. Quandt { 5}, have also tack-led this problem and have given a parametric and a nonparame-tric test to compare the ratio and linear models.

t BARTLETT, M.S. "The Problem in Statistics of Testing Seve-ral Variances", Proceedings of the Cambridge Philosophical Society. Vol. 30, 1934.

S. r~,2.~ 3 ~

~ ~~

(5)
(6)

2. TheorY

where:

Let us consider the regression model: ~~, ... ,Tyt - Slxlt t SZx2t t... f Bnxnt

} ut

- yt - observation of the dependent variable at

(1)

time t;

- xlt' "' ' xnt' are the observations on the n explanatcry variables at time t; these va-lues are nonstochastic and identical in re-peated samples; the first variable is the usual constant and takes the value 1; - ut : is the disturbance at time t;

- There are T independent observations on the depen-dent and explanatory variables;

- T , n. It is assumed that - E (ut) - 0 ; F11~...,T t,t~ - E (utut, ) - 0 , t ~ t' ;

- E(ut) -~t a2 , where cr2 is unknown, but ~2t is known ;

(2) (3a)

(3b)

- that the error terms are normally and

independent-ly distributed. (4)

Bayesian analysis.

To apply the Bayesian analysis, we first make a transformation of (1) .

1,...,T yt x'-t ~-t f .

Flt ~t - S1 ~~ t SZ ~t . } Sn ~nt ~ ~t (5)

(7)

3.

Model (5) may be called a ratio model.

Now ~1~...,T t ut E( ~ ) - 0 , because of (2) (6) t 2 E( ~2 ) - Q2 , because of (3b) (7) t

Because of (7), the model (5) is a homoscedastic one.

For simplicity, we write the ratiomodel (5) as

1,...,T - glxltl~ f S2x2t ~ f... f S x f

~t Yt,~ , n nt,~

t ut~~ (8) Under the assumption (4), the likelihood function of the sample is given by

T

~(B ,...,8 ,aly) '- 1 eXP {- ~ E (yt ~

-1 n QT 2a2 t-1 '

Blxlt~~ -Or in matrix notation,

. - Bnxnt ~)2}

R(B,Q~Y) a T exp {- ~2 (y~ - X~B)'(y~ - X~B)}(9)

a 2a

where

a' - csi..-.,an)

y~

(8)

xll,r~ x21,~... x~I,~

... xnT,~

Throughout this paper we shall use the symbol

Q(;', 1, A) to denote a quadratic form in variables ~á centred at ~ and with matrix A, namely

4(S, a, A) -(~ - a) ' A(I~ - a)

where

We can now write (9) as .

R,(;~,o~y) a T exp [- ~[Q(~3.á,Z) t v Sz] } (~p). a 2~2 Z - (X~ X~)

á

-

Z-1 X~ y~

v

-

T - n

S2 - (y~ - x~~)';y~ - X~..) v

It can be seen that S, and ~~ are ~ize usual least squares es-timates of B and a2.

Using the Bayes's theorem, the likelihcod function in (~~~' is combined with a prior distribution p(a,a) of the paramet~~rs ~ and a to yield a joïnt posterior distribution p(S,~~y) for these parameters, that is

(9)

5. where :

-I

K - !R p(S.a)R(B.a~y)dBda

Clearly the form of our posterior distribution g and a will depend on what prior distribution we adopt.

Jeffreys, H. ({ 6}, pp. 179-192), Savage, L.J.t and Box, G.E.P. 8~ G.C. Tiaott suggested that in situations where little is known about g and a, the prior distributions of g and log a should be taken as locally independent and uniform. In the literature this type of prior is usually known as an "Noninformative prior".

In our case, we also adopt such prior distributionsttt ,

that is :

p(B) a kl - m c S c~

p(log a) a k2 or p(a) a 1ra

0 c 6 c ~

the joint prior distribution of S and a is

P(B,a) - p(B)P(a)

P(8.a) a 1ra 0 c a c~ (12)

t SAVAGE, L.J. "Bayesian Statistics". In Decision and

Information Processes. New York, Macmillan and Co., 1952. tt BOX, G.E.P. and G.C. TIAO. "A further look at

robust-ness via Bayes theorem", Biometrika. Vol. 49, 1962, nr 3~4, pp. 419-433.

(10)

Substituting (10) and (12) in (11), the joint pos-terior distribution of B and a is :

p(S'a~y) 6 QTf1 eXp {- 2a2 ~ 4(B.S.Z) f v SZ ]} Integrating this joint density function over B, by the pro-perties of multivariate normal distribution, we get the mar-ginal posterior distribution of a.

p(a~y) - I}~ p(S~a~y)ds

~.,~~i ~ 1 -~, Q a-(vt1)exp{- ~ vS2} (13)

~ 2a2

It is to be noted that the expression v S2 is just the resi-dual sum of squares in the least squares regression.

It can be seen that the marginal posterior distri-bution of a, i.e., p(Q~y) in (13) is an Inverted-gamma-2

normalized density function (Raiffa, H. and R. Schlaiffer { 9 } p. 228) :

2 e-~vS2,o2 (~vS2ra2)~v}~

f(a~S,v), - o '- 0 (1 4 )

r (~v) (~vs2)~ s,v ~ o

Its first two moments are ~.

Mean . ui - S ,~ r (~v-~) r(2v) Variance . u2 - S2 v-2 ~ 1 v The mode is : 5~~~ v - uZ , v ~ 1 (15) , v ~ 2 (16) (1~')

(11)

In most of the econometric problems heteroscedasti-city is usually due to the variances of the disturbance terms being dependent on the explanatory variables. The most frequent form of heteroscedasticity results from standard deviation being proportional to the values of one of the ex-planatory variables (Fisher, G. { 2}, p. 156; Glejser, H.

{ 3}, p. 3; Goldberger, A. { 4}, p. 245; Johnston, J. { 7} p. 210) .

In view of the above, we take for ~t's in (5) the values of one of the n explanatory variables.t The posterior distribution of a, its mean and variance are calculated. This procedure is repeated for all the explanatory variables. The n posterior distributions, their means and variances can be analysed from the point of heteroscedasticity. Theoreti-cally, we can conclude that the model with the sharpest pos-terior distribution p(a~y) is the least heteroscedastic in comparison to the other n-1 posterior distributions, and one will naturally choose that one if the criterion is

homosce-dasticity. It is to be noted that our oriainal model is ob-tained by taking each of the ~t's equal to unity, and is in-cluded within the n models. In the next section, as illustra-tions, we will analyse two numerical examples.

(12)

3. Illustrations

3.1 As a first illustration, we choose a Rate of inventory formation, equation based on United Kingdom figures from 1 951 -1 966 :

ti ti ti

Nt - 81 t s2(N-1~V'1)t t S3 Ht f S4 Vt t S5 Kt t ut Explanation of Symbolst

N - o N~V'1ti ti . the dependent variable represents the inventory changes, their rate of change being expressed as a percentage of lagged total expen-diture less inventory changes and net invisibles ;

ti

N - inventory changes ; H- labour cost per unit ;

V' - total expenditure less inventory changes and

net invisibles ;

ti

K- gross profits per unit of output ;

ti

K - 4 K x 100.

There are in this equation five explanatory varia-bles ( constant included). So there will be five posterior distributions which are indexed by positive integers 1.1 to 1.5. The posterior distribution indexed by 1.1 is based on the orginal equation ( ~t's - 1). Other posterior distribu-tions are derived by dividing by the values of the explana-tory variables e.g. the posterior distribution indexed by 1.2, is derived from the ratio model where ~t's take the

va-ti ti

lues of (N-1~V'1)t , at different time periods.

(13)

9. The calculated results are given in table 1.

The values of the Poisson distributions are used to determine the posterior density functions p(a~y). This was possible because cumulative Inverted gamma-2 distribution is related to cumulated Poisson distribution (Raiffa, H. and R. Schlaifer { 9}, p. 228). The whole work has been done on a IBP.1 1620II computer.

From table 1. it is seen that the posterior distri-bution nr 1.4 has the lowest variance (posterior variance).

In figure 1. the graphs of the posterior distribu-tions are given. From the fiqure we see that the posterior distribution nr 1.4 is the sharpest one, which as we

expec-ted.

The posterior variance of the orginal model is 38 times the variance of the sharpest distribution :

posterior variance of p(a~y) nr 1.1 .0106551 posterior variance of p(a~y) nr 1.4 .0002805

- 37.986.

We can safely infer from the above that the o~gi-nal model is significantly heteroscedastic in comparison with the model with the sharpest distribution. We note that, it may also turn out by the above kind of analysis, that our orginal model is the least heteroscedastic in com-parison to the other ratio models.

(14)

ior

Pos ert Mean Variance R t Disturbanee Values of the re ression coeffícients

distribution variance S2 g Q 63 S4 QS

a

example 1: Rate of inventory formation

Nr. 1.1 .4323 .0106 .9763 .1616 1.1008 -1.5007 .1876 .3584 .6304

1.2 .6070 .0210 .9503 .3186 1.6772 -1.3304 .2631 .3969 .6345

1.3 .1926 .0021 .8981 .0321 1.7023 -1.6460 .4197 .3780 1.1690

1.4 .0701 .0002 .9554 .0042 1.2344 -1.5450 .2346 .3655 .7546

1.5 1.1982 .0818 .9760 1.2417 1.9477 -1.4946 .3339 .4019 .8810

example 2: Induced investment I

2.1 2.8434 .4138 .8634 7.0823 15.5816 -1.1585 .2732 -.1220 -1.0369 2.2 .5455 .0152 .9798 .2606 17.5651 -1.4527 .3168 -.1304 -1.3249 2.3 2.4 .1767 .0015 .9789 .0273 14.7125 - .8898 .2916 -.0363 -1.1560 2.5 1.4530 .1080 .9882 1.8495 12.0189 - .6294 .2336 -.1478 - .3059 j ~ I

t Multiple correlation coefficient, adjusted for deqrees of freedom. I

(15)
(16)

This equation is based on United Kingdom figures from 1950-1966 : It - S1 } s2(Z-1 - Tz)t } S3 r-1~2,t } S 4 Wt } S5 Pai-1,t } ut Explanation of Symbolst I - induced investment ; Z - non-labour income ;

TZ - p(T~Z) . change of tax rate on non-labour income ; (Z-1 - TZ) - difference in change of tax rate on

la-bour income and the change in lagged non-labour income itself ;

r - Bank rate ,

W - registered ~.~hol1~. unemployed ;

Pai - Price index of autenomous investment.

In this example the posterior distribution indexed by 2.3, cannot be derived because one of the values of r-1~2 is exactly zero.

The results of the calculations are also given in abo-ve table 1., and the posterior distributions are plotted in figure 2.

Here we find that the p(a~y) indexed by 2.4 is the

sharpest.

The posterior variance of the orginal model is now 259 times the variance of the sharpest distribution.

(17)
(18)

4. Conclusion

The analysis given in the preceding pages is a sim-ple way of detecting and correcting heteroscedasticity where heteroscedasticity is in the form of standard deviations be-ing proportional to one of the explanatory variables.

It is a comparative procedure based on the posteri-or distributions, and not a direct test. Nevertheless it is quite reasonable to make such analysis for heteroscedasti-city. With high-speed computer this type of analysis will not involve much additional work.

5. References

{ 1} ANDERSON, T.W. An Introduction to Multivariate

Sta-tistical Analysis. New York, John Wiley ~ Sons,

1958, 374 pp.

{ 2} FISHER, Gordon R. "Iterative Solutions and Heterosce-dasticity in Regression Analysis", Review of the Internatioaal Statistical Institute. Vol. 30, 1962, nr 2, pp. 153-159.

{ 3} GLEJSER, H. Testing Heteroscedasticity in reqression disturbances. Warsaw,Mimeographed paper presented at the joint European meeting of the Econometric Society and the Institute of Management Sciences, september 1966, 13 pp.

{ 4} GOLDBERGER, Arthur S. Econometric Theory. New York, John Wiley ~ Sons, 1964, pp. 231-246.

(19)

15.

{ 6} JEFFREYS, H. Theory of Probability. Oxford, Claren-don Press, 1961, 3rd edition, 459 pp.

{ 7} JOHNSTON, J. Econometric Methods. London, McGraw Hill Book, 1963, pp. 207-211.

{ g} MALINVAUD, Edmond. Méthodes Statistiques de 1'Eco-nométrie. Paris, Dunod, 1964, pp. 261-265. { 9} RAIFFA, Howard and Robert SCHLAIFFER. Applied

Sta-tistiCal Decision Theory. Boston, Division of Research, Graduate School of Business Admini-stration, Harvard University, 1961, 356 pp. { 10 } ROTHENBERG, T. A Bayesian analysis of simultaneous

equations systems. Rotterdam, Econometric Insti-tute Report 6315, 1963, 20 pp.

{ 11 } RUTEMILLER, Herbert C and David A BOWERS. "Estima-tion in a Heteroscedastic Regression Model", Journal of the American Statistical Association. Vol. 63, june 1968, nr 322, pp. 552-557.

{ 12 } SAVAGE, Leona~d J. The Foundations of Statistics. New York, John Wiley and Sons, 1954, 294 pp. { 13 } TIAO, George C. and Arnold ZELLNER. "Bayes's

(20)
(21)

Referenties

GERELATEERDE DOCUMENTEN

Another way may be to assume Jeffreys' prior for the previous sample and take the posterior distribution of ~ as the prior f'or the current

We consider those cases in multiple regression ana- lysis, where our only prior knowledge is, that a subset of the parameters have finite, definite and known bounds.. Exam- ples of

The underlying theory for dis- criminating bet~een the two statistical models isó if the linear model is true, but ~ is ealculated from the ratiomodel, the assumption

Although in the emerging historicity of Western societies the feasible stories cannot facilitate action due to the lack of an equally feasible political vision, and although

Archeologische verwachting: Op basis van de historische kaarten, de bodemkundige situatie en de archeologische inventaris (CAI) kunnen op het projectgebied

If the unexpected stock returns can indeed explained by the cash flow news and discount rate news alone, the beta of these two news terms can be used in constructing a rough form

Comparing Gaussian graphical models with the posterior predictive distribution and Bayesian model selection.. Williams, Donald R.; Rast, Philip; Pericchi, Luis R.;

We again start with five independent change- point models, using the MDL as minimization criterion, this results in the estimated network of figure 5.7 on the following page..