• No results found

Therefore, we will adhere to the maximum likelihood approach by Hsiao, Pesaran, and Tahmiscioglu [2002]

N/A
N/A
Protected

Academic year: 2021

Share "Therefore, we will adhere to the maximum likelihood approach by Hsiao, Pesaran, and Tahmiscioglu [2002]"

Copied!
8
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Fixed T robust LM tests for Time Persistence using a Predictive Model for the Initial Conditions

Technical Supplement to Master’s Thesis

Ties van den Ende (s1885677) January 15, 2016

1 Introduction

In this paper we will develop LM tests that are, if applied simultaneously, capable of pointing out the type of time persistence at hand, if any. Existing tests are based on initial assumptions, restricting and influencing the outcomes. Therefore, we aim to create tests that are, similar to Anselin, Bera, Florax, and Yoon [1996], robust to the presence of either type of time persistence, circumventing the need for initial assumptions. More formally, we consider the following combined hypothesis test:

H0: No state dependence and no serial correlation H1A: State dependence irregardless of serial correlation

H1B: Serial correlation irregardless of state dependence. (1.1) To construct two LM tests associated with these combined hypotheses, we need the score vector and information matrix of a general dynamic panel data model with serial correlation, evaluated at H0. Therefore, we will adhere to the maximum likelihood approach by Hsiao, Pesaran, and Tahmiscioglu [2002]. After these have been derived, applying a modification to the ‘standard’ LM statistic following Bera and Yoon [1993] will yield the relevant test statistics.

2 The model

We consider the following general dynamic panel data model with fixed individual effects and serial correlation, hence allowing for both types of time persistence, for i= 1, . . . , N and t = 2, . . . , T:

yit= yi,t−1γ + x0itβ + uit uit= µi+ νit

νit= νi,t−1ρ + it, (2.1)

where y= (y12, . . . , yNT)0is a vector of dependent variables of individuals over time, y−1is the observable vector of dependent variables of individuals over time, lagged once, X = (x12, . . . , xNT)0 is a matrix of k exogenous explanatory variables of individuals over time, µi is the fixed individual-specific effect, ν = (ν12, . . . , νnT)0are disturbances, ν−1are the disturbances, lagged once, it∼ IIN(0, σ2) and γ, β, ρ and σ2 are parameters to be estimated. Throughout we will assume stationarity, i.e. |γ| < 1 and |ρ| < 1. We assume the initial observations yi1are observable.

(2)

We define

Vρ=

1 ρ ρ2 . . . ρT −2

ρ 1 ρ . . . ρT −3

... ... ... ... ... ρT −2 ρT −3 ρT −4 . . . 1

.

We find E[νν0] = Ω = σ2(IN ⊗ Vρ. Under normality of the disturbances, the loglikelihood function of model 2.1 is given by:

L(θ)=constant −1 2log |Ω|

1 2

N

X

i=1

yi− yi,−1γ − Xiβ − ιT −1µi0 V−1ρ 

yi− yi,−1γ − Xiβ − ιT −1µi,

where θ0= (γ, β0, ρ, σ2), ιT −1is a T − 1-vector of ones, yihas elements yit, yi,−1has elements yi,t−1and Xi has rows x0it. To cope with the individual effects, we consider the first-difference transform of model 2.1.

Let us define D= IN⊗ Dwhere Dis a (T − 2) × (T − 1) matrix:

D=

−1 1 0 . . . 0 0 −1 1 . . . 0 ... ... ... · · · ... 0 0 0 . . . 1

.

Now consider the first-difference transform of model 2.1:

∆y = ∆y−1γ + ∆Xβ + ∆ν

∆ν = ∆ν−1ρ + ∆, (2.2)

where∆y = Dy has elements yit− yi,t−1for i= 1, . . . , N and t = 3, . . . , T and similar transformations hold for∆y−1, ∆X, ∆ν, ∆ν−1and∆. This conveniently cancels out the individual effect, solving the incidental parameter problem induced by the individual effect. The transformation does, however, bring about another complication, namely the influence of the assumptions regarding the initial observation of the dependent variable. To make matters more explicit, we will elaborate on the (influence of) initial observations.

3 Initial observations

This discussion regarding the initial observations is, to a large extent, derived from Hsiao et al. [2002]. It starts by noting that model 2.2 is well defined for t= 3, . . . , T, but not for t = 2, since we lack observations on∆yi1. Therefore, we need information on the process before the periods under consideration. Let us formalize this in the following assumption:

Assumption 1 (i) The process has started from the (1 − m)th period, m = 0, 1, . . . and then evolved according to model 2.1; (ii) The start of the process yi,1−mis treated exogenous.

For m > 0 starting from yi,1−mand by continuous substitution of model 2.2, we can write∆yi2as

∆yi2= γm∆yi,2−m+

m−1

X

j=0

γj∆x0i,2− jβ +

m−1

X

j=0

γj∆νi,2− j.

Then we find

η ≡ E[∆y |∆y , ∆x , . . . , ∆x ]= γm∆y +

m−1

Xγj∆x0 β.

(3)

Since xi,2− j is unknown for j > 0, ηi2 is unknown. Treating ηi2 as an extra parameter will once more result in the incidental parameter problem. Therefore, Hsiao et al. [2002] propose a predictive model for ηi2 based on assumptions on (the data generating process of) the observable explanatory variables. Let

∆Xi = (∆xi2, ..., ∆xiT)0and let qi2 = ηi2− E[ηi2|∆Xi]. First of all, we assume that m is sufficiently large (m → ∞) for the process to have reached stationarity. Furthermore, we assumed throughout that the explanatory variables are strictly exogenous. Then the marginal distribution of∆yi2conditional on∆Xiis given by:

∆yi2= π0∆Xiιk+ ξi2 ξi2= qi2+

X

j=0

γj∆νi,2− j,

where π is a (T − 1)-vector of unknown parameters. Now that we have dealt with the initial observations, we will derive the score vector and information matrix corresponding to the transformed model.

4 Score vector and information matrix

First, let us define∆νi = (ξi2, ∆ν0i)0for i= 1, ..., N and ∆ν= (∆ν10, ..., ∆νN0)0. First, consider E[∆ν∆ν0]= DΩD0= σ2n

INΣo

whereΣ = is a (T − 2) × (T − 2) matrix given by Σ = DVρD0. Under H0, we find

Σ|H

0= ˆΣ =

2 −1 0 . . . 0

−1 2 −1 . . . 0 0 −1 2 . . . 0 ... ... ... ... ...

0 0 0 . . . 2

.

Furthermore, define E[∆ν∆ν∗0]= Ω= σ2INΣ. Under H0, we find= σ2IN⊗ ˆΣwhere

Σˆ=

2 −1 0 . . . 0

−1

0 Σˆ

... 0

and therefore, under H0,∗−1=σ12





IN⊗ ˆΣ∗−1 ,∂Ω∂σ2

 = IN⊗ ˆΣand∂Ω∂ρ = σ2n

IN∂ ˆΣ∂ρo , where

∂ ˆΣ

∂ρ =

∂Σ

∂ρ H

0

=

−2 2 −1 . . . 0 2 −2 2 . . . 0

−1 2 −2 . . . 0 ... ... ... ... ...

0 0 0 . . . −2

.

In this case, we are concerned with the following loglikelihood L(θ)= constant − 1

2log || −1

2∆ν∗0∗−1∆ν,

(4)

where θ0 = (π0, γ, β0, ρ, σ2). Now define∆˜yi,−1 = (0, ∆yi2, ..., ∆yi,T −1)0with∆˜y−1 = (∆y01,−1, ..., ∆yN,−1)0,

∆ ˜Xi = (0k, ∆xi3, ..., ∆xiT)0with 0k a k-vector of zeros, ∆ ˜X = (∆ ˜X01, ..., ∆ ˜X0N)0,∆ ˜Xi =

∆Xiιk0 0(T −2)×(T −1)

and

∆ ˜X= (∆ ˜X10, ..., ∆ ˜X

N

0)0. We find

∂L(θ)

∂π = ∆X˜∗0∗−1∆ν

∂L(θ)

∂γ = ∆˜y0−1∗−1∆ν

∂L(θ)

∂β = ∆X˜0∗−1∆ν

∂L(θ)

∂ρ =

−N

2 tr Σ∗−1∂Σ

∂ρ

! + 1

2 ∆ν∗0 (

IN∗−1∂Σ

∂ρ Σ

∗−1)

∆ν

!

∂L(θ)

∂σ2 = 1 4 ∆ν∗0

(

IN∗−1ΣΣ∗−1)

∆ν

! .

Setting ∂L(θ)∂π = 0 and solving for π yields an estimator of π, which under H0 is given by ˆπ =

∆ ˜X∗0

IN⊗ ˆΣ∗−1

∆ ˜X−1

∆ ˜X∗0

IN⊗ ˆΣ∗−1

∆y, setting ∂L(θ)∂β = 0 and solving for β yields an estimator of β, which under H0 is given by ˆβ = ∆ ˜X0

IN⊗ ˆΣ∗−1

∆ ˜X−1

∆ ˜X0

IN⊗ ˆΣ∗−1

∆y and setting ∂L(θ)∂σ2

 = 0

and solving for σ2 yields an estimator of σ2, which under H0 is given by ˆσ2 = ∆ˆν

0 IN⊗ ˆΣ∗ −1

∆ˆν N(T −1) where

∆ˆν = ∆y − ˆπ0∆ ˜X∆ ˜Xˆβ are the restricted estimation residuals. Then under H0 we find the following estimates of elements of the score vector:

d(ˆθ)ˆ γ = 1

σˆ2 ∆˜y0−1

IN⊗ ˆΣ∗−1

∆ˆν!

d(ˆθ)ˆ ρ= N(T − 1)

T + 1

2 ˆσ2∆ˆν0

IN ˆΣ∗−1∂ ˆΣ

∂ρ Σˆ∗−1

∆ˆν, where we used that tr Σˆ∗−1 ∂ ˆΣ∂ρ!

=−2(T −1)T .

Next, we will derive the information matrix J(θ) = −E"

2L(θ)

∂θj∂θ0l

#

for j, l = 1, . . . , 5. Therefore, we will use that E[z0Ak] = tr

E[z0Ak]

= tr

AE[kz0]

for N(T − 1)-vectors z, k and conformable matrix A. Now let us define∆ ˜Xi,−1 = (0k, ∆xi2, ..., ∆xi,T −1)0,∆ ˜X−1 = (∆ ˜X01,−1, ..., ∆ ˜X0N,−1)0,∆˜νi = (0, ∆ν0i)0and

∆˜ν = (∆˜ν01, . . . , ∆˜ν0N)0. Furthermore, let Eh∆ν∆˜ν0i = σ2n IN⊗ Qo

. We observe

J(θ)[γ,γ]= E[∆˜y0−1∗−1∆˜y−1]= E[(∆ ˜X−1β)0∗−1(∆ ˜X−1β)] + E[∆˜ν0∗−1∆˜ν]

= (∆ ˜X−1β)0∗−1(∆ ˜X−1β) + trnΩ∗−1E[∆˜ν∆˜ν0]o.

Similarly,

J(θ)[γ,ρ]=N · tr (

Σ∗−1∂Σ

∂ρ Σ∗−1Q )

and

J(θ) 2 =N σ tr

( Σ∗−1Q

) .

(5)

We obtain estimates of elements of the information matrix:

ˆJ(ˆθ)[γ,π]= ˆJ(ˆθ)[π,γ]=1 σˆ2



(∆ ˜X−1β)ˆ 0

IN⊗ ˆΣ∗−1

∆ ˜X ˆJ(ˆθ)[γ,γ]=1

σˆ2



(∆ ˜X−1β)ˆ 0

IN⊗ ˆΣ∗−1

(∆ ˜X−1β)ˆ 

+ N(T + 1)(T − 2) T ˆJ(ˆθ)[γ,β]= ˆJ(ˆθ)[β,γ]=1

σˆ2



(∆ ˜X−1β)ˆ 0

IN⊗ ˆΣ∗−1

∆ ˜X ˆJ(ˆθ)[γ,ρ]= ˆJ(ˆθ)[ρ,γ] =N(T − 1)(T − 2)

T ˆJ(ˆθ)[γ,σ2

] = ˆJ(ˆθ)2

,γ]=0 and

ˆJ(ˆθ)[π,π] = σˆ12



∆ ˜X∗0

IN⊗ ˆΣ∗−1

∆ ˜X

ˆJ(ˆθ)[π,ρ]= ˆJ(ˆθ)[ρ,π] = 0 ˆJ(ˆθ)[β,π]= ˆJ(ˆθ)[π,β] = σˆ12



∆ ˜X0

IN⊗ ˆΣ∗−1

∆ ˜X

ˆJ(ˆθ)[β,ρ]= ˆJ(ˆθ)[ρ,β] = 0 ˆJ(ˆθ)[β,β] = σˆ12



∆ ˜X0

IN⊗ ˆΣ∗−1

∆ ˜X

ˆJ(ˆθ)[β,ρ]= ˆJ(ˆθ)[ρ,β] = 0 ˆJ(ˆθ)[ρ,ρ] = N2tr

Σˆ∗−1 ∂ ˆΣ∂ρΣˆ∗−1 ∂ ˆΣ∂ρ

ˆJ(ˆθ)[ρ,σ2

]= ˆJ(ˆθ)2

,ρ] =−N(T −1)σˆ2

T

ˆJ(ˆθ)2

2] = N(T −1)2 ˆσ4

 ˆJ(ˆθ)[β,σ2

] = ˆJ(ˆθ)2

,β] = 0,

where we used that under H0the expectation of the score vector equals zero (see Appendix A) and therefore trn ˆΣ∗−1Qo = 0. Moreover, using Kruiniger [2000], we find

1 2tr

Σˆ∗−1∂ ˆΣ

∂ρ Σˆ∗−1∂ ˆΣ

∂ρ

=−2(T − 1)

2(T − 1) − 3 + 2(T − 2)2+ (T − 2)(T − 1)2

(T − 1)2 .

Following Breusch and Pagan [1980] in constructing LM statistics, we define ψˆγγ≡ ˆJ(ˆθ)γγ|β,σ2

 = ˆJ(ˆθ)[γ,γ]− ˆJ(ˆθ)0[γ,β]ˆJ(ˆθ)−1[β,β]ˆJ(ˆθ)[β,γ]− ˆJ(ˆθ)[γ,σ2

]ˆJ(ˆθ)−12

2]ˆJ(ˆθ)2

,γ]

ψˆρρ≡ ˆJ(ˆθ)ρρ|β,σ2

 = ˆJ(ˆθ)[ρ,ρ]− ˆJ(ˆθ)0[ρ,β]ˆJ(ˆθ)−1[β,β]ˆJ(ˆθ)[β,ρ]− ˆJ(ˆθ)[ρ,σ2

]ˆJ(ˆθ)−12

2]ˆJ(ˆθ)2

,ρ]

ψˆργ≡ ˆJ(ˆθ)ργ|β,σ2

 = ˆJ(ˆθ)[ρ,γ]− ˆJ(ˆθ)0[ρ,β]ˆJ(ˆθ)−1[β,β]ˆJ(ˆθ)[β,γ]− ˆJ(ˆθ)[ρ,σ2

]ˆJ(ˆθ)−12

2]ˆJ(ˆθ)2

,γ].

In constructing the test statistics, we need elements of the inverse of the information matrix. Therefore, let us define ˆZ(ˆθ) = ˆJ(ˆθ)−1and let ˆZ(ˆθ)[γ,γ], ˆZ(ˆθ)[γ,ρ] and ˆZ(ˆθ)[ρ,ρ] be the elements of the inverse of the information matrix corresponding to [γ, γ], [γ, ρ] and [ρ, ρ]. Following Anselin et al. [1996] and Bera and Yoon [1993] we propose modified LM tests for testing H0 : ρ= 0 against H1 : ρ , 0 in the presence of γ and for testing H0 : γ= 0 against H1 : γ , 0 in the presence of ρ, independent of the amount of time units at hand:

LMρ=n ˆd(ˆθ)ρ− ˆZ(ˆθ)−1[γ,ρ]Z(ˆθ)ˆ [γ,γ]d(ˆθ)ˆ γo2

Z(ˆθ)ˆ −1[ρ,ρ]− ˆZ(ˆθ)−2[γ,ρ]Z(ˆθ)ˆ [γ,γ]

d χ21 (4.1)

LMγ=n ˆd(ˆθ)γ− ˆZ(ˆθ)−1[γ,ρ]Z(ˆθ)ˆ [ρ,ρ]d(ˆθ)ˆ ρo2

Z(ˆθ)ˆ −1[γ,γ]− ˆZ(ˆθ)−2[γ,ρ]Z(ˆθ)ˆ [ρ,ρ]

d χ21. (4.2)

(6)

5 Conclusion

In this paper we have adhered to the approach by Hsiao et al. [2002] of extending the first-differenced dynamic panel data model by a predictive model for the initial observations. We have constructed two fixed T robust LM tests robust to misspecification regarding the source(s) of time persistence. If applied simultaneously, these LM tests are capable of pointing out the type of time persistence at hand, if any.

References

L. Anselin, A. Bera, R. Florax, and M. Yoon. Simple diagnostic tests for spatial dependence. Regional Science and Urban Economics, 26:77–104, 1996.

A. Bera and M. Yoon. Specification testing with locally misspecified alternatives. Econometric Theory, 9:

649–658, 1993.

T. Breusch and A. Pagan. The lagrange multiplier test and its applications to model specification in econometrics. Review of Economic Studies, 47:239–253, 1980.

C. Hsiao, M. Pesaran, and A. Tahmiscioglu. Maximum likelihood estimation of fixed effects dynamic panel data models covering short time periods. Journal of Econometrics, 109:107–150, 2002.

H. Kruiniger. Maximum Likelihood and GMM estimation of dynamic panel data models with fixed effects.

Working paper, Queen Mary University of London, 2000.

Z. Yang. Initial-Condition Free Estimation of Fixed Effects Dynamic Panel Data Models. Working paper, Singapore Management University, 2014.

(7)

Appendices

A Review of proof of consistency by Hsiao et al. [2002]

In their paper, Hsiao et al. [2002] derive a minimum distance estimator based on the likelihood of the first-difference transform of a dynamic panel data model utilizing assumptions regarding the initial observations and proof it is consistent. Here, we will elaborate on their proof based on the method proposed in footnote 13 of their paper. First, we will define the model and associated assumptions. Consider

yit= αi+ γyi,t−1+ uit i= 1, ..., N t = 2, ..., T,

where yit is some dependent variable of interest, yi,t−1 is the dependent variable lagged once, αi is the individual specific fixed effect, uit are disturbances which are normally distributed with mean zero and variance σ2u and γ and σ2u are parameters to be estimated. In this simplified model without explanatory variables we assume that the initial variable yi1is observable. The individual effect induces the incidental parameter problem which in turn will lead to inconsistent results when maximizing the associated likelihood. Therefore, consider the first-difference transform of the model:

∆yit= γ∆yi,t−1+ ∆uit,

where∆yit = yit− yi,t−1and similar transformations hold for∆yi,t−1 and∆uit. This model is well defined for t ≥ 3 but not for t= 2 since ∆yi1requires knowledge of yi0, which is unavailable. We assume that the process has been going on for a long time, reaching stationarity and by repeated substitution we find

∆yi2 =

X

j=0

γj∆ui,2− j.

Now define∆ui= (∆yi2, ∆ui3, ..., ∆uiT)0. We find Eh∆ui∆u0ii = Ω = σ2uwhere

=

2

1 −1 0 . . . 0

−1 2 −1 . . . 0 0 −1 2 . . . 0 ... ... ... ... ...

0 0 0 . . . 2

.

We will evaluate the following expectation

E

" N

X

i=1

 ∂∆ui

∂γ

0

−1∆ui

#

= tr ( N

X

i=1

−1Eh ui ∂∆ui

∂γ

0i) .

Therefore, we note that

Eh∆ui ∂∆ui

∂γ

0

i = E

∆yi2

∆yi3γ∆yi2 ...

∆yiTγ∆yi,T −1

0 ∆yi2 . . . ∆yi,T −1

= σ2u

0 12 1− 1 12 γ . . . 1T −1 γT −2 0 −1 γ + 2 −γ2+ 2γ − 1 . . . −γT −1+ 2γT −2γT −3

0 0 −1 2 . . . 0

... ... ... ... ... ...

0 0 0 0 . . . −1

.

(8)

Now if we set γ= 0 we find

E

" N

X

i=1

 ∂∆ui

∂γ

0 −1∆ui

#

= N · trn Bo

= 0,

where B is a (T − 1) × (T − 1) matrix with the ( j, j+ 1)-th element equal to one for j = 1, ..., T − 1 and zeros everywhere else. Consequently, the claim made in footnote 13 of Hsiao et al. [2002] is justified. If we, however, treat∆yi2exogenous and start analyzing the model from t = 3 onwards, we essentially truncate Ω and Eh

ui∂∆ui

∂γ

0i

by omitting the first row and column. As a result, we find

E

" N

X

i=1

 ∂∆ui

∂γ

0

−1∆ui

#

= −NT −2 T −1, which actually corresponds to the observed bias in Yang [2014].

Referenties

GERELATEERDE DOCUMENTEN

Stalondeugden komen vaak omdat een paard te weinig contact heeft met andere paarden, weinig of niet kan grazen, weinig of geen bewegings- vrijheid heeft en weinig afleiding heeft

A Monte Carlo comparison with the HLIM, HFUL and SJEF estimators shows that the BLIM estimator gives the smallest median bias only in case of small number of instruments

o Er werd geen plaggenbodem noch diepe antropogene humus A horizont aangetroffen. - Zijn er

The study has aimed to fill a gap in the current literature on the relationship between South Africa and the PRC by looking at it as a continuum and using asymmetry

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End

In both situations, we see that the new method and the traditional one yield equivalent performance for small channel orders. When the channel order increases, the new

The blind algorithms estimate the channel based on properties of the transmitted signals (finite alphabet properties, higher order statistics, ...) Training-based techniques assume

Bewijs, dat de lijn, die het midden van een zijde met het snijpunt van de diagonalen verbindt, na verlenging loodrecht op de overstaande