Misspecification tests in Dynamic Panel Data Models and improvements through bootstrapping

(1)

Master’s Thesis

Misspecification Tests in Dynamic Panel

Data Models and Improvements through

Bootstrapping

Tim Gabel

Student number: 6149812

Date of final version: September 8, 2015

Master’s programme:

Econometrics

Specialisation: Econometrics

Supervisor: Prof. dr. J.F. Kiviet

Second reader: Dr. M. Pleus

(2)

Statement of Originality

This document is written by Student Tim Gabel who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Abstract

This paper investigates the finite sample properties of several misspecification tests for linear dynamic panel data models after generalised method of moments estimation. The tests exam-ined are the overidentifying restrictions test and the m1and the m2tests of Arellano and Bond

(1991). Several model designs are considered to examine the behaviour in a broad landscape. Overall the m1and m2tests perform well, but the overidentifying restrictions test shows poor

finite sample behaviour in almost all cases considered. A bootstrap method for panel data is used to mitigate these shortcomings. Especially for the overidentifying restrictions test the bootstrap provides some promising finite sample improvements.

(4)

1 Introduction

If it is possible to predict the future or, putting it in a milder form, make an educated guess about present or future events, it would be through the use of data. The increase in the collection of this form of information in the last decades has stimulated the use of new forms of quantitative analyses. Especially panel data analysis has gained popularity over the last 40 years, due to the large quantities of data that are available. The major advantage of panel data compared to single indexed variables is that panels provide an opportunity to deal with the situation of unobserved heterogeneity which is correlated with the explanatory variables. One of the most prominent methods to process this information is through the use of the General Method of Moments (GMM) estimation techniques. This method has been widely used in economic analysis, such as labour participation, cross-country growth convergence, government behaviour, among many others. The claimed flexibility, generality, robustness, efficiency and ease of use are the main features that give this technique its reputation.

The most commonly used implementations of the GMM techniques are the ones introduced by Arellano and Bond (1991). However, since these techniques first appeared, a lot of practical problems came to light. Examples are weakness of instruments, size distortion of the test statistics, vulnerability due to the abundance of internal instruments and insignificant improvements of the 2-step GMM estimator compared to the 1-step GMM estimator. In applied research it frequently happens that the significance and plausibility of coefficient estimates is taken as a benchmark to select the models and techniques. However, using invalid instruments or imposing wrong coefficient restrictions will, in most cases, lead to standard errors that are relatively small. These standard errors will then give misleading information about the precision of the estimators and hence often seriously biased estimators are wrongly assumed to be precise.

It is a well known fact that the validity of GMM estimation results depends strongly on the validity of moment conditions. In a more general context, the estimation results are only reliable if the model is correctly specified. To check if this is the case, several tests have been constructed. A test for general misspecifications of the model is proposed by Sargan (1958) and Hansen (1982). This test checks the validity of the overidentifying restrictions. Another common phenomenon that points to model misspecification is serial correlation in the error terms. Error serial correlation will invalidate subsets of the moment conditions and will bias the coefficient estimates. Therefore it is important to identify serial correlation in the idiosyncratic error terms. Arellano and Bond (1991) proposed a first and second order error serial correlation test of the first di↵erenced errors, called the m1 and m2 test, respectively. An alternative introduced by the same authors is the Sargan’s

(5)

nested hypotheses concerning serial correlation in a sequential way. The Sargan’s di↵erence test is a more refined version of the overidentifying restrictions test by Sargan (1958) and Hansen (1982). As indicated earlier, these tests have become part of the standard tools in applied research.

Under misspecifications that cause serial correlation in the error terms, it is expected that the probability of rejecting the null hypothesis of no misspecification will tend to one as the sample size goes to infinity. However this does not always happen. Some studies show relatively poor finite sample behaviour. Bowsher (2002) and Windmeijer (2005) show that the overidentifying restrictions test does not reject the null often enough. Only when the number of time series observations is very low this test does perform well. This means the power of the overidentifying restrictions test can be very low in finite samples. Bowsher (2002) makes use of homoskedastic models. Windmeijer (2005) implements heteroskedasticity but does not include a lagged dependent variable. Furthermore, Yamagata (2008) introduces a variation of the mj test of Arellano (1993).

This joint test, called the m22,p test, tests for second to pth-order error serial correlation and can

serve as a general misspecification test. Yamagata (2008) compares the m2 test, the Sargan’s

di↵erence test and the m2

2,ptest using a homoskedastic model and a simple heteroskedastic model,

with several forms of serial correlation. Arellano and Bond (1991) performs some Monte Carlo simulations to examine the finite sample properties of the m1 and m2 tests using a model with

heteroskedasticity and an exogenous regressor. However, these simulations are of a too small scale to produce any precise results. From this we may conclude that knowledge with respect to the performance of serial correlation tests, especially the m1 and m2 tests, when they have to deal

with heteroskedasticity, skewness and serial correlation is still scarce.

These inference methods are all based on asymptotic approximations. The use of the boot-strap, to produce bootstrap critical values, is an alternative that can be used in cases where the standard asymptotic critical values do not work. Hall and Horowitz (1996) introduce a block bootstrap providing asymptotic refinement for dependent data. Gon¸calves and Kilian (2004) and Godfrey (2005) both apply the Wild bootstrap to obtain improved inference in time series, whereas Cameron, Gelbach and Miller (2008) uses an adjusted version of the Wild bootstrap for clustered data. Kapetanios (2008) compares di↵erent resampling schemes for bootstrapping in a panel data setting. However, there is still little literature about bootstrapping misspecification tests in a panel data setting.

Therefore this paper examines the behaviour of the m1, m2and the overidentifying restrictions

tests in a Monte Carlo simulation study which uses a data generating process (DGP) that includes a lagged dependent variable and an extra explanatory variable. Heteroskedastic, skewed and serially correlated error terms are introduced to investigate the robustness and behaviour under serial correlation of the tests, respectively. Furthermore, it is examined how the tests behave when an

(6)

extra lagged explanatory variable in the DGP is not taken into account in the estimation procedure. A bootstrap method for panel data GMM is discussed and examined for its potential to improve the tests in circumstances where the normal test procedures fail to produce accurate inference.

The m1 and m2 tests perform satisfactory in all cases, except for the one with a combination

of heteroskedasticity and skewness in the error term distribution. The overidentifying restrictions test shows poor finite sample behaviour, especially when N is relatively small and T relatively large. The bootstrap method improves the overidentifying restrictions test in the small N large T case substantially. For the m2 test the improvements are less substantial.

This paper is structured as follows. In Section 2 the basic GMM results are derived. This is done to give an introduction into the notation used in this paper and to clarify further derivations. Furthermore, the derivations of the relevant test statistics are given and the bootstrap technique is illustrated. Section 3 introduces the model and assumptions made. In Section 4 the Monte Carlo simulation is explained. Section 5 provides results of the Monte Carlo simulation and Section 6 summarizes the paper and links the results to a conclusion.

2 Panel GMM estimator and test statistics

In this section the basic results for Instrumental Variable (IV) and GMM estimation techniques in panel data are discussed. This is done to clarify the notation used and introduce the estimation procedure and inference techniques. First the general model and its estimators are introduced. Thereafter, the overidentifying restrictions test and the m1and m2tests are described. Throughout

this paper linear models are considered.

2.1 Basic model and the resulting estimators

We start with defining the basic model and the orthogonality conditions used for the IV and GMM procedures. One of the main advantages of panel data is the opportunity to deal with time-constant unobserved heterogeneity. This is included into the model by the term ↵i. The basic panel data

model is

yit= ↵i+ x0it + uit, (1)

with i = 1, ..., N , t = 1, ..., T , xitis a K⇥ 1 vector of regressors, is a K⇥ 1 vector of coefficients

and yit and uit are scalars. To deal with the unobserved heterogeneity, ↵i, in models where not

(7)

yit= x0it + uit, (2)

with i = 1, ..., N , t = 2, .., T , yit= yit yi,t 1, xit= xit xi,t 1and uit= uit ui,t 1. By

stacking the remaining T 1 observations the model can be written as

yi= Xi + ui$ ˙yi= ˙Xi + ˙ui, (3)

where ˙Xiis a (T 1)⇥ K matrix of first di↵erenced regressors and ˙yiand ˙uiare (T 1)⇥ 1 vectors

consisting of first di↵erenced terms,

˙yi= 0 B B B @ ˙yi2 .. . ˙yiT 1 C C C A, ˙ui= 0 B B B @ ˙ui2 .. . ˙uiT 1 C C C A, ˙Xi = 0 B B B @ ˙x0 i2 .. . ˙x0iT 1 C C C A.

Using this notation the moment conditions to be exploited can be expressed as

E[Zi0( ˙yi X˙i )] = 0, (4)

where Zi= (zi1. . . ziT)0is a (T 1)⇥ L matrix of instruments, with L the total number of

instru-ments available. The validity of the individual instruinstru-ments depends on the correlation between explanatory variables and the ui,t and ui,t 1. This will be discussed in more detail in Section 3.

In the special case L = K the sample moments

1 N N X i=1 Zi0( ˙yi X˙iˆ) = 0, (5)

can be solved, giving the panel IV estimator

ˆ_{P IV} ₌ N X i=1 Zi0X˙i ! 1 _N X i=1 Zi0˙yi ! . (6)

In the more general case when L > K finding an estimator comes down to minimizing the following quadratic form with respect to ,

QN( ) = N 1 N X i=1 ( ˙yi X˙i0 )0Zi ! WN N 1 N X i=1 Zi0( ˙yi X˙i0 ) ! .

Here WN is a L⇥ L weighting matrix which is discussed in more detail later on. After some

(8)

ˆ_{P GM M} ₌ " _XN i=1 ˙ Xi0Zi ! WN N X i=1 Zi0X˙i !# 1 _N X i=1 ˙ Xi0Zi ! WN N X i=1 Zi0˙yi ! . (7)

Assumption (4) is essential for the consistency of this estimator. The exact form of the instrument matrix is discussed in detail in Section 3.

It is convenient to rewrite (7) in a more compact form by stacking the N blocks. This results in ˆ_{P GM M} ₌⇣_X˙0_ZW NZ0X˙ ⌘ 1 ˙ X0ZWNZ0˙y, (8)

with ˙y = ( ˙y10 . . . ˙y0N)0, ˙X =

⇣ ˙

X10. . . ˙XN0

⌘0

and Z = (Z10. . . ZN0 )0. Assuming the Central Limit

Theorem (CLT) for the sample moment conditions holds

1 p N N X i=1 Zi0˙ui d ! N(0, S).

To make use of this CLT independence over i is assumed. By combining this CLT and (7) it can be shown that ˆP GM M is consistent and asymptotically normally distributed. A consistent estimator

of the asymptotic variance matrix of ˆP GM M is

ˆ

V [ ˆP GM M] = ( ˙X0ZWNZ0X)˙ 1X˙0ZWN(N ˆS)WNZ0X( ˙˙ X0ZWNZ0X)˙ 1. (9)

The asymptotic variance matrix of the moment conditions, S, can be consistently estimated by

ˆ S = 1 N N X i=1 Zi0ˆ˙uiˆ˙u0iZi, (10)

where ˆ˙ui = ˙yi X˙iˆ is a (T 1)⇥ 1 first di↵erenced residual vector. Now two cases can be

distinguished. In both cases the asymptotically optimal choice for WN is an expression which

has a probability limit that is proportional to the inverse of the asymptotic variance matrix S. First, the case when ui is homoskedastic, ˙ui⇠ N(0, u2H), where H is defined in Section 3. Then

the optimal choice is proportional to the inverse of

N

P

i=1

Z0

iHZi. This results in the panel GMM

estimator given by

ˆ_{P GM M} _{= ( ˙}_X0_Z(Z0_HZ) 1_Z0_X)˙ 1_X_˙0_Z(Z0_HZ) 1_Z0_˙y. ₍₁₁₎

In the second case we have ˙ui⇠ N(0, ˙⌦i), where ˙⌦iis a N (T 1)⇥ N(T 1) band matrix. ˙⌦i and

(9)

may now be heteroskedastic. The result is the panel GMM estimator

ˆ_{P GM M} _{= ( ˙}_X0_ZS 1_Z0_X)_˙ 1_X_˙0_ZS 1_Z0_˙y. ₍₁₂₎

In practice ˙⌦ and so S 1 _{are unknown, hence it is not possible to use the above estimator. To}

overcome this problem the two-step procedure can be used. In the first round use W_N(0) to obtain ˆ(1)

P GM M. This estimator is not optimal if W (0)

N 6= S 1, however under the assumptions made it is

consistent. Now form the consistent residuals

ˆ˙u(1)_{= ˙y} _{X ˆ}_˙ (1)

P GM M, (13)

which makes it possible to get an expression for ˆS 1_{. Because of consistency we now have plim}

ˆ

S 1_{= S} 1_{. Using W}(1)

N = ˆS 1 and substituting this into (12) gives the two-step GMM estimator

ˆ(2) P GM M = ( ˙X0ZW (1) N Z0X)˙ 1X˙0ZW (1) N Z0˙y. (14)

This estimator will converge to estimator (12), thus it is also asymptotically optimal.

2.2 Test procedures for misspecification

The first test discussed is the Sargan-Hansen test of overidentifying restrictions, called OIR from now on. The OIR test can be used for over-identified models with more moment conditions (L) than coefficients (K). The test, originally developed by Sargan (1958) and Hansen (1982), is based on the asymptotic normality of the moment conditions under the null hypothesis of no misspecifications. In other words, E[Z0

iui] = 0 should hold. This assumption can be used to

construct the test statistic

OIR = N X i=1 ˆ˙u0 iZi ! (N ˆS) 1 N X i=1 Zi0ˆ˙ui ! (15) where ˆ˙ui= ˙yi X˙i0ˆ (2)

P GM M, ˆS is given in (10). Heteroskedasticity and correlation of ui over t for

a given i are allowed supposing the moment conditions still hold.

Using the result of asymptotic normality, it is relatively straight forward to derive that the OIR statistic is 2_(L _{K) distributed under the null. The instrument matrix, Z, does not need to be}

the optimal set of instruments. Here L just refers to the number of columns in Z provided L > K. This test can be computed when we use the two-step estimator. If the one-step GMM estimator is used some modifications need to be done.

(10)

size control, insignificant test statistics are certainly not a guarantee that the population moment conditions are valid. As explained earlier the test checks only L K moment conditions, upon assuming that K instruments are valid anyway. In case of rejection, there is no indication which instruments cause the rejection (Arellano and Bond (1991) and Cameron and Trivedi (2005)).

The second test procedure is the mj test with j = 1, 2 of Arellano and Bond (1991). Details of

the following derivation can be found in Arellano (2003). The mj statistics are tests of significance

of the average j-th order autocovariance rj

rj= 1 T 1 j T X t=2+j rtj, (16)

with rtj = E[ ˙uit˙ui(t j)]. It starts at t = 2 + j because the model is in first di↵erences, thus ˙ui2 is

the first available residual. It is important to note that if a lagged dependent variable is included, as is done in Section 3, the starting t becomes t = 3 + j and hence 1/(T 2 j) should be used. The sample counterpart of rtj based on the first-di↵erenced residuals ˆ˙uitis

ˆ rtj = 1 N N X i=1

ˆ˙uitˆ˙ui(t j), (17)

where ˆ˙uit= ˙yit ˙x0itˆ. The test statistic is given by

mj = rˆj

SE(ˆrj)

. (18)

To form an expression for ˆrj we start with writing the first-di↵erenced residuals as ˆ˙uit = ˙uit

˙x0

it( ˆ ). Here ˙xit and ˆ denote the vectors of right-hand-side variables and parameter

estimation errors, respectively. This on its turn can be rewritten as ˆ = PNN1

PN

i=1Zi0˙ui and

gives an implicit expression for PN. Some algebra results in the explicit expression

PN = N N X i=1 ˙ Xi0ZiWN N X i=1 Zi0X˙i ! 1 N_X i=1 ˙ Xi0ZiWN. (19)

The other building blocks are

gN0 = 1 N (T 1 j) N X i=1 T X t=2+j ⇣

˙uit˙x0i(t j)+ ˙ui(t j)˙x0it

⌘ , (20) and ⇣ji= 1 (T 1 j) T X t=2+j

(11)

Under the null hypothesis, H0: rj = 0, the estimated residual autocovariance may now be written as p N ˆrj = (1 g0NPN)p1 N N X i=1 0 @ ⇣ji Zi0˙ui 1 A + Op(1), (22)

Using a standard CLT and the information above we may conclude that under the null rj = 0

ˆ V (ˆrj) 1 2pN ˆr j d ! N(0, 1), where ˆ V (ˆrj) = (1 gN0 PN) 1 N N X i=1 0 @ ⇣ˆ 2 ji ⇣ˆjiˆ˙u0iZi Z0

iˆ˙ui⇣ˆji Zi0ˆ˙uiˆ˙u0iZi

1 A 0 @ 1 P0 NgN 1 A . (23)

See Arellano and Bond (1991) for a proof of asymptotic normality results. This provides an expres-sion for SE(ˆrj) =

q ˆ

V (ˆrj) and because of the asymptotic normal distribution it is straightforward

that mj d

! N(0, 1). The mj criterion is rather flexible, in that it can be defined in terms of

any consistent GMM estimator, not necessarily in terms of an efficient estimator. However, the asymptotic power of the mj test will depend on the efficiency of the estimators used (Arellano

(2003) and Arellano and Bond (1991)).

These tests are compared with the Wald test, which tests if the coefficients are equal to a certain value. The Wald test statistic is

W = (R ˆP GM M r)0(R ˆV [ ˆP GM M]R0) 1(R ˆP GM M r), (24)

where R is a h_{⇥ K matrix and r a h ⇥ 1 vector of constants, h are the number of restrictions that} are tested, ˆV [ ˆP GM M] is given in (9) and under the null W

d

! 2_(h).

2.3 Wild bootstrap

The bootstrap uses the estimation sample as if the sample were the population. Repeatedly resampling the data results in several bootstrap samples. Each bootstrap round test statistics based on these bootstrap samples can be computed. This process develops bootstrap empirical distributions of the test statistics. Hence, bootstrap estimates of the ↵-level critical values of the m1, m2and OIR tests are the 1 ↵ quantiles of the bootstrap empirical distributions of the tests.

The bootstrap technique considered is a Wild bootstrap for panel data. Even when the residuals are heteroskedastic and correlated over t for given i, this bootstrap should provide an asymptotic

(12)

refinement in a linear model if the panel is short (Cameron and Trivedi (2005)). Let

⇠it = (

p

5 1)/2 , with probability (p5 + 1)/(2p5) = (p5 + 1)/2 , otherwise,

hence E(⇠it) = 0, E(⇠it2) = 1 and E(⇠it3) = 1. Essential for the validity of the bootstrap procedure

is that the assumptions E(⇠it) = 0 and E(⇠it2) = 1 hold. The third assumption consisting of

the third moment of the distribution of ⇠it is suggested by Liu (1988). There it is shown that

if E(⇠3

it) = 1, the wild bootstrap enjoys second-order properties and the first three moments of

the test statistic used are estimated correctly to O(T 1_{). The following steps describe the Wild}

bootstrap procedure.

(i) Use the GMM residuals, ˆ˙uit, and generate

˙u⇤it= ⇠itˆ˙uit, (25)

where i = 1, ..., N and t = 3, ..., T .

(ii) Using (25) pseudo data are generated by

˙yit⇤ = ˆ ˙xit+ ˙u⇤it, (26)

where the actual estimation sample starting values are used for the bootstrap sample starting values (Li and Maddala (1996)). The original xitare used in this procedure. One assumption that

is used in (26) is that the xit are strictly exogenous. This is taken into account and is explained

in Section 4.

(iii) By making use of the bootstrap sample the regression model is estimated and the associated values of the test statistics m⇤1, m⇤2 and OIR⇤ are calculated. Using these estimators the bootstrap

versions of the test statistics can be obtained. The bootstrap OIR becomes

OIR⇤= N X i=1 ˆ˙u0⇤ i Zi⇤ ! (N ˆS⇤) 1 N X i=1 Zi0⇤ˆ˙u⇤i ! , (27) where ˆ˙u⇤ i = ˙y⇤i X˙i ⇤P GM M, ˆS⇤ = N 1 N P i=1 Z0⇤

i ˆ˙u⇤iˆ˙u0⇤i Zi⇤ and Zi⇤ is the instruments matrix created

using the bootstrap sample. The bootstrap mj test becomes

m⇤j = ˆ r⇤ j SE(ˆr⇤ j) , (28)

(13)

where ˆr⇤j is (16) using the bootstrap residuals given above.

(iv) Now repeat (i), (ii) and (iii) B times such that B values of the test statistics can be obtained. These values can be used to acquire the 1 ↵ quantiles of the bootstrap empirical distributions of the tests, which are the bootstrap estimates of the ↵-level critical values as explained previously (Godfrey and Tremayne (2005)).

3 The model

3.1 Model and assumptions

To examine the behaviour of the the OIR, m1and m2tests under di↵erent conditions, the following

model is used

yit= ↵i+ yi,t 1+ xit+ uit, (29)

with i = 1, ..., N , t = 2, ..., T and ↵i are individual specific e↵ects which are constant over time

but di↵er across i, such that ↵i ⇠ iid(0, 2↵). Through this term unobserved heterogeneity can

be captured. The xitare considered to be predetermined and hence the current value and lagged

values are uncorrelated with the current error term uit. The same holds for the lagged dependent

variable, yi,t 1, which can also be classified as predetermined. Notice that uit is an error term

which could be of several forms, as is explained in Section 4. The focus in this paper lies on micro panels, panels with relatively few time-series observations T and a large set of cross-sectional ob-servations N . Therefore, the asymptotic approximations will be for N _{! 1 and for finite T . In} order to proceed, the following assumptions are made

Assumption 1. _{yi1, ...yiT, xi1, ...xiT}Ni=1 is a sequence of independently and identically distributed

random variables.

Assumption 2. (i) {uit}Tt=1 is an independently distributed sequence, with mean zero and strictly

positive variance 2

u. Meaning that 8i, j, t 6= s

E[uit] = 0

E[u2it] = u2

E[uitujs] = 0

(14)

(ii) The classification of the two regressors implies 8i, t, s, l 0

E[xitui,t+l] = 0

E[yi,t 1ui,t+l] = 0

(iii) For the coefficient of the lagged dependent variable it holds that| | < 1.

The first assumption is necessary to use the standard iid CLT that is used in the previous section. Althought we can use a weaker form, namely the ”independently but not necessarily identically distributed” case, the stronger one is employed for the ease of computation and exposition. The second assumption can be divided into three sub assumptions where (i) says the error terms are cross-sectionally uncorrelated, but allow for heteroskedasticity and skewness. Furthermore, this assumption is needed for the validity of the bootstrap. (ii) is needed in order to form the moment conditions and (iii) makes sure that the yitprocess is stable.

We now write (29) in a form such that the techniques of the previous section can be applied. Combining the two explanatory variables into one 1⇥ 2 vector gives ˜xit= (yi,t 1 xit). Define the

2⇥ 1 vector of coefficients ✓ = ( )0. This results in the following model

yit= ˜xit✓ + ↵i+ uit. (30)

To get rid of the individual specific e↵ects first di↵erences of (30) are taken. This results in

˙yit= ˙˜xi,t 1✓ + ˙uit, (31)

with i = 1, ..., N , t = 3, ..., T , ˙yit= yit yi,t 1, ˙˜xit= ˜xit x˜i,t 1and ˙uit= uit ui,t 1as before.

Now stack the observations over T which gives

Dyi = D ˜Xi✓ + Dui, ˙yi= ˙˜Xi✓ + ˙ui, (32)

where ˙yi = ( ˙yi3· · · ˙yiT)0, ˙Xi = ( ˙˜xi30 · · · ˙˜x0iT)0 and ˙ui = ( ˙ui3· · · ˙uiT)0. The D matrix, which has

dimension (T 1)_{⇥ T , is of the form}

D = 0 B B B B B B @ 1 1 0 _{· · · 0} 0 1 1 ... .. . ... . .. ... 0 0 0 _{· · ·} 1 0 1 C C C C C C A . (33)

(15)

The techniques from Section 2 can be applied to the model written as in the right-hand side of (32).

3.2 The instrument matrix

Now we need to form a matrix of instrumental variables Zi. The explanatory variables are all

predetermined which means that the current values and lagged values are uncorrelated with the current error term. Using the fact that the equation is in first di↵erences, this suggest that all lagged values of yi,s 1 and xis, with s < t, could be used as instruments for the explanatory

variables at time t

E[yi,s 1˙uit] = 0, s < t, (34)

E[xis˙uit] = 0, s < t, (35)

with t = 3, ..., T . First, start with forming the instrument matrix for the lagged values of yit

Zyi= 0 B B B B B B B B B @ yi1 0 0 0 0 0 · · · 0 0 yi1 yi2 0 0 0 · · · 0 0 0 0 yi1 yi2 yi3 · · · 0 .. . ... ... ... ... ... . .. ... ... ... 0 0 0 0 0 0 · · · yi1 · · · yi,T 2 1 C C C C C C C C C A . (36)

This is a (T 2)⇥ (T 2)(T 1)/2 or, say, (T 2)⇥ Ty matrix. Secondly, we can construct the

instrument matrix for xit by the same principle

Zxi = 0 B B B B B B @ xi1 xi2 0 0 0 · · · 0 0 0 xi1 xi2 xi3 · · · 0 .. . ... ... ... ... . .. ... ... ... 0 0 0 0 0 _{· · · x}i1 · · · xi,T 1 1 C C C C C C A , (37)

where the size of the matrix is (T 2)_{⇥ [(T} 2)(T + 1)/2 1] = (T 2)_{⇥ T}x. These are all

valid instruments under Assumption 2, so to obtain the complete instrument matrix they should be combined, resulting in

Zi= [Zyi Zxi], (38)

(16)

instruments are left out of consideration in this paper. In addition to the first two assumptions a third one is made

Assumption 3. rank(Z0

iX˙i) = 2.

This assumption is an identification condition (Yamagata (2008) and Cameron and Trivedi (2005)).

3.3 The weighting matrices

To perform two-step GMM we need weighting matrices for both steps. In Assumption 2 we assumed that the uit are cross-sectionally and serially uncorrelated, but that they may display

some heteroskedastic behaviour. This suggest ui⇠ (0, ⌦i), where

⌦i= 0 B B B B B B @ 2 i1 0 · · · 0 0 2 i2 · · · 0 .. . ... . .. 0 0 0 _{· · ·} 2 iT 1 C C C C C C A . (39)

Using (32), this suggest that ˙ui = Dui ⇠ (0, ˙⌦i), where ˙⌦i = D⌦iD0. Hence, using the CLT as

explained in Section 2, we end up with

N 12 N X i=1 Zi0˙ui ! N(0, Vd i), (40) where Vi= plim_N1 N P i=1

Zi0˙⌦iZi. The weighting matrix used for the optimal GMM estimator should

be proportional to diag(V1 1, . . . , VN1). To obtain the one-step GMM estimator assume ⌦i= 2✏I.

This is incorrect under heteroskedasticity, but it will be corrected during the second step. Under this assumption the weighting matrix is

WN(0)= 1 N N X i=1 Zi0HZi ! 1 . (41)

(17)

H = 0 B B B B B B B B B B B B B @ 2 1 0 _{· · · ·} 0 1 2 1 _{· · · ·} 0 0 1 2 . .. 0 .. . ... . .. ... ... ... .. . ... . .. 2 1 0 0 0 · · · 1 2 1 C C C C C C C C C C C C C A . (42)

In model (32), this weighting matrix results in the one-step GMM estimator

ˆ

✓(1) = ( ˙X0ZWN(0)Z0X)˙ 1X˙0ZW (0)

N Z0y,˙ (43)

where the matrices are stacked as is done in (8). In the simulation this estimator is called AB1. Using the residuals, as explained in Section 2, results in the second step weighting matrix

WN(1) = 1 N N X i=1 Zi0ˆ˙u (1) i ˆ˙u0(1)i Zi ! 1 . (44)

With help of this matrix the two-step GMM estimator, named AB2 in the simulation, can be estimated using (14).

4 Simulation Design

The DGP is the same as in Yamagata (2008). It concerns a panel Autoregressive Distributed Lag model of order 1 and 0 (ARDL(1,0) model) and can be written as in (29), namely

yit= ↵i+ yi,t 1+ xit+ uit, (45)

where the DGP used for xitis

xit= ⇢xi,t 1+ ⇡ui,t 1+ vit, (46)

both with i = 1, ..., N and t = 49, ..., T and ↵i are drawn from a N (0, 2↵) distribution. A grid

of parameter values and di↵erent numbers for T is used to examine the behaviour of the di↵erent tests in a wide landscape. The chosen values are 2 {0.1, 0.5, 0.9}, = 1 and ⇡ and ⇢ are set to 0.5. This choice of means the so called Long Run Multiplier (LRM) is /(1 ) = 1. The long run elasticity is equal to one in this way, which is an economical relevant case (Harvey (1981)). Furthermore, using no misspecification of the model we have uit = ✏it, with ✏it ⇠ iidN(0, 2✏)

(18)

and vit ⇠ iidN(0, v2). To get starting values of xi1 and yi1, the process starts at t = 49 with

xi, 49= 0 and yi, 49= 0 and runs till T , then the first 50 observations are discarded. This scheme

provides the values for xi1 and yi1 (Yamagata (2008)).

Using the same approach as in Kiviet (1995), Bun and Kiviet (2006) and Yamagata (2008), the signal-to-noise ratio under the null, uit= ✏it, is controlled through 2v. The signal is defined

as 2

s= var(y⇤it ✏it), with yit⇤ = yit ↵i/(1 ), and the signal-to-noise ratio is ¯! = 2s/ 2✏. Now

the variance of vitis 2 v = 1 2  2 ✏(1 + !) a1 b1 , (47) where a1= 1 + ⇢ (1 ⇢2₎₍₁ 2₎₍₁ _⇢) and b1= 1 + ( ⇡ ⇢)2+ 2( ⇡ ⇢)( + ⇢) 1 + ⇢ .

The variance of the fixed e↵ect is chosen such that the impact of the two variance components ↵i

and ✏iton var(yit) is constant across the several cases considered. The variance of the individual

e↵ect, ↵i, is

2

↵= (1 )2a1b1 (48)

Furthermore, the signal-to-noise ratio ¯! is set to 3 (Yamagata (2008)). For a more detailed deriva-tion see Sarafidis et al. (2009).

We define two categories of cases of misspecification. First, three cases which are considered as the base misspecifications consist of di↵erent forms of serial correlation in the error terms. The second category are the cases in which e.g. the error terms are heteroskedastic or skewed. The di↵erent designs are given below.

(i) The first case is the situation where there is no misspecification, hence

uit = ✏✏it, ✏ = 1.

(19)

variance of uit is equal to one,

uit= ⇢1ui,t 1+ ✏✏it, (49)

where 2

✏ = (1 ⇢21). The parameter ⇢1is set to 0.2.

(iii) Now the error terms follow a MA(1) model. Again the variance of uitis equal to one, in order

to make the di↵erent cases comparable.

uit= ✏(✏it+ 1✏i,t 1), (50)

with 2

✏ = (1 + 12) 1. 1= 0.2 is considered in this case.

Case (i)-(iii) are the base cases. The following four are the misspecifications that will be combined with the base cases. Hence, including the case of no misspecification, we end up with twelve di↵erent designs.

(a) The errors contain unconditional heteroskedasticity. For ease of computation ⇡ = 0. This does create a di↵erence between the previous designs and the heteroskedasticity design, but because the estimation method remains the same some comparison can be made. The design is such that the ✏it is heteroskedastic over both t and i. Start with var(✏it) = x2it, such that ✏it =p ⌘itxit,

with ⌘it⇠ N(0, 1). To make the design of heteroskedasticity comparable to the other designs, the

average variance of the error term should equal one. This is done by specifying as

=1 ⇢

2 2 v

. (51)

See Appendix A for further explanation.

(b) Skewness is introduced making use of a gamma distribution for the error terms

✏it⇠ (1/16, 4) 1/4 (52)

In this way the distribution of the error terms is positively skewed. The skewness is exaggerated for a more visibly e↵ect. The mean and variance of the error term are still zero and one however. (c) In the last case the heteroskedasticity from (a) and skewness from (b) are combined into one design. Let ✏it=p ⌘itxitand ⌘it⇠ (1/16, 4) 1/4.

Finally, an alternative DGP is considered and examined how the misspecification tests perform with respect to the more specific Wald test which tests if the coefficient 1 is equal to zero or

not. The error terms are assumed to be white noise, as in case (i). Now the DGP contains an extra explanatory variable xi,t 1, but this variable is not taken into account during the estimation

(20)

yit= ↵i+ yi,t 1+ 0xit+ 1xi,t 1+ uit, (53)

where the DGP used for xitis the same as in (46). For the same grid of values is used. However,

for the other coefficients 0+ 1= 1 holds. This results in ( 0+ 1)/(1 ) = 1, which means

the LRM is equal to one again. Furthermore, we take 0= 1= /2 and for ease of computation

again ⇡ = 0. This specification investigates how the tests behave in the case of a relevant omitted variable and if a more specific coefficient test is a better alternative. The relevant formulas become

2 v= 2 2[(1 2_)(¯_! 2_)](1 ⇢)(1 ⇢) 1 + ⇢ + 2 (54) and 2 ↵= 2 1 1 + , (55)

with = 1. The formula for 2

↵ comes from Kiviet (1995) and it can be shown that if ⇡ = 0, (48)

becomes (55). Detailed derivations can be found in Appendix B.

Monte Carlo results using these specifications may prove that some misspecifications cause the tests to have rejection frequencies under the null that di↵er strongly from the nominal level or to have rejection frequencies under the alternative that are relatively low. Using the bootstrap method discussed in Section 2.3 we attempt to fix these shortcomings by resampling data from the true DGP. First the specific misspecifications that cause problems are identified, then the bootstrap technique is used to solve the problems. Because the Wild bootstrap, as explained in 2.3, needs strictly exogenous xit, again ⇡ = 0 is used. This results in xitto be strictly exogenous, hence more

instruments could be used. For simplicity reasons the instrument matrix is kept the same as in Section 3.2 however. Of course the yi,t 1are still predetermined.

In all the cases a nominal significance level of five percent is used, ↵ = 0.05. Furthermore the cases of N 2 {100, 200} and T 2 {5, 7, 9} are considered and the number of Monte Carlo repetitions is R = 5000. The number of bootstraps is B = 199. MacKinnon (2002) reports that small B are not recommended in applied research, however sampling errors associated with this value tend to cancel out in Monte Carlo experiments.

5 Results

In this section the finite sample behaviour of the m1, m2and OIR tests are examined. This is done

(21)

fail to provide reasonable size control, a bootstrap method is used to try to mitigate the problems. In this section m21 means the m2 test after one-step GMM and m22 the m2 test after two-step

GMM. The same holds for m11 and m12. In Table 1, row (a) and (d), the actual significance

levels of the m21, m22 and OIR tests are given. Note these are the actual significance levels for

the chosen parameter value combination and so these may di↵er when other parameter values are used. The actual significance levels of the m21 and m22 are satisfactory for all combinations

of N and T . However, the OIR tends to reject the null too infrequently. Especially when T becomes larger and N stays relatively small this test shows poor rejection frequencies under the null. Overall the test becomes worse when T increases, due to an increase in the number of moment restrictions (Yamagata (2008)). These findings are consistent with the ones found in Bowsher (2002), Windmeijer (2005) and Yamagata (2008).

Next, consider the results for the rejection frequencies under the alternative, given in row (b), (c), (e) and (f). The m21and m22have good size control, hence we can talk about power for these

tests. For the OIR test this is not the case. The model is estimated in first di↵erences, hence the m11 and m12 should reject. This happens in almost one hundred percent of the cases. In general

the rejection frequency is increasing in N and T . The rejection frequencies of the m21, m22 and

OIR tests are smaller when the errors are AR(1), than the rejection frequencies in the case of MA(1) errors. Especially the OIR test in the case of N = 100 displays very weak rejection rates in both error term specifications. Furthermore, the rejection frequency of the OIR test decreases with an increase in T . The power of the m21and m22tests is increasing in T however.

(22)

Table 1: Rejection frequencies under the null and the alternative of tests: A dynamic panel ARDL(1,0) model with predetermined regressors. The ✏it are standard normally distributed.

N=100 m11 m12 m21 m22 OIR uit / T 5 7 9 5 7 9 5 7 9 5 7 9 5 7 9 (a) ✏it 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.048 0.054 0.053 0.050 0.053 0.054 0.033 0.014 0.000 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.048 0.052 0.052 0.048 0.052 0.053 0.037 0.016 0.000 0.9 0.996 1.000 1.000 0.983 1.000 1.000 0.048 0.052 0.052 0.046 0.051 0.053 0.052 0.020 0.000 (b) AR(1) 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.175 0.290 0.386 0.176 0.292 0.385 0.087 0.038 0.000 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.175 0.294 0.394 0.176 0.293 0.392 0.108 0.049 0.000 0.9 0.989 1.000 1.000 0.957 1.000 1.000 0.143 0.260 0.357 0.129 0.248 0.353 0.115 0.055 0.000 (c) MA(1) 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.336 0.550 0.704 0.339 0.551 0.705 0.143 0.061 0.000 0.5 1.000 1.000 1.000 0.999 1.000 1.000 0.334 0.553 0.709 0.334 0.553 0.708 0.159 0.071 0.000 0.9 0.986 1.000 1.000 0.940 0.998 1.000 0.280 0.509 0.677 0.254 0.495 0.674 0.150 0.069 0.000 N=200 (d) ✏it 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.051 0.052 0.051 0.048 0.050 0.051 0.039 0.039 0.023 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.050 0.052 0.050 0.049 0.051 0.050 0.043 0.041 0.025 0.9 1.000 1.000 1.000 1.000 1.000 1.000 0.048 0.051 0.049 0.047 0.050 0.050 0.069 0.058 0.038 (e) AR(1) 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.289 0.471 0.632 0.291 0.472 0.633 0.241 0.260 0.202 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.296 0.490 0.663 0.296 0.487 0.663 0.289 0.311 0.246 0.9 1.000 1.000 1.000 0.997 1.000 1.000 0.259 0.447 0.623 0.242 0.434 0.616 0.362 0.402 0.332 (f) MA(1) 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.576 0.829 0.938 0.575 0.828 0.939 0.404 0.425 0.328 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.581 0.837 0.946 0.578 0.837 0.945 0.443 0.475 0.382 0.9 1.000 1.000 1.000 0.991 0.999 1.000 0.532 0.809 0.933 0.498 0.796 0.930 0.481 0.544 0.452

R = 5000 simulation replications. Design parameter values: = 1 , ⇡ = ⇢ = 0.5, ¯! = 3, ✏= 1. The nominal significance level is 5%.

(23)

Table 2: Rejection frequencies under the null and the alternative of tests: A dynamic panel ARDL(1,0) model with predetermined regressors. The ✏it are heteroskedastic.

(24)

Now consider the case of heteroskedasticity in the error terms as specified in (51). In Table 2 rows (a) and (d) it can be seen that the heteroskedasticity does a↵ect the actual significance levels of the m21 and m22 test. However, the e↵ect is very small. The actual significance levels

of both tests are still satisfactory for all combinations N and T . The significance level of the OIR test deteriorates and approaches zero. In contrast to the rejection frequencies under the null, the heteroskedasticity does have a substantial e↵ect on the rejection frequencies of the m21, m22

and OIR tests under the alternative. They decrease substantially and for the OIR test it even approaches zero. Under serial correlation the rejection frequencies of the m11and m12seem hardly

e↵ected by heteroskedasticity.

The results for the case of a skewed distribution of the error terms can be found in Table 3. Rows (a) and (d) show that skewness a↵ects the actual significance level of the m21 and m22 a

little bit more than heteroskedasticity does. However, the actual significance level of the OIR test improves in comparison with the heteroskedastic case. The rejection frequencies of the m21, m22

and OIR tests are on average the same as in case (i). However the rejection frequencies of the m11

and m12 become smaller, although they are still satisfactory.

Turning now to the results in Table 4 where heteroskedasticity and skewness are combined. The rejections under the null of the m21 and m22 are still within a range that is acceptable, but

they leave room for improvement. The rejection frequencies, under the null hypothesis, of the OIR test are far from acceptable however. Only for the case of T = 5 and N = 200, so small T and large N , does this test have reasonable finite sample properties. For the other combinations of T and N serious problems occur e.g. over rejection when N = 200 and under rejection for T = 9 and N = 100. From rows (b), (c), (e) and (f) it can be concluded that m21 and m22 have relatively

low rejection frequencies in this case. For large T and N the rejections are acceptable, but when T and N are small the rejection frequencies under the alternative do not di↵er that much from the rejection frequencies under the null. The OIR does have a reasonable rejection frequency under the alternative in most cases, however from the problems that occur under the null hypothesis it may be concluded these rejection frequencies are not of much use for interpretation in practice. Overall the m21, m22and OIR tests do not provide reliable inference when heteroskedasticity and

skewness in the error term distribution are present. The rejection frequencies of the m11and m12

deteriorate further but are still of an acceptable level. However, the rejection frequencies of the m11 and m12 under the null should be investigated to conclude this with certainty.

Finally, an extra lagged explanatory variable, xi,t 1, is included in the DGP. The estimation

method does not take this xi,t 1into account. It is investigated how well the m2and OIR detect this

form of misspecification in comparison with the more specific Wald test, using the null hypothesis of 1= 0. The results are given in Table 5. The rejection frequencies of the Wald test under the

(25)

null are displayed in Appendix C Table 8. In both tables W1is the Wald test after one-step GMM

and W2 the Wald test after two-step GMM. The rejection frequencies under the null in Table 8

show that the W1has satisfactory size control. However, the W2needs higher N to display correct

rejection frequencies under the null hypothesis. The results in Table 5 show that only for small and thus large 1 the m2 test detects the misspecification with a reasonable rate. For smaller 1

this test does not recognize the serial correlation caused by the omitted xi,t 1. The OIR performs

a lot better, but also fails to detect the misspecification for small 1. Also, the OIR test again

fails to provide precise inference when T becomes relatively large and N stays small. It seems the Wald test outperforms both tests, especially the W1, which does not come as a surprise since it is

a test more appropriate for the situation.

To address the problems discussed, the bootstrap method explained in Section 2.3 is used. Because of the substantial computational requirements a selection of cases is made. Furthermore the Monte Carlo repetitions are set to R = 1000. The earlier obtained results and the bootstrap results for the cases selected can be found in Table 6 and Table 7.

(26)

Table 3: Rejection frequencies under the null and the alternative of tests: A dynamic panel ARDL(1,0) model with predetermined regressors. The ✏it are skewed.

(27)

Table 4: Rejection frequencies under the null and the alternative of tests: A dynamic panel ARDL(1,0) model with predetermined regressors. The ✏it are heteroskedastic and skewed.

(28)

Table 5: Comparison of m2 and OIR tests with Wald test using the alternative DGP. ✏it are standard normally distributed. N=100 W1 W2 m21 m22 OIR uit / T 5 7 9 5 7 9 5 7 9 5 7 9 5 7 9 (a) ✏it 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.106 0.210 0.328 0.134 0.236 0.337 0.997 0.956 0.000 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.053 0.047 0.048 0.053 0.047 0.048 0.846 0.565 0.000 0.9 0.573 0.827 0.939 0.669 0.918 0.991 0.050 0.057 0.051 0.053 0.058 0.050 0.111 0.042 0.000 N=200 (b) ✏it 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.237 0.470 0.667 0.303 0.540 0.714 1.000 1.000 1.000 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.049 0.053 0.058 0.050 0.055 0.060 1.000 1.000 0.999 0.9 0.872 0.982 0.998 0.888 0.985 0.999 0.052 0.053 0.050 0.053 0.055 0.051 0.309 0.307 0.226

R = 5000 simulation replications. Design parameter values: = 1 , 0= 1= /2, ⇡ = 0, ⇢ = 0.5, ¯! = 3, ✏= 1. The nominal significance level is 5%.

(29)

Table 6: Wild bootstrap to improve the rejection frequencies under the null of tests.

Original Bootstrap

Wild bootstrap m21 m22 OIR m21 m22 OIR

Homoskedasticity N=100 = 0.5 T = 5 0.048 0.048 0.037 0.059 0.075 0.056 T = 7 0.052 0.052 0.016 0.080 0.074 0.078 T = 9 0.052 0.053 0.000 0.133 0.122 0.036 Skewness + Heteroskedasticity N=100 = 0.5 T = 5 0.023 0.018 0.263 0.037 0.039 0.114 T = 7 0.024 0.023 0.600 0.047 0.049 0.313 T = 9 0.027 0.027 0.000 0.093 0.100 0.404 N=200 = 0.5 T = 5 0.023 0.020 0.095 0.033 0.035 0.070 T = 7 0.028 0.024 0.372 0.033 0.033 0.138 T = 9 0.031 0.029 0.807 0.067 0.069 0.381

R = 1000 simulation replications and B = 199 bootstrap replications. Design parameter values: = 1 , ⇡ = ⇢ = 0.5, ¯! = 3, ✏= 1. The nominal significance level is 5%.

Table 7: Wild bootstrap to improve the rejection frequencies under the alternative of tests.

Original Bootstrap

Wild bootstrap m21 m22 OIR m21 m22 OIR

Homoskedasticity N=100, T=5 = 0.5 AR(1) 0.175 0.176 0.108 0.153 0.194 0.186 MA(1) 0.334 0.334 0.159 0.296 0.356 0.252 N=100, T=9 = 0.5 AR(1) 0.394 0.392 0.000 0.138 0.167 0.080 MA(1) 0.709 0.708 0.000 0.373 0.427 0.116 Skewness + Heteroskedasticity N=100, T=5 = 0.5 AR(1) 0.060 0.051 0.221 0.056 0.052 0.113 MA(1) 0.068 0.060 0.218 0.051 0.059 0.105 N=200, T=5 = 0.5 AR(1) 0.094 0.085 0.133 0.084 0.103 0.125 MA(1) 0.112 0.105 0.134 0.082 0.105 0.121 Omitted xi,t 1 N=100, T=9 = 0.5 ✏it 0.048 0.048 0.000 0.163 0.165 0.339 N=200, T=9 = 0.5 ✏it 0.058 0.060 0.999 0.118 0.115 1.000

R = 1000 simulation replications and B = 199 bootstrap replications. Design parameter values for homoskedasticity and skewness plus heteroskedasticity: = 1 , ⇡ = ⇢ = 0.5, ¯! = 3, ✏= 1. Design parameter values for omitted

xi,t 1: = 1 , 0= 1= /2, ⇡ = 0, ⇢ = 0.5, ¯! = 3, ✏= 1.The nominal significance level is 5%.

The Wild bootstrap for the case of homoskedastic error terms shows that this bootstrap technique is not working properly for the m2 test. The rejection frequencies under the null and under the

(30)

alternative of serial correlation do not show improvements and in most cases even deteriorate. After all, this test does already provide satisfactory results without the use of bootstrapping. The rejection frequencies of the OIR test under homoskedasticity do improve substantially. Especially when T becomes larger, the original results show rejection frequencies of zero, where the Wild bootstrap fixes these finite sample problems and provides proper rejection frequencies.

The combined case of heteroskedasticity and skewness shows that the rejection frequencies under the null are improved for both the m2 test and the OIR test. The ones for the OIR test

are not close to 0.05 but when N increases it can be seen that the rejection frequencies move closer to this desired number. The results for the rejection frequencies under the alternative are disappointing however. The m21 and OIR tests show both no improvements in this area and in

most cases the results are even worsened. The rejection frequencies of the m22 test show small

signs of improvement, however very minimal.

Turning to the case of an omitted explanatory variable, in Table 7 the last two rows, the Wild bootstrap increases the rejection frequencies under the alternative substantially. As well as for the m2 test as for the OIR test the rejection frequency more than triples in comparison with the

original results for N = 100. For N = 200 the bootstrap is as good as or better than the original results.

Overall it may be concluded that for the OIR test in the case of small N and large T substantial improvements are obtained. It is a promising result because this case is precisely the situation in which the OIR test without bootstrapping is not working at all. The bootstrap for the m2test does

not provide an improvement when heteroskedasticity is present. However, under homoskedasticity it might increase the rejection frequencies under the alternative in some cases.

6 Conclusion

With the increase of data storage and the use of panel data models gaining in popularity, it is of great importance to have reliable misspecification tests. These tests should behave properly in a panel data setting and under several kinds of misspecification. This paper examined the finite sample properties of the m1, m2and OIR tests within a GMM framework and explored the ability

of the bootstrap to provide improved critical values for the test statistics.

The Monte Carlo simulations show that the OIR test performs poorly when T becomes relatively large. These results are in accordance with the findings in Windmeijer (2005). Only when the N is large, T is small and no heteroskedasticity or skewness is present can this test provide reliable inference in finite samples. Because of the poor behaviour in panels with many time lags, it can be argued wether the OIR test, in its basic form, should be used for panel data inference at all.

(31)

The m1 and m2tests, with the focus on the m2test, provide satisfactory rejection frequencies

for all combinations of N and T under homoskedasticity both under the null and mild AR(1) and MA(1) alternatives. Skewness is not much of a problem for these tests. The real weakness is heteroskedasticity, which diminishes the rejection frequencies under the alternative hypothesis.

The Wild bootstrap for panel data shows some potential for improving the finite sample prop-erties of the OIR test in the case of small N and large T . Using this bootstrap method might make the OIR test a reliable inference method for panel data models. However, further research that investigates di↵erent combinations of nuisance parameters should be carried out before this can be said with certainty.

The Wild bootstrap does improve the rejection frequencies of the m2 test under the null and it

improves the rejection frequencies under the alternative in some cases. However, it fails to mitigate the problems caused by heteroskedasticity. In most other situations considered, the original test performs well. Combining this with the bootstrap result, one may conclude that using the Wild bootstrap for the m2 test in panel data can provide improvements although not as substantially

as it improves the OIR test.

Further research can expand the cases considered in this paper and might explore the landscape in which the bootstrap provides improvements for the m2and OIR tests even more. Varying certain

nuisance parameters could give more insight in the strength of this method.

Acknowledgement

My gratitude goes out to Jan F. Kiviet for helpful comments and suggestions and to Bobby Witte for the philosophical conversations that provided me with insights and ideas.

References

Arellano, M., Bond, S.R., 1991. Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations. The Review of Economic Studies 58, 277-297.

Arellano, M., 2003. Panel Data Econometrics. Oxford University Press.

Bowsher, C.G., 2002. On testing overidentifying restrictions in dynamic panel data models. Eco-nomic Letters 77, 211-220.

(32)

accuracy in panel data models. Journal of Econometrics 132, 409-444.

Cameron, A.C., Gelbach, J.B., Miller, D.L., Bootstrap-based improvements for inference with clustered errors. The Review of Economics and Statistics 90(3), 414-427.

Cameron, A.C., Trivedi, P.K., 2005. Microeconometrics: methods and applications. Cambridge university press.

Godfrey, L.G., Tremayne, A.R., 2005. The wild bootstrap and heteroskedasticity-robus tests for serial correlation in dynamic regression models. Computational Statistics & Data Analysis 49, 377-395.

Gon¸calves, S., Kilian, L., 2004. Bootstrapping autoregressions with conditional heteroskedasticity of unknown form. Journal of Econometrics 123, 89-120.

Hall, P., Horowitz, J.L., 1996. Bootstrap critical values for tests based on generalized-method-of-moments estimators. Econometrica 64, 891-916.

Hansen, L., 1982. Large sample properties of generalized method of moments estimators. Econo-metrica 50, 1029-1054.

Harvey, A.C., 1981. The econometric analysis of time series. MIT press.

Kapetanios, G. 2008. A bootstrap procedure for panel data sets with many cross-sectional units. Econometrics Journal 11, 377-395.

Kiviet, J.F., 1995. On bias, inconsistency, and efficiency of various estimators in dynamic panel data models. Journal of Econometrics 68, 53-78.

Kiviet, J.F., Pleus, M., Poldermans, R., 2014. Accuracy and efficiency of various GMM inference techniques in dynamic micro panel data models. Amsterdam School of Economics, Discussion paper.

Li, H., Maddala, G.S., 1996. Bootstrapping time series models. Econometric Review 15, 115-158.

(33)

Liu, R.Y., 1988. Bootstrap procedures under some non-IID models. Annals of Statistics 16, 1696-1708.

MacKinnon, J.G., 2002. Bootstrap inference in econometrics. Canadian Journal of Economics 35, 115-158.

Sarafidis, V., Yamagata, T., Robertson, D., 2009. A test of cross section dependence for a linear dynamic panel model with regressors, unpublished manuscript.

Sargan, D., 1958. The estimation of economic relationships using instrumental variables. Econo-metrica 26, 393-415.

Windmeijer, F., 2005. A finite sample correction for the variance of linear efficient two-step GMM estimators. Journal of Econometrics 126, 25-51.

Yamagata, T., 2008. A joint serial correlation test for linear panel data models. Journal of Econometrics 146, 135-145.

Appendix

A Heteroskedasticity parameter

Start with var(✏it) = x2it. Such that ✏it=p ⌘itxit, with ⌘ ⇠ N(0, 1). Specify such that the

average variance of the disturbances is one

1 N T N X i T X t x2it= 1, = 1 var(xit) = 1₂ x . (56)

When ⇡ = 0 the DGP for xitbecomes

xit= ⇢xi,t 1+ vit. (57)

(34)

2 x= 2 v 1 ⇢2, (58) which results in =1 ⇢ 2 2 v . (59)

B Derivation alternative DGP

Start with the DGP with xi,t 1included

yit= yi,t 1+ 0xit+ 1xi,t 1+ ↵i+ uit, (60)

which can be rewritten as

yit ↵i

1 = y

⇤

it= yi,t 1⇤ + 0xit+ 1xi,t 1+ uit. (61)

Derivation of the variance gives

var(y⇤it) = 2var(y⇤it) + ( 0⇢ + 1)2 2x+ 02 2v+ 2u+ 2 ( 0⇢ + 1)cov(yi,t 1⇤ , xi,t 1), (62)

with

cov(y_{i,t 1}⇤ , xi,t 1) = 0

+ ⇢ 1

(1 ⇢)(1 ⇢2₎ 2

v. (63)

Substituting this and 2

x= 1/(1 ⇢2) v2into (62) gives var(yit⇤) = 1 1 2 ✓ ( 0⇢ + 1)2 1 ⇢2 + 2 0+ 2 ( 0⇢ + 1)2 (1 ⇢)(1 ⇢2₎ ◆ 2 v+ u2 , (64) hence 2 s= 1 1 2 ✓ ( 0⇢ + 1)2 1 ⇢2 + 2 0+ 2 ( 0⇢ + 1)2 (1 ⇢)(1 ⇢2₎ ◆ 2 v+ 2 2u . (65)

Let 0= 1= /2 and 2u= 1 and by rewriting (65) we get 2 v= 2 2 ⇥ (1 2)¯! 2⇤ (1 ⇢)(1 ⇢) 1 + ⇢ + 2 . (66)

(35)

C Wald rejection frequencies

Table 8: Rejection frequencies of the Wald test under the null hypothesis 1= 0.

N=100 W1 W2 uit / T 5 7 9 5 7 9 (a) ✏it 0.1 0.086 0.092 0.099 0.187 0.308 0.525 0.5 0.090 0.106 0.114 0.194 0.315 0.539 0.9 0.072 0.078 0.084 0.171 0.286 0.503 N=200 (b) ✏it 0.1 0.071 0.079 0.080 0.114 0.178 0.247 0.5 0.075 0.076 0.083 0.122 0.179 0.252 0.9 0.067 0.073 0.075 0.114 0.174 0.249

R = 5000 simulation replications. Design parameter values: = 1 , 0= 1= /2, ⇡ = 0, ⇢ = 0.5, ¯! = 3, ✏= 1. The nominal significance level is 5%.