Master’s Thesis
Misspecification Tests in Dynamic Panel
Data Models and Improvements through
Bootstrapping
Tim Gabel
Student number: 6149812
Date of final version: September 8, 2015
Master’s programme:
Econometrics
Specialisation: Econometrics
Supervisor: Prof. dr. J.F. Kiviet
Second reader: Dr. M. Pleus
Statement of Originality
This document is written by Student Tim Gabel who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.
Abstract
This paper investigates the finite sample properties of several misspecification tests for linear dynamic panel data models after generalised method of moments estimation. The tests exam-ined are the overidentifying restrictions test and the m1and the m2tests of Arellano and Bond
(1991). Several model designs are considered to examine the behaviour in a broad landscape. Overall the m1and m2tests perform well, but the overidentifying restrictions test shows poor
finite sample behaviour in almost all cases considered. A bootstrap method for panel data is used to mitigate these shortcomings. Especially for the overidentifying restrictions test the bootstrap provides some promising finite sample improvements.
1
Introduction
If it is possible to predict the future or, putting it in a milder form, make an educated guess about present or future events, it would be through the use of data. The increase in the collection of this form of information in the last decades has stimulated the use of new forms of quantitative analyses. Especially panel data analysis has gained popularity over the last 40 years, due to the large quantities of data that are available. The major advantage of panel data compared to single indexed variables is that panels provide an opportunity to deal with the situation of unobserved heterogeneity which is correlated with the explanatory variables. One of the most prominent methods to process this information is through the use of the General Method of Moments (GMM) estimation techniques. This method has been widely used in economic analysis, such as labour participation, cross-country growth convergence, government behaviour, among many others. The claimed flexibility, generality, robustness, efficiency and ease of use are the main features that give this technique its reputation.
The most commonly used implementations of the GMM techniques are the ones introduced by Arellano and Bond (1991). However, since these techniques first appeared, a lot of practical problems came to light. Examples are weakness of instruments, size distortion of the test statistics, vulnerability due to the abundance of internal instruments and insignificant improvements of the 2-step GMM estimator compared to the 1-step GMM estimator. In applied research it frequently happens that the significance and plausibility of coefficient estimates is taken as a benchmark to select the models and techniques. However, using invalid instruments or imposing wrong coefficient restrictions will, in most cases, lead to standard errors that are relatively small. These standard errors will then give misleading information about the precision of the estimators and hence often seriously biased estimators are wrongly assumed to be precise.
It is a well known fact that the validity of GMM estimation results depends strongly on the validity of moment conditions. In a more general context, the estimation results are only reliable if the model is correctly specified. To check if this is the case, several tests have been constructed. A test for general misspecifications of the model is proposed by Sargan (1958) and Hansen (1982). This test checks the validity of the overidentifying restrictions. Another common phenomenon that points to model misspecification is serial correlation in the error terms. Error serial correlation will invalidate subsets of the moment conditions and will bias the coefficient estimates. Therefore it is important to identify serial correlation in the idiosyncratic error terms. Arellano and Bond (1991) proposed a first and second order error serial correlation test of the first di↵erenced errors, called the m1 and m2 test, respectively. An alternative introduced by the same authors is the Sargan’s
nested hypotheses concerning serial correlation in a sequential way. The Sargan’s di↵erence test is a more refined version of the overidentifying restrictions test by Sargan (1958) and Hansen (1982). As indicated earlier, these tests have become part of the standard tools in applied research.
Under misspecifications that cause serial correlation in the error terms, it is expected that the probability of rejecting the null hypothesis of no misspecification will tend to one as the sample size goes to infinity. However this does not always happen. Some studies show relatively poor finite sample behaviour. Bowsher (2002) and Windmeijer (2005) show that the overidentifying restrictions test does not reject the null often enough. Only when the number of time series observations is very low this test does perform well. This means the power of the overidentifying restrictions test can be very low in finite samples. Bowsher (2002) makes use of homoskedastic models. Windmeijer (2005) implements heteroskedasticity but does not include a lagged dependent variable. Furthermore, Yamagata (2008) introduces a variation of the mj test of Arellano (1993).
This joint test, called the m22,p test, tests for second to pth-order error serial correlation and can
serve as a general misspecification test. Yamagata (2008) compares the m2 test, the Sargan’s
di↵erence test and the m2
2,ptest using a homoskedastic model and a simple heteroskedastic model,
with several forms of serial correlation. Arellano and Bond (1991) performs some Monte Carlo simulations to examine the finite sample properties of the m1 and m2 tests using a model with
heteroskedasticity and an exogenous regressor. However, these simulations are of a too small scale to produce any precise results. From this we may conclude that knowledge with respect to the performance of serial correlation tests, especially the m1 and m2 tests, when they have to deal
with heteroskedasticity, skewness and serial correlation is still scarce.
These inference methods are all based on asymptotic approximations. The use of the boot-strap, to produce bootstrap critical values, is an alternative that can be used in cases where the standard asymptotic critical values do not work. Hall and Horowitz (1996) introduce a block bootstrap providing asymptotic refinement for dependent data. Gon¸calves and Kilian (2004) and Godfrey (2005) both apply the Wild bootstrap to obtain improved inference in time series, whereas Cameron, Gelbach and Miller (2008) uses an adjusted version of the Wild bootstrap for clustered data. Kapetanios (2008) compares di↵erent resampling schemes for bootstrapping in a panel data setting. However, there is still little literature about bootstrapping misspecification tests in a panel data setting.
Therefore this paper examines the behaviour of the m1, m2and the overidentifying restrictions
tests in a Monte Carlo simulation study which uses a data generating process (DGP) that includes a lagged dependent variable and an extra explanatory variable. Heteroskedastic, skewed and serially correlated error terms are introduced to investigate the robustness and behaviour under serial correlation of the tests, respectively. Furthermore, it is examined how the tests behave when an
extra lagged explanatory variable in the DGP is not taken into account in the estimation procedure. A bootstrap method for panel data GMM is discussed and examined for its potential to improve the tests in circumstances where the normal test procedures fail to produce accurate inference.
The m1 and m2 tests perform satisfactory in all cases, except for the one with a combination
of heteroskedasticity and skewness in the error term distribution. The overidentifying restrictions test shows poor finite sample behaviour, especially when N is relatively small and T relatively large. The bootstrap method improves the overidentifying restrictions test in the small N large T case substantially. For the m2 test the improvements are less substantial.
This paper is structured as follows. In Section 2 the basic GMM results are derived. This is done to give an introduction into the notation used in this paper and to clarify further derivations. Furthermore, the derivations of the relevant test statistics are given and the bootstrap technique is illustrated. Section 3 introduces the model and assumptions made. In Section 4 the Monte Carlo simulation is explained. Section 5 provides results of the Monte Carlo simulation and Section 6 summarizes the paper and links the results to a conclusion.
2
Panel GMM estimator and test statistics
In this section the basic results for Instrumental Variable (IV) and GMM estimation techniques in panel data are discussed. This is done to clarify the notation used and introduce the estimation procedure and inference techniques. First the general model and its estimators are introduced. Thereafter, the overidentifying restrictions test and the m1and m2tests are described. Throughout
this paper linear models are considered.
2.1
Basic model and the resulting estimators
We start with defining the basic model and the orthogonality conditions used for the IV and GMM procedures. One of the main advantages of panel data is the opportunity to deal with time-constant unobserved heterogeneity. This is included into the model by the term ↵i. The basic panel data
model is
yit= ↵i+ x0it + uit, (1)
with i = 1, ..., N , t = 1, ..., T , xitis a K⇥ 1 vector of regressors, is a K⇥ 1 vector of coefficients
and yit and uit are scalars. To deal with the unobserved heterogeneity, ↵i, in models where not
yit= x0it + uit, (2)
with i = 1, ..., N , t = 2, .., T , yit= yit yi,t 1, xit= xit xi,t 1and uit= uit ui,t 1. By
stacking the remaining T 1 observations the model can be written as
yi= Xi + ui$ ˙yi= ˙Xi + ˙ui, (3)
where ˙Xiis a (T 1)⇥ K matrix of first di↵erenced regressors and ˙yiand ˙uiare (T 1)⇥ 1 vectors
consisting of first di↵erenced terms,
˙yi= 0 B B B @ ˙yi2 .. . ˙yiT 1 C C C A, ˙ui= 0 B B B @ ˙ui2 .. . ˙uiT 1 C C C A, ˙Xi = 0 B B B @ ˙x0 i2 .. . ˙x0iT 1 C C C A.
Using this notation the moment conditions to be exploited can be expressed as
E[Zi0( ˙yi X˙i )] = 0, (4)
where Zi= (zi1. . . ziT)0is a (T 1)⇥ L matrix of instruments, with L the total number of
instru-ments available. The validity of the individual instruinstru-ments depends on the correlation between explanatory variables and the ui,t and ui,t 1. This will be discussed in more detail in Section 3.
In the special case L = K the sample moments
1 N N X i=1 Zi0( ˙yi X˙iˆ) = 0, (5)
can be solved, giving the panel IV estimator
ˆP IV = N X i=1 Zi0X˙i ! 1 N X i=1 Zi0˙yi ! . (6)
In the more general case when L > K finding an estimator comes down to minimizing the following quadratic form with respect to ,
QN( ) = N 1 N X i=1 ( ˙yi X˙i0 )0Zi ! WN N 1 N X i=1 Zi0( ˙yi X˙i0 ) ! .
Here WN is a L⇥ L weighting matrix which is discussed in more detail later on. After some
ˆP GM M = " XN i=1 ˙ Xi0Zi ! WN N X i=1 Zi0X˙i !# 1 N X i=1 ˙ Xi0Zi ! WN N X i=1 Zi0˙yi ! . (7)
Assumption (4) is essential for the consistency of this estimator. The exact form of the instrument matrix is discussed in detail in Section 3.
It is convenient to rewrite (7) in a more compact form by stacking the N blocks. This results in ˆP GM M =⇣X˙0ZW NZ0X˙ ⌘ 1 ˙ X0ZWNZ0˙y, (8)
with ˙y = ( ˙y10 . . . ˙y0N)0, ˙X =
⇣ ˙
X10. . . ˙XN0
⌘0
and Z = (Z10. . . ZN0 )0. Assuming the Central Limit
Theorem (CLT) for the sample moment conditions holds
1 p N N X i=1 Zi0˙ui d ! N(0, S).
To make use of this CLT independence over i is assumed. By combining this CLT and (7) it can be shown that ˆP GM M is consistent and asymptotically normally distributed. A consistent estimator
of the asymptotic variance matrix of ˆP GM M is
ˆ
V [ ˆP GM M] = ( ˙X0ZWNZ0X)˙ 1X˙0ZWN(N ˆS)WNZ0X( ˙˙ X0ZWNZ0X)˙ 1. (9)
The asymptotic variance matrix of the moment conditions, S, can be consistently estimated by
ˆ S = 1 N N X i=1 Zi0ˆ˙uiˆ˙u0iZi, (10)
where ˆ˙ui = ˙yi X˙iˆ is a (T 1)⇥ 1 first di↵erenced residual vector. Now two cases can be
distinguished. In both cases the asymptotically optimal choice for WN is an expression which
has a probability limit that is proportional to the inverse of the asymptotic variance matrix S. First, the case when ui is homoskedastic, ˙ui⇠ N(0, u2H), where H is defined in Section 3. Then
the optimal choice is proportional to the inverse of
N
P
i=1
Z0
iHZi. This results in the panel GMM
estimator given by
ˆP GM M = ( ˙X0Z(Z0HZ) 1Z0X)˙ 1X˙0Z(Z0HZ) 1Z0˙y. (11)
In the second case we have ˙ui⇠ N(0, ˙⌦i), where ˙⌦iis a N (T 1)⇥ N(T 1) band matrix. ˙⌦i and
may now be heteroskedastic. The result is the panel GMM estimator
ˆP GM M = ( ˙X0ZS 1Z0X)˙ 1X˙0ZS 1Z0˙y. (12)
In practice ˙⌦ and so S 1 are unknown, hence it is not possible to use the above estimator. To
overcome this problem the two-step procedure can be used. In the first round use WN(0) to obtain ˆ(1)
P GM M. This estimator is not optimal if W (0)
N 6= S 1, however under the assumptions made it is
consistent. Now form the consistent residuals
ˆ˙u(1)= ˙y X ˆ˙ (1)
P GM M, (13)
which makes it possible to get an expression for ˆS 1. Because of consistency we now have plim
ˆ
S 1= S 1. Using W(1)
N = ˆS 1 and substituting this into (12) gives the two-step GMM estimator
ˆ(2) P GM M = ( ˙X0ZW (1) N Z0X)˙ 1X˙0ZW (1) N Z0˙y. (14)
This estimator will converge to estimator (12), thus it is also asymptotically optimal.
2.2
Test procedures for misspecification
The first test discussed is the Sargan-Hansen test of overidentifying restrictions, called OIR from now on. The OIR test can be used for over-identified models with more moment conditions (L) than coefficients (K). The test, originally developed by Sargan (1958) and Hansen (1982), is based on the asymptotic normality of the moment conditions under the null hypothesis of no misspecifications. In other words, E[Z0
iui] = 0 should hold. This assumption can be used to
construct the test statistic
OIR = N X i=1 ˆ˙u0 iZi ! (N ˆS) 1 N X i=1 Zi0ˆ˙ui ! (15) where ˆ˙ui= ˙yi X˙i0ˆ (2)
P GM M, ˆS is given in (10). Heteroskedasticity and correlation of ui over t for
a given i are allowed supposing the moment conditions still hold.
Using the result of asymptotic normality, it is relatively straight forward to derive that the OIR statistic is 2(L K) distributed under the null. The instrument matrix, Z, does not need to be
the optimal set of instruments. Here L just refers to the number of columns in Z provided L > K. This test can be computed when we use the two-step estimator. If the one-step GMM estimator is used some modifications need to be done.
size control, insignificant test statistics are certainly not a guarantee that the population moment conditions are valid. As explained earlier the test checks only L K moment conditions, upon assuming that K instruments are valid anyway. In case of rejection, there is no indication which instruments cause the rejection (Arellano and Bond (1991) and Cameron and Trivedi (2005)).
The second test procedure is the mj test with j = 1, 2 of Arellano and Bond (1991). Details of
the following derivation can be found in Arellano (2003). The mj statistics are tests of significance
of the average j-th order autocovariance rj
rj= 1 T 1 j T X t=2+j rtj, (16)
with rtj = E[ ˙uit˙ui(t j)]. It starts at t = 2 + j because the model is in first di↵erences, thus ˙ui2 is
the first available residual. It is important to note that if a lagged dependent variable is included, as is done in Section 3, the starting t becomes t = 3 + j and hence 1/(T 2 j) should be used. The sample counterpart of rtj based on the first-di↵erenced residuals ˆ˙uitis
ˆ rtj = 1 N N X i=1
ˆ˙uitˆ˙ui(t j), (17)
where ˆ˙uit= ˙yit ˙x0itˆ. The test statistic is given by
mj = rˆj
SE(ˆrj)
. (18)
To form an expression for ˆrj we start with writing the first-di↵erenced residuals as ˆ˙uit = ˙uit
˙x0
it( ˆ ). Here ˙xit and ˆ denote the vectors of right-hand-side variables and parameter
estimation errors, respectively. This on its turn can be rewritten as ˆ = PNN1
PN
i=1Zi0˙ui and
gives an implicit expression for PN. Some algebra results in the explicit expression
PN = N N X i=1 ˙ Xi0ZiWN N X i=1 Zi0X˙i ! 1 NX i=1 ˙ Xi0ZiWN. (19)
The other building blocks are
gN0 = 1 N (T 1 j) N X i=1 T X t=2+j ⇣
˙uit˙x0i(t j)+ ˙ui(t j)˙x0it
⌘ , (20) and ⇣ji= 1 (T 1 j) T X t=2+j
Under the null hypothesis, H0: rj = 0, the estimated residual autocovariance may now be written as p N ˆrj = (1 g0NPN)p1 N N X i=1 0 @ ⇣ji Zi0˙ui 1 A + Op(1), (22)
Using a standard CLT and the information above we may conclude that under the null rj = 0
ˆ V (ˆrj) 1 2pN ˆr j d ! N(0, 1), where ˆ V (ˆrj) = (1 gN0 PN) 1 N N X i=1 0 @ ⇣ˆ 2 ji ⇣ˆjiˆ˙u0iZi Z0
iˆ˙ui⇣ˆji Zi0ˆ˙uiˆ˙u0iZi
1 A 0 @ 1 P0 NgN 1 A . (23)
See Arellano and Bond (1991) for a proof of asymptotic normality results. This provides an expres-sion for SE(ˆrj) =
q ˆ
V (ˆrj) and because of the asymptotic normal distribution it is straightforward
that mj d
! N(0, 1). The mj criterion is rather flexible, in that it can be defined in terms of
any consistent GMM estimator, not necessarily in terms of an efficient estimator. However, the asymptotic power of the mj test will depend on the efficiency of the estimators used (Arellano
(2003) and Arellano and Bond (1991)).
These tests are compared with the Wald test, which tests if the coefficients are equal to a certain value. The Wald test statistic is
W = (R ˆP GM M r)0(R ˆV [ ˆP GM M]R0) 1(R ˆP GM M r), (24)
where R is a h⇥ K matrix and r a h ⇥ 1 vector of constants, h are the number of restrictions that are tested, ˆV [ ˆP GM M] is given in (9) and under the null W
d
! 2(h).
2.3
Wild bootstrap
The bootstrap uses the estimation sample as if the sample were the population. Repeatedly resampling the data results in several bootstrap samples. Each bootstrap round test statistics based on these bootstrap samples can be computed. This process develops bootstrap empirical distributions of the test statistics. Hence, bootstrap estimates of the ↵-level critical values of the m1, m2and OIR tests are the 1 ↵ quantiles of the bootstrap empirical distributions of the tests.
The bootstrap technique considered is a Wild bootstrap for panel data. Even when the residuals are heteroskedastic and correlated over t for given i, this bootstrap should provide an asymptotic
refinement in a linear model if the panel is short (Cameron and Trivedi (2005)). Let
⇠it = (
p
5 1)/2 , with probability (p5 + 1)/(2p5) = (p5 + 1)/2 , otherwise,
hence E(⇠it) = 0, E(⇠it2) = 1 and E(⇠it3) = 1. Essential for the validity of the bootstrap procedure
is that the assumptions E(⇠it) = 0 and E(⇠it2) = 1 hold. The third assumption consisting of
the third moment of the distribution of ⇠it is suggested by Liu (1988). There it is shown that
if E(⇠3
it) = 1, the wild bootstrap enjoys second-order properties and the first three moments of
the test statistic used are estimated correctly to O(T 1). The following steps describe the Wild
bootstrap procedure.
(i) Use the GMM residuals, ˆ˙uit, and generate
˙u⇤it= ⇠itˆ˙uit, (25)
where i = 1, ..., N and t = 3, ..., T .
(ii) Using (25) pseudo data are generated by
˙yit⇤ = ˆ ˙xit+ ˙u⇤it, (26)
where the actual estimation sample starting values are used for the bootstrap sample starting values (Li and Maddala (1996)). The original xitare used in this procedure. One assumption that
is used in (26) is that the xit are strictly exogenous. This is taken into account and is explained
in Section 4.
(iii) By making use of the bootstrap sample the regression model is estimated and the associated values of the test statistics m⇤1, m⇤2 and OIR⇤ are calculated. Using these estimators the bootstrap
versions of the test statistics can be obtained. The bootstrap OIR becomes
OIR⇤= N X i=1 ˆ˙u0⇤ i Zi⇤ ! (N ˆS⇤) 1 N X i=1 Zi0⇤ˆ˙u⇤i ! , (27) where ˆ˙u⇤ i = ˙y⇤i X˙i ⇤P GM M, ˆS⇤ = N 1 N P i=1 Z0⇤
i ˆ˙u⇤iˆ˙u0⇤i Zi⇤ and Zi⇤ is the instruments matrix created
using the bootstrap sample. The bootstrap mj test becomes
m⇤j = ˆ r⇤ j SE(ˆr⇤ j) , (28)
where ˆr⇤j is (16) using the bootstrap residuals given above.
(iv) Now repeat (i), (ii) and (iii) B times such that B values of the test statistics can be obtained. These values can be used to acquire the 1 ↵ quantiles of the bootstrap empirical distributions of the tests, which are the bootstrap estimates of the ↵-level critical values as explained previously (Godfrey and Tremayne (2005)).
3
The model
3.1
Model and assumptions
To examine the behaviour of the the OIR, m1and m2tests under di↵erent conditions, the following
model is used
yit= ↵i+ yi,t 1+ xit+ uit, (29)
with i = 1, ..., N , t = 2, ..., T and ↵i are individual specific e↵ects which are constant over time
but di↵er across i, such that ↵i ⇠ iid(0, 2↵). Through this term unobserved heterogeneity can
be captured. The xitare considered to be predetermined and hence the current value and lagged
values are uncorrelated with the current error term uit. The same holds for the lagged dependent
variable, yi,t 1, which can also be classified as predetermined. Notice that uit is an error term
which could be of several forms, as is explained in Section 4. The focus in this paper lies on micro panels, panels with relatively few time-series observations T and a large set of cross-sectional ob-servations N . Therefore, the asymptotic approximations will be for N ! 1 and for finite T . In order to proceed, the following assumptions are made
Assumption 1. {yi1, ...yiT, xi1, ...xiT}Ni=1 is a sequence of independently and identically distributed
random variables.
Assumption 2. (i) {uit}Tt=1 is an independently distributed sequence, with mean zero and strictly
positive variance 2
u. Meaning that 8i, j, t 6= s
E[uit] = 0
E[u2it] = u2
E[uitujs] = 0
(ii) The classification of the two regressors implies 8i, t, s, l 0
E[xitui,t+l] = 0
E[yi,t 1ui,t+l] = 0
(iii) For the coefficient of the lagged dependent variable it holds that| | < 1.
The first assumption is necessary to use the standard iid CLT that is used in the previous section. Althought we can use a weaker form, namely the ”independently but not necessarily identically distributed” case, the stronger one is employed for the ease of computation and exposition. The second assumption can be divided into three sub assumptions where (i) says the error terms are cross-sectionally uncorrelated, but allow for heteroskedasticity and skewness. Furthermore, this assumption is needed for the validity of the bootstrap. (ii) is needed in order to form the moment conditions and (iii) makes sure that the yitprocess is stable.
We now write (29) in a form such that the techniques of the previous section can be applied. Combining the two explanatory variables into one 1⇥ 2 vector gives ˜xit= (yi,t 1 xit). Define the
2⇥ 1 vector of coefficients ✓ = ( )0. This results in the following model
yit= ˜xit✓ + ↵i+ uit. (30)
To get rid of the individual specific e↵ects first di↵erences of (30) are taken. This results in
˙yit= ˙˜xi,t 1✓ + ˙uit, (31)
with i = 1, ..., N , t = 3, ..., T , ˙yit= yit yi,t 1, ˙˜xit= ˜xit x˜i,t 1and ˙uit= uit ui,t 1as before.
Now stack the observations over T which gives
Dyi = D ˜Xi✓ + Dui, ˙yi= ˙˜Xi✓ + ˙ui, (32)
where ˙yi = ( ˙yi3· · · ˙yiT)0, ˙Xi = ( ˙˜xi30 · · · ˙˜x0iT)0 and ˙ui = ( ˙ui3· · · ˙uiT)0. The D matrix, which has
dimension (T 1)⇥ T , is of the form
D = 0 B B B B B B @ 1 1 0 · · · 0 0 1 1 ... .. . ... . .. ... 0 0 0 · · · 1 0 1 C C C C C C A . (33)
The techniques from Section 2 can be applied to the model written as in the right-hand side of (32).
3.2
The instrument matrix
Now we need to form a matrix of instrumental variables Zi. The explanatory variables are all
predetermined which means that the current values and lagged values are uncorrelated with the current error term. Using the fact that the equation is in first di↵erences, this suggest that all lagged values of yi,s 1 and xis, with s < t, could be used as instruments for the explanatory
variables at time t
E[yi,s 1˙uit] = 0, s < t, (34)
E[xis˙uit] = 0, s < t, (35)
with t = 3, ..., T . First, start with forming the instrument matrix for the lagged values of yit
Zyi= 0 B B B B B B B B B @ yi1 0 0 0 0 0 · · · 0 0 yi1 yi2 0 0 0 · · · 0 0 0 0 yi1 yi2 yi3 · · · 0 .. . ... ... ... ... ... . .. ... ... ... 0 0 0 0 0 0 · · · yi1 · · · yi,T 2 1 C C C C C C C C C A . (36)
This is a (T 2)⇥ (T 2)(T 1)/2 or, say, (T 2)⇥ Ty matrix. Secondly, we can construct the
instrument matrix for xit by the same principle
Zxi = 0 B B B B B B @ xi1 xi2 0 0 0 · · · 0 0 0 xi1 xi2 xi3 · · · 0 .. . ... ... ... ... . .. ... ... ... 0 0 0 0 0 · · · xi1 · · · xi,T 1 1 C C C C C C A , (37)
where the size of the matrix is (T 2)⇥ [(T 2)(T + 1)/2 1] = (T 2)⇥ Tx. These are all
valid instruments under Assumption 2, so to obtain the complete instrument matrix they should be combined, resulting in
Zi= [Zyi Zxi], (38)
instruments are left out of consideration in this paper. In addition to the first two assumptions a third one is made
Assumption 3. rank(Z0
iX˙i) = 2.
This assumption is an identification condition (Yamagata (2008) and Cameron and Trivedi (2005)).
3.3
The weighting matrices
To perform two-step GMM we need weighting matrices for both steps. In Assumption 2 we assumed that the uit are cross-sectionally and serially uncorrelated, but that they may display
some heteroskedastic behaviour. This suggest ui⇠ (0, ⌦i), where
⌦i= 0 B B B B B B @ 2 i1 0 · · · 0 0 2 i2 · · · 0 .. . ... . .. 0 0 0 · · · 2 iT 1 C C C C C C A . (39)
Using (32), this suggest that ˙ui = Dui ⇠ (0, ˙⌦i), where ˙⌦i = D⌦iD0. Hence, using the CLT as
explained in Section 2, we end up with
N 12 N X i=1 Zi0˙ui ! N(0, Vd i), (40) where Vi= plimN1 N P i=1
Zi0˙⌦iZi. The weighting matrix used for the optimal GMM estimator should
be proportional to diag(V1 1, . . . , VN1). To obtain the one-step GMM estimator assume ⌦i= 2✏I.
This is incorrect under heteroskedasticity, but it will be corrected during the second step. Under this assumption the weighting matrix is
WN(0)= 1 N N X i=1 Zi0HZi ! 1 . (41)
H = 0 B B B B B B B B B B B B B @ 2 1 0 · · · · 0 1 2 1 · · · · 0 0 1 2 . .. 0 .. . ... . .. ... ... ... .. . ... . .. 2 1 0 0 0 · · · 1 2 1 C C C C C C C C C C C C C A . (42)
In model (32), this weighting matrix results in the one-step GMM estimator
ˆ
✓(1) = ( ˙X0ZWN(0)Z0X)˙ 1X˙0ZW (0)
N Z0y,˙ (43)
where the matrices are stacked as is done in (8). In the simulation this estimator is called AB1. Using the residuals, as explained in Section 2, results in the second step weighting matrix
WN(1) = 1 N N X i=1 Zi0ˆ˙u (1) i ˆ˙u0(1)i Zi ! 1 . (44)
With help of this matrix the two-step GMM estimator, named AB2 in the simulation, can be estimated using (14).
4
Simulation Design
The DGP is the same as in Yamagata (2008). It concerns a panel Autoregressive Distributed Lag model of order 1 and 0 (ARDL(1,0) model) and can be written as in (29), namely
yit= ↵i+ yi,t 1+ xit+ uit, (45)
where the DGP used for xitis
xit= ⇢xi,t 1+ ⇡ui,t 1+ vit, (46)
both with i = 1, ..., N and t = 49, ..., T and ↵i are drawn from a N (0, 2↵) distribution. A grid
of parameter values and di↵erent numbers for T is used to examine the behaviour of the di↵erent tests in a wide landscape. The chosen values are 2 {0.1, 0.5, 0.9}, = 1 and ⇡ and ⇢ are set to 0.5. This choice of means the so called Long Run Multiplier (LRM) is /(1 ) = 1. The long run elasticity is equal to one in this way, which is an economical relevant case (Harvey (1981)). Furthermore, using no misspecification of the model we have uit = ✏it, with ✏it ⇠ iidN(0, 2✏)
and vit ⇠ iidN(0, v2). To get starting values of xi1 and yi1, the process starts at t = 49 with
xi, 49= 0 and yi, 49= 0 and runs till T , then the first 50 observations are discarded. This scheme
provides the values for xi1 and yi1 (Yamagata (2008)).
Using the same approach as in Kiviet (1995), Bun and Kiviet (2006) and Yamagata (2008), the signal-to-noise ratio under the null, uit= ✏it, is controlled through 2v. The signal is defined
as 2
s= var(y⇤it ✏it), with yit⇤ = yit ↵i/(1 ), and the signal-to-noise ratio is ¯! = 2s/ 2✏. Now
the variance of vitis 2 v = 1 2 2 ✏(1 + !) a1 b1 , (47) where a1= 1 + ⇢ (1 ⇢2)(1 2)(1 ⇢) and b1= 1 + ( ⇡ ⇢)2+ 2( ⇡ ⇢)( + ⇢) 1 + ⇢ .
The variance of the fixed e↵ect is chosen such that the impact of the two variance components ↵i
and ✏iton var(yit) is constant across the several cases considered. The variance of the individual
e↵ect, ↵i, is
2
↵= (1 )2a1b1 (48)
Furthermore, the signal-to-noise ratio ¯! is set to 3 (Yamagata (2008)). For a more detailed deriva-tion see Sarafidis et al. (2009).
We define two categories of cases of misspecification. First, three cases which are considered as the base misspecifications consist of di↵erent forms of serial correlation in the error terms. The second category are the cases in which e.g. the error terms are heteroskedastic or skewed. The di↵erent designs are given below.
(i) The first case is the situation where there is no misspecification, hence
uit = ✏✏it, ✏ = 1.
variance of uit is equal to one,
uit= ⇢1ui,t 1+ ✏✏it, (49)
where 2
✏ = (1 ⇢21). The parameter ⇢1is set to 0.2.
(iii) Now the error terms follow a MA(1) model. Again the variance of uitis equal to one, in order
to make the di↵erent cases comparable.
uit= ✏(✏it+ 1✏i,t 1), (50)
with 2
✏ = (1 + 12) 1. 1= 0.2 is considered in this case.
Case (i)-(iii) are the base cases. The following four are the misspecifications that will be combined with the base cases. Hence, including the case of no misspecification, we end up with twelve di↵erent designs.
(a) The errors contain unconditional heteroskedasticity. For ease of computation ⇡ = 0. This does create a di↵erence between the previous designs and the heteroskedasticity design, but because the estimation method remains the same some comparison can be made. The design is such that the ✏it is heteroskedastic over both t and i. Start with var(✏it) = x2it, such that ✏it =p ⌘itxit,
with ⌘it⇠ N(0, 1). To make the design of heteroskedasticity comparable to the other designs, the
average variance of the error term should equal one. This is done by specifying as
=1 ⇢
2 2 v
. (51)
See Appendix A for further explanation.
(b) Skewness is introduced making use of a gamma distribution for the error terms
✏it⇠ (1/16, 4) 1/4 (52)
In this way the distribution of the error terms is positively skewed. The skewness is exaggerated for a more visibly e↵ect. The mean and variance of the error term are still zero and one however. (c) In the last case the heteroskedasticity from (a) and skewness from (b) are combined into one design. Let ✏it=p ⌘itxitand ⌘it⇠ (1/16, 4) 1/4.
Finally, an alternative DGP is considered and examined how the misspecification tests perform with respect to the more specific Wald test which tests if the coefficient 1 is equal to zero or
not. The error terms are assumed to be white noise, as in case (i). Now the DGP contains an extra explanatory variable xi,t 1, but this variable is not taken into account during the estimation
yit= ↵i+ yi,t 1+ 0xit+ 1xi,t 1+ uit, (53)
where the DGP used for xitis the same as in (46). For the same grid of values is used. However,
for the other coefficients 0+ 1= 1 holds. This results in ( 0+ 1)/(1 ) = 1, which means
the LRM is equal to one again. Furthermore, we take 0= 1= /2 and for ease of computation
again ⇡ = 0. This specification investigates how the tests behave in the case of a relevant omitted variable and if a more specific coefficient test is a better alternative. The relevant formulas become
2 v= 2 2[(1 2)(¯! 2)](1 ⇢)(1 ⇢) 1 + ⇢ + 2 (54) and 2 ↵= 2 1 1 + , (55)
with = 1. The formula for 2
↵ comes from Kiviet (1995) and it can be shown that if ⇡ = 0, (48)
becomes (55). Detailed derivations can be found in Appendix B.
Monte Carlo results using these specifications may prove that some misspecifications cause the tests to have rejection frequencies under the null that di↵er strongly from the nominal level or to have rejection frequencies under the alternative that are relatively low. Using the bootstrap method discussed in Section 2.3 we attempt to fix these shortcomings by resampling data from the true DGP. First the specific misspecifications that cause problems are identified, then the bootstrap technique is used to solve the problems. Because the Wild bootstrap, as explained in 2.3, needs strictly exogenous xit, again ⇡ = 0 is used. This results in xitto be strictly exogenous, hence more
instruments could be used. For simplicity reasons the instrument matrix is kept the same as in Section 3.2 however. Of course the yi,t 1are still predetermined.
In all the cases a nominal significance level of five percent is used, ↵ = 0.05. Furthermore the cases of N 2 {100, 200} and T 2 {5, 7, 9} are considered and the number of Monte Carlo repetitions is R = 5000. The number of bootstraps is B = 199. MacKinnon (2002) reports that small B are not recommended in applied research, however sampling errors associated with this value tend to cancel out in Monte Carlo experiments.
5
Results
In this section the finite sample behaviour of the m1, m2and OIR tests are examined. This is done
fail to provide reasonable size control, a bootstrap method is used to try to mitigate the problems. In this section m21 means the m2 test after one-step GMM and m22 the m2 test after two-step
GMM. The same holds for m11 and m12. In Table 1, row (a) and (d), the actual significance
levels of the m21, m22 and OIR tests are given. Note these are the actual significance levels for
the chosen parameter value combination and so these may di↵er when other parameter values are used. The actual significance levels of the m21 and m22 are satisfactory for all combinations
of N and T . However, the OIR tends to reject the null too infrequently. Especially when T becomes larger and N stays relatively small this test shows poor rejection frequencies under the null. Overall the test becomes worse when T increases, due to an increase in the number of moment restrictions (Yamagata (2008)). These findings are consistent with the ones found in Bowsher (2002), Windmeijer (2005) and Yamagata (2008).
Next, consider the results for the rejection frequencies under the alternative, given in row (b), (c), (e) and (f). The m21and m22have good size control, hence we can talk about power for these
tests. For the OIR test this is not the case. The model is estimated in first di↵erences, hence the m11 and m12 should reject. This happens in almost one hundred percent of the cases. In general
the rejection frequency is increasing in N and T . The rejection frequencies of the m21, m22 and
OIR tests are smaller when the errors are AR(1), than the rejection frequencies in the case of MA(1) errors. Especially the OIR test in the case of N = 100 displays very weak rejection rates in both error term specifications. Furthermore, the rejection frequency of the OIR test decreases with an increase in T . The power of the m21and m22tests is increasing in T however.
Table 1: Rejection frequencies under the null and the alternative of tests: A dynamic panel ARDL(1,0) model with predetermined regressors. The ✏it are standard normally distributed.
N=100 m11 m12 m21 m22 OIR uit / T 5 7 9 5 7 9 5 7 9 5 7 9 5 7 9 (a) ✏it 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.048 0.054 0.053 0.050 0.053 0.054 0.033 0.014 0.000 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.048 0.052 0.052 0.048 0.052 0.053 0.037 0.016 0.000 0.9 0.996 1.000 1.000 0.983 1.000 1.000 0.048 0.052 0.052 0.046 0.051 0.053 0.052 0.020 0.000 (b) AR(1) 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.175 0.290 0.386 0.176 0.292 0.385 0.087 0.038 0.000 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.175 0.294 0.394 0.176 0.293 0.392 0.108 0.049 0.000 0.9 0.989 1.000 1.000 0.957 1.000 1.000 0.143 0.260 0.357 0.129 0.248 0.353 0.115 0.055 0.000 (c) MA(1) 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.336 0.550 0.704 0.339 0.551 0.705 0.143 0.061 0.000 0.5 1.000 1.000 1.000 0.999 1.000 1.000 0.334 0.553 0.709 0.334 0.553 0.708 0.159 0.071 0.000 0.9 0.986 1.000 1.000 0.940 0.998 1.000 0.280 0.509 0.677 0.254 0.495 0.674 0.150 0.069 0.000 N=200 (d) ✏it 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.051 0.052 0.051 0.048 0.050 0.051 0.039 0.039 0.023 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.050 0.052 0.050 0.049 0.051 0.050 0.043 0.041 0.025 0.9 1.000 1.000 1.000 1.000 1.000 1.000 0.048 0.051 0.049 0.047 0.050 0.050 0.069 0.058 0.038 (e) AR(1) 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.289 0.471 0.632 0.291 0.472 0.633 0.241 0.260 0.202 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.296 0.490 0.663 0.296 0.487 0.663 0.289 0.311 0.246 0.9 1.000 1.000 1.000 0.997 1.000 1.000 0.259 0.447 0.623 0.242 0.434 0.616 0.362 0.402 0.332 (f) MA(1) 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.576 0.829 0.938 0.575 0.828 0.939 0.404 0.425 0.328 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.581 0.837 0.946 0.578 0.837 0.945 0.443 0.475 0.382 0.9 1.000 1.000 1.000 0.991 0.999 1.000 0.532 0.809 0.933 0.498 0.796 0.930 0.481 0.544 0.452
R = 5000 simulation replications. Design parameter values: = 1 , ⇡ = ⇢ = 0.5, ¯! = 3, ✏= 1. The nominal significance level is 5%.
Table 2: Rejection frequencies under the null and the alternative of tests: A dynamic panel ARDL(1,0) model with predetermined regressors. The ✏it are heteroskedastic.
N=100 m11 m12 m21 m22 OIR uit / T 5 7 9 5 7 9 5 7 9 5 7 9 5 7 9 (a) ✏it 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.046 0.045 0.046 0.044 0.045 0.046 0.017 0.003 0.000 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.046 0.048 0.051 0.046 0.048 0.051 0.019 0.006 0.000 0.9 0.999 1.000 1.000 0.997 1.000 1.000 0.058 0.050 0.050 0.058 0.049 0.050 0.022 0.004 0.000 (b) AR(1) 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.076 0.101 0.125 0.077 0.101 0.126 0.022 0.006 0.000 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.085 0.119 0.151 0.085 0.117 0.150 0.026 0.007 0.000 0.9 0.998 1.000 1.000 0.994 1.000 1.000 0.092 0.120 0.154 0.089 0.118 0.154 0.029 0.007 0.000 (c) MA(1) 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.090 0.127 0.164 0.087 0.133 0.163 0.026 0.009 0.000 0.5 1.000 1.000 1.000 0.999 1.000 1.000 0.099 0.148 0.196 0.097 0.147 0.194 0.029 0.010 0.000 0.9 0.997 1.000 1.000 0.994 0.998 1.000 0.110 0.150 0.195 0.106 0.152 0.194 0.034 0.007 0.000 N=200 (d) ✏it 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.038 0.038 0.040 0.039 0.043 0.042 0.029 0.020 0.010 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.040 0.046 0.043 0.040 0.045 0.045 0.030 0.024 0.011 0.9 1.000 1.000 1.000 1.000 1.000 1.000 0.048 0.048 0.043 0.047 0.047 0.043 0.043 0.033 0.014 (e) AR(1) 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.097 0.136 0.177 0.096 0.140 0.184 0.048 0.040 0.019 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.113 0.165 0.224 0.115 0.166 0.228 0.048 0.043 0.018 0.9 1.000 1.000 1.000 0.999 1.000 1.000 0.123 0.176 0.246 0.124 0.176 0.247 0.090 0.064 0.027 (f) MA(1) 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.125 0.190 0.255 0.132 0.198 0.267 0.054 0.045 0.023 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.146 0.228 0.315 0.145 0.230 0.318 0.057 0.045 0.023 0.9 1.000 1.000 1.000 0.991 0.999 1.000 0.157 0.246 0.337 0.156 0.244 0.337 0.094 0.068 0.029
R = 5000 simulation replications. Design parameter values: = 1 , ⇡ = ⇢ = 0.5, ¯! = 3, ✏= 1. The nominal significance level is 5%.
Now consider the case of heteroskedasticity in the error terms as specified in (51). In Table 2 rows (a) and (d) it can be seen that the heteroskedasticity does a↵ect the actual significance levels of the m21 and m22 test. However, the e↵ect is very small. The actual significance levels
of both tests are still satisfactory for all combinations N and T . The significance level of the OIR test deteriorates and approaches zero. In contrast to the rejection frequencies under the null, the heteroskedasticity does have a substantial e↵ect on the rejection frequencies of the m21, m22
and OIR tests under the alternative. They decrease substantially and for the OIR test it even approaches zero. Under serial correlation the rejection frequencies of the m11and m12seem hardly
e↵ected by heteroskedasticity.
The results for the case of a skewed distribution of the error terms can be found in Table 3. Rows (a) and (d) show that skewness a↵ects the actual significance level of the m21 and m22 a
little bit more than heteroskedasticity does. However, the actual significance level of the OIR test improves in comparison with the heteroskedastic case. The rejection frequencies of the m21, m22
and OIR tests are on average the same as in case (i). However the rejection frequencies of the m11
and m12 become smaller, although they are still satisfactory.
Turning now to the results in Table 4 where heteroskedasticity and skewness are combined. The rejections under the null of the m21 and m22 are still within a range that is acceptable, but
they leave room for improvement. The rejection frequencies, under the null hypothesis, of the OIR test are far from acceptable however. Only for the case of T = 5 and N = 200, so small T and large N , does this test have reasonable finite sample properties. For the other combinations of T and N serious problems occur e.g. over rejection when N = 200 and under rejection for T = 9 and N = 100. From rows (b), (c), (e) and (f) it can be concluded that m21 and m22 have relatively
low rejection frequencies in this case. For large T and N the rejections are acceptable, but when T and N are small the rejection frequencies under the alternative do not di↵er that much from the rejection frequencies under the null. The OIR does have a reasonable rejection frequency under the alternative in most cases, however from the problems that occur under the null hypothesis it may be concluded these rejection frequencies are not of much use for interpretation in practice. Overall the m21, m22and OIR tests do not provide reliable inference when heteroskedasticity and
skewness in the error term distribution are present. The rejection frequencies of the m11and m12
deteriorate further but are still of an acceptable level. However, the rejection frequencies of the m11 and m12 under the null should be investigated to conclude this with certainty.
Finally, an extra lagged explanatory variable, xi,t 1, is included in the DGP. The estimation
method does not take this xi,t 1into account. It is investigated how well the m2and OIR detect this
form of misspecification in comparison with the more specific Wald test, using the null hypothesis of 1= 0. The results are given in Table 5. The rejection frequencies of the Wald test under the
null are displayed in Appendix C Table 8. In both tables W1is the Wald test after one-step GMM
and W2 the Wald test after two-step GMM. The rejection frequencies under the null in Table 8
show that the W1has satisfactory size control. However, the W2needs higher N to display correct
rejection frequencies under the null hypothesis. The results in Table 5 show that only for small and thus large 1 the m2 test detects the misspecification with a reasonable rate. For smaller 1
this test does not recognize the serial correlation caused by the omitted xi,t 1. The OIR performs
a lot better, but also fails to detect the misspecification for small 1. Also, the OIR test again
fails to provide precise inference when T becomes relatively large and N stays small. It seems the Wald test outperforms both tests, especially the W1, which does not come as a surprise since it is
a test more appropriate for the situation.
To address the problems discussed, the bootstrap method explained in Section 2.3 is used. Because of the substantial computational requirements a selection of cases is made. Furthermore the Monte Carlo repetitions are set to R = 1000. The earlier obtained results and the bootstrap results for the cases selected can be found in Table 6 and Table 7.
Table 3: Rejection frequencies under the null and the alternative of tests: A dynamic panel ARDL(1,0) model with predetermined regressors. The ✏it are skewed.
N=100 m11 m12 m21 m22 OIR uit / T 5 7 9 5 7 9 5 7 9 5 7 9 5 7 9 (a) ✏it 0.1 0.777 0.925 0.970 0.779 0.925 0.970 0.024 0.032 0.038 0.021 0.032 0.038 0.028 0.009 0.000 0.5 0.775 0.924 0.970 0.776 0.924 0.970 0.024 0.031 0.036 0.022 0.031 0.036 0.039 0.009 0.000 0.9 0.751 0.920 0.969 0.755 0.921 0.969 0.061 0.038 0.040 0.050 0.040 0.040 0.011 0.002 0.000 (b) AR(1) 0.1 0.648 0.885 0.960 0.640 0.884 0.960 0.196 0.371 0.484 0.193 0.373 0.484 0.070 0.145 0.000 0.5 0.640 0.884 0.959 0.633 0.882 0.959 0.194 0.376 0.489 0.190 0.376 0.490 0.056 0.094 0.000 0.9 0.619 0.871 0.956 0.623 0.870 0.956 0.181 0.353 0.474 0.182 0.353 0.474 0.027 0.024 0.000 (c) MA(1) 0.1 0.608 0.870 0.956 0.597 0.868 0.956 0.325 0.537 0.668 0.323 0.538 0.668 0.076 0.136 0.000 0.5 0.601 0.867 0.955 0.592 0.865 0.955 0.311 0.538 0.667 0.314 0.537 0.667 0.064 0.093 0.000 0.9 0.579 0.855 0.952 0.578 0.853 0.952 0.275 0.509 0.661 0.283 0.513 0.660 0.030 0.026 0.000 N=200 (d) ✏it 0.1 0.958 0.990 0.998 0.957 0.990 0.998 0.030 0.041 0.045 0.029 0.039 0.044 0.013 0.013 0.021 0.5 0.957 0.989 0.998 0.957 0.989 0.998 0.029 0.039 0.046 0.029 0.039 0.044 0.015 0.016 0.027 0.9 0.952 0.989 0.998 0.954 0.989 0.998 0.044 0.041 0.044 0.038 0.040 0.044 0.020 0.017 0.020 (e) AR(1) 0.1 0.925 0.985 0.998 0.922 0.984 0.998 0.366 0.565 0.687 0.358 0.564 0.687 0.143 0.164 0.220 0.5 0.921 0.985 0.998 0.920 0.984 0.998 0.367 0.572 0.699 0.360 0.574 0.699 0.147 0.138 0.143 0.9 1.000 1.000 1.000 0.997 1.000 1.000 0.259 0.447 0.623 0.242 0.434 0.616 0.192 0.101 0.046 (f) MA(1) 0.1 0.907 0.985 0.997 0.899 0.985 0.997 0.556 0.771 0.873 0.549 0.771 0.872 0.184 0.190 0.226 0.5 0.902 0.985 0.997 0.895 0.985 0.997 0.557 0.775 0.875 0.549 0.776 0.875 0.182 0.150 0.137 0.9 0.883 0.982 0.996 0.883 0.982 0.996 0.508 0.762 0.870 0.513 0.763 0.872 0.204 0.121 0.058
R = 5000 simulation replications. Design parameter values: = 1 , ⇡ = ⇢ = 0.5, ¯! = 3, ✏= 1. The nominal significance level is 5%.
Table 4: Rejection frequencies under the null and the alternative of tests: A dynamic panel ARDL(1,0) model with predetermined regressors. The ✏it are heteroskedastic and skewed.
N=100 m11 m12 m21 m22 OIR uit / T 5 7 9 5 7 9 5 7 9 5 7 9 5 7 9 (a) ✏it 0.1 0.552 0.727 0.823 0.538 0.722 0.822 0.026 0.030 0.032 0.022 0.028 0.032 0.247 0.595 0.000 0.5 0.546 0.724 0.821 0.534 0.720 0.821 0.023 0.024 0.027 0.018 0.023 0.027 0.263 0.600 0.000 0.9 0.508 0.710 0.815 0.504 0.708 0.814 0.056 0.035 0.029 0.049 0.034 0.027 0.095 0.215 0.000 (b) AR(1) 0.1 0.496 0.697 0.803 0.479 0.693 0.802 0.058 0.086 0.116 0.046 0.079 0.112 0.238 0.494 0.000 0.5 0.487 0.693 0.801 0.467 0.685 0.800 0.060 0.098 0.135 0.051 0.094 0.133 0.221 0.471 0.000 0.9 0.452 0.676 0.794 0.441 0.669 0.792 0.062 0.089 0.132 0.055 0.085 0.132 0.095 0.211 0.000 (c) MA(1) 0.1 0.495 0.705 0.808 0.473 0.700 0.808 0.065 0.099 0.142 0.054 0.096 0.139 0.240 0.481 0.000 0.5 0.485 0.699 0.805 0.464 0.692 0.804 0.068 0.111 0.163 0.060 0.108 0.161 0.218 0.453 0.000 0.9 0.450 0.681 0.798 0.438 0.674 0.797 0.069 0.101 0.159 0.063 0.099 0.159 0.087 0.219 0.000 N=200 (d) ✏it 0.1 0.781 0.886 0.935 0.774 0.885 0.935 0.028 0.033 0.035 0.020 0.029 0.031 0.093 0.364 0.807 0.5 0.778 0.887 0.935 0.770 0.884 0.934 0.023 0.028 0.031 0.020 0.024 0.029 0.095 0.372 0.807 0.9 0.759 0.882 0.933 0.759 0.882 0.933 0.040 0.029 0.032 0.034 0.027 0.031 0.076 0.261 0.676 (e) AR(1) 0.1 0.754 0.882 0.934 0.742 0.881 0.933 0.083 0.143 0.193 0.068 0.130 0.180 0.157 0.454 0.819 0.5 0.751 0.881 0.933 0.740 0.879 0.932 0.094 0.174 0.242 0.085 0.165 0.235 0.133 0.407 0.792 0.9 0.728 0.877 0.930 0.722 0.874 0.930 0.093 0.177 0.259 0.082 0.169 0.252 0.123 0.332 0.730 (f) MA(1) 0.1 0.749 0.886 0.943 0.736 0.884 0.943 0.098 0.167 0.235 0.090 0.161 0.220 0.159 0.441 0.808 0.5 0.741 0.884 0.942 0.730 0.882 0.942 0.112 0.204 0.287 0.105 0.199 0.285 0.134 0.388 0.784 0.9 0.720 0.876 0.940 0.712 0.873 0.939 0.106 0.201 0.298 0.095 0.203 0.296 0.118 0.320 0.722
R = 5000 simulation replications. Design parameter values: = 1 , ⇡ = ⇢ = 0.5, ¯! = 3, ✏= 1. The nominal significance level is 5%.
Table 5: Comparison of m2 and OIR tests with Wald test using the alternative DGP. ✏it are standard normally distributed. N=100 W1 W2 m21 m22 OIR uit / T 5 7 9 5 7 9 5 7 9 5 7 9 5 7 9 (a) ✏it 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.106 0.210 0.328 0.134 0.236 0.337 0.997 0.956 0.000 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.053 0.047 0.048 0.053 0.047 0.048 0.846 0.565 0.000 0.9 0.573 0.827 0.939 0.669 0.918 0.991 0.050 0.057 0.051 0.053 0.058 0.050 0.111 0.042 0.000 N=200 (b) ✏it 0.1 1.000 1.000 1.000 1.000 1.000 1.000 0.237 0.470 0.667 0.303 0.540 0.714 1.000 1.000 1.000 0.5 1.000 1.000 1.000 1.000 1.000 1.000 0.049 0.053 0.058 0.050 0.055 0.060 1.000 1.000 0.999 0.9 0.872 0.982 0.998 0.888 0.985 0.999 0.052 0.053 0.050 0.053 0.055 0.051 0.309 0.307 0.226
R = 5000 simulation replications. Design parameter values: = 1 , 0= 1= /2, ⇡ = 0, ⇢ = 0.5, ¯! = 3, ✏= 1. The nominal significance level is 5%.
Table 6: Wild bootstrap to improve the rejection frequencies under the null of tests.
Original Bootstrap
Wild bootstrap m21 m22 OIR m21 m22 OIR
Homoskedasticity N=100 = 0.5 T = 5 0.048 0.048 0.037 0.059 0.075 0.056 T = 7 0.052 0.052 0.016 0.080 0.074 0.078 T = 9 0.052 0.053 0.000 0.133 0.122 0.036 Skewness + Heteroskedasticity N=100 = 0.5 T = 5 0.023 0.018 0.263 0.037 0.039 0.114 T = 7 0.024 0.023 0.600 0.047 0.049 0.313 T = 9 0.027 0.027 0.000 0.093 0.100 0.404 N=200 = 0.5 T = 5 0.023 0.020 0.095 0.033 0.035 0.070 T = 7 0.028 0.024 0.372 0.033 0.033 0.138 T = 9 0.031 0.029 0.807 0.067 0.069 0.381
R = 1000 simulation replications and B = 199 bootstrap replications. Design parameter values: = 1 , ⇡ = ⇢ = 0.5, ¯! = 3, ✏= 1. The nominal significance level is 5%.
Table 7: Wild bootstrap to improve the rejection frequencies under the alternative of tests.
Original Bootstrap
Wild bootstrap m21 m22 OIR m21 m22 OIR
Homoskedasticity N=100, T=5 = 0.5 AR(1) 0.175 0.176 0.108 0.153 0.194 0.186 MA(1) 0.334 0.334 0.159 0.296 0.356 0.252 N=100, T=9 = 0.5 AR(1) 0.394 0.392 0.000 0.138 0.167 0.080 MA(1) 0.709 0.708 0.000 0.373 0.427 0.116 Skewness + Heteroskedasticity N=100, T=5 = 0.5 AR(1) 0.060 0.051 0.221 0.056 0.052 0.113 MA(1) 0.068 0.060 0.218 0.051 0.059 0.105 N=200, T=5 = 0.5 AR(1) 0.094 0.085 0.133 0.084 0.103 0.125 MA(1) 0.112 0.105 0.134 0.082 0.105 0.121 Omitted xi,t 1 N=100, T=9 = 0.5 ✏it 0.048 0.048 0.000 0.163 0.165 0.339 N=200, T=9 = 0.5 ✏it 0.058 0.060 0.999 0.118 0.115 1.000
R = 1000 simulation replications and B = 199 bootstrap replications. Design parameter values for homoskedasticity and skewness plus heteroskedasticity: = 1 , ⇡ = ⇢ = 0.5, ¯! = 3, ✏= 1. Design parameter values for omitted
xi,t 1: = 1 , 0= 1= /2, ⇡ = 0, ⇢ = 0.5, ¯! = 3, ✏= 1.The nominal significance level is 5%.
The Wild bootstrap for the case of homoskedastic error terms shows that this bootstrap technique is not working properly for the m2 test. The rejection frequencies under the null and under the
alternative of serial correlation do not show improvements and in most cases even deteriorate. After all, this test does already provide satisfactory results without the use of bootstrapping. The rejection frequencies of the OIR test under homoskedasticity do improve substantially. Especially when T becomes larger, the original results show rejection frequencies of zero, where the Wild bootstrap fixes these finite sample problems and provides proper rejection frequencies.
The combined case of heteroskedasticity and skewness shows that the rejection frequencies under the null are improved for both the m2 test and the OIR test. The ones for the OIR test
are not close to 0.05 but when N increases it can be seen that the rejection frequencies move closer to this desired number. The results for the rejection frequencies under the alternative are disappointing however. The m21 and OIR tests show both no improvements in this area and in
most cases the results are even worsened. The rejection frequencies of the m22 test show small
signs of improvement, however very minimal.
Turning to the case of an omitted explanatory variable, in Table 7 the last two rows, the Wild bootstrap increases the rejection frequencies under the alternative substantially. As well as for the m2 test as for the OIR test the rejection frequency more than triples in comparison with the
original results for N = 100. For N = 200 the bootstrap is as good as or better than the original results.
Overall it may be concluded that for the OIR test in the case of small N and large T substantial improvements are obtained. It is a promising result because this case is precisely the situation in which the OIR test without bootstrapping is not working at all. The bootstrap for the m2test does
not provide an improvement when heteroskedasticity is present. However, under homoskedasticity it might increase the rejection frequencies under the alternative in some cases.
6
Conclusion
With the increase of data storage and the use of panel data models gaining in popularity, it is of great importance to have reliable misspecification tests. These tests should behave properly in a panel data setting and under several kinds of misspecification. This paper examined the finite sample properties of the m1, m2and OIR tests within a GMM framework and explored the ability
of the bootstrap to provide improved critical values for the test statistics.
The Monte Carlo simulations show that the OIR test performs poorly when T becomes relatively large. These results are in accordance with the findings in Windmeijer (2005). Only when the N is large, T is small and no heteroskedasticity or skewness is present can this test provide reliable inference in finite samples. Because of the poor behaviour in panels with many time lags, it can be argued wether the OIR test, in its basic form, should be used for panel data inference at all.
The m1 and m2tests, with the focus on the m2test, provide satisfactory rejection frequencies
for all combinations of N and T under homoskedasticity both under the null and mild AR(1) and MA(1) alternatives. Skewness is not much of a problem for these tests. The real weakness is heteroskedasticity, which diminishes the rejection frequencies under the alternative hypothesis.
The Wild bootstrap for panel data shows some potential for improving the finite sample prop-erties of the OIR test in the case of small N and large T . Using this bootstrap method might make the OIR test a reliable inference method for panel data models. However, further research that investigates di↵erent combinations of nuisance parameters should be carried out before this can be said with certainty.
The Wild bootstrap does improve the rejection frequencies of the m2 test under the null and it
improves the rejection frequencies under the alternative in some cases. However, it fails to mitigate the problems caused by heteroskedasticity. In most other situations considered, the original test performs well. Combining this with the bootstrap result, one may conclude that using the Wild bootstrap for the m2 test in panel data can provide improvements although not as substantially
as it improves the OIR test.
Further research can expand the cases considered in this paper and might explore the landscape in which the bootstrap provides improvements for the m2and OIR tests even more. Varying certain
nuisance parameters could give more insight in the strength of this method.
Acknowledgement
My gratitude goes out to Jan F. Kiviet for helpful comments and suggestions and to Bobby Witte for the philosophical conversations that provided me with insights and ideas.
References
Arellano, M., Bond, S.R., 1991. Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations. The Review of Economic Studies 58, 277-297.
Arellano, M., 2003. Panel Data Econometrics. Oxford University Press.
Bowsher, C.G., 2002. On testing overidentifying restrictions in dynamic panel data models. Eco-nomic Letters 77, 211-220.
accuracy in panel data models. Journal of Econometrics 132, 409-444.
Cameron, A.C., Gelbach, J.B., Miller, D.L., Bootstrap-based improvements for inference with clustered errors. The Review of Economics and Statistics 90(3), 414-427.
Cameron, A.C., Trivedi, P.K., 2005. Microeconometrics: methods and applications. Cambridge university press.
Godfrey, L.G., Tremayne, A.R., 2005. The wild bootstrap and heteroskedasticity-robus tests for serial correlation in dynamic regression models. Computational Statistics & Data Analysis 49, 377-395.
Gon¸calves, S., Kilian, L., 2004. Bootstrapping autoregressions with conditional heteroskedasticity of unknown form. Journal of Econometrics 123, 89-120.
Hall, P., Horowitz, J.L., 1996. Bootstrap critical values for tests based on generalized-method-of-moments estimators. Econometrica 64, 891-916.
Hansen, L., 1982. Large sample properties of generalized method of moments estimators. Econo-metrica 50, 1029-1054.
Harvey, A.C., 1981. The econometric analysis of time series. MIT press.
Kapetanios, G. 2008. A bootstrap procedure for panel data sets with many cross-sectional units. Econometrics Journal 11, 377-395.
Kiviet, J.F., 1995. On bias, inconsistency, and efficiency of various estimators in dynamic panel data models. Journal of Econometrics 68, 53-78.
Kiviet, J.F., Pleus, M., Poldermans, R., 2014. Accuracy and efficiency of various GMM inference techniques in dynamic micro panel data models. Amsterdam School of Economics, Discussion paper.
Li, H., Maddala, G.S., 1996. Bootstrapping time series models. Econometric Review 15, 115-158.
Liu, R.Y., 1988. Bootstrap procedures under some non-IID models. Annals of Statistics 16, 1696-1708.
MacKinnon, J.G., 2002. Bootstrap inference in econometrics. Canadian Journal of Economics 35, 115-158.
Sarafidis, V., Yamagata, T., Robertson, D., 2009. A test of cross section dependence for a linear dynamic panel model with regressors, unpublished manuscript.
Sargan, D., 1958. The estimation of economic relationships using instrumental variables. Econo-metrica 26, 393-415.
Windmeijer, F., 2005. A finite sample correction for the variance of linear efficient two-step GMM estimators. Journal of Econometrics 126, 25-51.
Yamagata, T., 2008. A joint serial correlation test for linear panel data models. Journal of Econometrics 146, 135-145.
Appendix
A Heteroskedasticity parameter
Start with var(✏it) = x2it. Such that ✏it=p ⌘itxit, with ⌘ ⇠ N(0, 1). Specify such that the
average variance of the disturbances is one
1 N T N X i T X t x2it= 1, = 1 var(xit) = 12 x . (56)
When ⇡ = 0 the DGP for xitbecomes
xit= ⇢xi,t 1+ vit. (57)
2 x= 2 v 1 ⇢2, (58) which results in =1 ⇢ 2 2 v . (59)
B Derivation alternative DGP
Start with the DGP with xi,t 1includedyit= yi,t 1+ 0xit+ 1xi,t 1+ ↵i+ uit, (60)
which can be rewritten as
yit ↵i
1 = y
⇤
it= yi,t 1⇤ + 0xit+ 1xi,t 1+ uit. (61)
Derivation of the variance gives
var(y⇤it) = 2var(y⇤it) + ( 0⇢ + 1)2 2x+ 02 2v+ 2u+ 2 ( 0⇢ + 1)cov(yi,t 1⇤ , xi,t 1), (62)
with
cov(yi,t 1⇤ , xi,t 1) = 0
+ ⇢ 1
(1 ⇢)(1 ⇢2) 2
v. (63)
Substituting this and 2
x= 1/(1 ⇢2) v2into (62) gives var(yit⇤) = 1 1 2 ✓ ( 0⇢ + 1)2 1 ⇢2 + 2 0+ 2 ( 0⇢ + 1)2 (1 ⇢)(1 ⇢2) ◆ 2 v+ u2 , (64) hence 2 s= 1 1 2 ✓ ( 0⇢ + 1)2 1 ⇢2 + 2 0+ 2 ( 0⇢ + 1)2 (1 ⇢)(1 ⇢2) ◆ 2 v+ 2 2u . (65)
Let 0= 1= /2 and 2u= 1 and by rewriting (65) we get 2 v= 2 2 ⇥ (1 2)¯! 2⇤ (1 ⇢)(1 ⇢) 1 + ⇢ + 2 . (66)
C Wald rejection frequencies
Table 8: Rejection frequencies of the Wald test under the null hypothesis 1= 0.
N=100 W1 W2 uit / T 5 7 9 5 7 9 (a) ✏it 0.1 0.086 0.092 0.099 0.187 0.308 0.525 0.5 0.090 0.106 0.114 0.194 0.315 0.539 0.9 0.072 0.078 0.084 0.171 0.286 0.503 N=200 (b) ✏it 0.1 0.071 0.079 0.080 0.114 0.178 0.247 0.5 0.075 0.076 0.083 0.122 0.179 0.252 0.9 0.067 0.073 0.075 0.114 0.174 0.249
R = 5000 simulation replications. Design parameter values: = 1 , 0= 1= /2, ⇡ = 0, ⇢ = 0.5, ¯! = 3, ✏= 1. The nominal significance level is 5%.