• No results found

Examining the performance of Lochners and Moretti’s Wald test for different significance levels

N/A
N/A
Protected

Academic year: 2021

Share "Examining the performance of Lochners and Moretti’s Wald test for different significance levels"

Copied!
21
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Examining the performance of Lochners and Moretti’s Wald test for different significance levels

Dick van der Sluijs 10274111 University of Amsterdam

(2)

1. Introduction

In empirical research, endogeneity of regressors is a recurrent topic. Durbin (1954), Wu (1973) and Hausman (1978) came up with tests to detect endogeneity, of which the Durbin Wu Hausman (DWH) test is known best. However, this test has serious shortcomings in nonlinear models. Therefore, Lochner and Moretti (2011) derived a new test, robust for a certain type of nonlinearity. They extensively analyzed the behavior of this general Wald test for different degrees of nonlinearity and endogeneity, and their results are encouraging. Additionally, Lochner and Moretti showed that this test performs well even when only a single instrument is available.

If their test proves to be valid, it means that researchers have a new tool to detect endogeneity in this specific setting of nonlinear models. Because many relations that are being analyzed are nonlinear in the parameters, the finding of a robust endogeneity test is an important one. To make things more concrete, consider the relationship between earnings and schooling. One could assume that the relationship between earnings and schooling is linear, or in other words, that the parameter of schooling is homogenous. This would mean that every year of schooling has the same effect on earnings. However, such a model might be misspecified. Lochner and Moretti (2011) assume the parameter of schooling is heterogeneous. To be able to estimate the marginal effect of every year of schooling, they construct indicator functions that correspondent with the years of schooling. However, if schooling is endogenous, these indicator functions will be endogenous as well. This would mean that, to get a consistent estimator for the indicator functions, the same amount of instrument variables is required. In practice, researchers often only have limited valid and strong instruments. Because of this problem, Lochner and Moretti (2011) derived a different estimator than the standard IV-estimator: a weighted averaged estimator. With this estimator and a new multiple step test, endogeneity can be tested with just one instrument. If this new derived test behaves well, it could be highly useful in practice.

Lochner and Moretti (2011) examined the tests performance with Monte Carlo simulations at a nominal significance level of five percent. Important to take into account is that they do not further analyze the behavior of the test at other significance levels. The five percent significance level is chosen because it is the most widely used level. However, for large datasets one might want to use the test with a smaller significance level. Additionally, if the test for other significance levels is not examined, it will remain unknown how closely the finite sample distribution follows the asymptotical Chi-Squared (1) distribution. This is

(3)

mostly important in the tails of the Chi-Squared (1). In the worst case, it is pure coincidence that the simulations and the Chi-squared (1) give the same five percent significance level, and the finite sample distribution is completely different from the asymptotic distribution for other significance levels. Such findings would limit the usability of the suggested test of Lochner and Moretti (2011). For this reason it is important that the behavior of this test is examined for both smaller and larger significance levels; to provide better insight on how the test behaves.

Therefore the purpose of this paper is to examine the performance of the derived Wald test for nominal significance levels in the range of [0.01, 0.2]. The objective is to explore how the actual significance level deviates from the nominal significance level for varying significance levels. From that, a conclusion can be given about the behavior of the finite sample distribution in comparison to the asymptotic distributions. I will examine this with use of Monte Carlo simulations. The performance will be examined in both linear and nonlinear models.

The remainder of the paper is organized as follows. Section two describes the theoretical background of the derived test and the difference between the actual and nominal significance level. Section three gives the methods, where the setup of the Monte Carlo simulations is described. Section four gives the results of the simulations, and section five concludes.

2. Literature review

Before describing the test derived by Lochner and Moretti (2011), it is important to explain the standard Durbin-Wu-Hausman test. First consider the standard linear model:

𝑦𝑦 = 𝑋𝑋𝑋𝑋 + 𝜀𝜀.

Where X is an n x k matrix, containing k regressors, β is a k x 1 unknown coefficient vector and ε is the disturbance term that is identically independent distributed with 𝜀𝜀𝑖𝑖 ∼ (0, 𝜎𝜎2) for

𝑖𝑖 = 1, … , 𝑛𝑛 . With only exogenous variables, one can use the Ordinary Least Squares (OLS) to

find an unbiased, consistent and efficient estimator. However, if there are endogenous variables -variables that are correlated with the disturbance term- the OLS-estimator becomes biased and inconsistent (Hausman, 1978, p. 1260).

When suspecting endogenous regressors, a consistent estimator can be found with either the Generalized Method of Moments (GMM) – or the Instrumental Variable (IV) estimator. The IV-estimator is a special case of the GMM-estimator, which is efficient under homoskedasticity. However, both methods give inefficient estimates under the null

(4)

hypothesis (Baum, Schaffer & Stillman, 2003, p.21). For that reason, under exogeneity, the OLS estimator is preferred to the IV- and GMM estimators.

With that, the importance of an endogeneity test becomes clear. Durbin (1954) proposed a test to see if the model has been misspecified by endogenous variables. Additionally, Wu (1973) proposed four other tests to test for the same misspecification. Hausman (1978), build on this and proposed the test as used today. This test is based on a vector of contrast. The DWH test is defined as:

𝐻𝐻 = �𝑋𝑋̂𝑂𝑂𝑂𝑂𝑂𝑂− 𝑋𝑋̂𝐼𝐼𝐼𝐼�′�𝑉𝑉𝑉𝑉𝑉𝑉�𝑋𝑋̂𝐼𝐼𝐼𝐼� − 𝑉𝑉𝑉𝑉𝑉𝑉�𝑋𝑋̂𝑂𝑂𝑂𝑂𝑂𝑂��+�𝑋𝑋̂𝑂𝑂𝑂𝑂𝑂𝑂− 𝑋𝑋̂𝐼𝐼𝐼𝐼�→ 𝜒𝜒𝑑𝑑 2(𝜌𝜌) (1) Where�𝑉𝑉𝑉𝑉𝑉𝑉�𝑋𝑋̂𝐼𝐼𝐼𝐼� − 𝑉𝑉𝑉𝑉𝑉𝑉�𝑋𝑋̂𝑂𝑂𝑂𝑂𝑂𝑂��+ is the Pseudo-inverse matrix of the difference of

variances of the estimators 𝑋𝑋̂𝑂𝑂𝑂𝑂𝑂𝑂 and 𝑋𝑋̂𝐼𝐼𝐼𝐼, the OLS- and IV-estimator. ρ is the number of possibly endogenous variables under the alternative hypothesis. As one can see from the equation, an increase in the difference between the two estimators results in an increase of the first and third term. Furthermore, if the sample variance of both estimators gets closer to each other, the middle term gets smaller. Because the pseudo-inverse of this middle term is used, the statistic gets larger. As one can see, the statistic gets significantly large if the estimators are different, but the variance is more or less the same. In other words, the DWH test tests the consequences of employing different estimation methods on the same

equation (Baum et al. 2003, p.20). There has been done extensive research on the performance of the DWH test in linear models (Nakamura & Nakamura, 1985). However, this study focuses on endogeneity in a specific kind of nonlinear models.

Nonlinear models give more accurate descriptions of certain relationships. To make this clearer, again consider the thorough studied relationship between earnings and years of schooling (Hungerford & Solon, 1987, p.176). The parameter of schooling can be assumed to be homogenous. The equation following from this assumption would be:

𝑦𝑦𝑖𝑖 = 𝑠𝑠𝑖𝑖𝑋𝑋𝑂𝑂+ 𝑥𝑥𝑖𝑖′𝛾𝛾𝑂𝑂+ 𝑣𝑣𝑖𝑖 (2) Where 𝑠𝑠𝑖𝑖 is the years of schooling. 𝑥𝑥𝑖𝑖′ are the exogenous variables and 𝑦𝑦𝑖𝑖 is earnings. 𝑋𝑋𝑂𝑂 and 𝛾𝛾𝑂𝑂 are the coefficients. 𝑣𝑣𝑖𝑖 is identically independent distributed with 𝑣𝑣

𝑖𝑖 ∼(0, 𝜎𝜎2) for 𝑖𝑖 =

1, … , 𝑛𝑛. With such a linear model, one assumes that a year at primary school contributes as

much to earnings as a year doing a master degree. Obviously, this does not approach the reality. Using a model with indicator functions allows one to estimate the marginal effect of every year of schooling. For example, consider the following model:

(5)

Where S is the total years of schooling. Dij is a vector of indicator functions, with Dij =1 if si ≥ j. βj is the corresponding coefficient. xi is a k x 1 vector of exogenous variables, with k x 1 vector γ containing the corresponding unknown coefficients. 𝜀𝜀𝑖𝑖 is identically independent distributed with 𝜀𝜀𝑖𝑖 ∼ (0, 𝜎𝜎2) for 𝑖𝑖 = 1, … , 𝑛𝑛 . With equation (3), one can easily estimate the marginal effects βj of every year of schooling j. However, a practical problem arises if schooling is assumed to be endogenous. As said in the introduction, to get a consistent estimator for those marginal effects, the same amount of instruments is needed. Because in practice it is hard to find valid and strong instruments, Lochner and Moretti (2011) give a way to get a consistent estimator with just one instrument. In the article of Lochner and Moretti (2011) it is shown that both the OLS and the IV estimates of equation (2) converge to a weighted average of the marginal effects. These weights are different for each

estimators and can estimated by regressing equation (4):

𝐷𝐷𝑖𝑖𝑖𝑖 = 𝑠𝑠𝑖𝑖𝜔𝜔𝑖𝑖+ 𝑥𝑥𝑖𝑖′𝛼𝛼𝑖𝑖+ 𝜓𝜓𝑖𝑖𝑖𝑖, ∀𝑗𝑗 ∈ {1, … , 𝑆𝑆} (4) where 𝜔𝜔𝑖𝑖 is the estimated weight, and 𝑥𝑥𝑖𝑖′ are the exogenous variables. Finally, they give a generalization of the Hausman test, where the 2SLS estimate of the misspecified linear model is compared with the weighted average of the estimated coefficients of the indicator functions. With this test, the estimated weights are 2SLS and the estimated coefficients of the indicator functions are OLS.

As mentioned in the introduction, the DWH test does not perform well in nonlinear models. This it studied extensively by Lochner and Moretti (2011). With Monte Carlo simulations they found that, under constant endogeneity, when increasing the degree of nonlinearity, the fraction of rejecting the Hausman test increases rapidly. For example, when a dataset without endogeneity and nonlinearity is simulated, the fraction the null hypothesis gets rejected for the DWH test is 0.049. This is around the expected five percent. However, if the degree of nonlinearity is increased to 0.1, 0.5 and 1, the fraction of rejection of the null hypothesis is respectively 0.054, 0.172 and 0.434.

Due to the unstable behavior of the DWH test, Lochner and Moretti (2011) suggest a new test, consisting multiple steps and described above. First, they first compute the 𝑋𝑋̂2𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂 that follows from the model:

𝑦𝑦𝑖𝑖 = 𝑠𝑠𝑖𝑖𝑋𝑋𝑂𝑂+ 𝑥𝑥

𝑖𝑖′𝛾𝛾𝑂𝑂+ 𝑣𝑣𝑖𝑖 (5)

From this, 𝑋𝑋̂2𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂 is defined as:

𝑋𝑋̂2𝑂𝑂𝑂𝑂𝑂𝑂𝑂𝑂 = (𝑠𝑠̂′𝑀𝑀𝑥𝑥𝑠𝑠̂)−1𝑠𝑠̂′𝑀𝑀𝑥𝑥𝑦𝑦 (6)

In this equation, 𝑀𝑀𝑥𝑥 is the matrix that projects off the space of x, defined as: 𝑀𝑀𝑥𝑥 = 𝐼𝐼 − 𝑥𝑥(𝑥𝑥′𝑥𝑥)−1𝑥𝑥. 𝑠𝑠̂I is the projection of the instruments and exogenous variables on schooling defined as: 𝑠𝑠̂ = 𝑥𝑥′𝑖𝑖𝜃𝜃�𝑥𝑥+ 𝑧𝑧′𝑖𝑖𝜃𝜃�𝑧𝑧. Again, x’i is the vector consisting the exogenous

(6)

variables. 𝑧𝑧′𝑖𝑖 is the vector consisting the instruments. In other words, this is the fitted first stage of the 2SLS-estimator. y is the dependent variable.

Next thing is to define the final test: 𝑊𝑊𝑁𝑁= ��𝛽𝛽�2𝑆𝑆𝑆𝑆𝑆𝑆

𝑆𝑆 −𝑊𝑊′𝐵𝐵��2

𝐺𝐺�′𝐼𝐼�𝐺𝐺� �

𝑑𝑑

→ 𝜒𝜒2(1) (7)

𝑊𝑊 = (𝑊𝑊1… 𝑊𝑊𝑂𝑂) Is the vector with the weights for the levels of schooling, obtained with the 2SLS-estimation of equation (8):

𝐷𝐷𝑖𝑖𝑖𝑖 = 𝑠𝑠𝑖𝑖𝜔𝜔𝑖𝑖+ 𝑥𝑥𝑖𝑖′𝛼𝛼𝑖𝑖+ 𝜓𝜓𝑖𝑖𝑖𝑖, ∀𝑗𝑗 ∈ {1, … , 𝑆𝑆} (8) 𝐵𝐵 = ( 𝑋𝑋1 … 𝑋𝑋𝑂𝑂) is the vector with the coefficients for the years schooling, estimated with the OLS-estimator when regressing equation (3). 𝑉𝑉� is the estimated variance of the vector of the full set of parameters. G is defined as:

𝐺𝐺� = (−𝑊𝑊� 1 0 �−𝑋𝑋′ � 0�… �−𝑋𝑋𝑂𝑂∗1 � 0�)′ (9) WN is the test statistic for the null hypothesis that 𝐵𝐵� → 𝐵𝐵. The upper part of the Wald test 𝑝𝑝 calculates the squared difference between the linear 2SLS estimates and the weighted averaged OLS estimates, which have the same probability limit under the null hypothesis.

Both the DWH- and the new Wald test follow a Chi-Squared asymptotic distribution. Lochner and Moretti (2011) run the simulation only for a model with one instrumental variable. For that reason the degree of freedom used for the Wald test is one.

Using this test in the Monte Carlo simulations, Lochner and Moretti (2011) show that their derived test behaves stable under different degrees of nonlinearity. While the test seems to be robust, it should be taken into account that the performance is only examined at a five percent significance level. Before giving the setup of the simulations, it is important to give background information on the difference between nominal and actual significance levels.

There is a difference between a nominal and an actual significance level. The nominal significance is the chosen significance level. The critical values are based on the asymptotic distribution, whereas the actual significance is the type I error probability in finite samples when using a certain critical value. Lochner and Moretti (2011) show that the nominal and actual significance levels of the derived Wald test are equal at a five percent significance level (p. 29).

This study will examine the test behavior for other significance levels. Only

significance levels close to zero are interesting to examine. This is because in general, when testing a null hypothesis, one wants only a small chance of making a type I error. Therefore the study will focus on the interval of [0.01, 0.2]. Lochner and Moretti show that the fraction of rejection has the expected value at a five percent nominal significance level. As

(7)

stated in the introduction, this can be a coincidence. The actual significance level can deviate completely from the nominal significance level, which result in an over- or under-rejection.

As explained in the article of Wahlby, Jonsson and Karlsson (2001), there are various reasons why an actual significance level differs from a nominal significance level. In

nonlinear models specifically, this can be caused by the approximation of random effects. Also, if the residual error is heteroskedastic, the actual significance level can deviate from the nominal level. Thirdly, if the underlying residual error distribution is lognormal, and not normal, the deviation gets larger when the variance is increasing. Another feature that has major influence on the actual significance level is the sample size. When the sample size increases, the actual significance level usually comes closer to the nominal distribution (Wahlby, Jonsson & Karlsson, 2001, p. 249).

As mentioned in the introduction, the behavior of the test is examined by running Monte Carlo simulations. With these simulations, it is possible to control certain

characteristics of the simulated dataset. For that reason it allows to avoid some of above described causes. For example, with Monte Carlo simulations it can be made sure the residual error is homoskedastic and normal distributed. Different sample sizes will be generated to see how the actual significance level behaves.

Based on the outcome of the simulations, a conclusion about the tests performance will be given. First, the setup of the simulations will be described.

(8)

3. Method

In this section the method is described to examine if the actual significance level deviates from the nominal significance level. Monte Carlo simulation is a widely used method where multiple samples are examined. These samples are generated from a known distribution. Therefore, the advantage of using Monte Carlo simulations is that the user can examine outcomes of the simulations, knowing the distribution and characteristics of the data. In this section the parameters and functions are defined.

Looking to the relationship between earnings and schooling, one can conclude that every individual tries to maximize its utility function. A modification of Cards model (1995) is used to estimate the optimal years of schooling per individual. The utility function is given as:

𝑀𝑀𝑉𝑉𝑥𝑥𝑖𝑖𝑀𝑀𝑖𝑖𝑧𝑧𝑀𝑀: 𝑉𝑉𝑖𝑖(𝑠𝑠𝑖𝑖) = log[𝑦𝑦𝑖𝑖(𝑠𝑠𝑖𝑖)] − 𝐶𝐶𝑖𝑖(𝑠𝑠𝑖𝑖),

where 𝑦𝑦𝑖𝑖(𝑠𝑠𝑖𝑖) is the earnings function for a given level of schooling 𝑠𝑠𝑖𝑖 defined as: log[𝑦𝑦𝑖𝑖(𝑠𝑠𝑖𝑖)] = 𝑉𝑉 + 𝑏𝑏𝑠𝑠𝑖𝑖+ 𝜅𝜅𝜅𝜅(𝑠𝑠𝑖𝑖 ≥ 𝐽𝐽) + 𝜀𝜀𝑖𝑖.

𝐶𝐶𝑖𝑖(𝑠𝑠𝑖𝑖) is the cost function for a given level of schooling:

𝐶𝐶𝑖𝑖(𝑠𝑠𝑖𝑖) = 𝑐𝑐 + (𝑑𝑑𝑧𝑧𝑖𝑖+ 𝜂𝜂𝑖𝑖)𝑠𝑠𝑖𝑖+ 𝑘𝑘22𝑠𝑠𝑖𝑖2+ 𝜅𝜅𝜅𝜅(𝑠𝑠𝑖𝑖 ≥ 𝐽𝐽).

𝜅𝜅 is the degree of nonlinearity in the range of [0,1]. 𝜅𝜅(𝑠𝑠𝑖𝑖 ≥ 𝐽𝐽) is the indicator function that is one if the level of schooling equals or is greater than J, and zero otherwise. 𝜂𝜂𝑖𝑖 and 𝜀𝜀𝑖𝑖 are the disturbance terms generated from a binormal distribution with given correlation ρ.

After generating these disturbances, the level of schooling per individual is maximized. 𝑠𝑠𝑖𝑖 ∈ {0,1,2, … , 𝑆𝑆}. With the individually maximized level of schooling and disturbance term 𝜀𝜀𝑖𝑖, log[𝑦𝑦𝑖𝑖(𝑠𝑠𝑖𝑖)] is calculated. Because with the given parameters the optimized levels of schooling does not always include the whole range {1, … ,S}, the matrix containing the dummy variables is transformed so it only contains linear independent columns, i.e. only the years of schooling that are generated. Additionally, if levels of schooling exceed twenty years, they are changed in twenty. With the generated dataset, equations (2), (3) and (6) are estimated:

𝑦𝑦𝑖𝑖 = 𝑠𝑠𝑖𝑖𝑋𝑋𝑂𝑂+ 𝑥𝑥𝑖𝑖′𝛾𝛾𝑂𝑂+ 𝑣𝑣𝑖𝑖 (2) 𝑦𝑦𝑖𝑖 = ∑𝑂𝑂𝑖𝑖=1𝐷𝐷𝑖𝑖𝑖𝑖𝑋𝑋𝑖𝑖+ 𝑥𝑥𝑖𝑖′𝛾𝛾 + 𝜀𝜀𝑖𝑖 (3) 𝐷𝐷𝑖𝑖𝑖𝑖 = 𝑠𝑠𝑖𝑖𝜔𝜔𝑖𝑖+ 𝑥𝑥𝑖𝑖′𝛼𝛼𝑖𝑖+ 𝜓𝜓𝑖𝑖𝑖𝑖, ∀𝑗𝑗 ∈ {1, … , 𝑆𝑆} (4) Where equation (3) is estimated with OLS, and equations (2) and (4) are estimated with IV.

(9)

After this, log[𝑦𝑦𝑖𝑖(𝑠𝑠𝑖𝑖)] is regressed on 𝑠𝑠𝑖𝑖 to estimate with both the OLS and IV. For every replication, the OLS- and the 2SLS estimates are saved in a vector. Additionally, the vector with the differences between the variances of the OLS- and 2SLS-estimator is saved. From these vectors, the standard DWH statistic is calculated for every replication. For every replication this functions output is saved in a column vector.

For the derived Wald test, things are done differently. The 22 equations given in equation (2), (3) and (4), are estimated jointly. The structure of the matrices is defined as:

𝑌𝑌𝑖𝑖 = ( 𝑦𝑦𝑖𝑖 𝑦𝑦𝑖𝑖 𝐷𝐷𝑖𝑖′ )′ 𝑋𝑋𝑖𝑖 = �𝑋𝑋01𝑖𝑖 𝐼𝐼 0

𝑂𝑂+1⨂𝑋𝑋2𝑖𝑖�

𝑍𝑍𝑖𝑖 = �𝑋𝑋01𝑖𝑖 𝐼𝐼 0

𝑂𝑂+1⨂𝑍𝑍2𝑖𝑖�

where 𝑋𝑋1𝑖𝑖 = ( 𝐷𝐷𝑖𝑖′ ), 𝑋𝑋2𝑖𝑖 = ( 𝑠𝑠𝑖𝑖 𝑥𝑥𝑖𝑖′ ) 𝑉𝑉𝑛𝑛𝑑𝑑 𝑍𝑍2𝑖𝑖 = ( 𝑧𝑧𝑖𝑖′ 𝑥𝑥𝑖𝑖 ′ ). ⨂ is the Kronecker product of the a S* + 1 identity matrix and the given matrix. 𝑋𝑋 and 𝑍𝑍 are the matrices consisting all

individuals 𝑋𝑋𝑖𝑖 and 𝑍𝑍𝑖𝑖. The dimensions of X and Z are n*(S*+2) x (3S* +2), where S* is the adjusted maximum years of schooling. From these computations, all estimators can be obtained with one regression:

Θ� = [𝑋𝑋′𝑍𝑍(𝑍𝑍𝑍𝑍)−1𝑍𝑍𝑋𝑋]−1𝑋𝑋′𝑍𝑍(𝑍𝑍𝑍𝑍)−1𝑍𝑍′𝑌𝑌 From that, 𝑈𝑈𝚤𝚤� = 𝑌𝑌 − 𝑋𝑋′Θ�

The projection of X on Z is defined as: 𝑋𝑋� = 𝑍𝑍′Γ� = 𝑍𝑍′(𝑍𝑍′𝑍𝑍)−1𝑍𝑍′𝑋𝑋

From these results, the variance of the difference is obtained by computing the matrix: 𝑉𝑉� = �𝑋𝑋�′𝑋𝑋��−1Γ�′Λ� Γ��𝑋𝑋�′𝑋𝑋��−1

Where Λ� is defined as: ∑ 𝑍𝑍𝑖𝑖𝑈𝑈 𝚤𝚤 � 𝑈𝑈�𝑍𝑍𝑖𝑖𝚤𝚤′ 𝑁𝑁

𝑖𝑖=1

Then the Jacobian vector of the difference of the two estimators is defined as: 𝐺𝐺� = (−𝑊𝑊� 1 0 �−𝑋𝑋

1

� 0�… �−𝑋𝑋� 0�) 𝑂𝑂∗ Finally the statistic can be calculated:

𝑊𝑊𝑁𝑁 = ��𝑋𝑋̂2𝑂𝑂𝑂𝑂𝑂𝑂

𝑂𝑂 − 𝑊𝑊′𝐵𝐵��2 𝐺𝐺�′𝑉𝑉�𝐺𝐺� �

𝑑𝑑

→ 𝜒𝜒2(1)

For every replication, the statistic is saved in a vector. After the simulation, this vector will be used for twenty indicator functions.

These twenty indicator functions are defined as: 𝜄𝜄 (𝑊𝑊𝑁𝑁≥ 𝜒𝜒𝛼𝛼2(1))

(10)

Where 𝛼𝛼 has the range [0.01, 0.02,… ,0.2 ]. At last, the mean of this vector is calculated per column. The result is a row vector consisting of the means of every column. This is done for both the Wald test as the DWH test.

To generate the dataset, the parameters are set as in Lochner and Moretti (2011) with a = 1.5; b=0.04; c=0; d=0.01; 𝑘𝑘2=0.003; 𝜎𝜎𝜀𝜀2=0.25; 𝜎𝜎𝜂𝜂2=0.00005; J=12; and S=20. The sample size N is 1000, and 2500 replications are used per degree of nonlinearity. The estimated parameter is 𝑏𝑏�.

4. Results

For the twenty different nominal significance levels, the actual significance levels of both tests are examined. As said above, this is done with 2500 replications per simulation. The simulation will be done for both a sample size of 1000 , 500 and of 50. Fractions that significantly deviate from the nominal significance level are highlighted with an asterisk. At first, the results of n = 1000 will be discussed.

As can be seen from Table 1, the fraction the null hypothesis gets rejected lies close to the nominal significance level for no added nonlinearity. This is as expected. Additionally it can be seen that the fraction the null hypothesis gets rejected lies for both tests lower than the expected theoretical fraction with significance levels below eight percent. For example, for a nominal significance level of 3 percent the fraction the null hypothesis of the DWH test and Wald test are respectively is 0.024 and 0.026. For eight percent and above, a slight overestimation of both tests is observed. For example, for the expected chance of a type I error of seventeen percent, the DWH- and Wald test reject with fraction 0.182 and 0.181. However, none of these values significantly deviate from the nominal significant level. For that reason, no conclusion can be made.

When having a small degree of nonlinearity of 0.1, a few observations can be made. First, both the DWH test and the Wald test lie above the nominal significance level for all tested nominal significance levels. This would indicate, that when setting the degree of nonlinearity to 0.1, both tests reject the null hypothesis more often than expected. Secondly, it can be seen that for the tested nominal significance levels, the Wald test outperforms the DWH test. Again, this is not significant. For some nominal significance levels, for example the 7 percent significance level, the actual significance level deviates significantly from the nominal significance level.

For the degree of nonlinearity set to 0.5, the difference gets larger. It can be seen easily that for the DWH test, the actual significance level does not come close to the nominal

(11)

significance level. For a nominal significance level of 1 percent, the fraction the DWH test rejects the null hypothesis is 0.056. For a nominal significance level of 5 percent, the DWH test gives a 0.165 actual significance level. With the ten percent nominal level, which in some studies still is accepted for a proper significance level, the simulated significance level is 0.25. With the DWH test with a degree of nonlinearity of 0.5, all tested actual significance levels deviate significantly from their nominal significant levels.

The Wald test remains controlled. The actual significance level lies close to the nominal significance level for the entire range of tested nominal significance levels. It is observed that the maximum deviation from the nominal significance level is no more than 0.0056. It can also be seen that the larger the nominal significance level gets, the smaller the deviation gets. At last, it is noteworthy to mention that the test does not significantly under or over rejects.

If the degree of nonlinearity is set to 1, with full non-linearity, the DWH test is uncontrolled for all significance levels that were tested. The test rejects the null hypothesis a lot more than can be accepted. The actual significance level deviates significantly from the nominal significant level for all tested levels. Handling a 0.01 significance level, the DWH test rejects twenty percent of the times. With a significance level of 0.05, the DWH test rejects over 41 percent of the time. Noteworthy is that both with the degree of nonlinearity set to 0.5 and 1, the actual significance levels of the DWH test behave convex, as can be seen in graph two and three. Again, the Wald test keeps following the tail of the asymptotic

distribution. With a nominal significance level of 1 percent the Wald test rejects 0.007 of the time. With a five percent significance level this fraction is 0.052. At last, there is no

indication that the Wald test performs better with the degree of nonlinearity of 0 than with a degree of nonlinearity of 0.5.

For a sample size of 500, the results are similar. When setting the degree of

nonlinearity at 0.5, the fraction the null hypothesis gets rejected with the DWH test is higher than expected from the asymptotic distribution: for a five percent significance level, the actual significance level is above ten percent. When taking a one percent significance level, the actual fraction the null hypothesis gets rejected is 2.5 percent. While the actual

significance levels lie a lot closer to the nominal significance level with the sample size set to 500, the DWH test still behaves uncontrolled. The Wald test however, again proves to be controlled.

For the small sample with a size of 50, both test behave uncontrolled. It can be seen that for both the DWH- and the Wald test, for all tested degrees of nonlinearity, some actual

(12)

significance levels deviate significantly from the nominal significance level. This is mostly at small significance levels. Additionally, both tests tend to under-reject for small significance levels and over-reject for nominal significance levels around 14 percent.

5. Discussion

The DWH test and its importance in econometrics are discussed. Additionally, the

differences between nominal and actual significance levels have been described. Lochner and Moretti (2011) showed that the DWH test performs poorly in nonlinear models with indicator functions. When increasing the degree of nonlinearity, the DWH test rejects the null hypothesis more often than is expected from the asymptotic distribution. This study examined that this is not only the case with a five percent significance level, but that the DWH test also behaves uncontrolled for all tested nominal significance levels in the range of 0.01 and 0.2. This is for large samples, but not with the small sample. It should be noted, that with small with a small degree of nonlinearity, say 0.1, the DWH test still performs controlled. For no nonlinearity, the DWH test performs better than expected from the asymptotic distribution, for all tested sample sizes. However, this is insignificant.

The recommended Wald test, derived by Lochner and Moretti, proves to be controlled for this specific form for nonlinearity. For different degrees of nonlinearity, and for all tested significance levels the derived Wald test follows the tail of the asymptotic distribution. This is the case for all tested sample sizes. Also, for no nonlinearity, the Wald test performs as expected. Noteworthy is that the actual significance level gets closer to the nominal significance level for larger samples sizes, as is expected from the theoretical background. Also, the actual fraction of rejection tends to be lower than the nominal fraction for small significance levels. For higher nominal significance levels, this is the other way around. However, this is not significant. It is observed that with both tests the actual significance levels deviate more from the nominal significance level for smaller sample sizes. Again, nothing can be concluded due to insignificance. Further research could look deeper into the behavior of the Wald test under the alternative hypothesis. It is preferred, under the alternative hypothesis that the Wald test rejects with fraction one. For that reason, it could be interesting how the test behaves with different significance levels under the alternative hypothesis.

(13)

6. Appendix

Table 1: Fraction of rejection null hypothesis with nominal significance level at n=1000:

Nominal Significance

level

DWH test k=0 Wald test

k=0 DWH test k=0.1 Wald test k=0.1 DWH test k=0.5 Wald test k=0.5 DWH test k=1 WALD test k=1

0.01 0.010 0.008 0.010 0.009 0.056* 0.006* 0.206* 0.007 0.02 0.016 0.018 0.022 0.017 0.092* 0.016 0.281* 0.017 0.03 0.024 0.026 0.032 0.029 0.114* 0.031 0.336* 0.026 0.04 0.033 0.037 0.044 0.040 0.142* 0.044 0.385* 0.038 0.05 0.041 0.050 0.057 0.051 0.166* 0.054 0.418* 0.052 0.06 0.054 0.056 0.072 0.064 0.185* 0.066 0.448* 0.062 0.07 0.070 0.068 0.084* 0.076 0.202* 0.076 0.480* 0.069 0.08 0.081 0.081 0.097* 0.085 0.217* 0.083 0.504* 0.082 0.09 0.093 0.094 0.106* 0.095 0.234* 0.092 0.522* 0.093 0.1 0.103 0.102 0.115 0.109 0.250* 0.100 0.539* 0.102 0.11 0.115 0.114 0.128* 0.119 0.266* 0.108 0.554* 0.110 0.12 0.125 0.124 0.136 0.129 0.280* 0.121 0.570* 0.121 0.13 0.136 0.133 0.149* 0.138 0.294* 0.133 0.588* 0.130 0.14 0.149 0.147 0.157 0.150 0.314* 0.143 0.605* 0.140 0.15 0.161 0.158 0.171* 0.160 0.326* 0.154 0.620* 0.149 0.16 0.172 0.171 0.180* 0.170 0.337* 0.160 0.637* 0.158 0.17 0.182 0.181 0.187 0.182 0.348* 0.170 0.651* 0.166 0.18 0.191 0.190 0.196 0.193 0.360* 0.179 0.661* 0.177 0.19 0.203 0.198 0.206 0.204 0.373* 0.190 0.674* 0.188 0.2 0.213 0.207 0.216 0.215 0.385* 0.202 0.684* 0.201

Table 2: Fraction of rejection null hypothesis with nominal significance level at n=500:

Nominal Significance

level

DWH test k=0 Wald test

k=0 DWH test k=0.1 Wald test k=0.1 DWH test k=0.5 Wald test k=0.5 DWH test k=1 WALD test k=1

0.01 0.01 0.008 0.008 0.01 0.025* 0.005* 0.09* 0.006 0.02 0.021 0.018 0.017 0.016 0.05* 0.018 0.144* 0.018 0.03 0.031 0.027 0.025 0.023 0.068* 0.026 0.183* 0.03 0.04 0.039 0.04 0.037 0.032 0.088* 0.038 0.216* 0.04 0.05 0.051 0.054 0.049 0.04* 0.103* 0.047 0.239* 0.05 0.06 0.061 0.061 0.058 0.052 0.12* 0.057 0.263* 0.066 0.07 0.072 0.068 0.066 0.063 0.132* 0.068 0.285* 0.074 0.08 0.084 0.079 0.075 0.071 0.147* 0.082 0.302* 0.081 0.09 0.096 0.094 0.082 0.081 0.159* 0.093 0.32* 0.093 0.1 0.103 0.102 0.09 0.098 0.17* 0.104 0.337* 0.104 0.11 0.11 0.111 0.101 0.106 0.182* 0.112 0.361* 0.116 0.12 0.121 0.122 0.112 0.119 0.197* 0.122 0.38* 0.125 0.13 0.13 0.133 0.124 0.126 0.205* 0.132 0.396* 0.138 0.14 0.139 0.144 0.132 0.136 0.218* 0.142 0.411* 0.149 0.15 0.147 0.156 0.142 0.148 0.234* 0.147 0.425* 0.16 0.16 0.157 0.165 0.157 0.156 0.249* 0.157 0.436* 0.171 0.17 0.168 0.175 0.167 0.164 0.261* 0.166 0.451* 0.183 0.18 0.185 0.185 0.178 0.173 0.273* 0.179 0.469* 0.197 0.19 0.196 0.195 0.192 0.184 0.283* 0.185 0.487* 0.208 0.2 0.206 0.207 0.201 0.199 0.294* 0.197 0.501* 0.217

(14)

Table 3: Fraction of rejection null hypothesis with nominal significance level at n=50:

Nominal Significance

level

DWH test k=0 Wald test

k=0 DWH test k=0.1 Wald test k=0.1 DWH test k=0.5 Wald test k=0.5 DWH test k=1 WALD test k=1

0.01 0.002* 0* 0.001* 0* 0.002* 0* 0.005* 0* 0.02 0.008* 0.004* 0.006* 0.002* 0.008* 0.001* 0.015 0.003* 0.03 0.016* 0.012* 0.01* 0.006* 0.015* 0.009* 0.024 0.01* 0.04 0.027* 0.022* 0.018* 0.013* 0.022* 0.016* 0.039 0.016* 0.05 0.035* 0.034* 0.028* 0.025* 0.032* 0.024* 0.052 0.027* 0.06 0.047* 0.047* 0.038* 0.037* 0.043* 0.03* 0.064 0.038* 0.07 0.059 0.062 0.049* 0.049* 0.052* 0.041* 0.078 0.055* 0.08 0.066* 0.074 0.063* 0.063* 0.061* 0.054* 0.092 0.066* 0.09 0.076* 0.086 0.075* 0.079 0.072* 0.066* 0.105 0.078 0.1 0.087 0.1 0.087 0.096 0.082* 0.078* 0.116 0.09 0.11 0.097 0.115 0.1 0.111 0.094* 0.091* 0.13* 0.103 0.12 0.11 0.128 0.113 0.121 0.105 0.11 0.146* 0.115 0.13 0.123 0.143 0.123 0.137 0.118 0.122 0.158* 0.128 0.14 0.135 0.154 0.135 0.152 0.132 0.135 0.173* 0.142 0.15 0.145 0.17* 0.148 0.166 0.142 0.15 0.185* 0.155 0.16 0.156 0.184* 0.162 0.185* 0.153 0.163 0.196* 0.171 0.17 0.167 0.197* 0.173 0.197* 0.166 0.176 0.207* 0.181 0.18 0.178 0.21* 0.184 0.208* 0.177 0.185 0.22* 0.191 0.19 0.19 0.225* 0.197 0.221* 0.186 0.198 0.231* 0.203 0.2 0.199 0.233* 0.206 0.233* 0.193 0.209 0.243* 0.214

Graph 1: Nominal and actual significance level for k=0, n=1000.

0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 Ac tu al sig nif ic anc e l ev el

Nominal significance level

DWH test WALD test 45 Degree line

(15)

Graph 2: Nominal and actual significance level for k=0.1, n=1000.

Graph 3: Nominal and actual significance level for k=0.5, n=1000. 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 Ac tu al si gn ifi ca nc e l ev el

Nominal significance level

DWH test WALD test 45 Degree line 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 Ac tu al si gn ifi ca nc e l ev el

Nominal significance level

DWH test WALD test 45 Degree line

(16)

Graph 4: Nominal and actual significance level for k=1, n=1000.

Graph 5: Nominal and actual significance level for k=0, n=500. 0 0,05 0,1 0,15 0,2 0,25 0,3 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 Ac tu al si gn ifi ca nc e l ev el

Nominal signficance level

DWH test WALD test 45 Degree line 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 Ac tu al si gn ifi ca nc e l ev el

Nominal significance level

DWH test WALD test 45 Degree line

(17)

Graph 6: Nominal and actual significance level for k=0.1, n=500.

Graph 7: Nominal and actual significance level for k=0.5, n=500.

0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 Ac tu al si gn ifi ca nc e l ev el

Nominal significance level

DWH test WALD test 45 Degree line 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 Ac tu al si gn ifi ca nc e l ev el

Nominal significance level

DWH test WALD test 45 Degree line

(18)

Graph 8: Nominal and actual significance level for k=1, n=500.

Graph 9: Nominal and actual significance level for k=0, n=50. 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 Ac tu al si gn ifi ca nc e l ev el

Nominal significance level

DWH test WALD test 45 Degree line 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 Ac tu al si gn ifi ca nc e l ev el

Nominal significance level

DWH test WALD test 45 Degree line

(19)

Graph 10: Nominal and actual significance level for k=0.1, n=50.

Graph 11: Nominal and actual significance level for k=0.5, n=50.

0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 Ac tu al si gn ifi ca nc e l ev el

Nominal significance level

DWH test WALD test 45 Degree line 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 Ac tu al si gn ifi ca nc e l ev el

Nominal significance level

DWH test WALD test 45 Degree line

(20)

Graph 12: Nominal and actual significance level for k=1, n=50. 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 0 0,02 0,04 0,06 0,08 0,1 0,12 0,14 0,16 0,18 0,2 Ac tu al si gn ifi ca nc e l ev el

Nominal signficance level

DWH test WALD test 45 Degree line

(21)

References

Baum, C. F., Schaffer, M. E., & Stillman, S. (2003). Instrumental variables and GMM: Estimation and testing. Stata Journal, 3(1), 1-31.

Card, D. (1995). Using Geographic Variation in College Proximity to Estimate the Return to Schooling.’, In LN Christofides, EK Grant & R. Swidinsky (eds.), Aspects of

Labor Market Behaviour: Essays in Honour of John Vanderkamp. Toronto: University of TorontoPress.

Durbin, J. (1954). Errors in variables. Revue de l'Institut international de statistique, 23-32. Hausman, J. A. (1978). Specification tests in econometrics. Econometrica: Journal of the

Econometric Society, 1251-1271.

Hungerford, T., & Solon, G. (1987). Sheepskin effects in the returns to education. The review of economics and statistics, 175-177.

Nakamura, A., & Nakamura, M. (1985). On the performance of tests by Wu and by Hausman for detecting the ordinary least squares bias problem. Journal of

econometrics, 29(3), 213-227.

North, B. V., Curtis, D., & Sham, P. C. (2002). A note on the calculation of empirical P values from Monte Carlo procedures. American journal of human genetics, 71(2), 439. Lochner, L., & Moretti, E. (2011). Estimating and testing non-linear models using

instrumental variables (No. w17039). National Bureau of Economic Research. Ulrika, W.E., Jonsson, N. & Karlsson, M.O. (2001). "Assessment of actual significance

levels for covariate effects in NONMEM." Journal of pharmacokinetics and pharmacodynamics. 28(3). 231-252.

Wu, D. M. (1973). Alternative tests of independence between stochastic regressors and disturbances. Econometrica: journal of the Econometric Society, 733-750.

Referenties

GERELATEERDE DOCUMENTEN

Door het toedie- nen van ijzerkalkslib daalde de concentratie fosfaat van het poriewater in alle plots sterk tot beneden 0,7 µmol/l (Lommerbroek), 2,9 µmol/l (Jammerdal) en 1,7

C is een buigpunt: de daling gaat van toenemend over in afnemend (minimale helling) en punt E is ook een buigpunt: de stijging gaat over van toenemend in afnemend.. De helling

Figure 1 Time dependent velocities at 9 different levels (a) and velocity profiles at 12 different wave phases (b) In combination with other flow velocity, sediment concentration

and its significance for Boeotian History In the 1980's the Bradford-Cambridge Boeotia Project (Bintliff, 1991; Bintliff and Snodgrass, 1985; Bintliff and Snodgrass, 1988a; Bintliff

Pollution prevention is arguably one of the ways by which sustainable development may be achieved. According to Bosman Waste Disposal or Discharge 28, the most obvious feature

(2012a, 2012b) may not simulate a sulphide smelting furnace model, but it is able to generate dynamic data, it has a low computational cost allowing weeks of simulated data

Practical flight tests showed that the flight control was stable for both the healthy and the damaged aircraft configurations, and able to handle the transition following an

As one is susceptible to being lured by the arts, and most of the arts are about a willingness to participate in reciprocal deception (Hodgson and Helvenson 2006; Hodgson 2013), one