• No results found

Returns to education in the Netherlands

N/A
N/A
Protected

Academic year: 2021

Share "Returns to education in the Netherlands"

Copied!
32
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

UNIVERSITEIT VAN AMSTERDAM

Returns to education in the Netherlands

Name: Maaike Boot

Student number: 10105743

Field of study: Labor economics/economics Supervisor: David Smerdon

(2)

Table of contents

Table of contents 2

List of tables 4

1. Introduction 5

1.1 Background 5

1.2 Overall Research Aim and Individual Research Objectives 6

1.3 Focus and value of this research 6

1.4 Structure of this research 7

2. Literature review and previous research on the topic 7

2.1 Estimating the returns to education with OLS regression 7

2.1.1 Endogenous schooling variable 7

2.2 Estimating the returns to education with IV regression 8

2.2.1 The reasoning behind IV regression 8

2.2.2 Caution related to IV regression 9

2.2.3 Relevance of the instrument 9

2.2.4 Quality of the instrument 9

2.2.5 Validity of the instrument 10

2.3 Instruments used in previous research 10

2.3.1 The season of birth instrument 10

2.3.1.1 Quality and validity check 11

2.3.2 The family background instrument 12

2.3.2.1 Quality and validity check 12

2.3.3 The changes in compulsory schooling laws instrument 13

2.3.3.1 Quality and validity check 13

3. Methodology 14

3.1 Research strategy 14

3.1.1 IV estimates and the season of birth instrument 14

3.1.2 IV estimates and the family background instrument 14

3.2 Data collection 15

3.2.1 Season of birth variable 15

3.2.2 Years of schooling variable 15

3.2.3 Family background variable 16

3.2.4 Net hourly wage variable 16

3.2.5 Respondents 17

3.3 Framework for data analysis 17

(3)

4. Findings 17 4.1 Brief introduction 17 4.2 Experimental procedure 17 4.2.1 OLS regression 17 4.2.2 IV regression 18 4.2.2.1 IV regression in general 18

4.2.2.1.1 Criteria for a reliable instrument 19

4.2.2.2 IV regression using the season of birth instrument 19 4.2.2.3 IV regression using the family background instrument 20

4.3 Results 21

4.3.1 Data 21

4.3.2 Effect of family background on schooling 22

4.3.3 Effect of season of birth on schooling 23

4.3.4 OLS estimates 25

4.3.5 IV estimates – Family background 26

4.3.6 IV estimates – Season of birth 28

4.4 Discussion 29

5. Conclusion 31

6. Bibliography 32

(4)

List of tables

Table 1 – Descriptive statistics, means and standard deviations 21 Table 2 – OLS estimates of the schooling function – family background 23 Table 3 – OLS estimates of the schooling function – season of birth 24

Table 4 – OLS estimates of the returns to schooling 26

Table 5 – IV estimates of the returns to schooling – family background 27

Table 6 – Checking the criteria for valid instruments 28

Table 7 – IV estimates of the returns to schooling – season of birth 28

(5)

1. Introduction

1.1 Background

What is the economic return of an additional year of schooling? Many researchers have tried to find an answer to this question for various countries over a long period of time.

Estimating the returns to education is important because if we know how the returns to education changed, we know if our education investment level is in line with the rate of return. A discussion has been going on for years about the level of investment in education. The central discussion question seems to be: have we expanded our educational investment level too much or should we expand less? In other words, do we invest too much or too less because of too high/too low expectations of the rates of return? To be able to answer these questions, we need reliable estimates of the returns to education.

Researchers have tried to measure reliable returns to education by estimating the Mincer equation, in which earnings are regressed on the years of education of an individual. The econometric method that was mostly used in the first available estimates was the Ordinary Least Squares method (hereafter OLS). Empirical research shows that there is a positive relationship between the years of schooling and the wage.

However, in the last decades several researchers have tried to solve their doubts about the reliability of the OLS estimates on the returns to schooling. One of the main reasons for these doubts is that several factors might affect the returns to education and consequently make it more difficult and possibly biased to measure the effects of the years of education on the returns to education. One could think of the influence of family background, for example parental education and job level, IQ,

experience, the minimum school leaving age and sibling composition. In the debate on the reliability of the OLS estimates the schooling parameter is blamed to be an endogenous variable because ability measures are mostly unobserved or unavailable and because both schooling and the returns to

schooling are likely to be correlated to the unobserved variable ‘ability’.

Researchers conducted several ways to counter this problem related to missing information on ability and the errors in the measurement of schooling. The Instrumental Variables (from now on called IV) regression is one of the most used methods to obtain more reliable schooling return

estimates. Different instruments have been used to address the endogeneity in the schooling variable. As one of the first, Angrist and Krueger (1991) addressed the problems of the OLS estimates by using season of birth as an instrument for schooling. Other researchers, such as Leigh and Ryan (2008), Levin and Plug (1999) used changes in compulsory schooling laws and family background variables as instruments for measuring the returns to schooling.

The conclusion most found in mainstream literature is that the OLS estimates of schooling returns are significantly biased downwards and that estimating the returns to education with the IV method leads to significantly higher returns to education.

(6)

1.2 Overall research aim and research objectives

The main question that will be answered in this research is: how did the returns to education change in the Netherlands from 1994 to 2000? The main objective is to focus on the returns estimated by OLS regression and IV regression by using two different instruments and to see if these results are in line with the conclusions found in previous research.

1.3 Focus and value of this research

The focus of this research is to look at the changes in the returns to education in the Netherlands by applying two different research methods.

Research on the returns to education in the Netherlands is sparse. Hartog et al (1993)

estimated the returns to education in a period from 1962 to 1989 and Levin and Plug (1999) and Plug (2001) estimated the returns to education for the Netherlands in 1994, but more recent research seems non- existent.

Plug (2001) used the most applied instruments in the existing literature to address the earlier mentioned problems with the OLS estimates. Plug (2001) states that there are three instruments which seem to be most important in previous research: the season of birth variable, family background variables and changes in compulsory schooling laws. To be able to answer the research question, this research will be comparable to the research as done by Plug (2001), although the changes in

compulsory schooling laws will not be used as an instrument. The main objective of Plug (2001) is to see if the results from previous research conducted in other countries are also applicable to Dutch data. Plug (2001) stated that if the same conclusions as found in other studies arise from the same methods applied to Dutch data, we can take the IV estimates more seriously.

As Plug (2001) focused on the returns to education in 1994, this research will focus on the change in the returns to education in the period from 1994 to 2000. To measure this change, the returns to education in 2000 will be estimated to see how the rate of return increased, remained stable or decreased. It is possible to make this conclusion, since the same data is used in this research as in Plug’s (2001) study. Unfortunately, it is not possible to conduct the research on a more recent time period since there is no data available on the family background variables in surveys conducted after 2000. This would make it impossible to compare the OLS estimates to the IV estimates using family background variables as an instrument.

To estimate the change in the returns to education in this period the two approaches applied in previous research will be used, including OLS regression and IV regression. Both methods are

performed since this will add value to the question whether it is also the case in the Netherlands that in 2000 the OLS estimates are biased downwards in comparison to the IV estimates of the returns to schooling. As an instrument for conducting the IV estimates the season of birth variable is used in the first place. Secondly, the family background variables are applied as an instrument, since these variables are mentioned by Plug (2001) as the only instrument that fulfills the criteria for a suitable

(7)

variable. The three most important criteria for a reliable instrument are quality, validity and relevance. The meaning of these criteria will be explained further in the research.

As mentioned before, there is not much research on the returns to education in the

Netherlands. This research will therefore add value in a way that it conducts research on measuring the returns to education in a time period that has not been explored yet.

1.4 Structure of this paper

In the next section, previous research that has been performed on the returns to education in various countries will be described. The focus will lie on the critics related to the OLS estimates and the alternatives that are conducted to address this criticism.

The research methodology that is used to estimate the returns to education in the Netherlands and the data necessary to conduct the research will be explained further in the third section.

The fourth section consists of the findings, including an introduction of the empirical methodology and a conclusion about the results.

2. Literature review and previous research on the returns to education

2.1 Estimating the returns to education with OLS regression

From past literature, including Hartog et al (1993), it follows that there seems to be a positive

relationship between years of schooling and wages when estimating the returns to education with OLS regression.

Hartog et al. (1993) were one of the first who conducted an OLS regression on the returns in the Netherlands from 1962 to 1989. They observed drops in returns over time. In 1962 the returns were 11% compared to a return of 5% in 1985. The returns remained stable in the period from 1985 to 1989. Plug (2001) finds an OLS estimate of the returns to education in 1994 of 3.4% for the males population.

However, many researchers including Uusitalo (1997) and Leigh and Ryan (2008) seem to agree on the main reasons mentioned by Plug (2001) why the schooling parameter measured by the OLS research method does not reveal the exact returns to education. The biased OLS estimates could arise from the fact that, in the first place, schooling is an endogenous variable, which means that the schooling variable is correlated with the error term. The main reason for this endogeneity is that ability might affect earnings and the years of schooling directly but that information on ability is usually unobserved or unavailable. In this case, we are talking about an omitted variable, which leads to the bias in the OLS estimates. Finally, the schooling measure itself is subject to errors of measurement.

2.1.1 Endogenous schooling variable

The number of the years of schooling is one of the explanatory variables in the earnings function of an individual. However, the number of years an individual decides to go to school might also be

(8)

influenced by the ability characteristics of an individual (Kalwij, 1996). This ability, including for example intelligence and motivation, is also likely to influence the earnings of an individual while controlling for schooling. This means that, if we cannot observe ability, schooling might be an endogenous variable and this would lead to upward biased OLS estimates. In other words, we overestimate the returns related to obtaining one more year of education when it is easier for high-ability individuals to undertake education (Leigh and Ryan, 2008). On the other hand, the OLS

estimates may also be biased downwards when low-ability people compensate by finishing more years of education, or because of an error made in the measurement of the years of schooling.

2.2 Estimating the returns to education with IV regression

To address the bias in the OLS estimates of the returns to schooling mentioned above, several studies have tried to find alternatives to estimate more reliable returns to schooling. It seems surprising that the conclusion most found in literature is that the OLS estimates are significantly biased downwards compared to alternative estimators that account for the endogeneity in the schooling parameter. Upward biased OLS estimates would be expected because of the missing information on ability and the fact that this ability also influences the earnings of an individual, as explained by Kalwij (1996). This is one of the main objectives of this research: to see if the OLS estimates are also biased downwards when conducting the same methods applied in previous research to Dutch data in 2000. One of the alternatives for the potential biased OLS estimates that is used most in previous literature is the Instrumental Variables approach.

2.2.1 The reasoning behind IV regression

If the independent variable, the years of schooling, is correlated with the error term in the earnings function, this means that the independent variable is endogenous. Consequently, the OLS estimates are inconsistent, that is, they may not be close to the true value of the regression coefficient even when the sample size is very large (Stock and Watson, 2012). The correlation between the independent variable and the error term may result from different sources, including omitted variables, measurement errors in the variables and simultaneous causality. In the case of estimating the returns to education, it seems that ability is the omitted variable. However, since information on ability is usually unobservable, we cannot include the omitted variable (ability) in a multiple regression to obtain more reliable estimates. This is the point where the IV regression can be used to obtain a consistent estimator of the unknown coefficients of the earnings function. The IV regression uses the additional variables as instruments to isolate the parts in the independent variable that are uncorrelated with the error term, which in turn permit consistent estimation of the regression coefficients, in this case the rate of return to education.

(9)

2.2.2 Caution related to IV regression

There is, however, an important caution to take care of when estimating the returns to education with the IV method. IV regression produces in fact estimates with bigger confidence levels and larger standard errors in comparison to the OLS estimates. This is the main reason why researchers prefer the OLS estimates above the IV estimates if it is likely that there is no bias in the OLS estimates.

This is also one of the three criteria mentioned by Plug (2001) to decide if an instrument produces more reliable estimates than his OLS counterpart. If all three criteria mentioned below are satisfied, we are able to conclude that the IV estimates are more reliable than the OLS estimates. A second point of caution is related to the concept of over identifying instruments, when there are more instruments than endogenous variables. In general, the more instruments are used, the higher the quality of the instrument set. In other words, it increases the correlation between the instruments and the years of schooling, the first stage F-statistic. However, a problem related to this over identification could be that using more than one instrument increases the small sample bias of the IV estimates. It is therefore best to have as few instruments as possible, and to make sure that for them there is a strong correlation with the endogenous variable.

2.2.3 Relevance of the instrument

If instruments are used for the schooling variable to conduct an IV regression and at the same time expecting that more reliable estimates than the OLS estimates are produced, it is necessary to be sure that it is relevant to estimate the returns to education by IV regression. In other words, we need to be sure that there is a significant bias in the OLS estimates such that we want to risk to produce an estimate with larger standard errors and bigger confidence levels. The endogeneity of the OLS estimates can be proved by performing a Hausman test. It is, however, important to note that this instrument is mentioned by Plug (2001) but not by many other researchers as an important criteria, who mainly focus on the following two criteria.

2.2.4 Quality of the instrument

The second criteria mentioned by Plug (2001) is the quality of an instrument. This criteria controls for the fact that the correlation between the instrument and the schooling variable is not too weak. An instrument is called a weak instrument if the instrument explains little of the variation in the schooling variable. The problem of a weak instrument is that the normal distribution provides a poor

approximation of the sampling distribution of the IV estimator, even if the sample size is large. If the instruments are weak, the IV regression is no longer reliable because the IV estimator can be badly biased in the direction of the OLS estimator. In other words, 95% confidence intervals constructed for the IV estimator can contain the true value of the coefficient far less than 95% of the time. Bound et al (1995) mention that a weak correlation between instruments and earnings can result in even more biased IV estimates than the OLS estimates. To test if this criteria satisfies, Bound et al. (1995)

(10)

propose to test for the quality of an instrument by excluding the instruments from the earnings equation and by using a F-test to indicate if the instrument is statistically significant.

2.2.5 Validity of the instrument

The third and last criteria is the validity of an instrument. This criteria means that the instrument only affects earnings through schooling and is thus uncorrelated with the error term. If the instrument is not exogenous and thus correlated with the error term, the IV regression fails to provide a consistent estimator. To test if the instrument is valid, Plug (2001) proposes Sargan’s over-identification test.

2.3 Instruments used in previous studies

Levin and Plug (1999) focus on the three instruments most used in previous literature: the season of birth variable, family background variables and changes in compulsory schooling laws. Levin and Plug (1999) discuss if the results of estimating the returns to education with the use of these instruments in previous studies is comparable to their results based on Dutch data. As mentioned before, there is not much research on the returns to education in the Netherlands, which means that most results from other studies are based on foreign data.

2.3.1 The season of birth instrument

The possible bias in the OLS estimates of the returns to education is addressed by Angrist and Krueger (1991) as one of the first. They tried to measure more reliable returns to education in the United States by using the season of birth of an individual as an instrument for addressing the endogeneity in the schooling parameter. They assume that season of birth is related to the educational attainment because of school start age policies and compulsory school attendance laws. In other words, there is a relation between date of birth and years of schooling, since students born in different months of the year start school at different ages and the laws on the minimum school leaving age require the students to attend school until they reach a specified birthday. This means that children who are born earlier in the year, are older when they start with school than children who are born later in the year. Consequently, the children who start school at an older age, reach the minimum school leaving age after less years of education than the children born later in the year. This is also the conclusion made by Angrist and Krueger (1991) who summarize that children born earlier in the year have less schooling on average because they reach the school-leaving age earlier than individuals born later in the year. However, they do not find any significant difference between their OLS estimates of 0.072 and their IV estimates of 0.080. It is, however, difficult to say if this means that there is no relevance for estimating the returns to education by using IV regression when using season of birth as an instrument. In fact it would not be possible to make this conclusion about the nonexistent difference between the OLS en IV estimates without using IV regression.

Plug (2001) tried to find the same results by applying the research method of Angrist and

(11)

Krueger (1991) to Dutch data. However, Plug (2001) found an opposite result in the sense that the season of birth influences school performance not in terms of the compulsory schooling effect and the minimum school leaving age but by means of relative age effects. He found that people born later in the year, after the first of October, have a schooling career that is almost one year longer than people born earlier in the year. However, also in this case it seems that there is no large difference between OLS and IV estimates, since the OLS estimate found by Plug (2001) is 3.4% and the IV estimate is almost the same as this OLS estimate when he uses season of birth as an instrument.

The same research method is implemented by Leigh and Ryan (2008), who conducted their research on Australian data to find more reliable returns to education in Australia by using season of birth as an instrument. The OLS estimate of the return from one additional year of schooling based on annual pre-tax income is 13%, but when using the IV approach they found an estimate of 8%, which shows that their OLS estimates are highly biased upwards. These results seem to be in contrast with the results found by Angrist and Krueger (1991), Levin and Plug (1999) and Plug (2001) who

conclude that there is no significant difference between the OLS estimates and the IV estimates when season of birth is used as an instrument. Besides, it also contradicts the general results that OLS estimates are biased downwards.

Concluding from this, it seems that in three out of four mentioned studies there is no

significant difference between the two estimation methods. One study, namely the one from Leigh and Ryan (2008), finds upwards biased OLS estimates. These differences between the four studies makes it even more relevant to measure the returns to education with IV regression.

2.3.1.1 Quality and validity check

Assuming that there is any relevance for estimating the IV returns to education, it is also necessary to make sure that the instrument satisfies the quality and validity condition before we conclude about the reliability of the results. In other words, the season of birth variable needs to be correlated with the schooling variable but uncorrelated with the error term. It seems that the season of birth instrument is correlated with the schooling variable in a way that people born just before the school entry date spend one more year in school than children born just after the school entry date. To satisfy the validity condition, the season of birth instrument needs to be handled with more care. This condition might be violated when the season of birth instrument does not only affect earnings through the quantity of schooling, but also through the quality of schooling. If earnings are also affected through the quality of schooling, this implies that children learn more from being the youngest or oldest in the class. Plug (2001) found that the season in which an individual is born influences school performance through the fact that people born earlier in the year are older than their classmates, thus more mature and on average get better marks, that they have less difficulties in school and often achieve an higher level of education. If this affect found by Plug (2001) is true, this would mean that the instrument is not valid. Although Angrist and Krueger (1991) state that they found enough evidence of the fact that season of

(12)

birth does not have any other effect on earnings than through schooling, Levin and Plug (1999) find strong evidence for the influence of the season of birth on earnings by performing the test of over identifying restrictions. Since this research mainly follows the results found by Plug (2001), it is concluded from the literature review that season of birth is not a valid instrument.

2.3.2 The family background instrument

After the attempt of Angrist and Krueger in 1991, several other instruments are used to address the endogeneity problems in the OLS estimates. One of the most important instruments used is the family background variable. This instrument is among others used by Uusitalo (1999), who tried to estimate the returns to education in Finland. By using data on profession, income, education and socioeconomic status of the parents he found an OLS estimate of 9%, but an IV estimate of 11- 13%. In other words, the IV estimates are approximately 60% higher than the OLS estimates, which is a significant

difference.

The family background variable is also used by Plug (2001) and Levin and Plug (1999). They show that the level of the father’s job and education have significant effects on the school attainment of an individual. The OLS estimates Plug (2001) found are 3.4% for males and 4% for females, but if he uses the IV approach with family background variables as instruments, he found an estimate of the return to education of 5.1% for men and 4.7% for women. According to Levin and Plug (1999) the use of family background instruments in an IV setting significantly increases the estimated returns to education, from 3.4% to 5% for OLS and IV estimates respectively. The conclusion made from these results is the same as the conclusion found by Uusitalo (1999), namely that the OLS estimates are significantly biased downwards when using family background as an instrument. This implies that the relevance criteria for a reliable instrument is satisfied and that, based on this criteria, there is enough reason to estimate the returns to education with the IV approach.

2.3.2.1 Quality and validity check

It is also important to verify if the family background instruments also satisfy the other two criteria. It seems that there is a correlation between the parent’s education- and job level and the years of schooling. Levin and Plug (1999) and Uusitalo (1999) proved this correlation between the instrument and the independent variable by performing a F-test of joint significance. In other words, they found that the family background variable is a strong instrument. The validity condition is also satisfied according to Levin and Plug (1999) by showing their results of the Sargan’s over-identification test. They find that the family background instrument is not correlated with the error term. From these results, it can be concluded that family background is a reliable instrument.

(13)

2.3.3 The changes in compulsory schooling laws instrument

Changes in compulsory schooling laws is, in contradiction to the season of birth and the family background instrument, a natural experiment that is outside the control of the researcher and because of the randomization in the variable we can assume that this variable is uncorrelated with the error term. If we do not have a natural experiment, there is no guarantee that the instrument is uncorrelated with the error term and might be endogenous. On the one hand, one of the main objectives of this research is the comparison between the OLS estimates and the IV estimates of the returns to education by using season of birth and family background variables as instruments. Because of this fact, not too much attention is paid to the usage of changes in compulsory schooling laws as an instrument. On the other hand, since it is mentioned as one of the most used instruments in previous literature it is worth giving a brief description on the instrument and how it is used in previous research.

The instrument is among others used by Leigh and Ryan (2008) to estimate more reliable returns to education in Australia, and by Harmon and Walker (1995) who applied the same research method to data from the United Kingdom. Leigh and Ryan (2008) found OLS estimates of a 13% return to an additional year of schooling, compared to an IV estimate of 12%. In contrast to this upward biased estimate found by Leigh and Ryan (2008), Harmon and Walker (1995) find downward biased OLS estimated returns of 6% and remarkably higher IV estimates, namely 15%. From these two studies it seems that there is a contradiction in the direction of the bias. Because of this difference in the direction of the bias and the significant difference between the OLS and IV estimates it seems that it is relevant to estimate the returns to education by applying the IV method and using changes in compulsory schooling laws as an instrument. The endogeneity of the schooling parameter, and hence satisfying the criteria of relevance, is shown by Harmon and Walker (1995), who performed a Hausman test.

2.3.3.1 Quality and validity check

Levin and Plug (1999) show that the increase in the minimum schooling leaving age is correlated with schooling by performing a F-test of the joint significance and thus satisfying the quality condition. They also state that changes in compulsory schooling laws have no direct effect on earnings and thereby assuming that the instrument is uncorrelated with the error term.

(14)

3. Methodology

3.1 Research strategy

According to the findings in the existing literature, the estimates of the returns to education performed by the OLS research method may not reveal precise estimates because of endogeneity bias and measurement error in the schooling variable. To tackle these problems, several researchers have estimated the returns to education by applying an IV regression on the earnings function. The general opinion from previous studies is that the OLS estimates of the returns to education are biased

downwards.

To see if the OLS estimates of the returns to education in the Netherland are also biased downwards in the period from 1994 to 2000 in comparison with the IV estimates, the research methodology consists of those two regression methods. The first step is to estimate the returns of an extra year of schooling by using OLS regression applied to the traditional Mincerian equation. The second step is to produce IV estimates by using two different instruments for measuring the schooling parameter: the season of birth and the family background.

3.1.1 IV estimates and the season of birth instrument

According to the studies from Angrist and Krueger (1991) and Plug (2001) it seems that it is relevant to measure the returns to education in the United States and the Netherlands by using season of birth as an instrument. There is no significant difference between the OLS- and the IV estimates, but it is only possible to prove this by running the IV regression.

This research will verify if these results are also applicable to the returns to education in the Netherlands in 2000. Next to this relevance check of the instrument, this research shows whether the other criteria of quality and validity are also satisfied, since Plug (2001) states that the season of birth instrument is not valid because of the fact that this variable is correlated with the error term. The opinion of Angrist and Krueger (1991) is however that they found enough evidence on the fact that season of birth does not have any other effect on earnings than through schooling and is thus not correlated with the error term. Since these opinions are conflicting, it is interesting to see if the validity condition is satisfied when these criteria are tested on the Dutch data.

3.1.2 IV estimates and the family background instrument

Uusitalo (1999) and Plug (2001) both prove the relevance of estimating the returns to education in Finland and the Netherlands with the IV approach when using the family background instrument since they find high significant differences between the OLS and IV estimates. Levin and Plug (1999), as well as Uusitalo (1999), prove that the family background instruments satisfy all three criteria for reliable IV estimates: family background is correlated with the schooling variable but not with the error term. More importantly, and which is one of the fundamental bases for this research, is that Levin and Plug (1999) state that the family background instrument, namely the parental education

(15)

level and job level, performs best on the three criteria in comparison with season of birth or changes in compulsory schooling laws as instruments.

The family background variables, the education- and job level of the father, are therefore used as a second instrument.

3.2 Data collection

The data that is used for running the regressions is available from the OSA labor market survey. This survey includes data on school duration, earnings, family background and month of birth. These datasets are also used by Plug (2001) and Levin and Plug (1999). The OSA began with conducting the survey in 1985 and every survey contains between 4000 and 5000 respondents. The main focus of the survey is to obtain more information on the labor market situation of the Dutch population, including individuals from 16 to 66 years old. The survey is conducted every second year and includes old respondents from the previous years as well as new respondents. For example, the 2000 data set includes 320 respondents who filled in the survey for the first time in 1985. Besides, 1798 new respondents are included in the 2000 survey.

The dataset from 2000 will be the main research object. However, since not every survey year includes information on the variables needed for this research, the datasets from previous years will be used to fill up the missing data on family background variables. It is possible to fit all variables into one dataset because of the identification numbers which are assigned to the respondents and

households. The variables that will be used in this research are the month of birth variable, the variables that give information on the years of schooling, the variables which include information on the job- and education level of the father and the variable including the net hourly wage. As control variables, the age and the square of the age are used. A short explanation on the main variables and the treatment of them in the final data set is given in the next section.

3.2.1 Season of birth variable

The season of birth variable is easily derived from the variable ‘month of birth’ in the existing data set. Since this variable is available in the 2000 survey, no missing values are reported. However, since it is especially interesting to measure the effects of the season of birth instead of the month of birth on the years of schooling, such that it is possible to keep them comparable to previous studies, dummy variables are created for three different seasons. A respondent is included in the first dummy when he is born in the last three months of the year, included in the second dummy when he is born in the first three months of the year and included in the third dummy when he is born in April, May or June. Individuals born in July, August or September are used as a reference point.

3.2.2 Years of schooling variable

The years of schooling variable is constructed by taking the years of schooling after primary school.

(16)

The years of schooling are measured in years of schooling completed, with or without a degree, and there is only accounted for the years of education after primary school. The years of education are only taken into account when it concerns a full-time education. The years of schooling are measured by subtracting the ending year from the year of beginning, but only if the individual finished this education. Furthermore, education obtained abroad is excluded. From 1985 to 1990, there is no sufficient data on the years of schooling. However, in the years from 1994 on, there is information available on the years of schooling. The data from these datasets is used to fill up the missing values for the respondents included in the 1985 to 1990 surveys.

3.2.3 Family background variables

The OSA survey contains rich information on the family background variables. There is information available on the highest education obtained by the father when the respondent was twelve years old. The different education levels are divided in four levels: education unknown, low education, intermediate education and high education. The same holds for the job level of the father, however, this variable is more difficult to divide into the four levels, since the job levels are coded based on the classifications of the SBC (standard job classifications) from Statistics Netherlands. However, these classifications are not ranked on the level of the profession and this made it necessary to use my own knowledge to organize them in job level unknown, low job level, intermediate job level and high job level. Since only the 1985, 1988, 1996, 1998 and 2000 datasets contain full information on the job and education level and the 1990 and 1992 datasets contain only information on the job level, all data sets are used to fill up the missing values.

3.2.4 Net hourly wage variable

The dependent variable is the log of the net hourly wage. The net wage per hour is calculated by using the net wage, the payment period and the hours worked per week defined by the contract. Resulting from this net wage per hour variable, a variable of the log of the net wage per hour is created and this variable is used for the regression. In the creation of this variable, no distinction is made between the people working under a fixed hours contract and people working under a part-time contract, for example freelancers. This could be a problem when people have less hours on their contract than the real hours of work they perform. This could lead to higher weekly/monthly/yearly earnings and therefore an higher net hourly wage because when creating this variable, the earnings are divided by the hours on the contract. Instead of avoiding these incorrect wage measures in the data by making the distinction between those two type of workers, people with an extremely high wage, higher than €400 per hour, are removed from the dataset. The same problems could result from respondents with more than one job. They could mention a higher income, because of their second job, than normal according to the hours on their contract and thereby resulting in an incorrect high wage per hour.

(17)

3.2.2 Respondents

The 2000 survey consists of 4185 respondents. To keep the research comparable to the study of Plug (2001), all respondents are removed who did not participate in the labor force or are unemployed. Secondly, the respondents who did not respond to the main variables mentioned above, are also removed from the survey. Finally, only the male participants are maintained in the survey to make the results more comparable to other studies. The reason why other studies mainly focused on the male population is because the careers of females are often disturbed by childbirth. In the end, 874 male respondents remain, which is less than the number of male respondents studied by Plug (2001).

3.3 Framework for data analysis

The data will be analyzed and the regressions will be made in the statistical programs STATA and SPSS. The last one is mainly used to fill up the 2000 dataset with all the missing values.

4. Findings

4.1 Brief introduction

The main objectives of this study are to estimate the returns to education in the Netherlands in 2000 and to see whether the rate of return increased or decreased since 1994. The two methods used to estimate the returns to education are the OLS regression model and the IV regression model. The mathematics of the application of these two regression models on the earnings function are described in this section. Furthermore, this section will provide the results of running the two regressions on the 2000 data set and the results of testing the quality and validity of the instruments used.

4.2 Empirical procedure

The first approach to estimate the returns to education is to estimate the simple Mincer equation, in which the wage is a function of the years of schooling, by applying the OLS regression model. This model is shortly described below.

4.2.1 OLS regression

A simple model of schooling and earnings can be expressed as follows:

log 𝑌𝑌𝑖𝑖= 𝛽𝛽0+ 𝛽𝛽1∗ 𝑆𝑆𝑖𝑖+ 𝛼𝛼𝛼𝛼𝑖𝑖+ 𝑢𝑢𝑖𝑖

where 𝑆𝑆𝑖𝑖 is the years of education, 𝑌𝑌𝑖𝑖 is a measure of earnings and 𝛼𝛼𝑖𝑖 is a vector of control variables, such as age and age squared. 𝑢𝑢𝑖𝑖 is a pair of residuals. 𝛽𝛽1 is the causal effect of education: it gives the expected percentage increase in earnings if a randomly selected member of the population were to receive an additional year of schooling. However, 𝛽𝛽1 is described as the rate of return to an additional year of education, but it is important to mention that this is only an estimate of the benefits of

(18)

education, without subtracting the cost of education (in tuition fees and lost wages). The OLS estimates of 𝛽𝛽1 are only consistent if 𝑢𝑢𝑖𝑖 and 𝑆𝑆𝑖𝑖 are uncorrelated. If there would be a correlation between them, this would mean that the unobserved variables affecting the years of schooling also affect the earnings of an individual. A solution for dealing with this possible correlation is to identify a set of variables that affect schooling but not earnings. These variables are then used to conduct IV estimates.

4.2.2 IV Regression

We have the basic model in which the earnings of an individual are dependent on the years of education. However, it is also mentioned that there are more factors that could influence the years of educational attainment, which makes the independent variable endogenous. This correlation between the years of education and the earnings is probably due to the fact that ability characteristics are omitted from the earnings function. Consequently, this leads to inconsistent OLS estimates of the returns to education. Because it is not possible to include the omitted variable in the regression function since ability is mostly unobserved or difficult to measure, the IV research method is applied. Two instruments are used that try to isolate the uncorrelated part out of the independent variable from the part that is correlated with the error term. The next section will describe the econometrics behind the IV regression in general and the IV regression applied to the above mentioned instruments.

4.2.2.1 IV Regression in general

If the regression model is only based on one regressor, the explanatory variable, we can write the regression model as follows:

𝑌𝑌𝑖𝑖 = 𝛽𝛽0+ 𝛽𝛽1𝛼𝛼𝑖𝑖+ 𝑢𝑢𝑖𝑖, 𝑖𝑖 = 1, … , 𝑛𝑛

Where 𝑢𝑢𝑖𝑖 is the error term representing the omitted factors that determine 𝑌𝑌𝑖𝑖. If 𝛼𝛼𝑖𝑖 and 𝑢𝑢𝑖𝑖 are correlated, the OLS estimates are inconsistent. Instrumental variables estimation uses an additional, instrumental variable 𝑍𝑍 to isolate that part of X that is uncorrelated with 𝑢𝑢𝑖𝑖.

If the instrument 𝑍𝑍 satisfies the two conditions of quality and validity, we can estimate the coefficient 𝛽𝛽1 using an IV estimator.

The first stage is to separate the independent variable into two components: a component that may be correlated with the error term and another component that is uncorrelated with the error term. The regression related to the first stage is as follows:

𝛼𝛼𝑖𝑖 = 𝜋𝜋0+ 𝜋𝜋1𝑍𝑍𝑖𝑖+ 𝑣𝑣𝑖𝑖

where 𝜋𝜋0 is the intercept, 𝜋𝜋1 is the slope and 𝑣𝑣𝑖𝑖 is the error term. 𝜋𝜋0+ 𝜋𝜋1𝑍𝑍𝑖𝑖 is the part 𝛼𝛼𝑖𝑖 that can be predicted by 𝑍𝑍𝑖𝑖 and is uncorrelated with the error term. 𝑣𝑣𝑖𝑖 is the problematic component of 𝛼𝛼𝑖𝑖, that is,

(19)

𝑣𝑣𝑖𝑖 is correlated to the error term 𝑢𝑢𝑖𝑖. The idea behind this stage is to separate the problem-free

component of 𝛼𝛼𝑖𝑖 and the error term 𝑣𝑣𝑖𝑖.

OLS is used to estimate 𝜋𝜋�0 and 𝜋𝜋�1 in the following model: 𝛼𝛼� = 𝜋𝜋�𝚤𝚤 0+ 𝜋𝜋�1𝑍𝑍𝑖𝑖

In the second stage, 𝑌𝑌𝑖𝑖 is regressed on 𝛼𝛼� . The results from this last stage is that we obtain IV 𝚤𝚤 estimators for 𝛽𝛽0 and 𝛽𝛽1.

4.2.2.1.1 Criteria for a reliable instrument

Before we apply IV it is essential to make sure that the three conditions for a reliable instrument hold. However, even if we can use statistical tools to check whether the quality- and validity condition hold, expert judgments still play a very important role. In reality, it seems to be difficult to find proper instruments that satisfy the following three criteria:

(1) Instrument quality: corr(𝑍𝑍𝑖𝑖, 𝛼𝛼𝑖𝑖) ≠ 0

This criteria implies that the instrument is correlated with the independent variable. (2) Instrument validity: corr(𝑍𝑍𝑖𝑖, 𝑢𝑢𝑖𝑖) = 0

This criteria implies that the instrument is not correlated with the error term. (3) Relevance for using an IV regression

This criteria is satisfied when the schooling variable is correlated with the error term.

4.2.2.2 IV regression using the season of birth instrument

In the first case, the IV regression is conducted by using season of birth as an instrument for the schooling variable. The simple earnings equation is stated as follows:

log 𝑌𝑌𝑖𝑖 = 𝛽𝛽0+ 𝛽𝛽1∗ 𝑆𝑆𝑖𝑖+ 𝑢𝑢𝑖𝑖

Where 𝑌𝑌𝑖𝑖 is the earnings of an individual, 𝛽𝛽0 is the intercept, 𝑆𝑆𝑖𝑖 are the years of schooling, 𝛽𝛽1 is the slope and tells us how much the earnings increase when an individual obtains one more year of schooling, and 𝑢𝑢𝑖𝑖 is the error term.

The coefficients in the earnings equation can only be estimated by using IV regression if the coefficients are either exactly identified or over identified. The coefficients are said to be exactly identified when the number of instruments is equal to the number of endogenous variables. The coefficients are over identified when the number of instruments exceeds the number of endogenous variables. In this case the coefficients are over identified because we create three dummy variables for the season of birth. The fourth season of birth is used as a reference point.

The first stage of the IV estimates implies the following function:

𝑆𝑆𝑖𝑖 = 𝜋𝜋0+ 𝜋𝜋1∗ 𝑄𝑄1+ 𝜋𝜋2∗ 𝑄𝑄2+ 𝜋𝜋3∗ 𝑄𝑄3+ 𝑣𝑣𝑖𝑖

(20)

Where 𝑆𝑆𝑖𝑖 is the years of schooling obtained, 𝜋𝜋0 is the intercept and 𝜋𝜋1, 𝜋𝜋2 and 𝜋𝜋3 are the coefficients that tell us how different schooling lengths depend on the season of birth.

We can estimate the coefficients of this function by applying OLS. That is, we obtain the following function, in which 𝜋𝜋�0, 𝜋𝜋�1, 𝜋𝜋�2 and 𝜋𝜋�3 are OLS estimates.

𝑆𝑆̂𝑖𝑖 = 𝜋𝜋�0+ 𝜋𝜋�1∗ 𝑄𝑄1+ 𝜋𝜋�2∗ 𝑄𝑄1+ 𝜋𝜋�3∗ 𝑄𝑄1

In the second stage, we will regress log 𝑌𝑌𝑖𝑖 on 𝑆𝑆̂𝑖𝑖, from which we can obtain IV estimates for 𝛽𝛽1 and 𝛽𝛽0, and thereby estimating the returns to education.

4.2.2.3 IV regression using the family background instrument

In the second case, the IV regression is conducted by using family background as an instrument for the schooling variable. The simple earnings equation is again stated as follows:

log 𝑌𝑌𝑖𝑖 = 𝛽𝛽0+ 𝛽𝛽1∗ 𝑆𝑆𝑖𝑖+ 𝑢𝑢𝑖𝑖

Where 𝑌𝑌𝑖𝑖 is the earnings of an individual, 𝛽𝛽0 is the intercept, 𝑆𝑆𝑖𝑖 is the years of schooling and 𝛽𝛽1 is the coefficient that tells us how much the earnings increase when an individual obtains one more year of schooling, and 𝑢𝑢𝑖𝑖 is the error term.

This time, the job level and the education level of the father are used as instruments for schooling. More than one instrument is used to address the endogeneity of schooling and this means that the coefficients are over identified and that IV can be used to obtain estimates of the coefficients. Dummy variables are created to be able to use the family background as an instrument. For the job level as well as for the education level four dummy variables are created: a dummy for education level unknown, an intermediate education level and an high education level. The variable of a low education level is used as the reference point. The same applies to the dummies corresponding to the job level.

In the first stage, the following function is estimated by applying OLS regression:

𝑆𝑆𝑖𝑖 = 𝜋𝜋0+ 𝜋𝜋1∗ 𝐽𝐽𝐽𝐽𝑏𝑏𝑢𝑢+ 𝜋𝜋2∗ 𝐽𝐽𝐽𝐽𝑏𝑏𝑖𝑖+ 𝜋𝜋3∗ 𝐽𝐽𝐽𝐽𝑏𝑏ℎ+ 𝜋𝜋4∗ 𝐸𝐸𝐸𝐸𝑢𝑢𝑐𝑐𝑢𝑢+ 𝜋𝜋5∗ 𝐸𝐸𝐸𝐸𝑢𝑢𝑐𝑐𝑖𝑖+ 𝜋𝜋6∗ 𝐸𝐸𝐸𝐸𝑢𝑢𝑐𝑐ℎ+ 𝑣𝑣𝑖𝑖

where 𝑆𝑆𝑖𝑖 is the years of schooling, 𝐸𝐸𝐸𝐸𝑢𝑢𝑐𝑐𝑢𝑢, 𝐸𝐸𝐸𝐸𝑢𝑢𝑐𝑐𝑖𝑖, 𝐸𝐸𝐸𝐸𝑢𝑢𝑐𝑐, are the dummy variables for the years of schooling attained by the father, 𝐽𝐽𝐽𝐽𝑏𝑏𝑢𝑢, 𝐽𝐽𝐽𝐽𝑏𝑏𝑖𝑖, 𝐽𝐽𝐽𝐽𝑏𝑏 are the dummy variables for the job level of the father and 𝑣𝑣𝑖𝑖 is the error term. If we estimate this function with OLS, we get the following estimation of the coefficients.

𝑆𝑆̂𝑖𝑖 = 𝜋𝜋�0+ 𝜋𝜋�1∗ 𝐽𝐽𝐽𝐽𝑏𝑏𝑢𝑢+ 𝜋𝜋�2∗ 𝐽𝐽𝐽𝐽𝑏𝑏𝑖𝑖+ 𝜋𝜋�3∗ 𝐽𝐽𝐽𝐽𝑏𝑏+ 𝜋𝜋�4∗ 𝐸𝐸𝐸𝐸𝑢𝑢𝑐𝑐𝑢𝑢+ 𝜋𝜋�5∗ 𝐸𝐸𝐸𝐸𝑢𝑢𝑐𝑐𝑖𝑖+ 𝜋𝜋�6∗ 𝐸𝐸𝐸𝐸𝑢𝑢𝑐𝑐_ℎ Where , 𝜋𝜋�0, 𝜋𝜋�1, 𝜋𝜋�2, 𝜋𝜋�3, 𝜋𝜋�4, 𝜋𝜋�5, 𝜋𝜋�6 are the OLS estimates.

(21)

In the second stage, we regress log 𝑌𝑌𝑖𝑖 on 𝑆𝑆̂𝑖𝑖 to obtain consistent estimates of the return to education when we use family background as an instrument for the schooling parameter.

4.3 Results

The next step is to apply the mathematics of the two models above to the Dutch 2000 dataset. The results of this data analysis is found below.

4.3.1 Data

Table 1 - Descriptive statistics, means and standard deviations

Variables Means Standard

Deviations Descriptives Family background Education unknown 0.224 Low education 0.216 Intermediate education 0.452 High education 0.108

Job level unknown 0.183

Low job level 0.399

Intermediate job level 0.256

High job level 0.161

Season of birth October – December 0.229 January – March 0.244 April – June 0.264 July – September 0.263 Education Length of schooling 6.983 3.735 Other Wage 23.399 10.004 Log wage 3.075 0.429 Age 41.910 9.480 N 874

Table 1 represents the descriptive statistics of the used variables for the male population. It shows the average years of schooling, the average net hourly wage, the average age of the respondents and the percentage of respondents who are born in a certain season and the descriptive of the family

background variables. For example, the average wage earned is 23.399 guilder per hour, the average length of education is 6.983 years and the average age of the respondents is 41.91 years old.

(22)

These results seem to be different from the descriptive and means produced by Plug (2001). Plug found that the average years of schooling in 1994 for males is 5.375. In the years from 1994 to 2000, the years of education an individual obtained increased on average by more than 1.5 years. This increase might be a result from the fact that people truly increased the years of schooling in this time period, but it might also come from the fact that there is a difference in the measurement of the years of schooling an individual obtained. For example, in this research all years of education the respondent obtained are used, without the necessity that the individual obtained a degree as well. However, some of the respondents have a total sum of 20 years of education. On the one hand, this might seem a bit unrealistic and the question is if a more reliable view on the years of education is obtained if we exclude all respondents with more than a certain amount of years of schooling. On the other hand, it seems difficult to set this boundary. Besides, even when the years of schooling is measured only when the individual obtained a degree, the average years of schooling only changes with a very small amount. Therefore it is concluded that all the years of schooling obtained by the individual are measured, nonetheless if they received a degree or not. This is in line with the assumption that an individual who obtains three more years of education after high school, but does not obtain a degree, should earn more than an individual who drops out after high school immediately.

As the results also show, almost an half of all the respondents has a father who completed an intermediate level of education and the largest part of the population has a father with a low job level. 4.3.2 Effect of family background on schooling

The first step in producing IV estimates of the returns to schooling is to measure the effects of the chosen instruments on the years of education. The chosen instruments are the family background variables and the season of birth of an individual. Table 2 shows the effects of the job- and education level of the father on the years of schooling.

As mentioned before, dummy variables are created for the two variables to indicate if the job- or education level is unknown, low, intermediate or high. As the table shows, individuals with a father who has an high job level attain on average 1.794 more years of education than an individual with a father who has a low job level. The same applies to an individual with a highly educated father, but here the difference is even more notable: they obtain on average 2.740 more years of education. Notable seems that the coefficients on the dummies for education unknown and job level unknown are not significantly different from zero. However, because both variables are correlated with the years of schooling, it might be more dangerous to exclude the dummies with the insignificant coefficients. If we exclude the education unknown and job level unknown dummy from our regression model, this would mean that they are now included in the error term and because these two dummies are also correlated with the other dummies and the years of schooling variable, the other coefficients on the explanatory variables would be biased.

Age and age squared are used as control variables. The reason why the age variable is used

(23)

instead of controlling for experience is the same as the reason Plug (2001) mentioned: “We prefer the age variable over experience because of the possible mismeasurement of experience variables derived from education variables that are measured with error (Plug, 2001, p.525)”.

Table 2 - OLS estimates of the schooling function, family background Years of schooling Coefficient Standard

deviation

P-value Pearson correlation Constant

Family background variable

-0.485 1.838 0.792

Education unknown 0.458 0.368 0.213 -0.053

Education intermediate 0.989*** 0.320 0.002 0.025

Education high 2.740*** 0.495 0.000 0.242

Job level unknown 0.567* 0.345 0.100 0.010

Job level intermediate 0.717** 0.305 0.019 0.014

Job level high 1.794*** 0.388 0.000 0.223

Controls

Age 0.342*** 0.090 0.000 -0.095

Age squared -0.448*** 0.108 0.000 -0.112

Adjusted R-square 0.112

N 874

Pearson correlation: correlation between the explanatory variables and the dependent variable *Implies significance at 10% level

**Significance at 5% level ***Significance at 1% level

The adjusted R-square is the percentage of the variance in the dependent variable, explained by the independent variables. In other words, it tells us if the explanatory variables are good at predicting the values of the dependent variable. The closer the value of the adjusted R-square is to 1, the better the variables are in predicting the value of the dependent variable. Because the adjusted R-square value found is 0.112, this means that the family background variables explain 11.2% of the variance in the values of the years of schooling variable. However, because of the high correlation between the dependent and the independent variable and the significance of almost all the variables, the low adjusted R-square does not have to be a problem.

4.3.3 Effect of season of birth on schooling

The above step is the first step in estimating the returns to education with IV regression. The same step is applied to the season of birth instrument. These results are shown in table 3.

These results show that individuals born at the end of the year only obtain 0.173 more years of education than individuals born in the months of July, August or September. This seems to be the complete opposite of the results found by Plug (2001). He found that people born just after the cutoff

(24)

date, that is, in the months of October, November or December, obtain almost one more year of education compared to people born just before the cutoff date, that is, in the months of July, August or September. The same conclusion is also found by Angrist and Krueger in 1991, who state that people born earlier in the year are older when they start with education than people born later in the year and thereby have the possibility to leave school after less years of schooling.

Table 3 – OLS estimates of the schooling function, season of birth Years of education Coefficient Standard

deviation

P-value Pearson correlation

Constant 1.492 1.901 -2.239

Season of birth variables

Q1 – October – December 0.173 0.358 0.629 0.001 Q2 – January – March 0.282 0.352 0.422 0.011 Q3 – April – June 0.348 0.345 0.314 0.024 Controls Age 0.316*** 0.094 0.001 -0.095 Age squared -0.432*** -0.906 0.000 -0.112 Adjusted R-Square 0.021 N 874

Pearson correlation: Correlation between the explanatory variables and the dependent variable *Implies significance at 10% level

**Significance at 5% level ***Significance at 1% level

This conclusion made by Angrist and Krueger (1991) holds because of compulsory schooling laws and a minimum school leaving age in the United States. In the Netherlands, however, a child can enter school when he is 4 years old, but school is compulsory when the child reaches the age of 5. It seems to be normal that the child goes to the primary school when the child reaches the age of 4. However, the child does not have to wait until the new school year starts, but can enter school immediately. The minimum school leaving age in the Netherlands is 16, but the child is required to finish the complete school year. This could explain why the results of Angrist and Krueger (1991) are not found in this research. Assuming that all children start at the age of four, they all obtain the same years of education when they reach the age of sixteen. The small differences found in table 4, are because of the fact that children are required to finish the year they started with when they reach the age of 16. It seems logical that an individual born in October has more months to complete than someone born in July when he turns 16, but it does not seem to explain the effects of the other seasons of birth on years of schooling.

Plug (2001) also found the same results as Angrist and Krueger (1991), but attributes these results to a different explanation. The reason why people born later in the year obtain more years of education is because of the relative age effect. These people are older when they start with school, are

(25)

more mature and consequently receive better marks. In this research, however, there is no prove found of this relative age effect. This is surprising, since the rules around the compulsory schooling laws did not change since the first OSA survey.

Another notable conclusion from the results in table 3 is that the coefficients on the season of birth dummy variables are not significantly different from zero. In general, insignificant coefficients are not a direct reason to exclude them from the regression. However, because all the coefficients on the dummies variables are insignificant, this might create doubts about the reliability of the regression. The correlation of nearly zero between the season of birth variable and the years of schooling seems to strengthen these doubts about the reliability of the regression model including season of birth as an instrument.

Finally, the adjusted R square is only 0.021. This tells us that the season of birth variable only explains 2.1% of the variation in the years of schooling variable. Together with the non significance of the independent variables and the almost non-existent correlation between the season of birth and the years of schooling, we might worry about the reliability of this instrument.

4.3.4 OLS estimates

Before taking the second step of the IV regression, first the returns to education are estimated with OLS regression, such that we can compare the IV outcomes to the OLS outcome. The results of the regression of the years of education on the log of the net hourly wage are shown in table 4.

A return to education of 3.8% is found in 2000 when we use the OLS regression model and when we control for the age of the individual. These results are highly significant and the adjusted R-square shows that the years of education variable explains 22.2% of the values of the dependent variable.

The Pearson correlation coefficient shows that there is a positive relationship between the years of schooling and the log of the net hourly wage.

The next step is to see if the OLS estimates are indeed biased downwards. The instruments chosen to control for schooling are the family background variables and the season of birth variable. The effect of the instruments on the years of schooling is already measured in step one, which means that in step two the OLS estimates of the schooling function are used to obtain the IV estimates of the returns to education. First, the family background variables are used to instrument for schooling, these results are represented in table 5.

(26)

Table 4 – OLS estimates of the returns to schooling

Coefficient Standard error P-value Pearson correlation Intercept 1.574*** 0.192 0.000 Returns Year of schooling 0.038*** 0.003 0.000 0.293 Controls Age 0.046*** 0.010 0.000 0.265 Age squared -0.037*** 0.012 0.002 0.285 Adjusted R-square 0.222 N 874

Dependent variable is the log of the net wage per hour *Implies significance at 10% level

**Significance at 5% level ***Significance at 1% level

4.3.5 IV estimates – Family background

The estimated returns to education measured with IV regression seem to be in line with the conclusion most found in previous studies, namely that the OLS estimates are biased downwards. The OLS estimates of the returns to education are 3.8%, but if the returns to education are estimated while instrumenting for schooling with family background variables, we find a significant return of 5.7%. This would mean that the OLS estimates are indeed biased downwards and that estimating the returns to education by IV regression produces more reliable estimates. Before making this conclusion, it is, however, important to check if the criteria for a valid instrument hold. If the criteria are not satisfied, we may conclude that the IV estimates are even more biased than the OLS estimates.

The first criteria that needs to be satisfied is related to the quality of the instrument. This means that the instrument is correlated with the years of schooling variable. According to Plug (2001) we can check for the quality of the instrument by measuring the first-stage F-statistic. This statistic tests the hypothesis that the coefficients on the instruments, in this case the dummies for family background variables, are equal to zero in the first stage of the IV regression. If the produced statistic is higher than 10, there is no need to worry about weak instruments. First, the two family background instruments are analyzed separately to see if the instruments are both valid instruments. Secondly, the two variables are combined to see if the instruments remain strong, even when they are used at the same time. These results are shown in Table 7.

The job level of the father seems to be a strong instrument, since the First-stage F-statistic is higher than 10. The same holds for the education level of the father. For this instrument we also find a First-stage F-statistic which is higher than 10.

(27)

Table 5 – IV estimates of the returns to schooling – family background Coefficient Standard error P-value

Intercept 1.541*** 0.195 0.000 Returns Year of schooling 0.057*** 0.011 0.000 Controls Age 0.040*** 0.010 0.000 Age squared -0.029** 0.013 0.022 Adjusted R-square 0.020 N 874

Dependent variable is the log of the net wage per hour *Implies significance at 10% level

**Significance at 5% level ***Significance at 1% level

Since the F-statistic on the combined instruments of job- and education level is also higher than 10, the conclusion is made that family background is a strong instrument for schooling and thereby satisfying the criteria of a qualified instrument.

The second criteria that needs to be satisfied is related to the validity of the instrument. Plug (2001) proposes the Overidentifying Restrictions Test, also known as the J-statistic, to check for the fact that there is no correlation between the instruments and the error term. This test can only be applied when there are more instruments than endogenous variables. In other words, it is possible to do the Overidentifying Restrictions Test for the family background variables, since both the education level and the job level of the father instrument for the years of schooling. The p-value presented in table 6 shows that there is enough evidence to conclude that the null hypothesis that both instruments are endogenous is not rejected at the 5% significance level, since the value is not smaller than 0.05. In other words, family background is not correlated with the error term and the validity criteria is also satisfied.

It is important to note that the adjusted R-square of 0.020 in this case seems to be very low, which might be a problem since it might indicate that the years of schooling variable is not good in predicting the earnings of an individual when family background is used as an instrument. This is in line with the results found earlier, when we concluded that family background variables only explain a small part of the variances in the values of the years of schooling (adjusted R-square of 0.112) and that the years of schooling variable in the OLS estimates does also not explain the major part of the earnings of an individual (adjusted R-square of 0.222). On the other hand, the years of schooling variable is significant at the 1% level, which might take away a big part of the worries concerned with the low adjusted R-square.

Referenties

GERELATEERDE DOCUMENTEN

Against this background the purpose of the current study was to explore how the international literature deals with the idea of regulatory burdens to further our understanding of

It appears that the experiences of the majority (209 per 1000) of the adolescents who had to deal with child abuse at one point in their lives (373 per 1000 adolescents) are

Besides, distinguishing between foreign and domestic investors, the study also takes into account whether the foreign investor has operations in Korea and it controls for

Women who are preterm and considering the options for birth after a previous caesarean delivery should be informed that planned preterm VBAC has similar success rates to planned

Therefore, this study investigated which of the following factors targeted in CBT relate most strongly to the physical and psychological functioning of children with FAP:

The test can be used for this paper in two different ways; it can compare the level of an indicator in a normal period with the level prior to a financial

have a bigger effect on willingness to actively participate than a person with an external locus of control faced with the same ecological message.. H3b: when a person has an

Chapter 3 then gives the outcomes of the quantitative research, accompanied by an inventory of the custodial penalties imposed for murder and manslaughter from 1 February 2006