• No results found

Human capital effects on wage growth : wage dierences using NLSY79 panel data.

N/A
N/A
Protected

Academic year: 2021

Share "Human capital effects on wage growth : wage dierences using NLSY79 panel data."

Copied!
43
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Human capital effects on wage growth:

wage differences using NLSY79 panel data

By

M.A.R. Moerkercken v/d Meulen (5831687)

Amsterdam, September 21, 2014

Supervisor

dr. J.C.M. van Ophem

Abstract

We analyze the effects of human capital on wage growth, using the NLSY79 panel data. Primarily education, cognitive ability, noncognitive ability have been used as proxies for human capital. Using the largest interval of observed wages and taking into account the different moments when these wages have been observed we find, in particular, education to have a positive impact on wage growth, which even continues and increases long after employees have finished their education. Cognitive ability is in most of our models significant as well, in agreement with the theoretical prediction. Noncognitive ability seems to be of less importance in determining wage growth. Further it is found that the determinants (including non human capital variables) of wage growth might not be constant over time and non-constant within a particular variable.

1

Introduction

In the past couple of decades economists have attempted to show what determines wage differences in personal incomes. Theoretical research into this topic has more and more been taken over by empirical research, because of the growing amount of (available) data. These data are collected by numerous businesses, government authorities and institutions specialized in collecting data of substantial amount of people.

Remarkable is that despite the high interest in wage determination, little research has been conducted about the wage growth per employee over a certain time span. The effects of training and job tenure on wage growth (e.g. Brown, 1989 and Bartel

(2)

and Borjas, 1981) are one of the few which have been analyzed, together with Lazear (1974) who discussed the effect of age and work experience on wage growth. Another topic concerning wage growth which has been discussed already is the relationship between work experience and education. To what extend “does the slope of experience-earnings profiles vary with educational attainment?” (e.g. Brunello and Comi, 2004 and Farber and Gibbons, 2006).

Today’s knowledge about wage growth affected by other components of human capital such as cognitive ability and noncognitive ability is in its early stage. On that account the intention of this thesis is to broaden the knowledge about what causes wages of a certain individual to grow over time (or maybe even decline). Focus will in particular be put on the human capital effects. So in contrast to most studies which concern human capital effects on wage return using data at one particular moment in time or take only one aspect of human capital into account, this thesis tries to identify the different effects of human capital on wage growth.

Human capital is defined by the stock of skills, knowledge, habits and social competencies one obtains through education, training and experience. Our measures of human capital will be: education, cognitive ability and noncognitive ability. The latter could also be defined independent of human capital as psychological capital (e.g. Goldsmith et al., 1997).

So far most surveys have observed wages at one certain point in time and have used control variables of the individual at the same time. Others have made use of an average of respondents’ labour market wage of the first until the last observation (e.g. Deming, 2009). Because there is such a little amount of information available on this topic, mainly empirical and theoretical background information will be used of wage determinants at one certain moment in time to cover the literature here. Thereby assuming that the empirical and theoretical results are relevant and will approximately be true for time intervals as well.

Mincer (1958) and Becker (1962) suggested that human capital has a critical effect on real wages. Future real income may be influenced by investing more or less in people’s resources. A few ways to invest in an invididual’s capital is through schooling, on-the-job training, medical care, vitamin consumption and acquiring information about the economic system (Becker, 1962).

In short what will be discussed mainly in this thesis are the human capital determinants on annual wage growth for a possible period of more than 30 years. On the basis of ordinary least squares we will regress the annual wage growth (between the first observed wage and last observed wage in the panel data) on a couple of measures of human capital (education, cognitive ability, noncognitive ability and training). Based on relating literature we will add various control variables that may affect wage growth as well. Further, the determinants are compared between earlier on in the employees’ careers and one or two decades later. To check whether there are differences in the wage growth effects within the sample, some break-tests will be conducted.

(3)

The structure of this thesis is as follows: in part 2 the previous literature of the determinants of wages will be discussed. Next in part 3 the data of our variables from the NLSY79 are described. The methodology in part 4 explains the estimation methods. The results will be presented in part 5 and in part 6 we will conclude and briefly discuss the results.

2

Literature

In this section we summarize the findings of previous researchers about the wage/salary returns on the basis of individual human capital,- demographical- and geographical characterics. Nearly all surveys have made use of the ordinary least squares (some with two-stage least squares) or panel data (fixed effect- vs. random effect models) estimation method.

In 1962 Becker provided the simple model:

Y = X + rC,

in which Y was defined as earnings, X are the earnings when there is no investment in human capital, C measures total investment costs and r the average rate of return (on human capital). Although this was (presumably) not the first time such a model has been used to explain (theoretically) the impact of all kinds of economic and non-economic factors on personal incomes, it has been widely used by many researchers thereafter and will be in this thesis. The general theory established by Mincer (1958) and Becker (1962) tells us differences in earnings between workers/employees may simply be the result of some persons to invest more in themselves than others. Since people that appear to have a higher ability in terms of personality, persistence and intelligence, tend to invest more in themselves than others do, the distribution of earnings is unequal and skewed. Apart from the differences in abilities Mincer (1958) also provides a theoretical framework (using a simple model) in combination with empirical findings of distributional differences in incomes. Apart from differences in the amount of training, differences in occupation, age and education the variation between earnings is explained. Next, (more recent) empirical findings relating to human capital effects on wages will be discussed.

To give an impression of the interest in the returns to education among economists, Ashenfelter et al. (1999) discussed a total of 27 important surveys on the returns to education just in the period 1991-1999 and there are actually many more. They pro-vide an analytical review of previous estimates of the amount of return in earnings as a result of education. Differences in estimation methods become much smaller after correcting for publication bias than is sometimes reported. Most important is that education indeed adds significantly to the returns of individuals. Overall

(4)

average returns to education is 3.5% in 1974 and has increased since then at a rate of 2 percentage point per decade. Noteworthy is that in the United States (data in this thesis is from the U.S. as well) the rates of returns appear to be somewhat higher than elsewhere, partly because of a higher increase in returns over the years compared to other countries included in those surveys.

The increased returns to education in the 1980’s has occurred largely for work-ers with higher levels of cognitive ability (Blackburn and Neumark, 1993). They provide a supply-side explanation which states that education and cognitive ability become less correlated over time (for the entire population, not per individual). A growing scarcity of higher educated employees with a high level of cognitive ability is the result. This however is only a hypothesis which is as they state: “difficult to assess the existence or importance of such supply-side changes” at the moment. Because the used sample is constant in our survey we cannot test this hypothesis by ourselves. The AFQT scores are constant over time, just as the education variable (see data description).

One of the oldest theories relating to which factors influence individual wage in-equality is that the distribution of abilities is related to the distribution of income (Mincer, 1958).

A distinction has to be made between two kinds of ability. First the cognitive ability (cognitive skills). These are the skills measurable by standardized tests, such as IQ tests. Intellectual ability, reasoning ability, solving problems and being able to distinguish relationships are some component most tests contain.

On the other hand there are the non-cognitive skills. These skills are considered as someone’s socio-behavioral development, for example in terms of adaptability, self-restraint and motivation (Jackson, 2013). Deming (2009) found evidence of sig-nificant effects of non-cognitive skills of adults on earnings. As Heckman (1999) points out: “The preoccupation with cognition and academic “smarts” as measured by test scores to the exclusion of social adaptability and motivation causes a seri-ous bias in the evaluation of human capital interventions”. Heckman et al. (2006) show that cognitive ability together with noncognitive ability both explain economic (and social) success. The effect on wages of cognitive ability and the noncognitive ability measures (motivation, persistance and self-esteem) are similar and for some outcomes the noncognitive measures even play a more substantial role (Heckman et al., 2006).

A problem with noncognitive ability (often labelled as social ability) according to Heckman and Rubinstein (2001): “No single factor has yet emerged to date in the literature on noncognitive skills, and it is unlikely that one will ever be found, given the diversity of traits subsumed under the category of noncognitive skills.”

One possibility to account for noncognitive ability is via a specific measure of social ability: the level of how shy someone is according to him-, or herself. Among

(5)

entrepreneurs and wage employees there is a significant difference between the effect of this social ability on income. Social ability benefits entrepreneurial incomes more than employees’ wages, a suggestion for this result is the idea that social ability is of more importance for the performance of entrepreneurs than for employees as part of the work of entrepreneurs is interacting with many different persons inside and outside the firm (Hartog et al., 2010). Here we focus on wages and thereby on employees, so we expect these noncognitive skills effects to be somewhat smaller.

Among the first who studied the noncognitive ability (psychological capital) ef-fects on wages were Goldsmith et al. (1997). Their measures of noncognitive ability were based on Rosenberg’s Self-Esteem Scale (SES)1 and Rotter’s Internal-External

(I-E) Locus of Control Scale2. Brockner (1988) found self-esteem to influence pro-ductivity positively in two ways. High self-esteem workers tended to have less need of direction from supervisors, resulting in a more effect way of employing their time. Second, high self-esteem workers were willing to consider a wider range of solutions to problems.

Goldsmith et al. (1997) offer evidence that apart from conventional human cap-ital measures, psychological capcap-ital measures of self-esteem (using Locus of Control as an indirect measure) is a significant determinant of wages as well. More detailed information about both tests is given in the data description.

Cawley et al. (2001) studied the effect of cognitive ability on wages. They show that without the human capital measure education, cognitive ability explains 14% to 19.9% of the variance in wage (change in R2). But if controlled for by human capital measures (education and work experience) this variance decreases tot 0.7% to 2.9%.

The pattern of decreasing explaining power of cognitive ability, when education is added to the model, agrees with previous findings of Blackburn and Neumark (1993) and Murnane et al. (1995). They reported that those people with high ability correspond to having a high return to education and that 38-100% of the increase in the return to education is due to increasing abilities, so the reverse effect is true as well. Cawley et al. (2001), who use the same dataset as in this thesis, find a high correlation between education and cognitive ability. This collinearity imposes an identification problem, because, as an example, there are no high-ability white male high school dropouts (low value for highest completed grade) to whom the graduates could be compared. Some ability-education combinations are simply not observed.

Another identification problems occurs when both age and time trend variables are included in the model, as these, by definition, will be correlated. Working with time intervals reduces this correlation as there are multiple intervals related to each

1Rosenberg (1965) 2Rotter (1966)

(6)

possible age variable.

Although ability and education are seen as the most influential factors for wage/income distribution, this cannot explain all the differences. Other social,- and economic fac-tors seem to intervene and thereby distort the relation between ability and income (Mincer, 1958).

Human capital stock could, besides education and ability, be extended by the so called ’on-the-job training’. For employers this is of importance as it increases its employees capabilities, learning new skills, improving old skills and thereby possibly its productivity. Productivity improvement in the future is only possible at a cost (1. deferral of earnings for the period of training; 2. cost of educational services), since there would otherwise be a unlimited demand for training. Therefore firms only let their employees attend on-the-job training if they expect the future marginal productivity of workers to increase sufficiently (see Becker, 1962).

There is reasonable evidence of an effect of on-the-job training among employees (Brown, 1989). An average training effect of 11 percent is found.

Various other surveys have been published that describe other wage determi-nants.

Lazear (1974) found age at which employees start with their work to be of sig-nificant importance in the determination of wage growth. Hereby is controlled for work experience (in weeks) over the period for which the growth has been observed (1966-1969). So here age is not the same, as it is sometimes confused with, as labour market experience. The natural logarithm of wages in 1969 and 1966 has been subtracted from each other and regressed on age in the year 1966 and a couple of more regressors. The negative sign for the age coefficient is explained by the fact that older employees are less likely to invest in human capital. Difference with our model is that we are (although only for the 1988-2010 interval) able to control for human capital investments (on-the-job training), resulting in less biased estimates and probably smaller estimates for the age coefficient. Furthermore is our interval of wage growth determination much wider (up to 32 years) which makes it less likely the age factor to be significant as the effect and difference in age will be removed over time. For older employees an age difference of a couple of years is presumably less affecting the likelihood of investing in human capital compared to an age difference for younger employees. The age effect is thereby expected to be rather moderate.

Brunello and Comi (2004), who also implemented a wage growth model found work experience to have stronger effects among higher educated employees than for employees with lower education. They defined work experience as the number of weeks one has worked during the interval in which wages have increased (or decreased). Because our method of estimating wage growth is not similar to their method (different wage intervals to measure wage growth) this makes it useless to include this measure in our model. We have to take this into account in our

(7)

conclusion.

As we assume the respondents to have no working experience at the time of the first observed wage (average age at time of first wage is observed is 19.3 years), we do not expect the respondent to have any real work experience. Besides that there is no data available which could indicate any kind of work experience before the date of the first observed wage.

Earnings among married men and never married men seem to differ in favour of the married men (Hill, 1979). According to Becker (1991) the productivity hypothe-sis causes this observation: when a man is married he will have greater opportunities in specializing in their labour marktet activities, because his wife is more likely to specialize in ‘home production’. Another argument is that being married not causes higher earnings. Instead a certain selection takes place, in which those who are more productive already, have a higher chance of getting married compared to the lower productive men. Chun and Lee (2001) however contradict this last statement and find the earnings difference attributable to the degree of specialization within the household.

Various literature has showed that having children and the number of children in the household has a significant effect on wages. Especially women’s wages has been widely discussed. A 4 percent ‘penalty’ for one child and a 12 percent ‘penalty’ for having two or more children among women is found by Waldfogel (1997) and 7 percent per child by Budig and England (2001). The hypothesis concerning these differences put forward by Waldfogel is in the form of employer perceptions (i.e. dis-crimination) or employee adjustments, such as occupational downgrading or chang-ing jobs after childbirth.

Unlike changes in recent decades of the movement of mothers into paid em-ployment the findings of Avellar and Smock (2003) do not indicate a decline of motherhood ‘penalty’ of lower wages. This suggests that paid work and family is still more of a mother’s struggle than that of both parents.

3

Data description

The panel data used in this thesis is obtained from the National Longitudinal Survey of Youth 1979 (NLSY79), part of the National Longitudinal Surveys (NLS) program conducted by the U.S. Bureau of Labor Statistics. This survey includes an approx-imately equal amount of women and men born in the years 1957-1964 and having an age ranging from 14 to 22 years old at the initial survey in 1979. The sample consists of 6,111 US civilian youths, 5,295 minority and economically disadvantaged civilian youths and 1,280 youths on active duty in the military, making it a total of 12,686 respondents. Approximately 16% of the respondents are hispanic, 25% black

(8)

and the other 59% consists of other ethnic groups.

Until 1994 the participants were interviewed annually and since then until 2010 biennially.

The information gathered from the interviews could be distinguished in the following main topics:

• Labor market behaviour

• Educational experiences (high school, college, training) • Family background

• Armed Services Vocational Aptitude Battery • High school information

• Government program participation • Family life

• Health

Only a small amount of data has been used during this thesis. Next will follow a short description of each variable taken into account. In Table 1 an overview of all variables is given together with the maximum and minimum number of responses returned in the corresponding year.

To determine the wage growth (WGrowth), the wages and salary variable is used. This contains the total amount of wages and salary reported by the respondents over the past calendar year. This is without the money received form military service and includes wages, salary, commissions and tips from all jobs, before deductions for taxes or anything else.3

The Armed Forces Qualification Test (AFQT) scores are used as a proxy measure for intelligence (cognitive ability). The AFQT is part of the Armed Services Voca-tional Aptitude Battery (ASVAB) multiple choice test, which contains nine sections. The NLSY79 respondents participated in the AFQT test in 1980 (reported in 1981, see Table 1). The AFQT consists of the following four sections: arithmetic reason-ing, word knowledge, paragraph comprehension and mathematics knowledge. It has been used extensively as a measure of cognitive skills in the literature. (Heckman, Stixrud and Urzua, 2006) The test score is a raw score of total obtained points, which is then converted into a percentile score ranging between 1 to 99. So for example a score of 63 means that the participant has a better test score than 63 percent of the other participants of the test.4

3Not included is income from own farms or businesses

4The revised version of 2006 will be used here, as it is rounded to 3 decimals and thereby the

(9)

The AFQT has proven to be a good proxy for the total ASVAB test as the predictive power of the AFQT in explaining wage variations hardly differs from the ASVAB test (Cawley et al., 2001).

For the education variable the “highest completed grade” is used. This variable ranges from 0 to 20 indicating, as the name already suggests, the highest completed grade of the respondent. 0 refers to the fact that the respondent has not completed any grade so far. 1stgrade until 12thgrade includes elementary school, middle school

and high school. 13th until 20th grade is defined by college (university). The same description holds for “highest completed grade mother” and “highest completed grade father”.

As a measure for the noncognitive ability (sociability trait at adulthood) of the employees we have got 3 variables which at first sight seem to correlate quite strongly. First there is the ’shy’ variable. The respondent were given the given the following question and corresponding answer: “Thinking about yourself as an adult, would you describe yourself as:

1. Extremely shy (Sociali=1)

2. Somewhat shy (Sociali=2)

3. Somewhat outgoing (Sociali=3)

4. Extremely outgoing” (Sociali=4)

Were the term Sociali comes back in the methodology.

All respondents were adults at the time of the interview, since it was held in 1985.

Another method for capturing the noncognitive abilities of the respondent was done by Rotter’s Internal-Exernal Locus of Control Scale. The Locus of Control, the general outlook on life of a person, is believed by psychologists to have influence of one’s self-esteem. Those who believe that are masters of their own fates are called internalizers. Externalizers are those who believe that their lifes are controlled by some outside force, have little influence on the outcome of what happens with them and therefore bear little responsibility (Goldsmith et al., 1997).

The Control Scale consists of 23 statements where the respondents have to reveal their personal perceptions. If a internal statement is selected one point is given. A total of 23 points could be obtained; 0 for most external to 23 for most internal. The minimum score observed in the NLSY79 is 0 (for only one respondent) and the maximum score observed in the NLSY79 is 16. The test scores are from 1979.

The third way of measuring noncognitive ability described here is the already mentioned Rosenberg’s Self-Esteem Scale (SES). The scale is a Likert-type scale, questions answered on a four point scale from strongly agree to strongly disagree. This scale contains ten questions to measure to what extend one has a favorable or unfavorable attitude towards oneself. Rosenberg’s Scale ranges from 0 to 30, with 0 indicating a low level of self-esteem and 30 a high level of self-esteem. The scores have been reported from the year 1980.

(10)

The respondents was asked in each year how many children they had. These chil-dren could be biological chilchil-dren, adopted chilchil-dren and/or stepchilchil-dren and had to be included in the same household as the respondent.

The vocational/technical program or on-the-job training variable (TRAI) is a dummy variable in which respondents answered “yes” or “no” to the following ques-tion: “Did you attend a training program or any on-the-job training designed to help people find a job, improve job skills, or learn a new job?”. Unlike all other variables here described the first time this on-the-job question has been asked was in 1988 and the year 1989 was skipped. So to include this variable in the model, some modification is necessary (see the methodology section).

Our marital variable is based on the question whether each person was married at the time of the interview and if so, whether their spouse was living in the same household at the time.

The region of current residence of the respondents in is distinguished by living in the 1) north-eastern, 2) northern-central, 3) southern or 4) western part of the United States.

Ethnicity is distinguished by three possibilities: being black, hispanic or non-black and non-hispanic. This is, together with gender and age, the only variable which is observed over all 12,686 possible respondents (in 1979).

Instruments

As we will make use of several IV estimation methods to account for the possible endogeneity of start wage, education and cognitive ability a short explanation of the included instruments is given here.

In 1988 all respondents had to declare if they lived with both their biological parents until the age of 18. This accounts for an increasing probability of turmoil (divorce or death of a parent) in the respondent’s childhood.

Access to reading/studying material is represented by the presence of one or more newspapers, magazines and library cards at the age of 14. It might inspire the child to learn more and thereby affecting ability, education and start wage.

Next, we expect the number of siblings and older siblings to have no direct effect on wage growth and have a significant relationship with the three potentially endogenous variables . The number of siblings and older siblings data were collected in 1979 in the childhood or early adulthood of the respondent. After this year the number of siblings could have increased or decreased, but these possibilities will not be taken into account fixing these numbers.

The last instrument consists of the answer to the next question: “When you were a child, was any language, other than English, spoken in your home?” Data is obtained by a ”yes” or a ”no”.

(11)

4

Methodology

Due to looking at wage growth we will use the panel data of NLSY79 to estimate what determines the average annual growth rate of wages by ordinary least squares. The wage/salary variable of the 12,686 individuals over 24 years contain many miss-ing data (per person and per year), without a clear structure in why some are missing and others not. Together with previous literature (both wage and wage growth literature) where a linear relationship between the independent variables and dependent variable, wages/salary/income, is assumed we follow this common practice by starting with OLS.

The panel data consists of data in the interval year 1979 until 2010. As a start we use the complete interval for estimating the development of the wages of each respondent. Later on, this period is cut in three periods of approximately 10 years to investigate the possibility of nonlinearities and whether the effects in the complete period differ from shorter time periods and more specifically if the effects change over time.

First an explanation of the dependent variable follows. Next the independent variables are explained based on the literature. Then the method of how the data is adapted, interpreted and modelled is given

What is the key issue in this thesis is the development of wages. Although panel data is available, we restricts to the use of a single modified variable per respondent. Two observed wages have been used to accomplish this: the first observed wage and the last observed wage in the total interval of 32 years. For some reason lots of respondents did not answer to the question of how much they earned last year and thereby we define the observed wage as any value to that earnings question which is positive.5 For instance for the first respondent in the dataset only the wages in 1979 and 1981 are observed so these form the basis of the average annual wage growth variable. The last observed wage is divided by the first observed one and then discounted by the difference in years of the observations minus 1 to obtain the average annual wage growth: W Growthi:

W Growthi = (

W agei,endi

W agei,starti

)1/∆ti− 1

where W agei,endi is the last observed wage and W agei,starti is the first observed wage

of individual i, ∆ti is the difference in years between tstarti (first observed wage in

interval for individual i) and tendi (last observed wage in interval for individual i).

Because the intervals of first and last observed wages show great fluctuating between the respondents. For example, the first respondent has only provide his/her wage in 1979 and in 1981. Whereas the second respondent gave its wage level for the period 1979-1992 and for 2002-2010, so there is no data available for 1993-2000. Due

(12)

to an economic trend, which causes wages to grow over time (because of inflation for example), we have to control for these, individual invariant, differences. Correction for these differences is explained later.

As this thesis is mainly interested in human capital effects, we start with the AFQT scores, these are added to account for the cognitive ability of each partici-pant. Note that in literature we already described a potential issue in the form of multicollinearity: correlation of ability with age and level of education (e.g. Heck-man et al., 2006 and Hartog et al., 2010). We suspect older respondents with a high level of education to complete the AFQT better than when they would have done the same test at an early age. Therefore the ability variable has been adapted slightly (see Appendix) which gives the AF QTi variable.

As a measure for ‘sociability’ (as an adult) we include the variables which ac-counts for whether the respondents thinks her,- or himself to be a shy person. Soc2i = 1 if Sociali = 2, else = 0

Soc3i = 1 if Sociali = 3, else = 0

Soc4i = 1 if Sociali = 4, else = 0

So 3 dummies are included and Soc1i is the reference dummy.

It is known that this variable is quite subjective: what exactly is the definition of noncognitive ability? Here we follow the method of Hartog et al. (2010) who use this same ‘sociability’ variable. Another way of accounting for these individual characteristics is reported by Cawley et al’s. (2001). They come up with various behavioural characteristics representing social skills level. As they note: “Those who attain higher education levels and higher wages tend to be those who are characterized at a young age as having the self-discipline to follow the rules, to show up at school on time, and to not abuse drugs or alcohol”. They obtained a social skills level by ‘measuring’ to what extend the respondents have abused drugs or alcohol, have attacked their teachers, to be absent or late to school and some more relating variables to these kind of behaviour.

Because these data was not exactly available in the NLSY79 and much less respondents answered to these questions compared to the ”shy” variable, Hartog et al’s. procedure is copied.

Locusi and SESi represent the Locus of Control and self-esteem levels

respec-tively of individual i. Although Goldsmith et al. (1997) use Locus of Control as an indirect measure of self-esteem, the correlation appears to be rather small in our sample: -0.29 (N=6,557). So the high score for self-esteem corresponds more than average with a low score for Locus of Control (externalizers). The correlations between the ‘shy’ variable and Locusi/SESi are even lower (in absolute terms).

Therefore all three measures of noncognitive ability will be included in the model. Furthermore the educational part of the survey is represented by the highest completed grade variable of the respondent. Due to some irregularities in the data and for simplification the Educi variable is defined by: the highest completed grade

(13)

account that it is possible that some have finished their education when their first wage is observed, while others may still be students for a couple of years and at the same time earning wage.

Following Cawley et al. (2001) the cognitive ability effect with and without the education variable is presented. As well as leaving out cognitive ability and noncog-nitive ability to measure the explanatory power of these important determinants.

The presence of training is reflected by the following variable:

T raii = endi X t=starti T RAIi,t ∆t

starti = time at which start wage is observed (in years).

endi = time at which end wage is observed (in years).

T RAIi,t = the dummy variable whether individual i attended training at time t.6

∆t = difference in time between start wage is observed and end wage is observed (in years).

Now we see that T raii must be a variable between 0 and 1. So for example,

when individual i has T raii equal to 0.5 it means that it has attended on-the-job

training in half the years of the observed interval. The assumption is that training attended before the observed interval does not have an extra direct effect on wage growth, but only an indirect effect via start wage. So pre-training is ignored here, just as the attended training after the interval for obvious reasons.

Since we only have data of on-the-job training starting at the year 1988, some smaller time-intervals will appear. Therefore the coefficients will be estimated with and without the training variable, T raii (apart from shorter intervals

noth-ing changes).

Control variables

The two most common characteristics in individual micro-economic research are gender and ethnicity (together with age, but this one is defined later). Gendi is a

dummy, which is 1 for individual i being a woman and zero otherwise. Blacki and

Hispanici are two dummies (Ethni) for individual i being either black or hispanic.

Other ethnic groups are left out for identification.

To control for the differences in intervals between the respondents, time-interval dummies are included in the model. These dummies control for inflation and other macroeconomic variation over time between the observations. There are a total of 276 possible intervals within this way of estimating wage growth (see Appendix). Therefore we include 275 dummies in the regression, leaving out the 1979-1980

6If there was no response in a certain year, this year is ignored in determining the T rai

ivariable

(14)

interval for identification. In theory we thus have 275 time-interval dummies, but in partice this value turns out to be smaller (depending on the method of estimation), because many intervals are not observed after deleting the non-relevant observations. Apart from time effects one region has perhaps had to do with higher growth in wages compared to one or more others7. To control for this possiblitiy we add 15 residence dummies The dummies in the model are: N orthCi if responded i’s

residence was in the north-central part at the time of the first observed wage and at time the of the last observed wage. Southi is defined similarly, but now for the

southern part and W esti for the western part of the United States. To take into

account for those that have moved to another region during their individual observed wage interval we include dummies that account for this. So if the respondent lived in the north-central part during its first observed wage and lived in the southern part during the last observed wage it is ’attached’ to that specific ’north-central south’ dummy. This is similar for the other 11 residence dummies. The north-eastern region is the reference category (those that lived in the north-eastern part of the country during its first and last observed wage).

Following the model of Lazear (1974) we add age at the time the first wage is reported and its squared value8 Age

i,starti and Age

2

i,starti. As mentioned in the

lit-erature part, we doubt the significant effect these two will have on wage growth. Therefore they will be removed from the model if they appear to be insignificant after the start model. There is no theoretical basis to assume that in the modified models Agei,starti and Age

2

i,starti will have a stronger effect (in contrast to the other

regressors).

Controlling for marriage effects we apply the same technique as we did for on-the-job training: M arri = endi X t=starti M ARRi,t ∆t

M ARRi,t = the dummy variable whether individual i was married and spouse was

in same household at time t.9

The other variables are defined similar as in the T raii equation. In the

break-test discussed later we check whether there is any difference noticeable between the marriage variable on women and that on men. To account for having children or not during the period of the observed wage growth the following variable has been

7Cawley et al. (2001) found region of residence to explain some part of the variance in wages 8Lazear (1974) did not include the squared value of age

9The same procedure as for the training variable is followed here with respect to the availability

(15)

used: CHILDi = endi X t=starti #Childrent,i ∆t

#Childrent,i = number of children of individual i at time t.

CHILDi = the average number of children during the time interval ∆t of individual

i.2

Because CHILDi doesn’t necessarily have to be an integer it is not intuitive

to work with so it is modified a bit further. When CHILDi is smaller or equal to

0.5 individual i is assumed to have had on average 0 children in that corresponding interval. If the CHILDi lies between 0.5 and 1.5 we say that individual has had on

average 1 child in that period. When CHILDi is bigger than 1.5 it is interpreted

as individual has had two or more children (on average) between start time and end time. Two dummies will be added, Child1i and Child2i.

0 ≤ CHILDi ≤ 0.5 → no dummy (for identification)

0.5 < CHILDi ≤ 1.5 → Child1i

CHILDi > 1.5 → Child2i

Again just as for M arri we will distinguish in the break-test the ‘child’-effect on

women and on men.

The final control variable which is hypothesized to have a possible effect on wage growth is the start wage: W agei,starti. Is a relatively high start wage accompanied

by a higher wage growth or maybe the other way around? W agei,starti is defined

just as in the definition of W Growthi as the first wage observed for individual i

in the interval 1979-2010. The logarithm of start wage will be used as a regressor: log(W agei,starti).

To summarize, we have the following equation to be estimated:

W Growthi =β0+ β1log(W agei,starti) + β2AF QTi+ β3Educi+

6

X

k=4

βkSoci,k+

β7Locusi + β8SESi+ β9Gendi+ 11

X

k=10

βkEthni,k+ β12Agei,starti+

β13Age2i,starti+ β14M arri+

16 X k=15 βkChildi,k + 31 X k=17 βkResidi,k+ 33 X k=32 βkEducP ari,k+ 309 X k=35 βkTi,k + (β34T raii) + i

Ti are the time-interval dummies and EducP ari are the highest completed grades

(16)

Pitfalls

Following Blackburn’s and Neumark’s (1993) and Van Praag et al’s. (2009) meth-ods and the three sources of bias associated with OLS estimates of the return to schooling (Harmon and Walker, 1995)10, the education variable is possibly an en-dogenous variable. The OLS estimate might therefore be biased and inconsistent. A general and often used concept to obtain a consistent estimate for Educi is the

two-stage least squares method11. Educi is instrumented and the coeffient will be

compared with the OLS variant. The instruments collected from the dataset, fam-ily background variables, are: 1) presence of newspaper(s) in household at age 14, 2) presence of magazine(s) in household at age 14, 3) presence of a library card in household at age 14, 4) number of siblings in household (in 1979), 5) number of older siblings (in 1979) 6) foreign language spoken (in 1979), 7) living with both biological parents until age 18, (8) highest grade father and 9) highest grade mother).12 One of the two requirements of good instruments is that they must be uncorrelated with the error term, meaning that they should not influence the wage growth in a direct way (instrument exogeneity). This is tested by the Hansen J statistic. The other requirement is that the instruments need to be relevant: the instruments must have a significant effect on the endogenous variable. The strength / joint significance of all the instruments is tested with a F-test13. A statistic which only tests for the relevance of the excluded instruments (the 7 or 9 instruments just explained) is the Kleibergen-Paap LM statistic (underidentification test) and is provided as well.

Note that with two-stage least squares the biasedness of the endogenous re-gressors might not disappear. The bias could become even larger in case of weak instruments (Hahn and Hausman, 2002)

Several studies have been published discussing other instrumental variable esti-mation methods than 2SLS which theoretically should provide less biased estimates when there are weak instruments (e.g. Stock and Yogo, 2001, Hahn and Hausman, 2002). Hahn and Hausman (2002) describe ’possible cures’ to the weak instru-ments problem which will be partly followed here (in case of weak instruinstru-ments). They suggest to use the limited information maximum likelihood (LIML) estima-tion method14, the Fuller estimation method15 and the Jackknife-2SLS (JK2SLS)16 method which will be followed here next to the two-stage least squares method.

10One of the three sources is “ability bias”, which is ‘tackled’ in this framework by including a

proxy of cognitive ability (AFQT)

11Developed more or less independently by Theil (1953a, b), Basmann (1957) and Sargan (1958) 12Blackburn and Neumark (1993) and Van Praag et al. (2009) included a couple of more

in-struments, but here we restrict to 7 or 9, because not all there included instrument are available in the NLSY79.

13The Anderson-Rubin F-statistic (1949) 14Introduced by Anderson and Rubin (1949) 15Fuller (1977)

(17)

More theoretical explanation and previous results will follow later.

Blackburn and Neumark (1993) included two variables to indicate the educa-tional level of the mother and the father to the instruments. Van Praag et al. (2009) however argue that these two family background variables may influence labour mar-ket performance (returns) in a direct way, besides the effect on education. Thereby not satisfying the condition that the instruments must be uncorrelated with the error term, instrument exogeneity. They suggest to exclude parental education as instruments and use these as control variables in the OLS model. Depending on the significance of the educational level of the parents (highest completed grade of the parents) in the regression of wage growth on the independent variables, we de-cide whether to include parental education as control variables or to use them as instruments in explaining educational level of the respondents.

Further, Blackburn and Neumark (1993) argue to use the family background variables to instrument for the test scores of ability as well, as this variable might be measured with error (see Blackburn and Neumark, 1993). Therefore we implement the same procedure (2SLS) as for the (endogenous) variable education (with the same instruments). The hypothesis of endogeneity for the education and ability variables is checked with a Hausman test.

A high start wage may have or may have not a positive effect on wage growth in the future. When someone already has a high start wage (first observed wage), because hes/her cognitive ability is above average or someone has completed a high level of education, which is likely based on the discussed literature, we might assume that its wage growth is higher than average as well (based on the same theory). This means that wage growth and start wage are possibly affected by the same determi-nants (causing inconsistent and unbiased estimators) and so we expect start wage to be endogenous. A common procedure to account for this inconsistency is again 2SLS, so we will apply the same approach as we did for the (endogenous) variables AFQT and education. LIML, Fuller and JK2SLS are runned conducting exactly the same instruments as for AFQT and education.

IV estimation methods

1. 2SLS

The 2SLS estimators and its variances are defined as follows:

ˆ

β2SLS = (X0PZX)−1(X0PZY )

with X being the matrix of endogenous and exogenous regressors, Y the N×1 vector of dependent variables (wage growth), Z the matrix with all exogenous covariates and the instruments and PZ = Z(Z0Z)−1Z0, the projection matrix

(18)

of Z.

The variance of the 2SLS:

V2SLS = N (X0PZX)−1(X0Z(Z0)−1S(Zˆ 0Z)−1Z0X)(X0PZX)−1

in case of heteroskedastic errors (assumed here) ˆS = N−1

n P i=1 ˆ u2izizi0 and ˆui = yi− x0iβˆ2SLS. 2. LIML

This estimator has a much more cumbersome derivation than the 2SLS es-timator and was therefore not preferred by statisticians in the past and still isn’t. LIML is a linear combination of the OLS and 2SLS estimate (with the weights depending on the data), and the weights happen to be such that they (approximately) eliminate the 2SLS bias (in theory) when dealing with weak instruments.

The error terms of the first stage regression and of the second stage regression (together the reduced form equation), were wage growth is regressed on the predicted regressors are assumed to me normally distributed. However, con-sistency and asymptotic normality of the estimator do both not rely on this assumption.We have the following well known likelikhood function

L(β, π, Ω) = N P i=1 −1 2ln|Ω| − 1 2 Yi− βZi0π Xi− Zi0π 0 Ω−1Yi− βZ 0 iπ Xi− Zi0π ! ,

with Ω being the variance-covariance matrix in the reduced form equation. Another way of solving the LIML estimator is to express it as a k -class esti-mator

ˆ

βliml = (X0(I − klimlMZ)X)−1(X0(I − klimlMZ)Y ),

where MZ = I − Z(Z0Z)−1Z0 and kliml is equal to the minimum eigenvalue λ:

λ = min

β

(Y −Xβ)0(Y −Xβ)

(Y −Xβ)0M

(19)

The variance of the LIML will differ from the 2SLS variant only because the errors ˆS are different.

The k -class estimators are instrumental variables estimators where the pre-dicted values in the first stage take a weighted average of the actual endogenous variables and the estimated endogenous variables and : X∗ = (1 − k)X + k ˆX. In the OLS and 2SLS cases k is 0 and 1 respectively. For the LIML method k is a stochastic value.

3. Fuller

The Fuller estimator is defined similarly as the LIML estimator except for the value of k. In Fuller’s case k is set as: k = λ − α/(N − L), where L=number of instruments and α is a self-specified positive constant. Davidson and MacK-innon (1993) suggest using a value of α=1 as a good choice (to give approxi-mately unbiased estimators) and so we have followed their suggestion.

Copying the LIML variance explanation the Fuller-variance differs from 2SLS because of different errors.

4. Jackknife

In case of the Jackknife estimator all observations except observation i are used to estimate the explanatory parameter in the first stage regression:

ˆ

π−i = (Z−i0 Z−i)−1Z−i0 X−i

Four differenent versions of the Jackknife estimator are described by Poi (2006). We will restrict to the one which yields the best results under Monte Carlo simulations, called the Unbiased Jackknife Instrumental Variables Esti-mator 1 (UJIVE1) and is defined by:

ˆ

βU J IV E1 = ( ˆX0X)−1Xˆ0y

with ˆX defined as ziπˆ−i.

The heteroskedasticity-robust variance estimator is:

d V ar( ˆβU J IV E1) = ( ˆX0X)−1 n P i=1 ˆ 2 ixˆixˆ0i(X 0X)ˆ −1 where ˆ 2 i = (yi− xiβˆU J IV E1)2

(20)

Monte Carlo results

The Monte Carlo results of various researchers have led to some mixed conclusions. Stock and Yogo (2002) found in their Monte Carlo setup that LIML is “far superior” to 2SLS in terms of coverage rates when they have to deal with weak instruments. Even when their measure of strength of the instruments was weak the coverage rates of the LIML approximated the nominal rate quite closely. These findings approximately correspond to the findings of Poi (2006).

Another result relating to the LIML was obtained by Hahn et al. (2004). They demonstrate that the ‘moment problem’17causes problems for the LIML estimators

in the weak instruments situation (finite sample (second-order) bias). The inter-quartile range (IQR) of the LIML estimators often far exceed the IQR of the 2SLS estimator. In addition, the root mean-squared errors of the 2SLS estimators are often considerably lower than those of the LIML.

With respect to the Fuller estimator and the JK2SLS estimator (both have finite sample moments), they both give significantly lower IQR values than the LIML estimator does. In terms of MSE the Fuller estimator does better than the JK2SLS when a small number of instruments is used. When the number of instruments increases the JK2SLS appears to be the better estimator.

Although approximate second-order MSE of the Fuller estimator and of the JK2SLS estimator (Hahn et al., 2004) predict the Fuller estimator to have a lower MSE, in practise this advantage is overestimated and only one half of the theoretical value. The same is true for the 2SLS which is lower in the Monte-Carlo results than predicted by its theoratical approximation. All in all Hahn et al. (2004) conclude that the 2SLS, JK2SLS and the Fuller estimator perform better than the LIML estimator and that 2SLS seems to perform better in terms of MSE than predicted. The Monte-Carlo evidence of Angrist et al. (1999) and Blomquist and Dahlberg (1999) show some mixed results for the JK2SLS compared to the 2SLS estimator. There is potential for bias reduction, but also for inclease of the variance. So the JK2SLS may not necessarily have an improved MSE. Poi (2006) provides evidence that the JK2SLS could be an improvement over the 2SLS when there are weak in-struments in terms of IQR and coverage rates.

Multiple intervals

So far we have restricted to one interval of more than 30 years and have assumed a linear relationship in which wage growth is constant each year. The time-dummy intervals control for different tendencies of the market over time (booming economy or depression in the extreme cases). They do not, however, control for non-linearities of the wage growth per individual over time. It could be the case that wages grow faster in the start of someone’s career or the other way around. Another possiblity

17The fact that LIML has no finite sample moments (see Mariano and Sawa (1972) and Sawa

(21)

is that some individual characterics could have different effects in different periods in time. For example, after the Perry Preschool program18 cognitive test scores appeared to have reduced greatly in a few years. This in contrast to improved so-cial skills (self-discipline and self-control) that persisted at least 20 years. So there is some evidence that our cognitive ability measures become (more) important in explaining wage growth over and AFQT’s coefficient will be less powerful.

The complete interval 1979-2010 is separated in three new intervals of approxi-mately 10 years: 1979-1988, 1989-1998 and 2000-2010 (1999 is not observed). These intervals are estimated in the same manner as the complete one with OLS and 2SLS (depending on the results of the other estimation methods). The only variable which will be treated differently in the three intervals is self-esteem. Self-esteem has been measured at three different moments: 1980 (N=11,992), 1987 (N=10,340) and 2006 (N=7,370). In the total interval the self-esteem measured variable of 1980 has been used, being the most frequently observed of the three.

Self-esteem does not appear to be time-invariant, as the correlation between three observed are 0.4549 (’80 and ’87), 0.3188 (’80 and ’06) and 0.4143 (’87 and ’06). By regressing the self esteem measure of 1980 to the wage growth of the period 1979-1988, 1987 measure to the wage growth of the 1989-1998 period and the 2006 measure to wage growth of the interval 2000-2010 we correct as much as possible for the time-invariant characteristic of self-esteem and thereby obtaining better es-timates.

Split sample groups

In addition to the different intervals in years, the effects of human capital on different groups of people within the sample will be discussed.

First we distinguish those with a high wage level from those with lower wages. Because it is impossible to have a straightforward definition of who has a high wage level, we suggest the following definition of wage of individual i:

W agei =

W agei,starti/(1 + DRatestarti) + W agei,endi/(1 + DRateendi)

2

where DRatestarti is equal to the rate with which the wage index in the year of

the first wage is observed changed compared to the 1979 wage index19(see Table

2): (W ageIndexstarti− W ageIndex1979)/W ageIndex1979. DRateendi is defined in a

similar manner. Discounting the wages accordingly to 1979 values is done to assign more weight to the wages observed earlier as wage have increased over time (by approximately 263%) due to the, already discussed, macro-economic factors.

Next, all computed wages are sorted from low to high and split into two samples:

18The Perry Preschool program was a social experiment conducted in the early 1960s in Ypsilanti

(see Schweinhart et al. (1993))

(22)

WageLow and WageHigh. These samples will be compared with the descriped esti-mation methods. The Chow-break test will test for significant coefficient differences. The same procedure (sorting) will be repeated for AFQT (corrected), educa-tional level and the self-esteem measure. The sample will also be split between men and women to test for non constant parameters here. This could be interesting as there might be differences in the strength of human capital in the determination of wage growth among individuals with different levels of human capital. Further we are interested whether the effects of human capital on wage growth are constant between men and women and if the differences of wage level determination (see lit-erature) could be observed for wage growht as well.

5

Results

After deleting all individuals in the dataset for which not all data was available in the complete interval (using the self-esteem measure of 1980 and with the ex-ception of the training variable) we are left with a total number of respondents of 4,796. Some time-interval dummies are only observed once. Because Stata is not always able to compute all the regression results if this is the case (singularity of the variance-covariance matrix), the corresponding dummies and individuals have been deleted as well. If we would include them nevertheless, these one time ob-served variables drive the R2 value of most regressions up to more than 0.9. These

one time observed variables do not influence the values of the coefficients and the corresponding standard errors. When the instruments are excluded the complete observations increase to 5,510.

First we checked whether the AFQT correction resulted in some significant changes of the (primarily) AFQT and education coefficients. In Table 520 the coeffi-cients of the OLS regression are presented with and without corrected AFQT. From this we see that the coefficients and standard errors have hardly changed. The AFQT coefficient has dropped a bit and that of education increased slightly. AFQT is sig-nificant on a 1% level; a 1 point increase in one’s score brings a 0.04/0.03% higher yearly wage growth. Highest completed education is significant as well, though on a 5% level. Education effect increase after the correction for AFQT from 0.060% to 0.067% (for every higher grade completed) and becomes significant at a 1% level. Concerning the other three human capital variables only self-esteem is significant when AFQT score is corrected. Per point increase in the Rosenberg Scale annual wage growth is 0.11 percentage higher. The other measures of noncognitive ability are of the expected sign, but not significant. In the third and fourth column we

20The regression results of the time-interval dummies and the residence dummies for those who

(23)

check what happens after excluding AFQT or education from the model. Without a cognitive ability measure the eductation effect increases further to 0.077%. This is in accordance with the findings of Ashenfelter et al. (1999), who found education to be partly determined by (cognitive) ability. The removal of the correlation between education and cognitive ability has cleary not disappeared completely, also shown in the Appendix in the description of the corrected AFQT score. When exluding education we see a similar pattern happening for AFQT, increase to 0.05%. Further significant variables are start wage, gender, residence and grade father. The negative sign for start wage is difficult to interpret. It is possible that part of the individu-als started providing their wage when having a full-time job, while others already provided their part-time job wage21. In this way those with higher start wages are more likely to observe less growth over their entire career compared to the ’part-time providers’. Another possibility is obtained from macro-economic empirical findings. Mankiw et al. (1992) provided evidence of convergence of per capita income on a cross-country level. The log difference (1960-1985) of GDP per working-age per-son has been regressed on the initial level of income in 1960 and a couple of more variables. A significantly negative coefficient of ln(Y60) indicates the convergence. This would suggest that the negative coefficient for our start wage variable could be explained by a convergence of wages within a country.

Women experience an annual wage growth that is almost 3.5 percent lower than men do. In the north-central and southern part of the U.S. wages have clearly grown less than in the north-eastern part. Highest completed grade of the father is signifi-cant in determining wage growth, so we leave this one (just as grade mother) in the regression, instead of as an instrument. Start age and start age squared are in non of the regressions significant. We expect these variables not to have more influence in the upcoming regressions, leaving out these two hereafter.

For testing the hypothesis of start wage, AFQT and education to be endogenous, the 2SLS, LIML, Fuller and Jackknife estimates are presented in Table 6. The OLS results show that in this sample start wage, AFQT, education and self-esteem are significant (at least up to 5%) and approximately the same as in Table 5. Gender, residence and grade father show similar results.

Instrumenting for the three possibly endogenous regressors we see that the effect of AFQT increases for 2SLS, LIML and Fuller, but it is only significant on a 10% level. Education increases as well for 2SLS and LIML, but only and stays significant on 1%. Significance of the self-esteem variable disappears in all four methods. In case of 2SLS, LIML and the Fuller estimates the coefficients are all relatively similar to the OLS estimates except for the educational level of the parents. Remarkable is that now the grade of the mother becomes significant (and negative) and grade father insignificant.

(24)

When we compare the four instrumental variables estimation methods we notice 2SLS to be the most efficient estimator as its standard errors are lower than all other comparable ones. This is in line with the findings of Poi (2006) for the jack-knife variant. For determining which of the estimation methods is best we follow Hahn et al. (2004) who used the mean-squared error as an approach (we use the root of the MSE (RMSE)). Based on the RMSE 2SLS is the preferred instrumental variables estimation method. Than Fuller, followed by LIML and JK2SLS. Espe-cially the JK2SLS gives deviating results. Apart form gender non of the coefficients is significant and the RMSE of this estimation method is considerably larger than the RMSE of the other methods. A slightly other sample has been used22 and the

results of the 2SLS, LIML, Fuller and JK2SLS in terms of the RMSE showed to be 0.09465, 0.17124, 0.1154 and 0.1015 respectively. This demonstrates that the 2SLS, Fuller and JK2SLS estimators are indeed preferred over LIML which is in accordance with Hahn et al. (2004). The fact that we here use less instruments (7 instead of 9) could explain Fuller to give a lower RMSE in Table 6. The 2SLS and LIML first stage regression results are given in Table 7. The corresponding F-values (27.82 and 121.38) indicate AFQT and education to be explained strongly by the instruments and start wage to be explained weakly (4.80). The underiden-tification test of Kleibergen-Paap, for relevance of the excluded instruments, has a p-value of 0.2284, not satisfying the condition of strong instruments. The Hansen J statistic (overidenification test of all instruments) shows the instruments to satisfy the exogeneity condition (p-value= 0.6022). In the other sample, just mentioned, the Kleibergen-Paap statistic turned out to be 0.0726, satisfying this condition on a 10% level and the first stage F-statistics were 13.86 (start wage), 29.79 (AFQT) and 94.29 (education). So in one sample the instruments are relatively strong in explaining the ’enodgenous’ variables and in the sample used here the instruments are weak. Nevertheless, in both samples 2SLS is, based on RMSE, the preferred estimate, which does not match with the discussed theoretical literature, but it does match the Monte Carlo results of Hahn et al. (2004).

The Hausman test23in all four instrumental variables methods does not reject the

null-hypothesis of exogeneity (just as Blackburn and Neumark (2003) concluded). All p-values of the Hausman test are 1.0000 (see last line Table 6). This means that OLS is the preferred estimation method.

Next our assumption is that only start wage will be endogenous. Therefore start wage is still instrumented in the next session of the thesis (with the same instru-ments) and tested.

22The residence dummies of those who moved during the observed wage interval were not

in-cluded and the parental education variables were used as instruments instead of second stage regressors

23The default standard errors (homoskedastic errors) have been used to perform this test, because

(25)

Now the possibility of non-linearities over the different time-intervals is studied. In order to compare the different intervals the sample has been reduced to 2,656 individuals. In particular the 2000-2010 interval contained a decreased amount of observations compared to the beginning years of the survey. The descriptive statis-tics in Table 4 show that wage growth is on average rather high in the start years of employees compared with when they are further in their career (38% annual growth). This percentage growth is probably the result of the fact that people in the early stage of their career are more likely to have part-time/side jobs. Therefore we ex-pect the coefficient results since 1989 to be more reliable when paying attention to part-time and full-time jobs. The marital and children variables have developed as expected.

In Table 8 we see the results of the OLS and 2SLS (only start wage is instru-mented) regression. We notice that AFQT is highly significant in the OLS case, especially in the first years of working the cognitive abilities seem important for wage development. 2SLS does, although it is only significant in the second period, reflect a relative constant effect of cognitive ability (even negative in the first pe-riod). Contrary to this observation is the education effect, for which it takes more time to sense a positive influence on wage growth (approximately 1% annually). The 2SLS is here roughly in accordance with OLS in the second period, not in the last period. People with a higher self-esteem experience higher wage growth later on in time as well in the OLS variant. In the 2SLS variant the contrary is true; at the start of the career differences in self-esteem tend to have impact on wage growth than later on in careers. Again the shyness of people does not seem to determine any wage growth. Noteworthy is that in separated interval version Locus of Control is significant for 2 out of the 3 periods, especially among employees of around 27 years of age. Those who find themselves to be always in control of their lifes suffer lower wage growth. Again the 2SLS is significant in the first part of the career, while OLS is significant later on.

Interesting is that the convergence of wages among men and women over the last few decades is depicted here (OLS). The 2SLS estimates of the first and second period show a similar pattern in which gender is a significant determinant. The third period however does not resemble the OLS coefficient.

Proceeding with some other control variables marital is of the expected sign in the second and third period in the OLS and in the first and second period in the 2SLS. The effect of having children on growth of wages in the 2SLS case seems to correspond slighly more to the past literature in the beginning of the career. This is probably because than the children are on average younger and require more care of the parents. Wage growth in north-eastern U.S. has in particular been higher in the 80s.

The Hausman test does again not reject the null-hypothesis of exogeneity of the start wage regressor in all three periods (see Table 3), so OLS is preferred in

(26)

inter-preting the determinants of wage growth in different time-intervals. Furthermore we see that in the middle time period the instruments do not even satisfy the exogeneity criterium (p-value= 0.0641). Despite the fact that OLS is preferred over 2SLS the other statistics of the 2SLS are presented in Table 3 as well.

Checking the possibility of a clear break point in the sample for wage level the lower wage level and upper wage level are obtained such as described in the methodology. From the full sample of 4,842 observations, we now have sample “lower” and “upper” both with 2,421 observations, with regression results in Table 9 together with the Chow-break test statistic24. The Chow-break test confirms the hypothesis of a

cer-tain break in the sample of the parameters on a 1% level. Under the null-hypothesis of constant parameters the test follows the F (K, N1 + N2 − 2K) distribution. K

is here 201 using the number of explanatory variables in the complete sample. In the sub-samples we end up with less variables, but using the higher K value gives a more strict condition to rejecting the null-hypothesis so we prefer this one. N1

and N2 are the sample sizes in the two subsets. This will also hold for the tests

conducted in the other split samples.

The regression results of two-stage least squares using the same instruments as before for start wage have been presented as well, but again provided in Table 3 the null-hypothesis of exogeneity is not rejected (tested for lower wages only). So for the rest of regressions we continue with just the OLS method.

Education and cognitive ability effects in the upper ‘class’ are considerably smaller than for the total sample and the impact of gender has shrunk as well, implying a possible break point. Further, what stands out in the regression results is that Locus of Control is significant among the upper wage ‘class’ and of the ex-pected sign, while it is not in the lower ‘class’. Self-esteem is now not significant anymore.

Splitting AFQT level in two equal sample sizes gives Chow-break test statistic of 2.04 which means that the constant parameters hypothesis is rejected (significant on a 1% level). Table 10 shows that eduational level and cognitive ability (in OLS) are more influential for those who scored in the lower half of the sample during the AFQT (corrected), but education is still highly significant in the upper ‘class’ and AFQT is not significant in both samples. Being a woman in the upper AFQT level has a more restraining implication compared to men in the upper level than for the lower scoring women. Residence seemed to have also more affected those with a lower AFQT score than the ones who scored in the upper half of the sample. What is remarkable is the significance and negative sign of the educational level of the mother. No explanation for this result is found here.

24In particular the split up of wage level has to be interpreted carefully. The specification

of wage level in the methodology is such that it is composed out of the same variables as the independent variable (wage growth), with a slight modification. Endogenous sampling for this variable is potentially the case here, causing a higher chance of a break point.

(27)

The Chow-break test statistic for education is 1.27 making it still significant on a 5% level. Educational level split (Table 10) show a quite similar pattern as for AFQT in terms of the AFQT variable and gender. The education variable is however a bit lower and almost equal between the lower educated and higher educated employees. We now see that self-esteem is, although its coefficients is approximately the same for lower educated people, significant in the higher educated sample. Region of residenceis in contrast to the split sample of AFQT has more impact on the higher educated.

The self-esteem level break-up (Table 11) seems to have the least effect on the estimated parameters. This is also reflected by the Chow-test of 0.85 (not significant on a 5% level). Self-esteem under the lower part of the sample has a larger effect than under the people with a relatively high self-esteem and region of residence is more affecting those with a higher self-esteem.

The last broken sample is the break-up of men and women (Table 11). The constant parameters hypothesis is rejected on a 1% level (Chow test is 1.64). Cog-nitive ability seems to have slightly more effect on women, while men experience relatively more influence of education. Furthermore we notice that in line with the literature being married has among men a positve effect (significant) on growth of returns. Having children has more impact on women’s wage growth than it has on men’s wage growth. Taking into account that we have not controlled for full-time and part-time jobs. What is most feasible is that when there are (young) children in the household, women tend to switch to part-time jobs more often and thereby having a higher probability of smaller wage growth.25

Data of attending any kind of training as described in the data description was only available since 1988. OLS results with and without this variable are given in Table 12. Although AFQT was thus far significant in the complete interval and the three separated intervals it is not for this particular subset. Thus, the cognitve regressor seems rather sensitive to the choice of the sample. The education variable appeares to be more robust, significant in Table 12 in both columns. Also self-esteem is not significant anymore, although it is the expected sign and almost the same co-efficient as in the complete interval. For the other two measures of noncognitive ability once more no signs of significance is detected. Gender is again significant. More important is that attending any on the job training is significant on a 1% level. Thereby agreeing with the findings of Brown (1989). The education variables decreases with almost 0.1% after including training, but remains highly significant. A possible explanation is that higher educated employees are more likely to attain some kind of job-related training.

25Note that in the break-up regressions the R2’s of OLS are more than 0.8. This is the result of

including the only one time observed time-time interval dummies, because it would otherwise be impossible to compare the split samples with the complete sample.

Referenties

GERELATEERDE DOCUMENTEN

Where y jt is the dependent variable which measures the spread of wages, x kjt is the independent variable which measures passage of time, union density,

Since technological change, minimum wage and to some extent a higher share of female employment all reduce wage inequality, education and outsourcing together are

Unfortunately, to the best of our knowledge, whether or not sales promotions (i.e., free product offers or price reductions) through large-scale promotional events influence customer

Om deze vraag te beantwoorden heeft een onafhankelijk instructeur de kinderen drie keer beoordeeld op het uitvoeren van de basiselementen die leiden tot de Onderzoekend leren

Therefore this research investigates the current status of gold as a safe haven for investor in the Dutch stock market, during the several periods of negative market returns from

Timing and nature of muscle stretch reflex activity augmentation by subthreshold TMS reflect supraspinal integration of peripheral sensory afferent with cortical efferent signals as

bepaald. Via deze data kan het fenomeen ttdrukberg&#34; worden onderzocht en de invloed van de parameterstblankdikte,smering en de dikte van het aanliggend

This PhD study will therefore assess the breast milk FA composition of lactating mothers and LCPUFA nutrition of South African infants during breastfeeding