• No results found

The effect of school type on higher education attainment

N/A
N/A
Protected

Academic year: 2021

Share "The effect of school type on higher education attainment"

Copied!
36
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The Effect of School Type on Higher

Education Attainment

Oliver Budd

University of Amsterdam

July, 2017

Abstract

In this analysis the aim is to estimate the causal relationship between the type of secondary school an individual attends (selective vs. non-selective) and the probability that they obtain a degree from a higher education institution. A longitudinal dataset is used, comprising of 7,121 individuals born in England and Wales using data from the 1970 British Cohort Study. Students in the data attended either a selective grammar school or a non-selective comprehensive school. A linear probability model and a probit model with a large set of controls were estimated, and sub-sequently a probit model with the same controls was used for propensity score matching. While the regression results showed a significant advan-tage in university chances by attending a selective school, matching results revealed an average treatment effect of selective schooling that was not significantly different from zero.

I would like to thank my supervisor Hessel Oosterbeek for his help in the process of this

(2)

Statement of Originality

This document is written by Student Oliver Budd who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Contents

1

Introduction

3

2

Secondary Education in the United Kingdom

6

2.1

History of Grammar and Comprehensive Schools . . .

6

2.2

How Grammars and Comprehensives Operate . . . .

7

2.3

Two Systems One Country

. . . .

8

3

Data

8

3.1

The 1970 British Cohort Study . . . .

8

3.2

Key Variables and Descriptive Statistics

. . . .

9

3.3

Missing Observations . . . .

11

4

Empirical Methods

13

4.1

Linear Probability Model . . . .

13

4.2

Probit Model . . . .

14

4.3

Propensity Score Matching . . . .

14

5

Results

18

5.1

OLS Results . . . .

18

5.2

Probit Results . . . .

20

5.3

Matching Results . . . .

21

6

Unobservable Selection and Coefficient Stability

25

7

Results: Russell Group University Access

26

7.1

OLS and Probit Results . . . .

26

7.2

Matching Results . . . .

28

8

Conclusion

30

References

31

Appendix

34

(4)

List of Tables

1

School Type . . . .

9

2

School Type and Higher Education Destinations . . .

9

3

Split Descriptive Statistics . . . .

12

4

OLS Results . . . .

18

5

Probit Results: Marginal Effects . . . .

20

6

PSM Balancing Table . . . .

22

7

Matching Results . . . .

24

8

Unobservable Selection and Coefficient Stability . . .

25

9

OLS Results (Russell Group)

. . . .

27

10

Probit Results (Russell Group): Marginal Effects

. .

27

11

Matching Results (Russell Group) . . . .

29

12

Russell Group Universities . . . .

34

13

Sutton Trust 13 Universities . . . .

34

List of Figures

1

PSM Density Graph

. . . .

17

2

PSM Histogram . . . .

17

3

PSM Kernel Density Graphs . . . .

23

4

Russell Group Overlap Graph . . . .

28

(5)

1

Introduction

Where to send your child for school is a difficult decision for many parents around the world. A parent who is invested in their child’s future will want make sure they give their child the best possible start in life. While many countries operate a single comprehensive school system, where children of all backgrounds and abilities are taught together, many families in the United Kingdom have the choice between a comprehensive system and a selective one. These selective (grammar) schools require a child to pass an exam to enter, and are often in ar-eas with much higher house prices.1 The decision for many British parents (the vast majority of whom cannot afford private education) then becomes whether to send their children to the local comprehensive school or for those that can, a selective (and often less conveniently located) grammar school. In today’s world, where it is thought by most that a university degree is essential to success in the labour market, access to higher education is at the heart of this choice of school type. Proponents of the grammar school would argue that attending a selective school increases a student’s chances of acceptance not only to higher education in general but to the best universities when their time at school comes to an end. At first thought this might seem like a credible statement, but is that really the case? To illustrate, a report published by the Sutton Trust in 2013 showed what has been known for a long time, that grammar schools are dominated by middle to upper class children from wealthier backgrounds (Cribb et al. 2013). An interesting fact from the report stated that ”less than 3% of entrants to grammar schools are entitled to free school meals – an important indicator of social deprivation.” It is logical that children from wealthier backgrounds and social classes are more likely to attend university (and elite universities) regard-less of the school that they went to. This can be via the mechanism of parental education (Ermisch and Protanzo 2010, Chevalier 2004) or of household wealth itself (Williams 2004, Karagiannaki 2012). This paper therefore aims to address the following question; does a grammar school education really provide better access to higher education?

Answering this question is important for a few different reasons. Firstly, this should have an effect on the government stance on grammar schools. It has been well publicized in recent months that the conservative party (who’s MPs tend to be private and grammar school educated) have planned to expand the number of grammar schools in the UK. They have argued that grammar schools are better for social mobility whilst giving equal chances for all in the school system. While this paper will focus on higher education rather than the social mobility aspect, it could be argued that attending university is an important mechanism in social mobility. If going to a grammar school provides no advan-tage in reaching tertiary education, this would make a strong case against the reintroduction of the selective system, and promote the further reduction of the number of grammar schools in the country. This would also be applicable not

1Lloyds Bank published a report it 2016 showing that average property prices were much

(6)

just to the UK but to other developed countries with selective school systems. Furthermore, from a social perspective, an answer to this question is important because of the aforementioned ”decision” that many families make in deciding if they want to try and send their child to a grammar school. For one, the requirement of passing the 11-plus grammar school entrance exam will often mean the need to hire a private tutor.2 This is often not an issue for those parents with grammar school aspirations who can easily afford this (hence the domination of grammar schools by middle and upper class families). For many lower income families however, the hiring of a private tutor is a potentially large financial burden. Seeing as almost all students in the UK end up taking out student loans for university3, if grammar school provides no advantage in being accepted to university, this would lead many families to forgo hiring a private tutor, and some families to never attempt to enter their child into the selective system in the first place. If it can be shown that school type provides no advantage in university attainment, it could influence authorities as well as individual households to have faith in the comprehensive system. This would help to reduce the social division that the selective system breeds, as well as spread resources to improve all schools around the country.4

Previous empirical work in this area has used similar data and a variety of methods to derive the effects of school selectiveness on dependent variables such as test scores, social mobility and labour market outcomes. The closest study to this specific topic is Boliver and Swift (2011). The authors use data from the 1958 National Child Development Study (NCDS) to investigate whether the introduction of comprehensive schools reduced social mobility in Great Britain. Despite being in the field of sociology rather than economics, the authors use familiar empirical techniques. They first estimate a series of logistic regression models, then perform the same exercise controlling for estimated propensity score. Using whether or not an individual moved up a certain income quartile as dependent variables, they found some small benefits of grammar schools to low income children relative to high income children, but concluded that overall comprehensive schools were as good for social mobility as selective ones.

Using data from the ”East Riding database” in the UK, Clark (2009) exploits the 11-plus entrance exam to estimate the effect of attending a selective school on test scores. The author uses a regression discontinuity design, using the jump in probability of treatment at the passing threshold of the exam. Clark finds small effects of selective school attendance on test scores. In a more recent paper (Clark 2015), Clark uses data from the ”Aberdeen Children of the 1950s” study to assess longer run outcomes such as completed years of education, fertility, and income. However, Clark’s study only examines the effect of the upper track in the selective system relative to the lower track, rather than comparing two

2Currently around 25% of all state educated students between 11 and 16 in England and

Wales have received some kind of private tutoring. (Kirby 2016)

3Tuition fee loan take up was an estimated 93% in 2014/15 (Bolton 2016)

4The government could potentially invest up to 320 million in a grammar school expansion.

(7)

different systems. The empirical strategy used by the author is instrumental variables. He notes that the probability of assignment to an ”elite school” (equivalent to grammar school) varies sharply at certain cut-offs. Using dummy variables for theses cutoffs as instruments, he estimates two-stage least squares models to estimate a causal effect. The author finds that elite school attendance on average leads to one additional year of total education. There was however no statistically significant effect found on labour market outcomes.

Using RD methods similar to Clark 2009, Abdulkadiroglu, Angrist and Pathak 2011 assesses the effect of attending an exam school in the US on test scores such as the SAT. The authors use data from Boston and New York schools which require an entrance exam before admission (exam schools). They exploit the cutoff in these exams for a regression discontinuity design. Subsequently, the authors find little to no advantage of an exam school education on test scores, with only small effects on some outcomes, such as the 10th grade ELA score.5

One final example of analyzing grammar schools in the UK is that of Galindo-Rueda and Vignoles 2004. This paper uses the 1958 NCDS to assess whether there is any advantage in test scores between grammar and comprehensive school students. The empirical strategy employed is to use the political affiliation of the LEA (local education authority) that a child’s school is in. The authors ar-gue that political affiliation/voting history influenced the rate at which schools moved towards the comprehensive system. Matching methods were also em-ployed.6 Galindo-Rueda and Vignoles find that students at the higher end of the ability/test score spectrum did better in grammar schools than compre-hensive schools, but found no evidence of any difference for mid to low ability students.

While previous studies have focused mainly on short run effects such as test scores and longer run effects such as social mobility and wages, I have been unable to find any empirical studies that focus specifically on higher education attainment (which is an important mechanism in the path to higher wages), and only few that have included this as an addition to other outcomes (Clark 2009, Clark 2015).7 This analysis would be an interesting contribution to the literature, because I believe university education to be far more important to an individual’s future outcomes than the school they attended when they were a teenager. Most students are far more engaged from an academic perspec-tive once they reach university, and future employers will be more concerned with performance in university than with what an individual has accomplished in high school. University is a more important driver for social mobility and labour market outcomes, especially now when more people in the UK are at-tending university than ever before. It can be said that ascertaining the effect

5ELA stands for the English Language Arts test administered in New York

6The Labour party has been strongly in favour of the removal of the grammar system,

while most of the grammar schools left today are in Conservative dominated areas.

7Note: Clark 2015 analyses total years of education as opposed to university attainment,

and uses data from Scotland, which has a slightly different school system to England and Wales (the two countries used in this analysis.)

(8)

of type of school attended on higher education access adds an important dimen-sion in understanding the possible benefits (or detriments) of a selective school system. Additionally, much of the previous literature (including publications discussed above) have used older data, most commonly the 1958 National Child Development Study. In this analysis newer data from 1970 is used, which could provide better insights into the school systems today. A notable example of this is the sharp decline in the number of grammar schools in the UK. Just 163 exist in England today, compared to over 1,200 in the 1960s.This nationwide tran-sition from a selective to a comprehensive system meant that many secondary modern schools (the lower track in the selective system), were re branded as new comprehensive schools, with no actual changes being made. The older data therefore may not truly reflect the comprehensive system at its full capacity. The new data from 1970 could provide a more accurate outlook on the com-prehensive school system, and a fairer comparison between comcom-prehensive and grammar schools.

The main findings of this analysis were that ordinary least squares and pro-bit estimates showed a significant advantage of the grammar school in obtaining a university degree. This result was contradicted however by propensity score matching estimates, which showed no significant difference between the the two types of schools. Additionally, a measure of ”elite” universities is investigated in section 7, to see whether grammar schools provide better access to top universi-ties in the UK. This analysis yielded similar results to that of the main outcome. The rest of the thesis is structured as follows: Section 2 explains the context of the education system in the United Kingdom, section 3 describes the data used, section 4 describes the empirical methods, section 5 presents the results, section 6 presents a heuristic for robustness to omitted variable bias, section 7 presents brief results for ”elite” university access, and section 8 concludes.

2

Secondary Education in the United Kingdom

2.1

History of Grammar and Comprehensive Schools

The setting of this research is the state funded education system of the United Kingdom around the 1980s and 1990s. Seeing as the cohort in my dataset were all born in 1970, most of them would be attending secondary school around 1981, and those that attended university would be doing so around the late 80s to early 90s. The 1944 Education Act in Britain introduced the modern concept of the grammar school. A so-called ”tripartiate” system was implemented. This consisted of grammar schools, secondary modern schools and secondary tech-nical schools. The grammar school was the more academically focused ”upper track”, while the secondary modern school and secondary technical school were the so-called ”lower tracks” for less academically gifted students. Local Educa-tion Authorities (LEAs) could submit their own proposals for the reorganizaEduca-tion

(9)

of secondary schooling in their area, within the framework of the Education Act (Butler 1944). This decentralized nature of the education system allowed indi-vidual LEAs to make their own decisions about their schools. Ultimately, this led to the coexistence of comprehensive and selective systems around the UK, although few comprehensive schools were initially founded (Pischke and Man-ning 2006).

By the 1950s and 60s, the previous consensus about the benefits of the se-lective system had begun to falter. In 1965, Circular 10/65, a new Department of Education policy regarding the reorganization of the schooling system, de-clared the government’s plans to end the selective system. It ordered LEAs to start phasing out the grammar school system and replace it with comprehen-sive schools, citing social division and unfair advantages to better off children as the main flaws of the incumbent system. This policy led to a steep decline in the number of grammar schools in the UK. The comprehensive system, which started as a small experiment of the 1944 Education Act in a few areas, was starting to become the dominant system in the UK. By the late 1970s, the time period the data used in this study reflects, only a small fraction of British students were in grammar schools (Bolton 2016).

2.2

How Grammars and Comprehensives Operate

Grammar schools, just like comprehensive schools, are state funded and run, meaning in theory there should be no extra school fees or associated costs if a child is accepted to a grammar school. The main argument of proponents of the grammar school is social mobility. These schools should theoretically allow intelligent, less privileged students to succeed through a better academic program with brighter peers. At the core of the grammar school concept is how students are selected for eligibility through the 11-plus exam. The 11-plus exam was taken by some students in their last year of primary school, usually at the age of 10.8 The test is designed to see if a student has the ability to succeed in a grammar school environment. The test, which can vary in content depending where a student lives, commonly comprises of up to four sections. These sections are verbal reasoning, non-verbal reasoning, maths and english. The test has been heavily criticized, mostly for putting students who are tutored/wealthier students at an unfair advantage. Acceptance rates after taking the test vary by area. Some grammar schools have far more applicants than others than others. Grammar schools in the areas of Kingston and Sutton, for example, have acceptance rates of around 3%, while some areas that have fully maintained the selective system have acceptance rates of around 30% (Eleven Plus Exams 2017). Grammar schools claim to offer better opportunities for the brightest students, as well as a better academic program, often citing their strong exam

8While it is still used in some schools today, most current grammar schools have their own

(10)

results and university destinations of their graduates.9 A goal of this paper is to see how much of this success can be attributed to the schools rather than the predetermined characteristics of the students themselves.

Comprehensive schools on the other hand are more straightforward. Stu-dents stuStu-dents apply for a place at a number of comprehensive schools in their area based on personal preference, and are then allotted a place based on prox-imity to the school and demand. Students of all abilities are admitted to com-prehensive schools regardless of academic ability or other possible factors. This is the schooling arrangement for the vast majority of school children in the UK as well as the dataset used in this study, which can be seen from table 1 in section 3.

2.3

Two Systems One Country

The power that LEAs have over the way they run schools in the UK meant that the government could not get rid of grammar schools all together. This has led to an interesting mix of the two systems that is closely related to the country’s two main political parties. The more left-wing Labour party has traditionally been more in favour of equal education for all - it was a Labour government that issued the decree to end the selective system. In areas that tend to vote Labour, the comprehensive school was introduced much more quickly than in Conservative voting areas, many of which host today’s remaining grammar schools. All of this is interesting from a research point of view both in terms of empirical methods (Galindo-Rueda and Vignoles 2004), as well as in terms of the data that one can use. Using data from the UK, a researcher can analyze the difference between two systems that are operating at the exact same time in the exact same political, economic and social environment. If this were not the case, one may have to compare data from two separate time periods or two separate countries, which could call the validity of the results into question. More specifically, the dataset allows for the analysis of British students that are applying to the same universities in the same time frame.

3

Data

3.1

The 1970 British Cohort Study

The data used in this research is taken from the 1970 British Cohort Study (BCS70). The study was organized and conducted by the Centre For Longi-tudinal Studies, and is the successor to the 1958 National Child Development Study (NCDS) that many related papers have used. The initial plan was to use both datasets and compare the results, but information on higher education attendance was unavailable in the NCDS. The BCS70 is a longitudinal study,

9Grammar schools consistently outperform comprehensives overall in end of year exam

(11)

meaning that the data is panel data. The study collects information on a vari-ety of different characteristics in several waves. Surveys were conducted at birth and ages 5, 10, 16, 26, 30, 34, 38 and 42. For the purposes of this paper however the waves of interest are birth, age 5, age 10, age 16 and age 42.10 The study aims to follow the lives of 17,000 individuals all born in the same week in 1970 in England, Scotland and Wales. Due to a variety of reasons, such as widespread industrial action in the 1970s protesting government educational reforms, re-sponse rate in later waves was significantly reduced (Dodgeon et al. 1992).11 Furthermore, students that attended school in Scotland were also removed, due to Scotland operating a slightly different education system, with different school leaving exams (Pischke and Manning 2006). Once this was accounted for, along with the removal of all individuals that did not attend a grammar or comprehen-sive school (i.e. private, special needs and secondary modern school students), the final sample in the dataset consists of 7,121 respondents.

3.2

Key Variables and Descriptive Statistics

The BCS70 contains a vast amount of data on a variety of characteristics in areas such as health and education as well as social and economic status. Over 15,000 variables were narrowed down to a selection that was assessed as covering the important areas that influence a child’s school and university destinations.

Table 1: School Type

N=7121 N Frequency

Grammar 508 0.071

Comprehensive 6613 0.929

Table 1 shows simple descriptives on the school type split in the data. Of the 7,121 students in the dataset, 93%, or 6,613, attended a comprehensive school and the remaining 7% attended a grammar school. As previously stated, the vast majority of state educated children in the UK attend comprehensive schools.

Table 2: School Type and Higher Education Destinations

N=7121 University Russell Group Sutton Trust 13 Oxbridge

Mean N Mean N Mean N Mean N

Grammar .380 193 .100 51 .065 33 .020 10

Comprehensive .202 1134 .041 272 .019 124 .003 19

Table 2 shows the higher education destinations for the students in the dataset, separated by school type. Four outcomes for higher education were

10Which University individuals had obtained a degree from was recorded in the sweep at

age 42.

(12)

selected; University, Russell Group, Sutton Trust 13 and Oxbridge. University is a variable describing whether or not an individual had obtained a univer-sity degree. Russell Group and Sutton Trust 13 are two variables describing obtaining a top university degree. The Russell Group is an association of 24 UK universities that are considered as the best institutions in the country. The Sutton Trust 13 is a similar, somewhat overlapping, group of 13 institutions in the UK. Oxbridge is a variable for students that obtained a degree from either Oxford or Cambridge, widely regarded as the two most prestigious universi-ties in the UK. Both Oxford and Cambridge are members of the Russell and Sutton Trust groups.12 In the dataset, 38% of grammar school students ob-tained a university degree, compared to around 20% for comprehensive school students. 10% and 6.5% of grammar school students obtained a Russell Group and a Sutton Trust degree respectively, while 4.1% and 1.9% of comprehen-sive school students obtained degrees from universities in those groups. Lastly, 2% of grammar school students obtained a degree from either Oxford or Cam-bridge, compared to 0.3% of comprehensive school students. The original plan of this paper was to see whether grammar schools provided better access to top universities, but it can be seen in the table that the outcomes describing top institutions suffer from a low number of observations. For example, only 2% of the students in the dataset have a degree from a Sutton Trust university, and only 29 out of 7,121 have a degree from either Oxford or Cambridge. These esti-mates for one outcome (Russell Group) will therefore be displayed in a different section to the main results. A glance at table 2 would suggest an advantage in higher education chances for grammar schools students, however table 3 shows that there are many other differing characteristics of the individuals in the data. Table 3 (on page 12) shows descriptive statistics, split by school type. Panel A, variables recorded at birth include gender, the age of the individual’s mother at time of birth, the marital status of the individual’s parents, as well as em-ployment and social class dummy variables. The social class variables are the Registrar-General’s Social Classes, a scale based on occupation (Bland 1979). The variables record whichever of the individuals parents were in the highest social class. The most prevalent social class in the data is class III - manual. Around 43% of parents are in this group. Panel B includes the person per room ratio (a common indicator of household wealth), dummy variables for the high-est qualifications of the individual’s parents, as well as several standardized thigh-est scores including an English picture vocabulary test (EPVT), and the child’s es-timated reading age. Around 38% of parents have no qualifications, while only around 14% have a degree or higher. Finally, Panel C shows a maths test score, a vocabulary test score and estimated reading age at age 10.

The last column displays the two-sided p-value of the t-test for the means of the two groups. In Panel A, we can see that grammar school students have significantly older mothers, as well as parents who left school at an older age.

(13)

Unsurprisingly, there is a large social class disparity between students of the two different schools. Grammar school students are significantly more likely to have parents in the top two social classes, while comprehensive school families are significantly more likely to be in the bottom three social classes. This strongly confirms the dominance of grammar schools by middle and upper class students discussed previously. None of the other variables in Panel A have significantly different means. At age 5, grammar school students had significantly less persons in their household, and subsequently lower person per room ratios. Looking at parental qualifications, comprehensive school students are far more likely to have parents with no qualifications, while grammar school students are in turn far more likely to have parents with a university degree or higher. Test scores displayed in Panels B and C show that grammar school students (before they entered grammar school) performed significantly better in every test, and had significantly higher reading ages at both age 5 and age 10. These significant differences in test scores can be said to be driven not only by individual ability, but are arguably a product of the social classes and academic qualifications of the students’ parents. Grammar school students are markedly better off than their comprehensive counterparts before they are even put into one of the two school types, which makes the true advantage (or lack of advantage) of the grammar an interesting question to answer.

3.3

Missing Observations

As discussed in section 3.1, the BCS70 has its fair share of issues surrounding missing observations in the data. In order to maximize the number of individuals that could be used for analysis, explanatory variables that had observations missing were assigned the sample mean of the variable, a common method in empirical literature. Dummy variables were created for each variable that had observations missing. The dummy is made equal to one when an observation is missing, and zero otherwise. These dummy variables were subsequently used in all regressions that their corresponding explanatory variable was included in.

(14)

Table 3: Split Descriptive Statistics N=7121 Grammar Comprehensive P-Value Mean (SD) Mean (SD) Panel A: Birth Female 0.521 (0.500) 0.524 (0.499) 0.911 Twin 0.015 (0.121) 0.016 (0.127) 0.836 Mother’s age 26.55 (5.321) 25.782 (5.326) 0.003 Birth weight 3346.332 (495.339) 3304.769 (521.182) 0.098 Parents married 0.866 (0.341) 0.878 (0.327) 0.419 Parents divorced/separated 0.016 (0.125) 0.014 (0.12) 0.825 Mother a widow 0.002 (0.044) 0.0007 (0.027) 0.364 Single mother 0.033 (0.18) 0.039 (0.194) 0.529 Born in a Hospital 0.785 (0.411) 0.781 (0.414) 0.805 Father employed 0.975 (0.157) 0.97 (0.17) 0.589 Mother employed 0.053 (0.225) 0.054 (0.226) 0.954

Age father left school 17.359 (7.13) 15.955 (3.876) 0.000

Age mother left school 16.74 (6.87) 15.622 (2.582) 0.000

Social class of parents

I – Professional 0.108 (0.311) 0.039 (0.193) 0.000

II – Managerial and technical 0.209 (0.407) 0.123 (0.329) 0.000

III – Non-manual 0.157 (0.365) 0.137 (0.344) 0.194

III – Manual 0.313 (0.464) 0.433 (0.495) 0.000

IV – Partly-skilled 0.11 (0.313) 0.149 (0.356) 0.017

V – Unskilled 0.02 (0.139) 0.05 (0.218) 0.002

Panel B: Age 5

Person per room ratio 0.81 (0.235) 0.874 (0.277) 0.000

Number of persons in household 4.467 (0.951) 4.57 (.1.116) 0.049

Parents qualifications No qualifications 0.271 (0.381) 0.384 (0.438) 0.000 Vocational qualifications 0.114 (0.275) 0.135 (0.308) 0.142 O-Level or equivalent 0.23 (0.372) 0.23 (0.379) 0.989 A-Level or equivalent 0.101 (0.272) 0.082 (0.246) 0.087 SRN 0.009 (0.077) 0.017 (0.117) 0.156 Certificate of Education 0.026 (0.145) 0.018 (0.121) 0.188 Degree or higher 0.248 (0.399) 0.134 (0.305) 0.000 Test scores

Standardised EPVT score 0.057 (1.193) -0.196 (1.277) 0.000

Standardised human figure drawing score 0.223 (0.948) -0.02 (1.025) 0.000

Standardised copy designs score 0.422 (0.857) 0.081 (0.933) 0.000

Estimated reading age (at age 5) 5.23 (0.636) 5.089 (0.762) 0.000

Panel C: Age 10 (test scores)

Standardised vocabulary test score 0.461 (0.372) 0.124 (0.757) 0.000 Estimated reading age (at age 10) 10.699 (1.168) 10.158 (1.204) 0.000

(15)

4

Empirical Methods

For this analysis, the empirical strategy proceeds as follows: First a linear prob-ability model is estimated, then a probit model is estimated in a similar fashion. Using the specifications of this probit model, propensity score matching is then used to match similar individuals based on probability of being treated.

4.1

Linear Probability Model

Suppose that the true relationship between obtaining a university degree and having attended a grammar school is:

yi= β0+ β1Grammari+ εi (1)

Where outcome yi is a dummy variable indicating whether individual i has ob-tained a university degree, Grammari is a dummy variable indicating whether individual i attended a grammar school, and the error term εiincludes a number of omitted variables and some random error. We can expect a simple regression of equation (1) to lead to heavily upward biased estimates. This is because the error term εi contains factors that will be positively correlated with both the outcome yiand the treatment variable Grammari. Family characteristics such as parent’s education and socioeconomic status, as well as individual charac-teristics such as ability and motivation will not only be positively related with the probability of grammar school attendance, but also with the probability of attending university. We therefore estimate equation (2). Where Xiis a vector of control variables:

yi= β0+ β1Grammari+ Xi0β2+ εi (2) The most important variables to be included in Xi will be variables that influence both the dependent variable and the treatment variable. The iden-tifying assumption of this model is therefore that conditional on vector Xi, grammar and comprehensive school students are as good as the same, and a causal inference of the effect of the treatment variable Grammari can be made. The variables in vector Xi will be household characteristics such as social class, parent’s education, household wealth, as well as significant differences between comprehensive and grammar school students identified in table 3. The large amount of data in the BCS70 allows for many such controls to be included. Arguably the most important confounder in any research attempting to identify the causal effect of an education variable is individual ability (Griliches 1977). If grammar school students were on average more able/more intelligent before they entered grammar school (which is a reasonable claim considering the en-trance exam requirements of grammar schools), this will lead to a potentially very strong upward bias in the estimates. Previous literature has attempted to control for ability in a variety of ways (Card 1999, Nordin 2005). An obvious but often criticized approach is to use test scores to proxy for individual ability (Grove, Wasserman and Grodner 2006). Panels B and C in table 3 show a clear

(16)

difference in test scores between grammar school and comprehensive school stu-dents across all areas. While controlling for these could remove much of this bias, the identifying assumption of this model remains a very difficult statement to make. For this reason, further methods are used after the linear probability model.

4.2

Probit Model

The main difference between the LPM and the probit model is the probit model’s use of the cumulative distribution function of the standard normal distribution. It is used as a transformation to provide a predicted probability. There are a few main arguments in favour of using an probit model rather than a linear probability model. Most importantly in this context is the possible bias that can arise when using the LPM. Previous literature has shown that estimates of the LPM can be biased and inconsistent (Amemiya 1977, Horrace and Oaxaca 2006). The two models could therefore potentially provide different results, al-though it is not anticipated.

In much the same way as the linear probability model, a probit model will also be estimated with the same vector of controls Xi.

The probit model is estimated by:

P (Yi= 1|Grammari) = φ(Xi0β) (3)

Where the left hand term is the probability of the realization of outcome Yi conditional on the treatment variable Grammari. φ is the cumulative dis-tribution function of the standard normal disdis-tribution, and Xi is a vector of regressors (Bliss 1934, Fisher 1935). Although the potential issues with OLS regression of a binary dependent variable are well publicized (Aldrich and Nel-son 1984), there are a large enough number of individuals attending university from each school type that it would be expected for the estimates from the two models to be very similar. In order to properly interpret the estimates of the probit model for suitable comparison to the LPM, marginal effects will be estimated.

4.3

Propensity Score Matching

Using a probit model with the same vector of regressors Xi, the technique of propensity score matching can be implemented. To understand this method, we can consider the impact of grammar school attendance on outcome Yi(attending university) for individual i.

δi= Y1i− Y0i (4)

The treatment impact δi is the difference between the potential outcomes for attending grammar school Y1i and attending comprehensive school Y0i. The issue surrounding this statement is that one of these outcomes will always be

(17)

counterfactual. Y0i for example, will be unobservable for all grammar school students. To illustrate this, consider the average treatment effect on the treated:

AT ET = E(Y1|Grammar = 1) − E(Y0|Grammar = 1) (5)

As previously stated, the second term in the equation is unobserved. What is observed, however, is the same term for untreated individuals (comprehensive school students) E(Y0|Grammar = 0). Therefore, the average treatment effect can be computed:

E(Y1|Grammar = 1) − E(Y0|Grammar = 0) = AT ET + SB (6) The final term SB is the selection bias term. To successfully estimate an ATE or ATET, this selection bias term must be zero. In non-experimental data such as the BCS70, selection bias is an issue. In the grammar school example, se-lection bias is comprised of previously discussed characteristics such as family background and ability that make some students more likely to attend grammar school than others.

A method that attempts to circumvent this selection bias problem is propen-sity score matching (PSM). The main idea of PSM is to find a comparable group of non-treated individuals (comprehensive school students) with similar pre-treatment characteristics as their treated counterparts. Rosenbaum and Rubin (1983) suggest the use of propensity scores:

ρ(xi) = P r(Grammar = 1|Xi) (7)

The propensity score ρ(xi) is the probability of individual i being treated (at-tending grammar school) conditional on a set of observable characteristics Xi. The steps for using these scores for matching are as follows13:

1. Estimation of propensity scores 2. Selection of a matching algorithm

3. Common support 4. Estimation of treatment effect

1. To estimate propensity scores, one must decide on a model, and a set of variables for these scores to be conditioned upon. Caliendo and Kopeinig note that the model choice (between logit and probit) is not too critical in the case of a binary treatment variable. Hence a probit model is chosen and estimated as explained in section 4.2. More important however is variable choice. This is heavily related to the conditional independence assumption (CIA) of matching, which requires that treatment participation be independent of the outcome Yi

13The following draws heavily on Caliendo and Kopeinig (2005), which sets out a guide for

(18)

given the set of variables Xi (or conditional on the propensity score). That is to say that participation is as good as random conditional on the covariates. This is the identifying assumption of the PSM method. As mentioned in section 4.1, the vast data available in the BCS70 allows for a rich set of variables to be in-cluded in this set. These variables will be the same controls used in estimating the linear probability and probit models.

2. The next step in PSM is the choice of matching algorithm, meaning the specific method used to match individuals based on their propensity scores. The most common and straightforward matching algorithm is nearest neighbour matching. This algorithm matches treated individuals to their nearest partner in the comparison group. Due to the disparity in observations between grammar and comprehensive school students (approximately a 13:1 ratio), I will match 3 untreated (comprehensive) observation to 1 treated observation, thereby taking advantage of as much of the data as possible. Using this oversampling method also increases the efficiency (by lowering the variance) of the estimates (Smith 1997).

3. The common support assumption is the second key assumption of PSM.

0 < P (D = 1|X) < 1 (8)

The assumption states that for each value of X, there is a positive probability of being treated. This assumption is to ensure that there is enough overlap (or common support) in characteristics between treated and untreated individuals in order to find satisfactory matches. In order to confirm this assumption, one can simply visually inspect the distributions of propensity scores of the two groups (Lechner 2000b). If the two assumptions (common support and CIA) are satisfied, Rosenbaum and Rubin state that treatment assignment is said to be ”strongly ignorable”.

Figures 1 and 2 (on page 17) show two graphs used to assess the com-mon support assumption. Figure 1 shows the propensity score distributions for grammar and comprehensive school students, while figure 2 shows the same in a histogram. The graphs show that there is ample overlap between the two groups, meaning that suitable matches can be made.

4. The final step in PSM is the estimation of a treatment effect. This method produces the ATE as well as the ATET. The ATE is computed by taking the average difference between the observed and potential outcomes for each in-dividual, while the ATET does the same only for treated individuals. Both of these treatment effects will be displayed in the results section, however the ATE is the more important indicator. This is because the question this analysis is trying to answer is whether or not grammar schools provide an overall advan-tage in higher education access. If one only takes into account the effect it has on those who are treated (a small percentage of the population in this case),

(19)

this question might not be properly answered. It is therefore more important to know the effects of the grammar school across the whole population in order to conclude whether or not it is beneficial.

Figure 1: PSM Density Graph

(20)

5

Results

5.1

OLS Results

The subsequent table shows the linear probability model results for the outcome of attending university. The outcome variable is a binary variable taking the value of 1 if the individual has obtained a university degree. Column one displays results for the most basic regression of the outcome variable on the treatment variable of attending a grammar school. The coefficient of 0.178 can be inter-preted as grammar school increasing an individual’s probability of obtaining a university degree by 17.8 percentage points. The coefficient is significant at the 1% level. For reasons previously explained in section 4.1, namely omitted variable bias, this estimate can be said to be heavily upwardly biased. Column two includes a set of covariates recorded at birth in 1970. These covariates are a dummy variable for being female, a dummy indicating whether or not the indi-vidual’s parents were married, the age of the indiindi-vidual’s mother at birth, and a set of dummy variables for the social class of the parents of the individual. Also included were dummies indicating missing observations as explained in section 3.3.

Table 4: OLS Results

University Attendance

(1) (2) (3) (4)

Grammar 0.178*** 0.131*** 0.098*** 0.072***

(0.022) (0.021) (0.021) (0.021)

Birth Covariates No Yes Yes Yes

Age 5 Covariates No No Yes Yes

Age 10 Covariates No No No Yes

Observations 7,121 7,121 7,121 7,121

R-squared 0.012 0.069 0.127 0.162

Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1

The coefficient in column two drops to .131, and is also highly significant. The change between column 1 and 2 is the largest change in coefficient size in table 4. This was mainly caused by the inclusion of the social class variables and the mother’s age variable, all of which are highly significant. This is to be expected, as social class is a key issue in the grammar school debate. Social class heavily influences grammar school attendance, and seems to influence university attendance in the same way. The inclusion of age 5 covariates in column three reduces the coefficient of interest by a somewhat smaller amount (0.033). The covariates added to the regression in this column are the person per room ratio of the individual’s household, a set of dummy variables describing the

(21)

high-est qualification of the individual’s parents (see panel b of table 3), high-estimated reading age at age 5, as well as one of the cognitive test scores described in table 3. The test score, ”standardized copy designs score” is aimed at capturing cognitive ability. This was the only age 5 test score that significantly impacted the coefficient, and the result is robust to the inclusion of all three age 5 test scores together. Most of the change in the coefficient can be attributed to the person per room ratio of the household, parental education, and the test score. Estimated reading age at age 5 was significant but had a negligible effect on the treatment coefficient. The coefficient of .098 remains significant at the 1% level, despite the inclusion of more key contributing factors to grammar school attendance, such as parental education and age 5 cognitive ability. The last column presents the results for the regression with two extra controls measured at age 10. These controls are the individuals estimated reading at age 10, as well as a maths test score. One would expect these two covariates (particularly the maths score), to have a significant impact on the coefficient.

Column 4 shows a significant decrease in the magnitude of the coefficient.14 As expected, this change was mostly driven by the inclusion of the age 10 maths test score. This maths test score would be expected to be a good predictor of both grammar school attendance and university attendance. While this might be a large generalization and not hold true in some cases, it can be said that strong capabilities in math are advantageous in both pursuits (the 11-plus exam for example.) Despite the inclusion of all these controls, the coefficient in column 4 remains significant at the 1% level. The coefficient can be interpreted as grammar school attendance increasing the probability of obtaining a university degree by 7.2 percentage points, all else held constant. This is a surprising result, considering that much previous literature has found only small if any difference between the two school types. It would have been expected that most, if not all, of the effect shown in column 1 would have been absorbed by the further covariates added in each subsequent column. It would have been expected that previously discussed factors such as family background and test scores would make the effect in column 4 disappear completely. Not only is the significance surprising, but the magnitude of the coefficient as well. The result suggests that students with the same social background, parental education and cognitive ability (proxied by test scores), would receive a noteworthy benefit from attending grammar school. The overall result is however not particularly convincing, as differences between the two groups of students in this analysis that could not be included in the regression may have lead to an upward bias (see section 4.1). For this reason, a probit model is estimated to be used for propensity score matching.

(22)

Table 5: Probit Results: Marginal Effects University Attendance

(1) (2) (3) (4)

Grammar 0.178*** 0.126*** 0.088*** 0.051***

(0.022) (0.021) (0.02) (0.019)

Birth Covariates No Yes Yes Yes

Age 5 Covariates No No Yes Yes

Age 10 Covariates No No No Yes

Observations 7,121 7,121 7,121 7,121

Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1

5.2

Probit Results

This section will briefly examine the probit model results in table 5. As previ-ously explained, the results of the probit model (which should be similar to the OLS results) are not as important as the use of the probit model in propensity score matching later on. Table 5 shows results for the same estimations as in table 4, but this time using a probit model as opposed to a linear probability model. We can see in the first two columns that the probit model (showing marginal effects) provides extremely similar results to the OLS estimates. Col-umn 3 shows a slightly different coefficient to its counterpart in table 4, but is still very similar. The final column on the other hand shows a marginal ef-fect of 0.051. This coefficient is smaller in magnitude compared to the result of the linear probability model. As discussed in section 4.2, this difference in coefficient magnitude could be the result of the bias and inconsistency that can arise from the use of the linear probability model. Since the probit model may be more suited to the case of a binary outcome variable, it may be important for the coefficient in column 4 of table 5 to be interpreted. The probit model predicts that grammar school attendance increases the probability of obtaining a university degree by 5.1 percentage points, all else held constant. While this is a smaller effect than in table 4, the main takeaway from this table remains the same. The coefficient remains significant at the 1% level through all four columns, and the probit results still show a notable benefit from grammar school attendance.15

14A Hausman test using the suest command rejected the equality of the coefficients for

Grammar across the models in columns three and four.

(23)

5.3

Matching Results

This section presents the matching results for the outcome of obtaining a uni-versity degree. As described is section 4.3, the propensity scores are conditioned upon the same variables as used in the previous two models using a probit spec-ification. Figures 1 and 2 showed that there was sufficient overlap for matching to be performed. What must next be concluded is whether or not the match-ing has been successful once the algorithm has been implemented. This is best demonstrated using table 6 and figure 3. Figure 3 shows the kernel densities of the two groups before and after matching. The before matching graph is a similar plot to figure 1, using the entire sample. The after matching graph shows the distributions only for matched observations, meaning that the com-prehensive distribution now reflects only those individuals that were matched to treated individuals (grammar school students). Figure 3 shows that the nearest-neighbour matching algorithm has created a comparison group of comprehensive school students that now has an visually similar propensity score kernel density to the treatment group. From this perspective the matching can be said to have been successful.

This conclusion is further supported by table 6, known as a balancing table. table 6 is a descriptive table showing the means of the treatment and control groups (grammar and comprehensive) along with the p-value of the t-test be-tween the means, before and after matching. It is designed to show whether or not the groups have been balanced after matching. The p-values in bold are those that were less that 0.1 (before matching), meaning that the means are significantly different from each other at the 10% level (although the vast majority are significant at the 1% level.) The table shows that for every previ-ously unbalanced variable used in the estimation of the propensity scores, it is now strongly rejected that the means are significantly different from each other. Taking the Degree or higher variable as an example, the unmatched row shows what was shown in table 3, that 24.8% of grammar school students have a par-ent with a university degree or higher, while this is the case for only 13.4% of comprehensive school students. 26% of comprehensive students in the matched sample now have a parent with a degree or higher, and there is no significant difference in the means of the two groups. table 6 shows that matching has successfully balanced all of the necessary variables, and a suitable comparison group has been created.

(24)

Table 6: PSM Balancing Table

Mean

P-Treated Control Value

Female Unmatched 0.521 0.524 0.911

Matched 0.521 0.506 0.609

Married Unmatched 0.866 0.878 0.419

Matched 0.866 0.863 0.879

Mothers Age Unmatched 26.55 25.786 0.003

Matched 26.55 26.531 0.908

Social Class I Unmatched 0.108 0.039 0.000

Matched 0.108 0.097 0.563

Social Class II Unmatched 0.209 0.123 0.000

Matched 0.209 0.225 0.534

Social Class III Unmatched 0.157 0.137 0.194

Matched 0.157 0.16 0.896

Social Class III Manual Unmatched 0.313 0.433 0.000

Matched 0.313 0.317 0.898

Social Class IV Unmatched 0.11 0.149 0.017

Matched 0.11 0.099 0.575

Person per room ratio Unmatched 0.81 0.874 0.000

Matched 0.81 0.814 0.771

Vocational qualifications Unmatched 0.114 0.135 0.142

Matched 0.114 0.114 0.974

O-Level or equivalent Unmatched 0.23 0.23 0.990

Matched 0.23 0.216 0.523

A-Level or equivalent Unmatched 0.101 0.082 0.087

Matched 0.101 0.102 0.959

SRN Unmatched 0.009 0.017 0.156

Matched 0.009 0.012 0.680

Certificate of Education Unmatched 0.026 0.018 0.188

Matched 0.026 0.029 0.763

Degree or higher Unmatched 0.248 0.134 0.000

Matched 0.248 0.26 0.635

Standardised copy designs score Unmatched 0.422 0.081 0.000

Matched 0.422 0.411 0.844

Estimated reading age (at age 5) Unmatched 5.23 5.089 0.000

Matched 5.23 5.232 0.960

Standardised vocabulary test score Unmatched 0.461 0.124 0.000

Matched 0.461 0.446 0.736

Age 10 maths test score Unmatched 50.398 45.056 0.000

(25)
(26)

Table 7 displays the average treatment effect, as well as the average treat-ment effect on the treated for three different specifications, comparable to those in tables 5 and 6. The total number of observations in each column is 1,524, because each of the 508 treated individuals are matched with their three nearest neighbors based on propensity score (probability of being treated). Column 1 shows the results for propensity score matching based on only the birth vari-ables. The ATE offers a similar result to column two of tables 5 and 6. Column two of table 7 includes age 5 variables. The ATE and ATET are significant at the 1% level in both columns one and two. The results shown in the first two columns are not of real interest, and only the results in column three will be interpreted. This is because one of the main assumptions of PSM (Conditional Independence Assumption) is violated when the propensity scores are not con-ditioned upon all possible variables. Column three includes all the variables specified in both the probit and LPM models.

Table 7: Matching Results

University Attendance (1) (2) (3) ATE 0.122*** 0.0684*** 0.0253 (0.0241) (0.0219) (0.0162) ATET 0.131*** 0.105*** 0.0476** (0.022) (0.0235) (0.0235)

Birth Covariates Yes Yes Yes

Age 5 Covariates No Yes Yes

Age 10 Covariates No No Yes

Observations 1,524 1,524 1,524

Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1

Starting with the ATET, we can interpret this coefficient as follows. The average grammar school student16 is 4.76% more likely to obtain a university degree than if they had not attended a grammar school. This coefficient is sig-nificant at the 5% level. The ATET is the treatment effect only for grammar school students. This means that grammar schools were significantly benefi-cial for their students. As explained in section 4.3, it is more important to examine the average treatment effect. The ATE is the effect of the treatment across the entire population. This shows whether or not grammar schools are

(27)

effective in general. We can see from column three that the ATE is 0.0253. The interpretation is that the average student in the dataset would be 2.53% more likely to obtain a university degree if they attend a grammar school. This coefficient, however, is not significantly different from zero. Recalling section 4.3, three nearest neighbours were used for each treated variable. The result is robust to the use of one, two and four nearest neighbours being used in the matching algorithm.17 The result is also robust to reducing the sample to only the observations which do not have any values missing (3,344) as described in section 3.3. The result therefore contradicts the results of the probit and linear probability models. Matching on propensity scores shows that there is no signif-icant difference in probability of receiving higher education between grammar and comprehensive schools.

6

Unobservable Selection and Coefficient

Stabil-ity

This section presents a robustness check for evaluating robustness of results to omitted variable bias. This is based on the article ”Unobservable Selection and Coefficient Stability: Theory and Evidence” (Oster 2014). The author ar-gues that simply observing coefficient movements after the inclusion of extra controls is not sufficient in assessing omitted variable bias. Oster states that coefficient movements and R-squared movements must be considered together. The main idea behind observing coefficient movements is that bias stemming from observed controls is informative about bias from unobserved controls. Os-ter argues however that this is not implied by the assumptions of the linear model, and making this statement assumes that observables and unobservables share the same covariance properties. What is therefore crucial is the move-ment in R-squared after the inclusion of a control. The paper develops formal method for a robustness check based on the above statements, results of which are displayed in table 8.18

Table 8: Unobservable Selection and Coefficient Stability

Baseline Effect Controlled Effect δ for β=0

Treatment Variable Coefficient (s.e.) R2 Coefficient (s.e.) R2 (R max=1)

Grammar School .178*** (0.022) 0.012 .072*** (.021) .162 0.12124

The coefficient of interest in this table is δ, described as a coefficient of proportionality.19 It can be interpreted as ”the degree of selection on

unob-17While the point estimates vary slightly from around 0.019 to around 0.028, none of these

coefficients are significantly different from zero.

18Note that this method is only applicable to a linear model, so can only be used after the

regressions in table 4.

(28)

servables relative to observables which would be necessary to explain away the result” (Oster 2014). ”Explain away the result” means for the treatment effect β to equal zero. As an example, a value of δ = 2 would mean that the unob-servables would need to be twice as important as the obunob-servables to produce a treatment effect of zero. δ = 1 suggests that the observables are at least as important as the unobservables. Calculating this δ value requires an assump-tion to be made regarding Rmax. This can be defined as the R2 value of a hypothetical regression that includes observed as well as unobserved controls. A common choice is Rmax= 1 (Altonji, Elder and Taber 2005).

table 8 reports the results of the robustness check. The first set of columns under Baseline Effect displays the coefficient, standard error and R-squared for the regression in column one of table 4. The next set of columns shows the same but with all additional controls included (column four of table 4.) Fi-nally, the last column displays the calculated δ value for β = 0. The value of 0.12124 suggests that the unobservables in the regression are far more impor-tant than the observables. One can formally interpret the coefficient as follows. The unobservables would only need to be approximately 1/8 as important as the observables to produce a treatment effect of zero. This result is consistent with the conclusion of the matching results in table 7. Oster shows that effects that are confirmed by external data (i.e. pre-existing literature), generally have values of δ > 1. Previous literature on this topic suggests little if any impact of selective/grammar school attendance on later outcomes such as degree at-tainment. This δ value confirms this, as well as what was already previously discussed regarding the potential omitted variable bias issues surrounding the OLS and probit results in tables 5 and 6. This issue appears to have been re-solved via propensity score matching. The estimated (and highly significant) treatment effect of 0.072 (in the linear model) was negated by the matching results.

7

Results: Russell Group University Access

Due to a small number of observations that are in the three groups of elite higher education institutions (table 2), results for these measures were not included in the main results. This section briefly discusses results for Russell Group university access.

7.1

OLS and Probit Results

Tables 10 and 11 show the OLS and probit results for the outcome of obtaining a degree from a Russell Group university, using the same methods and controls as in the main results. The OLS results in table 9 show a reduced effect of grammar school attendance than in table 4. This is somewhat surprising, as it might be thought that grammar schools provide better preparation for the very best universities. On the other hand, students at top universities may have been more able to begin with. The low sample size however makes any inference

(29)

from the coefficients difficult. The coefficients in all four columns are significant at the 1% level, and the coefficient reduction is less substantial than in table 4, only moving slightly when controls are added. The coefficient in column four can be interpreted as grammar school attendance increasing the probability of obtaining a degree from a Russell Group university by 2.7 percentage points, all else held constant.

Table 9: OLS Results (Russell Group) Russell Group Attendance

(1) (2) (3) (4)

Grammar 0.059*** 0.045*** 0.036*** 0.027**

(0.014) (0.013) (0.013) (0.013)

Birth Covariates No Yes Yes Yes

Age 5 Covariates No No Yes Yes

Age 10 Covariates No No No Yes

Observations 7,121 7,121 7,121 7,121

R-squared 0.005 0.027 0.047 0.061

Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1

Table 10: Probit Results (Russell Group): Marginal Effects Russell Group Attendance

(1) (2) (3) (4)

Grammar 0.059*** 0.035*** 0.021** 0.009

(0.014) (0.011) (0.009) (0.006)

Birth Covariates No Yes Yes Yes

Age 5 Covariates No No Yes Yes

Age 10 Covariates No No No Yes

Observations 7,121 7,121 7,121 7,121

Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1

Table 10 shows that while the probit results are initially similar in the first two columns to the results in table 9, the results are notably different in the final two columns. This could again be attributed to the low number of obser-vations with Russell Group degrees. The final column predicts that grammar

(30)

school attendance increases the probability of obtaining a degree from a Russell Group university by 0.9 percentage points, all else held constant. This coefficient however is not significantly different from zero.

7.2

Matching Results

Figure 4 shows the overlap assumption graph for propensity score matching with the Russell Group outcome. The graph is almost identical to figure 1, with enough overlap for matches to be made.

Figure 4: Russell Group Overlap Graph

(31)

Figure 5 shows the kernel density graphs for before and after matching. It can be seen from this graph that the distribution of propensity scores after matching are very similar. Furthermore, all of the variables used in the match-ing process were balanced after matchmatch-ing, with none of the t-tests concludmatch-ing that the means of the grammar and comprehensive groups were significantly different from each other.

Table 11 shows matching results for the Russell Group outcome. Column three shows the ATE and the ATET when matching is conditioned on the full set of control variables. Neither of the ATE or the ATET are significant at any level. The magnitudes of the coefficients are also very small. The interpretation of the ATE is that the average student in the dataset would be 1.1% more likely to obtain a degree from a Russell Group university if they attend a grammar school, although the coefficient is not significantly different from zero. Despite the potentially low statistical power of these matching results (considering that only 51 grammar school students attended a Russell Group university), the overall matching results in table 7 suggest that even a larger sample may have yielded similar results in terms of statistical significance.

Table 11: Matching Results (Russell Group) Russell Group Attendance

(1) (2) (3)

ATE 0.0401*** 0.0231** 0.0110

(0.0135) (0.0121) (0.00916)

ATET 0.0445*** 0.0398*** 0.00938

(0.0141) (0.0151) (0.0154)

Birth Covariates Yes Yes Yes

Age 5 Covariates No Yes Yes

Age 10 Covariates No No Yes

Observations 1,524 1,524 1,524

Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1

(32)

8

Conclusion

In this analysis, OLS and probit results showed a 7.2 percentage point and 5.1 percentage point increase in the probability of obtaining a university degree for grammar school students. Both of these results were significant at the 1% level. Matching results however showed that grammar school students had no signif-icant advantage in university access, for both universities in general as well as ”elite” universities characterized by the Russell Group.

The main policy implication of this result is in the potential reformation of the UK state school system. There currently still are 163 grammar schools in England and Wales, and the incumbent Conservative Prime Minister has publicly declared her desire to create more grammar schools throughout the country. My results strongly support the claim that this would not be a ben-eficial decision. Why no claims can be made from this analysis about social mobility directly (one of the main arguments the government has made for the reintroduction of grammar schools), equal university access would support the claim that comprehensive schools are as good for social mobility (at least for those who go to university) as grammar schools. Complete abolition of gram-mar schools would lead to a less divisive state school system where the better schools are no longer dominated by students from more fortunate backgrounds. Lastly, it is important to mention some shortcomings of the analysis. It was initially intended for a comparison of two datasets to be used, the 1958 NCDS and the BCS70. This would have allowed me to compare results from two slightly different time periods in the UK education system. However, the 1958 study lacked the key outcome variables needed for this analysis. Furthermore it would have been interesting to be able to properly analyze the measures of an ”elite” university degree; Russell Group, Sutton Trust and Oxford & Cambridge. This was made difficult however by a low number of students who attended institutions in these groups. For example, a total of only 29 students out of the 7,121 individuals in the dataset attended either Oxford or Cambridge. More recent data in which a greater number of individuals attend university for a more equal comparison could be used for future research.

(33)

References

Aldrich JH, Nelson FD. 1984. Linear Probability, Logit, and Probit Models. SAGE Publications 45(1).

Altonji JG, Elder TE, Taber CR. 2005. Selection on Observed and Unobserved Variables: Assessing the Effectiveness of Catholic Schools. Journal of Political Economy 113(1): 151-184.

Amemiya T. 1977. The Maximum Likelihood and the Nonlinear Three-Stage Least Squares Estimator in the General Nonlinear Simultaneous Equation Model Econometrica 44(4): 955-968.

Abdulkadiroglu A, Angrist JD, Pathak PA. 2014. The Elite Illusion: Achieve-ment Effects at Boston and New York Exam Schools. Econometrica 82(1): 137-196

Asthana A, Campbell D. 2017. Theresa May paves way for new gener-ation of grammar schools. Available: https://www.theguardian.com/uk- news/2017/mar/06/theresa-may-paves-way-for-new-generation-of-grammar-schools. Last accessed May 2017.

Bland R. 1979. Measuring ”Social Class”: A Discussion of the Registrar-General’s Social Classification. Sociology 13(2): 283-291.

Bliss CI. 1934. The Method of Probits. Science 79(2037): 38-39.

Boliver V, Swift A. 2011. Do comprehensive schools reduce social mobility? The British Journal of Sociology 62(1): 90-110.

Bolton P. 2016. Student Loan Statistics. House of Commons Library: 3-33. Butler RA. 1944. An Act to reform the law relating to education in England and Wales. United Kingdom Parliament.

Caliendo M, Kopeinig S. 2005. Some Practical Guidance for the Implementation of Propensity Score Matching. IZA Discussion Paper No. 1588: 1-32.

Card D. 1999. The causal effect of education on earnings. Handbook of Labor Economics 3(A): 1801-1863.

Chevalier A. 2004. Parental Education and Child’s Education: A Natural Ex-periment. IZA Discussion Paper Series 1153: 1-44.

Clark D. 2009. Selective Schools and Academic Achievement. The B.E. Journal of Economic Analysis & Policy 10(1): 1-40.

Clark D. 2015. The Long-Run Effects of Attending an Elite School: Evidence from the UK. American Economic Journal: Applied Economics 8(1): 150-76. Cribb J, Jesson D, Sibieta L, Skipp A, Vignoles A. 2013. Poor Grammar: Entry into Grammar Schools for disadvantaged pupils in England. Sutton Trust : 3-21. Dodgeon B, Shepherd P, Butler N, Johnson J. 1992. A Guide to the BCS70 16-year Head Teacher Questionnaire Data available at the UK Data Archive. Centre for Longitudinal Studies: 4-33.

Referenties

GERELATEERDE DOCUMENTEN

De grond die nodig is voor waterberging moet bij voorkeur in beheer blijven van de landbouw, omdat anders te veel grond aangekocht moet worden, waardoor waterberging

(2016), we examined trends in the percentage of articles with p values reported as marginally significant and showed that these are affected by differences across disciplines

Uit deze onderzoeken zouden we dus kunnen concluderen dat – in overeen- komst met de theoretische herkenbaarheidsvoorwaarden – de herkenbaar- heid van wegen (verder) verbeterd

Binnen deze studie wordt onderzoek naar frames in berichtgeving over medische crisissituaties uitgebreid door het erbij betrekken van verschillende typen kranten,

The catalysts were evaluated for the catalytic hydrotreatment of Kraft lignin and process conditions such as temperature, reaction time and catalyst loading

[r]

The after tax affordability index is a function of the regional housing indices, Dutch mortgage interest rates for new home-buyers, regional disposable

It clearly visualise that Human Interest frame is predominantly more used in proximal crisis (i.e., France Telecom for British newspapers and Foxconn for Chinese