A review of the economic literature on the determinants of international differences in student test scores

Detailed analyses

2.4 Explaining student test scores

2.4.1 A review of the economic literature on the determinants of international differences in student test scores

The economic literature on the determinants of international differences in student test scores has grown quickly over the last two decades due to the enormous progress in data collection and availability. Hanushek and Woessmann (2010) review this literature and discuss three groups of determinants: (i) student and family background characteristics; (ii) school inputs; and (iii) institutional features of schools and education systems.

This section draws heavily on Hanushek and Woessmann (2010), Hanushek and Woessmann (2014), and Hanushek, Piopiunik and Wiederhold (2014).³¹ Methodological challenges

Estimating causal effects on student test scores poses methodological challenges. The concern most often voiced by economists is that the estimates obtained cannot be given a causal interpretation. This is because the indicators may be correlated with unobserved student characteristics (such as individual ability), so that we cannot be sure that we are capturing the effect of the factor in question, for example government spending, and not the effect of unobserved characteristics.³² But the problem can also be due to causation running in the reverse direction or to error in measuring the indicators. In all these cases, causality cannot be established and hence research cannot inform policymakers on which educational policies work and which do not.³³

In the light of the above, and following Hanushek and Woessmann (2010), we distinguish between (i) descriptive studies that document associations between variables but do not necessarily identify causal links; and (ii) studies that use more elaborate strategies to identify causal effects of

We focus on mean test scores. For a discussion of the determinants of within-country inequality in test scores, see e.g.

Van de Werfhorst and Mijs (2010) and oecd (2010a).

A comprehensive review of all relevant literature in all relevant fields (sociology, economics, psychology, etc.) is beyond the scope of this chapter. Instead, we focus on the recent and most relevant literature in economics.

For an earlier account of the factors that are considered effective in fostering education, see for example Scheerens and Bosker (1997).

A similar problem occurs when students self-select into private schools, in which case it is unclear whether observed dif-ferences are caused by the type of school or are simply due to different types of students going to private schools.

Endogeneity is by no means the only concern when estimating effects on test scores. Psycholo-gists, for example, often point to measurement problems (reliability and validity). To some extent, the two concerns are related, as measurement error in the explanatory variables constitutes one type of endogeneity.

particular variables. Below we briefly summarise the findings of the descriptive studies and discuss a few leading examples of studies that aim to identify causal effects.

Do student and family background characteristics affect student test scores?

Student and family background characteristics have long been viewed as an important determinant of student performance. In fact, the influ-ential 1966 Coleman report, published by the u.s. Government, already demonstrated statistically that educational outcomes were much more a reflection of a students’ friends and family than of the inputs supplied by the government (Coleman et al. 1966, Fukuyama 2014). In a cross-national context, the extent to which student test scores are explained by student and family background provides an indication of the inequality of oppor-tunity of children from different social backgrounds. Table 2.4 in Section 2.1.1 showed the proportion of variation in pisa test scores that is explained by socioeconomic status (as a measure of inequality). It revealed that in-equality is lowest in Norway, Finland and Canada, where socioeconomic status explains less than 10% of the variation in student test scores, and highest in the Slovak Republic, Hungary and France, where it explains up to 25%. In addition to inequality, the extent to which student and family background characteristics explain test scores also reflects the (lack of) intergenerational mobility in a society (Hanushek and Woessmann 2010).

Hanushek and Woessmann (2010) provide a detailed overview of descrip-tive studies that estimate the relationship between student/family background variables and student test scores. In summary, these studies confirm the two observations above:

1 Student test scores differ substantially by student and family background.

2 The extent to which test scores differ by student and family background varies substantially across countries.

As explained above, while informative, the descriptive studies do not identify the underlying causal mechanisms. To address this shortcoming, several authors have employed more involved identification strategies to obtain better estimates of causal effects.

The role of parents

Starting from the observation that better-educated parents have better- educated children (Haveman and Wolfe 1995), Holmlund et al. (2011) inves-tigate why this is the case. Is it because parents with higher ability have more able children (nature)? Or is it because education generates resources that help parents in fostering the education of their children (nurture)?

Based on their reading of the empirical literature and the various identifi-cation strategies employed, Holmlund et al. (2011) conclude that more than half the educational persistence across generations is explained by na-ture (inherited abilities), with the remainder being explained by nurna-ture.

The causal effect of parental schooling, while small, constitutes a large part of the nurture component.

Being younger than classmates

Bedard and Dhuey (2006) explore the fact that, due to the use of a single cut-off date for school enrolment, children enter school at somewhat different ages. They show that these initial age differences have long-lasting effects on student performance. The youngest members of a cohort score lower than the older members in grades 4 and 8 (age 9-10 and 13-14, respectively) and are less likely to go on to high-end universities. In short, being young compared to classmates early in life puts children at a disadvantage which persists into adulthood.

Peer effects in primary schools

Ammermueller and Pischke (2009) estimate peer effects in primary schools in six European countries. In particular, they ask whether, in addition to a child’s own personal and family characteristics, the characteristics of the child’s peers (classmates) also affect their performance. The difficulty in estimating peer effects is the observation that schools and classrooms are typically not formed randomly, so that the estimated effects may capture unobserved characteristics of pupils rather than peer effects.

Ammermueller and Pischke (2009) argue that, within a given school (and only in primary schools, where pupils have not yet allocated to different educational tracks), the allocation of children over different classes is fairly random. Looking at the within-school variation in the characteristics of classmates across different classes, Ammermueller and Pischke (2009) find moderately large peer effects. Hence, perhaps unsurprisingly, the development of young children’s cognitive skills depends to some extent on their peers in class.

Do school inputs matter for test scores?

According to the heuristic model presented in Chapter 1, the public sector provides inputs that are used to produce output. More output should then result in better outcomes. In line with this model, empirical research has focused a lot of attention on estimating the effects of educational inputs on student test scores. From a public policy perspective, this avenue of research is particularly important, as it informs policymakers on whether education policy does indeed contribute to the development of cognitive skills, and if so by how much.

In Section 2.2 we compared countries in terms of various measures of educational inputs. Hanushek and Woessmann (2010) provide a review of descriptive studies that analyse the relationship between school inputs and student test scores. The school input measures studied include expenditure per student, class size, availability of instructional material and teacher characteristics. In summary, the studies find that:

1 Quantitative measures of school inputs such as expenditure per student and class size do not explain the cross-country differences in student test scores.

2 By contrast, several studies document positive associations of student test scores with the quality of instructional material and the quality of teachers.

As before, these descriptive studies, while informative, do not necessarily indicate causality. Several studies have attempted to identify causal effects using more elaborate identification strategies.

The effect of class size

Hanushek and Woessmann (2010) point out that most progress has been made in estimating the effects of class size on student performance.

Woessmann and West (2006), for example, investigate the variation in class size that is caused by natural fluctuations in the size of subsequent age cohorts of a school (similar to Hoxby 2000). As long as differences in the number of students per age cohort within a given school are random, this strategy yields good causal estimates.

In an alternative attempt to isolate exogenous variation in class size, Woessmann (2005), following Angrist and Lavy (1999), uses the fact that ten Western European countries impose maximum class-size rules. As long as the number of students in a school is not close to a multiple of this maxi-mum, average class size in a school increases with the number of students in a given school year group. But when the number of students increases beyond a multiple of the maximum class size, average class size will drop.

It is this latter variation that is exploited to obtain causal estimates of the effect of class size.

Finally, the third attempt to obtain random variation uses only the class- size variation between different school subjects for a given student (Altinok and Kingdon 2012, Dee 2005). In this case, the causal effect is identified simply from comparing the performance of the same student in different subjects (and corresponding differently sized classes).

The estimated effects are similar in all three approaches. In short, the studies find no strong effects of class size on student performance in most countries. Class size seems to be most beneficial in countries with low teacher quality. The latter finding suggests that only the relatively capable teachers do as well when teaching large classes as when teaching small classes (Hanushek and Woessmann 2010).

The importance of teacher quality

Progress has been more limited in estimating the causal effects of teacher quality on student test scores. Hanushek, Piopiunik and Wiederhold (2014) estimate the effect of the cognitive skills of teachers on student performance.

They exploit the idea that, in countries where non-teacher public-sector employees are paid more (relative to their private-sector counterparts), the public sector attracts workers (including teachers) with higher levels of cognitive skills. Using this variation in teacher cognitive skills, the authors find that such skills are indeed an important determinant of international differences in student test scores.

Do institutional features of schools and education systems affect test scores?

Governments not only provide educational inputs (spending, materials, teachers) but also shape the institutional design of the education sector.

According to North (1990), “Institutions are the rules of the game in society or, more formally, are the humanly devised constraints that shape human interaction.

In consequence they structure incentives in human exchange, whether political, social, or economic…” A large body of literature has focused on estimating the effect of institutional features of schools and education systems on student test scores. Hanushek and Woessmann (2014) review this literature and discuss the following three institutional features: (a) accountability measures;

(b) school autonomy; and (c) competition from private schools.³⁴ Accountability

The accountability device that has received most attention is the presence of curriculum-based external exit examinations. These are examinations where a decision-making authority external to the school has full responsibility for or gives final approval of the content of examinations. In the absence of such external examinations (i.e. when teachers in each school are responsible for examination content), the performance of students cannot easily be compared across classes and schools. External examinations facilitate monitoring of the performance of students, teachers and schools.

They also increase the scope for students to signal their achievements to future employers or higher education institutions. Both imply more accountability (Woessmann et al. 2007, Hanushek and Woesmann 2014).

Hanushek and Woessmann (2014) provide an overview of descriptive studies that estimate the relationship between external exit examination systems and student test scores. Based on these studies, they draw the following conclusions:

1 Students in countries that have external exit examination systems score substantially higher than students in countries without external examination systems.

2 The evidence suggests that the effect may well be larger than a whole grade-level equivalent.

Again, the descriptive studies do not definitively establish causality. Jürges, Schneider and Büchel (2005) use the fact that, in some German secondary school tracks, the federal states with centrally set exit examinations use them in math but not in science. This allows the authors to compare the test scores in maths (a subject with central exit examinations) to the test

In addition, Hanushek and Woessmann (2014) discuss school tracking (which has been studied mostly in terms of the equity of student test scores) and pre-school education.

scores in science (a subject without central exit examinations) for one and the same student. The results confirm the positive and substantial effect of central exit examinations on test scores (although the size of the effect is smaller than in the descriptive studies).

Hanushek and Woessmann (2014) also briefly review studies that analyse other accountability mechanisms (the results reported here stem first and foremost from the study by Woessmann et al. 2007). The first mechanism involves the monitoring of teacher-led lessons by the school principal or internal staff, or by external inspectors. The results show positive associa-tions of student test scores with both internal and external monitoring of teachers.

The second accountability mechanism uses assessments of student achievement to compare the performance of schools within the same district or country. Being able to compare schools increases accountability.

The empirical results show that student test scores are indeed higher when schools use assessments to compare themselves to district or national performance.

The third accountability mechanism uses assessments of student achieve-ment to decide on students’ retention or promotion, thus creating incen-tives for students to perform well. The results indicate that students perform substantially better in countries where a larger share of schools use assessments for retention or promotion (Woessmann et al. 2007).

Finally, the fourth accountability mechanism uses assessments of student achievement to group students for instructional purposes. According to Woessmann et al. (2007), the share of a country’s schools that use assess-ments to group students provides a crude measure of the extent of tracking that goes on within schools. In contrast to the other accountability mech-anisms, this type of accountability is associated with lower student test scores (Woessmann et al. 2007). This is consistent with earlier results in the literature. The negative effects of ability grouping or tracking could reflect the notion that lower-ability students suffer disproportionately from slow-er learning environments that leave them far behind high-ability students (Hanushek and Woesmann 2006).

Table 2.5 for each country documents the presence or absence of each of the accountability mechanisms described above (in 2012). The variable

‘External exams’ indicates the extent to which standards-based external examinations for students in secondary education exist in the system.

The variables ‘Assessments used for retention/promotion’, ‘Assessments used to compare schools’ and ‘Assessments used to group students’ indi-cate the percentages of students in schools whose principal reported that assessments of 15 year-old students are used to decide on students’ reten-tion or promoreten-tion, to compare schools to district or nareten-tional performance and to group students for instructional purposes, respectively. The variable

‘Monitoring of lessons by principal’ (‘Monitoring of lessons by external in-spectors’) indicates the percentage of students in schools whose principal reported that the principal or senior staff (inspectors or other externals) have monitored maths teachers by observing lessons.

School autonomy

Another institutional feature discussed by Hanushek and Woessmann (2014) is school autonomy. Schools, they argue, have an informational advantage over regional and national authorities when it comes to their own students, staff and local conditions. Giving schools more responsi-bility for school policy and management could therefore improve student outcomes. On the other hand, when the interests of schools are not aligned with improving student outcomes, more autonomy may instigate opportunistic behaviour, unless schools are held accountable for the performance of their students.

Hanushek and Woessmann (2014) provide an overview of descriptive stud-ies that estimate the impact of school autonomy on student test scores.

Based on these studies, they conclude the following:

1 Students score substantially higher in schools that have autonomy when it comes to process and personnel decisions (such as the decision to purchase supplies, allocate the budget, and hire and reward teachers (within a given budget), and the choice of textbooks and methods of instruction).

2 By contrast, students score lower in schools that have autonomy in setting the budget and choosing the subject matter to be taught in class (two areas of decision-making that, according to Hanushek and Woessmann 2014, give rise to opportunistic behaviour but do not in-volve much superior local knowledge).

3 In addition to these average effects, the descriptive studies also suggest important cross-country differences. The effect of for example school autonomy over teacher salaries on student test scores seems negative in systems that do not have external exit examinations, but positive in systems that do have external exit examinations. The size of the effects indicates that moving from the worst possible system (with autonomy but without external examinations) to the best (with autonomy and with external examinations) improves test scores by no less than three quarters of a standard deviation. Similar interactions have been found for school autonomy in determining course content and teacher influ-ence on resource funding. Intuitively, what these results suggest is that autonomy leads to lower test scores if it is not accompanied by external examinations that hold schools accountable for test scores, but leads to higher test scores if it is accompanied by external examinations.

As before, these findings do not necessarily imply causality. Hanushek, Link and Woessmann (2013) attempt to address some of the concerns by constructing a panel dataset and exploiting variation over time. Their

results suggest that school autonomy affects test scores negatively in developing and low-performing countries, but positively in developed and high-performing countries. They also confirm that granting schools autonomy works better in combination with accountability (central exit examinations) that limits opportunistic behaviour.

Table 2.5 documents for each country the extent to which schools are autonomous in the areas of formulating the budget, establishing the start-ing salaries of teachers, determinstart-ing course content and hirstart-ing teachers (in 2012).

The variables ‘Formulating budget’, ‘Establishing starting salaries’,

‘Determining course content’ and ‘Hiring teachers’ indicate the percent-ages of students in schools where the principal reports that schools have a main/substantial responsibility for the following aspects of school policy and management, respectively: formulating the school budget, establish-ing teachers’ startestablish-ing salaries, determinestablish-ing course content and appointestablish-ing teachers.

Competition from private schools

A third institutional feature discussed by Hanushek and Woessmann (2014) relates to the scope for private-sector involvement in schools, and the degree to which state schools face competition from private schools. According to Hanushek and Woessmann (2014), it is important to distinguish between the private (as opposed to public) operation of schools and the private (as opposed to public) funding of schools. Privately operated schools are managed directly or indirectly by a non-government organization such as a church, trade union or business. By contrast, private funding of schools refers to funding from private contributions such as

In document Public sector achievement in 36 countries (pagina 74-85)