Under-advice gap in primary education in the
Netherlands: a Blinder-Oaxaca decomposition analysis.
Nihal Chehber (12455245) MSc Economics, Public Policy
Supervisor: Prof. Jeroen Hinloopen Second reader: Prof. Bas ter Weel
Internship at CPB Netherlands Bureau for Economic Policy Analysis
I would like to express my sincere gratitude and warm appreciation to the following persons who have given me much of their time to help me shape this thesis.
Prof. Jeroen Hinloopen, my thesis supervisor, for always being available to discuss my concerns and giving me precious suggestions.
Maria Zumbuehl, my supervisor at CPB, from whom I learned so much and this thesis wouldn’t look the way it does without her help and expertise.
Rik Dillingh, for his valuable critique and brilliant thoughts.
I would also like to thank my colleagues at CBP who were always ready to answer my questions and who made me feel welcome during these difficult times of the pandemic.
Finally, a big thank you to my parents for their continuous support and to my partner for his patience and encouragement.
Statement of originality
This document is written by Nihal Chehber who declares to take full responsibility for the contents of this document.
I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating it.
UvA Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.
The aim of this study is to identify the factors that contribute to the gap in the probability of being under-advised between different groups of pupils in their last year of primary school in the Netherlands. Previous studies show that pupils with low socioeconomic status and pupils from a migrant background are more likely to receive a school advice that is lower than the test advice. We use Blinder-Oaxaca decomposition to break down this under-advice gap into a part that is attributable to differences in observed characteristics such as background characteristics, school characteristics, cognitive and non-cognitive skills, and a residual part that is due to differences in returns to characteristics. We find that Pupils of low-educated parents are significantly more likely to get under- advised. While 35% of this gap can be attributed to the difference in average observed characteristics, the largest part of the gap remains unaccounted for. Pupils of low-income parents are more likely to get under-advised than pupils of high-income parents, and this gap is almost entirely driven by differences in parental education. We find no evidence for a significant under-advice gap between pupils with and without a migrant background. Although pupils in low urbanized areas have on average better cognitive and non-cognitive skills, they are more likely to get under-advised. Females are more likely to be under-advised than boys, but most of the gap remains unexplained by differences in observed characteristics.
Table of Contents
1 Introduction ... 5
2 Theoretical framework ... 8
2.1 The Dutch context ... 8
2.2 Inequality in the school advice ... 9
2.3 Literature on the objective-subjective assessment gap ... 11
2.3.1 Role of the pupil ... 12
2.3.2 Teacher bias ... 12
2.3.3 Pupils’ attributes ... 13
3 Data ... 17
3.1 Dataset ... 17
3.2 Variables ... 17
3.2.1 Outcome variables ... 17
3.2.2 Independent variables ... 19
3.3 Descriptive statistics ... 24
4 Empirical strategy ... 29
5 Results ... 32
5.1 Regression analysis results ... 32
5.2 Blinder-Oaxaca decomposition results ... 35
5.2.1 Decomposition by parental education ... 35
5.2.2 Decomposition by parental income ... 37
5.2.3 Decomposition by migrant background ... 38
5.2.4 Decomposition by gender ... 39
5.2.5 Decomposition by urbanization ... 40
5.3 Robustness checks ... 43
5.3.1 Overfitting ... 43
5.3.2 logistic decomposition ... 44
5.3.3 Adding sample weights ... 45
6 Conclusion ... 46
7 Bibliography ... 48
8 Appendix ... 53
A heated debate erupted in the Netherlands when it has been shown that during the transition from primary to secondary school, some groups of pupils are more often allocated to lower ability tracks in secondary education. In particular, compared to pupils with a high socioeconomic status, those from low socioeconomic backgrounds are more likely to receive a school advice that is lower than the test advice, despite having the same score on the final test (Inspectorate of Education., 2014, 2017, 2018, 2019). This higher likelihood of receiving a lower school advice also concerns children with a migrant background and those from low urbanized areas.
The fact that school advice appears to be related to students' socioeconomic status and migrant background, both of which are beyond their control, raises concerns about inequality in opportunity.
This issue grew even more serious in 2015 when a decision was made to render the school advice decisive, giving the final test a much less important role in the transition from primary to secondary education (Toetsbesluit PO, 2014). While this resolution is based on the belief that the teacher has a better understanding of the student’s interests and capabilities, others view the teacher’s decision- making process as a black box that is susceptible to bias. In a highly tracked educational system such as the Netherlands, the transition from primary to secondary education is a critical moment in the pupil’s school career (Zeedyk et al., 2003). A poor and inadequate transition not only leads to a decline in academic performance and school attendance (Gutman & Midgley, 2000) but can also have a long- term detrimental impact on mental health (Waters et al., 2012). Therefore, school advice inequalities need to be carefully investigated and addressed.
The literature is not conclusive on the reasons for the under-advice gap. Some studies reveal indeed the existence of teacher bias, especially against pupils from low socioeconomic backgrounds, without indicating the nature of the bias (Timmermans et al., 2015). Other studies find that teachers consider other variables in formulating their advice, such as non-cognitive skills and home situation which happen to be correlated with background characteristics of the pupils (Borghans et al., 2018).
Here also, the decision-making process of the teacher remains unclear as there are no objective evaluation criteria available to all teachers to evaluate non-cognitive skills. This study aims to identify the factors that contribute to the under-advice gap. Using the Blinder-Oaxaca decomposition technique, we break down the gap in the probability of getting under-advised between two groups into a part that is attributable to differences in known characteristics such as background characteristics, school characteristics, cognitive and non-cognitive skills, and a residual part that is due to differences in returns to characteristics. We decompose the under-advise gap by parental education, parental
6 income, migrant background, degree of urbanicity and gender. We also investigate the probability to still be under-advised even after reevaluation1. To the best of our knowledge, this is the first study to decompose the under-advice gap between different groups of pupils in the Netherlands. Our central research question is as follows:
What accounts for the gap in the probability to be under-advised between pupils from different backgrounds? How much of the gap can be attributable to differences in observed characteristics and how much of the gap remains unexplained?
We use data from the third round of the cohort study COOL of 2013/2014, linked with CBS microdata on 5197 pupils. Data include measures of cognitive skills, non-cognitive skills, school advice, test advice, background variables and school characteristics. In the first stage of the analysis, we first perform ordinary least squares (OLS) regression to identify variables that strongly predict under-advise. In the second stage, we carry out Blinder-Oaxaca decomposition to decompose the gap in reevaluation into an explained and an unexplained part. Results show that under-advising is not random but is systematically related to a number of pupils’ background characteristics, mainly parental education, parental income, gender and level of urbanization. Results also indicate that in addition to cognitive skills, teachers also use information on non-cognitive skills when giving the school advice.
Pupils of low-educated parents are 7.3 percentage points more likely to get under-advised. 35% of this gap can be attributed to the difference in average observed characteristics mainly non-cognitive skills, parental involvement and birth year which is in favor of pupils of high-educated parents. Pupils of low-income parents are 3.7 percentage points more likely to get under-advised than pupils of high- income parents, and this gap is almost entirely driven by differences in parental education. We find no evidence for a significant under-advice gap between pupils with and without a migrant background.
Although pupils in low urbanized areas have on average better cognitive and non-cognitive skills, they are more likely to get under-advised. Females are more likely to be under-advised than boys, but most of the gap remains unaccounted for by differences in observed characteristics.
1When the test advice is lower than the school advice, the pupil is eligible for a reevaluation of the school advice, but it is not necessarily adjusted. Only about a third of pupils eligible for a reevaluation has their school advice adjusted (Swart et al., 2019).
7 The remainder of the paper proceeds as follows. Section 2 explains the Dutch primary education system, reviews the existing literature and outlines our hypotheses. Section 3 then describes the dataset, the variables used and provides descriptive statistics. Section 4 lays out the empirical strategy. Section 5 presents our findings and outlines a number of robustness checks. Section 6 concludes and addresses policy implications.
2 Theoretical framework
2.1 The Dutch context
In the Netherlands, primary education starts from the age of 4 until the age of 12, with the first two years spent in kindergarten. Starting from the third year, kids learn how to read and write and their progress is thenceforth continuously evaluated through a pupil monitoring system2. By the end of the 8th grade in primary school, it is mandatory for pupils to take a test (Eindtoets Basisonderwijs) in order to evaluate their language and numeracy skills. There are many versions of the test available for the primary school to choose from, the most popular one being De Centrale Eindtoets van het College voor Toetsen en Examens (CvTE), shortly called Centrale Eindtoets (CET). This test consists of two main sections, namely language and arithmetic, and an additional optional section; world orientation which evaluates the pupil’s knowledge in geography, history, nature and technology. The final score of the CET (referred to as the standard score) is a number between 501 and 550. Each score interval corresponds to a certain secondary school track, either within pre-vocational secondary education (VMBO), senior general secondary education (HAVO), or pre-university secondary education (VWO). The secondary school level inferred from the final test result constitutes what is called the test advice. Vmbo lasts four years and prepares for secondary vocational education (MBO). It is divided into different levels, each with a particular combination of vocational training and theoretical education. These are vmbo-basic-vocational (vmbo-b), vmbo-senior-vocational (vmbo-k), vmbo- mixed (vmbo-g) and vmbo-theoretical (vmbo-t). Havo (5 years) and vwo (6 years) are general education programs leading to university. A Havo diploma allows students to attend universities of applied sciences (HBO) while a vwo diploma grants access to research universities (WO). In addition to the advice for one of these main tracks, combined advices also exist such as vmbo-b/k and Havo/vwo.
Before the first of March and thus before taking the final test, every child gets to hear their teacher’s advice3 about the secondary school track that best fits their academic level. This school advice is a subjective expert evaluation of the pupil’s abilities based on his/her achievements, attitudes and interests. It reflects the teacher’s expectation of the pupil’s future achievement level during secondary education (de Boer et al., 2010). While formulating their recommendation, teachers are expected to carefully assess the child’s cognitive and non-cognitive abilities relying on objective and
2 Most schools in the Netherlands evaluate the child‘s progress by keeping a record of their homework, tests, intermediate assessment and extracurricular activities.
3 We loosely use the term teacher’s advice to refer to the school advice. The school advice is not entirely decided by one individual. The process involves other members in the primary school among whom is the main teacher.
9 measurable learning performance and following a clear decision-making procedure with no implicit assumptions (Inspectie van het Onderwijs, 2018).
The test advice and the school advice together determine the level of secondary education that is suitable for the pupil. However, since the school year of 2014/2015, the school advice became decisive, giving the final test a much less important role in the transition from primary to secondary education (Toetsbesluit PO, 2014). If the test advice is lower than the school advice, nothing will be changed and the school advice is still preeminent. Contrarily, if the test advice is higher than the school advice, the pupil is then eligible for a reevaluation of his/her school advice, but the advice is not necessarily adjusted. Making the school advice the main guiding principle for secondary school track placement is justified on the grounds of the teacher’s professional judgment. The teacher interacts every day with the child and knows more about the student’s background, difficulties, interests and capabilities. Unlike the test score that is viewed as a snapshot of the pupil’s performance at a particular moment, the teacher’s assessment reflects a more accurate picture of the pupil’s overall performance.
Furthermore, this policy change prevents secondary schools from rejecting pupils based on their final test scores (Lubbe, 2005). In this context, studies have shown that the teacher recommendation is an important element that reflects the pupils’ abilities and strongly predicts their future performance (Feron et al., 2016; Lenhard & Schröppel, 2014; Geert Driessen, 2006), which fulfills the meritocratic principle (Luyten & Bosker, 2004).
2.2 Inequality in the school advice
The preeminence of the school advice however has raised concerns regarding inequality in opportunities when it was shown that some groups of kids are more likely to receive a school advice that is lower than their test advice and thus be eligible for a reevaluation. This concerns differences between children with low and high-educated parents, between kids with low and high-income parents, differences according to the level of urbanization and difference between children with and without a migrant background (Inspectorate of Education., 2014, 2017, 2018, 2019). For the same final test score, children of less-educated parents are much more likely to receive a lower recommendation than their test advice compared to children with highly educated parents. The higher probability of under- advice for children of low-educated parents has been amply documented in the Dutch literature (Claassen & Mulder, 2003; de Boer et al., 2010; A. Timmermans et al., 2013; Timmermans et al., 2015). Children of low-income parents are also more likely to receive a school advice lower than the test advice (Feron et al., 2016). As for pupils with a migrant background, they are as likely as those with no migrant background in having their school advice reevaluated when the income and the level
10 of education of parents are controlled for (Swart et al., 2019). There is thus little evidence for the existence of ethnicity or racial effect and socioeconomic effects appear dominant (Inspectie van het onderwijs, 2014). Differences also exist according to the level of urbanization. Pupils in less urbanized areas have a higher chance of receiving a school advice that is lower than their test advice compared to pupils living in large cities (Inspectorate of Education., 2017). One of the explanations put forward is that in non-urban areas the offer of higher secondary school types is limited which makes schools more likely to give a lower advice than the test advice (Oomens et al., 2019).
Hypothesis 1: parental education, parental income and degree of urbanicity are expected to be important predictors of under-advice. We do not expect a higher chance of under-advice for pupils with a migrant background, once parental education and income are controlled for.
Only about a third of pupils eligible for a reevaluation has their school advice adjusted (Swart et al., 2019). Children with a non-western migrant background have a higher chance of adjustment compared with those with no migrant background. The chance of adjustment is even higher for children of non- western migrant origin with highly educated parents. On the other hand, pupils of low-educated parents, pupils of low-income parents and children in less urbanized areas are less likely to get an adjustment of the school advice compared with kids of high-educated parents, kids of high-income parents and with kids in urban areas respectively.
Hypothesis 2: we expect a lower probability of readjustment for pupils of low-educated parents, pupils of low-income parents and kids in low urbanized areas but a higher readjustment probability for pupils with a non-western migrant background.
The gap between teacher assessment and test score is not particular to the Netherlands. Studies have demonstrated the existence of disparities between the teacher’s subjective assessment of different groups of students with similar academic profiles. Given the same performance, boys are often graded less favorably than girls (Cornwell et al., 2013; Driessen & Langen, 2011; Feron et al., 2016; Lavy, 2008; Reeves et al., 2001). Students from racial and ethnic minorities with the same academic performance as those from racial and ethnic majorities are more likely to get a lower teacher’s recommendation for secondary school tracks. This is the case for Portuguese students in Luxemburg (Klapproth et al., 2012) and for Turkish and Italian students in Germany (Kristen, 2002). In England, black Caribbean pupils often receive a grade from the teacher that is lower than their final test score compared with white pupils (Burgess & Greaves, 2013). Nevertheless, results regarding the effect of ethnic background on teacher recommendation remain mixed as some studies find no effect (Arnold et al., 2007; Driessen et al., 2008a) while others report a positive one (Driessen, 1990; Kristen &
11 Dollmann, 2009). Timmermans et al.( 2015) point that the effect of migrant background on teacher recommendation in the Netherlands has changed throughout time, which could explain the inconsistent findings of different studies. In the ‘80s teacher recommendation was favorable towards pupils with a migrant background compared to natives due to positive discrimination from teachers or to the high aspiration of immigrant parents for their kids. Claassen & Mulder (2003) then show that this trend reversed around 2000 while Driessen et al.(2008a) find the gap in the teacher’s advice between pupils with and without migrant backgrounds to no longer exist. Other studies also reveal substantial variation across schools and classrooms, with some teachers overestimating while others underestimating, resulting in a null average impact.
Still, the evidence for inequality in school advice is most consistent with respect to socioeconomic status. Research in Flanders and France show that pupils from a low socioeconomic background are more likely to get a recommendation for lower-level educational tracks than pupils from a higher socioeconomic background even with equal performance (Boone & Van Houtte, 2013;
Kieffer, 2004). In Switzerland, participants were asked to make tracking decisions for fictitious pupils with the same grades but different socioeconomic status (SES)4. Results show that higher tracks were advised more to children from high SES than to kids from low SES and this even for pupils who are borderline cases for higher tracks (marginally below the threshold to follow a higher track) (Batruch et al., 2019). This result could be interpreted as offering high SES students the benefit of the doubt while denying it to low SES students, leading to the reproduction of social inequality (DiTomaso, 2015). There is in fact abundant literature on the role of school in reproducing social inequality based on the work of Bourdieu and Boudon (Boudon, 1974), arguing that the school, through teaching style and curriculum, promotes the culture and the ideology of the upper class, which limits the success of lower-class students (de Boer et al., 2010).
2.3 Literature on the objective-subjective assessment gap
We distinguish between three main key processes in the literature when seeking to explain the disparity in teacher track recommendation. First, one type of literature focuses on the students themselves and the reasons that make their scores differ from their daily performance in class. Second, a large body of literature examines the role of teacher bias. The third mechanism is that teachers base their track recommendations on their evaluation of some attributes they view as important for future
4 In this study, socioeconomic status is depicted through names (Louis vs Bryan), parental occupation (mother: director of marketing vs. Waitress), extracurricular activity (traveling to London vs. local amusement park)
12 success and these traits happen to be correlated with background characteristics, thereby leading to inequality in school advice.
2.3.1 Role of the pupil
Burgess & Greaves, (2013) suggest that students with low socioeconomic status and some ethnic groups might take school tests more seriously because of cultural reasons or because they perceive the rate of return of the test to be higher than other groups. Indeed, Connor et al., (2001, 2004) suggest that students from disadvantaged social classes expect higher beneficial outcomes from education and that minority ethnic students and their parents value education highly than white students and view it as crucial to achieve upward social mobility. A different explanation relies on the theory of stereotype threat stating that people in a group subject to stereotypes and social stigma may underachieve in tests because they get stressed and concerned about confirming the stereotypes about them, which negatively affects their performance (Steele & Aronson, 1995). Applying this theory to explain why some groups of children receive a lower school advice than the test advice suggests that they deal with stereotype threat in their everyday interactions with the teacher in class but not during the final test. However, feeling at risk of confirming stereotypes throughout the whole scholar year but not during the final test does not seem like a cogent scenario (Burgess & Greaves, 2013).
2.3.2 Teacher bias
Another possibility is that consciously or unconsciously, teachers have inaccurate or biased expectations towards some groups of children. Geven et al. (2018) distinguish between statistical and taste-based teacher bias. We speak of statistical bias when the teacher holds high expectations of children from a certain group (for instance children of highly educated parents) because, on average, this group has truly a higher achievement level compared to other groups. Statistical bias can take place in a rational decision-making process when the teacher has doubts about the child’s abilities and the average group achievement level appears like a good proxy for that individual pupil’s competencies. Statistically biased recommendations do reflect the actual average achievement level of the group but for an individual pupil within that group, the teacher’s evaluation can be inaccurate.
Taste-based bias on the other hand occurs when the teacher systematically under or over evaluates certain pupils just because they belong to a certain group. In this case, we talk about discrimination based on stereotypes or simply due to a personal preference of the teacher. Glock et al. (2013) show that teachers have stereotypical expectations about Turkish students, who are overrepresented in the lower school tracks in Germany. Giving teachers information that confirmed the stereotypes about Turkish students activated the stereotypical expectations of the teachers which led them to biased
13 judgments. Results from other studies reveal that controlling for academic achievement, teachers have biased judgments and hold lower expectations for ethnic minorities (Hughes et al., 2005; McKown &
Weinstein, 2008; Ready & Wright, 2011; Riley & Ungerleider, 2012; Shepherd, 2011), for students with lower socioeconomic status (Auwarter & Aruguete, 2008; Ready & Chu, 2015; A. C.
Timmermans et al., 2015; Tobisch & Dresel, 2017) and for boys especially in literacy (Holder &
Kessels, 2017). The evidence is most consistent in showing that teachers have low expectations of students from low socioeconomic status (Wang et al., 2018). Most Dutch studies investigate the presence of teacher bias by looking at the extent to which other background variables predict teacher recommendation above and beyond student scores (Boer et al., 2006; de Boer et al., 2010; Luyten &
Bosker, 2004). Teacher recommendation was found to be associated with variables other than school performance suggesting the presence of a bias. The bias is in favor of kids of high SES, girls, and pupils whose parents have high aspirations for them. Nevertheless, the extent of the teacher bias outlined in these studies need to be taken carefully because calculating teacher bias is not straightforward. Other elements that determine teacher expectations might be overlooked when estimating the bias.
2.3.3 Pupils’ attributes
When asked about the determinants of the school advice, teachers do not explicitly mention socioeconomic status, gender, or ethnicity. Oomens et al. (2019) show that for 531 interviewed schools in 2018 in the Netherlands, the school advice is mainly based on previous test scores of the pupils in previous groups (98%), behavioral characteristics (97%), test results in group 8 (89%), care record of the child (78%) and home situation of the student (46%). Moreover, 56% of primary schools affirm that parents often pressure the teachers to issue a higher recommendation for their kids and that the pressure coming from high-educated parents is especially apparent when it comes to school advice adjustment. Another research conducted by the Dutch inspectorate of education along 118 primary schools finds that all schools base their advice on information from the students tracking system and on behavioral characteristics such as motivation, perseverance and concentration. In addition, almost all schools indicate that they take into account the role of a stable home and the involvement and support of parents (Inspectie van het onderwijs, 2014). Looking more closely to how teachers formulate their recommendations, it was found that when the test results and academic performance clearly point to a particular track, the school advice goes along. However, when the cognitive performance of the pupil does not clearly match a certain level, and the teacher has doubts, then the school advice is based on the subjective assessment of factors such as motivation, housework attitude, family situation, study skills, medical details and how the teacher views the possibilities for support at
14 home (Inspectie van het Onderwijs, 2018). However, the assessment in this case is hardly based on any specific agreed-on procedure. This means that the outcome of the pupil is largely dependent on the subjective opinion and the implicit assumptions of the person assessing them, which might be subject to bias.
This leads Geven et al. (2018) to hypothesize that teachers base the school advice on personality traits they believe to be factors of success and that these traits happen to be associated with certain groups of pupils, leading to inequality in school advice. Indeed, Boone & Van Houtte (2013) reveal that primary school teachers in Flanders take into account self-reliance, capacity to plan and punctuality in formulating their school advice, and that these traits are more characteristic of pupils from high socioeconomic status. After interviewing 50 randomly selected primary school teachers in Flanders, a qualitative study finds that instead of basing the tracking decision on pre-defined rational criteria, most teachers rely on data collected intuitively pertaining to the pupils’ non-cognitive skills, in particular student motivation, work ethics, behavioral engagement in the classroom (Vanlommel &
Schildkamp, 2018). Using data on 5316 children in grade 6 in Dutch primary school, Timmermans et al.(2016) show that given the same academic record, teachers’ expectations are higher for students perceived as having superior work habits and higher self-confidence and that teachers tend to associate these characteristics more with girls rather than boys. In other studies, the discrepancy in teacher’s assessment of girls and boys becomes insignificant when the teacher’s evaluation of behavior (Bennett et al., 1993) and non-cognitive skills (Cornwell et al., 2013) are controlled for. Non-cognitive differences can thus explain the higher chance of boys at attending a lower secondary school track than their test advice (Driessen & Langen, 2011; Korpershoek, 2016). Using data on 7883 pupils from the Dutch Primary Education (PRIMA) study of 2002 /2003, Geert Driessen et al. (2008) find that controlling for school performance and background characteristics, pupils with better study attitude, higher self-confidence and those who display greater effort get a slightly higher school advice.
In the Netherlands, a CPB report reveals that even before children go to school, inequalities can be detected in work attitude and behavior with respect to socioeconomic status, migrant background and gender (Zumbuehl & Dillingh, 2020). Based on questionnaires and administrative data from de Onderwijs Monitor Limburg (OML), Borghans et al.(2018) explore the role of non- cognitive skills in explaining the gap in educational outcomes (final test score, school advice) between pupils from low and high socioeconomic backgrounds. 8th-grade pupils from high socioeconomic backgrounds have on average a higher test score and a higher school advice than pupils from low socioeconomic backgrounds. Here, socioeconomic status is measured by the education level of parents.
Non-cognitive skills were portrayed in a number of statements and the children themselves or their
15 parents indicate on a 5-point scale the degree to which that statement describes them. Results indicate that there is an association between socioeconomic status and non-cognitive skills and that the level of non-cognitive skills is correlated with the school advice. Compared with children with high socioeconomic status, children from low socioeconomic backgrounds score on average higher in neuroticism and concentration problems and lower in conscientiousness, openness to experience, collaboration, persistence, school motivation and work attitude. Furthermore, one level higher on the 5-point scale for conscientiousness, openness, collaboration, achievement orientation leads, ceteris paribus, to 3%, 7%, 8% and 7% higher school advice respectively. Scoring one point lower in neuroticism is associated with 4% higher school advice. Extraversion was found to be weakly negatively correlated with school advice and no clear link was established between extraversion and socioeconomic status.
Using data from the sixth measurement of PRIMA for the school year 2004/2005 and the first measurement of COOL5-18 in 2007/20085 for 78348 pupils, A. Timmermans et al. ( 2013) perform a regression analysis for each school advice category to identify variables that predict the degree of under and over advice for each advice category. The dependent variable indicates the extent to which the school advice deviates from the test advice either positively (over-advice) or negatively (under- advice). Within most advice categories, several variables appear to be significant predictors of under- and over-advice, namely cognitive capacities, educational level of parents, parents’ involvement, motivation of the pupil, self-efficacy and the teacher’s evaluation of the student's performance compared to his or her classmates (‘vergelijking prestaties’). Contrary to expectation, higher cognitive capacities are found to be associated with more under-advising when looking within each school advice category. However, considering the whole sample however, there is a positive association between cognitive abilities and over-advice. This shift in the sign of cognitive abilities is explained by the fact that within a particular school advice (say HAVO), pupils who are over-advised have less cognitive abilities than pupils who are under-advised. The pupil’s motivation and self-efficacy seem to be significant predictors only for the advice category HAVO and higher. Regarding some variables on the student’s personality traits such as autonomy, work attitude and popularity in the classroom they are not systematically associated with under-and over-advice within the different advice categories.
5 In our study, we use the third measurement of COOL5-18 for the school year 2013/2014
16 Hypothesis 3: unfavorable non-cognitive skills and lack of parental involvement are associated with a higher chance of under-advice. We expect the under-advice gap to be attributed to differences in non-cognitive skills to some extent.
Overall, the literature reveals that there is a gap in the probability of being under-advised between pupils with different socioeconomic status, between girls and boys and between pupils in low and high urbanized areas. Some studies investigate the possibility of teacher bias while other studies try to identify factors that could explain this gap. These factors include mainly behavioral characteristics and non-cognitive skills (work habits, self-confidence, effort, motivation, persistence, conscientiousness, collaboration, etc), home situation and parental involvement. By decomposing the probability of under-advice into an explained and unexplained part, we identify at the same the extent to which under- advising can be attributed to differences in observed characteristics (explained part) and the extent to which the under-advice gap results from differences in returns to characteristics (unexplained part).
The latter means that for two groups of pupils, girls and boys or pupils from low and high SES for instance, the teacher would give different recommendations even though they have exactly the same observed characteristics, which suggests the presence of a bias. The extent of the unexplained part depends on whether or not we include all variables that are important predictors of under-advice. Of course, we cannot claim that we have included all essential factors, but we do include several variables that have been shown to be important predictors of under-advice.
For the analysis, this paper uses data from the third round of the cohort study COOL of 2013/2014, linked with CBS microdata. COOL study follows students aged 5 to 18 to track their educational careers and measure their cognitive development, socio-emotional development and their social skills. These data are collected using tests and questionnaires administered to teachers, students and their parents. The final COOL sample consists of a representative part and an additional part of disadvantaged schools6. In this paper we limit the COOL dataset to pupils who were in 5th grade (aged 9) and we use variables measuring their cognitive and non-cognitive skills. These data are then merged with CBS microdata comprising the school and the test advice for the same pupils at age 12 along with background variables. The final sample consists of 5197 pupils from 311 schools.
3.2.1 Outcome variables
We use two main outcome variables. The first one, under-advise, is a binary variable indicating under-advice when the school advice is two or more tracks lower than the test advice7. When a pupil is under-advised they are eligible for a reevaluation which can lead to a readjustment of the school advice. Not all under-advised pupils however have their school advice readjusted. Thus, we use a second outcome variable, persistent under-advice, in order to consider the effects of the readjustment in our analysis. This binary variable indicates that there is under-advice if the final school advice8 is still two or more levels lower than the test advice.
The school advice and test advice in our dataset consist of nine advice categories as follows:
• Vmbo basis (vmbo-b)
• Vmbo basis/kader (vmbo-b/k)
• Vmbo kader (vmbo-k)
• Vmbo Kader/gemengde theoretische (vmbo-k/gt)
6 The additional sample of disadvantaged schools is chosen based on the school score indicating the social ethnic composition of the student population in a school. Additional schools with a higher score are added to the representative sample.
7 By definition, a pupil is under-advised when the school advice they get is lower than their test advice. This is the case for instance with a vwo test advice and a havo/vwo school advice. Other papers however consider a stricter definition of under-advice; when the school advice is two or more levels lower than the test advice (Timmermans et al., 2013). If a pupil receives a vwo test advice and a havo/vwo school advice, one could argue that this pupil is not clearly under- advised since the school advice is combined. We thus consider under-advice when the school advice is two or more tracks lower than the test advice to consider pupils who are clearly under-advised. An example would be a test advice of vwo and a school advice of havo.
8 The final school advice is the school advice after a potential readjustment.
• Vmbo gemengde theoretische (vmbo-gt)
• Vmbo gemengde theoretische / havo (vmbo-gt/havo)
• Havo / vwo
Table 1 shows the frequencies and percentages of each advice category for the test and school advice. The most common test advice is vwo (19%) and the most common school advice is vmbo-gt (20%). Table 2 shows the number of pupils who are under-advised and persistently under-advised.
31% of under-advice cases and 30% of persistent under-advice cases take place within vmbo-gt level.
Note that due to ceiling effects, pupils with a vwo or with havo/vwo school advice cannot get under- advised. This is because the two highest test advices a pupil can get are havo/vwo and vwo. Thus, no matter how high the score on the final test is, a vwo or a havo/vwo school advice cannot be two or more levels lower than the test advice. Comparing the numbers of under-advised pupils with those persistently under-advised shows that 241 pupils (4,6% of the total sample) got their school advice readjusted9.
Table 1: frequencies of test and school advice per advice category
9 Readjusted to match the test advice or readjusted to be one track lower than the test advice.
19 Table 2: frequencies of test and school advice per advice category
3.2.2 Independent variables
The variables used to predict under-advice can be grouped into five categories: background variables, regional characteristics, school characteristics, relative cognitive skills and non-cognitive skills.
220.127.116.11 Background characteristics
• Parents’ educational level: we classify pupils into three groups according to the level attained at least by one of their parents: max mbo2, mbo 3-4/havo/vwo and hbo/wo10.
• Parents’ income: we use a continuous variable indicating percentile groups of the household’s disposable income, compared to all households in the Netherlands.
• Migrant background: we classify pupils into 4 categories: those with no migrant background, those with a western migrant background, those with a non-western migrant background 1st generation and those with a non-western migrant background 2nd generation.
• Gender: binary variable that takes the value 1 for females.
• Birth year: the year the pupil was born.
• Household type: this is a binary variables that takes the value 1 if the household consists of a couple with kids and value 0 otherwise.
10 Mbo means secondary vocational education. Pupils go to mbo after completing vmbo. Mbo has 4 levels: mbo1 (assistant training), mbo 2 (basic vocational training), mbo 3 (professional training) and mbo 4 (middle-management training).
• Kids in household: this variable indicates the number of kids living in the household.
• Father’s age at birth & mother’s age at birth: these two continuous variables indicate the age of the father and the mother at the time the pupil was born.
• Special care: a binary variable indicating whether or not the pupil needs special care. This is the case when the pupil has a physical or mental disability or when he/she has learning problems.
• Parents’ involvement: continuous variable reported by the teacher on the parents’ level of involvement in school and in their support for the kid’s learning process on a scale of 1 to 511. Parental education, parental income and migrant background are also coded into binary variables, to be able to use them as group variables for the Blinder-Oaxaca decomposition. Table 3 shows how variables are coded and indicates the 0 and 1 values of the binary variables.
Table 3: summary statistics for background variables
11 A score of 5 means very high involvement.
21 18.104.22.168 Regional characteristics
• Province : the province where the pupils live.
• Degree of urbanicity: we categorize regions where pupils live into five categories according to the level of urbanicity. We also have a binary variables that takes the value 1 to indicate highly urban areas.
Table 4: summary statistics for regional variables
22.214.171.124 School characteristics
• School score: The school score provides an indication of the socio-ethnic composition of a school's student population; the higher the score, the more diverse is the socio-ethnic composition of the school.
• Denomination: The ideological vision on which the school is based.
• Type of Final test: type of final test that the pupil has passed in the last year of primary school.
22 Table 5: summary statistics for school variables
126.96.36.199 Cognitive skills
Our dataset includes test scores on language and math in grade 5 that are part of the students’
tracking system (leerling-en onderwijsvolgsysteem or LOVS for short) and scores on non-school cognitive capacities test (nscct) in grade 5. These tests were thus taken 3 years before the school advice.
We assume that the cognitive abilities of pupils remained stable through time. The LOVS tests assess vocabulary (woordenschat), technical reading (Drie minute toets), reading comprehension (begrijpend lezen) and math (rekenen/wiskunde). The nscct test aims to estimate the pupil’s learning potential. The test consists of five parts, namely figure composition, exclusion, series of numbers, categories and analogies.
Table 6: summary statistics for cognitive skills
23 For the analysis, we convert cognitive skills into deciles within each school advice category.
The position of the pupil in these tests tells us about the performance of the pupil compared to other pupils with the same school advice and not compared to all pupils in the sample. This helps us detect higher ability pupils within vmbo-b for instance and lower ability pupils within vwo. If we categorize all pupils in the sample into deciles according to their scores on cognitive tests, then all vmbo pupils would be ranked lower than vwo pupils and we won’t be able to detect pupils with high and low cognitive abilities within each advice category. This choice is furthermore justified by the existence of non-linearities in the evaluation of pupils cognitive skills. For the teacher to place a pupil from vmbo- b to vmbo-k, a slight improvement in math might be enough but for the transition from havo to vwo a larger improvement might be required. We will then call this variable relative cognitive skills.
188.8.131.52 Non-cognitive skills
The COOL dataset includes measures of some non-cognitive abilities relating to the pupil’s behavior, his/her relationship with the teacher and other students and their motivation. On a scale of 1 to 5, the teacher evaluated each pupils’ behavior, working attitude, popularity in class and performance relative to real abilities12. The teacher also assessed the pupils on their teacher-student relationship in terms of dependence, conflict and closeness. A detailed description of non-cognitive skills can be found in appendix A-1.
Table 7: summary statistics for non-cognitive skills
12 The variable performance indicates to what extent the teacher considers the pupil’s performance at school to reflect the pupil’s skills and abilities. A higher score on performance indicates that the pupil is doing his/her best.
3.3 Descriptive statistics
Figure 1 shows the raw differences in the under-advice probability between groups of pupils by gender, parents’ education, migrant background, parents’ income and level of urbanicity. Females, pupils of low-educated parents, pupils of low-income parents and pupils in low urban areas are on average more under-advised than males, pupils of high-educated parents, pupils of high-income parents and pupils in high urban areas respectively. The differences by parents’ education and degree of urbanicity are statistically significant at the 5% significance level. Pupils from a non-western migrant background are on average slightly more under-advised than pupils without a migrant background but they are less likely to get a persistent under-advice. This means that pupils from a non- western migrant background are more likely to get their school advice adjusted than pupils without a non-western migrant background.
Figure 1: Percent under-advice by background variables
Figures 2, 3, 4, 5 and 6 display box plots describing the distribution of non-cognitive skills and cognitive skills between the different groups of pupils13. We want to see whether there are differences between different groups of pupils in cognitive and non-cognitive skills that could justify the systematic under-advice. The distribution of cognitive skills is shown within each school advice category. We show the box plots separately for the math and language tests taken in grade 5, and for the total score on non-school cognitive tests. For all the cognitive tests, the boxplots move to the right as the school advice level gets higher, which means that on average, pupils in higher advice categories
13 Outliers are removed from the box plots to avoid distorting the scale.
20% 17% 18% 16% 19% 19%
12% 13% 15%
15% 13% 11% 11% 14% 15%
Male Female high educated parents low educated parents without migration bacgkround with migration background high income parents low income parents less urban urban
Percent under-advice by background variables
%Under-advice Persistent under-advice
25 have higher cognitive skills. Differences are most apparent by migrant background (figure 2), wherein every school advice category, the distribution of cognitive skills of pupils from a migrant background is to the left of that of pupils without a migrant background. This reveals that, in each track, pupils with a migrant background score systematically lower in math, language and non-school cognitive tests than pupils without a migrant background. If you take the language test for instance, in each advice category, more than 75 percent of the pupils without a migrant background scored higher than the median score of pupils with a migrant background. These discrepancies can be accounted for when we presume that children from non-western families do not typically speak Dutch at home or with their parents. Differences are less pronounced when we consider non-cognitive skills, but the distribution for kids with a migrant background displays higher variation, resulting in lower means for them.
However, pupils with a migrant background score on average better in self-efficacy, and motivation (see Appendix A-2 for the differences in means of cognitive and non-cognitive skills between groups of pupils).
The distribution of cognitive and non-cognitive skills of pupils with low and high-educated parents are fairly similar, except that it is wider for pupils of low-educated parents, indicating higher variation (see figure 3). This makes the average scores of pupils of high-educated parents slightly overtake those of pupils of low-educated parents. Only for motivation and relation with the teacher that pupils with a migrant background score higher on average.
The box plots for cognitive skills for pupils of low-income parents are slightly to the left of those of pupils of high-income parents, especially in language and non-school cognitive skills, which means they perform relatively poorly in these tests compared to pupils of high-income parents. The non-cognitive skills of pupils of low-income parents are skewed more down (towards lower scores), which makes their average lower compared to scores of pupils of high-educated parents, except in self- efficacy and motivation.
Females score on average lower in math and higher in non-school cognitive skills and non- cognitive skills relative to males. Pupils in low urban areas score on average higher in cognitive and non-cognitive skills than pupils in high urban areas.
26 Figure 2: box plots of cognitive and non-cognitive skills by migrant background
Figure 3: box plots of cognitive and non-cognitive skills by parental education
27 Figure 4: box plots of cognitive and non-cognitive skills by parental income
Figure 5: box plots of cognitive and non-cognitive skills by gender
28 Figure 6: box plots of cognitive and non-cognitive skills by urbanization
4 Empirical strategy
First, we perform an OLS regression to determine the predictive power of our independent variables for under advising. The aim is to identify which variables are significantly associated with under-advice. Our regression of interest can be written as:
𝑦 = 𝑋𝛽 + 𝜀
Where 𝑋 is a vector of our independent variables, 𝛽 is the vector of the corresponding coefficients plus a constant and 𝜀 is the error term.
Since our predictors can be grouped into 5 categories, namely background variables, regional characteristics, school characteristics, cognitive skills and non-cognitive skills, we use the Lubotsky
& Wittenberg (2006) estimator to summarize the information from the coefficients of all variables within each category14. This method includes all variables in the regression and constructs a weighted sum of the coefficients of each variable within a given category. We end up having a single coefficient for each category that summarizes the information from all the variables within that category. The statistical significance associated with each coefficient of a category represents the joint significance of the variables within that category. We use this method as a way of simplifying the interpretation and visualization of the results. An explanation of how the method works following Lubotsky &
Wittenberg (2006) can be found in Appendix A-3.
Our main goal is to determine how much of the gap in the probability of getting a reevaluation between the different groups of pupils can be explained by differences in observable characteristics and how much of the gap remains unexplained. We, therefore, perform the Blinder-Oaxaca decomposition which divides the gap between two groups into a part that is attributable to differences in known characteristics such as background characteristics, regional variables, school characteristics, cognitive and non-cognitive skills and a residual part that is due to differences in returns to characteristics. This method is often used to analyze labor market outcomes between men and women and the unexplained part is interpreted as discrimination. We need to be careful with this interpretation because the unexplained part can also reflect the effects of differences in unobserved variables. We use the two-folddecomposition based on pooled linear parameter estimates first proposed by Oaxaca
& Ransom (1994). For standard errors, we use the bootstrap technique with 6000 repetitions. Suppose
14 To account for the sampling error we bootstrap using 6000 repetitions.
30 there are two groups A (pupils with high-educated parents) and B (pupils with low-educated parents) that are mutually exclusive, an outcome variable Y (under-advice) and a list of independent variables.
Allow the model to be linear and the regression coefficients to differ for both groups of interest:
𝑌𝑗 = 𝑋𝑗𝛽𝑗 + 𝜀𝑗 , 𝐸[𝜀] = 0 𝑗 ∈ (𝐴, 𝐵)
Where X is a vector of independent variables and a constant, 𝛽 is a vector containing slope parameters and the intercept and 𝜀 is a normally distributed disturbance term with zero mean. The difference in the average outcome between the two groups, which is the raw gap in under-advice is:
∆= 𝐸[𝑌𝐴] − 𝐸[𝑌𝐵]
∆= 𝐸[𝑋𝐴𝛽𝐴+ 𝜀𝐴] − 𝐸[𝑋𝐵𝛽𝐵+ 𝜀𝐵]
∆= 𝐸[𝑋𝐴𝛽𝐴] + 𝐸[𝜀𝐴] − 𝐸[𝑋𝐵𝛽𝐵] − 𝐸[𝜀𝐵]
∆= 𝐸[𝑋𝐴]𝛽𝐴− 𝐸[𝑋𝐵]𝛽𝐵 Because 𝐸[𝜀𝑗] = 0 and 𝐸[𝛽𝑗] = 𝛽𝑗
The estimated decomposition is:
∆̂= 𝑋̅𝐴𝛽̂𝐴− 𝑋̅𝐵𝛽̂𝐵
By adding and subtracting 𝑋̅𝐵𝛽̂𝐴, the average counterfactual mean of under-advice that pupils in group B would have had under the coefficients structure of group A, the equation can be rearranged as follows:
∆̂= 𝑋̅𝐴𝛽̂𝐴− 𝑋̅𝐵𝛽̂𝐵+ 𝑋̅𝐵𝛽̂𝐴− 𝑋̅𝐵𝛽̂𝐴
∆̂= (𝑋̅𝐴− 𝑋̅𝐵)𝛽̂𝐴+ 𝑋̅𝐵(𝛽̂𝐴− 𝛽̂𝐵)
The first term of the decomposition, 𝐸 = (𝑋̅𝐴− 𝑋̅𝐵)𝛽̂𝐴, represents the endowments effect and reflects mean differences in the independent variables between the two groups. For instance, this would be the case if pupils of high-educated parents have on average higher cognitive and non-cognitive skills than pupils of low-educated parents, which contributes in explaining the gap in the probability of getting a reevaluation. Since the above decomposition is defined from the perspective of group B, the endowments effect calculates the expected change in the mean outcome for group B if group B had the same predictors as group A.
The second component, 𝐶 = 𝑋̅𝐵(𝛽̂𝐴− 𝛽̂𝐵), the coefficients effect, reflects differences in the coefficients between the two groups. This element measures for group B the differences of return to
31 characteristics due to membership in this group; what is the expected change in group B’s mean outcome, if group B had the same coefficients (returns to characteristics) as group A? Differences in returns to characteristics are present when the two groups are evaluated differently by the teacher despite having the same skills and background characteristics. This can be interpreted as a bias of the teacher towards one group. However, one should be careful with the bias interpretation, because this component can also reflect the effect of unobserved characteristics not included in the model.
The main results are from a linear Blinder-Oaxaca decomposition, but since both outcome variables are binary variables, we also report the results for the logistic decomposition as a robustness check.
5.1 Regression analysis results
Table 8 presents the contribution of predictors, individually and grouped, into the probability of getting under-advised (columns 1 and 2) and being persistently under-advised (columns 3 and 4).
We report coefficients of the independent variables individually (columns 2, 4) and as groups using the Lubotsky & Wittenberg method (columns 1, 3). School characteristics, non-cognitive skills, relative cognitive skills, having low-educated parents and being a female are highly significant predictors of under-advice (columns 1, 3). Better non-cognitive skills are associated with lower under- advice probability which is in line with hypothesis 3. This hypothesis also states that parental involvement is an important variable but we find it to be significant only at the 10% significant level.
Relative cognitive skills are positively associated with under-advice, which might seem surprising at first, but this is due to the way we coded cognitive skills. Pupils with a havo test advice who got a vmbo-gt school advice for instance (and are thus under-advised) have better cognitive skills compared to pupils within the vmbo-gt school advice category15. That is why cognitive skills are positively associated with under-advice. Looking at the regression with the complete list of variables in column 2, significant predictors include gender, parents’ education, birth year, school denomination, scores on math, reading comprehension and on the non-school cognitive tests figure composition and exclusion.
For non-cognitive skills significant predictors include performance, behavior and working attitude.
Pupils in schools with a religious orientation are more likely to get under-advised than pupils in other types of schools. A high score on performance and working attitude and a low score on behavior are associated with lower under-advice probability. Nevertheless, all these predictors only explain 6.2%
of the variation in the variable under-advice. With persistent under-advice as an outcome variables (columns 3, 4), school characteristics, relative cognitive skills and parental education remain significant predictors. The significant cognitive skills are math, figure composition, categories and analogies. Just as with under-advice, performance and behavior are significant non-cognitive variables for persistent under-advice. Note that migrant background is positively associated with under-advice and negatively associated with persistent under-advice. This means that pupils with a migrant background have a higher chance of getting under-advised but they are more likely to get a readjustment of the school advice which agrees with our second hypothesis. However, migrant background is not a significant predictor for neither under-advice nor persistent under-advice. These regression results partially confirm our first hypothesis that parental education is an important predictor
15 There is a high correlation between the score on the final test and the scores on cognitive tests taken 3 years before.
33 for under-advice, and migrant background is not. The degree of urbanicity and parental income are however not significant. Note that, controlling for parental education, parental income is positively associated with under-advice. An interpretation could be that holding parental education constant, richer parents can afford to pay for the final test preparation which helps pupils getting a higher score and thus having a higher test advice than the school advice.
Table 8: Results of the regression analysis.
Under-advice Persistent under-advice
(1) (2) (3) (4)
-0.0288 (0.0191) Background
Migrant background Parents’ income
Special care Birth year
Mother’s age at birth Father’s age at birth
Kids in household Parents’ involvement
Degree of urbanicity Very urban
Moderately urban Slightly urban
Not urban Province
(0.0119) 0.0242 (0.0170) - 0.00107 (0.0125)
(0.0122) 0.0242 (0.0185) -0.00107 (0.0124) 0.00796 (0.0154) 0.0109 (0.0153) 0.0245**
(0.0108) -0.00134 (0.00143) 0.000149 (0.00110) 0.00268 (0.00610) -0.0145*
(0.00842) -0.0290 (0.0283) -0.000536 (0.0363) 0.0243 (0.0368) -0.00740 (0.0498) 0.0962 (0.0746)
(0.00999) -0.00389 (0.0144) 0.00273 (0.0111)
0.0194 (0.0119) 0.0548***
(0.0100) -0.00389 (0.0166) 0.00273 (0.0110) -0.00353 (0.0133) 0.00824 (0.0136) 0.0129 (0.00953) -0.000841 (0.00126) 0.000591 (0.00100) 0.00723 (0.00561) -0.0111 (0.00714) -0.0254 (0.0255) 0.00221 (0.0336) 0.0238 (0.0313) 0.0109 (0.0406) 0.105 (0.0695)