• No results found

Patterns of Development in Children’s Scientific Reasoning: Results from a Three-Year Longitudinal Study

N/A
N/A
Protected

Academic year: 2021

Share "Patterns of Development in Children’s Scientific Reasoning: Results from a Three-Year Longitudinal Study"

Copied!
18
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Full Terms & Conditions of access and use can be found at

https://www.tandfonline.com/action/journalInformation?journalCode=hjcd20

Journal of Cognition and Development

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/hjcd20

Patterns of Development in Children’s Scientific

Reasoning: Results from a Three-Year Longitudinal

Study

Ard W. Lazonder , Noortje Janssen , Hannie Gijlers & Amber Walraven

To cite this article: Ard W. Lazonder , Noortje Janssen , Hannie Gijlers & Amber Walraven (2020): Patterns of Development in Children’s Scientific Reasoning: Results from a Three-Year Longitudinal Study, Journal of Cognition and Development, DOI: 10.1080/15248372.2020.1814293 To link to this article: https://doi.org/10.1080/15248372.2020.1814293

© 2020 The Author(s). Published with license by Taylor & Francis Group, LLC. Published online: 08 Sep 2020.

Submit your article to this journal

Article views: 239

View related articles

(2)

Patterns of Development in Children’s Scientific Reasoning:

Results from a Three-Year Longitudinal Study

Ard W. Lazonder a, Noortje Janssen a, Hannie Gijlers b, and Amber Walravena

aRadboud University, Netherlands; bUniversity of Twente, Netherlands

ABSTRACT

Scientific reasoning refers to the thinking skills involved in conceiving and conducting an investigation. This study examined how proficiency in performing these skills develops during the upper-elementary school years. A sample of 157 children (age 7–10) took a performance- based scientific reasoning test in three consecutive years. Four distinct developmental patterns emerged from their annual test scores, which were independent of prior domain knowledge and sociodemographic characteristics except gender. Developmental patterns in scientific reasoning and reading comprehension, but not math, were related such that many children with a high entry level or accelerated growth in scientific reasoning also performed better and progressed more in reading comprehension. These results indicate that scientific reason-ing develops differently in same-age children, largely independent of personal characteristics but generally comparable with reading comprehension.

Introduction

Scientific reasoning refers to the set of thinking skills involved in conceiving and conducting an investigation, which includes making hypotheses or predictions, doing experiments, interpreting results, evaluating data characteristics and drawing conclusions (Pedaste et al.,

2015). These skills start developing around age 4, mostly through play or by observing others (Akman & Güchan Özgül, 2015; Buchsbaum, Gopnik, Griffiths, & Shafto, 2011; Cook, Goodman, & Schulz, 2011) and are cultivated further in elementary education and beyond, where both direct instruction and inquiry-based teaching methods provide rich opportunities for children to advance in scientific reasoning (Sandoval, Sodian, Koerber, & Wong, 2014).

Studies investigating the early development of scientific reasoning have established what children of a certain age can and cannot do with minimal support (e.g., Köksal- Tuncer & Sodian, 2018; Schulz, Gopnik, & Glymour, 2007) and how they respond to instructional interventions (e.g., Lazonder & Kamp, 2012; Schalk, Edelsbrunner, Deiglmayr, Schumacher, & Stern, 2019). Research portraying children’s progress in scientific reasoning over the years is scarce and predominantly administers cross- sectional designs to infer how scientific reasoning develops (e.g., Koerber, Mayer, Osterhaus, Schwippert, & Sodian, 2015). As longitudinal research more accurately

CONTACT Ard W. Lazonder a.lazonder@pwo.ru.nl Behavioural Science Institute, Radboud University, Nijmegen 6500 HC, Netherlands

https://doi.org/10.1080/15248372.2020.1814293

© 2020 The Author(s). Published with license by Taylor & Francis Group, LLC.

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http:// creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.

(3)

captures developmental growth at both the group and the individual level, the present study tested a sample of upper-elementary children in three consecutive years. The study contributes to the literature by using a longitudinal design and practical inquiry tasks (instead of written tests) to determine whether and which differential developmental patterns exist within a group of peers.

Development of scientific reasoning

The cross-sectional study by Koerber et al. (2015) provided evidence that upper-elementary children gradually improve in scientific reasoning. Whether this developmental growth applies equally to all component skills remains unknown because Koerber et al. collapsed the scores obtained for separate scientific reasoning skills into a single outcome measure. Cross-sectional studies that did differentiate between component skills point to some interesting variations, which are summarized below.

The skill of hypothesizing or making predictions is difficult for elementary school-children and hardly improves between first and sixth grade (Klahr, Fay, & Dunbar, 1993; Penner & Klahr, 1996; Piekny & Maehler, 2013). Few sixth-graders predict the outcomes of an experiment, and those who do often stick to a single prediction (Klahr et al., 1993) or propose known or plausible conjectures (Lazonder & Kamp, 2012). Designing experiments to test these predictions is easier and the skills involved develop earlier, usually around preschool age (Cook et al., 2011; Van der Graaf, Segers, & Verhoeven, 2018). Some studies concluded that developmental growth in experimentation levels off between first and seventh grade (Penner & Klahr, 1996; Piekny & Maehler, 2013) whereas other studies found either a significant linear increase across these grade levels (Veenman, Wilhelm, & Beishuizen, 2004), a nonsignificant increase (Croker & Buchanan, 2011; Kanari & Millar,

2004) or an upward curvilinear trend (Tschirgi, 1980).

Once data are collected, children have to interpret the results and consider data char-acteristics before conclusions can be drawn. Third graders can interpret visual and numer-ical data (Krummenauer & Kuntze, 2019; Masnick, Klahr, & Knowles, 2017; Watson & Moritz, 1998), but make more mistakes than sixth-graders do (Klahr et al., 1993). Third- graders also demonstrate an intuitive awareness of the number and variability of observa-tions when reasoning with data (English & Watson, 2013; Masnick & Klahr, 2003), but evaluate these characteristics less often and less proficiently compared to sixth-graders (Masnick & Morris, 2008). Data characteristics also affect children’s skill in drawing conclusions. While preschoolers can already make correct conclusions from perfectly covarying data (Van der Graaf et al., 2018), children of all ages are ill versed in drawing conclusions based on imperfect and non-covariation evidence (Koerber, Sodian, Thoermer, & Nett, 2005; Piekny & Maehler, 2013).

Some of these cross-sectional findings have been substantiated and refined by long-itudinal evidence. Piekny, Gruber, and Maehler (2014) corroborated that the skills of experimenting and drawing conclusions start developing at preschool age. Bullock and Ziegler (1999) additionally found a quadratic developmental trend in experimentation from grade 3 through 6 (cf. Tschirgi, 1980). Higher skill levels in grade 3 were associated with more rapid growth, and although third graders with low experimentation skills improved considerably in subsequent years, they never caught up with their initially more competent

(4)

peers. Detailed developmental patterns for the remaining scientific reasoning skills have not been established in longitudinal research.

To conclude, the abovementioned findings indicate that the component skills of scien-tific reasoning develop with substantial variation. Part of this variability is likely due to methodological and substantive study features (Koerber & Osterhaus, 2019) as well as the nonlinear nature of children’s development in general. Another portion of this variance might be explained by children’s cognitive and sociodemographic characteristics, which are considered next.

Factors influencing developmental change

Few cross-sectional studies examined what accounts for developmental differences in scientific reasoning. Only Koerber et al. (2015) did, and found that intelligence, text comprehension and parental education were associated with developmental change, whereas gender was not. Similar conclusions arise from the longitudinal survey by Blums, Belsky, Grimm, and Chen (2017), while Bullock and Ziegler (1999), in contrast, found that developmental growth was not related with any of these factors.

Supplementary evidence comes from studies investigating children’s scientific reasoning at a single point in time. Reading comprehension affected scientific reasoning in all of these studies (e.g., Mayer, Sodian, Koerber, & Schwippert, 2014; Schiefer, Golle, Tibus, & Oschatz,

2019; Van de Sande, Kleemans, Verhoeven, & Segers, 2019), which seems at least partly due to the use of written tests. Another reason is that reading comprehension requires linguistic inferencing, which is also involved in scientific reasoning – for instance to create a mental representation of the inquiry process or experimental outcomes (Van der Graaf et al., 2018; Van de Sande et al., 2019). Mathematical skillfulness also correlates with scientific reasoning (Koerber & Osterhaus, 2019; Kuntze, 2004; Schlatter, Molenaar, & Lazonder, 2020; Tajudin & Chinnappan, 2015). This relationship likely exists because number sense and basic arithmetic operations in particular are needed to interpret numerical data (e.g., to recognize and calculate a pattern in scores) and evaluate data characteristics (e.g., to identify outliers).

Prior domain knowledge is another factor worth considering. As scientific reasoning usually takes place in a particular content area, knowledge of the topic of investigation facilitates the reasoning process. Specifically, prior domain knowledge is positively asso-ciated with the number of predictions made and the quality of the conclusions drawn (Lazonder, Wilhelm, & Hagemans, 2008; Wilhelm & Beishuizen, 2003). Whether domain knowledge has any differential effects when children’s scientific reasoning becomes increas-ingly more sophisticated remains to be shown.

Research questions and hypotheses

This longitudinal study examined (1) which developmental patterns can be identified in children’s scientific reasoning, (2) how these developmental patterns are related to chil-dren’s cognitive and sociodemographic characteristics, and (3) to what extent these patterns accord with children’s developmental trajectories in reading comprehension and math.

In line with Koerber et al. (2015), who also studied a sample of second, third and fourth graders, children were expected to make significant progress in scientific reasoning. The overall course of development was expected to be linear; whether this developmental

(5)

pattern applies equally to all children was an open question this study sought to answer. While some will indeed make steady progress over the years, others may alternate periods of accelerated growth with times of stagnation or even regression. Theoretically, such non-linear growth patterns are indicative of children who progress to higher developmental levels under optimal learning conditions. Linear growth, in contrast, occurs when the circumstances at home or in school insufficiently stimulate a child to push the limits of their cognitive capacity (Fischer, 2008).

Secondly, developmental patterns were expected to be independent of children’s initial age because the age range in upper-elementary classrooms is rather small. No gender effects were expected to be found either (e.g., Bullock & Ziegler, 1999) whereas higher levels of both parental education and prior domain knowledge were predicted to be associated with more rapid developmental growth (Blums et al., 2017; Koerber et al., 2015). Finally, developmental patterns in scientific reasoning were predicted to resemble those in reading comprehension and math. Although these pattern matches have not been examined before, reading and math growth is often nonlinear (Scammacca, Fall, Capin, Roberts, & Swanson,

2020) and might correspond with the hypothesized nonlinear developmental patterns in scientific reasoning because performance of these skills is consistently related during the elementary years (e.g., Koerber & Osterhaus, 2019; Schlatter et al., 2020).

Method

Participants

The original sample comprised 170 children, 92 boys and 78 girls, enrolled in a large-scale three-year longitudinal research project. Parental consent and ethical approval was obtained before the start of the project in 2016. At that time, the participating children had a mean age of 8.51 years (SD = 0.80, range 7 −10 years). Their parents’ highest level of schooling attained was classified as either high (higher education; 71%), middle (upper-secondary education; 21%) or low (lower-secondary education; 8%). Regarding children’s background, the majority of the sample (85%) originated from Dutch families; the 26 children with a migration background (15%) had at least one parent who was born outside the Netherlands. As 13 children left school during the runtime of the project, we report the data of the 157 children who completed all tests. The dropout group did not differ from the final sample in terms of child characteristics and initial test scores, χ2 < 5.47, p >.140.

All children attended weekly science lessons as part of their regular curriculum. These lessons were thematically organized in 8-week units on science topics not addressed in the longitudinal assessment reported here. The children’s science teachers were not informed about the specifics of the assessments used in this study to prevent them from teaching children to the test.

Materials

Scientific reasoning test

Children’s command of five scientific reasoning skills was assessed by a validated perfor-mance-based test that was orally administered in order to minimize the possible confound-ing influence of readconfound-ing and writconfound-ing skills. The 15 items on this test addressed the scientific

(6)

reasoning skills of predicting, experimenting, interpreting results, evaluating the properties of data and drawing conclusions (see Table 1) in the context of a practical inquiry with real equipment. Each skill was assessed by three items of increasing difficulty, which were distributed over four cycles of investigation. Children processed these cycles individually under surveillance of a test administrator who followed a script that specified what assign-ments to give and questions to ask at every stage of the inquiry (see Figure 1). Children responded orally or by performing a certain action (e.g., setting up an experiment). The test administrator recorded their spoken answers and actions so children neither had to read nor write.

In order to avoid possible carryover effects, children received a different version of this test each year. These three versions were identical except for the topic of inquiry and required children to investigate the influence of four dichotomous input variables on a continuous outcome measure. The ramps version had children examine how the distance a ball rolls down an inclined plane is affected by the mass of the ball (heavy or light) and the slope’s angle (steep or shallow), length (long or short) and surface (rough or smooth). The cars version addressed how far rubber-band powered cars travel and how this distance is influenced by the perimeter of the back wheels, the thickness of the back axle, the diameter of the rubber band, and its winding around the back axle. In the bouncing balls version, children dropped balls of different masses and compositions from different heights on different surfaces to find out how these dimensions influence the number of bounces.

Five handouts were used to ensure uniform conditions for children to interpret results, evaluate data and draw conclusions. The top-half of each handout displayed an experiment designed with the equipment available to the child; the bottom-half presented the results of that experiment and, where appropriate, a conclusion. Both the experimental setup and the outcomes were clarified by the administrator before asking the child to either interpret, evaluate or draw conclusions from the evidence given.

Table 1. Sample items of the scientific reasoning test.

Scientific reasoning

skill Sample items

Predicting What do you think would happen if I did the same test again. What would the outcome be? And why?

Your classmate who comes next will do the same test. Suppose that s/he asks you to predict what the outcome of that test will be. What would you say?

Experimenting Was this a good test to find out whether it makes a difference if the bouncing ball is light or heavy? And why?

Can you make a fair test to find out whether it makes a difference if the surface is smooth or rough? Interpreting Okay, you have now seen where the cars came to a stop [. . .] Please explain in your own words what

the outcome of this test is.

Here you see two characters, Boris and Angie, who have also done some tests with the ramps [. . .] Could you explain in your own words what came out of Boris and Angie’s tests?

Evaluating Here you see Boris and Angie’s reply. If you look at their scores in the table, who do you believe most? And why?

Suppose both Boris and Angie would do one additional test. Angie thinks hers will show a larger difference than Boris’. Would you agree? And why?

Concluding I wanted to test my idea that solid balls bounce more often than hollow balls. You have observed the outcomes. Do you think I should change my idea? And why?

If you consider the 12 outcomes of Boris and Angie together, could you explain, in one sentence, how the size of the back wheels affects the distance cars travel?

(7)

The test administrator recorded children’s actions and answers on the script for later coding and analysis. A rubric was developed to specify what counted as a correct (1 point) or incorrect response (0 points). To illustrate, the first experimenting item asked children to design an unconfounded comparison with the equipment at hand. In the ramps version of the test, this meant that children had to change the ramp surface while keeping constant the mass of the ball and the slope’s angle and length. One point was awarded if this was the case; no points were given if two or more input variables were manipulated or when the surface of both ramps was identical. Children could earn 3 points maximum for each scientific reasoning skill and, hence, 15 points in total. Interrater agreement was high, Cohen’s κ =

Figure 1. Fragment of the test administrator script showing the first inquiry cycle of the ramps version of the test. Items 1 to 4 measure experimenting, interpreting, concluding and predicting, respectively. Icons on the left indicate which type of action the administrator had to perform (hand = act, callout = speak, pen = write).

(8)

.84. An initial validation study (Lazonder & Janssen, 2019) further showed that the test scores conform well to a two-parameter Item-Response Theory model. The test has an expected a posteriori (EAP) reliability of .59, which is comparable with related instruments (Koerber et al., 2015; Mayer et al., 2014) and generally considered acceptable in science education research (Taber, 2018).

Prior domain knowledge tests

Three separate tests measured children’s initial knowledge of the four input variables in the version of the scientific reasoning test they were about to take. Each variable was assessed by a single item that tapped children’s ideas about whether and how that variable affected the outcome measure – which was exactly what they had to investigate during the scientific reasoning test. For example, one item in the cars version of the prior domain knowledge test was: “You can change four features of these cars, for instance the back axle can be thick or thin. Will that make a difference in how far cars travel?”. If a child answered that it did have an effect, the test administrator continued by asking what that difference would be so as to elicit the child’s ideas about of the direction of effect. All items were read aloud and children responded orally to the test administrator, who recorded their responses. Items were binary scored as either correct or incorrect; the total test score represents the number of input variables a child was familiar with beforehand.

Reading comprehension test

Children’s ability to understand written information was assessed by a standardized pro-gress-monitoring test (Weekers, Groenen, Kleintjes, & Feenstra, 2011). The test contained expository and narrative texts of approximately 200 to 700 words. Comprehension of these texts was determined by multiple-choice items, 50 in total, that asked children to arrange sentences in logical order, complete sentence fragments or answer inferential questions about (parts of) the text. Correctly answered items were awarded 1 point. Children took a different version of this test each year which was adapted to their age and reading level. Raw scores were converted to standardized proficiency scores that can be meaningfully compared across tests and among individuals. Population averages for the current sample are in the range of 26 to 56 points.

Math test

A standardized progress-monitoring test (De Vos, 2006) was used to assess children’s profi-ciency in performing numerical operations. This test was chosen because numeracy was assumed to facilitate scientific reasoning (in particular the interpretation and evaluation of data) and basic arithmetic operations, unlike word problems, make minimal demands on reading comprehension. The test contained 200 problems (e.g., 6 + 13; 84– 12; 3 × 4; 16 ÷ 2) of increasing difficulty that make minimal demands on children’s language and reasoning abilities. Children received 1 point for each correctly solved item; national median scores ranged from 51 points for the youngest children in 2016 to 87 points for the oldest children in 2018.

Procedure

Demographic characteristics of the sample were obtained from the school administration. Test data were collected over three years from 2016–2018. Children took the reading

(9)

comprehension test and the math test in January each year. Both tests were administered and scored by their classroom teachers according to the standardized procedures provided by the test publishers. The scientific reasoning test was administered annually in February and March. In order to prevent testing effects, children took a different version of this test each year; the three test versions were randomly assigned and administered similarly. Children were tested individually by a test administrator in a quiet space in or outside their classroom. The administrator first explained the purpose of the activity and intro-duced the equipment at hand, making sure that each child understood the outcome measure and the four input variables that could be manipulated in the investigation. Following the administration of the prior domain knowledge test, the child completed the scientific reasoning test in approximately 20 min. Testing was guided by the administrator, who adhered to the script to determine which questions to ask or assignments to give at designated moments during the child’s inquiry.

Results

Identifying developmental patterns in scientific reasoning

Children’s annual achievements on the scientific reasoning test were analyzed to distinguish possible differential patterns of development. Descriptive statistics showed that our sample obtained an overall mean score of 5.45 during the first test administration in 2016, which increased to 6.85 and 7.92 in the next two years, respectively. Repeated-measures ANOVA with polynomial planned contrasts evidenced that children made significant progress over-all, F(2, 312) = 82.33, p < .001, η2

p = .35, as well as between adjacent years, F(1, 156) = 150.46, p < .001, η2p = .49, that did not level off, F(1, 156) = 1.21, p = .291, η2p = .01. Despite this steady linear progress, there was also considerable individual variation. The 2016-scores ranged from 1 to 11 points (SD = 2.20) and this high level of divergence was maintained in 2017 (SD = 2.31, range 2 −13) and 2018 (SD = 2.24, range 2 −13). Variation was also apparent in the improvement of test performance over the years: some children eventually exceeded their initial scores by 10 points whereas others made hardly any progress at all.

In order to give a more precise account of these individual differences, k-Means cluster analysis was performed to classify children according to their annual total test scores. As we were primarily interested in developmental patterns (as opposed to the magnitude of scores), we started with a two-cluster solution and gradually increased the number of clusters until no new pattern in scores emerged. Figure 2 shows that the three-cluster solution yielded two different patterns compared to the two-cluster solution, and that the four-cluster solution added yet another pattern. A five-cluster solution did not, as it merely split the lowest-performing cluster in two based on the magnitude of children’s scores. We therefore decided to continue our analyses with four clusters, which had a pooled within- cluster SD of 1.47. The Euclidean distance between cluster centers (d) ranged from 2.96 to 8.23.

Repeated-measures ANOVA was used to examine whether and how scientific reasoning developed in each of these four clusters. The results in Table 2 indicate significant effects of time, which means that the total test scores improved over the years, irrespective of developmental pattern. Polynomial planned contrasts further showed that children who were labeled as low achievers annually improved their scores on the test, F(1, 31) = 36.63,

(10)

p < .001, η2

p = .54, and this progress was monotonic, F(1, 31) = 0.12, p = .728, η2p = .00. The early- and late-bloomers displayed a more erratic development. Although they uniformly achieved significant overall gains (see Table 2), their developmental trajectories diverged. Scores of the early-bloomers greatly improved from 2016 to 2017 and then stabilized in 2018, whereas the late-bloomers made this growth spurt one year later. Polynomial planned contrasts substantiated that the early-bloomers made significant progress, F(1, 40) = 61.86, p < .001, η2

p = .61, that was more pronounced during the first years of the study, F(1, 40) = 78.06, p < .001, η2

p > .66. Scores of the late-bloomers also increased significantly over time, F(1, 47) = 43.33, p < .001, η2p = .48, and were characterized by an upward curvilinear trend, F(1, 47) = 80.78, p < .001, η2

p = .63. Visual inspection of Figure 2 suggests that the high achievers consistently did well on the scientific reasoning test without much improvement. Yet the polynomial planned contrasts showed that these children too made significant progress, F(1, 35) = 20.18, p < .001, η2

p = .37, and evidenced a curvilinear improvement over the years, F(1, 35) = 11.54, p = .002, η2p = .25.

Explaining developmental patterns in scientific reasoning

The next set of analyses considered whether and how the observed developmental patterns in scientific reasoning were related to the cognitive and sociodemographic characteristics presented in Table 3. As most of this data was nominal or ordinal in nature, chi-square tests of independence were performed to examine these associations. A significant relationship was found between developmental pattern and gender, χ2(3, N = 157) = 8.80 p = .032. The cell counts in Table 3 show that, although boys and girls were equally represented in most developmental patterns, the boy-girl ratio in the group of high achievers was rather disproportionate. Post hoc tests based on adjusted residuals indicated that there were

0 3 6 9 12 15 2016 2017 2018 F in a l c lu s te r c e n te rs Two-cluster solution 2016 2017 2018 Three-cluster solution 2016 2017 2018 Four-cluster solution Low achievers Early-bloomers Late-bloomers High achievers

Figure 2. Results of cluster analyses to identify developmental patterns in children’s scientific reasoning.

Table 2. Annual performance on the scientific reasoning test by developmental pattern.

2016 2017 2018 M SD M SD M SD F df p η2 p Low achievers (n = 32) 3.13 1.43 4.06 1.41 5.22 1.50 3.10 10,118 .002 .21 Early-bloomers (n = 41) 4.39 1.36 7.78 1.31 7.51 1.60 8.81 10,154 .001 .36 Late-bloomers (n = 48) 6.23 1.61 5.96 1.22 8.94 1.69 8.28 10,182 .001 .31 High achievers (n = 36) 7.69 1.37 9.47 1.34 9.42 1.65 4.63 10,134 .001 .26

(11)

significantly more boys in this group than expected, z = 2.86, p = .004. Odds ratios further showed that high achievers were 3.26 times more likely to be boys than girls (95% CI [1.41, 7.51], z = 2.78, p = .006) compared to the other three patterns combined. No significant relationships were found regarding children’s age, χ2(9, N = 157) = 7.55, p = .580, and parental education, χ2(6, N = 157) = 5.82, p = .444.

Mixed-design ANOVA was performed to determine whether developmental patterns in scientific reasoning were associated with children’s understanding of the topic they had to investigate. Results showed no significant within-subject difference in prior domain knowl-edge test scores, F(2, 304) = 2.04, p = .132, η2

p = .01, which means that the ratio between the number of known and unknown factors was comparable across the three versions of the scientific reasoning test. As the effect of the between-subject factor “developmental pattern” was also nonsignificant, F(3, 152) = 1.21, p = .310, η2

p = .02, children with a different developmental trajectory had as much initial knowledge of the causal factors they had to investigate. The nonsignificant prior knowledge × developmental pattern interaction, F(6, 304) = 1.59, p = .150, η2

p = .03, further indicated that this result applied to all three test administrations.

Comparing developmental patterns in scientific reasoning

Of final interest was whether developmental patterns in scientific reasoning would match the pathways along which proficiency in reading comprehension and math is achieved. Developmental patterns for these school subjects were identified by k-Means cluster analysis using the exact same procedure as with scientific reasoning. The optimal cluster solutions in Figure 3 convey that three distinct patterns could be distinguished based on children’s annual reading comprehension scores. These patterns (pooled SD = 2.07, 23.70 ≤ d ≤ 49.30) resembled the ones found in scientific reasoning except that the pattern typical of the early-bloomers was not identified in the reading comprehension data. The cross-

Table 3. Characteristics of the children by developmental pattern in scientific reasoning.

Low achievers (n = 32) Early-bloomers (n = 41) Late-bloomers (n = 48) High achievers (n = 36)

Characteristic # % # % # % # % Gender Boys 17 53 18 44 23 48 27 75 Girls 15 47 23 56 25 52 9 25 Age 7 years 6 19 3 7 5 10 2 6 8 years 11 34 17 42 19 40 17 47 9 years 11 34 18 44 18 39 16 44 10 years 4 13 3 7 6 13 1 3 Parental education High 21 66 33 80 29 60 28 78 Middle 8 25 6 15 13 27 6 17 Low 3 9 2 5 6 13 2 6

Prior domain knowledge

M SD M SD M SD M SD

2016 2.88 0.91 2.73 0.81 2.91 0.97 2.86 0.87

2017 2.88 1.10 3.10 0.94 2.72 0.80 3.33 0.72

2018 2.91 0.93 3.05 0.67 3.06 0.85 3.08 0.87

(12)

tabulation in Table 4 revealed that the developmental patterns in scientific reasoning and reading comprehension were related, χ2(6, N = 149) = 22.24, p = .001. Post hoc analysis of adjusted residuals indicated that the overrepresentation of children classified as low achie-vers in both subjects contributed to this overall association, z = 3.68, p < .001, as did the underrepresentation of late-bloomers in reading comprehension in the low-achieving scientific-reasoning group, z = −2.82, p = .004. In addition, among the high achievers in scientific reasoning were relatively many children who either performed or progressed well in reading comprehension (late-bloomers: z = 2.19, p = .029; high achievers, z = 2.46, p = .014) and relatively few low achievers in reading comprehension, z = −3.49, p < .001. All other adjusted residuals were smaller than 1.96 and hence not statistically significant at the .05 level.

K-Means cluster analysis on children’s annual math test scores returned two develop-mental patterns with a similar slope but a different intercept. As increasing the number of clusters did not yield any additional patterns, the two-cluster solution displayed in Figure 3

(pooled SD = 14.59, d = 57.90) was cross-tabulated with the scientific reasoning patterns. The data in Table 4 point to a rather scattered distribution of developmental patterns. Although relatively few high achievers in scientific reasoning were low achievers in math and vice versa, this tendency was not statistically significant, χ2(3, N = 150) = 7.43, p = .059.

0 30 60 90 2016 2017 2018 F in a l c lu s te r c e n te rs Reading comprehension

Low achievers Late-bloomers High achievers 0 30 60 90 120 2016 2017 2018 Math

Low achievers High achievers

Figure 3. Classification of children based on their annual scores on the reading comprehension and math tests. Maximum scores on these tests were 147 and 200 points, respectively. National standards for the present sample were lower and are reported in the method section.

Table 4. Cross-tabulation of developmental patterns in scientific reasoning, reading comprehension and math.

Scientific reasoning

Low achievers Early-bloomers Late-bloomers High achievers Total Reading comprehension Low achievers 28 30 32 17 107 Late-bloomers 0 6 11 11 28 High achievers 0 4 3 7 14 Total 28 40 46 35 149 Math Low achievers 20 22 23 12 77 High achievers 9 18 24 22 73 Total 29 40 47 34 150

(13)

Discussion

This study provides new evidence about the development of scientific reasoning in upper- elementary education. Instead of inferring children’s course of development from cross- sectional comparisons or collapsing longitudinal data into a single developmental pattern, a more fine-grained approach was used to portray individual differences in development. This approach yielded four distinct developmental patterns which turned out to be largely independent of children’s cognitive and sociodemographic characteristics and generally complied with their course of development in reading comprehension.

So how does scientific reasoning develop? The derived developmental patterns point to four intuitively meaningful possibilities. Children classified as low achievers were mediocre from the start and progressed slowly over the years, while their initially high-achieving peers did not seem to make much progress either. Both linear patterns suggest that the learning conditions in school were insufficiently responsive to the needs of 43% of the children (Fischer, 2008). The remaining children, by contrast, made a growth spurt at some point during the study, which is not uncommon for this age group (Schwartz, 2009) and might indicate developmental growth under optimal conditions. These patterns refine the results of Koerber et al. (2015) by showing that overall linear progress in scientific reasoning is made by some, but not all children within a class. Our results also extend those of Piekny et al. (2014), who examined scientific reasoning during the preschool years, and broaden Bullock and Ziegler (1999) longitudinal analysis of a single component skill.

Results regarding the second research question indicate that the four developmental patterns are independent of age, parental education and prior domain knowledge. The lack of association with children’s age was expected because the majority of the sample (82%) shared the same birth year, which diminishes the chance that any significant age-related differences will show. Whether age will affect the course of development in more hetero-geneous samples is an interesting question future research might address. A similar unba-lanced distribution was found for parental education: 7 out of 10 children came from high- educated families. Their overrepresentation in the sample was unforeseen and is probably the reason why developmental patterns in scientific reasoning were independent of parental education. Another reason could be that this characteristic has a marginal influence on developmental growth that only shows in studies with very large samples. Bullock and Ziegler (1999), for example, found no effects of parental education in their sample of 753 children whereas Koerber et al. (2015) did in a sample that was twice as large.

Another unpredicted outcome is that developmental patterns in scientific reasoning were independent of prior domain knowledge. This result supplements the empirical evidence undergirding our hypothesis, which indicated that domain knowledge affects scientific reasoning performance at a single point in time (e.g., Lazonder et al., 2008; Wilhelm & Beishuizen, 2003). The present study adds that this effect does not necessarily apply to the development of scientific reasoning over the years. Future research might examine whether this conclusion holds if children complete a content-matched version of the test each year – but this would be a purely academic endeavor because children in science classes hardly ever investigate the same topic more than once on separate occasions. A related conclusion is that prior domain knowledge has no differential effect due to children’s increasing profi-ciency in scientific reasoning. If this were the case, the prior domain knowledge of children

(14)

with high achievements or accelerated growth would have differed from that of children who performed less proficiently or underwent a more linear development.

As for gender, previous developmental studies of scientific reasoning established that girls and boys make similar progress (Bullock & Ziegler, 1999; Koerber et al., 2015). The present findings generally concur with this conclusion except that relatively few girls were high achievers. Research on gender differences offers a plausible explanation as to why the best-performing children were mainly boys. Despite numerous policy efforts, males con-tinue to outperform females in science from fourth grade onward (Hyde & Linn, 2006) and although the magnitude of this difference is small, gender inequality is more substantial in high achievers, with a male/female ratio in excess of 2:1 (Reilly, Neumann, & Andrews,

2015). The present findings accord with this international trend.

Conclusions regarding the third research question state that developmental patterns in scientific reasoning generally parallel those in reading comprehension but not those in math. The former result is consistent with hypotheses and indicates that, despite the nonperfect match, the course of development in both skills bears considerable resemblance. This conclusion extends existing evidence regarding the linguistic nature of scientific reasoning (Van de Sande et al., 2019) by suggesting that children’s development in this area can be predicted by the reading comprehension progress monitoring data available in schools. One explanation for these pattern matches is that scientific reasoning and reading comprehension share a set of core problem-solving processes and sense-making strategies (Cervetti, Pearson, Bravo, & Barber, 2006; Siler, Klahr, Magaro, Willows, & Mowery, 2010). A related possibility is that both reading comprehension and scientific reasoning draw on general language comprehension processes, in particular when scientific reasoning is measured through an interactive dialogue (cf. Fuchs, Fuchs, Compton, Hamlett, & Wang,

2015). Alternatively, reading comprehension could be conditional for the advancement of scientific reasoning (Hapgood, Magnusson, & Palincsar, 2004) or mediate the effect of executive functions on this development (Van der Graaf et al., 2018). Which explanation holds true should be determined by future studies.

The absence of any correspondence with developmental growth in math could be due to type of skill being assessed. The math test involved numerical operations, a skill set relevant to scientific reasoning (Koerber & Osterhaus, 2019) that turned out to develop without much inter-individual variation other than the magnitude of scores. As other math skills play a role in scientific reasoning too (Gallenstein, 2005; Krummenauer & Kuntze, 2019), it seems worthwhile to replicate the present study with skills from different math strands. Number sense, spatial ability and measurement qualify for inclusion because these skills underlie data interpretation and evaluation and, hence, might contribute to their develop-ment (e.g., Koerber & Osterhaus, 2019). An alternative possibility could be that math and scientific reasoning develop similarly because they share the same underlying general ability, for instance abstract reasoning with symbolic representations. As this ability is relevant to some, but not all scientific reasoning skills, the odds of finding matching developmental patterns will probably be lower than with reading comprehension.

This study has some obvious limitations, which include the homogeneous sample of ages and demographics as well as the choice of a math test that may not have tapped the most relevant skill. Moreover, because some potentially influential characteristics could not be taken into account, our results paint an incomplete picture. Future longitudinal studies should address one or more characteristics that have been shown – in descriptive research –

(15)

to influence scientific reasoning, such as inhibitory control (Van der Graaf et al., 2018), nonverbal intelligence (Schiefer et al., 2019; Veenman et al., 2004) and problem solving ability (Mayer et al., 2014).

Despite these limitations, the present study offers some implications for practice. As the achievements and developmental patterns in scientific reasoning vary greatly among same- age children, elementary science teachers face the challenge of responding to the needs of individual children in their classes. Differentiated forms of teaching and learning thus seem called for, in particular because 43% of our sample did not seem to take full advantage of the instruction they received. We therefore recommend teachers to base their instructional adaptations either on children’s actual performance in scientific reasoning or progress in reading comprehension. Age and gender differences are poor predictors of who should receive additional teacher guidance, although more girls could be encouraged to excel, for instance by simulating their interest in science.

Acknowledgements

We are grateful to Merel Boon, who scored part of the data to determine interrater reliability. We also wish to thank Robin Willemsen and Joep van der Graaf for helpful feedback on a preliminary version of this paper.

Disclosure statement

No potential conflict of interest was reported by the authors.

Funding

This work was supported by the Netherlands Initiative for Education Research (NRO) [405-15-546].

ORCID

Ard W. Lazonder http://orcid.org/0000-0002-5572-3375 Noortje Janssen http://orcid.org/0000-0002-8628-0466 Hannie Gijlers http://orcid.org/0000-0003-2406-6541

References

Akman, B., & Güchan Özgül, S. (2015). Role of play in teaching science in the early childhood years. In K. C. Trundle & M. Saçkes (Eds.), Research in early childhood science education (pp. 237–259). Dordrecht, the Netherlands: Springer.

Blums, A., Belsky, J., Grimm, K., & Chen, Z. (2017). Building links between early socioeconomic status, cognitive ability, and math and science achievement. Journal of Cognition and Development,

18(1), 16–40. doi:10.1080/15248372.2016.1228652

Buchsbaum, D., Gopnik, A., Griffiths, T. L., & Shafto, P. (2011). Children’s imitation of causal action sequences is influenced by statistical and pedagogical evidence. Cognition, 120(3), 331–340. doi:10.1016/j.cognition.2010.12.001

Bullock, M., & Ziegler, A. (1999). Scientific reasoning: Developmental and individual differences. In F. E. Weinert & W. Schneider (Eds.), Individual development from 3 to 12: Findings from the

(16)

Cervetti, G., Pearson, P. D., Bravo, M. A., & Barber, J. (2006). Reading and writing in the service of inquiry-based science. In R. Douglas, M. Klentschy, K. Worth, & W. Binder (Eds.), Linking science

and literacy in the K-8 classroom (pp. 221–244). Arlington, VA: NSTA Press.

Cook, C., Goodman, N. D., & Schulz, L. E. (2011). Where science starts: Spontaneous experiments in preschoolers’ exploratory play. Cognition, 120(3), 341–349. doi:10.1016/j.cognition.2011.03.003 Croker, S., & Buchanan, H. (2011). Scientific reasoning in a real world context: The effect of prior

belief and outcome on children’s hypothesis testing strategies. British Journal of Developmental

Psychology, 3(3), 409–424. doi:10.1348/026151010X496906

De Vos, T. (2006). Schoolvaardigheidstoets hoofdrekenen [School proficiency test mental arithmetic]. Amsterdam, the Netherlands: Boom test uitgevers.

English, L., & Watson, J. (2013). Beginning inference in fourth grade: Exploring variation in measurement. In V. Steinle, L. Ball, & C. Bardini (Eds.), Mathematics education: Yesterday,

today and tomorrow (pp. 274–281). Melbourne, Australia: MERGA.

Fischer, K. W. (2008). Dynamic cycles of cognitive and brain development: Measuring growth in mind, brain, and education. In A. M. Battro, K. W. Fischer, & P. Léna (Eds.), The educated brain (pp. 127–150). Cambridge: Cambridge University Press.

Fuchs, L. S., Fuchs, D., Compton, D. L., Hamlett, C. L., & Wang, A. Y. (2015). Is word-problem solving a form of text comprehension? Scientific Studies of Reading, 19(3), 204–223. doi:10.1080/ 10888438.2015.1005745

Gallenstein, N. L. (2005). Engaging young children in science and mathematics. Journal of Elementary

Science Education, 17(2), 27–41. doi:10.1007/BF03174679

Hapgood, S., Magnusson, S. J., & Palincsar, A. S. (2004). Teacher, text, and experience: A case of young children’s scientific inquiry. The Journal of the Learning Sciences, 13(4), 455–505. doi:10.1207/s15327809jls1304_1

Hyde, J. S., & Linn, M. C. (2006). Gender similarities in mathematics and science. Science, 314(5799), 599–600. doi:10.1126/science.1132154

Kanari, Z., & Millar, R. (2004). Reasoning from data: How students collect and interpret data in science investigations. Journal of Research in Science Teaching, 41(7), 748–769. doi:10.1002/ tea.20020

Klahr, D., Fay, A. L., & Dunbar, K. (1993). Heuristics for scientific experimentation: A developmental study. Cognitive Psychology, 25(1), 111–146. doi:10.1006/cogp.1993.1003

Koerber, S., Mayer, D., Osterhaus, C., Schwippert, K., & Sodian, B. (2015). The development of scientific thinking in elementary school: A comprehensive inventory. Child Development, 86(1), 327–336. doi:10.1111/cdev.12298

Koerber, S., & Osterhaus, C. (2019). Individual differences in early scientific thinking: Assessment, cognitive influences, and their relevance for science learning. Journal of Cognition and

Development, 20(4), 510–533. doi:10.1080/15248372.2019.1620232

Koerber, S., Sodian, B., Thoermer, C., & Nett, U. (2005). Scientific reasoning in young children: Preschoolers’ ability to evaluate covariation evidence. Swiss Journal of Psychology, 64(3), 141–152. doi:10.1024/1421-0185.64.3.141

Köksal-Tuncer, O., & Sodian, B. (2018). The development of scientific reasoning: Hypothesis testing and argumentation from evidence in young children. Cognitive Development, 48, 135–145. doi:10.1016/j.cogdev.2018.06.011

Krummenauer, J., & Kuntze, S. (2019). Primary students’ reasoning and argumentation based on

statistical data. Paper presented at the Eleventh Congress of the European Society for Research in

Mathematics Education (CERME11), Utrecht, the Netherlands.

Kuntze, S. (2004). Wissenschaftliches Denken von Schülerinnen und Schülern bei der Beurteilung gegebener Beweisbeispiele aus der Geometrie [Students’ scientific thinking in the evaluation of geometry proof examples]. Journal for Mathematik-Didaktik, 25(3–4), 245–268. doi:10.1007/BF03339325 Lazonder, A. W., & Janssen, N. (2019). Development and initial validation of a performance-based

scientific reasoning test for children. Manuscript submitted for publication.

Lazonder, A. W., & Kamp, E. (2012). Bit by bit or all at once? Splitting up the inquiry task to promote children’s scientific reasoning. Learning and Instruction, 22(6), 458–464. doi:10.1016/j. learninstruc.2012.05.005

(17)

Lazonder, A. W., Wilhelm, P., & Hagemans, M. G. (2008). The influence of domain knowledge on strategy use during simulation-based inquiry learning. Learning and Instruction, 18(6), 580–592. doi:10.1016/j.learninstruc.2007.12.001

Masnick, A. M., & Klahr, D. (2003). Error matters: An initial exploration of elementary school children’s understanding of experimental error. Journal of Cognition and Development, 4(1), 67–98. doi:10.1080/15248372.2003.9669683

Masnick, A. M., Klahr, D., & Knowles, E. R. (2017). Data-driven belief revision in children and adults.

Journal of Cognition and Development, 18(1), 87–109. doi:10.1080/15248372.2016.1168824 Masnick, A. M., & Morris, B. J. (2008). Investigating the development of data evaluation: The role

of data characteristics. Child Development, 79(4), 1032–1048. doi:10.1111/j.1467- 8624.2008.01174.x

Mayer, D., Sodian, B., Koerber, S., & Schwippert, K. (2014). Scientific reasoning in elementary school children: Assessment and relations with cognitive abilities. Learning and Instruction, 29, 43–55. doi:10.1016/j.learninstruc.2013.07.005

Pedaste, M., Mäeots, M., Siiman, L. A., De Jong, T., Van Riesen, S. A. N., Kamp, E. T., . . . Tsourlidaki, E. (2015). Phases of inquiry-based learning: Definitions and the inquiry cycle.

Educational Research Review, 14, 47–61. doi:10.1016/j.edurev.2015.02.003

Penner, D. E., & Klahr, D. (1996). The interaction of domain-specific knowledge and domain-general discovery strategies: A study with sinking objects. Child Development, 67(6), 2709–2727. doi:10.1111/1467-8624.ep9706244829

Piekny, J., Gruber, D., & Maehler, C. (2014). The development of experimentation and evidence evaluation skills at preschool age. International Journal of Science Education, 36(2), 334–354. doi:10.1080/09500693.2013.776192

Piekny, J., & Maehler, C. (2013). Scientific reasoning in early and middle childhood: The development of domain-general evidence evaluation, experimentation, and hypothesis generation skills. British

Journal of Developmental Psychology, 31(2), 153–179. doi:10.1111/j.2044-835X.2012.02082.x Reilly, D., Neumann, D. L., & Andrews, G. (2015). Sex differences in mathematics and science

achievement: A meta-analysis of national assessment of educational progress assessments.

Journal of Educational Psychology, 107(3), 645–662. doi:10.1037/edu0000012

Sandoval, W. A., Sodian, B., Koerber, S., & Wong, J. (2014). Developing children’s early competencies to engage with science. Educational Psychologist, 49(2), 139–152. doi:10.1080/00461520.2014.917589 Scammacca, N., Fall, A.-M., Capin, P., Roberts, G., & Swanson, E. (2020). Examining factors affecting

reading and math growth and achievement gaps in grades 1–5: A cohort-sequential longitudinal approach. Journal of Educational Psychology, 112(4), 718–734. doi:10.1037/edu0000400

Schalk, L., Edelsbrunner, P. A., Deiglmayr, A., Schumacher, R., & Stern, E. (2019). Improved application of the control-of-variables strategy as a collateral benefit of inquiry-based physics education in elementary school. Learning and Instruction, 59, 34–45. doi:10.1016/j. learninstruc.2018.09.006

Schiefer, J., Golle, J., Tibus, M., & Oschatz, K. (2019). Scientific reasoning in elementary school children: Assessment of the inquiry cycle. Journal of Advanced Academics, 30(2), 144–177. doi:10.1177/1932202X18825152

Schlatter, E., Molenaar, I., & Lazonder, A. W. (2020). Individual differences in children’s development of scientific reasoning through inquiry-based instruction: Who needs additional guidance? Frontiers in

Psychology, Section Developmental Psychology, 11. Article 904. doi:10.3389/fpsyg.2020.00904 Schulz, L. E., Gopnik, A., & Glymour, C. (2007). Preschool children learn about causal structure from

conditional interventions. Developmental Science, 10(3), 322–332. doi:10.1111/j.1467- 7687.2007.00587.x

Schwartz, M. (2009). Cognitive development and learning: Analyzing the building of skills in classrooms. Mind, Brain, and Education, 3(4), 198–208. doi:10.1111/j.1751-228X.2009.01070.x Siler, S., Klahr, D., Magaro, C., Willows, K., & Mowery, D. (2010). Predictors of transfer of

experi-mental design skills in elementary and middle school children. In V. Aleven, J. Kay, & J. Mostow (Eds.), Intelligent tutoring systems (pp. 198–208). Berlin, Germany: Springer.

Taber, K. S. (2018). The use of Cronbach’s alpha when developing and reporting research instruments in science education. Research in Science Education, 48(6), 1273–1296. doi:10.1007/s11165-016-9602-2

(18)

Tajudin, N. M., & Chinnappan, M. (2015). Exploring relationship between scientific reasoning skills and mathematics problem solving. In M. Marshman, V. Geiger, & A. Bennison (Eds.),

Mathematics education in the margins (pp. 603–610). Sunshine Coast, Australia: MERGA.

Tschirgi, J. E. (1980). Sensible reasoning: A hypothesis about hypotheses. Child Development, 51(1), 1–10. doi:10.1111/1467-8624.ep12325377

Van de Sande, E., Kleemans, M., Verhoeven, L., & Segers, E. (2019). The linguistic nature of children’s scientific reasoning. Learning and Instruction, 62, 20–26. doi:10.1016/j.learninstruc.2019.02.002 Van der Graaf, J., Segers, E., & Verhoeven, L. (2018). Individual differences in the development of

scientific thinking in kindergarten. Learning and Instruction, 56, 1–9. doi:10.1016/j. learninstruc.2018.03.005

Veenman, M. V. J., Wilhelm, P., & Beishuizen, J. J. (2004). The relation between intellectual and metacognitive skills from a developmental perspective. Learning and Instruction, 14(1), 89–109. doi:10.1016/j.learninstruc.2003.10.004

Watson, J. M., & Moritz, J. B. (1998). The beginning of statistical inference: Comparing two data sets.

Educational Studies in Mathematics, 37(2), 45–168. doi:10.1023/A:1003594832397

Weekers, A., Groenen, I., Kleintjes, F., & Feenstra, H. (2011). Wetenschappelijke verantwoording

papieren toetsen Begrijpend lezen voor groep 7 en 8 [Scientific underpinning of the written Reading

comprehension tests for grade 5 and 6]. Arnhem, the Netherlands: Cito.

Wilhelm, P., & Beishuizen, J. J. (2003). Content effects in self-directed inductive learning. Learning

Referenties

GERELATEERDE DOCUMENTEN

At the same time, the association celebrates the test’s creation as a milestone in the history of humanity: “When psychologist Alfred Binet developed a test to measure

Alhoewel Dr Avenant die gevolglike mengsel van die pulp nie as 'eie produkte van sy boerderybedrywighede' beskou het nie, het die Appèlhof ook daarop klem gelê

Applications are given of a preconditioned adaptive algorithm for broadband multichannel active noise control. Based on state-space descriptions of the relevant transfer functions,

The OLFAR radio telescope will be composed of an antenna array based on satellites deployed at a location where the Earth's interference is limited, and where the satellites can

Bospaden waar in principe gefietst mag worden maar die geen fietspad zijn en die in de regel niet worden gebruikt door fietsers, of als verbinding geen meerwaarde hebben voor

of the site at Waziers has expanded the presumed northern limit of hominin occupation during the Last Interglacial, and may support the theory that hominins

It could be the case that secondary- market mispricing leads to a less efficient market for corporate control if also target management infers firm value predominantly from

In werkput 35 werd nog een sporenconcentratie aangetroffen, maar het is niet zeker of in deze groep wel een gebouw herkend kan worden.. Het aantreffen van spikkels