• No results found

Profiling children's reading comprehension: A dynamic approach

N/A
N/A
Protected

Academic year: 2021

Share "Profiling children's reading comprehension: A dynamic approach"

Copied!
11
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Contents lists available atScienceDirect

Learning and Individual Di

fferences

journal homepage:www.elsevier.com/locate/lindif

Pro

filing children's reading comprehension: A dynamic approach

Sophie Gruhn

a,⁎

, Eliane Segers

a,b,1

, Jos Keuning

c,2

, Ludo Verhoeven

a,d,3 aRadboud University, Nijmegen, The Netherlands

bUniversity of Twente, Enschede, The Netherlands

cCito, Institute for Educational Measurement, Arnhem, The Netherlands dUniversity of Curaçao, Curaçao

A R T I C L E I N F O Keywords: Reading comprehension Primary school Individual differences A B S T R A C T

To profile children's reading comprehension, we developed a dynamic approach with componential abilities (orthographic knowledge, vocabulary, sentence-integration) being assessed within the same texts and provided with feedback in addition to the global comprehension of these texts. In 275 Dutch third tofifth graders, we investigated to what extent the response accuracy for questions on componential abilities onfirst attempts and after feedback predicted global text comprehension within the same texts as well as the prospective development in a standardized reading comprehension test. We found that global text comprehension was increased by each correctly answered question on a componential ability onfirst attempts and by each correctly answered sen-tence-integration question after feedback. The accuracy onfirst attempts also explained unique variance of the growth in the standardized reading comprehension test. A dynamic approach may thus help to arrive at a better understanding of the profiles of children's reading comprehension.

1. Profiling children's reading comprehension: A dynamic approach

Reading comprehension is important for educational success (Hakkarainen et al., 2013). Efficient instruction is, thus, highly re-levant. Reading comprehension is a complex interaction of word-, sentence-, and text-level processes. This is why comprehension pro-blems have various origins (Perfetti & Stafura, 2014). Individual in-structional needs can be derived from a profile on the strengths and weaknesses in the underlying componential abilities in reading com-prehension (Cain & Oakhill, 2006). Moreover, examining which chil-dren are at risk for a low responsiveness to instruction takes a more preventative approach (Compton et al., 2012;Vaughn & Fuchs, 2003). Standardized reading comprehension tests do not identify the under-lying componential abilities in reading comprehension and children's ability to learn (Cain & Oakhill, 2012;Compton et al., 2012). The al-ternative of assessing each componential ability with isolated tests does not consider how the components interact and depend on each other (Perfetti & Adlof, 2012;Sabatini et al., 2016). Thus, isolated measures

may only be seen as a proxy of interactive reading comprehension processes. A dynamic approach in which the componential abilities (orthographic knowledge, vocabulary, sentence-integration) are as-sessed within the same texts and the responsiveness to feedback after mistakes is measured may provide a better insight into the required focus and intensity of instruction (Den Ouden et al., 2019). It is unclear yet to what extent this may indeed yield a better understanding of the variation in reading comprehension in perspective of its growth than traditional reading comprehension tests. Therefore, we examined in the present study how the componential abilities of reading comprehension on thefirst attempt and after feedback predict global text comprehen-sion within the same texts and growth in a standardized reading com-prehension test.

1.1. Interaction of componential abilities in reading comprehension Reading comprehension is a complex interaction of several com-ponents, which can broadly be separated into lower-order and higher-order abilities (Perfetti, 1999; Perfetti & Stafura, 2014). Lower-order

https://doi.org/10.1016/j.lindif.2020.101923

Received 18 January 2020; Received in revised form 13 July 2020; Accepted 1 August 2020

Corresponding author at: Radboud University, Behavioural Science Institute, Postbus 9104, 6500 HE Nijmegen, The Netherlands.

E-mail addresses:c.gruhn@bsi.ru.nl(S. Gruhn),e.segers@bsi.ru.nl(E. Segers),jos.keuning@cito.nl(J. Keuning),l.verhoeven@bsi.ru.nl(L. Verhoeven).

1Radboud University, Behavioural Science Institute, Postbus 9104, 6500 HE Nijmegen, The Netherlands. University of Twente, Faculty of BMS, Department IST,

PO Box 217, 7500 AE Enschede, The Netherlands.

2Amsterdamseweg 13, 6814 CM Arnhem, The Netherlands. 3

Radboud University, Behavioural Science Institute, Postbus 9104, 6500 HE Nijmegen, The Netherlands. University of Curaçao, Faculty of Arts, Jan Noorduynweg 111, Curaçao.

1041-6080/ © 2020 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/BY/4.0/).

(2)

componential abilities refer to knowledge at the lexical level, and higher-order componential abilities entail competences at sentence and text level. The specificity with which the phonological, orthographic, and semantic information of a word is stored in the mental lexicon and the interconnectional strength between each informational node de-termine the ease of lexical retrieval during reading and, consequently, the availability of cognitive capacities for higher-order processes (Perfetti & Hart, 2002). By this, the quality of lexical representations directly impacts reading comprehension (Richter et al., 2013; Verhoeven & Van Leeuwe, 2008).

Higher-order processes are based on the input from lower-order componential abilities and aim at the construction of meaning beyond single words (Perfetti, 1999;Perfetti & Stafura, 2014). In order to un-derstand the text, the reader has to connect all identified words and phrases to represent the literal text meaning, which is referred to as textbase (Van Dijk & Kintsch, 1983). Since not all information is di-rectly expressed in the text, the reader has to interpret the link between adjacent phrases and sentences (e.g., cohesive ties, semantic mapping) and has to infer implicitly provided information across the whole text by integrating background knowledge (Cain & Oakhill, 2014). Via these so-called sentence-integration abilities, the reader constructs a more abstract and elaborate mental representation of the situation described in the text (situation model), which goes beyond the literal meaning (Van Dijk & Kintsch, 1983). Lower-order and higher-order componen-tial abilities can be seen as interdependent (Cain et al., 2003;Cain & Oakhill, 2014;Daugaard et al., 2017). Several studies found that both uniquely predict reading comprehension abilities (Oakhill & Cain, 2012;Silva & Cain, 2015).

1.2. Profiling of individual differences in reading comprehension

The dissociation between decoding and language comprehension has often been used to profile children with reading comprehension difficulties (Aaron, 1991; Kleinsz et al., 2017). Children can have a relatively specific problem with either decoding or language compre-hension only, or they may perform low on both (Bishop & Snowling, 2004). This is aligned with the simple view of reading that considers reading comprehension as the product of decoding and language comprehension (Gough & Tunmer, 1986). Language comprehension problems may be due to a variety of causes at word, sentence, or text level (problems in vocabulary, (morpho-)syntax, sentence-integration, comprehension monitoring, or story structure knowledge) and, thus, ask for morefine-grained profiling (Clarke et al., 2010;Landi & Ryherd, 2017). Children with comprehension problems might show weaknesses at only one or two levels (Cain & Oakhill, 2006;Colenbrander et al., 2016;Nation et al., 2004). This suggests the existence of various pro-files.

Profiles (i.e., performance pattern across word, sentence, and text level) can describe the instructional needs of a child. Therefore, it is important to assess componential abilities in reading comprehension throughout primary school (Sabatini et al., 2014). This may focus on lexical quality (i.e., orthographic knowledge, vocabulary) and sentence-integration because they are key components of reading comprehen-sion, vary among individuals with different reading comprehension abilities, and are responsive to instruction (Elleman, 2017; Elleman et al., 2009;Oakhill & Cain, 2012;Perfetti & Stafura, 2014;Therrien, 2004). Such selection criteria of components for assessments were suggested byPerfetti and Adlof (2012).

1.3. Problems in current assessment practice

Standardized reading comprehension tests usually measure thefinal product of comprehension (Kintsch, 2012;Van den Broek, 2012). Such assessments are useful to identify struggling children (Van den Broek, 2012). However, they do not provide information on what leads to weak performances (Cain & Oakhill, 2012;Mislevy & Sabatini, 2012).

Furthermore, the results may depend on the choice of tests, as tests vary in their reliance on componential abilities (Colenbrander et al., 2017; Keenan et al., 2008). Therefore, standardized reading comprehension tests should be used with caution when it comes to determining in-structional needs. Another limitation in the current practice is that developmental delays are often only identified over time when chil-dren's progress-rate is too flat or stagnates (Compton et al., 2012; Vaughn & Fuchs, 2003). Childrenfirst need to fall behind before in-structions are intensified, which has been criticized as a wait-to-fail approach. Identifying these children earlier as well as establishing their required intensity of instruction is desirable (Fuchs et al., 2012).

Tofind out what children need to improve on reading comprehen-sion, isolated tests on componential abilities of reading comprehension (e.g., reading proficiency and vocabulary tests) are often administered. However, reading comprehension cannot be considered as just the sum of its parts (Perfetti & Adlof, 2012;Sabatini et al., 2014). Conclusions about higher-order componential abilities are constrained by their de-pendence on lower-order componential abilities. Vice versa, higher-order componential abilities may be used to compensate difficulties in the fundamental skills (Cain et al., 2003;Nation & Snowling, 1998). In this respect, we should consider reading comprehension as a complex construct with its processes being dependent on the interrelatedness of underlying components. This interrelatedness does not take place in a context-free interplay. The processing behavior is influenced by task demands and text features in interaction with individual differences (Eason et al., 2012;Francis et al., 2018;Wang et al., 2017). Assessing the componential abilities in isolation limits this complexity (Sabatini et al., 2014;Sabatini et al., 2016). Therefore, isolated measures may not be comparable with the actual online processes.

1.4. Dynamic approach to reading comprehension assessment

The above articulated problems indicate a need for assessments which provide more information for instruction and consider reading comprehension as an interactive process. The changing view of the complexity of reading comprehension and the need to individually adapt instructions led to more dynamic approaches to assessments (Sabatini et al., 2016;Sabatini et al., 2020).

Thefirst aspect of a dynamic approach is to assess the componential abilities in reading comprehension in a more interactive way. This has been done by measuring the components within the same text (Den Ouden et al., 2019;Sabatini et al., 2015;Sabatini et al., 2016).Sabatini et al. (2015)proposed a reading component battery where two text-level tasks were assessed within the same paragraphs. The word- and sentence-level measures were not integrated to these paragraphs. The performance of sixth graders on each subtask uniquely predicted their score on a reading comprehension test (seeSabatini et al., 2014). This test was based, however, on different texts than the component battery. In contrast, word-, sentence-, and text-level skills were assessed within the same text in another assessment ofSabatini et al. (2016). They showed that the overall performance of kindergarteners to third graders on the assessment was predicted by their prior background knowledge and the knowledge acquired during reading or listening, even after controlling for grade. The assessment was limited, though, to only one text and decoding was not measured despite its relevance for profiling of readers (Bishop & Snowling, 2004;Perfetti, 1999). Across a larger number of texts,Den Ouden et al. (2019)assessed word-, sentence-, and text-level components within each text. They found that the word-level abilities of fourth graders summarized across texts predicted the global comprehension of such texts. It was not examined how the performance at sentence level explained global text comprehension due to a limited reliability of the task.

The second aspect of a dynamic approach is to provide feedback after mistakes during the assessment. Assessments which include forms of instruction are considered dynamic (Sternberg & Grigorenko, 2002). The responsiveness to instruction (e.g., the ability to answer correctly

(3)

after feedback) indicates the learning potential. This can estimate the required intensity of instruction during classroom activities and can help identifying children with a low responsiveness to instruction ear-lier (Fuchs et al., 2012;Gustafson et al., 2014). To date, there are only a few attempts to measure children's learning potential in reading com-prehension by providing instruction during assessment (e.g., Dörfler et al., 2017;Elleman et al., 2011;Navarro & Mora, 2011;Sabatini et al., 2020). To evaluate the benefit of such assessments, it should be con-sidered how much unique variance in educational achievements they explain on top of traditional tests (Caffrey et al., 2008).

The computerized Global-Integrated Scenario-Based Assessment (GISA) examined how K-12 students solve a complex reading task via several underlying subtasks and make use of strategic hints (Sabatini et al., 2020). The total performance was strongly correlated with results on other reading comprehension tests. It has not been examined how the responsiveness to hints was related to achievements at GISA or other test results, and the subtasks rather focused on higher-order componential abilities. Also,Dörfler et al. (2017)andElleman et al. (2011) focused on only one level of componential abilities. In both studies, inferential hints were provided after incorrectly answered sentence-integration questions. Sixth graders who received person-mediated inferential hints during practice answered similar questions on new texts better than their peers who did not receive hints (Dörfler et al., 2017). Moreover,Elleman et al. (2011)found that the amount of person-mediated feedback required by second graders together with the transfer to new questions explained unique variance in standardized reading comprehension test scores on top of standardized decoding and vocabulary measures. The amount of unique explained variance was, however, not allocated to the accuracy on the first attempt or after feedback.

Others assessed a broader range of componential abilities and pro-vided feedback (Den Ouden et al., 2019;Navarro & Mora, 2011). In Navarro and Mora's (2011) assessment, pre-determined one-on-one interactions took place between primary- or secondary-education stu-dents and an instructor. The latter gave feedback following guidelines and evaluated the students on a checklist. The total assessment score uniquely predicted on top of a standardized reading comprehension test how teachers evaluated the student's achievements. The unique effect of the responsiveness to feedback was not examined. Besides the as-sessment being rather time-consuming, the abilities at word and sen-tence level were not integrated to texts. In contrast,Den Ouden et al. (2019) provided computerized feedback on word-level components within the same texts and related the accuracy on first attempts and after feedback to the global text comprehension of the same texts. They did notfind, however, an effect of the accuracy after feedback on global text comprehension over and above the accuracy onfirst attempts. As summarized scores across texts were used for the analysis, considering the relationships at text level may yield different results.

1.5. The present study

We have articulated that a dynamic approach to reading compre-hension assessment at word, sentence, and text level could lead to op-timal profiling of children's instructional needs. In this approach, the componential abilities are assessed within the same text on the first attempt and after feedback has been provided. Existent assessments followed this approach only to a limited extent. They assessed com-ponential abilities only at one or two levels within the same text (Den Ouden et al., 2019;Sabatini et al., 2015) or not over a larger number of texts and without feedback (Sabatini et al., 2016). Others who provided feedback did not do so on componential abilities at several levels (Den Ouden et al., 2019;Dörfler et al., 2017;Elleman et al., 2011;Sabatini et al., 2020) or not on components within the same texts (Navarro & Mora, 2011). It is not yet clear how componential abilities within the same text without and with feedback may be used for profiling of children's reading comprehension and if such a dynamic approach

provides a better understanding of the variation in children's reading comprehension than traditional reading comprehension tests.

In the present study, Dutch third to fifth graders' orthographic knowledge, vocabulary knowledge, and sentence-integration abilities were assessed within the same texts on thefirst attempt and second attempt after feedback has been provided when a mistake was made. Additionally, the global text comprehension of the same text was evaluated. Prior and after this assessment, the standardized reading comprehension test from the Dutch monitoring system has been con-ducted at schools. The following questions were addressed:

1) To what extent do the responses to the orthographic knowledge, vocabulary, and sentence-integration questions onfirst attempts and after feedback uniquely predict global text comprehension within the same text?

2) To what extent do the responses to the orthographic knowledge, vocabulary, and sentence-integration questions onfirst attempts and after feedback uniquely predict the growth in a standardized reading comprehension test?

With respect to thefirst question, we expected that the probability to answer the global text comprehension questions correctly is in-creased by each correctly answered componential ability question on thefirst or second attempt in contrast to an incorrect response. We assumed a larger increase for correct first than second attempts. Regarding the second question, we hypothesized that the higher the accuracy on the componential ability questions is on thefirst or second attempts, the higher is the growth in the standardized reading com-prehension test. Larger effects were anticipated for first than second attempts.

2. Methods 2.1. Procedure

Dutch third tofifth graders took a computerized assessment with a dynamic approach in autumn 2018. This was conducted within three weeks during lecture times infive sessions. The sessions were scheduled and guided by teachers under researchers' instructions. About three months prior and after the assessment, participants' reading compre-hension was assessed with a standardized test as part of the Dutch monitoring system.

2.2. Participants

In our study, 407 Dutch third tofifth graders participated (age: 8–11 years). They came from 22 classrooms across six schools in a suburban municipality in the East of the Netherlands. One school with two classes and three classrooms from two other schools (n = 69 par-ticipants) dropped out during the study and were excluded. To keep the same sample across analyses, we excluded participants without com-plete observations on the standardized reading comprehension tests (n = 25) or assessment (n = 42). The observations were missing at random (Little's MCAR test:χ2(26) = 28.76, p = .32). We did not exclude children with language-related or developmental problems to reflect an authentic classroom in the Netherlands. The final sample consisted of 275 children (third graders n = 91, fourth graders n = 78, fifth graders n = 106) from 17 classrooms across five schools. The study sample performed slightly higher than the national sample on a stan-dardized reading comprehension test of the Dutch monitoring system at the end of the former school year (t(8045) = −5.22, p = .01, d = 0.29) and midterm of the current grade (t(8754) = −4.39, p = .01, d = 0.25). The national sample was based on different par-ticipants at pretest (n = 7772) and posttest (n = 8481). The study sample's growth in the reading comprehension test (t(274) = 11.10, p < .001, d = 0.46) was comparable to the national sample's growth (t

(4)

(16251) = 36.62, p < .01, d = 0.58). We obtained active parental consent. The study was approved by the Ethics Committee of the Faculty of Social Sciences of our university (ECSW-2018-064) and complied with APA ethical standards.

2.3. Materials and measures

2.3.1. Computerized assessment with a dynamic approach

The assessment consisted of 25 texts. They were equally divided into five sessions. Per text, seven questions had to be answered in a fixed order (seeFig. 1). First, six questions had to be answered on compo-nential abilities (three orthographic knowledge questions, two voca-bulary questions, one sentence-integration question). Feedback was provided on accuracy. A hint was presented after an incorrect response. The children, then, could answer the question again (seeFig. 2). If the second attempt was incorrect, the correct answer was shown. Finally, one question on global text comprehension had to be answered. Feed-back was only provided on accuracy, i.e., no hints and second attempt. After all questions were answered on one text, the children continued with the next text in a self-paced speed. The text remained visible while the questions were answered. There was no time restriction.

2.3.1.1. Adapted assessment difficulty. To take the variance in reading abilities across grades into account, a different assessment version was administered in third, fourth, andfifth grade. From an item bank of 38 texts, three assessment versions were developed with respectively 25 texts. The difficulty of the assessed texts increased with grade with a partial overlap of texts between assessment versions (see Supplement A). The text difficulty was determined in a pilot study. The text order was constant within grades but random across grades. The text length was on average 165.36 words (SD = 48.54) in third grade, 183.48 words (SD = 56.25) in fourth grade, and 189.08 words (SD = 56.12) in fifth grade.

2.3.1.2. Question types. The assessment included four question types: Orthographic knowledge questions. In order to measure the quality of orthographic representations for three words from the text, the child had to type each word after auditory presentation while the words were covered in the text. If a word was spelled incorrectly, the correct form wasflashed for 3 s. After the correct form disappeared, the child could make a second attempt. Cronbach's alphas were 0.92 to 0.94, which indicated very high reliability.

Vocabulary questions. The quality of semantic representations for two words from the text was measured via multiple-choice questions.

Successively for each word, the child had to choose the correct de-finition from three options. The word was printed in bold in the text. If the response was incorrect, a picture was provided to resemble the meaning before a new response could be given. The reliability of the task was high (Cronbach's alphas = 0.82 to 0.87).

Sentence-integration questions. To evaluate sentence-integration abilities, the child had to answer one multiple-choice question. This required the integration of explicitly provided information across several phrases or sentences. There were four answer options. To direct the reader's attention, the text part which required integration was highlighted. If the answer was incorrect, another part was highlighted, which helped finding the correct answer. After in-correct second attempts, a brief explanation of the in-correct answer was shown. Cronbach's alphas were between 0.78 and 0.88, which displayed acceptable to high reliability.

Global text comprehension questions. The overall text comprehension was measured with one multiple-choice question. The participants had to select from four answer options the sentence which best re-flected the main idea of the text or the best heading of the text. The task proved to be sufficiently reliable with Cronbach's alphas of 0.76 to 0.85.

More information about the development of the assessment can be found inDen Ouden et al. (2019)and example questions are provided in Supplement B.

2.3.1.3. Operationalization. For each participant, the response accuracy on thefirst and second attempt after feedback for each question type was operationalized in two ways.

2.3.1.3.1. Text level. We summarized the response accuracy onfirst and second attempts for all items of each question type at the text level because we could use only one score respectively for the prediction of global text comprehension of a text. Each item was scored as 2 if it was correct on thefirst attempt, as 1 if it was correct on the second attempt, and as 0 if it was incorrect on the second attempt. Sum scores for all items of a question type were built by either only considering thefirst attempts (i.e., correct second attempts scored as 0) or both thefirst and second attempts (i.e., correct second attempts scored as 1), as exemplified in Table 1. A similar approach was taken by Elleman et al. (2011).

Accuracy on thefirst attempts: Sum score for the first attempts Fig. 1. Order, amount, and type of questions for each text in the computerized

assessment with a dynamic approach.

Fig. 2. Feedback loop for the orthographic knowledge, vocabulary, and sen-tence-integration questions in the computerized assessment with a dynamic approach.

Table 1

Example for the sum score options for one text in the assessment with a dy-namic approach.

Scoring and sum score option

Performance on each question type for one text Orthographic knowledge Vocabulary Sentence-integration Global text comprehension Item scoring forfirst attempts Item 1 = 0 Item 2 = 0 Item 3 = 2 Item 1 = 0 Item 2 = 0 Item 1 = 0 Item 1 = 1 Item scoring forfirst and second attempts Item 1 = 1 Item 2 = 0 Item 3 = 2 Item 1 = 1 Item 2 = 1 Item 1 = 1 No feedback

Sum score for first attempts

2 0 0 1

Sum score for first and second attempts

3 2 1 No feedback

Note. Item scoring forfirst attempts (first attempt correct = 2, second attempt correct or incorrect = 0), item scoring forfirst and second attempts (first at-tempt correct = 2, second atat-tempt correct = 1, second atat-tempt incorrect = 0).

(5)

Accuracy after feedback: Difference between the sum scores for the first attempts and sum scores for the first and second attempts 2.3.1.3.2. Person level. We calculated the percentages of correct responses on thefirst and second attempts for each question type across all texts because only one score could be used respectively to predict the standardized reading comprehension test score of a participant. A similar approach was followed byDörfler et al. (2017).

Accuracy on thefirst attempts: Percentage of correct responses on thefirst attempts

Accuracy after feedback: Percentage of correct responses from all second attempts (i.e., questions which required feedback). Due to the different assessment versions, the percentages of correct first attempts were not directly comparable between grades. However, in a pilot study, we found that the responses of third tofifth graders to each item from all assessment versions (item bank) could well be de-scribed by the one parameter logistic model (OPLM;Verhelst & Glas, 1995). With this model, it is possible to estimate the ability on the same scale with every possible subset of items from the item bank. The scores on the three assessment versions can thus be transformed into so-called ability scores which are comparable even if different assessment ver-sions are being administered. The ability scores can further be trans-formed into so-called bank scores, which represent the expected per-centage of correct responses of test-takers if they would have been assessed with all items from the item bank (Hambleton et al., 1991). In our study, we used these bank scores for the percentage of correctfirst attempts. The percentage of correct second attempts was based on the raw scores.

2.3.2. Standardized reading comprehension test

The Dutch primary school curriculum requires the regular assess-ment of reading comprehension. For this purpose, the Dutch National Institute for Educational Measurement (Cito) offers a student progress monitoring system with assessments for the midterm and end of a grade. As the assessments are based on item-response-theory calibrated item banks, the scores across grades are estimated on one ability scale and can be used to measure growth. In the standardized reading com-prehension test, the children had to read short texts and answered per text one or more multiple-choice questions with four answer options. The questions tapped into comprehension, interpretation, evaluation, and summary of information from texts. The reliability of the assess-ment was high (Cronbach's alphas ≥ 0.83; Feenstra et al., 2010; Tomesen et al., 2018).

2.4. Data analysis

The analyses were conducted with the package lme4 in R (Bates, Mächler, et al., 2015;R Core Team, 2018). To answer thefirst research question, we investigated with a generalized linear mixed effect model how the accuracy on the first attempts and after feedback for each componential ability question of a text affected the probability to an-swer the respective global text comprehension questions correctly. The accuracy on the first attempts and after feedback were highly corre-lated. Therefore, we used two separate models with either the sum scores for thefirst attempts only or both the first and second attempts per predictor. By adding random intercepts for participants and texts, we took the dependence of observations within participants (ICC = 0.19) and texts (ICC = 0.08) into account. The low dependence of observations within classes, schools, and grades (ICC≤ 0.02) was not modeled. Random slopes and the random correlation terms of ex-perimental predictors were added if they significantly improved the modelfit (p < .05;Barr et al., 2013). We tested the significance of fixed effects with the function mixed of the package afex (Singmann et al., 2018). This computes likelihood ratio tests between models with the predictor of interest and models without the predictor of interest. It

is examined if the addition of a predictor significantly increases the modelfit.4Odd ratios of 1.68, 3.47, and 6.71 were considered respec-tively small, medium, and large effects (Chen et al., 2010).

To partial out the effect of the accuracy after feedback, we com-pared thefit of a model including the sum scores for the first and second attempts for all question types (baseline) to models from which re-spectively the second attempts for the question type of interest was excluded (i.e., sum score for only the first attempts as fixed effect), while keeping the models comparable otherwise. By this, the accuracy after feedback effect was investigated on top of accuracy on the first attempts for the same question type as well as the accuracy on thefirst attempts and after feedback for all other question types. An example is presented below:

Fixed predictors of the baseline model:

sum score for thefirst and second attempts on the orthographic knowledge questions + sum score for thefirst and second attempts on the vocabulary questions + sum score for thefirst and second attempts on the sentence-integration questions

Fixed predictors of the comparison model for vocabulary questions: sum score for thefirst and second attempts on the orthographic knowledge question + sum score for thefirst attempts on the vocabulary question + sum score for thefirst and second attempts on the sentence-integration questions.

Since the models were not nested, likelihood ratio tests were not possible but goodness-of-fit statistics were compared. The goodness-of-fit is often described with the Akaike's information criterion (AIC) and Bayesian information criterion (BIC), which indicate the amount of unexplained variance corrected for the number of predictors in the model (Burnham & Anderson, 2002;Field et al., 2012). Thus, the better fitting model has smaller AIC and BIC values compared to another model. BIC can be considered similar to the Bayes factor (albeit being on a different scale), which quantifies the evidence for an alternative hypothesis in contrast to the null hypothesis (Burnham & Anderson, 2002;Kass & Raftery, 1995). BIC differences can, therefore, be used to test hypotheses. A worsefit of the model without second attempts for the question type of interest (higher BIC) than the baseline model was considered positive evidence that correct second attempts after feed-back for the question type under investigation increased the probability of a correct response to the global text comprehension question over and above all other effects. Model differences in BIC between 2 and 6 were considered positive evidence, between 6 and 10 strong positive evidence, and above 10 very strong positive evidence (Kass & Raftery, 1995).

To answer the second research question, we investigated with a linear mixed effect model how the posttest scores on the standardized reading comprehension test were predicted by the accuracy on thefirst attempts and after feedback for each componential ability question in the assessment over and above the pretest scores on the standardized reading comprehension test. In step 1 of the analyses, we added the reading comprehension scores at pretest as a predictor. In step 2, the accuracy on thefirst attempt and after feedback for each componential ability question were entered to the model. Four participants were ex-cluded because they did not require feedback on some question types. The same procedure for the inclusion of random effects and significance testing was used as in the analysis for the first research question. Although the dependence of observations was relatively high in classes (ICC = 0.35) and schools (ICC = 0.19), we included only random in-tercepts for classes to remain model parsimony and convergence. Standardized estimates of 0.2, 0.3, and 0.5 were interpreted respec-tively as small, medium, and large effects (Field et al., 2012).

4Since the difference between the deviances of two models is Chi-square

distributed, comparisons of the modelfit are based on Chi-square tests (Field et al., 2012). The difference in the number of parameters between the compared models is indicated by the degrees of freedom.

(6)

For all analyses, we centered continuous predictors. If necessary, bound optimization by quadratic approximation was used. If not re-ported differently, model assumptions were met (Field et al., 2012; Hartig, 2018;Nieuwenhuis, Te Grotenhuis, & Pelzer, 2012).

3. Results

3.1. Descriptive statistics

On average, one assessment session wasfinished by the participants in 22.47 min (SD = 16.44). The descriptive statistics for the accuracy on the first attempts and after feedback for each question type are provided inTable 2. The scores on the standardized reading compre-hension test at pretest (M = 166.84, SD = 30.08) and posttest (M = 180.62, SD = 30.32) were strongly correlated with each other (r = 0.77) and with the percentage of correct responses to the global text comprehension questions (pretest: r = 0.66, posttest: r = 0.71). The reading comprehension test scores were strongly correlated with the percentage of correct first attempts for the componential ability questions (pretest: r = 0.58–0.63, posttest: r = 0.62–0.72) but weak to moderate with the performance after feedback (pretest: r = 0.24–0.40, posttest: r = 0.28–0.40). These correlations were respectively stronger for vocabulary and sentence-integration questions than orthographic knowledge questions. The percentage of correct responses to global text comprehension questions correlated strongly with the percentage of correctfirst attempts of other question types. This was the strongest for sentence-integration questions (r = 0.83), followed by vocabulary questions (r = 0.75), and the lowest for orthographic knowledge questions (r = 0.61). For the percentage of correct attempts after feedback, the correlations with global text comprehension were all moderate and slightly stronger for sentence-integration questions (r = 0.42) than for other question types (r = 0.35). All correlations were significant (p < .001). The inter-correlations were only weak when sum scores for texts were used due to the different scoring scale. A detailed overview is provided in Supplement C.

3.2. Prediction of global text comprehension by componential abilities within texts

The probability to correctly answer the global text comprehension question of a text was increased by each correctly answered ortho-graphic knowledge, vocabulary, or sentence-integration question on the first attempt within the same text, as indicated by significant effects of the sum scores for thefirst attempts in Table 3. Thus, the higher the accuracy on thefirst attempts was, the higher was the probability of a correct response to the global text comprehension question. Although the increase of the probability by a single correctly answered compo-nent question can be considered small (OR = 1.07–1.31), this can sum

up to large effects if all six questions were answered correctly on the first attempt.

In the second model, the sum scores for thefirst and second at-tempts were used as predictors. This unifies the accuracy on the first attempts and after feedback. A higher sum score for thefirst and second attempts on the vocabulary or sentence-integration questions led to an increased probability to answer the global text comprehension question correctly. The effect of orthographic knowledge questions was only marginal (p = .053). The difference between the models with and without the second attempts inTable 3indicated the effect of the ac-curacy after feedback. For example, the increase in odd ratios was 0.15 by each correctly answered sentence-integration question on the second attempt. To quantify the evidence for an effect of the accuracy after feedback, the goodness-of-fit was compared between a baseline model with the second attempts in the sum scores for all question types and an identical model besides excluding the second attempts for the question type under investigation. The goodness-of-fit statistics AIC and BIC of each model is provided inTable 4. Including the second attempts on the sentence-integration questions led to a BIC decrease of 8.50 compared to the model without second attempts. This can be considered strong evidence that a higher accuracy after feedback for the sentence-in-tegration questions increased the probability to correctly answer the global text comprehension question. However, including second at-tempts for the orthographic knowledge or vocabulary questions did not lead to a decrease of the BIC larger or equal to 2. Thus, there was no evidence for an accuracy after feedback effect for these question types. 3.3. Prediction of growth in reading comprehension by componential abilities

There was a moderate autoregressive effect (b = 0.43) for the standardized reading comprehension test scores in Step 2, as presented inTable 5. On top of this, the percentage of correctfirst attempts for the orthographic knowledge, vocabulary and sentence-integration ques-tions each explained unique variance in the posttest scores of the standardized reading comprehension test in Step 2 with small effect sizes (b = 0.10–0.20). The higher the accuracy was on the first at-tempts, the higher was the growth in the standardized reading com-prehension test from pretest to posttest. For incorrectly answered de-coding, vocabulary, and sentence-integration questions, the accuracy after feedback did not significantly influence the growth. The perfor-mance on the componential ability questions in the assessment overall explained 12% of the variance in the posttest scores on the standardized reading comprehension tests. Despite a strong correlation between the percentages of correctfirst attempts for the vocabulary and sentence-integration questions, all VIF values were below 3. This is not con-sidered evidence for multicollinearity influences (Field et al., 2012; Hair et al., 1995).

Table 2

Means and standard deviations for each question type in the assessment with a dynamic approach.

Question types Percentage of correct responses Sum scores for each text

First attempts Second attempts after feedback First attempts First and second attempts

M SD M SD M SD M SD

Orthographic knowledge 60.38 17.10 68.87 12.95 3.77 1.96 4.51 1.44 Vocabulary 73.94 13.57 81.47 14.32 2.76 1.35 3.24 0.90 Sentence-integration 61.13 19.03 64.49 23.26 1.32 0.95 1.52 0.73 Global text comprehension 57.79 19.52 n.a. n.a. 1.25 0.97 n.a. n.a.

Note. n = 275 (n = 274 for the percentage of correct second attempts for the vocabulary questions, n = 272 for the percentage of correct second attempts for the sentence-integration questions). The percentage of correctfirst attempts was estimated on one scale for the different assessment versions with item response theory. The percentage of correct second attempts was based on the raw data and the number of all incorrectfirst attempts. For the sum scores, we summed up the item scores for each question type for thefirst attempts (0 = incorrect, 2 = correct) or first and second attempts (0 = incorrect after feedback, 1 = correct after feedback, 2 = correct without feedback). The possible score ranges were for the orthographic knowledge questions 0–6, vocabulary questions 0–4, sentence-integration and global text comprehension questions 0–2. n.a. = not available.

(7)

4. Discussion

We examined how a dynamic approach to reading comprehension assessment could be used to profile the underlying componential abil-ities in reading comprehension. Dutch third to fifth graders' compo-nential abilities were assessed within the same texts and provided with feedback in addition to the global comprehension of these texts.

Aligned with our hypotheses, the accuracy on thefirst attempts for the orthographic knowledge, vocabulary, and sentence-integration questions each uniquely predicted global text comprehension and growth in a standardized reading comprehension test. The higher the accuracy on thefirst attempts was for each question type, the higher was the probability to correctly answer the global text comprehension question of a text and the higher was the growth in the standardized reading comprehension test. As expected, correctly answering the sen-tence-integration question after receiving feedback additionally in-creased the probability to answer the global text comprehension question correctly within the same text. Against our hypotheses, how-ever, this was not found for the orthographic knowledge and vocabu-lary questions. When the growth in the standardized reading compre-hension test was predicted, no accuracy after feedback effect was significant for any question type, in contrast to our expectations.

4.1. Prediction of variance in reading comprehension by thefirst attempts' accuracy

The fact that the accuracy on thefirst attempts for the orthographic knowledge and vocabulary questions predicted the global text com-prehension is in line with the findings ofDen Ouden et al. (2019). Together with the significant effect of the accuracy on the first attempts for sentence-integration questions in this study, thefindings support the construct validity and internal consistency of the assessment. Moreover, we found evidence for its suitability for profiling. Each of the three componential abilities in reading comprehension could explain unique variance in the global text comprehension of a text. The performance pattern on the first attempts, thus, provides an insight into the in-dividuals' strengths and weaknesses, which can indicate the required focus of instruction. Additionally, further evidence for the relevance of both lower-order and higher-order components in reading compre-hension is provided (Silva & Cain, 2015).

Also, thefindings on the prediction of growth in a standardized reading comprehension test support the usefulness of the assessment for profiling. The accuracy on the orthographic knowledge, vocabulary, and sentence-integration questions on thefirst attempt each predicted small but unique additional variance in prospective achievements in a Table 3

Prediction of probability to correctly answer a global text comprehension question.

Predictors s2 B (SE) OR 95% CI for OR χ2(df) model comparison

LL UL

Sum scores forfirst attempts Fixed

Intercept 0.54 (0.11) 1.71 1.37 2.13

Orthographic knowledge question 0.07 (0.03) 1.07 1.01 1.13 5.98 (1)⁎ Vocabulary question 0.14 (0.03) 1.15 1.09 1.22 25.65 (1)⁎⁎⁎ Sentence-integration question 0.27 (0.05) 1.31 1.19 1.45 20.98 (1)⁎⁎⁎ Random for participants

Intercept 0.58

Random for texts

Intercept 0.35

Sentence-integration question 0.06 Orthographic knowledge question 0.01

Sum scores forfirst and second attempts Fixed

Intercept 0.53 (0.11) 1.68 1.35 2.09

Orthographic knowledge question 0.07 (0.04) 1.08 1.00 1.16 3.76 (1) Vocabulary question 0.23 (0.04) 1.26 1.16 1.37 30.15 (1)⁎⁎⁎ Sentence-integration question 0.38 (0.07) 1.46 1.28 1.67 23.87 (1)⁎⁎⁎ Random for participants

Intercept 0.58

Random for texts

Intercept 0.35

Sentence-integration question 0.09 Orthographic knowledge question 0.01

Note. n = 275. Sum scores forfirst attempts: R2m= 0.038, R2c= 0.26. Sum scores forfirst and second attempts: R2m= 0.039, R2c= 0.26. The random correlation term

was removed for model convergence (Bates, Kliegl, et al., 2015). s2= variance, B = unstandardized estimate, OR = odds ratio, CI = confidence interval.p < .05.

⁎⁎⁎ p < .001.

Table 4

Accuracy after feedback effect: Goodness-of-fit statistics for the model comparisons.

Models AIC BIC

Baseline model with sum scores forfirst and second attempts on the orthographic knowledge, vocabulary, and sentence-integration questions 7999.7 8054.3 Models compared to the baseline model

Model without the second attempts in the sum scores on the orthographic knowledge questions 7995.8 8050.4 Model without the second attempts in the sum scores on the vocabulary questions 8000.9 8055.6 Model without the second attempts in the sum scores on the sentence-integration questions 8008.1 8062.8

Note. The models without the second attempt in the sum scores on a particular question type were otherwise identical to the baseline model. AIC = Akaike's information criterion, BIC = Bayesian information criterion.

(8)

standardized reading comprehension test on top of its autoregressive effect. Thus, each of the three components does not only explain var-iance in comprehension within the same texts but also in the develop-ment on general reading comprehension ability. In the latter, we tested the contribution of each component in quite a strict way because the performance on the standardized reading comprehension test at pretest may cover a lot of the variance in the componential abilities (Richter et al., 2013; Silva & Cain, 2015). The assessment seemed to tap into additional abilities which were not covered by the standardized reading comprehension test.

4.2. Prediction of variance in reading comprehension by the accuracy after feedback

Over and above thefirst attempts, we found an effect of the accu-racy after feedback on global text comprehension for the sentence-in-tegration questions but not for orthographic knowledge and vocabulary questions. This means that only the accuracy after feedback for sen-tence-integration questions could differentiate between children of various abilities in global text comprehension. It can be concluded that more fine-grained profiling of sentence-integration abilities was pos-sible. Children who could answer the questions correctly after the re-levant text part was highlighted did not have difficulties with the in-tegration itself but with identifying the proper information. In contrast, children who did not answer correctly after feedback had problems with integrating the information even after being directed to it. This may reflect a qualitative difference, which may ask for a different in-structional focus. Similar subgroups were also reported previously (Cain et al., 2001;McMaster et al., 2012).

It is unclear why the accuracy after feedback on the orthographic knowledge and vocabulary questions did not differentiate between children's global text comprehension. Similar non-significant findings were also reported by Den Ouden et al. (2019), although they con-sidered aggregated scores across texts instead of within texts. One possible explanation is that the word-level feedback was not sufficiently

integrated to the text. The pictures provided after incorrect vocabulary questions were not matched with the text context (Carney & Levin, 2002), and the feedback after incorrect orthographic knowledge ques-tions just focused on the form. Processing the word-level feedback, thus, required the reader to shift the focus away from the text, whereas highlighting sentences was directly linked to the text (Sweller, 2011).

Other reasons might lie in the nature of feedback (Hattie & Timperley, 2007; Shute, 2008). While the correct answer was pre-viewed briefly for the orthographic knowledge questions, compensation was offered in the vocabulary and sentence-integration questions (i.e., bypassing reading via pictures, taking over the information-selection step). This may cause dissociations in what the accuracy after feedback measured and how this mattered for global text comprehension. While sentence-integration feedback helps overcoming an obstacle in the comprehension process, the accuracy for word-level tasks after feed-back may rather evaluate a lexical-knowledge level. The word-level feedback was adapted from training studies and might still be useful to identify different ability levels with respect to the required intensity of instruction (Gruhn, Segers, & Verhoeven, 2019, 2020). In context of differentiating reading comprehension abilities, feedback should less focus on knowledge-strengthening but more on the processing level such as via meta-cognition and active learner-involvement (e.g.,Ebadi et al., 2018;Rittle-Johnson, 2006).

Finally, the interaction of learners with feedback has been shown to depend on many external influences (Maier et al., 2016;Timmers et al., 2013). This might cause a lot of individual variation in how the feed-back was processed and has impacted the subsequent response beha-vior. Inspecting this deeper could even reveal morefine-grained profiles (Grassinger & Dresel, 2017;Nakai & O'Malley, 2015).

We did not find an effect of the accuracy after feedback for the prediction of growth in the standardized reading comprehension test for any of the question types, i.e., children who answered correctly or incorrectly after feedback did not differ in their reading comprehension development. As the accuracy after feedback for the orthographic knowledge and vocabulary questions also did not differentiate Table 5

Prediction of standardized reading comprehension test scores at posttest.

Predictors s2 b B (SE) 95% CI for B χ2(df) model comparison

LL UL

Step 1 Fixed

Intercept 182.68 (2.56) 177.67 187.68

Pretest reading comprehension 0.72 0.73 (0.07) 0.59 0.87 33.56 (1)⁎⁎⁎ Random for classrooms:

Intercept 83.91

Pretest reading comprehension 0.06

Residuals 287.04

Step 2 Fixed

Intercept 181.97 (2.04) 177.98 185.97

Pretest reading comprehension 0.43 0.43 (0.06) 0.31 0.56 26.12 (1)⁎⁎⁎ Orthographic knowledgefirst attempt 0.10 0.18 (0.08) 0.02 0.33 5.01 (1)⁎ Orthographic knowledge after feedback −0.05 −0.11 (0.08) −0.28 0.06 1.76 (1) Vocabularyfirst attempt 0.19 0.43 (0.14) 0.16 0.69 10.11 (1)⁎⁎ Vocabulary after feedback −0.02 −0.05 (0.07) −0.19 0.10 0.37 (1) Sentence-integrationfirst attempt 0.20 0.32 (0.09) 0.15 0.49 13.06 (1)⁎⁎⁎ Sentence-integration after feedback 0.03 0.04 (0.05) −0.06 0.13 0.64 (1) Random for classrooms:

Intercept 50.68

Pretest reading comprehension 0.03

Residuals 222.85

Note. n = 271 (Four participants were excluded because they did not require feedback on some question types). Step 1: R2

m= 0.53, R2c= 0.68; Step 2: R2m= 0.65,

R2

c= 0.74. s2= variance, b = standardized estimate, B = unstandardized estimate. CI = confidence interval. ⁎ p < .05.

⁎⁎ p < .01. ⁎⁎⁎ p < .001.

(9)

children's global text comprehension within the same texts, it is not surprising that this effect was not found for the comprehension of other texts in the standardized test over time. This may also explain why the above-described accuracy after feedback effect for sentence-integration questions did not remain significant. The effect was, probably, too small. Also, in the study ofElleman et al. (2011), the performance on sentence-integration questions including the responsiveness to feedback explained only a small amount of variance in a standardized reading comprehension test. This might have been significant because they did not control for the accuracy on the first attempts in contrast to our study.

4.3. Limitations and future investigations

This study has some limitations due to the type of data the assess-ment entailed. Thefirst and second attempts for different question types of a text were not independent. Therefore, we used random person and text effects in our analyses. For the prediction of growth in the stan-dardized reading comprehension test, however, we had to summarize the performance for each question type because there was only one score available as dependent variable. Model comparisons of goodness-of-fits had to be conducted to circumvent the correlations between first and second attempts at text level. The models were not identical in their underlying covariance structure but differed with respect to coding of one parameter (sum score with or without second attempt). Since the range of the scores with or without the second attempts was compar-able, this is a very small difference.

Despite observing a large variety of different profiles in our study (see Supplement D), we could not provide empirical evidence for them. As about two-thirds of the participants did not perform lower than−1 SD from the grade level mean on any question type, the remaining sample was considered too low to evidence with a cluster analysis more fine-grained profiles than those commensurate with the simple view of reading. Follow-up research should include more same-aged children with reading comprehension problems or should increase the assess-ment sensitivity toward differences between higher-skilled children. It is also not clear if the componential measures within the same text can indeed capture the interactive online-processes in reading comprehen-sion better than isolated tests. In future, the profiles identified with our dynamic approach or with isolated tests could be compared with re-spect to congruency and effectiveness of profile-adapted instructions. Other feedback forms as well as its influence on response behavior may be considered to better understand the role of feedback for profiling and to individualize the assessment (Van der Linden & Glas, 2002). 5. Conclusion and implications

We found evidence for the suitability of a dynamic approach for profiling of children's instructional needs in reading comprehension in terms of the focus and intensity of instruction. The accuracy for or-thographic knowledge, vocabulary, and sentence-integration questions on the first attempt predicted the global text comprehension within texts and the growth in a standardized reading comprehension test. The individual performance pattern across the three componential abilities can provide an insight into the required focus of instruction in reading comprehension. Assessing the componential abilities within the same text seemed to tap into additional skills, which are not captured by a standardized reading comprehension test. Moreover, the accuracy after feedback for sentence-integration questions further differentiated chil-dren's reading comprehension abilities. This difference seemed to in-dicate a different instructional focus in sentence-integration (i.e., identifying vs. integrating information). Although the accuracy after feedback on the orthographic knowledge and vocabulary question did not differentiate children's reading comprehension abilities, it may still inform on the required intensity of word-level instructions. More pro-cess-related feedback may contribute to profiling of reading

comprehension.

As the presented assessment takes a rather restrictive view on what entails reading comprehension at text level (i.e., identifying the main message), it may not replace the tasks of existent standardized tests. It should rather be seen as a supplemental tool to better profile the in-dividual reader compared to standardized and isolated tests. Although the increasing role of assessments in children's education is criticizable, the assessment can contribute to the efficiency of instructions at schools.

Author note

Sophie Gruhn, Behavioural Science Institute, Radboud University, Nijmegen, The Netherlands; Eliane Segers, Behavioural Science Institute, Radboud University, Nijmegen, The Netherlands, Instructional Technology, University of Twente, Enschede, The Netherlands; Jos Keuning, Cito, Institute for Educational Measurement, Arnhem, The Netherlands; Ludo Verhoeven, Behavioural Science Institute, Radboud University, Nijmegen, The Netherlands, Faculty of Arts, University of Curaçao, Curaçao.

Funding

This work was supported by The Netherlands Initiative for Education Research (National Regieorgaan Onderwijsonderzoek) [grant number NWO 405-15-548].

Declaration of competing interest None.

Acknowledgements

This work was based on a cooperation between the Radboud University (Ludo Verhoeven, Eliane Segers, Sophie Gruhn), Cito and Twente University (Theo Eggen, Jos Keuning, Marije den Ouden), Expertisecentrum Nederlands (Nicole Heister-Swart), and Kennisinstituut voor Taalontwikkeling (ITTA, Femke Scheltinga). Data accessibility statement

Data available on request from the authors. Appendix A. Supplementary data

Supplementary data to this article can be found online athttps:// doi.org/10.1016/j.lindif.2020.101923.

References

Aaron, P. G. (1991). Can reading disabilities be diagnosed without using intelligence tests? Journal of Learning Disabilities, 24(3), 178–186.https://doi.org/10.1177/ 002221949102400306.

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278.https://doi.org/10.1016/j.jml.2012.11.001.

Bates, D., Kliegl, R., Vasishth, S., & Baayen, H. (2015). Parsimonious mixed models. arXiv: 1506.04967. Retrieved fromhttps://arxiv.org/abs/1506.04967.

Bates, D., Mächler, M., Bolker, B. M., & Walker, S. C. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48.https://doi.org/10. 18637/jss.v067.i01.

Bishop, D. V. M., & Snowling, M. J. (2004). Developmental dyslexia and specific language impairment: Same or different? Psychological Bulletin, 130(6), 858–886.https://doi. org/10.1037/0033-2909.130.6.858.

Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach (2nd ed.). New York, NY: Springer. Caffrey, E., Fuchs, D., & Fuchs, L. S. (2008). The predictive validity of dynamic

assess-ment: A review. The Journal of Special Education, 41(4), 254–270.https://doi.org/10. 1177/0022466907310366.

(10)

difficulties. British Journal of Educational Psychology, 76(4), 683–696.https://doi.org/ 10.1348/000709905X67610.

Cain, K., & Oakhill, J. (2012). Reading comprehension development from seven to fourteen years: Implications for assessment. In J. P. Sabatini, E. R. Albro, & T. O’Reilly (Eds.). Measuring up: Advances in how to assess reading ability (pp. 59–75). Lanham, MD: Rowman & Littlefield Education.

Cain, K., & Oakhill, J. (2014). Reading comprehension and vocabulary: Is vocabulary more important for some aspects of comprehension? L’Anneé Psychologique, 114(4), 647–662.https://doi.org/10.4074/S0003503314004035.

Cain, K., Oakhill, J. V., Barnes, M. A., & Bryant, P. E. (2001). Comprehension skill, in-ference-making ability and their relation to knowledge. Memory & Cognition, 29(6), 850–859.https://doi.org/10.3758/BF03196414.

Cain, K., Oakhill, J. V., & Elbro, C. (2003). The ability to learn new word meanings from context by school-age children with and without language comprehension difficul-ties. Journal of Child Language, 30(3), 681–694.https://doi.org/10.1017/ S0305000903005713.

Carney, R. N., & Levin, J. R. (2002). Pictorial illustrations still improve students’ learning from text. Educational Psychology Review, 14(1), 5–26.https://doi.org/10.1023/ A:1013176309260.

Chen, H., Cohen, P., & Chen, S. (2010). How big is a big odds ratio? Interpreting the magnitudes of odds ratios in epidemiological studies. Communications in Statistics -Simulation and Computation, 39(4), 860–864.https://doi.org/10.1080/ 03610911003650383.

Clarke, P. J., Henderson, L. M., & Truelove, E. (2010). The poor comprehender profile: Understanding and supporting individuals who have difficulties extracting meaning from text. Advances in Child Development and Behavior, 39, 79–129.https://doi.org/ 10.1016/B978-0-12-374748-8.00003-2.

Colenbrander, D., Kohnen, S., Smith-Lock, K., & Nickels, L. (2016). Individual differences in the vocabulary skills of children with poor reading comprehension. Learning and Individual Differences, 50, 210–220.https://doi.org/10.1016/j.lindif.2016.07.021. Colenbrander, D., Nickels, L., & Kohnen, S. (2017). Similar but different: Differences in

comprehension diagnosis on the Neale Analysis of Reading Ability and the York Assessment of Reading for Comprehension. Journal of Research in Reading, 40(4), 403–419.https://doi.org/10.1111/1467-9817.12075.

Compton, D. L., Gilbert, J. K., Jenkins, J. R., Fuchs, D., Fuchs, L. S., Cho, E., ... Bouton, B. (2012). Accelerating chronically unresponsive children to tier 3 instruction: What level of data is necessary to ensure selection accuracy? Journal of Learning Disabilities, 45(3), 204–216.https://doi.org/10.1177/0022219412442151.

Daugaard, H. T., Cain, K., & Elbro, C. (2017). From words to text: Inference making mediates the role of vocabulary in children’s reading comprehension. Reading & Writing, 30, 1773–1788.https://doi.org/10.1007/s11145-017-9752-2.

Den Ouden, M., Keuning, J., & Eggen, T. (2019). Fine-grained assessment of children’s text comprehension skills. Frontiers in Psychology, 10, 1–12.https://doi.org/10.3389/ fpsyg.2019.01313.

Dörfler, T., Golke, S., & Artelt, C. (2017). Evaluating prerequisites for the development of a dynamic test of reading competence: Feedback effects on reading comprehension in children. In D. Leutner, J. Fleischer, J. Grünkorn, & E. Klieme (Eds.). Competence assessment in education: Research, models and instruments. Methodology of educational measurement and assessment (pp. 487–503). Cham, Switzerland: Springer. Eason, S. H., Goldberg, L. F., Young, K. M., Geist, M. C., & Cutting, L. E. (2012).

Reader-text interactions: How differential Reader-text and question types influence cognitive skills needed for reading comprehension. Journal of Educational Psychology, 104(3), 515–528.https://doi.org/10.1037/a0027182.

Ebadi, S., Weisi, H., Monkaresi, H., & Bahramlou, K. (2018). Exploring lexical inferencing as a vocabulary acquisition strategy through computerized dynamic assessment and static assessment. Computer Assisted Language Learning, 31(7), 790–817.https://doi. org/10.1080/09588221.2018.1451344.

Elleman, A. M. (2017). Examining the impact of inference instruction on the literal and inferential comprehension of skilled and less skilled readers: A meta-analytic review. Journal of Educational Psychology, 109(6), 761–781.https://doi.org/10.1037/ edu0000180.

Elleman, A. M., Compton, D. L., Fuchs, D., Fuchs, L. S., & Bouton, B. (2011). Exploring dynamic assessment as a means of identifying children at risk of developing com-prehension difficulties. Journal of Learning Disabilities, 44(4), 348–357.https://doi. org/10.1177/0022219411407865.

Elleman, A. M., Lindo, E. J., Morphy, P., & Compton, D. L. (2009). The impact of voca-bulary instruction on passage-level comprehension of school-age children: A meta-analysis. Journal of Research on Educational Effectiveness, 2(1), 1–44.https://doi.org/ 10.1080/19345740802539200.

Feenstra, H., Kleintjes, F., Kamphuis, F., & Krom, R. (2010). Leerling en onderwijsvolgsys-teem. Begrijpend lezen. Groep 3 t/m 6 [Pupil and instruction monitoring system. Reading comprehension. Grade 1 to 4]. Arnhem, The Netherlands: Cito.

Field, A., Miles, J., & Field, Z. (2012). Discovering statistics using R. London, England: Sage Publications.

Francis, D. J., Kulesz, P. A., & Benoit, J. S. (2018). Extending the simple view of reading to account for variation within readers and across texts: The complete view of reading (Cvri). Remedial and Special Education, 39(5), 274–288.https://doi.org/10.1177/ 0741932518772904.

Fuchs, D., Fuchs, L. S., & Compton, D. L. (2012). Smart RTI: A next-generation approach to multilevel prevention. Exceptional Children, 78(3), 263–279.https://doi.org/10. 1177/001440291207800301.

Gough, P. B., & Tunmer, W. E. (1986). Decoding, reading, and reading disability. RASE: Remedial & Special Education, 7(1), 6–10.https://doi.org/10.1177/

074193258600700104.

Grassinger, R., & Dresel, M. (2017). Who learns from errors on a class test? Antecedents and profiles of adaptive reactions to errors in a failure situation. Learning and

Individual Differences, 53, 61–68.https://doi.org/10.1016/j.lindif.2016.11.009. Gruhn, S., Segers, E., & Verhoeven, L. (2020). Moderating role of reading comprehension

in children's word learning with context versus pictures. Journal of Computer Assisted Learning, 36(1), 29–45.https://doi.org/10.1111/jcal.12387.

Gruhn, S., Segers, E., & Verhoeven, L. (2019). The efficiency of briefly presenting word forms in a computerized repeated spelling training. Reading & Writing Quarterly: Overcoming Learning Difficulties, 35(3), 225–242.https://doi.org/10.1080/10573569. 2018.1526725.

Gustafson, S., Svensson, I., & Fälth, L. (2014). Response to intervention and dynamic assessment: Implementing systematic, dynamic and individualised interventions in primary school. International Journal of Disability, Development and Education, 61(1), 27–43.https://doi.org/10.1080/1034912X.2014.878538.

Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (1995). Multivariate data ana-lysis. Englewood Cliffs, NJ: Prentice Hall.

Hakkarainen, A., Holopainen, L., & Savolainen, H. (2013). Mathematical and reading difficulties as predictors of school achievement and transition to secondary educa-tion. Scandinavian Journal of Educational Research, 57(5), 488–506.https://doi.org/ 10.1080/00313831.2012.696207.

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage Publications.

Hartig, F. (2018). DHARMa: Residual diagnostics for hierarchical (multi-level/mixed) re-gression models (Version 0.2.0). Retrieved fromhttps://CRAN.Rproject.org/package= DHARMa.

Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112.https://doi.org/10.3102/003465430298487.

Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90(430), 773–795. Retrieved fromhttps://www.jstor.org/stable/ 2291091.

Keenan, J. M., Betjemann, R. S., & Olson, R. K. (2008). Reading comprehension tests vary in the skills they assess: Differential dependence on decoding and oral comprehen-sion. Scientific Studies of Reading, 12(3), 281–300.https://doi.org/10.1080/ 10888430802132279.

Kintsch, W. (2012). Psychological models of reading comprehension and their implica-tions for assessment. In J. P. Sabatini, E. R. Albro, & T. O’Reilly (Eds.). Measuring up: Advances in how to assess reading ability (pp. 21–37). Lanham, MD: Rowman & Littlefield Education.

Kleinsz, N., Potocki, A., Ecalle, J., & Magnan, A. (2017). Profiles of French poor readers: Underlying difficulties and effects of computerized training programs. Learning and Individual Differences, 57, 45–57.https://doi.org/10.1016/j.lindif.2017.05.009. Landi, N., & Ryherd, K. (2017). Understanding specific reading comprehension deficit: A

review. Language and Linguistics Compass, 11(2), Article e12234. 1-24doi:10.1111/ lnc3.12234.

Maier, U., Wolf, N., & Randler, C. (2016). Effects of a computer-assisted formative as-sessment intervention based on multiple-tier diagnostic items and different feedback types. Computers & Education, 95, 85–98.https://doi.org/10.1016/j.compedu.2015. 12.002.

McMaster, K. L., Van den Broek, P., Espin, C. A., White, M. J., Rapp, D. N., Kendeou, P., ... Carlson, S. (2012). Making the right connections: Differential effects of reading in-tervention for subgroups of comprehenders. Learning and Individual Differences, 22(1), 100–111.https://doi.org/10.1016/j.lindif.2011.11.017.

Mislevy, R. J., & Sabatini, J. P. (2012). How research on reading and research on as-sessment are transforming reading asas-sessment (or if they aren’t, how they ought to). In J. P. Sabatini, E. R. Albro, & T. O’Reilly (Eds.). Measuring up: Advances in how to assess reading ability (pp. 119–134). Lanham, MD: Rowman & Littlefield Education. Nakai, Y., & O’Malley, A. L. (2015). Feedback to know, to show, or both? A profile ap-proach to the feedback process. Learning and Individual Differences, 43, 1–10.https:// doi.org/10.1016/j.lindif.2015.08.028.

Nation, K., Clarke, P., Marshall, C. M., & Durand, M. (2004). Hidden language impair-ments in children: Parallels between poor reading comprehension and specific lan-guage impairment? Journal of Speech, Lanlan-guage, and Hearing Research, 47(1), 199–211 (doi:1092-4388/04/4701-0199).

Nation, K., & Snowling, M. J. (1998). Individual differences in contextual facilitation: Evidence from dyslexia and poor reading comprehension. Child Development, 69(4), 996–1011.https://doi.org/10.1111/j.1467-8624.1998.tb06157.x.

Navarro, J.-J., & Mora, J. (2011). Analysis of the implementation of a dynamic assessment device of processes involved in reading with learning-disabled children. Learning and Individual Differences, 21(2), 168–175.https://doi.org/10.1016/j.lindif.2010.11.008.

Nieuwenhuis, R., Te Grotenhuis, M., & Pelzer, B. (2012). influence.ME: Tools for de-tecting influential data in mixed effects models. The R Journal, 4(2), 38–47. Oakhill, J. V., & Cain, K. (2012). The precursors of reading ability in young readers:

Evidence from a four-year longitudinal study. Scientific Studies of Reading, 16(2), 91–121.https://doi.org/10.1080/10888438.2010.529219.

Perfetti, C., & Adlof, S. M. (2012). Reading comprehension: A conceptual framework from word meaning to text meaning. In J. P. Sabatini, E. R. Albro, & T. O’Reilly (Eds.). Measuring up: Advances in how to assess reading ability (pp. 3–20). Lanham, MD: Rowman & Littlefield Education.

Perfetti, C., & Stafura, J. (2014). Word knowledge in a theory of reading comprehension. Scientific Studies of Reading, 18(1), 22–37.https://doi.org/10.1080/10888438.2013. 827687.

Perfetti, C. A. (1999). Comprehending written language: A blueprint of the reader. In C. M. Brown, & P. Hagoort (Eds.). The neurocognition of language (pp. 167–208). Oxford, England: Oxford University Press.

Perfetti, C. A., & Hart, L. (2002). The lexical quality hypothesis. In L. Verhoeven, C. Elbro, & P. Reitsma (Eds.). Precursors of functional literacy (pp. 189–213). Amsterdam, The Netherlands: John Benjamins.

Referenties

GERELATEERDE DOCUMENTEN

It describes the key activities of the team, the information flows between the team members and will discuss the influence of organisational culture on knowledge

In order to improve this service support the Technical Assistance Center (TAC) is developed. The TAC is a system aimed at the distribution of knowledge to the workshop and

Alexandria in Egypt became a centre for the philological study of earlier Greek literature and started to work as a magnet to writers, scholars, and manuscripts from Greece..

There is ample documentation in research about the link between student reading achievement in middle school and their declining levels of engagement and interest in reading

As said, students of this age group have been found able to distinct between their own theories and evidence provided by the data (Schauble, 1990). Therefore, it was

That is, we examine the home literacy activities that parents of children in the upper grades of primary school engage in and will also look at the amount of books at home and

Whereas a direct relation has been shown between young children ’s home literacy environment and their reading skills, it is thought that for children in the higher grades of

If children demonstrate activation of prior text information during read- ing of a subsequent text (as indicated by a difference in reading times between the.. condition with