• No results found

Dynamic testing in practice : shall I give you a hint? Bosma, T.

N/A
N/A
Protected

Academic year: 2021

Share "Dynamic testing in practice : shall I give you a hint? Bosma, T."

Copied!
32
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Bosma, T.

Citation

Bosma, T. (2011, June 22). Dynamic testing in practice : shall I give you a hint?. Retrieved from https://hdl.handle.net/1887/17721

Version: Not Applicable (or Unknown)

License: Licence agreement concerning inclusion of doctoral thesis in the Institutional Repository of the University of Leiden

Downloaded from: https://hdl.handle.net/1887/17721

Note: To cite this publication please use the final published version (if applicable).

(2)

CHAPTER 5

Bridging the Gap Between Diagnostic Assessment and Classroom Practice

The contents of this chapter are published in:

T. Bosma & Wilma C. M. Resing (2008).

Journal of Cognitive Education and Psychology, 7, 174-196.

(3)

Abstract

This study was designed to investigate teachers' opinions of and responses to reports and recommendations based on dynamic assessment or traditional assessment as part of a psycho-diagnostic procedure. One hundred six typical first grade elementary school children participated, as well as their 18 teachers, distributed over an experimental, semi-control and control condition. Children were administered either a dynamic test (Learning Potential Test for Inductive Reasoning) or a static test (Raven's Progressive Matrices). Teachers were observed and interviewed and asked to estimate the learning potential for each child, prior to assessment as well as after recommendations, to assess possible changes and determine their responses to the provided reports and recommendations. Results showed teachers' appreciation for specific contents of the dynamic assessment reports (e.g., learning potential, need for and type of instruction). Teachers rated most recommendations as applicable and even changed some aspects of their teaching practices in response to recommendations;

suggesting that dynamic assessment provides the tools to link assessment to teaching practice.

Acknowledgements

We thank J. Harmony, L. Hoek, A. Huisman, J. Kemper and N. Warmerdam for helping us collect and code data.

(4)

Introduction

The primary purpose of this study was to investigate teachers' evaluations of and gain insight into their opinions regarding two types of psycho-educational reports. Our attention was focused in particular upon possible changes in teachers' educational practice after reports and recommendations based on static versus dynamic assessment were provided. Teachers' evaluations were studied through a (semi)diagnostic process within the regular school setting, including classroom observations and assessments of typically behaving elementary school children in their own classroom environment.

Often teachers are capable of adapting their educational practices to the idiosyncratic needs of pupils in their classrooms by offering curriculum in smaller steps for children with learning difficulties or offering instruction in more compact and abstract forms to gifted children. Nevertheless, sometimes these adaptations or changes in classroom instruction and feedback appear not to have the expected effect.

In these instances, teachers might consult an educational (or school) psychologist and ask advice about specialized types and contents of instruction a particular child needs, including explanations for a child's disappointing responses to instruction and intervention (Bosma & Resing, 2006; Grigorenko, 2009). In response to such a referral, a psychologist will often start a psychodiagnostic process, which includes a cycle of diagnostic phases starting with information collection (intake) to formulating and testing hypotheses (De Bruyn, Ruijssenaars, Pameijer & Van Aarle, 2003; Pameijer, 2006), ideally resulting in conclusions and practical educational recommendations (Resing, Ruijssenaars & Bosma, 2002). A written report is usually one of the important means to communicate the results of the psychodiagnostic process, the answers to the referral question, and recommendations for instructions and interventions to address the needs of a particular child.

(5)

Although the process and reporting outlined above is often considered to be an ideal form of psychodiagnostic assessment, Hagborg and Aiello-Coultier (1994) have pointed out that there is a discrepancy in the perceived usefulness of diagnostic reports by teachers. These reports often lack the practical and concrete recommendations teachers expect (Haywood & Lidz, 2007). One possible reason for this discrepancy lies in the function of the report; psychologists often use reports to clarify and analyze problems, whereas teachers perceive them as a confirmation of experienced problems or as a source of practical recommendations. The purpose of the diagnostic instruments used by educational psychologists might be related to this discrepancy; most instruments have not been specifically developed for the purpose of guiding interventions and instruction (e.g., Lidz, 1991; Resing, 2006; Sternberg &

Grigorenko, 2002; Tzuriel, 2000a). Although for a long time assessment has been, and still is, the main "product" of educational psychologists, the conventional instruments chosen to be administered in an assessment, such as intelligence tests, have mainly been developed for classification of children, for identification or for determining eligibility for special education services, or for clarification of learning difficulties (e.g., Elliott, 2003). These traditional instruments focus mostly on quantitative assessment elements (scores) and children's deficits, whereas information about learning processes, specific learning strategies, and other, more qualitative aspects, are hardly taken into account (e.g., Lidz, 1992; Lidz & Elliott, 2000; Resing, 1990, 1998; Tzuriel, 2000a).

To address this incompleteness of information from traditional instruments, various dynamic testing and assessment measures have been developed, all of which have in common that an intervention or training phase is part of the testing process (e.g., Elliott, Grigorenko & Resing, 2010; Lidz & Elliott, 2000; Resing et al, 2002;

Sternberg & Grigorenko; 2002). In such an intervention phase, a child's response to instruction and feedback on cognitive tasks (e.g., solving analogies, block design tasks, or a simple math task) is observed, mostly in the form of comparing pre- and posttest scores, as well as by the amount of help a child received during the intervention phase.

Some dynamic assessment procedures follow a rather clinical approach in which the

(6)

intervention phase is guided by an experienced clinician and by the specific characteristics and needs of a child (e.g., Lidz 1992, 2002; Tzuriel, 2000a, 2001), whereas others include a more structured and standardized training. A very promising approach in using the amount of provided feedback as indication of a child's potential is the graduated prompts approach (e.g., Campione & Brown, 1987; Elliott et al., 2010;

Peña, 2000; Resing, 1993, 2000, 2006). In this dynamic assessment approach, feedback is provided according to a hierarchical structured hints system, which is based on a cognitive task analysis (e.g., Sternberg, 1985). Both type and amount of instruction, including strategies the child uses to solve the problems, have been shown to provide valuable information that can be used to inform classroom recommendations (e.g.

Bosma & Resing, 2006; Haywood & Lidz 2007; Resing et al., 2002). The Learning Potential Test for Inductive Reasoning (LIR) (Resing 1990, 1997, 2000), which was used in the present study, is an example of such a dynamic test with a structured hint protocol.

Results of various studies (e.g., Hessels-Schlatter, 2002a; Lidz, 2002; Resing, 1990, 1997; Tzuriel, 2000a) focusing on the effects and usability of dynamic tests, from clinical to structural approaches, have demonstrated insights into children's learning potential, their need for instruction and their response to feedback. Results of a reversal transfer task added to the LIR dynamic testing procedure, in which children were asked to construct a similar problem as they had previously experienced in the dynamic test and to instruct and give hints to the experimenter (in the role of child) regarding how to solve the problem, have been shown to provide additional information for guidelines for recommendations (Bosma & Resing, 2006). Conclusions from studies of dynamic testing regarding improved insight into children's potential and for guiding classroom recommendations are certainly promising. However, to date, dynamic tests have not been fully incorporated into psychologists' practice (Elliott, 2003; Grigorenko, 2009; Jeltova et al., 2007).

The shortcomings of psychological reporting have been the focus of several studies. In addition to investigating readability of psychological reports (e.g., Harvey, 1997; Witt, Moe, Gutkin, & Andrews, 1984), preferences for specific types of reports

(7)

also have been investigated. A comparison of teachers' ratings of observational versus test based reporting and recommendations concerning a child referred with behavior problems showed that teachers rated both reports as equally helpful (Salvagno &

Teglasi, 1987). However, teachers preferred interpretative information over factual information and rated concrete recommendations, which included a verbatim statement a teacher could use in actual practice, as most useful in both reports. These outcomes reflect a common and general critique of psychological reports that recommendations are often too few and not sufficiently concrete (e.g., Bosma &

Resing, 2006; Hagborg & Aiello-Coultier, 1994; Haywood & Lidz, 2007; Tzuriel, 2000a).

Nevertheless, psychological reports are still the most common way to communicate results of psychodiagnostic procedures. We have concentrated our study, therefore, on teachers' opinions regarding recommendations based on either dynamic or traditional (standard) assessment. Our first specific focus was on the type of information teachers considered as helpful to guiding classroom practices and for planning interventions. Research of Delclos, Burns, and Vye (1993) revealed earlier that information conveyed in reports based on dynamic assessment is more appealing to teachers, because this kind of reporting, in principle, provides insights into children's learning characteristics, realistic expectations, and suggestions for interventions.

Hulburt (1995) investigated preschool teachers' ratings of information in different types of reports regarding applicability in planning interventions in preschool settings.

Information in curriculum based and dynamic assessment reports was preferred most;

especially information on skills and monitoring progress, understanding a child's difficulties, and recommendations. In standard reports, the diagnosis was considered to be an important aspect of the psychodiagnostic process, and even more interestingly, information on learning processes and teaching strategies-as specified in reports based on dynamic assessment-was rated as most valuable compared to all other information. These outcomes are consistent with results of an evaluation by special education coordinators in the UK (Freeman & Miller, 2001) regarding norm referenced, criterion referenced, and dynamic assessment reporting. Their results

(8)

revealed that, although reports based on dynamic assessment were relatively unfamiliar to the coordinators, they rated the information as potentially useful in understanding student difficulties and in constructing individual education plans. Based on the results of the above studies, we expected that information related to dynamic assessment (e.g., need for instruction and type of help) would be rated by teachers as highly valuable in planning interventions.

The majority of studies to date regarding report preferences have focused on evaluating written reports on individual, imaginary children with whom teachers had not actually worked in their classroom. In the present study, therefore, we recruited teachers and a portion of their pupils who took part in a simplified but controlled diagnostic process, including assessment, observations, and a written psycho- diagnostic report. Several weeks after recommendations were provided teachers were asked to rate the information conveyed in these reports and the recommendations.

A second interest in our study was teacher expectations regarding the learning potential of the children in their classroom. Sometimes teachers tend to underestimate children's potential and dynamic assessment results can be used to demonstrate their actual learning potential (e.g., Hessels, 1997,2000). Delclos, Burns, and Kulewicz (1987) reported in this context that information derived from dynamic assessment results might lead to changes in teacher expectations of a child's abilities and to more realistic expectations of his or her learning potential. We expected, therefore, that teachers would change their estimation of a child's learning potential in response to the dynamic assessment reports and that teachers receiving only standard reports would show lesser changes in these expectations than teachers receiving dynamic assessment reports.

Although teachers might rate information as potentially useful for guiding their classroom practice, they may or may not implement these recommendations and will not necessarily make changes in their daily classroom practice. Development of expertise, stages of development, and the ability to reflect on experience are some factors that explain differences among teachers in changing their teaching (Richardson

& Placier, 2001). To investigate whether teachers changed their practice in response to

(9)

the reported results of assessment and to the recommendations provided, we compared observations of teaching style and teacher-child interaction prior to the assessment to observations several weeks after recommendations were given.

Method

Participants

One hundred and six first grade elementary school children (56 boys and 50 girls) with a mean age of 7 years and 6 months (SD = 5.2 months; range = 67 to 94 months) and their 18 teachers (all female), with 5 to 10 or more years of teaching experience, participated in this study. Children and teachers were recruited from 14 elementary schools in the South- West of the Netherlands. Schools were approached by telephone and letters. When directors and teachers agreed to participate, children were chosen using an alphabetical (classroom) name list of which 8 children were randomly selected (by picking every third child) after which parental permission for participation was requested. Of the 144 children selected, 106 children were allowed to participate (on average 6 out of 8 children).

Design and Procedure

To obtain a global measure of children's general cognitive-intellectual abilities, Raven's Progressive Matrices, black and white version (Raven, Court, & Raven, 1979) was administered as a pretest. Depending on their Raven PM score, gender, and age, the children were assigned to an experimental, a semi-control or a control condition.

Teachers of children in the experimental condition also participated in the semi-control condition. In this way, differences in teaching practice towards individual children could be measured, as well as changes in estimations/expectations of learning potential. Teachers in the "pure" control condition did not have any children in the experimental condition, did not receive any recommendation based on dynamic assessment, and functioned as a comparison group to teachers who did have any

(10)

information about dynamic assessment. The distribution of teachers and children over the three conditions and administrative order of various measurements are displayed in Table 5.1.

Before the formal assessment took place teachers completed for every participating child a Dutch version of the School Behavior Checklist (SCHOBL-R;

Bleichrodt, Resing & Zaal, 1993; Resing, Bleichrodt & Dekker, 1999), including an additional, newly constructed Learning Potential scale. In addition, experimenters (five psychology students) conducted an interview with the teacher and performed classroom observations. During the interview, teachers were asked to rate the learning potential for each participating child in their classroom on a 6-point scale varying from low potential (0) to high potential (6).

In the next few weeks the dynamic assessment measure (LIR) was administrated to children in the experimental condition, including a pretest, 3 training sessions, a posttest, and a reversal task. Based on the outcomes of assessment, questionnaire, and classroom observation a written report, including recommendations, was handed to each teacher. Reports with specific recommendations based on the dynamic test were provided only regarding children in the experimental condition. All reports were written by the experimenters under supervision of and edited by the first author to insure similar content, structure, and style. Two to three weeks later classroom observations and questionnaires were repeated, including a second teacher interview and completion of learning potential ratings by the teachers.

Table 5.1. Design of Study

Condition Teachers Children

Teacher

measures I Test Report

Teacher measures II

Experimental 36 Raven’s

LIR

dynamic

Control 1

12

34 standard

Control 2 6 36

Interview questionnaire observations

Raven’s

standard

Interview questionnaire observations

(11)

Measures

Raven's Progressive Matrices (Raven PM). To get an indication of the children's general level of cognitive-intellectual ability before the experiment started, the Standard black and white version of the Raven PM was administered. Raw scores were used for blocking participants over the conditions and individual results were conveyed in the reports for children from all three conditions.

Learning Potential Test for Inductive Reasoning (LIR).To get dynamically administered information on the children's general level of cognitive functioning the verbal analogy subtest and training on the LIR were administered. The LIR is a standardized dynamic test based on a graduated prompts approach (e.g., Campione & Brown, 1987) for children age 6 to 8, having a pretest-training-posttest design aimed at assessing the learning potential of a child in the domain of inductive reasoning. The training on the LIR consists of a standardized series of hierarchically structured hints, based on task analyses of inductive reasoning tasks (e.g., Sternberg, 1985) and meta-cognitive hints;

starting with general meta-cognitive hints (e.g., "what do you have to do?"), to more concrete, cognitive task- specific hints (e.g., "why do A and B belong together, do you think?"). In the most specific hint, the examiner explains the solution fully to the child.

Measures for learning potential were defined as the test score after training, but more importantly, the minimum number of hints a child needs to solve analogical reasoning problems on his or her own, until a fixed learning criterion is reached (e.g., 4 problems solved without any help). Besides giving a correct answer, the child also was asked to verbalize and justify his or her answer. The number and types of hints as well as verbalizations were considered as measures of learning potential and were viewed as providing central information in formulating recommendations for the experimental, dynamically assessed group of children.

Reversal Task. In addition to the LIR, a reversal task was developed to stimulate and provoke children to demonstrate their understanding of the trained inductive reasoning principle. The child had to construct four analogies for the examiner and had to instruct the examiner regarding how to solve the analogy problems. In a previous

(12)

study by Bosma and Resing (2006), this reversal task was administered as part of the learning potential test to provide additional information about transfer and teaching skills as well as recommendations for instruction and educational challenges needed.

The number of correctly constructed analogies and the number of described relations were scored during administration and included in the psychological reports of children in the experimental group.

Dutch School Behavior Checklist Revised (SCHOBL-R). In order to measure typical classroom behaviors of children, the SCHOBL-R was completed by all teachers. The questionnaire consists of four main scales covering Extraversion, Agreeableness, Attitude Towards School Work, and Emotional Stability. Teachers had to choose which one of two statements fit best to the child. To measure "learning potential" separately, 13 new items were added to the questionnaire, all having the same format, regarding a child's ability to learn and need for help (e.g., "child needs a lot of help/can work on his own"). Factor analysis on these 13 items separately showed strong unidimensionality, with a mean factor loading of .908 (varying from .851 to .957) and a high internal consistency (α =.98). The questionnaire, with the five scales, was administered at the start of the experiment. The parallel version, including the learning potential scale as well, was given after psychological reports and recommendations were provided.

SCHOBL-R scores will be presented separately from those on the learning potential rating scale.

Teachers' Interview. Teachers were interviewed twice systematically for about 10 minutes, using structured interviews, which were audiotaped and transcribed. The first interview was at the start of the diagnostic process, and the second took place 2 to 3 weeks after the teachers received the psychodiagnostic reports. Topics in the first interview related to teachers' practical experience with psychodiagnostic reports, writing individual educational plans, and types of recommendations they preferred in these reports. Two to 3 weeks after recommendations were received, a second structured interview was conducted in which teachers were asked to rate the practical value of the information provided in the psycho- diagnostic report and recommendations. They were asked to place their responses on a 3- point scale (little

(13)

practical use, neutral, a lot), specifically the value of information was rated on, for example, test results and scores, need for instruction, type of help needed, observations, and recommendations. Additional comments or points of discussion were audiotaped and transcribed.

Observations. To observe the teacher-child interaction in the classroom, parts of the Mediated Learning Experience (MLE) Rating Scale, developed by Lidz (1991, 2003) and translated into Dutch by Van der Aalsvoort (1994) was carried out by the experimenters. The following teacher-child interactions were observed and rated on 4- point scales: Intentionality (involvement of teacher), Task Regulation (type of task instruction), Praise and Feedback (frequency of positive feedback), Challenge (challenging children's zone of proximal development, or ZPD) and Informing Change (informing the child's achievement). At the same time, the teaching practices of the teacher were observed using Lidz's (2003) "Observing Teaching Interactions." The components on this scale are similar to those on the MLE Rating Scale: Intentionality, Task Regulation, Praise and Feedback, Challenge, and Change. The scale's component of contingent responsivity (Lidz, 2003), meaning the response to and balancing of children's needs, was observed as well. All these observations required a total of at least an hour. To acquire a good sense of the classroom atmosphere and to practice the use of observation scales, experimenters were supervised during their first couple of observations.

Reports and Recommendations. To communicate the results of the assessment and recommendations, all teachers received a printed report about the children involved in the experiment. Reports and recommendations were different for experimental- and control-group children. Reports of children in the experimental condition included the following information: summary of observed teaching practice and teacher-child interactions, teacher's estimations of learning potential (derived from interview and questionnaires) and test results. Results of the dynamic test and reversal task were prominently reported, regarding the children's need for and response to hints, as well as the type of hints and their ability to apply the learned problem solving principles in the posttest measures and observed behaviors during testing. Recommendations

(14)

regarding children in the experimental group all had comparable formats but contents were child-specific and focused on positive feedback, need for either challenging tasks or help (within the ZPD), type of help the child profited from (general versus more task specific help) and targeting task behavior.

The standard reports provided for children in the control and semi-control conditions were similar in structure and content to the dynamic testing reports, except for test results: only the score on the Raven PM was reported and no dynamic testing results were provided. Recommendations for the control condition focused mainly on positive feedback and observed behaviors.

Results

Before investigating the research questions, we examined children’s initial levels of cognitive functioning and age and checked whether experimental- and control-group children differed significantly in this respect. One-way ANOVA with the independent variable being condition with three levels (experimental, control same teacher, and control different teacher) and the dependent variables, age and Raven’s score, showed neither significant age nor Raven’s effects for condition.

Further, we investigated the effects of administering the LIR on children’s reasoning ability, in the experimental condition. One-way repeated measures analysis was conducted with scores on verbal analogies specified as within-subject variables measured in two times of testing (pretest-posttest). The results indicated a significant effect for time of testing, Wilks’s λ = .29, F(1,35) = 85.14, p ≤ .001, η2 = .71. After the dynamic testing procedure children achieved significantly higher scores on verbal analogies (M = 22.9, SD = 3.7) compared to their pretest scores (M = 16.7, SD = 4.8).

Over 3 training sessions children needed a considerable number of prompts in reaching this effect (M = 24.4, SD = 15.7), and differed strongly from each other in this respect (range: 3-79 prompts).

(15)

Rather strong Pearson correlations were found between number of prompts needed before reaching independent problem solving and posttest scores (r = -.76, p ≤ .001). Controlling for the effect of the pretest score, a partial correlation coefficient revealed a still significant relationship between the number of prompts and the posttest score (r = -.62, p ≤ .001) .

The results of the reversal task for children in the experimental condition revealed a significant negative Pearson correlation between the number of hints needed by the child and his or her ability to construct correct new analogies (r = -.40, p

≤ .05) and to describe full analogical relations (r = -.38, p ≤ .05). Comparable correlations appeared to exist between children’s verbal analogy scores at the posttest and their ability to construct analogies (r =.40, p ≤ .05) as well as their descriptions of analogical relations (r = .41, p ≤ .05) during the reversal task.

To address the question of whether teachers change their estimation of learning potential of the children in their classroom as a consequence of recommendations based on dynamic versus static diagnostic information, teacher estimations of the child’s learning potential obtained during the first and second interview were compared and analyzed. Repeated measures analysis with condition (experimental, semi-control, control 2) specified as the between-subjects factor, across two measurement moments (time of observation) and estimation of learning potential specified as the within variable revealed no significant results for neither time of estimation, Wilks’s λ = .98, F (1,103) = 1.82, p = .18, η2 = .02, nor condition, Wilks’s λ = .96, F (2,103) = 2.19, p = .12, η2 = .04. In all conditions, teachers rated children’s learning potential as high to above average before the intervention took place (M = 4.1, SD = 1.2), as well as at their second estimation (M = 4.2, SD = 1.1), several weeks after the given reports and recommendations were received.

A second measure for teachers’ estimations of children’s learning potential was obtained by the scores on the SCHOBL-R. In all three conditions, teachers rated children’s learning potential as high to above average before the intervention took place (M = 52.7, SD = 16.9), as well as at their second rating (M = 55.7, SD = 18.1), several weeks after the given reports and recommendations were discussed. A

(16)

repeated measures analysis, again with condition (experimental, semi-control, control 2) specified as the between-subjects factor, across two measurement moments (time of estimation) and learning potential scale estimation specified as the within variable revealed a significant main effect for time of measurement, Wilks’s λ = .88, F (1,103) = 14.15, p ≤ .001, η2 = .12; however, there was no significant interaction effect, Wilks’s λ

= .98, F (1,103) = 1.110, p = .33, η2 = .02, indicating that teachers’ ratings of children’s learning potential increased nearly the same in all three conditions. Similar repeated measures analyses with the scales Extraversion, Attitude Towards School Work, Agreeableness, and Emotional Stability, each as separate within variables, did not show any significant results for time or condition.

Bivariate analyses demonstrated significant Spearman correlations between interview and learning potential questionnaire ratings at first measurement (rs = .78, p ≤ .001) and second measurement (rs = .82, p ≤ .001). Scores on the Attitude Towards School Work scale significantly correlated with the learning potential scale at the first (rs

= .72, p ≤ .001) and second measurement (rs = .70, p ≤ .001), as well as with the learning potential estimations during the interviews (rs = .59, p ≤ .001; rs = .62, p ≤ .001) for the first and second interview, respectively. Further, a significant but not very high relationship between the emotional stability scale and the learning potential scale of the questionnaire was found at the second measurement (rs = .20, p = .04). These analyses indicate that learning potential estimations on both measures (interview and questionnaire) relate strongly with one another as well as with a child’s task behavior (the Attitude Towards School Work scale) and in lesser extent with the level of emotional stability of the child.

Additionally, significant relationships were found between learning potential estimations (interviews and questionnaires) and learning potential scores on the dynamic test (LIR). The lower the number of prompts a child needed, the higher the learning potential ratings of the teacher on both the second interview and the second questionnaire (rs = -.61, p ≤ .001). These correlations were comparable to those with estimations at both the first interview and questionnaire (rs between -.43 and -.46, p = .005). Significant correlations were also found with the posttest score on the LIR and

(17)

the learning potential ratings during the second interview (rs =. 54, p ≤ .01) and second questionnaire (rs = .41, p ≤ .01) as well as with estimations during both the first interview and questionnaire (rs = .31, p =.07 and rs = .39, p = .02, respectively). In addition, significant correlations between the scores on the Attitude Towards School Work scale as well as scores on the Emotional Stability scale of the questionnaire and number of prompts during the LIR (rs = -.35, p = .04 and rs = -.36, p = .03, respectively) were found, but only for the second rating, after recommendations were provided.

Further, scores on the Attitude Towards School Work scale correlated significantly with the LIR posttest score (rs =. 36, p = .03). Other scales of the questionnaire revealed no significant relationships with the LIR. Significant, moderate correlations around .40 were also found between Raven’s scores and learning potential estimations during the first interview and first questionnaire and between .50 and .55 after recommendations were provided.

In sum, results showed that the learning potential ratings—questionnaire and interview—relate significantly to each other as well as to the learning potential measures of the dynamic test. The relationships sometimes became stronger at the second measurement, 2 to 3 weeks after recommendations were given. The scales Attitude Towards School Work and Emotional Stability also revealed strong relationships with both learning potential ratings as well as with the number of prompts provided during training, all indicating that teachers’ estimations of children’s learning potential do relate to their task behavior as well as to their emotional state.

To address the question whether, compared to observations prior to assessment, changes in teacher practices and behaviors could be observed several weeks after recommendations were provided, we examined the observations of the teachers, in the experimental condition (who were the same teachers as in the semi- control condition) compared to teachers in the second control condition who did not receive any dynamic assessment information. Separate repeated measures analyses were conducted with these two teacher conditions (experimental; control 2) specified as the between-subjects factor, across two measurement moments (time) for each of the six scales of Lidz’s Observing Teaching Interactions rating scale, separately specified

(18)

as within variables. Because it was our expectation that teacher practices would remain at least at the same level and possibly improve after our intervention, we chose to analyze data at the 0.10 level (one-tailed) of significance.

In Table. 5.2 the mean scores and standard deviations of the observations before and after recommendations were discussed with teachers are shown for each of the six subscales separately. Repeated measures analysis revealed one significant interaction effect, for the subscale Task Regulation, Wilks’s λ = .84, F (1,16) = 3.08, p = .099, η2 = .161, indicating that after the recommendations were discussed, teachers increased their teaching of task regulating behaviors towards dynamically tested children, as is depicted in Figure 5.1.

Table 5.2. Teacher 0bservations per condition before and after recommendations were given

Observation scale Condition

Pretest

M SD

Posttest

M SD

Intent Experimental

Control 2

3.1 2.5

.52 .59

3.3 2.5

.62 .55 Task regulation Experimental

Control 2

2.3 2.0

.49 .00

3.1 2.2

.52 .41

Praise Experimental

Control 2

2.8 2.3

.87 .52

2.9 2.5

.67 .55

Challenge Experimental

Control 2

2.3 2.3

.62 .82

3.0 2.7

.43 .52

Change Experimental

Control 2

1.5 1.3

.52 .52

1.8 1.5

.72 .84 Contingent

responsivity

Experimental Control 2

2.5 2.8

.80 .41

3.0 2.8

.85 .41

Further, significant main effects were found for the subscales Task Regulation, Wilks’s λ = .68, F(1,16) = 7.59, p ≤ .01, η2 = .32, Challenge, Wilks’s λ = .62, F(1,16) = 9.91, p ≤ .01, η2 = .38, and Change, Wilks’s λ = .78, F(1,16) = 4.57, p ≤ .05, η2 = .22, indicating that teachers in both conditions were observed as performing more task regulating activities, as more frequently challenging children’s ZPD, and as more frequently explaining children’s progress, 2 to 3 weeks after recommendations were

(19)

Figure 5.1. Pre- and post intervention scores for observed teacher practice on Task Regulation.

given. On the observations subscales Intent, Praise, and Contingent Responsivity, positive changes also were observed (see Table 5.3); however, the changes were not significant for Intent, Wilks’s λ = .94, F(1,16) = 1.07, p = .32, η2 = .07, Praise, Wilks’s λ = .85, F(1,16) = 2.84, p = .11, η2 = .15, and Contingent Responsivity, Wilks’s λ = .92, F(1,16) = 1.46, p = .25, η2 = .083.

These results indicate that the two groups of teachers differed from each other in the level of showing Task Regulating behaviors; teachers who received dynamic testing results and recommendations were observed to use more instructions regarding planning behaviors and strategies and requested more justifications and evaluations of answers, than teachers in the control condition. Teachers also improved on the subscales Challenge and Change in both experimental and control group several weeks after providing recommendations.

(20)

Figure 5.2. Pre- and postintervention scores for observed teacher-child interaction on Praise.

Besides studying the presumed changes on general teaching practices, we examined also whether teacher-child interactions changed in response to the provided recommendations. Although we did observe teacher-child interactions for all three conditions, specific recommendations based on dynamic assessment regarding instruction, learning potential, and challenge were provided only to teachers of children in the experimental condition. We therefore expected that teacher-child interactions in the experimental group would specifically improve compared to teacher-child interactions in the second control condition, in which teachers received standard reports and no information about dynamic assessment..

Results of the five repeated measures analyses conducted with two teacher conditions (experimental and control 2) specified as the between-subjects factor, across two measurement moments (time of observation) and each of the five scales of the observed teacher-child interaction specified as within variables indeed revealed significant interaction effects for the two subscales Praise, Wilks’s λ = .92, F(1,70) =

(21)

6.09, p ≤ .02, η2 = .08, and Challenging ZPD, Wilks’s λ = .92, F(1, 70) = 5.88, p ≤ .02, η2 = .08, as can be seen in the Figures 5.2 and 5.3. In the experimental group an increased number of teacher-child interactions were observed showing positive feedback (Praise) compared to the control group, although the latter group of teachers already did use frequent praise at the first observation; the number of observed interactions in which children were challenged in their ZPD also increased more in the experimental group compared to the second control group.

Figure 5.3. Pre- and post-intervention scores for observed teacher-child interaction on Challenging the ZPD.

Repeated measures analyses were also conducted with the three teacher conditions (experimental, semi-control and control 2) specified as between-subjects factor, across two measurement moments (time of observation) for each of the five scales of the

(22)

observed teacher-child interaction specified as within variables. Because it was our expectation that observed teacher child interactions would be improved after our intervention, we chose to analyze data at the 0.10 level (one-tailed) of significance.

In Table 5.3 the mean scores and standard deviations of the first and second observations are shown for each of the five subscales separately and as can be read off, all observed interactions have changed in positive directions for all conditions.

Results of the repeated measures analyses revealed significant main effects for all subscales: Intentionality, Wilks’s λ = .95, F(1,103) = 5.66, p ≤ .02, η2 = .05), Task Regulation, Wilks’s λ = .97, F(1,103) = 3.56, p ≤ .06, η2 = .03, Praise, Wilks’s λ = .93, F(1,103) = 7.43, p ≤ .01, η2 = .07, Challenge, Wilks’s λ = .83, F(1,103) = 21.48, p ≤ .01, η2 = .0.17, and Change, Wilks’s λ = .86, F(1,103) = 17.12, p ≤ .01, η2 = .14, indicating that teachers in all conditions were observed as showing significantly improved teaching, instructing, and feedback on the MLE scale, after receiving and discussing recommendations.

Around the time of the second observations, teachers were also interviewed to gather their opinions regarding the information (results, observations and suggestions) that they found valuable in our reports and recommendations. They were asked to rate the level of practical use (little, neutral, a lot ) of provided information regarding type of instruction, type of help, estimation of ZPD, explanation of learning experience, observations and observed interactions with the child. The majority of teachers evaluated the information per category as neutral to positive (see Table 5.4). Positively rated were recommendations regarding type and amount of help (41%), insights regarding learning potential (59%), feedback from observations (47%) and recommended interactions (47%), which are all prominent components of the dynamic assessment reports. The reported Raven’s scores were rated less positively; the majority of the teachers (75%) rated these score as less useful or neutral.

(23)

Table 5.3. Mean scores and standard deviations for observed teacher-child Interaction, per condition

Condition Pretest M SD

Posttest M SD Intentionality Experimental

Semi-control Control 2

1.97 1.94 1.86

.70 .55 .35

2.19 2.03 2.00

.53 .58 .00 Task regulation Experimental

Semi-control Control 2

1.56 1.62 1.97

.77 .89 .79

1.81 1.65 2.31

.98 1.04 .62

Praise Experimental

Semi-control Control 2

1.08 1.03 1.39

.69 .72 .69

1.58 1.12 1.44

.81 .69 .61

Challenge Experimental

Semi-control Control 2

1.44 1.44 1.33

.88 .86 .54

2.03 1.79 1.33

.17 .59 .51

Change Experimental

Semi-control Control 2

.03 .00 .39

.17 .00 .49

.28 .21 .58

.56 .59 .81

During the same interview, we also explored the value of information regarding classroom observations and interactions. Although many teachers rated the recommendations from the observations as “neutral,” they also commented that the observations did strongly confirm their own knowledge and experience. One teacher from the control group, who had 11 years of experience, commented, for example: “It is a confirmation of my own experience and practice: neutral.” Another teacher, from the experimental condition, with 9 years experience, responded: “The observational information you gave me was very informative. It reminds you how you do handle a child. Especially in combination with the recommendations they were very useful.” Yet another teacher, with 6 years experience from the experimental condition, commented: “What I especially liked was that you were neutral. You didn’t know any history of the children, and observed the child in the classroom as it was.”

The value of understanding learning potential also was explained by several teachers. One teacher, with 5 years of experience from the experimental group, commented: “Neutral. It may sound conceited, but I think when you have a lot of

(24)

experience, you know at what level you could rank a child. I did not come across anything unexpected.” A positive comment came from a teacher from the experimental condition with 8 years experience: “The information was very relevant. I’ll keep it in mind and will definitively transfer the information to the teacher in the next class.”

Whereas teachers in the control group gave overall neutral responses and hardly any negative ones, more variance in ratings was found within the experimental group. Two teachers in the experimental condition answered the majority of all questions negatively, which affected the mean scores. On the question whether recommendations provided practical insights to help start an individual educational plan for each child, teachers in both conditions responded that there was no need for special plans for their participating children because they behaved as typical children in their classes; however, information would be consulted when there would be future questions or reasons for starting educational planning.

Table 5.4. Teacher’s Ratings of Recommendations in Percentages, per Condition

Recommendations Useful

Condition Little Neutral A lot

Type of instruction Experimental Control 2

29.4 0

11.8 35.3

23.5 0

Type of help

Experimental Control 2

23.5 0

11.8 23.5

29.4 11.8 Estimation of ZPD Experimental

Control 2

12.5 0

31.3 31.3

18.8 6.2 Insight in learning

potential

Experimental Control 2

5.9 0

17.6 17.7

41.2 17.6 Explanation learning Experimental

Control 2

25 0

25 18.8

12.5 18.8

Observations

Experimental Control 2

11.8 0

17.6 23.5

35.3 11.8 Recommended interaction Experimental

Control

6.7 0

20 26.7

33.3 13.3

Raven’s scores

Experimental Control

31.3 6.3

25 12.5

6.2 18.8

(25)

Our final investigation concerned teachers in the experimental condition, all of whom participated in the semi-control condition as well. They were asked specifically whether the perceived differences in reports (standard and dynamic) made a difference in their teaching practice. Seventy-five percent responded positively to this question.

They explained further that the dynamic assessment reports were more elaborate and gave additional specific information about the children. Three teachers responded neutral or negative regarding the difference between reports and recommendations, and they perceived both types of recommendations as equally helpful. One teacher with 5 years of experience explained: “The dynamic recommendations were more elaborate; however, I don’t need such an elaborate report. The [standard]

recommendations based on just observations were useful for me.” To the contrary, one of the teachers who responded positively, having 8 years of experience, explained: “The dynamic recommendations provided a detailed picture of the child and how he developed during the trainings sessions; the other [standard] recommendations are providing results of one moment and are more general.”

Discussion

The main focus of our study was to investigate teachers’ responses to reports and recommendations based on results of a dynamic assessment within a regular education setting, in which teachers were interviewed, observed, and received reports about the assessment of several children in their classroom. One of our questions was related to the expectations of teachers regarding the learning potential of their students. We hypothesized that, in response to the recommendations from our reports based on dynamic assessment, teachers would change their estimation of a child’s learning potential and that teachers in the control condition who received only standard reports would show lesser change in their expectations. In order to measure teacher’s estimations of learning potential for individual children, we developed a specific learning potential scale in addition to the Dutch School Behavior Checklist, and teachers

(26)

were asked during the two interviews to estimate the learning potential for each participating child.

The results of the interviews and questionnaires demonstrated that teachers rated the learning potential of their students already at the start of our experiment as high average to above average. Several weeks after recommendations were provided we gathered once more the estimations of learning potential during the second interview and administered the parallel version of the questionnaire. Although the estimations slightly increased in average after the assessment and recommendations, teachers in the experimental and the two control conditions did not differ in their estimations. These results are not very surprising, since teachers had no referral question or concerns about the learning potential of the participating children prior to the assessment, and their ratings of learning potential at the start of the experiment were also relatively high in all three conditions. Previous studies that did show changes in expectations of learning potential all focused on dynamic assessment of children with learning difficulties (e.g., Delclos et al., 1993) or children from disadvantaged backgrounds, both populations in which teachers might have difficulty estimating the cognitive abilities (Hessels-Schlatter, 2002a) or even underestimate abilities (Hessels, 1997, 2000). We, on the other hand, focused on learning potential of typical children in elementary education.

In spite of the fact that teachers’ estimations of learning potential were not significantly affected by the type of reports and recommendations they received, we did find that the scores on the two new measures of gathering teacher’s estimations of children’s learning potential did show potential for future use. The learning potential estimation during the interviews and the ratings on the learning potential scale correlated significantly with each other. Scores on both measures also related significantly to the posttest scores of the LIR, as well as to the number of prompts children needed. Besides this relation with the dynamic test scores, teachers’ ratings of task behavior, as measured with the SCHOBL-R, were related to both learning potential estimations. Taking into account that task behavior and cognitive potential often show moderate to strong relationships, this finding is consistent with the literature. High

(27)

achieving children often show on-task behaviors and systematic approaches, and need relatively little additional instruction, whereas low achieving children more frequently need instruction and external task regulation (e.g., Siegler & Alibali, 2005).

Another focus of our study was whether we could detect the responses of the teachers to the recommendations that were provided in the reports (e.g., instructions, suggested interactions, or feedback) in their actual teaching practices. A positive response was expected especially from teachers in the experimental condition.

Classroom observations based on Lidz’s Observing Teaching Interactions (1992, 2003) with a specific focus on teaching and interactions related to dynamic testing were conducted at the start of the assessment procedure and repeated several weeks after discussing recommendations with teachers. General teaching practice was rated average on most variables in both conditions at the first observation; however, we did find at the second observation that, compared to teachers in the pure control condition, teachers in the experimental condition did show more frequent task regulating activities and instructions in their general classroom teaching. They were observed to give more instructions regarding planning and useful strategies and more frequently asked for justifications and evaluations of answers. This effect is promising since it seems to indicate that teachers respond to our specific recommendations, such as adapting instructions to the needs of children and organizing their lessons accordingly, which were exclusively and elaborately mentioned in the dynamic assessment reports. The observations also revealed that teachers in both the experimental and second (pure) control group more frequently challenged children within their ZPD and that they increased their informed feedback about children’s achievements, further indicating their response to provided recommendations. This effect is explicable by the fact that these two matters were described as part of the general observations in both the standard and the dynamic assessment reports. The fact that teachers in both conditions changed their teaching behaviors on these two matters shows that they noticed the information as important enough to implement, at least during our observations.

(28)

During the classroom observations, we also observed the specific interaction between teacher and child by using the MLE Rating Scale (Lidz, 1992, 2003). Again, we focused on teachers in the experimental condition, compared to teachers in the second control condition, because the latter group did not receive any information about dynamic assessment. It appeared that teachers who received dynamic assessment results and recommendations significantly challenged children’s ZPD more frequently.

This result fits with our expectations, since considering the ZPD of children was especially emphasized in the dynamic reports in relation to children’s achievements on the dynamic test. Depending on the child’s learning potential (the number of prompts and posttest score), we recommended to what extent children should be offered challenging or less challenging tasks. Teachers apparently applied these recommendations, at least during our observations. The observed teacher-child interactions also revealed that teachers in the experimental condition increased their use of positive feedback significantly, whereas teachers in the second (pure) control condition remained on the same level. Although frequent use of praise and feedback was recommended in both static and dynamic reports, teachers in the experimental condition seemed to have especially profited from the reported accounts of children’s responses to prompts and feedback during training sessions and, based on these applied examples, increased their praise and feedback.

Although these results of classroom and teacher-child interactions do meet some of our expectations, we hoped also to find that teachers in the experimental condition in particular would respond to our recommendations. However, we found that teachers in the experimental, semi-control as well as second control condition all improved their teaching when they were observed in interactions with individual children on all five observed variables. Their intentioned involvement with children increased as well as task regulating activities, feedback, informing children about their achievements, and taking the ZPD into account. The improvement in all three conditions was not expected, but it might well be that teachers in all conditions did apply to some extent the information as conveyed by our dynamic as well as static reports, or that an observer effect may have affected our results (Bogdan & Biklen,

(29)

2003). People do change their behaviors when observed, and teachers in all conditions might have changed and improved their teaching during the second time of observation, especially, because they understood what we were looking for. We tried to overcome this bias by remaining in the classroom for at least an hour, but this might not have been enough.

Another reason for not finding particular or different responses from teachers in the experimental condition lies perhaps in the fact that implementation of interventions or changes takes time and our observations took place 2 or 3 weeks after handing out the reports, which might be too short to effect changes in instructions or in adaptations of their teaching practices. According to Witt (1986), changes might even be resisted by teachers. Every implementation or change will have its effect on the whole classroom, and teachers are likely to choose the least intrusive interventions.

Apparently task regulating behaviors, increasing amount of feedback and praise, and challenging the ZPD were ideas teachers were ready to integrate into their general classroom teaching and interactions with children, whereas adapting their curriculum on matters such as increasing their involvement, informing children about their achievements, and balancing needs of children might have been more complex to implement in a few weeks.

The findings of changes in teacher practices and the use of learning potential estimations are interesting in relation to our main research purpose: acquiring insight into teachers’ opinions, appreciations, and preferences regarding dynamic assessment or static reports and recommendations. Teachers rated the practical content (e.g., recommended type of instructions, description of learning potential) of the received reports overall as neutral to positive, which indicates that our reports did at least measure up to reports teachers usually read and even surpassed these in quality for some teachers. A couple of teachers who responded less positively valued the information, yet had expected to receive even more hands-on information. On the other hand, the majority of teachers (in both conditions) responded that the received recommendations did provide practical insights on which they could start an individual

(30)

educational plan; however, since all participating children were typically behaving and developing children, there was no need to start any plan for them at that moment.

Promising, and expected, findings (see Delclos et al., 1993; Hulburt, 1995) were especially teachers’ positive evaluations regarding their understanding of a child’s learning potential and also regarding the type and amount of help that was recommended for a child. The content of these two recommendations is the essence of dynamic testing with a graduated prompt approach: through the intervention phase the response to prompts and feedback can be very specifically determined, and combined with the scores on the posttest and or pretest can provide an accurate description of a child’s learning potential (Jeltova et al., 2007; Resing, 1990, 2000). Teacher’s comments demonstrated further that they valued the information as relevant for their practice and would pass the information to future teachers. Other teachers recognized the value of this information, yet they felt they already had a good grasp of the learning potential based on their own experience. Again both explanations are understandable in the context that all these teachers dealt with typically achieving elementary school children.

The multiple testing and training sessions of dynamic assessment and our classroom observations created ample opportunity for observing children and formulating recommendations regarding feedback and interactions. Teachers rated the accounts of observed behaviors during testing and in the classroom as positive, as well as the recommended interactions. This leads us to conclude that these aspects of dynamic assessment, which we, though to a lesser extent, also added to the standard reports, are apparently valuable for teachers. Teachers commented especially that the provided observations and recommendations were a confirmation of what they noticed and experienced in their daily practice. To some teachers, the information even provided new insights into their own actions and interactions with children, and they appreciated the fact that the observations were conducted from a neutral standpoint.

(31)

A final important question was whether teachers would notice a difference between the two reports and would prefer one above the other. Because teachers in our experimental condition also participated in the semi-control condition, they had received both types of reports, and were able to compare their practical value. While several previous studies did not find a clear preference (Delclos et al., 1993; Hulburt, 1995), the majority of teachers in our study perceived the dynamic assessment reports as valuable and regarded these reports as making a difference in their teaching practice, compared to the standard reports they received for the children in the semi- control condition. They explained that especially the reported information and recommendations about need for instruction and response to feedback was valuable and useful regarding their children and teaching practice. As Haywood and Lidz (2007) noted, teachers wish to improve their understanding of children’s actions and want to know how they can improve children’s functioning. Our dynamic assessment procedure, including the graduated prompt approach, appeared to form a link between assessment and intervention, in offering specific recommendations for guiding teachers in their classroom practice. Hence, this finding might even be evidence of relevance for educational psychologists. Since teachers do recognize and value the additional information dynamic assessment offers them, educational psychologists might consider incorporating dynamic assessment into their psychoeducational practices.

Although we did find promising results, our study of a simplified but controlled diagnostic process had limitations. First, our sample of teachers and children was not very large, which leaves us to make careful interpretations and avoid making generalizations from our findings. Second, because the participating children were all typically developing and attending regular elementary schools, we did not start the process as would be usual for our psychodiagnostic procedure based on a referral question; as a result, teachers did not express any concerns to be addressed by the assessment, and the resulting recommendations were supplemental and hardly necessary. Third, we did not conduct a follow up to measure changes and opinions after a longer period of time. We have already adjusted designs of forthcoming studies to

(32)

address these limitations and to explore further the link between dynamic assessment and intervention.

Since various dynamic assessment instruments and procedures (e.g., Lidz, 1991; Lidz &

Haywood, 2007; Resing, 1990; 2000; Tzuriel, 2000a) have been developed with the expectation that results of these measures would provide additional insights into children’s cognitive abilities and would guide practical classroom recommendations, we would like, carefully, to conclude that our study offers evidence that dynamic assessment provides at least a first tool to bridge the gap between assessment and classroom practice. The teachers expressed their appreciation of the contents of the dynamic assessment reports and commented that they regarded the recommendations as applicable into their teaching practices; furthermore, they demonstrated change in some aspects of their teaching practices in response to the recommendations that were provided. These findings should encourage the incorporation of dynamic assessment into psycho-educational practices as well as further investigation into the linkages provided by this approach between assessment and intervention.

Referenties

GERELATEERDE DOCUMENTEN

License: Licence agreement concerning inclusion of doctoral thesis in the Institutional Repository of the University of Leiden. Downloaded

License: Licence agreement concerning inclusion of doctoral thesis in the Institutional Repository of the University of Leiden.. Downloaded

Further the question is examined whether teachers value the outcomes of dynamic testing procedures in the process of formulating educational plans for individual children

The Analogical Reasoning Learning Test (ARLT) was administered to 26 young children, receiving special education services, followed by a newly developed reversal

We conclude that dynamic testing, including a graduated prompt training and a construction task provide additional quantitative and qualitative information about potential

Children were trained to use systematic measuring to solve seriation problems on the Seria-think instrument (Tzuriel, 1998). The training was adaptive and prompts

The report finished with recommendations regarding needed practice of specific encountered math and, if present, memory difficulties, adaptations in the

was considered as useful to very useful for writing educational plans by 70% of the teachers. The responses particularly showed the usefulness of information