• No results found

Detecting rules with the Balance Scale Task : converging evidence from empirical and simulation studies

N/A
N/A
Protected

Academic year: 2021

Share "Detecting rules with the Balance Scale Task : converging evidence from empirical and simulation studies"

Copied!
39
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Detecting rules with the Balance Scale Task:

Converging evidence from empirical and simulation studies

Y. K. Kunkels

University of Amsterdam

10193634

prof. dr. H. L. J. van der Maas 2902 words

(2)

Index

Abstract 3

Detecting rules on the Balance Scale Task 3

Experiment 1 - Simulation study 5

Experiment 2 - Empirical study 9

Conclusion and discussion 13

Bibliography 14

(3)

Abstract

In the current study, we investigate two main topics related to the Piagetian Balance scale task. The first topic involves an evaluation of latent class analysis, a statistical technique for assigning rules to children on the balance scale task. A hotly debated issue here is whether latent class analysis is susceptible to finding small, trivial classes. The current study aims to elucidate this issue by replicating an experiment that supposedly shows this perceived shortcoming in LCA. The second topic that will be investigated is whether children can actually learn the most advanced Rule 4 on the balance scale task after relevant instructions. Therefore, we conducted an empirical study to see if this claim holds true. Results from both studies will be interpreted and discussed in context.

Detecting rules on the Balance Scale Task

When assessing proportional reasoning in children, the most commonly used task is the well known Balance Scale Task (BST). Results from tasks like the BST are often used to shed light on questions concerning cognitive reasoning in children (Siegler,1976, 1981). Such questions can be whether children use rules in acquiring knowledge, as predicted by the symbolic view, or whether they merely show behavior consistent with rules, as signified by the neural network view (Schultz & Takane, 2007). Other questions which can be elucidated with the BST include whether children are consistent in their use of rules, and whether the transitions from one rule to the other are continues and gradual, or discontinues and happen in

distinct jumps (Jansen & van der Maas, 2002). A commonly used method to assign children to specific rules is the method devised by

Siegler (1981), called the rule assessment methodology (RAM). But although this method is often used by researchers, it is not without its' limitations. Research by Strauss and Levin (1981) indentified such a limitation as RAM is found to produce spurious detection of rules. This limitation is created by the arbitrary choice of the used criterion, and can therefore not be tested in a proper statistical manner. Limitations such as the one considered before can be evaded by using a modern statistical technique called latent class analysis (LCA). LCA works by dividing the used sample into a limited number of latent classes. Classes can be identified by a specific pattern of probabilities. Such a pattern of probabilities indicates the chance of a specific response on an item. The produced pattern of probabilities is then interpreted as a cognitive strategy or rule. LCA is found to be well suited to be used on data from the BST

(4)

(Jansen & van der Maas, 1997, 2002; Quinlan, van der Maas, Jansen, Booij & Rendell, 2007). When comparing the more traditional method of RAM to modern LCA, one can

observe numerous advantages of LCA over RAM. Examples of such advantages are: the statistical fit measures that LCA offers which allows for proper hypothesis testing, the ability to model error processes, the use of a item number independent criterion which is not arbitrary chosen, and LCA's ability to detect unexpected response patterns (Jansen & van der Maas, 2002). Of course, RAM also enables researchers to detect new rules. But such a procedure would require visual inspection of the responses, and is by no means unconditional. On the other side, rule detection with LCA is unconditional and new rules are automatically derived. One of the questions in which the detection of rules is important, is in diagnosing Rule 4 on the BST. Diagnosing Rule 4 is a precarious endeavor as many conflict items of the BST can be solved by adding weight and distance. On this point, studies using RAM did not reach the desired outcome, as the use of an addition rule was not discovered (Siegler,1976, 1981). This RAM finding is contradicted by a studies indicating that children do use the addition rule (Normandue, Larivée, Roulin & Longeot, 1989; Wilkening & Anderson, 1982). LCA contributed to this body of contradictory evidence by also illustrating that a large proportion of children use the addition rule (Jansen & van der Maas, 1997, 2002). But although LCA has some clear advantages over RAM, some scholars still haven't embraced

this statistical technique, instead clinging on to the more familiar RAM. For example, Schultz and Takane (2007) argue that LCA is sensitive to large numbers

of items. Small item samples are indeed preferred in order to minimize the risk of rule switching within one test session, as rule switching is indeed difficult to handle for LCA as it is for RAM. But research by Boom, Hoijtink, and Kunnen (2001) shows that, when using modern software, LCA is well capable of handling large item sets. In another paper, Siegler and Chen (2002) retaliated from criticism that the 80% criterion in RAM is arbitrary by responding that using the alpha level of .05 in LCA is equally arbitrary. But on this point, van der Maas and Straatemeier (2008) argue convincingly that reliance on an arbitrary criterion based on goodness of fit of the model and the data should be preferred to an arbitrary criterion based on model parameters. They support their claims by stating the independence of alpha levels from the results yielded by a LCA model. In RAM, on the other hand, the arbitrary 80% criterion is satisfied when less than 20% of the participants fall within the 'no rule' category. In contrast with the alpha level criterion, the 80% criterion was not found to be supported by statistical theory.

(5)

Another interesting point in this discussion is that all the advantages of RAM posed by Siegler and Chen (2002) can be incorporated in LCA. This is done by using a very restricted confirmatory LCA (van der Maas, Quinlan, & Jansen, 2007). The restricted confirmatory LCA was found to be very similar to the fundamental RAM model, hence its' adequate performance when compared to RAM. These findings support earlier notions that as a statistical method, RAM can be improved by applying statistical techniques from LCA (Clogg, 1995). Another commonplace argument against LCA is its' perceived tendency to discover small unreliable classes. Research by van der Maas, Quinlan, & Jansen (2007) has already shown that LCA can detect small classes reliably. The current study aims to support these findings while investigating the claims Dandurand and Schultz (2013) make regarding to the small unreliable classes LCA is ought to find. Therefore a simulation study is

performed to show that such stern criticism is ungrounded. In the current article, we will also investigate the common, but yet untested, intuition that children acquire the torque rule through explicit verbal instruction that include relevant examples. Therefore, an empirical study involving an explicitly learned torque rule group and a torque rule naive group will be performed.

Experiment 1 - Simulation study

Method

To investigate claims that LCA is bound to find small unreliable classes, we used a simulation setup applied by Dandurand and Schultz (2013). Dandurand and Schultz simulated six hypothetical BST conflict items. Of these items, three could be solved using either the torque rule or the addition rule. The other three items could be solved using the torque rule only. Three different classes were used on the items: a torque rule class, an addition rule class, and a small random class. Class population sizes were respectively, .48, .48, and .04. We then attempted to replicate their results. Ten simulations were performed, and each simulation contained 500 cases. The response patterns were then analyzed using LCA in the R statistical software program.

(6)

Results

With this simulation, the hypothesis that LCA is bound to find small unreliable classes is tested. Figure 1 shows the conditional probabilities of being correct for each class and item. As expected, the torque class scores nearly perfect on each of the items. The addition class also performs to expectations, scoring nearly perfect on the first three items and failing the last three items. The small random class is found to perform near chance levels. Figure 2 shows the 95% confidence intervals for the three classes. As can be observed, the torque class and the addition class show very narrow confidence intervals. The small, random class on the other hand shows greater variability. In order to assess the standard deviations of the

conditional probabilities in this small random class, their averages per class are plotted in figure 3. As can be seen in this plot, the standard deviations in the small random class decrease when its' size increases. In order to compare the variability of the random class in this scenario, the standard deviations of conditional probabilities of the other classes, when taken as the smallest class, were plotted as well. Figure 4 shows the results of taking the addition class as smallest class. This plot shows a clear decrease in variability when compared to the small random class, as variability remains at a constant low, irrespective of class size. Figure 5 shows a similar result, as the variability remains low, again irrespective of class size.

(7)

Fig.2.: 95% confidence intervals for the three classes

When comparing Figure 4 to Figures 5 and 6, one can see that variability is greatest in the random class. As the classes in these plots were all simulated as smallest class, it seems that the variability in Figure 4 stems mostly from its' randomness.

Fig. 3.: Standard deviations of conditional probabilities in the small random class. The three lines indicate differing measure of response error, ranging from 1 (no response error) to .9 (10% response error).

(8)

Fig.4.: Standard deviations of conditional probabilities in the small addition rule class. The three lines indicate differing measure of response error, ranging from 1 (no response error) to .9 (10% response error).

Fig.5.: Standard deviations of conditional probabilities in the small torque rule class. The three lines indicate differing measure of response error, ranging from 1 (no response error) to .9 (10% response error).

(9)

Experiment 2 - Empirical study

Method Sample

For the current experiment, 40 Dutch speaking children between six and 16 years old, of whom 21 were female, were recruited. The mean age of the participants was 9.18 years (SD = 2.38). Participants were recruited at Science Centre Nemo, a museum of science located in the city of Amsterdam. Participants were notified of the experiment through various means of communication, e.g. flyers, posters and the Science Centre Nemo website. Correspondence with participants about the experiment involved the use of the working title “Verborgen raadsels oplossen” (“Solving hidden riddles”), instead of the experimental name “Detecting rules on the balance scale task". This was done to enhance the attractiveness and

understandability of the experiment for young participants. Participants received a small token of appreciation for their participation, mostly small toys ranging between € 1,- and € 2,-. For the current experiment, participants were randomly assigned to either the experimental condition or the control condition.

Materials

The experiment was conducted using a 28-itemed pen and paper form of the BST. The version of the BST used in the current experiment can be found in the appendices. This task was split up in two similar half's, each consisting of 14 items, to create a pre- and a post-manipulation task. Each half contained six 'conflict' items. Instructions were given before the BST in the form of an one-minute during video, ensuring consistent instructions for each participant. Instruction type was manipulated after completion of the first part of the BST.

In the experimental condition, participants received genuine instructions about the torque rule and its' application with relevant examples. In the control condition, they received a mock video instruction informing them on the topic of 3D drawing with relevant examples. The number of correctly solved items measured performance on the BST. In the exit interview, subjects indicated their opinions on the BST items, their performance, how they tried to solve the items, whether they had experience with BST like items, and their age. The exit interview was recorded in audio form for storage and ease of access.

(10)

Procedure

Before commencing with the experiment, an information brochure about the experiment was given, and an informed consent from the parents or guardians of the participant was obtained. These two documents can both be found in the appendices. The participant was seated behind a standard laptop with access to the video instructions. The experiment started as the first instruction video was run. This instruction video explained to participants the general course of the experiment, and how they could use the answer form. Thereafter, participants were presented with the first half of the BST. After completion of the first half of the BST, participants were shown a second instruction video. Depending on the group they were assigned to, they received either a genuine torque rule instruction video, or a mock video instruction. After the second video instruction ended, both groups of participants received the second half of the BST. This was followed by a 5 item exit interview (see appendix). When the exit interview was completed, the experiment was finished. Participants then had the chance to choose one of various small toys as an informal reward for their participation.

Results Sample

Participants were assigned to the control or experimental condition at random. Twenty-one participants were assigned to the experimental condition, while 19 participants were assigned to the control condition. As all participants were found to be Dutch native speakers, and of ages between six and 16, none were excluded from the analysis, resulting in a 40 persons sample.

Analysis

During the analysis we tested the hypothesis that the explicitly learned torque rule group would show clear improvement in number of correct items on the post-manipulation task when compared to the score on the pre-manipulation task. We also tested the hypothesis that the torque rule naive group would not show such improvements between the pre-manipulation and post pre-manipulation tasks. First, an analysis was run using the data of all of the items. A Levene's Test was employed to see whether variances in both groups were equal. This test indicated equal variances (F = 2.69, p = .052). Thereafter, a repeated measures ANOVA was conducted to compare the effect of explicit torque rule learning on number of

(11)

correct items. It was found that there was a significant effect of group (experimental vs. control), F(1,1) = 9.097, p = .004. There was also a significant effect of measurement moment (pre-manipulation vs. post-manipulation), F (1,1) = 6.73, p = .015. The interaction between group and measurement moment was also found to be significant, F(1,1) = 4.71, p = .033. Figure 6 shows the interaction plot of these results. As expected, one can observe clear improvement in the experimental group. The control group shows a unexpected decline in performance on the BST. This decline might be explained by the distractive nature of the mock video instruction interfering with focused attempts at solving BST items.

Fig. 6.: BST scores of all items of both groups over pre-manipulation and post-manipulation measurements

After the analysis with the data of all items, another analysis was run, albeit with only conflict items this time. Four conflict items in both the pre-manipulation and post-manipulation task were investigated.

(12)

Again, a Levene's Test was used to compare variances in both groups. The Levene's Test showed equal variances (F = 0.38, p = .77). A repeated measures ANOVA was used to compare the effect of explicit torque rule learning on number of correct items. Results from this analysis show that there is a significant effect for measurement moment, F (1,1) = 11.44,

p = .0012. In this analysis, no significant effects were found for either group effects or

interaction effects. Figure 7 shows the interaction plot of these results. Again, one can observe a distinct improvement in the experimental group. But contrary to the previous analysis, the control group shows an increase in performance on the BST. This increase in performance in the control group is not as strong as the increase in the experimental group.

Fig. 7.: BST scores of all items of both groups over pre-manipulation and post-manipulation measurements

As the current study shows, performance on the BST can be improved by use of instructions using relevant examples. Not only did this intervention lead to improved performance on the whole BST. It was also shown that such an intervention improves performance on BST conflict items, indicating that a true torque rule was acquired by the participants.

Comment [Hv1]: Toch wel

merkwaardig. De controle is dus heel hard achteruitgegaan op de simpele items? Wat hebben za daarop dan gedaan. Check het nog eens goed allemaal.

Comment [Hv2]: Gegeven figuur 6 iets

(13)

Conclusions and discussion

The current study aimed to answer two main questions. The first involved whether LCA is prone to finding small, trivial classes. The second revolved around the question whether children can acquire a genuine torque rule through instructions with relevant examples. It was found that the issues with LCA were not as dire and clear-cut as the current debate would lead one to expect. We could not replicate findings that would suggest that smallness of classes would severely hamper LCA's performance. Our results indicated that smallness might play a small role, but this is subordinate to the role randomness plays herein. Therefore, with systematic classes, either small or large, LCA was found to perform reliably. LCA was also found to perform reliably with large random classes. However, possible problems might arise in LCA when classes are random and very small. In such a scenario standard deviations of the classes might grow beyond the bounds of what is deemed acceptable.

To assess the acquisition of a genuine torque rule after instruction an empirical study was performed. Herein it was found that such an intervention does lead to improved performance on the BST. Further investigation into BST conflict items revealed that, also in this scenario, the intervention did improve performance. As such, it can be concluded that instructions including relevant examples can mediate an increase in BST performance on all items including conflicting ones.

Comment [Hv3]: Mooi samengevat!

Comment [Hv4]: Check de data nog

(14)

Bibliography

Boom, J., Hoijtink, H., & Kunnen, S. (2001). Rules in the balance: Classes, strategies, or rules for the balance scale task? Cognitive Development, 16, 717–735.

Clogg, C.C. (1995). Latent class models. In C.V. Arminger, C.C. Clogg, & M.E. Sobel (Eds.),

Handbook of statistical modeling for the social and behavioral sciences.

New York: Plenum Press. (pp. 311– 359).

Dandurand, F., & Schultz, T. R. (2013). A comprehensive model of development on the balance scale task. Cognitive Systems Research, xxx, xxx-xxx

Faul, F., Erdfelder, E., Buchner, A., & Lang, A.-G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research

Methods, 41, 1149-1160.

Jansen, B.R.J., & van der Maas, H.L.J. (1997). Statistical test of the rule assessment methodology by latent class analysis. Developmental Review, 17, 321–357.

Jansen, B. R. J., & van der Maas, H. L. J. (2002). The development of children's rule use on the balance scale task. Journal of Experimental Child Psychology, 81, 383-416.

Normandeau, S., Larivée, S., Roulin, J.L., & Longeot, F. (1989). The balance-scale dilemma: Either the subject or the experimenter muddles through. Journal of Genetic

Psychology, 150, 237–250.

van der Maas, H. L. J., Quinlan, P. T., & Jansen, B. R. J. (2007). Towards better computational models of the balance scale task: A reply to Schultz and Takane.

Cognition, 103, 473-479.

van der Maas, H. L. J., & Straatemeier, M. (2008). How to detect cognitive strategies: Commentary on 'Differentiation and integration: Guiding principles for analyzing cognitive change'. Developmental Science,11(4), 449-453

Quinlan, P.T., van der Maas, H.L.J., Jansen, B.R.J., Booij, O., & Rendell, M. (2007). Re-thinking stages of cognitive development: an appraisal of connectionist models of the balance scale task. Cognition, 103 (3), 413 –459.

Shultz, T. R., & Takane, Y. (2007). Rule following and rule use in simulations of the balance-scale task. Cognition, 103, 460–472.

Siegler, R. S. (1981). Developmental sequences within and between concepts. Monographs of

the Society for Research in Child Development, 46 (2, No. 189).

Formatted: English (U.S.) Formatted: Dutch (Netherlands) Formatted: Dutch (Netherlands) Formatted: English (U.S.) Formatted: Dutch (Netherlands)

(15)

Siegler, R. S., & Chen, Z. (2002). Development of rules and strategies: Balancing the old and the new. Journal of Experimental Child Psychology, 81, 446–457.

Strauss, S., & Levin, I. (1981). Commentary on Siegler’s “Developmental Sequences within and between Concepts.” Monographs of the Society for Research in Child

Development, 46(2, 189), 75–81.

Wilkening, F., & Anderson, N.H. (1982). Comparison of the two rule-assessment methodologies for studying cognitive development and structure. Psychological

Bulletin, 92, 215–237.

(16)

Appendices - Simulation Study

Standard deviations of conditional probabilities in the small torque rule class:

Standard deviations of conditional probabilities in the small torque rule class:

(17)

Standard deviations of conditional probabilities in the small torque rule class:

(18)

Standard deviations of conditional probabilities in the small torque rule class:

(19)

Standard deviations of conditional probabilities in the small torque rule class:

(20)

Standard deviations of conditional probabilities in the small torque rule class:

(21)

Standard deviations of conditional probabilities in the small torque rule class:

Standard deviations of conditional probabilities in the small addition class:

(22)

Standard deviations of conditional probabilities in the small addition class:

(23)

Standard deviations of conditional probabilities in the small addition class:

(24)

Standard deviations of conditional probabilities in the small addition class:

(25)

Standard deviations of conditional probabilities in the small addition class:

(26)

Standard deviations of conditional probabilities in the small addition class:

(27)

Standard deviations of conditional probabilities in the small random class:

(28)

Standard deviations of conditional probabilities in the small random class:

(29)

Standard deviations of conditional probabilities in the small random class:

(30)

Standard deviations of conditional probabilities in the small random class:

(31)

Standard deviations of conditional probabilities in the small random class:

(32)
(33)

Appendices - Empirical Study

(34)
(35)

Information brochure:

Beste deelnemer, ouders, en / of verzorgers,

Bedankt voor de aanmelding voor het onderzoek ‘Regel ontdekking bij de Balans Schaal Taak’. Dit onderzoek wordt uitgevoerd door dhr. Y. K. Kunkels en dhr. prof. dr. H. L. J. van der Maas, in het kader van de opleiding Psychologie aan de Universiteit van Amsterdam.

Voor het onderzoek begint, is het belangrijk dat je op de hoogte bent van de procedure die in dit onderzoek wordt gevolgd. Lees daarom onderstaande tekst zorgvuldig door en schroom niet om vragen te stellen over deze tekst, mocht het onduidelijk zijn. De onderzoeksleider zal eventuele vragen graag beantwoorden.

Deelname

Allereerst willen wij je er op attenderen dat je alleen mag deelnemen als je jonger bent dan 16 jaar, maar ouder dan 6 jaar. Indien je hier niet aan voldoet, bedanken wij je voor je

aanmelding en mag je helaas niet deelnemen aan het onderzoek.

Doel van het onderzoek

Het doel van dit onderzoek is het onderzoeken van hoe mensen regels ontdekken wanneer zij de Balans Schaal Taak maken. Met de resultaten hopen wij een antwoord te vinden op diverse vragen in de wetenschap die vooralsnog ongetest zijn.

Procedure van het onderzoek

Het onderzoek bestaat uit drie delen. In het eerste deel zal je instructies krijgen over de Balans Schaal Taak vragen waarna je deze 28 vragen mag beantwoorden. Er wordt aan je gevraagd om aan te geven welke kant een balans op zal vallen wanneer de ondersteuning weggehaald wordt. Je kunt antwoord geven op de vragen door het door jou gekozen antwoord te

(36)

omcirkelen op het opdrachtenformulier. In het tweede deel van het onderzoek krijg je een tweede instructiefilm te zien, waarna je nogmaals de Balans Schaal Taak vragen mag

beantwoorden. In het derde deel bespreken we hoe het is gegaan. Er zal worden gevraagd naar hoe je de vragen hebt geprobeert op te lossen en er zal worden geïnformeerd naar enkele demografische gegevens. Het onderzoek zal maximaal 30 minuten duren.

Vertrouwelijkheid van gegevens

Je antwoorden zijn strikt anoniem en zullen vertrouwelijk worden behandeld. De resultaten zullen louter voor wetenschappelijke doeleinden gebruikt en niet aan derden worden verstrekt.

Vrijwilligheid

Wij willen benadrukken dat deelname aan het onderzoek geheel vrijwillig is. Je hebt het recht om het onderzoek, zonder opgaaf van reden, op elk willekeurig moment te beëindigen.

Verzekering

Omdat dit onderzoek geen risico’s voor jouw gezondheid of veiligheid met zich meebrengt, gelden de voorwaarden van de reguliere aansprakelijkheidsverzekering van de UvA.

Beloning

Dit onderzoek heeft geen formele beloning.

Contact

Als je na het onderzoek nog vragen of opmerkingen hebt, kun je contact opnemen via het volgende e-mailadres: ykkunkels@gmail.com

(37)

Toestemmingsverklaring

Beste deelnemer,

Dit formulier hoort bij de informatiebrochure die je hebt ontvangen over het onderzoek waar je aan deelneemt. Met ondertekening van dit formulier verklaar je dat je de deelnemersinformatie hebt gelezen en begrepen. Verder geef je met de ondertekening aan dat je akkoord gaat met de gang van zaken zoals deze staat beschreven in de informatiebrochure.

Deelnemer / ouder / verzorger

“Ik heb de informatie gelezen en begrepen en geef toestemming voor deelname aan het onderzoek en het gebruik van de daarmee verkregen gegevens. Daarbij behoud ik het recht om zonder opgaaf van reden deze instemming weer in te trekken. Verder behoud ik het recht om op ieder door mij gewenst moment te stoppen met het experiment.”

Aldus in tweevoud getekend:

Datum:

………... ………

(38)

Onderzoeker

“Ik heb informatie en toelichting verstrekt op het onderzoek. Ik verklaar mij bereid om nog opkomende vragen over het onderzoek naar vermogen te beantwoorden.”

Datum:

………... ………

(39)

Exit interview questions:

A. "Wat vond je van de raadsels?" / "What do you think of the items?" B. "Lukte het om ze op te lossen?" / "Did you manage to solve the items?" C. "Hoe heb je ze opgelost?" / "How did you solve them?"

D. "Heb je eerder dit soort raadsels opgelost?" / "Have you ever solved items like this before?"

Referenties

GERELATEERDE DOCUMENTEN

The implementation of macros can be documented using this environment. The actual 〈macro code〉 must be placed in a macrocode environment. Longer macro definition can be split

I handout: slides suitable for printing on paper I article: transcript, paper, notes or other. article-style document based on

I The ‘trans’ and ‘handout’ versions do not have the intermediate slides used by the ‘beamer’ version for uncovering content. I The handout has three slides to a

I The ‘trans’ and ‘handout’ versions do not have the intermediate slides used by the ‘beamer’ version for uncovering content. I The handout has three slides to a

The Elsevier cas-dc class is based on the standard article class and supports almost all of the functionality of that class.. In addition, it features commands and options to

⋆⋆ The second title footnote which is a longer text matter to fill through the whole text width and overflow into another line in the footnotes area of the first page.. This note has

Thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Mechanical Engineering in the Faculty of Engineering at Stellenbosch

Then G must have four eigenvalues (and then G is connected) or be the disjoint union of some strongly.. If G has four eigenvalues, then the following theorem provides us with a