• No results found

The hare or the tortoise? Modeling optimal speed-accuracy tradeoff settings - 9: Conclusion

N/A
N/A
Protected

Academic year: 2021

Share "The hare or the tortoise? Modeling optimal speed-accuracy tradeoff settings - 9: Conclusion"

Copied!
8
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl)

UvA-DARE (Digital Academic Repository)

The hare or the tortoise? Modeling optimal speed-accuracy tradeoff settings

van Ravenzwaaij, D.

Publication date

2012

Link to publication

Citation for published version (APA):

van Ravenzwaaij, D. (2012). The hare or the tortoise? Modeling optimal speed-accuracy

tradeoff settings.

General rights

It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons).

Disclaimer/Complaints regulations

If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible.

(2)

Conclusion

In this dissertation, I have examined forced–choice decision–making from both a theoret-ical and an empirtheoret-ical perspective. Theorettheoret-ical issues include optimality of performance, bias in decision–making, explaining findings from the IQ world, and parameter recov-ery. Empirical issues include performance decrements following alcohol consumption, the measurement of implicit prejudice, and risk–taking behavior in decision–making. The main message that ties all chapters together is the following: in order to properly in-terpret your findings and draw psychologically meaningful conclusions, it is essential to view response time data through the lens of a cognitive process model. In what follows, I will first summarize the main conclusions of this dissertation. Then, I will discuss the importance of model validation, as well as the validity of the diffusion model. I will finish with some concluding remarks about the importance of modeling in psychology.

9.1

Summary of Results

IQ

Researchers in the IQ world have discovered a number of key findings that relate RT to general intelligence, or g. These key findings include:

1. The fact that RT distributions are right–skewed.

2. The worst performance rule, meaning that relatively slow RTs correlate relatively strongly with g.

3. The fact that g correlates stronger with the standard deviation of RT than with the mean RT.

4. The linear relation between the standard deviation of RT and the mean RT.

5. Linear Brinley plots, meaning that across several tasks that vary in difficulty, the mean RT of a group of low–g people is a constant multiple of the mean RT of a group of high–g people.

(3)

9. Conclusion

In chapter 2 it was demonstrated that all of the above key findings can be accounted for by a single DDM parameter: drift rate, or the speed of information processing. The drift rate parameter of the DDM provides an elegant, quantitative, and unifying account of these previously disparate empirical phenomena. This result showcases how a model– based analysis can result in a perspective on cognitive performance that is more insightful than summarizing the behavioral data.

Equivalence

Chapter 3 discusses formal models of decision making from two perspectives: optimality in terms of performance and biological plausibility. In this chapter, the DDM is discussed as a representative of an optimally performing model and both the LCA and the FFI are discussed as representatives of biologically plausible models. We scrutinized an attempt by Bogacz et al. (2006) to reduce neural inhibition models like the LCA and the FFI to the DDM. We have shown that both for the LCA and the FFI there are complications that have to be overcome before each of these models can be reduced to the DDM.

For both the LCA and the FFI, we showed that model equivalence with the DDM is compromised by the necessity to truncate negative accumulator activation values at zero. This complication could be overcome by assuming baseline activity of at least half the threshold firing rate. For the FFI, the resulting non–truncated version of the model is identical to the DDM.

For the LCA, there is the additional problem that the proposed model equivalence leads to a DDM boundary separation with across–trial variability, a condition which the DDM does not allow. Our simulations showed that the DDM outperforms the LCA; the models cannot be fully equated.

Bias

How do people incorporate prior knowledge about the correct response in their decisions? Chapter 4 examines decision making in situations where one of the two response alterna-tives is positively biased. When modeled by the DDM, bias in a decision situation can be reflected by a shift in starting point or by a shift in drift rate criterion.

This chapter thoroughly investigates a dissociation in optimal performance that was proposed by Hanks et al. (2011). When the decision maker is aware of the difficulty of the choice prior to the decision (i.e., when across–trial stimulus difficulty is fixed), optimal performance dictates a shift in starting point to accommodate bias (see e.g., W. Edwards, 1965). On the other hand, when the decision maker is not aware of the difficulty of the choice prior to the decision (i.e., when across–trial stimulus difficulty is variable), a decision maker should shift his or her drift rate criterion in addition to a shift in starting point (see e.g., Yang et al., 2005). Our results show that both for fixed and variable across–trial stimulus difficulty, optimal performance can be accomplished by shifting just the starting point.

The second part of this chapter presents an empirical study aimed to uncover how people implement bias in situations of fixed and variable across–trial stimulus difficulty in practice. We also tested selective influence of the across–trial drift rate variability parameter, or η. The results show that there is no difference in the way people implement bias between conditions of fixed and variable across–trial stimulus difficulty. Additionally, across–trial drift rate variability η is selectively influenced by the across–trial stimulus difficulty manipulation.

(4)

Technique

In chapter 5, three different implementations of the DDM were investigated: the EZ model, fast–dm, and DMAT. These fit routines were compared to one another by their ability to recover both the mean structure and individual differences in parameter values. The results show that if researchers are willing to assume that their data have actually been generated by a DDM process, all three implementations of the DDM are capable of recovering the model parameters with reasonable accuracy. The routines just differ in the specifics: DMAT requires a relatively large number of data points, whereas EZ and fast–dm provide useful estimates even with as little as 80 trials per condition. EZ and DMAT are better capable than fast–dm of reflecting experimental effects on parameters, with DMAT being the best at giving an unbiased estimate of parameter means. EZ is superior to both fast–dm and DMAT in terms of recovering individual differences in parameter values.

If researchers are not willing to assume that their data are generated by a DDM process, interpretation of the diffusion model parameter estimates becomes less straight-forward. Each of the DDM implementations was able to provide decent DDM parameter estimates for the corresponding parameters in the LCA model, but when the data was generated by the linear ballistic accumulator model, parameter correspondence degen-erated substantially. Thus, the validity of DDM parameters does not depend on the correctness of all assumptions of the DDM, but on the correctness of a set of relatively general assumptions that are shared between the DDM and the LCA model but not the linear ballistic accumulator model.

Alcohol

The nature of alcohol–induced performance decrements was the topic of chapter 6. In an empirical test, participants returned to the lab on three separate occasions. In each session, the participants consumed either a placebo dose, a moderate dose (blood alcohol content: 0.5g/l), or a high dose (blood alcohol content: 1g/l). Subsequently, performance of the participants on a perceptual discrimination task was tested. While it was expected that performance would deteriorate with increasing amounts of alcohol, the key question was at what stage of alcohol intake both the cognitive and the motor components of decision making would be affected.

The results showed that participants both slowed down and made more mistakes with progressively larger doses of alcohol. Our diffusion model analysis revealed that the relatively poor performance following alcohol intake is partly caused by a lower drift rate, signifying a decrease in the rate of information processing, and partly caused by an increase in non–decision time, signifying a deterioration of motor processes. Specifically, the negative effects of alcohol on motor processes manifest itself at a higher alcohol dose than the negative effects on cognitive performance.

Prejudice

Measuring something as sensitive as racial prejudice is not easy. Simply asking someone how he or she feels about Moroccan people may result in socially desirable answers that do not reflect someone’s true opinion. Aware of this pitfall, it is tempting to explore alternative measures that may be less sensitive to the problems of explicit self–report. The most popular one is the Implicit Association Test, or IAT, which attempts to measure

(5)

9. Conclusion

prejudice implicitly (hence the name). In a simple categorization task, the mean RT in a compatible block (in which Dutch names and positive attributes require one button press and Moroccan names and negative attributes require another) is compared to the mean RT in an incompatible block (in which Moroccan names and positive attributes require one button press and Dutch names and negative attributes require another). The difference in performance between these two blocks is called the IAT–effect and is supposed to be an implicit measure of racial prejudice that is free of the problems of social awareness that make self–report such a problematic measure.

In chapter 7, I examine a different explanation of the IAT–effect: in–group/out– group membership. Three name IATs were used in a between–subjects design: a Dutch– Moroccan, a Dutch–Finnish, and a Finnish–Moroccan IAT. If in–group/out–group mem-bership causes the IAT–effect, effects should be present for both the Dutch–Moroccan and the Dutch–Finnish IATs, but not for the Finnish–Moroccan IAT. If, on the other hand, implicit racial prejudice causes the IAT–effect, effects should be present for both the Dutch–Moroccan and the Finnish–Moroccan IATs, but not for the Dutch–Finnish IAT.

The results show that in the name–race IAT, the IAT–effect may be mistakenly at-tributed to the presence of an implicit racial prejudice. The results showed no effect when Moroccan names were contrasted with Finnish names, and an equivalent effect when Dutch names were contrasted with either Moroccan or with Finnish names. This suggests that the racially–charged Moroccan names were processed in a similar fashion as the racially–neutral Finnish names. Thus, while the idea of implicit measurement has intuitive appeal, caution must be exerted in accepting the interpretation of such measures at face value.

Risk

The 8th and final chapter of this dissertation dealt with decision making in a task that examines risk–taking behavior: the Balloon Analogue Risk Task, or BART. The chap-ter examines four cognitive process models that attempt to decompose performance on the BART task into meaningful psychological components. The different models include parameters that quantify the rate with which the decision maker learns during the task, the amount of risk the decision maker takes, and the behavioral consistency. In a param-eter recovery study, I investigate whether the four models can accurately recover their parameters. The results showed that only a 2–parameter model with risk parameter γ+ and behavioral consistency parameter β could recover its parameters adequately.

A second aim of this chapter was to empirically validate this BART model. Using the same design as in chapter 5, the effects of alcohol on BART performance was investigated. The BART data was analyzed with a Bayesian hierarchical implementation of the 2– parameter BART model. This analysis showed that alcohol consumption leads to a modest increase in risk taking and a modest decrease in behavioral consistency.

9.2

Model Validity

This thesis discusses both theoretical work and applications of cognitive models for deci-sion making, in particular for the diffudeci-sion model for speeded forced–choice response time tasks. The value of this work is dependent on the validity of the diffusion model. For a model to be valid, it has to pass at least three tests (see Figure 9.1). In the next three subsections, I will briefly discuss how the diffusion model passes each of these criteria.

(6)

Is my

model valid?

Can the parameters be recovered? Can the parameters be selectively manipulated? Does the model fit data well? yes yes

No, your model is not valid!

no no no

Yes, your model is valid!

yes

Figure 9.1: A model is valid if its parameters can be recovered, has parameters that can be manipulated selectively, and if it provides a good fit to data.

Parameter Recovery

As a first step towards validity, a model must be capable of recovering its parameters. This means that when a model generates data based on a chosen set of parameter values and the model is then used to estimate parameters for this generated dataset, the resulting parameter estimates should be close to the original set of parameter values.

The diffusion model passes this test with flying colors. In chapter 5 I discuss how different implementations of the diffusion model are each capable of recovering their parameters. The model’s ability to recover its parameters is also discussed by Ratcliff and Tuerlinckx (2002) and by Vandekerckhove and Tuerlinckx (2007). A requirement for the diffusion model to reliably recover its parameters is that there should be sufficient observations. This is particularly important for the full diffusion model with dispersion parameters for drift rate η, starting point sz, and non–decision time Ter.

Specific Influence

The second test for model validity is that of specific influence (e.g. Riefer, Knapp, Batchelder, Bamber, & Manifold, 2002). When researchers selectively manipulate a psy-chological process, this must be reflected in the model parameter that represents this process. Additionally, no other model parameter should be affected. As an example, a researcher could manipulate response caution by including two conditions in his or her experiment. In one condition, participants are stressed to respond as quickly as

(7)

possi-9. Conclusion

ble (the speed condition). In the other condition, participants, are stressed to respond as accurately as possible (the accuracy condition). When the model is fit to the data, participants should have a higher estimate for boundary separation a in the accuracy condition then in the speed condition.

Voss et al. (2004) confirmed that parameters of the diffusion model do indeed map very specifically on to psychological processes. Specifically, Voss et al. (2004) found that stimulus difficulty maps onto drift rate v, speed–accuracy manipulations map onto boundary separation a, the speed of motor responses affected non–decision time Ter, and

reward rate manipulations map onto starting point z. In similar work by Ratcliff (2002), specificity of influence was demonstrated for both drift rate v and boundary separation a. In chapter 4, I show how variability in stimulus difficulty maps onto standard deviation of drift rate η.

Fitting Data

The last test a model must pass in order to be valid is that it must be able to fit data. This is probably the easiest test for the diffusion model to pass, as it has been shown to fit data in a wide range of experimental paradigms, including, but not limited to, brightness discrimination, letter identification, lexical decision, recognition memory, and signal detection (e.g., Ratcliff, 1978; Ratcliff, Gomez, & McKoon, 2004; Ratcliff et al., 2006b; Klauer et al., 2007; Wagenmakers, Ratcliff, et al., 2008; Dutilh et al., 2009; Ratcliff et al., 2010). In chapters 6 and 7 I show two other instances where the diffusion model is successful in fitting data.

Apart from being able to fit real data, a model must not be capable of fitting all kinds of data. A model that is capable of fitting all kinds of data is probably overpa-rameterized. Ratcliff (2002) demonstrated that the diffusion model is incapable of fitting a number of synthetic data patterns. Examples include normally distributed response times, error response times that are exactly 100 ms. lower than corresponding correct response times, reduced accuracy with response times kept constant, data in which high accuracy conditions have relatively high response times, etc. As such, we can conclude that the diffusion model is capable of fitting data, but is unlikely to be overparameterized.

Interim Conclusion

The diffusion model is a useful model that can recover its parameters, has parameters that map onto specific psychological processes, and can fit data, without being overparameter-ized. As such, the DDM passes three necessary criteria for model validity. Although more desirable criteria exist for a cognitive process model (such as neurological plausibility, see chapter 3), the DDM has proved to be among the most useful models in experimental psychology.

9.3

Concluding Remarks

Decision making is strongly influenced by the speed–accuracy trade–off. In this disser-tation, I have tackled the issue of speed versus accuracy from both a theoretical and an empirical viewpoint and have been drawn time and time again to the same inescapable conclusion: speeded decision making cannot be thoroughly investigated without a formal quantitative process model. Only by transforming relatively uninformative behavioral

(8)

data into meaningful psychological processes can there be hope for a more complete un-derstanding of the factors that influence performance in human decision making.

Referenties

GERELATEERDE DOCUMENTEN

biedt deze, onder andere door Graetz (1986) geformuleerde theorie, een aanvulling op het ac­ tionistische verklaringsmodel. Volgens deze theorie zijn maatschappelijke

Hoewel de theorie van de industriële samenleving uiteinde­ lijk door hen niet wordt verworpen (en ook niet kan worden verworpen gezien de betrekkelijk smalle empirische

Hoe vindt de Vlaamse mannelijke beroepsbevol­ king werk. Denys,

It is also shown that t often used class scheme of Goldthorpe does n deal adequately enough with this fragmentation The second research question deals with tht

In datzelfde jaar lanceert de WRR in het rapport ‘Een werkend perspec­ tief het idee om behalve via arbeidsvoorzienin­ gen ook langs de weg van functiesplitsing

Verder zijn voor Groot- Brittannië en Frankrijk de deelname-cijfers met 1,33 opgehoogd; zo wordt gecorrigeerd voor het feit dat in deze landen cursisten worden geteld, terwijl in

In alle drie de bedrijven stootte de functionele flexibilisering op zeker ogenblik op grenzen: grenzen in de context van de ruimere organisatie maar vooral ook grenzen in

Minder dan de helft van de zelfstandige en meewerkende vrouwen weet van het bestaan ervan, maar is vaak niet op de hoogte van het recht op een uitkering voor ver­ lof bij