• No results found

Effect of urgency, complexity and reaction time on accuracy over four different decision tasks

N/A
N/A
Protected

Academic year: 2021

Share "Effect of urgency, complexity and reaction time on accuracy over four different decision tasks"

Copied!
12
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Effect of Urgency, Complexity and Reaction Time on Accuracy over

Four Different Decision Tasks

Tom Meurs

Bachelorthesis

Student-ID: 10358951 University of Amsterdam

Supervisor: Leendert van Maanen Wordcount: 3955

(2)

2

Table of contents

Abstract...3

Introduction...4

Materials and Methods...5

Analysis...7

Discussion...10

(3)

3

Abstract

A wide range of decision making models have been described. Within the accumulation to bound model, making decisions under time pressure has recently gained attention. In this thesis decision making under time pressure has been investigated over four tasks: a lexical-decision task, math task, flanker task and flash task. It was hypothesized that reaction time, time pressure and complexity have a main and interaction effect on accuracy. 30 students performed 700 trials over 4 tasks. Mixed Model Regression analyses were performed of reaction time, time pressure and complexity on accuracy, with task and subject as random effects. Reaction time and urgency positively affect accuracy. Furthermore, there was a interaction effect of complexity and reaction time on accuracy. It was concluded that urgency influences accuracy by collapsing boundaries, while changing complexity influences the evidence gathering process. Further research should indicate how these results fit in drift diffusion models.

(4)

4

Introduction

During our professional and personal lives, we often have to make decisions under time pressure. Big decisions like deciding which bachelors program to choose or small decisions like what shoes to were, are made and have to be done by a certain deadline. Depending on how complex and urgent the decision is, someone can make decisions differently then easy decisions where time does not seem to play a role. The aim of this study is to look for effects of urgency on decision-making in different tasks with two complexity levels.

Decision making is often modeled by the accumulation-to bound-model (Mulder, Van Maanen & Forstmann, 2014). In this mathematical model, a decision variable accumulates information until a certain

boundary is reached (see Figure 1). The rate at which information is gathered is called the drift rate. Within the accumulation-to-bound model, the diffusion models are the most popular type of models in current research. Diffusion models assume that noisy information is gradually sampled from the environment. These models allow the possibility that certain stimuli are a priori more likely. Furthermore, these models indicate a speed-accuracy trade-off which is

dependent on just one parameter: boundary setting (Bogacz, Brown, Moehlis, Holmes & Cohen, 2006). Finally it is assumed that some portion of the reaction time is not accounted for by the decision process, but by different processes. For instance body activity to push a button to indicate the decision being made. These models are not only intuitive, but also seem to have a neuroscientific basis (Gold & Shadlen, 2007). Gathering evidence and making a decision seems to be related to different brain regions and individual neurons (Cook & Maunsell, 2002), what makes the models more valid.

Recently collapsing boundary models became popular (Hawkins, 2015). Diffusion models typically use fixed boundaries, where the amount evidence needed to make a decision does not

change over time. But it is thought that as time passes, a urgency signal makes people decide quicker. This is

modeled by boundaries collapsing (see Figure 2), since less evidence is needed to make a decision. Another way by which it is modeled is by adding a urgency signal to the accumulated evidence, which has the same effect as collapsing boundaries, since in both cases boundaries are earlier reached (Ratcliff, Smith, Brown and McKoon, 2016).

Figure 1. Schematic overview of accumulation to bound model from Mulder et al. (2014). A decision variable accumulates evidence and then a decision is made when a certain threshold is reached. Non-decision time is the time that is not accounted for by the decision process.

Figure 2. Taken from Hawkins et al. (2015). A drift diffusion model where over time, the boundaries collapse. Less evidence is then needed to make a decision.

(5)

5 Another important aspect is the complexity of the evidence to make the decision. Sequential

sampling theory (Stone 1960, as cited in Palmer, Huk & Shadlen, 2005) dictates that a stimulus is internally represented by a random variable which is noisy and varies over time. Repeated sampling from this representation is finally compared with a criterion, before a decision is made. Therefore, Palmer, Huk and Shadlen (2005) argued and showed that the difficulty of the perceptual stimuli had an positive effect on reaction time and accuracy. That is, if stimulus strength was high, reaction time was fast and accuracy was high. If stimulus strength was low then reaction time was slow and accuracy was low.

So urgency and difficulty or complexity seem to affect accuracy and reaction time. Whereby setting a boundary level reaction time and accuracy covary in a certain way, according to drift diffusion model. Since difficulty of perceptual stimuli influences reaction time and accuracy, one could argue that not only stimulus strength but any way to discriminate the evidence for a certain decision affects the reaction time and accuracy in the same way. So if it easy to discriminate evidence for a certain decision, reaction time should be fast and accuracy should be high. Therefore

complexity is thought to have a negative effect on both accuracy and reaction time. So the more complex the decision is for different trials within a task, the more reaction time is needed and less accurate the response is. Because hard or complex stimuli will make evidence acquiring more slowly, since the internal random variable is noisier and more samples are needed before the criterion is met according to sequential sampling theory. Since the complexity levels within tasks differ between tasks, a interaction effect of reaction time, complexity and task on accuracy is expected. Furthermore, since accuracy and reaction time interact by setting a boundary level, and urgency changes a fixed boundary into a collapsing boundary, a interaction effect of urgency and reaction time on accuracy is expected. Also, a negative effect of urgency on reaction time and accuracy is expected, since less evidence is needed to reach a boundary. So it is easier to make mistakes and less time is needed to make a decision.

Participants completed four different tasks. Within each task, there was a easy and hard condition for complexity, and quick and slow deadline for urgency. The difference between complexity and deadline was task dependent. Participants had to make a binary decision. Reaction time and accuracy were listed. It is thought that participants will be slower and less accurate in the hard condition than the easy condition when corrected for task. Also, participants will be slower and less accurate in the quick deadline condition than slow deadline condition, also corrected for task. Finally, it is thought that there is a interaction effect of urgency and reaction time on accuracy. Also a interaction effect between urgency, complexity and reaction time on accuracy is expected.

Materials and Methods

Participants. Thirty healthy participants (23 female, mean age = 21.43, SD=1.1) performed four tasks:

lexical decision task, flanker task, calculation task and a flash task. Participants were recruited through the Psychology faculty at the University of Amsterdam, and received a credit for

participation. Psychology students need 15 credits to pass their first year of their bachelor. According to self-report, no subject had a history of a neurological, major medical or psychiatric disorder. All participants were native Dutch-speakers. Informed consent was obtained from all participants.

Materials. Visual stimuli were generated on a personal computer running Windows, using

(6)

6 1920x1200 pixels. The lexical decision task has been described in Ratcliff, Gomez, McKoon (2004) and the flanker task in Kopp, Rist and Mattler (1996). The flash task is based on a paradigm from Brunton, Botvinick and Brody (2013).

During the flash task, participants were instructed to maintain fixation on a cross on the middle of the screen, and when the flashes appeared, decide which of the two points flashes more often. Participants had to indicate their decision by pressing the left or right arrow button.

Afterwards, a new trial began. There were two difficulty levels. At the hard difficulty level, one point was flashing with 60 Hz, and the other with 40 Hz. At the easy difficulty level, one point was flashing with 70 Hz, and the other at 30 Hz. Prior studies indicate accuracy around 75% for the hard frequency and 85% for the easy frequency (Brunton et al., 2013). During each trial, the points were presented for a maximum of 2500 ms. The points were grey, 3x3 pixels, 10 cm off each other and 15 cm from the bottom of the screen.

During the calculation task, participants got 100 arithmetic exercises. All items were multiple-choice, and there was just one addition, subtraction, division or multiplication operator present. For example: what is 160/5? Answer A: 32. Answer B: 35. Participants had to indicate their decision with a left or right button press.

Crucially, a horizontal bar was presented during every task 5 cm under the stimuli. The length of the bar decreased, presenting the time left during that trial. The idea was to make participants aware of the urgency of their decision. Maass et al. (in prep.) found that this time manipulation achieves the expected results as to expect of a time manipulation: more time urgency makes people decide quicker and give less accurate results.

Participants were asked beforehand about their history of a medical, neurological and psychiatric disorder. Also, participants reported their sex, age and handedness.

Procedure. Participants were informed to perform four tasks. All tasks started with a training

block, followed by two blocks, one with urgency and one without. In the urgency condition people got instructed to answer in 1 second for the flash task, 0.5 seconds in the flanker task, 0.6 second in the lexical decision task and 3 seconds in the math task. In the no-urgency condition participants got instructed to answer in 1.5 seconds in the flash task, 0.75 seconds in the flanker task, 0.9 seconds in the lexical decision task and 4.5 seconds in the math test. All tasks had a time ratio of 2/3 between urgent and no-urgent condition. A decreasing horizontal bar presented the time left. Between tasks a screen appeared saying that they could relax and pause for 30 seconds.

Half of the participants started all the tasks in the urgency condition. The other half started the tasks in de no-urgency condition. The training block consisted of 6 trials. Within the lexicon-decision task and random dot motion task the urgency and no-urgency block both consisted of 100 trials. The calculation task and flash task consisted of two blocks of 50 trials.

Within a block, both easy and hard problems were randomly selected. The distinction between easy and hard problems was different for each task. In the lexical-decision task easy problems contained real words and hard problems contained pseudo words. In the calculation task hard problems contained calculations by division and multiplication. Easy problems contained problems with subtraction and addition. For the flash task, hard problems contained points flashing with 60 Hz and 40 Hz. So the difference between the two points was more difficult to see. Easy problems contained points flashing 70 Hz and 30 Hz. In the flanker task easy problems contained only the middle arrow pointing in the same or different direction as to the other arrows. In the hard problems the arrows different from the middle also could stand to different direction.

(7)

7 answering incorrect and/or not within the time limit, they got zero points. After each trial

participants got feedback about their answer, whether it was correct and within the time limit and whether they received a point. After each block, they got feedback about how many points they had collected during the block. At the end of the test they received the small questionnaire. Finally they were briefed about the purpose of the research and were allowed to ask further questions.

Easy and hard problems were counterbalanced during the blocks by random selection. Furthermore, the sequence of the different tasks and urgency versus no-urgency were

counterbalanced as well to account for practice and boredom effects. The sequence of the tasks was: lexicon decision task, random dot motion task, calculation task and flash task. While the sequence of the tasks remained unchanged, every subject started somewhere else in the sequence. Besides, a subject was selected to start the tasks in either the urgency or no-urgency condition. So every subject could start the experiment in 1 of 8 starting conditions: 1 of the 4 different tasks and urgency or no-urgency.

Analysis

First we look at some descriptive statistics. Then we look for a few participants if fatigue and learning effects were present. Finally we use linear mixed regression modeling to regress reaction time, urgency and complexity on accuracy. Participant 1 is left out of the analysis because he was a pilot person to check if the test performed correctly. Participants 2, 3 and 4 have done 200 extra trails for the flash task. They are included since we think that it didn't influence the data set that much (see fatigue and learning effects). So we have done analysis over 29 participants.

Descriptive statistics

In Figure 3 the mean reaction time (MRT) and accuracy (acc) over the different condition is shown.

Figure 3. Reaction time and accuracy for the different conditions on four different tasks: easy and hard for complexity and short and long for urgency. Yellow is long deadline, red is short deadline. Error bars with standard error of the mean are shown as well.

(8)

8 For the flanker task, mean response time was equally high in hard, easy as short and long deadline conditions. Accuracy was higher in the easy condition than the hard condition. Difference between accuracy in long and short deadline did not seem to differ. In the flash task, people responded slower in the long deadline than the short deadline, with not much difference between easy and hard condition. Accuracy was higher in the short deadline than long deadline, for both hard and easy. For the lexical decision task, people responded slower and more accurate in the long deadline condition than the short deadline condition. Although typically slower than word conditions, pseudo words was done more accurate than normal words. On the math test, people responded quicker on the long deadline then the short deadline, and also scored better. They also responded quicker in easy than hard conditions. Overall, participants took less time in the short deadline condition and had more good in the easy condition.

Learning and fatigue effects

In figure 2 four plots of different participants are shown. Participant 3 has done 200 more trials of the flash task. Participant 12 was chosen because it has the same order of tasks but different sequence of hard and easy condition within task. Participant 20 had the same order of manipulations as participant 12, to check for person dependent effects. Participant 8 had a different sequence of tasks and difficulty. Within the tasks and two conditions there are no particular declining or inclining

trends. There are a few data points which seem quite high, for instance for participant 8 in the flanker task. It is presumed they are just a few trials to get acquainted to the task, but that they did not influence the data that much. Therefore they were not deleted in the regression analysis.

Figure 2. 4 different plots of different participants. On the x-as is item number. On the y-as mean reaction times. Blue is the math task, red is the flash task, black is the flanker task and green is the lexical decision task. Triangles are the hard items, circles the easy ones.

(9)

9

Regression Analysis

To understand how accuracy is affected by reaction time, complexity, urgency and task regression analysis has been performed. Since participants performed all different conditions, a mixed model is proposed with participants as random effect and reaction time, complexity, urgency and task as fixed effect and accuracy as dependent variable. Since accuracy is 0 or 1, a logit model is proposed:

Asi = β0 + S0s + β1T1i + β2T2i + β3T3i + β4 Ci + β5 Ui + β6 RTi + β7 Ui*Ci +...+ εsit (1)

Where A = accuracy (0 incorrect, 1 correct) RT = Reaction time, U = urgency (0 = quick, 1=slow) and C= complexity (0=easy, 1 = diff). T1 = Flanker, T2 = Flash, T3=Lexical decision and T4=math, whereby task variables are dummies (0 or 1). S is random intercept for participants. Indexes were given for subject s=1,...,30 and item i=1...200 for lexical decision, flanker and flash tasks, and i=1...104 for math task. ε is normally distributed with mean =0 and variance=1. The points indicate all the different interaction effects between task, complexity and urgency. The mixed model is analyzed using R package lme4, described in Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Since model 1 did not convergence, a model with task as random effect has been proposed, and random intercept for both task and participants, and no random slopes. The model then looks like:

Asit = β0 + S0s + T0t + β1 RTi + β2 Ui + β3 Ci + β4 Ui*Ci + β5 Ui*RTi + β6 Ci*RTi +β7 Ui*Ci*RTi + εsit (2)

With the same specifications as in model 1, but now T is random intercept for task. Indices were given for task t=1,...,4. This model did convergence. Results are shown in Table 1. For a logit model, β�-values are the change in accuracy, if they are multiplied by the chance to respond correct and incorrect per subject. So β� 2 (urgency) in model 2 is -0.18 which means -0.18*P(correct for subject

s)*P(incorrect for subject s) change in accuracy for subject s. Urgency, reaction time, complexity and the interaction complexity and reaction time are significant. Quantile residuals showed near

normality except for the tails (see Figure 3, Dunn, 2004).

Figure 3. Left quantile residuals are plotted. Right theoretical quantiles are plotted against sample quantiles. The more horizontal the line, the more quantiles are normally distributed, which correspondents to a good fitting model (Dunn, 2004). Since the line is approximately horizontal left and diagonal right, the model fits the data well and the assumptions underlying the model are reasonably met.

(10)

10 Now a simpler model (3) without interaction effects is proposed, to test whether model 2 is a good model. This model is nested in model 2 by restriction β4= β5= β6 =β7=0. Additionally, a fourth model is

proposed without complexity (β3=0) and a fifth without urgency (β2=0):

Asit = β0 + S0s + T0t + β1 RTi + β2 Ui + β3 Ci + εsit (3)

Asit = β0 + S0s + T0t + β1 RTi +β2 Ui + εsit (4)

Asit = β0 + S0s + T0t + β1 RTi + εsit (5)

Results for model 3, 4 and 5 are shown in Table 1. All β�-values seem to be significant. Models 2, 3, 4 and 5 are nested and can be compared by means of a likelihood ratio test (Pinheiro & Bates, 2000, see Table 2). Model 2 seems to be the best fit, so interaction effects and main effects of complexity and urgency seem to play a important role (χ2(4)=44.722, p<.001).

So, urgency, complexity and reaction time negatively affect accuracy. Complexity positively interacts with reaction time on accuracy.

Discussion

The overarching goal of this study was to look for effects of urgency on decision-making in different tasks with two complexity levels. Overall, complexity triggered less accurate and slower responses. Urgency seemed to have caused less accurate and quicker responses. Complexity manipulation did not seem to have the expected effect in the lexical decision task, where pseudo words were

recognized more accurate and quicker than normal words. The urgency manipulation did not seem to have the expected effect in the flash task, where in the short deadline accuracy was higher.

Furthermore, complexity seems to interact with reaction time to have a positive effect on accuracy. So in the hard condition, reaction time seems to have a bigger effect on accuracy than in the easy condition. So the effects of urgency on decision making in different tasks with two complexity levels depends on urgency, reaction time, and the interaction between reaction time and complexity.

While these results seem pretty straightforward, the results could be biased since not a full

Table 1. Regression Analysis models 2, 3, 4 and 5. Different parameter estimates are shown. Sd = standard deviation. Urg = urgency, RT = reaction time, cplx = complexity. In brackets p-values from a t-test are used. In model 2 the interaction of urgency*reaction time, urgency *complexity and urgency*complexity* reaction time are insignificant (for significance level of α=0.05). In model 5 reaction time is insignificant. All the other parameters are significant.

Table 2. Likelihood-ratio test (LR-test). Whereby Model 2 is compared to Model 3, Model 3 to Model 4 etc. Df is free parameters in de model, logLik is the log-likelihood of the model, Chisq the χ2-value of the LR-test with Df_L degrees of freedom. P(>Chisq) gives the p-value for the test statistic. Model 2 seems better than Model 3 (α=0.01). Model 3 seems better than Model 4 and Model 4 better than Model 5.

(11)

11 model is fitted. Also, the model could be misspecified since now reaction time is taken as

independent variable, but is also possible that reaction time depends on accuracy. If someone is less accurate his reaction time could be slower. That makes reaction time not exogenous but

endogenous, which means that error terms are correlated with reaction time. Then OLS-estimation will give inconsistent and inefficient results. One way to fix this is to try IV-regression. Hereby certain instrument variables (IV) are chosen to model reaction time. Instruments have to be strongly

correlated with reaction time, but exogenous with respect to accuracy. These instruments are yet to be found. So far only non-decision time seems to be such a variable (Mulder & Van Maanen, 2013), but that is not expected to differ in the different conditions. Therefore a different solutions should be tried. One way could be to model it with structural equation analysis, which solves for endogenity.

The complexity and urgency manipulation might be invalid. The complexity manipulation in the lexical decision task is very different than for instance the math task. Discriminating pseudo and normal words seems to be different than finding solutions to different math problems. Different cognitive processes are involved. For urgency manipulations, the ratio between deadlines was the same, but since the cognitive processes are so different this might not be enough. Looking at the different results of the manipulations on accuracy and reaction in the descriptive analyses (Figure 3), this might be the case. Therefore it is hard to tell if the manipulations in the different tasks are comparable. This might result in invalid and unreliable conclusions. More different urgency and complexity manipulations should be checked to see which manipulations give the expected results.

From the sequential sampling theory it was reasoned that if it easy to discriminate evidence for a certain decision, reaction time should be fast and accuracy should be high. Complexity did seem to interact with reaction time, and to have an effect on its own. Therefore for the drift diffusion model it is possible that complexity has an effect on evidence gathering: complexity could give more weight to evidence and therefore make the effect of reaction time more strong, since the boundary is reached quicker. Another way to model is to collapse boundaries, but complexity has to do with stimulus strength, and therefore the way evidence of those stimuli is chosen. Therefore complexity might influence the evidence parameter in the drift diffusion model. Urgency did seem to have an effect on accuracy, but did not seem to change the relationship between complexity, reaction time and accuracy or reaction time and accuracy. This seems quite surprising, as urgency has to do with collapsing boundaries and therefore according to the drift diffusion model the relationship between accuracy and urgency should be changed by urgency. This could mean that other parameters then the one in the drift diffusion model could be involved, so urgency influences one boundary setting but also some other parameter with a zero net effect on the relationship between accuracy and reaction time. Another possibility is that urgency does not give rise to collapsing bounds, but more research is needed to confirm this conclusion.

So, people making complex and urgent decisions do differ from people making more easy and relaxed decisions. Complex decisions makes the relationship between reaction time and accuracy different. Urgency has a negative effect on accuracy. Therefore, one should decide how complex and urgent their decision is so they understand their decision process better.

(12)

12

References

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of memory and language, 68(3), 255-278 Bogacz, R., Brown, E., Moehlis, J., Holmes, P., & Cohen, J. D. (2006). The physics of optimal decision

making: a formal analysis of models of performance in two-alternative forced-choice tasks.

Psychological review, 113(4), 700.

Brunton, B. W., Botvinick, M. M., & Brody, C. D. (2013). Rats and humans can optimally accumulate evidence for decision-making. Science, 340(6128), 95-98.

Cook, E. P., & Maunsell, J. H. (2002). Attentional modulation of behavioral performance and neuronal responses in middle temporal and ventral intraparietal areas of macaque monkey. The Journal of Neuroscience, 22(5), 1994-2004.

Dunn, P. K. (2004). Occurrence and quantity of precipitation can be modelled simultaneously.

International Journal of Climatology, 24(10), 1231-1239.

Gold, J. I., & Shadlen, M. N. (2007). The neural basis of decision making. Annu. Rev.

Neurosci., 30, 535-574.

Kopp, B., Rist, F., & Mattler, U. W. E. (1996). N200 in the flanker task as a neurobehavioral tool for investigating executive control. Psychophysiology, 33(3), 282-294.

M.J. Mulder & L. van Maanen (2013). Are accuracy and reaction time affected via different processes? PLoS One, 8 (11), e80222. doi: 10.1371/journal.pone.0080222

Mulder, M. J., Van Maanen, L., & Forstmann, B. U. (2014). Perceptual decision neurosciences–a model-based review. Neuroscience, 277, 872-884.

Mulder, M. J., Wagenmakers, E. J., Ratcliff, R., Boekel, W., & Forstmann, B. U. (2012). Bias in the brain: a diffusion model analysis of prior probability and potential payoff. The Journal of

Neuroscience, 32(7), 2335-2343.

Palmer, J., Huk, A. C., & Shadlen, M. N. (2005). The effect of stimulus strength on the speed and accuracy of a perceptual decision. Journal of vision, 5(5), 1-1.

Pinheiro, J. C., & Bates, D. M. (2000). Linear mixed-effects models: basic concepts and examples.

Mixed-effects models in S and S-Plus, 3-56.

Ratcliff, R., Smith, P. L., Brown, S. D., & McKoon, G. (2016). Diffusion Decision Model: Current Issues and History. Trends in cognitive sciences, 20(4), 260-281.

Ratcliff, R., Gomez, P., & McKoon, G. (2004). A diffusion model account of the lexical decision task. Psychological review, 111(1), 159.

Referenties

GERELATEERDE DOCUMENTEN

Meegenomen worden de middelen met de volgende combinatie van karakteristieken: • toediening via bespuiting en/of ruimtebehandeling • een matig tot hoge dampdruk • een

To this end, we define the best policy as the cyclic appointment schedule in which the expected fraction of unscheduled jobs served on the day of arrival, F, is maximized, while

This model includes the lagged house price change, the error correction term, the change in real disposable household income, the change in the mortgage rate, the

Generic support provided by higher education institutions may not be suited to the specific support needs of the postgraduate woman, especially those whom study part-time and/or at

H3.1) The positive relationship between audit complexity and audit quality threatening behaviour is positively moderated by auditor performance orientation, such

Figure 3 - The Development of Average Word Length of Lexical Words (AWL) &amp; the Proportion of Tokens in a Text Belonging to the COCA Academic Word List (%ACWL) over Time in

Al het onderwijs was vroeger gratis, maar met het ineenzakken van de Zambiaanse economie is dat veranderd. Bovendien vereist iedere middelbare en basisschool dat de leerlingen in

Because there is a larger amount of positive effects of family ownership on performance ,the expectancy is that Chinese family owned firms will outperform the German ones.. Therefore