• No results found

Collapsing boundaries

N/A
N/A
Protected

Academic year: 2021

Share "Collapsing boundaries"

Copied!
20
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Bachelor Thesis Collapsing Boundaries Niklas Frerichs (10768238)

(2)

Abstract

In the present study we sought to test whether introducing deadlines to an expanded decision task environment would induce collapsing decision criteria and whether the shape of these decision criteria can be altered by the duration of the deadline. This study adds to the existing evidence that indicates the existence of dynamic decision criteria in dynamic task environments.

(3)

Introduction

The process by which a human to comes to a decision sure is a complex one. When deciding whether to take a job offer or not, one must integrate information from many different sources (job-descriptions, financial considerations, advice from friends, etc.) to ultimately form a decision that leads to an according reaction. Usually decisions have to be made in a given time-window, so they are also influenced by the time one needs to make a decision. In the case of the job offer this could be a deadline that the organization gives to react upon the job-offer. Such complexity can hardly be adequately addressed in a laboratory setting. Therefore research on human decision making has mainly focused on simple perceptual decision making tasks. Typically these tasks are two-alternative-forced-choice (2AFC) tasks. In such tasks, participants are presented with a noisy stream of sensory information and are required to chose between two different interpretations. Earlier research has led to a conceptualization of the decision-process, underlying such tasks, that is called the drift diffusion model (DDM).

During the decision-making process, task-relevant information is encoded in early sensory areas and subsequently delivered to other brain areas, where it accumulates over time (Thura et al., 2012). A decision is reached when the accumulated information, favoring one interpretation over the other, reaches a certain threshold. This conceptualization focuses on the case of a choice between two alternatives. Here, decision makers are assumed to accumulate noisy evidence until it reaches a preset criterion, at which they commit to the corresponding choice. In recent years, many studies that investigated the distributions of response-times in perceptual decision task have found data, supporting this idea of decision making (Palmer et al., 2005, Ratcliff 2002; Reddi & Carpenter 2000). But not only studies concerning the behavioral output (eg. response time) found evidence in accordance with the DDM.

Physiological data from neurological studies with humans showed EEG-, MEG- and fMRI-patterns in perceptual decision-making tasks, that can best be explained with with a DDM (Ratcliff et al., 2009; Philiastides & Sadja, 2009). The standard assumption of the DDM is that there is a static decision criterion (SDC). This SDC is the stable threshold that needs to be reached to elicit a decision. Stable meaning that the value of the threshold does not change as time, used to make a decision, passes.

However, a lot of the earlier research has been done using static environments. A new line of research claims that in dynamic environments, a different decision model yields better results concerning economic optimality (Cisek, Puskas, & El-Murr, 2009; Shadlen & Kiani, 2013; Thura, Beauregrad-Racine, Fradet, & Cisek, 2012). While the standard DDM features a SDC, the new model assumes a dynamic decision criterion (DDC). When using a DDC, the threshold that needs to be reached to elicit a decision is not stable, but changes over time. Meaning, the required amount of information, favoring one decision over the other, changes as time passes. It is important to note, that the DDC claim is bound to dynamic environments. Only in dynamic environments a DDC is expected to be used. So what makes an environment dynamic as opposed to stable?

In the broadest definition, a dynamic environment can be considered one where one or more variables of the environment change within or between trials. Given the simplistic design of the 2AFC tasks that are used to study the DDM, there are not many variables that can vary without comprimising the dynamics of the environment. For example, the DDM only applies to situations where information accumulates in favor of one hypothesis over the other. To study the DDM, there can therefore only be two choice options per trial. Having varying numbers of answer possibilities across trials would make the environment dynamic but also useless for studying the DDM. There are however two binary characteristics of the 2AFC that can be altered to create a dynamic environment while leaving its dynamics untouched. The first being whether the task-difficulty varies between trials or not, and the second being whether the reward for a correct answer on a trial is dependent on the response time or not. When the reward for a correct answer is

(4)

dependent on the response time it introduces within trial variation since the reward given for a correct answer changes as a function of time spent on a trial. A deadline creates a special form of time-dependency of the reward (details follow in the next paragraph).

In a stable environment the difficulty of the decision task does not vary across trials and the reward for a correct answer is independent on the response time on a trial. The latter meaning whether a decision-maker answers on a trial after one second, five seconds or 20 minutes does not make a difference in the reward that is given for a correct decision. In dynamic environments however, the task-difficulty changes between trials and / or the reward for a correct answer is dependent on the response time. The task-difficulty condition can be changed by having different signal-to-noise ratios across trials. The signal-to-signal-to-noise ratio is a measure of signal-strength relative to background-signal-to-noise. In the 2AFC context, the signal-to-noise ratio specifies the ratio of the evidence, indicating the correct answer (the signal) and the evidence not in favor of any answer or indicating the wrong answer (the noise).

The response-time dependence condition can be changed by two ways. One option would be to introduce a gradual dependence, where, for example, the reward gradually becomes smaller as time passes. The other is to create a binary dependence. Here, the reward size is fixed to a specific value until a certain amount of time (a deadline) has passed. After the deadline, the reward size is reduced to zero.

Now that the possibilities for making an environment dynamic have been discussed, I will explain why dynamic in the environment should theoretically elicit a DDC. Before considering the theoretical explanations, it is however important to note that the assumption of reward rate optimalization (RRO) underlies each of the three

explanations. This assumption holds that a rational agent will always try to maximize his reward rate (Simen, Contreras, Buck, Hu, Holmes,& Cohen, 2009). Reward rate is defined as the expected number of units reward per unit time (Drugowitsch,, Moreno-Bote, Churchland, Shadlen, & Pouget, 2012). Furthermore is this reward rate not maximized for every single trial, but over a series of trials (Boehm et al., 2015).

Based on the RRO assumption, it follows that in an environment with stable task difficulty, economic

optimality can be assured by using a SDC (Boehm et al., In Preparation). Since the task difficulty is stable, the assumed time for making a decision is the same across trials. In this case there is a certain optimal stable threshold that can be used to get the highest reward rate over the whole sequence of trials. When the task difficulties vary however, a DDC would yield better results (Boehm et al., In Preparation) . With varying task difficulties, the assumed time for making a decision is no longer stable, since in easier trials it would take the participant less time to come to a conclusion than in more difficult trials. Given that the reward for a correct answer is the same for each trial, a participant can lose much time on a difficult trial without getting more reward for answering correctly. By using a DDC with collapsing bounds, the decision-maker can ensure that he is not losing too much time on hard trials and instead invest the time in easier trials. It's more promising to invest time in easier trials since in easier trials the probability of getting the reward in a short amount of time is higher (Shadlen & Kiani, 2013; Cisek et al., 2009; Thura et al., 2012).

Creating a gradual response-time dependence of the reward size should also elicit a DDC (Busemeyer & Rapoport, 1988; Drugowitsch et al., 2012). Introducing sampling costs to an otherwise stable environment creates a gradual response-time dependence. Sampling costs are the costs that arise from delaying the decision for at least one time-step. The cost gets subtracted from the reward. Therefore sampling costs can be considered as reductions in the size of the reward. In an environment where the sampling costs per time-unit increase as time passes, a collapsing DDC would yield the best results. The cause for this lies in the trade-off between the probability of answering correct and the costs of waiting. While the probability of answering correct increases steadily over time, the costs for increasing the probability (getting more information) increase more and more rapidly as time passes. As a consequence, the expected actual reward (reward minus sampling costs) decreases more and more rapidly over time. A SDC does not cope with

(5)

this gradual time dependency of the actual reward. A collapsing DDC however does and would therefore result in a higher reward rate.

Frazier and Yu (2010) explained in their theoretical paper why a binary response-time dependence of the reward size should lead to a collapsing DDC, that reaches its minimum at the time of the deadline. The reasoning is similar to the sampling costs explanation. With increasing time, more and more information is accumulated, which increases the probability of answering correct. But with increasing time, the probability of the deadline occurring also increases. Answering after the deadline results in getting no reward. Therefore the expected reward (calculated by putting the two probabilities in relation) decreases over time and becomes zero at the moment the deadline occurs. A DDC takes the decreasing expected reward into account while a SDC does not. Therefore a DDC is better suited for increasing the reward rate.

Empirical data, supporting the use of a collapsing DDC in an environment with varying task difficulties, has been found in several studies (Drugowitsch et al., 2012; Shandlen & Kiani, 2013). There is also neurological data from studies with monkeys, that supports the idea of a collapsing DDC for such environments (Ditterich, 2006; Hanks et al., 2014). In these studies a DDM with a DDC did fit better to the firing patterns of neurons than a DDM with a SDC (Ditterich, 2006; Hanks et al., 2014). Data supporting the use of a collapsing DDC in an environment with increasing sampling costs has also been found (Drugowitsch et al., 2012; Boehm et al., In Preparation). A study from Gluth, Rieskamp & Büchel (2013) also provides neurological data from humans that supports this idea. In their study, participants had to make decisions under either low sampling costs or high sampling costs. Under high sampling costs, participants showed an EEG-pattern that could best be explained with a collapsing DDC (Gluth et al., 2013). It is important to note that all of the studies mentioned above featured experimental set-ups that were to designed to analyze whether participants were using a dynamic decision criterion or a static decision criterion. Accordingly these studies compared the evidence in their data indicating dynamic DCs with the evidence indicating static DCs. Our study, however, will attempt a parametric manipulation of the shape of the decision bounds. By creating three conditions with different deadline-durations we systematically seek to induce decreasing bounds with different slopes per condition. Specifically we will test whether the decision bounds collapse more rapidly as the deadlines become shorter. Thereby we are able to also assess whether the task-environment can induce decreasing DCs with different shapes, rather than merely assessing whether the task-environment can induce a dynamic DC. Another advantageous feature of our study is that we use a 2AFC that is based on the expanded judgment paradigm developed by Brown, Steyvers,& Wagenmakers (2009). Using an expanded judgment task will allow to directly plot the estimated decision criteria of the participants, instead of having to do complex model fitting.

In summary, there is a theoretical framework explaining why varying task difficulty should induce a collapsing DDC. There are also theoretical explanations why a gradual response-time dependency of the reward size (sampling costs) and a binary response-time dependency of the reward size (deadlines) should elicit a collapsing DDC. Furthermore is there a considerably large body of empirical data, supporting the idea of a collapsing DDC in environments with varying task difficulties and in environments with increasing sampling costs. Importantly, there is yet no empirical study that sought to analyze whether introducing deadlines to simple perceptual decision tasks would also induce a collapsing DDC. Further did no study attempt a parametric manipulation of the shape of the decision bounds, which enables the comparison of different shapes among decreasing DCs. For this reason, the goal of the present study is to investigate experimentally whether deadlines induce a collapsing DDC, and whether the decision bounds collapse more rapidly as the deadlines become shorter.

(6)

Methods

Experimental Paradigm

In this study, participants will be exposed to two stimuli that can be seen as consisting of a sequence of sensory events. Every single sensory event can be categorized as being either positve indicating the target or indicating the distractor. In this study, the two stimuli are blocks that appear either above or beneath a horizontal line in a flickering manner (details on this can be found in the experimental procedure). Since the task is to decide whether the left or right block appears above the line more frequently, the response-relevant event, in this case, is a block appearing above the line. Accordingly, a block appearing beneath the line is considered an response-irrelevant event. In every trial, one of the two stimuli is the target, while the other is the distractor. Meaning one is coded to appear more frequently above the line than the other. The probability with which the target stimuli appears above the line is given by ΘT; for the distractor this is ΘD. In this experiment we use a fixed trial difficulty. Therefore ΘT and ΘD are constants with ΘT = 0.35 and ΘD = 0.23.

The events that make up each stimulus are sampled independently, based on the theta values. Therefore there are three types of outcomes an participant might encounter. The outcomes can be denoted by a random variable X with the values x � {1,0,-1}. The occurrence of a

response-relevant event of the target but not the distractor is represented by a 1. A participant observing X = 1 receives evidence indicating that the stimulus, the participant is looking at right now, is the target. The probability of this happening is denoted by p, with p = ΘT*(1- ΘD). On the other hand, the occurrence of a response-relevant event of the distractor but not the target is coded with a -1. In this case the participant samples X = -1, which is evidence indicating that the stimulus, the participant is looking at right now, is the distractor. The probability of this happening is denoted by q, with q = ΘD*(1- ΘT). At last, there is also the possibility of sampling a response-relevant event for either both or neither stimuli. Meaning both blocks appear above the line or both blocks appear beneath the line. X = 0 represents this situation, in which the evidence is not in favor of any of the two stimuli. The probability of this happening is given by r, with r = ΘT*ΘD+(1- ΘT)*(1- ΘD). Sequential Sampling Model

The presented perceptual decision making task can be seen as a sequential sampling situation where the participant has to decide between two competing hypotheses (Rapoport & Burkheimer, 1971, retrieved from Boehm et al., In Preparation). Assume that H1 stands for the hypothesis that the left stimulus is the target, while H2 stands for the hypothesis that the right stimulus is the target. From each hypothesis follows a likelihood function λi(x) for observing a certain value of X. Since H1 being true is equal to H2 not being true and vice versa, only one

(7)

likelihood function needs to be taken into consideration.

p = ΘT*(1- ΘD) if x = 1 (1) λ1(x) = λ2(-x) = { q = ΘD*(1- ΘT) if x = -1

r = ΘT*ΘD+(1- ΘT)*(1- ΘD) if x = 0

Assuming that the participant has an unbiased belief at time-point t = 0 (denoted by π(0) = 0.5) the participant starts with the hypothesis that both stimuli are equally likely to be the target and

subsequently updates his belief with every observed discrete event x(t) in time steps t, with t � {1, ..., N}. According to Bayes' Rule, the updating process should proceed as follows:

(2) 𝜋𝜋(𝑡𝑡−1)∗𝜆𝜆1(𝑥𝑥𝑡𝑡)

𝜋𝜋(𝑡𝑡−1)∗𝜆𝜆1(𝑥𝑥𝑡𝑡)+(1−𝜋𝜋(𝑡𝑡−1))∗𝜆𝜆2(𝑥𝑥𝑡𝑡)

Experimental Study

The aim of this study was to investigate whether deadlines can induce a collapsing DDC. Furthermore, we are interested in seeing whether different deadlines result in collapsing DDCs with different slopes. To do this, this study incorporates a between subjects design with three different deadline conditions.

Participants

The sample consisted of 28 participants, all of which had normal or corrected to normal vision. Participants received partial course credit for their participation. The Ethics Review Board of the University of Amsterdam gave ethical approval for the study and participants had to fill in an informed consent document before beginning the experimental procedure.

Experimental Procedure

Given the between subject design of the study, each participant had to perform only one of the three conditions. Which participant had to perform which condition was randomly assigned.

Every condition started with two instructional screens that explained the decision making task. The instructions were followed by ten test trials, that were intended to acquaint participants with the experimental task. During these trials, participants only received feedback whether they answered right or wrong. After a correct answer, the word “Correct!” was shown in green letters. After a wrong answer, the word “Wrong!” was shown in red letters. If a participant did not answer before the deadline passed, the word “Too Late!” was presented in orange letters. The ten test trials were followed by a third informational screen that explained the point scheme and informed the

(8)

participants about the presence of a deadline. Next came ten more test trials, that served to familiarize the participant with the point scheme. Here, participants received feedback about the accuracy of their answer as well as about the reward earned for their performance. Participants received a reward of 750 points for each correct decision and a penalty of -750 points for each incorrect decision. Failing to answer before the deadline resulted in a penalty of -1000 points. These trials were followed by a fourth informational screen that informed the participant about the block and trial scheme of the study.

Participants had to complete 16 experimental blocks that consisted of 50 trials each. After each block the payoff of the respective block was presented. Positive payoffs were shown in green letters whereas negative payoffs were shown in red letters. Furthermore was there an optional message, that appeared when participants seemed to be answering too quickly, that advised the participants to take more time to make the decisions. After each block a non-mandatory pause could be taken. When a participant had completed all 16 blocks, the participants total payoff-score was shown.

Each of the three conditions were identical except for the duration of the deadline. In condition one, the deadline was one second; in condition two, the deadline was 1.5 seconds; in condition three, the deadline was 2.5 seconds.

Experimental Task and Apparatus

Participants performed the task at a viewing distance of 70cm from the screen, sitting alone in a small, dimly lit room. The software that was used to program the experiment was PsychoPy, version 1.85.Orc1 (Pierce, 2007, 2009). The stimuli were presented at a 60Hz refresh rate setting and a resolution of 1920 x 1080 pixels on an Asus VG236 23-inch screen. An illustration of the set-up of the experimental task can be found in Figure 1.

Each trial started with the presentation of fixation cross for 300ms. After that, two horizontal lines were shown, with blocks appearing either above or beneath each of these lines. The blocks were 4.5° wide and 1.7° high and appeared in a flickering manner. The flickering rate (frame rate) of the blocks was 60Hz. A single flash of a block being presented (above or beneath the line) consisted of 3 frames, which is equal to 16.667ms * 3 = 50 ms. After being presented for 50ms (three frames), the block vanishes for the subsequent 50ms, and then appears again (either above or beneath the line). The probability whether a block appears above or beneath the line, is determined by the values of ΘT and ΘD. ΘT has been set to 0.35 and ΘD has been set to 0.23. Putting these values in the formula (mentioned in experimental paradigm) gives a probability of p = 0.27 for observing X=1, q = 0.15 for observing X = -1, and r = 0.58 for observing X = 0. The flickering of the blocks continues until the participant gives a response or the deadline passes.

(9)

Figure 1: Setup of the experimental task. Participants performed the judgment task in which two

blocks independently appeared either above or beneath the line. Participants had to decide which of the two blocks appeared above the line more frequently.

Participants were asked to press either the “q” or the “p” key to indicate which block they thought appeared above the line more frequently. The “q” key was used for the left block and the “p” key was used for the right block. The response of the participant was followed by a blank screen that was shown for 200ms. After that, a feedback-screen was presented for 500ms. The feedback-screen informed the participant about the accuracy of the decision as well as the reward earned for the decision. The words and colors used were identical to the ones used during the test trials mentioned above.

(10)

Results

A total of 28 persons participated in this study, all of which had normal or corrected to normal vision. Eight participants were assigned to the 1sec. condition, nine participants were were assigned to the 1.5sec. condition and the remaining eight participants were assigned to the 2.5sec. condition. No participant had to be excluded from the study. The mean response time per condition, as well as the mean accuracy per condition, increased as the deadlines became longer. The averaged response time was 0.56s (se = .002) for the 1sec. condition , 0.83s (se = .003) for the 1.5sec.

condition, and 1.31s (se = .006) for the 2.4sec. condition. The mean accuracy was 0.58 (se = .006) for the 1sec. condition, 0.63 (se = .006) for the 1.5sec. condition, and 0.66 (se = .006) for the 2.5sec. condition.

Figure 2 shows the empirical estimates of the participants' decision criterion per condition. The x-axis shows the binned response times of the participants, while the y-axis displays the amount of evidence in favor of the target stimulus. The dots in each graph show the relative frequency with which all participants in the respective condition responded at different time steps and values of observed evidence. The darker the color of the dot, the more frequent this timestep-evidence combination has been recorded. Every black line shows the regression line that is fitted through the data of a particular participant. For every participant there is one regression line. These regression lines are crude linear approximations of the decision criteria and not factual

representations of the decision criteria. However, for the purpose of this study they will serve as empirical estimates of the participants' decision criteria. The red line represents the averaged

estimated decision criterion per condition. The averaged estimated decision criterion was derived by averaging the intercepts and slopes of all regression-lines in a particular condition.

In each condition, there are participants with an increasing estimated decision criterion as well as participants with a decreasing estimated decision criterion. However, most of the estimated decision criteria have a positive slope. This is especially the case for the 1sec. condition. When looking at the three averaged decision criteria, it becomes apparent that all of them have a positive slope and that the magnitude of the slope decreases as the deadlines become longer. This is in sharp contrast with the hypothesis of collapsing boundaries. According to the theory, an upcoming deadline should an induce a decreasing DDC. Furthermore should a shorter deadline elicit a steeper (more negative) slope than a longer deadline.

Another perceivable pattern is that the magnitudes of the slopes of the DCs vary to a great degree. While some DCs are collapsing steeply, others are increasing steeply and again other DCs are collapsing or increasing merely slightly. This pattern of heterogenity can be found in each condition. Interestingly, the level of heterogenity appears to decrease as deadlines become longer.

(11)

Meaning, the 1sec. condition shows more heterogenity than the 1.5sec. condition, and the 1.5sec. condition shows more heterogenity than the 2.5sec. condition. Furthermore do the intercepts of the three averaged estimated DCs increase as deadlines become longer. This shows that there is a trend for the mean intercept per condition to decrease as the deadline becomes shorter. This is in contrast with the theory of collapsing bounds, that predicts a stable starting value of the DC (which is represented by the intercept of the regression line) across conditions. On the other hand is a DC-starting-value (intercept) that decreases in accordance to the duration of the deadline actually in line with the predictions of the traditional model of static decision criteria.

1sec. Condition 1.5sec Condition 2.5sec. Condition

Figure 2: Empirical Estimates of the participants' decision criterion per condition. The x-axis

shows the binned response times of the participants, while the y-axis displays the amount of evidence in favor of the target stimulus. The averaged estimated decision criterion per condition is shown by the red line.

Figure 3 shows the density distribution of the response times per condition. The x-axis shows the different RT bins, while the y-axis indicates the proportion of the data that falls within a given RT bin. In the first condition, the bar of the 9th RT bin has a height of roughly 0.025, meaning that 2.5 percent of the recorded response times were made within the 9th RT bin. The last fifth of the

(12)

RT bins within a given condition can be considered as representing the tail of the respective distribution. For the 1sec. condition, these are the last two RT bins; for the 1.5sec. condition, these are the last three RT bins; and for the 2.5sec. condition, these are the last five RT bins. A qualitative observation of figure 3 shows that for the 1sec. condition, the proportion of data decreases rapidly across the last fifth of the RT bins. For the 1.5sec. condition, the proportion of data decreases less rapidly across the last fifth of the RT bins. In the 2.5sec. condition, the proportion of the data decreases only slightly across the last the last fifth of the RT bins. This pattern indicates that the tails of the response time distributions become shorter as the deadlines become shorter. The fact that the tails of the distributions become shorter shows that the participants adjusted their decision behavior to the different deadlines. This can be seen as evidence indicating a successful

manipulation of the deadline-variable.

Figure 3: Empirical density distributions of the response times per condition. The x-axis shows the

(13)

bin.

The hypothesis of this study was that the slope of the decision criterion in the 1sec.

condition (β1) is smaller than the slope of the decision criterion in the 1.5sec. condition (β2), which in turn is smaller than the slope in the 2sec. condition (β3). Accordingly, our alternative hypothesis is given by the equation β1 < β2 < β3, stating that the slopes are ordered in a certain way, while our null-hypothesis stated that the slopes are not ordered in the described way. To test the alternative hypothesis (β1 < β2 < β3), we initially planned to fit a null-model and an order-restricted model to the data and then compare the fit of these models by computing the Bayes Factor, that indicates to what degree which hypothesis is more likely than the other, given the data. The latest version of the Bayes Factor R-package BayesFactor: Computation of Bayes Factors for Common Designs, version 0.9.12 (Mourey & Rouder, 2015), does however currently not allow to compute Bayes Factors for order-restricted models. Therefore we first had to compute the Bayes Factor for the full-model (which allows for any ordering of the slopes) against the null-model. Subsequently we used Markov-Chain-Monte-Carlo -sampling to approximate the Bayes Factor for the order-restricted model against the full-model. Due to numerical problems, however, the according Bayes Factor was calculated to be zero. A way that enables the computation of a Bayes Factor that would fit the needs of our desired analysis has yet to be developed. To encompass the numerical problems we then reverted to an alternative analysis where we computed least squares-estimates of the regression slopes for each participant and subsequently used a one-sided Bayesian t-test to compare a) the slopes of the 1sec. condition and the 1.5sec. condition, and b) the slopes of the 1.5sec. condition and the 2.5sec. condition. Specifically, we tested the probability of obtaining the data under the null-hypothesis as compared to the probability obtaining the data under the alternative null-hypothesis. Test a) yielded a Bayes Factor of 1.21, specifying that, given the data, the null-hypothesis being true is 1.21 times more likely than alternative hypothesis (β1 < β2) being true. Test b) yielded a Bayes Factor of 4.38, indicating that the data were 4.38 times more likely under the null-hypothesis, than under the alternative hypothesis (β2 < β3). Reversing the direction of the comparison, and thus comparing the probability of obtaining the data under the alternative hypothesis as compared to the null-hypothesis yielded a Bayes Factor of 0.83 for test a) (meaning the data were 0.83 times as likely under the alternative hypothesis than under the null-hypothesis), and a Bayes Factor of 0.23 for test b) (meaning the data were 0.23 times as likely under the alternative hypothesis than under the null-hypothesis). A Bayes Factor that is smaller than 1, indicates that the data is more likely under the competing hypothesis. Therefore our analysis revealed Bayes Factors that state that, given the data, the alternative hypothesis is less likely to be true than the null-hypothesis. (Although classical hypothesis-testing usually gives the null-hypothesis the preferred status and merely

(14)

considers evidence against it, a brief interpretation of the Bayes Factors specifying the evidence against the alternative follows.) Jeffreys (1961) formulated a categorization, specifying the

interpretation of Bayes Factors of different values. He argued that Bayes Factors ranging from 1 to 3.15 can be considered as giving evidence that is “barely worth mentioning” and Bayes Factors ranging from 3.15 to 10 can be considered as giving “substantial” evidence (Jeffreys, 1961). According to Jeffreys categorization we therefore found substantial evidence supporting the null-hypothesis following from the comparison of the slopes β1 and β2, and evidence that is barely worth mentioning supporting the null-hypothesis following from the comparison of the slopes β2 and β3.

Discussion

The goal of our study was to investigate whether deadlines can induce collapsing boundaries and more specifically, to test whether shorter deadlines would cause steeper collapsing boundaries than longer deadlines. Accordingly, our hypothesis stated that the slope of the collapsing boundary in the 1sec. condition (β1) is smaller than the slope of the collapsing boundary in the 1.5sec. condition (β2), which in turn is smaller than the slope in the 2sec. condition (β3). To test the hypothesis (β1 < β2 < β3), we set up a between subjects experimental design where participants were asked to perform a perceptual decision making task under three different deadline conditions (1 second; 1.5 seconds; 2.5 seconds).

Qualitative observations of the density distributions of the response times per condition indicated that the participants did adjust their decision behavior to the different deadlines. This indicates a successful manipulation of the independent deadline variable. A visual analysis of the averaged estimated decision criteria showed that for each condition, the averaged decision criterion has a positive slope. Furthermore did the magnitude of the slope of the three averaged DCs decrease for longer deadlines. A quantitative Bayesian analysis of the results showed that, against our prediction, the order-restriction of the slopes (β1 < β2 < β3) has not been supported by the data.

The fact that, for each condition, the averaged DC had a positive slope is in contrast with the idea of collapsing boundaries. According to the theory, an upcoming deadline should an induce a decreasing DDC. Finding increasing bounds, as we did, can be seen as an indicator that the participants did not use the predicted decreasing DC. Such an interpretation of the slope-findings should however be made with caution. It may be, that the manner in which we computed the evidence in favor of one over the other hypothesis (at any given time-step) created a tendency to find positive slopes in the data, which is more dependent on the set up of the experimental task and evidence computation than on the actual decision behavior of the participants. The reasoning goes

(15)

as follows: With every discrete time-step the amount of evidence is computed, and at every single trial, at t= 0 the evidence has a value of 0.5 (meaning not in favor of any of the hypotheses). Even if the evidence accumulates only in one direction as time passes (favoring one hypothesis) , the

amount of evidence will necessarily be very small at the very early time-steps. It is therefore highly probable that a response made at 10 ms has a lower value of evidence (pointing in one direction) than a response made at 100 ms. Assume, for instance we define the interval in which the early responses fall as ranging from 1 ms to 300 ms, and call this interval the “very early interval”. Dependent on the described process of evidence accumulation, responses that are made early in the “very early interval” will have a strong tendency to have less evidence in favor of the hypothesis than responses made later in the “very early interval”. From this process it may come that we register an increasing DC for the responses in the “very early interval”. Such an increasing DC thus maybe more influenced by the experimental set-up than by the decision behavior of the participants. It is important to note that this dynamic is only valid for the “very early interval”. After the initial accumulation of evidence in the “very early interval” has passed, the probability that evidence is in favor of the hypothesis is higher at t = x than at t = x-1 decreases. This originates from the fact that, according to the collapsing bounds idea, it may be that while the evidence between t = 500 and t = 520 did not increase, the decision criterion decreased between the two time-steps to a degree that the same evidence-value that did not meet the criterion at t = 500, does meet the criterion at t = 520 and elicits a response. If there are a lot of responses in the “very early interval” and less responses in the time after the “very early interval”, the pattern found in the “very early interval” may have a strong influence on the registered pattern across the respective condition as a whole. This could result in finding an overall increasing DC, while the DC actually only predominately increases in the “very early interval” and predominately decreases in the time after that interval. To encompass the influence of the experimental setup we could choose to exclude responses that fall within the “very early interval” from our analysis. There is however no principled way to to choose the end-point of the “very early interval”. Since there is no principled way to, the interval has to be arbitrarily chosen. This creates the risk of trying different end-points of the “very early interval” until an end-point is found that seems reasonable and results in finding a decreasing DC for the data outside that interval. One possibility that might solve the problem of the dynamics within “very early interval” and arbitrarity of its end-point could be to create trials with non-uniform priors. Meaning trials where the evidence starts at a value that already is in favor of one over the other hypothesis. One way to accomplish that would be to state the prior probability of one of the hypotheses being true before the start of each trial.

Another finding that contradicts the prediction of collapsing bounds is that the magnitude of the slope of the three averaged DCs decreases as deadlines become longer. Meaning the averaged

(16)

DC in the 1sec. condition increases more rapidly than the averaged DC in the 1.5sec condition, which in turn increases more rapidly than the DC in the 2.5sec. condition. This finding may result from the fact that a very short deadline forces the participant to use all incoming information, while a longer deadline may allow the participant to let some of the information slide.

To conclude, in the present work we tested the hypotheses that introducing deadlines to a perceptual decision task does induce a collapsing DC and that the slope of the DC is steeper for shorter deadlines than for longer deadlines. To this end we exposed participants to extended perceptual decision tasks based on Brown et al. 2002 in different deadline conditions and

subsequently conducted a Bayesian analysis in which we compared the slopes of the DCs between the three different conditions. The Bayesian analysis did not support our predicted order-restriction of the slopes. For the slopes to be ordered as specified in our alternative hypothesis (β1 < β2 < β3), participants would have had to use an optimal decision criterion. The reasoning goes that, for shorter deadlines, the decision bound must collapse more rapidly than for longer deadlines in order to create the biggest reward rate. As described, we did not find such an order restriction of the slopes in our main analysis. This might be due to the fact that in our study, the participants received virtually no training in the task-environment prior to the experimental trials. Previous studies have shown that participants are able to adapt their decision criterion so that it approximates an optimal decision criterion for the respective task-environment (Boehm et al., In Preparation, Drugowitsch et al., 2012, Gluth et al., 2013, Shandlen & Kiani, 2013.) However, studies that incorporated training the participants before starting the experimental trials indicate that participants tend to require extensive practice with the task-environment before approximating an optimal decision criterion (Balci et al. 2011; Simen et al., 2009; Starns & Ratcliff, 2012, Boehm et al., In Preparation). Therefore it may be that the participants in our study did not use nearly optimal decision criteria because they did not have enough practice in the task-environment to approximate an optimal decision criterion. Using sub-optimal decision criteria may have caused the slopes of the decision criteria in the different conditions to not be in the order specified in the alternative hypothesis.

The qualitative observation of the data was also not in line with our hypothesis and the theory of collapsing bounds. The quantitative Bayesian analysis of the data yielded results that point against the attempted parametric manipulation of the decision criteria. The limitation of our study that we did not try to correct for the described problem with the “very early interval” does

compromise the strength of possible interpretations of our findings, but this study still gives insight in possible dynamics occurring during human decision making on perceptual decision making tasks. Especially comparing findings of future research that tackles the problem with the “very early interval” in any particular way or specifically addresses processes during very short deadlines might create interesting insights.

(17)
(18)
(19)

References

Balci, F., Simen, P., Niyogi, R., Saxe, A., Hughes, J.A., Holmes, P., & Cohen, J.D. (2011). Acquisition of decision making criteria: Reward rate ultimately beats accuracy. Attention,

Perception & Psychophysics, 73(2), 640–657.

Boehm, U., van Maanen, L., Evans, N., Brown, N., Wagenmakers, E.J. (In Preparation). On the relationship between reward rate and dynamic decision criteria. In Press.

Boehm, U., Hawkins, G.E., Brown, S., van Rijn, H., Wagenmakers, J.E. (2015). Of monkeys and men: Impatience in perceptual decision-making. Psychon Bull Rev. doi:10.3758/s13423-015- 0958-5

Busemeyer, J.R., & Rapoport, A. (1988). Psychological models of deferred decision making. Journal of Mathematical Psychology, 32(2), 91–134.

Cisek, P., Puskas, G.A., & El-Murr, S. (2009). Decisions in changing conditions: The urgency-gating model. Journal of Neuroscience, 29(37), 11560–11571.

Ditterich, J. (2006). Evidence for time-variant decision making. The European Journal of Neuroscience, 24(12), 3628–3641.

Drugowitsch, J., Moreno-Bote, R., Churchland, A.K., Shadlen, M.N., & Pouget, A. (2012). The cost of accumulating evidence in perceptual decision making. Journal of Neuroscience, 32(11), 3612–3628.

Gluth, S., Rieskamp, J., & Buchel, C. (2013). Classic EEG motor potentials track the emergence of value-based decisions. NeuroImage, 79, 394–403.

Hanks, T.D., Kiani, R., & Shadlen, M.N. (2014). A neural mechanism of speed-accuracy tradeoff in macaque area LIP. eLife, 3, e02260.doi:10.7554/eLife.02260

Jeffreys, H. (1961). The Theory of Probability (3 ed.). Oxford. p. 432.

Palmer, J., Huk, A.C., Shadlen, M.N. (2005). The effect of stimulus strength on the speed and accuracy of a perceptual decision. J Vis, 5, 376-404.

Philiastides, M.G., & Sajda, P. (2006). Temporal characterization of the neural correlates of perceptual decision making in the human brain. Cerebral Cortex, 16, 509–518. Ratcliff, R. (2002). A diffusion model account of response time and accuracy in a brightness

discrimination task: fitting real data and failing to fit fake but plausible data. Psychon Bull Rev, 9, 278-291.

Reddi, B.A., Carpenter, R.H. (2000). The influence of urgency on decision time. Nat Neurosci, 23, 3098-3108.

Ratcliff, R., Philiastides, M.G., & Sajda, P. (2009). Quality of evidence for perceptual decision making is indexed by trial-to-trial variability of the EEG. Proceedings of the National

(20)

Academy of Sciences,106, 6539–6544.

Shadlen, M.N., & Kiani, R. (2013). Decision making as a window on cognition. Neuron, 80(3), 791–806.

Starns, J.J., & Ratcliff, R. (2012). Age-related differences in diffusion model boundary optimality with both trial-limited and time-limited tasks. Psychonomic Bulletin & Review, 19, 139–145. Thura, D., Beauregard-Racine, J., Fradet, C.-W., & Cisek, P. (2012).Decision making by urgency

Referenties

GERELATEERDE DOCUMENTEN

Het nieuwe, extra, maximum is dat de pachtprijs van bestaande contracten door toepassing van het veranderpercentage niet meer dan 10% boven de regionorm mag uitkomen.. De prijzen

In his research, Lorentz has related the regional production location comparison to supply chain management issues, facility location analysis and multi criteria decision

How to design a mechanism that will be best in securing compliance, by all EU Member States, with common standards in the field of the rule of law and human

Ook voor bossen zijn geen kosten voor herstelbeheer berekend, omdat het voor een groot deel kosten voor hydrologische maatregelen buiten de gebieden betreffen, maatregelen

Het elektraverbruik voor de circulatie wordt berekend door de frequentie (het toerental) evenredig met de klepstand (die dus gestuurd wordt op basis van de ethyleenconcentratie) af

In this paper, we formulate and test two predictions derived from the mathematical analysis and simulations of Frazier and Yu, which concern the skewness of the

The research therefore submits that reading 2 Samuel 11:1-27 in the context of the theology of the Deuteronomistic History raises an awareness and understanding of why patriarchy,