Environmental influences on effort-based decision making

(1)

Environmental Influences on Effort-Based Decision Making

by Rik Janssen

Student Number: 10080619 Tutor: Jasper Winkel

(2)

Abstract

Psychological research is largely dependent on computerized tasks to analyze decision making. However, behavior in lab-settings could differ from behavior in real-life. With the use of a virtual reality headset, environmental influences on effort based decision making (EBDM) were assessed. Thirty-three participants completed an EBDM-task in three different environments, varying in degree of realism. Point of Indifference (POI) values were calculated for each participant and analyzed using a one-way

repeated measures ANOVA. No significant differences in POI values were found between conditions. This suggests that EBDM considerations are not easily influenced by specific presentation of the tasks and that results of these tests are probably reliable.

(3)

The Influence of Environment on Effort-Based Decision Making

Psychological research nowadays is strongly dependent on the use of computerized tasks. For the most part, these tasks are two-dimensional. But are

decisions made in these not so realistic tasks similar to the ones made in real life? Up to now, not much research has been done to address this question. But thanks to recent developments in computer technology, new possibilities have arisen to study the effect of task realism on behavior. With the use of Virtual Reality (VR), current laboratorial settings can be compared to more realistic task environments, while keeping the

behavioral considerations (such as effort and reward) stable. Therefore, in this research, we looked at choice behavior across different environmental settings, using a

VR-headset. In this way, we hope to uncover the relationship between realism in behavioral tasks and decision making, information that is highly useful for any kind of

psychological research administered with computerized tasks. If behavior is not stable across two-dimensional test settings and more realistic test settings, it is important to look at in what way the lab-setting influences human behavior, so that in future research this effect can be taken into account when designing experiments.

In this research, we specifically looked at Effort-Based Decision making (EBDM), a type of decision making where one considers whether they want to exert a certain amount of effort for a certain reward. This type of decision making is widely studied in combination with the neurotransmitter dopamine. Phillips, Walton and Jhou (2007) propose that dopamine plays a key role in cost-benefit considerations: they specifically suggest dopamine to be responsible for overcoming the costs in these considerations. Pardo et al. (2012) found that administration of a dopamine antagonist made mice more inclined to choose the low-effort and low-reward route in a maze, whereas in a control

(4)

condition they would prefer the high-effort and high-reward route, showing a lowering of dopaminergic activity to negatively influence effort-willingness. In humans, such a link between dopaminergic pathways and effort-willingness has also been studied. Wardle, Treadway, Mayo, Zald and De Wit (2011) studied the effects of amphetamine on human decision making. Amphetamines inhibit reuptake of dopamine, causing

dopamine levels in the synaptic cleft to rise. Wardle et al. (2011) found that

amphetamine positively influenced the amount of effort the participants were willing to exert for a reward. These results suggest that dopamine pathways in the brain are responsible for effort-considerations in such a way that higher levels of dopamine provoke more high-effort choices.

But in what circumstances can such a dopaminergic release be expected? Could there be differences in dopamine levels according to one’s surroundings? Dopamine plays a key role in learning, motivation and reward coding. For instance, dopamine is broadly active in the brain when new associations are learnt: this is so in various structures of the brain, such as the nucleus accumbens and the cerebral cortex (Wise, 2006). Dopaminergic release can be expected in any type of situation that requires learning of new associations. A novel environment can be one of these. Li, Cullen, Anwel and Rowan (2003) found that a Long-Term potentiation (LTP) was facilitated by a dopamine release when mice were in a novel environment. By genetically manipulating dopamine-receptors in mice, Tran, Uwano, Kimura, Hori, Katsuki, Nishijo and Ono (2008), found that dopamine played a crucial role in encoding spatial information in novel environments: blocking of dopamine pathways prevented signaling of a novel environment and impeded spatial learning. Horvitz (2000) found that dopamine neurons responded to salient and arousing changes in environment and salient events and objects. These findings all point towards environmental factors, and specifically

(5)

novel environments, as influencing dopamine release in the brain. So both novel environments and salient objects in the environment are possible triggers of higher dopamine levels.

Furthermore, midbrain dopamine neurons are involved in the coding of reward. A reliably predicted upcoming reward triggers dopamine bursts, as does presentation of an unexpected reward (Schultz, 2002; Ludvig, Sutton & Kehoe, 2008). Absence of a predicted reward however, inhibits dopamine activity. This is called reward prediction error: dopamine release expresses the difference between the expected reward and the received reward. Dopamine is therefore considered to code different types of reward, in order to regulate motivation (Samejima, Ueda, Doya & Kimura, 2005; Berridge, 2006). Also, reward-related stimuli tend to be more salient than other stimuli, a process in which dopaminergic neurotransmission is needed (Berridge, 2006). Dopamine, in these situations, does not code reward according to objective value. For instance, when a reward is expected to be delayed, less dopamine neurons show activation upon presentation of the stimulus (Schultz, 2002).

Although research up to now has linked environment with dopamine and dopamine with decision making, the link between test setting and choice behavior remains largely unexplored, although the results of such an investigation can be of great importance for research in the psychological sciences. In our research we will use an EBMD-task as a behavioral measure and use Virtual Reality to adjust task-realism. In this way, the influence of realism on choices can be assessed. We expect to see an increase in effort-willingness when people are in involved in a realistic effort EMBD-task, compared to when they are completing a less realistic, two-dimensional EMBD-task, with a low-realistic representation of effort. Furthermore, we want to know whether reward representation can influence decision making. This could indicate that dopaminergic

(6)

reward pathways can be influenced by initial representation of the reward, and that dopamine does not code for objective value of reward. Therefore we compared

decisions made in an environment with a realistic and highly salient representation of reward with decisions made in the same environment with an abstract representation of reward. If choosing behavior differs across these settings, this demonstrates

representation of reward to be another important influence in the administration of computerized behavioral tests, which would support Berridge’s (2006) hypothesis that salience of objects influences dopamine activity.

We assessed these effects using three conditions, with each participant completing all three in counterbalanced order. In the baseline condition participants were asked to complete an EBDM-task, which was computerized and administered in a virtual lab-setting. In the effort condition, the EBDM-task was completely translated into a high-resolution virtual task in a completely novel, simulated environment, with a more realistic representation of effort in the task (namely getting a mine cart to move). In the reward condition participants where in the same VR lab-setting as the baseline condition, but saw a three-dimensional, more realistic representation of a monetary reward while completing the two-dimensional task. For each person in each condition, willingness to choose the high-effort and high-reward (HE/HR) as compared to the low-effort and low-reward (LE/LR) option was assessed. With the role of dopamine in mind, we expected participants in the effort condition to be more willing to choose the high-effort option than in the baseline condition. Moreover, we also expected to see an increase in this kind of willingness in the reward condition as opposed to the baseline condition.

This condition was a more explorative condition to compare whether effects of salience of reward can modify motivation, which could indicate that manner of

(7)

presentation modifies dopaminergic response to reward. To assess whether our effort and reward virtual environments really did feel more real than the baseline condition, participants rated realism of each of the environments after completing the experiment.

Method

Participants

In total 51 people participated in the experiment. Two participants were excluded because they were not able to complete all three conditions, and one

participant was excluded due to changes in input device sensitivity settings in between trials. Participants were also excluded if there was a ceiling effect in at least two

different conditions. Conditions consisted of 13 trials each. If in these trials participants chose the LE/LR route once or less, this was defined as a ceiling effect. Thirteen

participants were excluded this way. However, if they chose the LE/LR route every time, this led to a POI approaching 1.25 (>1.24), showing that they never chose according to number of coins, but only chose the easier route. If this was the case in more than one condition this was defined as a floor effect. Two participants were excluded because of floor effects.

Of the remaining 33 participants, 45.5% were female, 15.2% were students and mean age was 23 (SD 3.2). Participants were not paid for their attendance, but could win payment of the mean amount of coins over conditions, with each virtual coin representing 10 eurocents. One participant was awarded this money.

(8)

Figure 1. VR environment in baseline condition.

Figure 2. VR environment in the reward condition. Task

Participants completed 3 conditions in counterbalanced order, with conditions consisting of 13 trials each, all implemented in virtual reality. Participants were

instructed to power a mine cart over a track by making pumping motions with a bicycle pump. At the beginning of each trial, participants were given a choice between a HE/HR route or a LE/LR route. Participants were specifically instructed to choose whatever they felt like and that there was no

‘correct’ answer. Color-coding in presentation of the tracks informed participants of the amount of effort a route would require; green sections of the track required no pumping input, orange sections of the track required medium

effort and red sections of the track required high effort pumping. In the baseline

condition (see Figure 1), the different route options and coin rewards were displayed on two different computer screens, one on

the left side of the virtual room and one on the right. The coins were displayed abstractly on these screens as stacked orange bars. After choosing a route with a mouse click, a third display in the middle of the room showed a power bar and progress within the chosen track.

(9)

Participants were only able to track their progress on this screen: no visible cart was moving. In the realistic reward condition, participants were in the same virtual environment. Choices were again represented on two different screens, but now rewards were realistically represented as three-dimensional stacks of golden coins on the left and right side of a desk in front of the participant and no longer on the computer screens. After choosing, the coins of the chosen route would fly into a chest in front of the participant. Then participants drove the cart the same way as in the baseline condition, and saw their progress in the middle screen. In the effort condition,

participants found themselves in a mine cart inside a room with two screens displaying the different route options and abstract coins

as orange bars. After selecting a route, a large door would open as they drove themselves into an outside natural

environment, using the same bicycle pump that they now saw integrated into their cart. Some sections of the outside tracks were

overgrown with either grass (medium effort) or shrubs (large effort). In each condition, coin rewards were adjusted according to participants’ choice behavior (see Reward

Modifier).

Before the start of the experiment, participants completed at least one test trial in every environment, to make sure they understood the amount of effort the different colors represented, knew how to operate the cart and knew where to look for effort, reward and progress information. For an illustrational movie of the way participants completed trials in each condition, see boxed text. After completion of the VR-task,

(10)

A short illustrational movie of the three conditions and the input device can be found online:

https://www.youtube.com/watch?v=B87NGwa5jlU

participants filled out a

digitalized questionnaire about the task and the environments,

rating realism of each environment on a 1-5 scale.

Effort Input Device

For this experiment we created a custom-made input device, using a bicycle pump with a computer mouse attached to it. The goal of this input device was to make participants expand considerable effort and to enhance one’s immersion by mimicking the handle on the virtual mine cart. In order to make this bicycle pump an appropriate input device for a computer, we attached a long strip of aluminum to the handle of the pump, and attached the computer mouse (Logitech G300) to the body of the pump. Consequently, when moving the handle up and down, the aluminum strips moves similarly along the fixed computer mouse. In this manner, the computer mouse registered the motions of the pump.

Reward Modifier

For each HE choice the difference between track rewards reduces and for each LE choice the difference between track rewards grows by using the reward modifier. For each trial the reward for both tracks is calculated by subtracting the total effort values from both tracks (one for green, two for orange and four for red) and multiplying this value with the reward modifier. The outcome is then added to ten for the high effort track and subtracted from ten for the low effort track.

(11)

between rewards, and 1.25, which is the maximum possible reward (20 coins) divided by the maximum possible difference between tracks (16). Each trials starts with a reward modifier value of 0.625, which is the maximum reward modifier value divided by two. For each HE choice a value is subtracted from the reward modifier value and for each LE choice a value is added to the reward modifier. The added or subtracted values grows for each consecutive choice of the same effort type. This value is 0.02 for the first, 0.05 for the second, 0.1 for the third and 0.2 for the fourth or a higher consecutive choice of the same type of effort. With each switch in effort type this value drops back to 0.02. The minimum value of the reward modifier has been set to 0 to avoid that the LE choice pertains a higher reward than the HE choice. The maximum value of the reward modifier has been set to 1.25 to avoid scores that are higher than 20 coins.

Point of Indifference

As a measure of the relationship between reward and perceived effort, the point of indifference (POI) for each subject was determined per condition. Assigning POI values has demonstrated to be a reliable method for measuring individual differences in subjective effort (Westbrook, Kester, & Braver, 2013). The POI is reached when the subject no longer expresses a preference for a specific option. At this point the subject will choose the HE/HR option just as often as the LE/LR option.

The POI can take on a value between 0 and 1.25. To determine the POI value for each condition, the average of the values of the reward per unit effort modifier of the last four trials is calculated. When the POI is low, the subject needs less reward to choose the HE/HR option. A higher POI means that the subject needs a higher reward to

(12)

Figure 4. Mean experienced realism of the three virtual environments on a 1-5 scale. Error bars: standard error of the mean, normalized between subjects to reflect within-subject comparison.

choose the HE/HR option. When the POI is equal to zero the subject chooses the HE/HR option, without consideration of the effort required.

The differences in POI values over conditions represent the differences in cost-benefit considerations. The representation of the effort and reward could account for these differences. When reward representation is constant, differences in the POI values are due to perceived effort. When effort representation is held constant, differences in the POI values can be attributed to perceived reward.

Results

To assess whether our effort and reward virtual reality environments were actually experienced as more realistic than the baseline condition, participants were asked to rate each environment on a 1-5 scale, with one being not at all realistic and five being completely realistic. Results of the one-way repeated measures ANOVA showed a

significant difference in experienced realism over conditions, F(2, 58) = 16.248, p <.001 (see Figure 4). On average, realism of the baseline condition was rated 3.03 on a scale of 1-5, the reward condition 3.17 and the effort condition 3.60. Contrasts revealed this difference not to be significant between the baseline

(13)

Figure 5. Mean Point of Indifference scores across conditions. Error bars: standard error of the mean, normalized between subjects to reflect within-subject comparison.

and reward condition, but participants rated the effort condition to be significantly more real than the baseline condition, F(1, 29) = 24.578, p < .001.

To investigate whether choice behavior differed between conditions, POI values were calculated for each participant in each condition. A lower POI value stands for a higher willingness to exert effort

and higher POI values represent low willingness to exert effort.

Choice behavior across conditions of the 33 participants was compared with a within subjects design: One-way

repeated measures ANOVA. The differences in POI per participant did not differ significantly,

F = (2, 64) 0.958, p = .389.

Discussion

This research focused on effort based decision making in three different environments, varying in degree of realism. We expected to see differences in choice behavior according to environment, with more realistic environments provoking more HE/HR choices. We expected this increase in effort-willingness to be the result of higher dopamine levels in the brain, specifically in response to a new environment and salient representation of reward. This research aimed to find in what way (presentation of) computerized tasks influences dopamine, and with that, behavioral preferences.

(14)

Our outside environment was rated more realistic than our simple, two-dimensional environment, which demonstrates that our goal to make a comparison between a more real and a less real environment was achieved. Nonetheless, a specific effect of environment onto decision making was not found. This indicates that possibly the specific presentation of a choice paradigm computer program does not influence the behavioral outcome, and that results of differently administered tests are to a certain extent comparable to each other. Taking into account that psychological research is largely dependent on these tasks, this non-significant result is actually a rather positive outcome for our scientific field.

However, such a conclusion must be handled with caution. Due to the fairly small sample size, only effects greater than 0.5 (according to Cohen, 1992, a medium effect size) could be detected in this research. So medium to smaller effect sizes will remain undetected in the current experimental setup. This is in itself not very alarming, but bear in mind that differences in experienced realism over conditions in this research was significant but not big (only a 0.57 increase on a 1-5 scale). Consequently, chances are that when realism differences between settings are greater, a big (and in this research missed) effect could still be present.

Moreover, our greatest concern before drawing a conclusion from this research is that we still cannot be sure whether real-life decisions are comparable to these computerized tasks. We tried our best to create a virtual environment that mimics the actual world, but the huge gap in realism between digitally created environments and reality cannot (yet?) be overcome. We feel that the use of a virtual reality headset is already a great improvement in task realism compared to normal two-dimensional setups, and probably in the near future far more realistic environments can be projected onto VR-headsets with better tracking and resolution.

(15)

Additionally, a couple of other factors could play a role in the current outcome of the analysis. Three of the participants mentioned choosing easier routes in the effort condition, so that they could pay better attention to their surroundings and take a look around instead of driving the cart through a tough route. Also, chances of winning a reward for exerted effort in this task were very low. There was only a one in 54 (51 participants + 3 pilot participants) chance of winning money, so motivation of the participants possibly depended more on social factors (helping out a friend, not

appearing lazy) or other personal factors (liking a challenge, not wanting to get bored) than on actual cost-benefit considerations.

Seeing how the effort-willingness did not differ across conditions, unfortunately we cannot conclude anything about the role of dopamine in computerized tasks. In future research, the administration of C-labeled raclopride and use PET-scanner would probably yield some very interesting results. Failures of binding of raclopride to

dopaminergic receptors in assumed to be due to endogenous dopamine occupying the receptor sites. Raclopride can be seen on a PET-scan, and therefore it is used to evaluate dopamine activity in the brain. Using this technique, other environmental influences on dopaminergic activity can be assessed; for instance influences of presence of other people or natural versus lab-settings. This could be used in combination with a VR-headset, enabling easy switching between environments.

For reasons of sample size, size of manipulation and motivational factors, this research cannot surely be considered to eliminate all effects of environment onto

choosing behavior for a greater population. Nevertheless, a reassuring fact remains that probably no large influence of test presentation on measurements of choice behavior is present, and that computer administration of effort based decision making tasks is probably pretty reliable.

(16)

References

Berridge, K. C. (2007). The debate over dopamine’s role in reward: the case for incentive salience. Psychopharmacology, 191(3), 391-431.

Cohen, J. (1992). A power primer. Psychological bulletin, 112(1), 155.

Horvitz, J. C. (2000). Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events. Neuroscience, 96(4), 651-656.

Li, S., Cullen, W. K., Anwyl, R., & Rowan, M. J. (2003). Dopamine-dependent facilitation of LTP induction in hippocampal CA1 by exposure to spatial novelty. Nature

neuroscience, 6(5), 526-531.

Ludvig, E. A., Sutton, R. S., & Kehoe, E. J. (2008). Stimulus representation and the timing of reward-prediction errors in models of the dopamine system. Neural

Computation, 20(12), 3034-3054.

Pardo, M., López-Cruz, L., Valverde, O., Ledent, C., Baqi, Y., Müller, C. E., Salamone, J. D., & Correa, M. (2012). Adenosine A 2A receptor antagonism and genetic deletion attenuate the effects of dopamine D 2 antagonism on effort-based decision making in mice. Neuropharmacology, 62(5), 2068-2077.

Phillips, P. E., Walton, M. E., & Jhou, T. C. (2007). Calculating utility: preclinical evidence for cost–benefit analysis by mesolimbic dopamine.

Psychopharmacology, 191(3), 483-495.

Samejima, K., Ueda, Y., Doya, K., & Kimura, M. (2005). Representation of action-specific reward values in the striatum. Science, 310(5752), 1337-1340.

(17)

Tran, A. H., Uwano, T., Kimura, T., Hori, E., Katsuki, M., Nishijo, H., & Ono, T. (2008). Dopamine D1 receptor modulates hippocampal representation plasticity to spatial novelty. The Journal of Neuroscience, 28(50), 13390-13400.

Wardle, M. C., Treadway, M. T., Mayo, L. M., Zald, D. H., & de Wit, H. (2011). Amping up effort: effects of d-amphetamine on human effort-based decision-making. The

Journal of Neuroscience, 31(46), 16597-16602.

Westbrook, A., Kester, D., & Braver, T. S. (2013). What is the subjective cost of cognitive effort? Load, trait, and aging effects revealed by economic preference. PLoS One, 8(7), e68210.

Wise, R. A. (2004). Dopamine, learning and motivation. Nature reviews