• No results found

Reinforcement learning in context of aversive emotional psychophysiological stimuli

N/A
N/A
Protected

Academic year: 2021

Share "Reinforcement learning in context of aversive emotional psychophysiological stimuli"

Copied!
50
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Title: Reinforcement learning in context of aversive emotional psychophysiological stimuli

Author: Isabela Lara Uquillas Collaboration: Chih-Chung Ting Supervisor: Jan Engelmann

Ethics approval: Economics & Business Ethics Committee (University of Amsterdam) EC20170314120328

Sponsor: Center for Research in Experimental Economics and political Decision-making (CREED) at the University of Amsterdam

Disclosure of interest: The author reports no conflicts of interest

(2)

2

Decision making is a complex process that takes place in every individual as decisions are ubiquitous and can have significant influences on life. Decision making has been widely studied due to its relevance in everyday live and prevalence in humans and other organisms. The importance of decision making and its mechanisms have been characterized in the literature as early as 1954 (Edwards, 1954), despite the long-standing interest in the topic, little is known about the effect of incidental emotions in the decision-making process. Consequentialist economic models make the assumption that incidental emotions do not influence choices; however, as of late, there has been a growing interest of the effect of

incidental emotions in the decision making process (Loewenstein & Lerner, 2003; Blanchette, 2010).

It’s no secret that decision making is a complex process. One of the most popular models for it was proposed by Rangel and colleagues and poses decision making is a continuous multi-step process consisting of actions and their evaluation (2008). Only recently have

experiments in neuroeconomics begun to identify the neural mechanisms underlying

emotional distortions of choice processes (Fehr & Rangel, 2011). Evidence seems to suggest that multiple areas and pathways are involved (Basten et al., 2010; Wyart et al., 2012; Rushworth et al., 2012; Phelps et al., 2014) and are further affected by multiple emotional and cognitive components (Heilman et al., 2010; Starcke & Brand, 2012; Payzan-LeNestour et al., 2013; Mitchell, 2011) . Furthermore, these areas and processes are also affected by the individuals pre-existing values as well as their sensitivity to reward and punishment, namely their capacity to distinguish between these choice domains and learn from them (Hee Kim et al., 2015).

In order to observe decision making in action, many paradigms have been used, each of which allows us to observe not just the decision taking place but also the different processes and components that come into play at different stages of that process (Yu, 2015). In order to focus on one particular aspect of the choice process, learning about the outcomes of our decisions to update future expectations, a reinforcement learning task has been selected and adapted from Palmintieri and colleagues (2006). Specifically, subjects learn to associate neutral stimuli with a specific (high or low) probability of monetary reward or punishment. This is important because it has previously been shown that learning through rewards and punishments can be differently affected by emotional manipulations (Engelmann et al., 2015;

(3)

3

Cavanagh et al., 2011); therefore, by using this task we can observe the differences between decisions made in a loss (punishment) and gain (reward) domain.

Additionally, we aim to observe the effect of incidental emotions on decision making. For this, researchers have employed emotion induction techniques which have been implemented in a variety of ways (Lench et al., 2011); however, many of these are only quantifiable through self-report and can potentially induce a variety of co-existing emotions. Limiting research to a specific and validated induced emotion can be beneficial to start to unravel the different variables that come into play in this complex process. In order to do this, incidental emotions will be defined as incidental anticipatory anxiety evoked by the threat of electrical stimulation to the forearm through a treat of shock protocol (Schmitz & Grillon, 2012). Furthermore, to evaluate the effect of this manipulation, skin conductance measurements will be performed and modeled.

Based on the previous literature, the proposed study aims to decipher to what extent

incidental emotions affect specific decision making processes. To gain a better understanding of emotion’s distortionary influences on cognitive processes involved in decision making, this research aims to observe the effect of incidental anxiety on learning, by combining methods from experimental economics and psychology. This study will focus on learning, which will be evaluated through an instrumental learning task involving decisions over cues associated with monetary gains and losses, while being exposed to incidental anxiety

induction as has been previously done (Engelmann et al., 2015). By using this task and emotion induction paradigm combination, we will be able to measure not just the effect of incidental anxiety on decision making but also how people are able to evaluate different options under different conditions and how each outcome affects the subsequent choices by updating information and thus improve the chances of maximizing gains.

For this purpose, participants will attend a lab sessions after completing a battery of

questionnaires. During the lab session, they will perform multiple tasks during which we will record reaction times for particular choices, the choice itself as well as skin conductance responses throughout the whole experiment which will allow us to observe the effect of induced anxiety and its potential effect on decision making processes. Additionally, assessments will take place once the task has been completed in order to see if the

participants were able to deduce the outcomes associated with particular options presented to them throughout the task.

(4)

4

Ultimately, it is expected that the participant will deduce the associations between stimuli and their beneficial or negative outcomes to improve their decisions during the task as well as report their preference for each option implicitly and explicitly after the task has been completed. Furthermore, anxiety, operationalized as threat of shock, is expected to interact with domain in order to influence learning. We do not hypothesize on the directionality of this interaction as there is evidence that anxiety could enhance performance in an emotion-congruent manner (Robinson et al., 2013) as well as findings suggesting the attentional bias associated with anxiety would be detrimental to performance (White et al., 2015; Petzold et al., 2010). Moreover, emotion congruent learning has been observed in positive affect manipulations and thus is follows that an effect of a similar magnitude but opposite

directionality could be observed in negative affect conditions such as anxiety (Carpenter et al., 2013). Behaviorally, it is expected that participants will be able to maximize gains to a better degree in the control baseline condition as opposed to the induced anxiety condition. We further expect that skin response modeling will result in validation of our anxiogenic experimental manipulation, such that participants will have a higher skin conductance response in the incidental anxiety blocks compared to neutral affect blocks (Bach & Friston, 2013). Finally, we expect participant’s reporting of their implicit and explicit preferences for particular stimuli to be less accurate in the incidental anxiety condition and more negative towards negative outcomes such that it correlates to the gains and losses since it should reflect the predictions used during the task itself.

Method Participants

42 students from the participant pool of the Center for Research in Experimental Economics and political Decision-making (CREED) at the University of Amsterdam participated in the study. The sample consisted right handed participants with no history of psychiatric disorders nor electronic implants; namely, 20 men (47.6%) and 22 women (52.4%) whose average age was 24.21 years (SD=3.14) participated in the study in exchange for monetary compensation. One participant was excluded from all further analysis due to their session being interrupted by a Windows 10 update reminder.

(5)

5 Materials

All tasks performed by the participants were presented on a LED screen (1366x768px) using Cogent 2000. Participants responded through their keyboard in accordance to each task’s instructions, namely with the spacebar, enter and arrow keys. In order to administer shocks for the emotion induction paradigm (i.e. threat of shock) a DS5 - Bipolar constant current stimulator was used. Shocks are administered with a wrist electrode attached with Velcro and are triggered by the accompanying MATLAB code. Additionally, a custom-made amplifier with a pair of sintered Ag/AgCl finger electrodes purchased from the University of

Amsterdam’s Technical Support Social & Behavioural Sciences (TOP) department were used to record skin conductance. Manipulation, recording and specifications for skin conductance measurements were applied through an Vsrrp98 V10.0 xml driver which would further convert the data to MATLAB. Due to time constraints and scope of this report, the results obtained from skin conductance will not be discussed further in this report. Symbols used for the learning task were drawn from the Agathodaimon font as was done in the original task (Palmintieri, 2006). For further information and details regarding all the materials used and methodologies, refer to Supplementary Materials section.

Measures

Questionnaires were sent to participants before the task for them to complete in their own time. In the instructions presented to them, they are encouraged to respond to all

questionnaires in a single sitting and without distractions. In compensation for successfully completing all parts the questionnaire, participants are rewarded a 10 euro endowment for the behavioral task on the second half of the experiment to be performed in the lab.

Demographics. Participants were asked a few basic questions, include age, gender, study background, handedness, history of mental illness and presence of any implanted electronic devices. The later three constituted grounds for exclusion for further participation and if that was the case, the questionnaire would end subsequently without following through with the rest of the inventories.

PANAS. Positive and Negative Affect Schedule (PANAS; Watson, Clark, & Tellegen, 1988) is used to measure both positive and negative affect at a given time scale through rating a series of adjectives ranging from 1, very slightly or not at all, to 5, extremely, based on the

(6)

6

extent to which they have felt it in the given time scale of “At this moment / Last week I felt…”. Specific items are scored to calculate the positive and negative affect subscales. BDI. Beck’s Depression Inventory (BDI; Beck, Steer, & Brown, 1996) is used to determine the existence and severity of depression in an individual through a 21 item inventory. For each item, participants select a statement with which they identify best. There are 4 statements per item and they range in severity. The higher the score the more severe the depression.

BAI. Beck’s Anxiety Inventory (BAI; Beck et al., 1988) is used to assess anxiety based on the symptomatology the participant observes. It consists of 21 common symptoms of anxiety. The participant indicates the frequency of each symptom during the past month in a 4-point likert-type scale ranging from Not at all to Severely - it bothered me a lot, which correspond to scores 0 and 3; the more severe the symptoms reported correspond to higher scores. ERQ. Emotion Regulation Questionnaire (ERQ; Gross & John, 2003) aims at observing emotion regulation through 10 statements describing emotional management and to which participants need to indicate to what degree they agree with them. Participants respond on a 7-point likert-type scale ranging from 1, strongly disagree, to 7, strongly agree. A higher score corresponds with higher emotion regulation.

Tasks

Multiple tasks were used during the experiment, they are further described below (Figure1 and Figure2).

Calibration Task. In order to implement the threat of shock paradigm, a calibration round was performed for each participant. Participants were prompted by the script to press a key to receive a shock and would have to rate said shocks on a scale on screen ranging from 1, not painful at all, to 10, extremely painful. The intensity of the shocks presented were based on the participant’s subjective responses and ranged from 2.5mA to 25mA in steps of 2.5mA. After a participant had rated three consecutive shocks as a 7 or higher the screen would close and their responses would be recorded. The last shock intensity rated as 7 or higher by the participant would be used for the subsequent learning task.

(7)

7

Practice Task. Participants performed a short practice for the main experimental task which allowed them to get familiarized with the task setup and to ask any questions they might have before the main experimental task took place. As the practice task was for the learning task per se, no shocks were delivered during this task nor were there any monetary gains or losses accrued. Moreover, the symbols used were unrelated to the main task’s such that there would not be any carryover effects.

Learning Task. The main experimental task follows a reinforcement learning paradigm in which two symbols are presented to the left and right sides of a fixation cross. Participants have 2.5 seconds to choose one of the sides through the arrows on the keyboard. They will receive feedback corresponding to the symbol they chose, if no symbol is chosen within the time limit they will receive the detrimental option’s outcome without specifying which symbol it was associated with. During the task they should be able to learn by accumulating evidence that some symbols generally represent beneficial responses whereas others

detrimental ones (Figure3). Additionally, trials are divided into gain and loss domain blocks, in the first the neutral option is negative as it represents no gains whereas in the second the neutral option is optimal because it does not inflict a monetary loss on the participant. Finally, certain blocks will have shocks administered throughout whereas others will not include any shocks; this will be shown to the participant on screen with a distinctive green or blue frame around the task being performed throughout the whole experiment and in text form before each block. Each symbol corresponded to either gain or loss and shock or safe; this resulted in 4 different symbol pairs. Participants were just aware of the shock and safe conditions before the task started. Each block would consist of 3 trials and there were 8 blocks per condition; the order of these was pseudorandomized. In total, a full round of the

reinforcement learning task included 96 trials, 24 trials per condition. During this task the participant’s responses, reaction time and money earned or lost would be recorded.

Preference Task. Participants were shown pairs of the symbols, from which they had to pick the one they preferred and their responses were recorded. There would not be any shocks, feedback nor monetary gains or losses from this task.

Valence Rating Task. An individual symbol would be shown and participants rated it from 1, very negative, to 10, very positive. Each symbol would be shown 4 times in a randomized order and the participant’s responses to the symbols would be recorded. Similar to the

(8)

8

preference task, there would not be any shocks, feedback nor monetary gains or losses from this task.

Exit Questionnaire

Participants were given a questionnaire regarding their particular strategies for the tasks and manipulation checks regarding their emotional state during the task. Additionally, it included a recognition task with symbols that weren’t present in the task itself and which they had to indicate whether they had seen before or not.

Procedure

All procedures were approved by the Economics & Business Ethics Committee at the University of Amsterdam prior to recruitment and data collection. Recruitment took place through the departmental participant pool from Center for Research in Experimental Economics and political Decision-making (CREED). Participants were asked to fill in a questionnaire battery prior to the experimental session. The battery consisted of screening questions, demographic questions, and the questionnaire measures further described previously (i.e. PANAS, BAI, BDI, ERQ). Additionally, all the symbols to be used in the main experimental task were shown for at least 60 seconds so they would be familiar with them. Furthermore, for completion of said questionnaires, participants were awarded 10 euros which were to be their endowment and starting balance for the learning task that was to take place during the experimental session in the lab.

Once participants had completed the questionnaire, they attended individually scheduled sessions. After participants arrived, they were asked to read and give their written consent for participation in the experiment. Additionally, they were thanked for completing the

questionnaire and once again reminded that the questionnaire completion earned them a 10euro starting balance for the task. Finally, recording and stimulator electrodes were placed and secured. Skin conductance electrodes were placed in their pinky and ring fingers of their left-hand and secured with tape; similarly the shock stimulator was secured to their left wrist by a Velcro band. Additionally, participants were given a pillow on which to lay their arm on to make it more comfortable. In order to improve signal conductivity and reduce impedance, all electrodes were placed with conductive gel in alcohol cleaned locations. Participants were further informed that the shocks would be delivered only to their wrist, that the electrodes on

(9)

9

their fingers would not deliver any shocks and that all responses during the tasks were to be done using their right hand.

First, they underwent the threat of shock calibration as well as a practice round for the learning task. Once these were completed, participants were reminded that there would be shocks on the following task, that their final payout would be determined by their

performance and that there was a short self-paced break halfway through all the trials. Afterwards, skin conductance response recording started and the participants engaged with the main experimental learning task.

After the task ended there was a second calibration round to account for habituation and sensitization effects. The intensity reached through this calibration would then be used for the second iteration of the learning task. This round had the same task as before; however the symbols were novel to the participant. This was told to the participants in the instructions for the task and was repeated to them verbally before the task started. Additionally, participants were reminded that their performance would determine their final payment. Like in the previous round, there was one self-paced break halfway through the task. After the learning task was completed, there was a third calibration to note any further habituation or

sensitization in the participants.

Once the last calibration was completed, the shock stimulator would be removed and skin conductance recording would be stopped. Participants would then be asked to do assessments regarding the symbols presented on the second session. The first assessment consisted of a preference task and the second of a valence rating task. After these two tasks were completed, skin conductance electrodes were removed and the participant was asked to complete an exit questionnaire including the recognition task on a separate laptop while the researcher

calculated the participant’s payout. Once the participant was done, they were paid and their questions were answered before they left.

Payment Scheme

Payment was calculated based on the choices the participants made. Each choice outcome would directly translate to their final payout such that:

(10)

10

In which the gain amount is the sum of all the +0.50 euro outcomes the participant achieved, loss amount is the sum of the -0.50 outcomes. Additionally, each item correctly recognized during the exit questionnaire resulted in +0.5 euro winnings for a maximum bonus of 2 euros from 4 items presented.

Results

Results from the main task’s measures were analyzed using Matlab statistics package,

whereas results from questionnaires and post testing tasks was analyzed using IBM SPSS 24. Calibration Testing

Participants underwent three different rounds of calibration. This analysis includes data from 40/41 participants due to data misplacement. During each of these the final measurement would be used throughout the experimental task as to evoke anxiety as our anxiety

manipulation. Participants in average rated their last shock as 7.70(SD=1.14), 7.83 (SD=0.93) and 7.85 (SD=0.83) respectively for each calibration in chronological order (Figure4). After performing a one way repeated measures ANOVA, it seems there was no significant

differences in the subjective rating participants gave to the shocks used throughout the task; F(2,78)=0.471, p=0.471, ηp2=0.012. These ratings in each calibration round correspond to an average intensity of 7.31mA (SD=5.26), 8.31mA (SD=5.29) and 10.06mA (SD=5.70)

respectively for each calibration in chronological order (Figure4). Similarly, another one way repeated measures ANOVA was used to compare the intensities between the different

calibration rounds. A significant difference was found between the objective measure of the calibrations as observed by their recorded intensity in mA; F(2,78)=16.91, p<0.001,

ηp2=0.302. These seem to suggest habituation was indeed taking place as the average intensity increases progressively and chronologically across sessions; however, this was accounted for in the next calibration as there seem to be no significant differences in the subjective ratings of the shocks used for the manipulation and therefore the perceived discomfort remains stable across calibrations and with it so should our manipulation. Questionnaires

Questionnaire results seem to show that participants had a higher propensity for positive affect (M=0.33.88, SD=7.59) than negative affect (M=21.71, SD=7.04) through the PANAS. Additionally, the BDI showed that participants were minimally depressed in average

(11)

11

(M=9.20, SD=7.84) and the BAI showed a tendency of minimal state anxiety (M=9.88, SD=8.66) in average across participants. Finally, the ERQ shows that participants in average are good emotion regulators (M=42.49, SD=6.84).

Reinforcement Learning task

Prior to analyzing the results of the main experimental task, trials that were too fast (<50ms) or took too long (>3s) were eliminated. To analyze the results from the main experimental task, a 2x2 repeated measures ANOVA was used test the effect of both of the experimental manipulations on the different measured of the reinforcement learning task, namely

performance and reaction time (RT). The independent variables used were the manipulations defined previously as emotion induced (safe or anxiety) and decision domain (gain or loss) in which the decision took place. Based on this, there seem to be marginally significant main effects of decision domain (F(1,40)=3.56, p=0.067) by which participants responded faster in the gain domain (M=895.05, SD=20.42) than in the loss domain (M=1028.50, SD=25.14). On the other hand, emotion induction had a highly significant main effect (F(1,40)=72.9,

p<0.001) on reaction time. In this case, participants were faster responding in the anxious condition (M=950.5, SD=44.68) than in the safe condition (M=973.05, SD=23.22). There also was a significant main effect of subject (F(40,40)=5.62, p<0.001) which seems to indicate high inter-subject variability. Furthermore, there was a significant interaction between emotion induction and subject (F(40,40)=2.47, p=0.003) in terms of reaction time. All other interactions were not significant (p>0.100). Finally, there were no significant main effects or interactions in terms of the performance of participants on the reinforcement learning task (p>0.200).

In order to better understand these relationships, it was decided to further analyze them using a 2x2x24 ANOVA which factors in emotion induction, domain and trial by trial results. This analysis approach could take into account the learning taking place on a trial by trial basis and therefore account for its variance while concurrently being able to measure and contrast the effects of the experimental manipulations on the task itself at different points in time. Performance. First off, the participant’s performance was assessed across the emotion induction manipulation, the domain manipulation as well as on a trial by trial basis (Figure5A). There seems to be no main effects of either the emotion induction

(12)

12

other hand, there is a main effects of trial number (F(40, 7572)=145.11, p<0.001). No pairwise analyses were performed but visual inspection of the data seems to show an increasing trend of performance over trials which suggests learning is taking place, particularly in the anxiety and gain condition (Figure5A). There were multiple significant interactions. Perhaps most notably was that between both experimental manipulations, emotion and domain; F(1,7572)=6.62, p=0.01; such that anxiety would improve learning in the gain domain (M=0.7713, SD= 0.03) however this would not be the case in the loss domain (M=0.7176, SD=0.02). Additionally, there seems to be an important effect of inter-individual differences despite its main effect being nonsignificant (F(40, 7572)=0.59, p=0.9689) since all its interactions with the tested variables are significant. Namely there seems to be significant interactions between participants and their response to domain (F(40, 7572)=8.46, p<0.001), elicited emotion (F(40,7572)=8.31, p<0.001) and trial number

(F(40,7572)=1.5, p=0.022).

Reaction Times. Reaction Times were also assessed across the same independent variables described above; namely, the emotion, domain and trial number in which the decisions took place (Figure5B). In this case, there seems to be no significant main effect of domain

(F(1,7572)=0.92, p=0.343) nor of emotion manipulation (F(1,7572)=0.00, p=0.953. However, there was a highly significant main effect of trial number (F(1,7572)=99.89, p<0.001). No pairwise analyses were performed but visual inspection of the data seems to show an

increasing trend of performance over trials which suggests learning is taking place. Like with performance, pairwise comparisons were not performed for the trial main effect; however, visual inspection of the data seems to suggest that reaction time decreases over trials in accordance to learning taking place particularly in the anxiety and gain condition (Figure5B). Additionally there was a significant interaction between domain and trial number (F(1, 7572)=14.95, p<0.001). Finally, like previously described in performance, it seems there is a remarkable effect of inter-subject differences affecting this dependent variable. In this case there is a highly significant main effect of subject on reaction time (F(40,7572)=6.18, p<0.001) and all interactions with the dependent measures accounted for are highly

significant as well (Domain: F(40,7572)=3.59, p<0.001; Emotion: F(40,7572)=2.15, p<0.001; Trial: F(40,7572)=3.76, p<0.001).

(13)

13

To analyze the effects of the independent variables on the post task and rating task a 2x2x2 repeated measures MANOVA was used. The independent variables used were those which defined each symbol presented: emotion induced (safe or anxiety), decision domain (gain or loss), probability of success (75% or 25%). Success was operationalized as either earning 50 cents in the gain domain or not losing 50 cents in the loss domain. This analysis seems to show that there is a highly significant main effect of domain (F(2,39)=43.64, p<0.001, ηp2= 0.69) and of probability of success (F(2,39)=42.396, p<0.001, ηp2=0.685) across dependent measures. Emotion induction does not seem to have a significant main effect overall (F(2,39)=1.25, p=0.30, ηp2=0.06). All other interactions are non-significant (p>0.150). In order to further analyze these results, they will now be described according to their respective dependent measures

Preference Task. To analyze the participant’s probability of choosing a specific symbol during the preference task by comparing them across the different dimensional contexts in which each symbol was presented, namely emotion induction (safe or anxious), decision domain (gain or loss) and probability of success (75% or 25%) (Figure6). There seems to be a significant main effect of domain (F(1,40)=68.187, p<0.001, ηp2= 0.630) by which

participants were more likely to select symbols from the gain (M=0.609, SE=0.013) than from the loss (M=0.391, SE=0.013) domain. Additionally, the main effect of probability of success for each symbol is also significant (F(1,40)=81.057, p<0.001, ηp2= 0.670) such that participants in average selected symbols with a 75% probability of success more often (M=0.650, SE=0.017) than those with just a 25% probability (M=0.350, SE=0.017). On the other hand, the main effect of emotion induction in non-significant (F(1,40)=0.039, p=0.845, ηp2=0.001). In addition to this, all other interactions were non-significant (p>0.100).

Rating Task. Similarly, the average rating each participant gave to each symbol presented was compared across the different contexts in which the symbol was presented to the participant during the main task, emotion induction, decision domain and probability of success (Figure6). Based on this analysis, there seems to be a highly significant main effect of domain (F(1,40)= 82.579, p<0.001, ηp2=0.674) such that symbols presented in the gain domain (M=6.602, SE=0.162) were rated significantly more favorably than those in the loss domain (M=4.30, SE=0.158) in average. Similarly, there was a significant main effect of probability of success (F(1,40)=70.044, p<0.001, ηp2=0.637) where symbols with a

(14)

14

with just a 25% probability of success (M=4.361, SE=0.188). Once again, the main effect of emotion was non-significant (F(1,40)=1.019, p=0.319, ηp2=0.025). Finally, there were no significant interactions (p>0.100).

Exit Questionnaire

The last measure recorded of participants were their responses to the exit questionnaire. In this questionnaire they were asked about the subjective experience during the tasks. Due to delayed implementation of the questionnaire only data from 35/41 subjects is available and will be analyzed. Using a repeated measures 2x7 ANOVA to observe the difference in feelings (anxiety, fear, surprise, disgust, sad, happy, angry) between the shock themselves and the blocks they were presented in; it was found that there was a highly significant main effect between emotions (F(6,204)=12.932, p<0.001, ηp2=0.276). Pairwise analysis show that the most common emotion was surprise (M=4.26, SD=0.30), followed by anxiety (M=3.91, SD=0.35) and fear (M=3.50, SD=0.35). Anxiety was not significantly different from surprise or fear, however it was felt significantly more (p<0.05) than disgust (M=2.61, SD=0.33), sadness (M=2.18, SD=0.25), happiness (M=2.01, SD=0.18) and anger (M=2.77, 0.32). There was also a main effect between the emotions felt during blocks and shocks themselves (F(1,204)=23.74, p<0.001, ηp2=0.41). Furthermore, there was a significant interaction between emotions and whether they were elicited during shocks themselves or during the blocks (F(6,204)=3.06, p=0.007, ηp2=0.083) which seems to show that all emotions were felt more throughout the block than during the shock itself.

In separate additional questions, participants rated their valence during shock blocks as negative leaning (M=6.88, SD=3.707) and their arousal towards excited (M=4.59, SD=1.258) via mannequin questions. They also had to complete a recognition task for the symbols where the average performance was of 3.86 (SD=0.35) out of 4.0 possible points. Finally,

participants were awarded an average of 20.95 (SD=4.79) euros for their participation. Discussion

Based on these results several conclusions can be reached. First of all, it seems that learning is indeed taking place as evinced by the significant main effect of trial number on both reaction time and accuracy during the learning task. This provides further evidence of the participants developing a strategy and a more efficient method through practice as the task is taking place. Furthermore, assessments after the learning task showed there seems to be a

(15)

15

clear pattern of preference and valence towards the symbols presented based on their corresponding outcomes during the task and are capable of reporting them both implicitly (binary preference choice) and explicitly (rating task).

Secondly, our experimental manipulations seem to have no main effects on the learning task, as evidenced by both reaction time and accuracy. Preliminary evidence of SCR (not reported) seems to show that there is a significant difference between anxious and safe condition. Furthermore, habituation was taken into account through multiple calibrations which all resulted in no significant differences in discomfort the participants would feel with the shock used throughout the task. These results seems to suggest that there is an effect of our threat of shock protocol on the participants alertness and arousal; however, this parasympathetic change does not seem to affect learning or isn’t strong enough to do so in the current manipulation. Additionally, the lack of a main effect of domain on learning seems to show that learning occurs equally well in both conditions which has previously been found by Guitart-Masip and colleagues as well (2012).

Despite the lack of main effects of our experimental manipulations, there were two very significant interactions which are worth noting. In terms of reaction time, there was a highly significant interaction between domain and trial which points towards domain having an effect on the trial by trial learning taking place during the task. Since there was no main effect of domain on reaction time, this interaction could reflect that the rate at which participants learn in both conditions is different. If this is the case, it should become apparent when pairwise comparisons for this interaction are tested. Additionally, a highly significant interaction between both manipulations can be observed in performance. This interaction suggests that anxiety improves reward learning but not punishment learning. This does not fully match the emotion-congruent interaction hypothesized earlier as it was expected that anxiety would improve punishment learning and not reward. An alternative explanation as to why this is taking place would be that learning is enhanced by the autonomic arousal changes taking place by the anxiety manipulation. It has been previously shown that arousal and stress hormones could contribute to improved encoding which would certainly improve learning (Cahill & Alkire, 2002). However, to determine whether this is the cause for the obtained results requires further testing.

Alternative explanations as to why our emotion induction paradigm might not have yielded the expected results in the decision making task can be found in the post test tasks and

(16)

16

questionnaire. As observed in the exit questionnaire, participants were not particularly

anxious when the threat of shock was taking place. Ultimately, it would be interesting to note why some effects were observed in reaction time whereas not in performance or viceversa. There is evidence to suggest that there could be a motor difference for both types of

processing and responses (Wrase et al., 2007). Additionally, without further testing it is hard to assert whether a shorter reaction time is due to arousal or to learning and practice effects. It is noteworthy that for reaction times there was unexpected spiking taking place at the

beginning of each set of trials, if this spike is taken into account, it could be possible the main effects of both domain and emotion could become apparent and therefore the first trial’s unusual increase of reaction time is due to an attentional artifact of the task used. This could be further exacerbated by task switching demands due to the different experimental

manipulations and strategies used by the participants.

Moreover, further analysis are needed that due to the scope of this report were not conducted and/or reported. Examples of this are skin conductance modeling and pairwise comparisons pending reports. Additionally, further analysis and interpretations like a median split would allow to take into account the intersubject variability that proved to have highly significant main effects and interactions in the main task. By reducing the amount of noise in the data, it would be possible to paint a clearer picture of the interaction between the experimental manipulated variables. Another idea would be to covariate some of the questionnaires out or divide the questionnaires into its base components, for instance the ERQ into reappraisal and suppression elements; this in turn would make it easier to tune covariates or regressors for further analysis.

The behavioral evidence collected points towards learning taking place and being affected differentially by anxiety and domain in both reaction time and performance measurements. These findings could improve our approach to learning and teaching so as to make these processes more efficient. They are also supported by evidence that stress hormones could aid and improve a person’s memory encoding (Cahill & Alkire, 2002), as could be the case in the current study. Furthermore, if extended and applied in imaging studies, this task paradigm could allow us to better understand the effects these two manipulations have in terms of interactions between their brain correlates and their corresponding processes within reinforcement learning models.

(17)

17 References

Bach, D. R., & Friston, K. J. (2013). Model‐based analysis of skin conductance responses: Towards causal models in psychophysiology. Psychophysiology, 50(1), 15-22. Basten, U., Biele, G., Heekeren, H. R., & Fiebach, C. J. (2010). How the brain integrates

costs and benefits during decision making. Proceedings of the National Academy of Sciences, 107(50), 21767-21772.

Blanchette, I., & Richards, A. (2010). The influence of affect on higher level cognition: A review of research on interpretation, judgement, decision making and

reasoning. Cognition & Emotion, 24(4), 561-595.

Cahill, L., & Alkire, M. T. (2003). Epinephrine enhancement of human memory

consolidation: interaction with arousal at encoding. Neurobiology of learning and memory, 79(2), 194-198.

Carpenter, S. M., Peters, E., Västfjäll, D., & Isen, A. M. (2013). Positive feelings facilitate working memory and complex decision making among older adults. Cognition & emotion, 27(1), 184-192.

Edwards, W. (1954). The theory of decision making. Psychological bulletin, 51(4), 380. Engelmann, J. B., Meyer, F., Fehr, E., & Ruff, C. C. (2015). Anticipatory anxiety disrupts

neural valuation during risky choice. Journal of Neuroscience, 35(7), 3085-3099. Fehr, E., & Rangel, A. (2011). Neuroeconomic foundations of economic choice—recent

advances. The Journal of Economic Perspectives, 25(4), 3-30.

Guitart-Masip, M., Huys, Q. J., Fuentemilla, L., Dayan, P., Duzel, E., & Dolan, R. J. (2012). Go and no-go learning in reward and punishment: interactions between affect and effect. Neuroimage, 62(1), 154-166.

Heilman, R. M., Crişan, L. G., Houser, D., Miclea, M., & Miu, A. C. (2010). Emotion regulation and decision making under risk and uncertainty. Emotion, 10(2), 257.

(18)

18

Kim, S. H., Yoon, H., Kim, H., & Hamann, S. (2015). Individual differences in sensitivity to reward and punishment and neural activity during reward and avoidance learning. Social cognitive and affective neuroscience, 10(9), 1219-1227.

Lench, H. C., Flores, S. A., & Bench, S. W. (2011). Discrete emotions predict changes in cognition, judgment, experience, behavior, and physiology: a meta-analysis of experimental emotion elicitations.

Loewenstein, G., & Lerner, J. S. (2003). The role of affect in decision making. Handbook of affective science, 619(642), 3.

Mitchell, D. G. (2011). The nexus between decision making and emotion regulation: a review of convergent neurocognitive substrates. Behavioural brain research, 217(1), 215-231. Payzan-LeNestour, E., Dunne, S., Bossaerts, P., & O’Doherty, J. P. (2013). The neural

representation of unexpected uncertainty during value-based decision making. Neuron, 79(1), 191-201.

Pessiglione, M., Seymour, B., Flandin, G., Dolan, R. J., & Frith, C. D. (2006). Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans. Nature, 442(7106), 1042-1045.

Petzold, A., Plessow, F., Goschke, T., & Kirschbaum, C. (2010). Stress reduces use of

negative feedback in a feedback-based learning task. Behavioral neuroscience, 124(2), 248.

Robinson, O. J., Vytal, K., Cornwell, B. R., & Grillon, C. (2013). The impact of anxiety upon cognition: perspectives from human threat of shock studies. Frontiers in Human Neuroscience, 7.

Rushworth, M. F., Kolling, N., Sallet, J., & Mars, R. B. (2012). Valuation and decision-making in frontal cortex: one or many serial or parallel systems?. Current opinion in neurobiology, 22(6), 946-955.

Schmitz, A., & Grillon, C. (2012). Assessing fear and anxiety in humans using the threat of predictable and unpredictable aversive events (the NPU-threat test). Nature Protocols, 7(3), 527-532.

(19)

19

Starcke, K., & Brand, M. (2012). Decision making under stress: a selective review. Neuroscience & Biobehavioral Reviews, 36(4), 1228-1248.

White, S. F., Geraci, M., Lewis, E., Leshin, J., Teng, C., Averbeck, B., ... & Blair, K. S. (2016). Prediction error representation in individuals with generalized anxiety

disorder during passive avoidance. American Journal of Psychiatry, 174(2), 110-117. Wrase, J., Kahnt, T., Schlagenhauf, F., Beck, A., Cohen, M. X., Knutson, B., & Heinz, A.

(2007). Different neural systems adjust motor behavior in response to reward and punishment. Neuroimage, 36(4), 1253-1262.

Wyart, V., De Gardelle, V., Scholl, J., & Summerfield, C. (2012). Rhythmic fluctuations in evidence accumulation during decision making in the human brain. Neuron, 76(4), 847-858.

Yu, A. J. (2015). Decision-making tasks. Encyclopedia of computational neuroscience, 931-937. Chicago

(20)

20 Figures

Figure1. Procedural overview. Calibration rounds are marked in purple, whereas the main experimental task is marked in red. Set-up and practice tasks are marked by green whereas post testing is highlighted in blue. Questionnaires were before arrival to the lab are not colored.

(21)

21

Figure2. Tasks used throughout the experiment. Namely the calibration task (A), Practice Task (B), Reinforcement learning task (C), Preference task (D) and Rating Task (E). Items marked in blue show the participant’s actions.

(22)

22

Figure3. Stimuli example and its corresponding probability. Each round had 8 symbols in total. Each symbol pair corresponded to either gain or loss (domain) and anxious or safe (emotion) conditions. The symbol pair consisted of a good symbol, which would have the preferable outcome 75% of the time. In the case of the gain domain it was a symbol which would result in gaining 0.50 euro 75% of the time. In the case of the loss domain, it would be the symbol which incurred losses just 25% of the time.

(23)

23

Figure4. Averaged calibration data. The graph shows the discomfort rating participants gave in blue bars (SD in blue glow bars) which show no significant differences. On the other hand, actual intensity is plotted in orange with orange glow error bars and shows a significant increasing trend.

(24)

24

Figure5. Raw data trace averaged across participants for emotion, domain and trial by trial changes for performance and reaction time. Although no pairwise analysis were performed for the 2x2x24 ANOVAs for reaction time and performance, there seem to be trends visible in the data. Different lines represent the different coditions.

(25)

25

Figure6. Posttests average results. Rating is plotted in orange and its standard deviation is plotted with an orange glow. Similarly, preference choices are plotted in glue with a standard deviation bar in blue glow. As you can see there is an overlap and both implicit (preference) and explicit (rating) measures show learning of the preferable symbol. Different conditions are separated with a dotted line. Each marker represents the average measure for that condition’s symbol.

(26)

26 Supplementary Materials

Table of contents

Appendix - Hardware ... 27

DS5 - Bipolar constant current stimulator ... 27 Custom-made amplifier ... 27 PC ... 27 Other (alcohol swabs, gel, tape) ... 27

Appendix - Software ... 28 Matlab ... 28 ToS_Calibration ... 28 ToS_Learning_v3 ... 28 ToS_PostTestBehaveGraded ... 29 rating _task ... 29 Vsrrp ... 29 NI_Controller ... 29

Appendix - Supporting Documents ... 30

Instructions and Consent Form. ... 30 Researcher’s Checklist. ... 35 Payment Slip. ... 36

Appendix - Participant Communication ... 37

Questionnaires. ... 37 Invitation. ... 37 Reminder. ... 37 Appendix - Questionnaires ... 38 Before arrival. ... 38 PANAS. ... 39 BDI. ... 40 BAI. ... 42 Exit Questionnaire. ... 44 Appendix - Stimuli ... 50

(27)

27 Appendix - Hardware

DS5 - Bipolar constant current stimulator

Device used to administer shocks. Input and output voltages can be regulated as desired and are administered with a wrist electrode attached with Velcro. Manipulations for the device include backlight, current output, and input among others and were applied in the

accompanying Matlab code created. In order to reduce impedance, the skin surface where the electrodes are applied can be cleaned with alcohol and further reduced using electrode gel.

Custom-made amplifier

Amplifier purchased from the University of Amsterdam’s Technical Support for Social and Behavioral Sciences (TOP) department. It included a pair of sintered Ag/AgCl EMG electrodes connected to a custom-made amplifier with an input resistance of 1GΩ and a bandwidth of 5-1000Hz (6dB/oct). Electrodermal activity (Skin Conductance Level; SCL) was measured with a sine wave shaped excitation voltage (1V pk-pk, 50Hz).The SCL circuit measures the current flowing through the skin from the output electrode to a GND electrode and converts this current to a conductance value. Manipulation, recording and specifications were applied through an xml file fed directly into the program as a driver.

PC

A computer was required to run the experiment and its software (Matlab), present the

experiment (screen) as well as record responses and interface with the researcher (mouse and keyboard). Additionally, it must also allow for the connection, manipulation, control and recording of all devices involved.

Other (alcohol swabs, gel, tape)

Additionally, practical issues also required preparation and additional materials. Electrodes for recording with the devices aforementioned required conducting gel to improve the signal and tape to hold the skin conductance electrodes in place. Additionally, alcohol swabs to clean the electrodes as well as the participant’s skin surface to improve the signal.

(28)

28 Appendix - Software

Matlab

Matlab2017 was used with a purchased personal student license. In order to run it, it requires the Data Acquisition toolbox for interfacing with the shocker. Additionally, all stimuli

presentation and tasks were presented using Cogent 2000 developed by the Cogent 2000 team at the FIL and the ICN and Cogent Graphics developed by John Romaya at the LON at the Wellcome Department of Imaging Neuroscience (www.vislab.ucl.ac.uk/cogent.php). Code used in the task consists of the following:

 ToS_Calibration

o Used to calibrate the shock to each participant to take into account that perception of pain and skin resistance can vary across participants. This is done by calling on function shock_in_block and shock_setup which sets the input parameters for the shocker device; this experiment’s parameters set the input voltage as 5V and output as 25mA for all participants. The screen shows the instruction “Press enter to be shocked”, after which the participant presses enter and unsurprisingly gets shocked. After the shock is administered,

participants rate their pain perception of it on a scale from 1, Not painful, to 10, Extremely painful. The intensity of the shocks increases and decreases in steps of 10% of the maximum output indicated in the parameters. After the second shock, if the participant rates a shock as less than 7, the intensity will be increased by one step. Conversely, if after the first shock the participant rates a shock as higher than 9, the intensity will be decreased by one step. The

minimum intensity is 2.5mA, namely 10% of the maximum intensity, which is 25mA. Once two consecutive shocks are rated as higher than 7, the intensity of the last shock administered is the value that will be used throughout the subsequent task.

o Requires:

 Data Acquisition Toolbox

 Input parameters provided by shock_in_block and shock_setup o Output: A mat file with the intensity and subjective rating of the participant.

The name of this file is Sub[participant number]_calib_[iteration]_times.mat  ToS_Learning_v3

o The task was adapted from code used and provided by Stefano Palimitieri (2006). This script is used to present the learning task itself and record the participant’s responses and results. The task will present a pair of symbols on the left and right sides of the fixation cross, said symbols will be randomly assigned for each participant to represent an advantageous and

disadvantageous option in either the loss or gain domain. The task consists of 96 trials, grouped in anxiogenic and anxiolytic blocks of 3 trials each. There is a pause halfway through the session, namely after 48 trials have been

completed. Inter trial time is jittered and lasts between 1 and 6 seconds. Additionally, the script also sends triggers for the shocking device in anxiogenic blocks as well as markers to the skin conductance response amplifier for its later analysis.

o Requires:

 Data Acquisition Toolbox o Output:

(29)

29

 Two mat files (Sub[participant #]_Session[session #].mat) listing the symbols to be used in 2 separate iterations of the task

 Two mat files showing the participants responses, accumulated

monetary reward for each run (first half and second half of the session), reaction times. This file is called

Sub[participant#]_ToS_Session[session#]run_[run#].mat  ToS_PostTestBehaveGraded

o This script is used to present the participant with a series of binomial choices with no anxiety or domain manipulations in order to obtain their implicit internal ratings for the symbols shown in a specified session. Pairs of symbols are shown and the participant selects their preferred symbol. There are no monetary gains or losses, nor feedback in this task.

o Requires:

 Mat file with stimuli presented during the learning task (created by ToS_SCR)

o Output:

 A mat file (PostTest_[participant#]) with the participants preference in each of the binomial preference choices.

 rating _task

o This script asks the participant for their valence rating after each symbol is presented individually. Each symbol is presented and rated on 4 different occasions in a randomized order using a number scale ranging from 1, very negative, to 10,very positive. The symbols presented are those from a session specified.

o Requires:

 Mat file with stimuli presented during the learning task (created by ToS_SCR)

o Output

 A mat file with the ratings given to each symbol in each iteration. The file name is RatingData_Sub[participant#].

Vsrrp

Vsrrp98 V10.0 was provided by the UvA with the purchase a pair of sintered Ag/AgCl electrodes connected to a custom-made bipolar amplifier with an input resistance of 1GΩ and a bandwidth of 5-1000Hz (6dB/oct). The software makes use of xml driver for recording, analysis and conversion of data from vsrrp files to mat format for analysis in matlab.The xml code used was:

 SCL - Debug

o Used to record Skin Conductance Level from a pair of Ag/AgCl electrodes taped to the ring and index finger of the participant, creating a single file with all the data and allowing for its division into blocked segments according to the desired design and markers sent during the task. Furthermore, this driver allows for conversion to mat files after recording.

o Output:

 Vsrrp and mat file with name specified at beginning or recording

NI_Controller

Software used by the interface to connect the shocker (DS5) to the main experimental PC. Doesn’t require any inputs beside proper hardware connections.

(30)

30 Appendix - Supporting Documents

Documents used during the lab session with the participants. All documents were retrieved by the researcher for archiving.

Instructions and Consent Form.

This form is given to the participants at the start of the experiment and includes basic

information about the research conducted as well as instructions to complete the task and the participant’s payout information; finally, it includes the informed consent for participants to sign if they agree to take part in the experiment.

(31)
(32)
(33)
(34)
(35)

35 Researcher’s Checklist.

This document is for the sole use of the researcher and aids in keeping to protocol and standardizing procedure used on each participant.

(36)

36 Payment Slip.

Once all tasks are completed, participants will be handed their payment as calculated in the instruction and consent. They will sign their acceptance of payment as record of its

(37)

37 Dear [Participant Name],

You have signed up for the session on [day], [month] [date] at [time] for experiment 1711 - "Decision making and electric shock"

It will take place in E7.20 in the E2 building 7th floor (Not in the PPLE side).

Please let me know if you have any questions, want to reschedule or aren't sure about the room location.

All best, Isabela L.

Appendix - Participant Communication

In order to standardize communication with participants, templates were used for the three times communication took place. Other communication and questions posed by the

participants (e.g. location, time clarification) were answered at the researcher’s discretion.

Questionnaires.

Participants who signed up for sessions were contacted individually and asked to take part in the questionnaire via a link attached. Participants who accessed the questionnaire through a mass email were contacted individually once the questionnaire was completed to schedule their sessions. Participants who did not fill in the questionnaire to completion were not eligible for taking part in the main experimental task.

Invitation.

After the participants completed the questionnaire, they were invited to the lab on their selected time via the CREED system or asked for their slot preference given the available time slots at the time the email was sent.

Reminder.

Approximately 6 hours before the experiment took place, participants were sent an email reminder with the date, time and location of the experiment.

Dear participant,

Thanks for completing the questionnaire for experiment 1711 - "Decision making and electric shock". You are now eligible to participate in the second half of the experiment. As this

experiment is done individually, we arrange time slots ourselves so it better suits your schedule. Currently, we have all times slots open during this week (10:00-18:00). The experiment takes approximately 1:30 hours so please let me know what time and date would suit you best and I will confirm its availability and your attendance.

All best, Isabela L.

(38)

38 Appendix - Questionnaires

Before arrival.

These questionnaires were sent to participants before their arrival to the lab session scheduled. Participants who did not complete the questionnaires were not eligible to perform the second half of the experiment in the lab. This battery of tests includes demographic questions as well as the PANAS, BDI, BAI, ERQ and were conducted online via Qualtrics. They are listed below in their original formats.

Exclusion Questions

(39)

39 PANAS.

(40)

40 BDI.

(41)
(42)

42 BAI.

(43)

43 ERQ.

(44)

44 Exit Questionnaire.

This questionnaire was completed by participants before leaving the lab.It includes

manipulation checks, questions about strategies used and a recognition task for symbols that weren’t present in the experiment.

(45)
(46)
(47)
(48)
(49)
(50)

50 Appendix - Stimuli

The symbols used during the task are part of the Agathodaimon font. The whole font is shown below. Symbols not used are light gray. Symbols used for the main task are shown in black. Symbols used for the recognition task are shown in red.

A

A

B

B

C

C

D

D

E

E

F

F

G

G

H

H

I

I

J

J

K

K

L

L

M

M

N

N

O

O

P

P

Q

Q

R

R

S

S

T

T

U

U

V

V

W

W

X

X

Y

Y

Z

Z

a

a

b

b

c

c

d

d

e

e

f

f

g

g

h

h

i

i

j

j

k

k

l

l

m

m

n

n

o

o

p

p

q

q

r

r

s

s

t

t

u

u

v

v

w

w

x

x

y

y

z

z

Referenties

GERELATEERDE DOCUMENTEN

6 In fact, prospective long-term follow-up is part of both investigator-initiated European- wide trials on fresh decellularized allografts for pulmonary and aortic valve replacement

Time-averaged alpha-beta power values locked to word presentation were used as the dependent variable in linear mixed models whose fixed effects included semantic diversity and

23 word order in a session for better retention (3). Although some refinements are still possible and further research is necessary to show possible better retention over longer

Where most studies on the psychological distance to climate change focus on the perceptions of outcomes over time, the present study focuses on the subjective

The main result of this correspondence is the demonstration of the equivalence of two of these approaches, namely, the constrained total least squares (CTLS) approach

However when using multiple networks to control a sub-set of joints we can improve on the results, even reaching a 100% success rate for both exploration methods, not only showing

We will use the Continuous Actor Critic Learn- ing Automaton (CACLA) algorithm (van Hasselt and Wiering, 2007) with a multi-layer perceptron to see if it can be used to teach planes

More important, this violation of expectations again predicted the return trip effect: The more participants thought that the initial trip took longer than expected, the shorter