• No results found

Information or motivation : an fMRI investigation into the effects of positive versus negative feedback

N/A
N/A
Protected

Academic year: 2021

Share "Information or motivation : an fMRI investigation into the effects of positive versus negative feedback"

Copied!
16
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Information or Motivation: An fMRI investigation

into the effects of positive versus negative feedback

RESEARCH MASTER’S PSYCHOLOGY INTERNSHIP THESIS

Graduate School of Psychology

Anne-Wil Kramer

Main supervisor: Prof. Dr. H.M. Huizenga

Additional supervisors: Dr. J. Cousijn, Dr. H. Larsen, Dr. I. Visser, Dr. P.J.F. Snellings, & Dr. M.H.T. Zeguers.

(2)

2 Abstract

This study assessed the differential effects of feedback valence on probabilistic learning. A cognitive view proposes that negative feedback carries more informative values which promotes learning. A motivational account proposes that motivation is required for longer tasks, and achieved by providing positive feedback. Learning is driven by prediction errors (PEs), used to update representation of the expected value of some action. PEs can be positive, when outcome is better than expected, or negative when outcome is worse than expected. We used a mixed probabilistic-reinforcement learning design whereby participants had to learn pseudo-words within two conditions: a negative condition in which negative and blank feedback was provided, and a positive condition in which positive and blank feedback was given. Results indicate that participants learned faster in the beginning of the task in the negative condition, but that the effect of feedback valence differed between participants. Moreover, results revealed that positive PEs are mainly represented in (parts of) the ACC as well as in both dorsal and ventral striatum, and that negative PEs are mainly represented in the OFC, insula and IFG. Furthermore, results indicate that the superior temporal gyrus was activated more in response to blank than positive feedback during positive PEs. No overall conclusions can be drawn whether negative feedback facilitates better performance than positive feedback on probabilistic learning.

(3)

3 Feedback-learning and its neural representations

What happens in the brain when humans receive feedback, and how do we learn from feedback? Feedback, or reinforcement, is everywhere around us. In schools, higher education, and work environments, feedback is used as an important communication- and learning tool. Multiple theories exist within the reinforcement learning (RL) literature on about how humans learn from feedback. One of the most central assumptions within this framework is that people learn from the consequences of their actions.

Sometimes, the consequence of one’s action is different from what is expected. This is called a prediction error (PE): it is a difference between an expected outcome and an actual outcome. Prediction errors can be positive, when the outcome is better than expected, or negative, when the outcome is worse than expected (Sutton & Barto, 1998). This signal is generally thought of as the engine of learning, because it is used to update expectations to make more accurate predictions, while also updating the expected value of chosen options (Gläscher, J, Daw, Dayan, & O'Doherty, 2010). The expected value associated with a chosen option will increase after positive PEs and decrease after negative PEs (Van den Bos et al., 2012).

Prediction errors can clearly be registered within the brain. BOLD signals in the ventral striatum and parahippocampal gyrus exhibit response characteristics consistent with dopaminergic input (Delgado, Nystrom, Fissell, Nol, & Fiez, 2000; Delgado, Gillis, & Phelps, 2008; Knutson, Adams, Fong, & Hommer, 2001; Knutson, Taylor, Kaufman, Peterson, & Glover, 2005; Van den Bos et al., 2009), that correlates with positive as well as negative PEs (Haruno, & Kawato, 2006; McClure, Berns, & Montague, 2003; O’Doherty, Dayan, Friston, Critchley, & Dolan, 2003). The ventral striatum specifically, is thought to play a central role in the RL functional connectivity network, and shows a strong functional connectivity pattern with the medial prefrontal cortex (mPFC). It exhibits BOLD responses correlated with PEs, and the mPFC subsequently uses this information to update the expected value associated with a chosen option (van den Bos et al., 2008). This relation between PEs and subsequent learning is also confirmed by some studies demonstrating that the representation of PEs in the ventral striatum is associated with individual differences on probabilistic learning tasks (Pessiglione et al., 2006; Schönberg et al., 2007).

Two behavioral theories can be distinguished within the RL literature. First, according to a cognitive account negative reinforcement is more informative because it serves as a source of information necessary to correct incorrect responses (Kulhavy, & Stock, 1989). Negative feedback thus improves performance on RL tasks. Second, a motivational account proposes that positive reinforcement promotes people’s motivation to learn. When a task takes longer and requires motivation to complete, positive feedback would be preferred over the more informative negative feedback, and would improve performance according to this view.

However, aforementioned studies used mixed reinforcement schemes in which positive feedback was given in response to correct answers, and negative feedback in response to incorrect answers within the same condition. Such a design makes it difficult to distinguish between the differential contributions of feedback valence on learning. The concept of blank feedback may pose an answer to this problem. Blank feedback can be described as not receiving feedback but it can hold informative value at the same time. For example, when people know that they receive either positive or blank feedback, they will eventually infer that blank feedback means ‘incorrect’. Then, what happens when people receive either positive-blank feedback or negative-blank feedback? This can be tested in

(4)

4

unmixed designs with separate conditions: one in which positive feedback is given to correct answers and blank feedback to incorrect answers, and one in which blank feedback is given to correct answers and negative feedback to incorrect answers.

This study attempted to integrate the two behavioural accounts about feedback valence and re-test the differential contribution of feedback valence on both learning and the end point of learning. Furthermore, we tried to test the differential contributions of the ventral-striatal network implicated in RL learning in response to receiving positive, blank or negative feedback. We expected that people would perform better and would show more activation in the cortico-striatal network when receiving negative-blank feedback as compared to positive-blank feedback in the beginning of the task, because of the more informative value of

negative feedback. However, we expected people to perform better and show more activation in aforementioned network when receiving positive-blank feedback as compared to negative-blank feedback at the end of the task when motivation is required.

Materials and Methods Participants

A total of 30 participants aged between 21 and 31 years (M = 24, SD = 3) cooperated in this study, 12 of which were male. Most of them were recruited via a convenience sample of an earlier study. Additionally, some senior students of the University of Amsterdam were asked to participate. None of the participants indicated to have dyslexia or any other psychiatric disorder. All participants gave written informed consent after reading the complete research description that was approved by the ethical committee of the Psychology Department of the University of Amsterdam. Participants were compensated for their participation.

Pseudo Word Task (PWT)

To measure feedback induced brain activity, an event-related fMRI pseudo word task (PWT) was designed in the program Presentation. Within this task, participants had to learn the correct spelling of pseudo words. On each trial, they were presented with two spellings (e.g strauk and strouk, but see Appendix A for all stimuli) together with a picture, and had to learn, given probabilistic feedback (65%), which of the two spellings was correct. The idea was, that after being presented with the same word pair for a few times, participants would learn that one of the pseudo word spellings was most often correct and thus would choose that word consistently throughout the task. The stimuli for the PWT were one-syllable pseudo words, with two alternative homophone spellings that include the letter(combination)s au/ou, ij/ei, or g/ch. Criteria for pseudo word construction were as follows: 1) five or six letters; 2) including one of the above letter combinations; 3) following Dutch language rules, and 4) differing at least one letter from existing Dutch words. Twenty-four pseudo word pairs have been constructed for the purpose of this study.

Task Design

We used a repeated-measures design in which participants all completed two testing days with feedback valence (positive or negative) varying between days. In the positive condition, participants received positive feedback (+10) when they pressed the button corresponding to the ‘correct’ pseudo word, and black feedback (0) otherwise. It was also indicated that the computer sometimes errs, and that thus positive feedback might be given to incorrect words and blank feedback to correct words. In the negative condition, participants received negative feedback (-10) when they chose the ‘incorrect’ pseudo word and blank feedback otherwise.

(5)

5

Again, it was indicated that sometimes negative feedback could be given to correct words and blank feedback to incorrect words.

One testing day consisted of three blocks of the PWT, all in one valence condition. The first block was meant to practice and was conducted outside the MRI scanner on a computer in a quiet room. The second and third testing blocks took place inside the MRI scanner and the order of those blocks was counterbalanced. Within a testing block, four pseudo word pairs were presented for 24 times each resulting in a total of 96 trials per block. One of the arbitrary pseudo word spellings had a higher chance (65 percent) to be followed by positive or negative feedback throughout the block. Participants were instructed to try to choose, for each trial, the word pair with the highest chance on positive- or the lowest chance on negative feedback, depending on valence condition.

After participants completed the three blocks, a dictation task was conducted which served as a retention measure. In this task, participants were auditorily (through headphones) presented with the pseudo words they just ‘learned’ in the scanner. After hearing a pseudo word, they typed the spelling they thought was ‘correct’.

Figure 1. Task design: trial example negative condition (first) and positive condition.

Additional tasks

After the PWT and dictation task, participants were tested on five other measures, to control for potential individual differences possibly influencing test results on the PWT. First, the sensitivity to reward and sensitivity to punishment questionnaire (SRSPQ; Carver & White,

(6)

6

1994) was used. This test was included to measure potential individual differences in participants’ sensitivity to the differential feedback. In this test, questions such as “Can the opportunity to earn money motivate you strongly to do things?” are asked. Participants should answer with ‘yes’ or ‘no’.

Second, we used the One-minute-test (één-minuut-test, EMT) to measure technical Dutch reading skills. Participants had one minute to read as many words as they can with increasing difficulty. Third, we used the Klepel non-word reading task to measure potential individual differences in participants’ ability to decode (non-existing) words. This was especially important since we used pseudo words, whereby technical reading skills were a prerequisite to reading pseudo words and differences in these skills may contribute to differences in performance on the PWT. In this test, participants are asked to read a list of non-words in two minutes.

Fourth, the synonyms subtest from the General Aptitude Test Battery (GAT-B) was be used to measure vocabulary and verbal IQ. In this test, participants should indicate whether given words are synonyms or antonyms for a certain target word. Individual differences on this test might contribute to individual differences on the PWT since differences in verbal IQ will most likely contribute to differences in the speed of learning spelling rules and encoding the value of blank feedback.

Finally, the folding patterns subtest from the GAT-B was included to measure individual differences in nonverbal IQ. Individual differences here may also contribute to differences in the speed of learning rules and encoding the value of blank feedback. In this test, participants saw a folding pattern together with four geometrical figures. They had to choose which of these figures could be formed when folding the folding pattern in a designated way.

fMRI acquisition

fMRI data was collected using a 3T-MRI scanner (Philips Achieva, Amsterdam, The

Netherlands). Visual stimuli were back-projected onto an IPS LCD (31.55" diagonal)screen using a DLP beamer, and participants viewed the stimuli through a mirror on the head coil. A structural T1 scan was acquired at the start of each first session (T1 turbo field echo, TR = 8281 ms, TE = 3.8 ms, slice thickness = 1 mm, FOV = 240 x 220 mm, in-plane resolution = 240 x 240 mm, flip-angle = 8°). During the PWT, BOLD signal was measured with a T2*gradient-echo echo-planar imaging (EPI) sequence (TR= 2000 ms, TE = 27.63 ms, number of slices = 37, slice thickness = 3 mm, interslice gap = 0.3 mm, FOV = 240 x 240 mm, in-plane resolution = 80 x 80 mm, flip-angle = 76.1°).

Results

The aim of this study was to examine differences in brain and behavioural responses to positive-blank versus negative-blank feedback, and to distinguish brain regions exhibiting those contrast BOLD responses as well as overall BOLD signals in response to feedback (prediction errors). Due to technical issues, one participant had to be excluded from further analysis, leaving a total of 29 participants. The reinforcement model showed that [..] in the [..] condition. Imaging analysis revealed [..] effect.

Behavioural results

We divided the data into 8 bins of 12 word pairs each to look at the learning rate as well as at the end-point of learning. A repeated measures analysis of variance (ANOVA) revealed no

(7)

7

significant effect from valence (F(1, 28) = 0.53, p = 0.475) nor block (F(1, 28) = 1.61, p = 0.215). A significant effect was found for bin (F(7, 22) = 8.45, p < .001) indicating that ‘percentage correct’ increased as bins increased, which reflects a learning effect. Although there were no significant effects from valence or block, we observed a steeper learning rate in the first block in the negative-blank condition than in the positive-blank condition.

Additionally, we observed a higher end-point of learning in both blocks for the negative-blank condition in terms of percentage correct at the end of the PWT (Fig 2).

Figure 2. Learning rates for both valence conditions in the two blocks.

To investigate potential between-subjects differences, we performed a multilevel regression analysis with the following regression model:

y(ij) = b0 + b1 * valence(ij) + b2 * block(ij) + b3 * bin(ij) + b4 * valence * block + u(1j) + e(ij)

In this equation, y(ij) denotes the dependent variables score (percentage accurate) at timepoint

i in person j, valence(ij) denotes the valence condition in which the timepoint i is contained

for person j (coded as 1 for the positive-blank condition and 2 for the negative-blank condition), bin(ij) denotes the timepoints i for person j, and block(ij) denotes the block in which timepoint i is contained for person j (block can be 1 or 2). The term e(ij) denotes the residual at timepoint i in person j. The residuals are correlated to an auto-regressive structure, Wald Z = 8.719, p < .001.

Results revealed a significant effect of the first block, β = 0.09, t(211.496) = 3.16, p < .01, indicating that participants learned fasterin the first block as compared to the second block. Furthermore, there was a significant random effect for valence, β = 0.01, Wald Z = 3.06, p < .01, indicating that the effect of valence condition on accuracy significantly differed between participants. No significant interaction effect between block and valence was found.

fMRI pre-processing

FMRI data processing was carried out using FEAT (FMRI Expert Analysis Tool) Version 6.00, part of FSL (FMRIB's Software Library, www.fmrib.ox.ac.uk/fsl). Registration of the functional data to the high resolution structural images was carried out using FLIRT

(8)

8

(Andersson 2007a, 2007b). The following pre-statistics processing was applied; motion correction using MCFLIRT (Jenkinson, 2002); slice-timing correction using Fourier-space time-series phase-shifting; non-brain removal using BET (Smith, 2002) with thresholds depending on manually assessed quality; spatial smoothing using a Gaussian kernel of FWHM 5mm; grand-mean intensity normalisation of the entire 4D dataset by a single multiplicative factor; high-pass temporal filtering (Gaussian-weighted least-squares straight line fitting, with sigma=45.0s, 1/90 Hz).Time-series statistical analysis/prewhitening was carried out using FILM with local autocorrelation correction (Woolrich, 2001).

The time series model for the first-level analysis included the following six regressors: the onset of the negative feedback in both blocks, the onset of the positive feedback in both blocks and the onset of the missing trials in both blocks.

fMRI results: whole brain analysis

The second level analysis averaged first-level contrast estimates over sessions across subjects, and was carried out using FLAME (FMRIB's Local Analysis of Mixed Effects) stage 1 with automatic outlier detection (Beckmann, 2003; Woolrich 2004; Woolrich, 2008). Across all participants, the average difference between positive and negative PEs (main effect positive > negative) correlated significantly with BOLD responses in the medial PFC, (para)cingulate gyrus, putamen bilaterally (part of dorsal striatum), and the right accumbens (Fig 3, table 1) indicating that those regions were more activated in response to positive than to negative PEs, regardless of valence condition. The average difference between negative and positive PEs (main effect negative > positive) correlated significantly with BOLD responses in themedial PFC (OFC), insula, and IFG (Fig 3, table 1), indicating that those regions were more activated in response to negative than to positive PEs, regardless of valence condition. Note that the vectors for negative feedback events contained all negative prediction errors regardless of feedback valence condition (i.e. also blank feedback in positive condition) and the vectors for positive feedback events contained all positive prediction errors regardless of condition (i.e. also blank feedback in negative condition). Furthermore, the positive > negative PEs main effect was thresholded at Z = 3.8 (Fig 3) to extract only the main four clusters among more than 30 significant (small) clusters for displaying purposes.

Whole brain regression analysis for valence differences across participants between sessions revealed some significant results. The contrast negative condition > positive

condition, which investigated potential differences in BOLD responses to negative > positive

PEs between feedback valence conditions, revealed significant differences in the occipital

lobe and the fusiform gyrus specifically (Fig 4, table 1), indicating that negative PEs were more pronounced in those regions in the negative condition than in the positive condition. The contrast negative condition > positive condition, which looked at differences in BOLD

responses to positive > negative PEs between feedback valence conditions, revealed a significant difference in the superior temporal gyrus, indicating that positive PEs were more pronounced in this region in the negative condition than in the positive condition. In other words, this region was more activated in the negative condition after positive feedback (i.e. blank in this context) than in the positive condition after positive feedback (Fig 4, table 1). The contrasts positive condition > negative condition revealed no significant differences for both positive and negative PEs between feedback valence conditions. Additionally, no

significant differences were found between blocks within sessions, nor between blocks across sessions.

(9)

9

Table 1.

Brain regions activated by prediction errors: main effects and condition differences. MNI coordinates Cluster size (voxels) Brain region Hemisphere x y z Zmax

Negative > positive PEs, negative > positive condition

2199 Occipital cortex R 33 18 37 6.91*

1770 Occipital fusiform gyrus,

lateral occipital cortex

L 57 19 32 6.56*

Negative > positive PEs, main effect

1247 mPFC R/L 8 24 40 5.36*

603 mPFC (OFC), insula, IFG R 46 18 2 4.79*

425 mPFC (OFC), insula, IFG L -32 28 0 4.34*

Positive > negative PEs, negative > positive condition

299 Superior temporal gyrus R 64 -20 2 3.57*

Positive > negative PEs, main effect

1076 mPFC, paracingulate gyrus L/R 6 52 -8 5.75**

2785 Cingulate gyrus L/R -4 -32 48 5.7**

859 Putamen L -18 10 -10 6.36**

489 Putamen, accumbens R 12 10 -14 6.26**

MNI coordinates of maximum Z-scores are shown for each cluster. *Significant at the Region of interest level, Z > 2.6, corrected p < .05. **Significant at the Region of interest level, Z > 3.8, cluster-corrected p < .01. L = left, R = right, MNI = Montreal Neurological Institute. IFG = inferior frontal gyrus, mPFC = medial prefrontal cortex, OFC = orbitofrontal cortex.

Figure 3. Negative > positive PEs main effects; medial PFC (OFC), insular cortex, IFG (left).

Positive > negative PEs main effects; medial PFC, (para)cingulate gyrus, precuneus, putamen, accumbens (right).

(10)

10 Figure 4. Negative PEs, contrast negative > positive (left); occipital lobe.

Positive PEs, contrast negative > positive (right); superior temporal gyrus.

Taken together, the behavioural results indicate that participants learned faster in the first block than in the second block and that the effects of valence condition on learning differed between participants. The fMRI results showed activation in different brain regions for processing negative and positive PEs respectively. In addition, results indicate that there was more overall activation in response to positive > negative PEs than to negative > positive PEs. No ROI analyses were carried out due to time limitations of the current project.

Discussion

This study investigated the effect of negative-blank versus positive-blank feedback on probabilistic learning and the end-point of learning. The behavioural results indicate that participants learned faster in the first block as compared to the second block, regardless of feedback valence. Neuroimaging analyses revealed that different brain regions encode positive and negative PEs.

Behavioural results

The behavioural results showed no main effect of valence on learning speed. The results indicate, however, that the effect of feedback valence differed between participants. This might explain why no overall valence effect has been found; individual differences between sensitivity to reward and sensitivity to punishment are well documented (Santesso,

Dzyundzyak & Segalowitz, 2011; Caseras & Torrubia, 2003). This was also the reason why we included the SPSRQ (Carver & White, 1994). For example, Boksem and colleagues (2006) reported that subjects scoring high on the behavioural inhibition scale (BIS), which is similar to the SP scale of the SPSRQ, showed higher error-related negativity (ERN),

reflecting a better control of inhibition. We could, based on the current data, further distinguish two groups based on participants’ SPSRQ score to see whether one group, that may be more sensitive to reward, shows a valence effect of positive-blank over negative-blank feedback, whereas the opposite may be true for a second group that may be more sensitive to punishment. This was, however, beyond the scope of the current paper.

Another behavioural outcome, that learning was more pronounced in the first block than in the second block regardless of feedback, does not strike with our hypothesis that this would be true for the first block of the negative condition but that the opposite would be true for the second block of the positive condition. A failure to find an effect of better performance in the second block of the positive condition might be due to lack of motivation, since the first block alone took 11 to 13 minutes after which a second block had to be completed for a

(11)

11

similar amount of time. Together with participants reporting that they thought the task was rather boring and the observation that some of them were falling asleep in especially the second block, this might have resulted in lack of motivation throughout the second block regardless of feedback condition. While we intended to make the task considerably

uninteresting in the second block to test the motivational hypothesis, this result implicates that the required motivation was at least not provided by positive-blank feedback. A possible solution may be to use more encouraging positive feedback such as some text saying ‘well done!’ or a green thumb pointing upwards. We, however, chose the ‘+10’, ‘0’ and ‘-10’ to avoid possible confounding activation due to associations people may have with coloured recognizable pictures signalling correct/incorrect responses. Another option is to shorten the task so that participants may be less unmotivated in the second block.

Taken together, the behavioural results do not provide direct evidence for a cognitive or motivational account of feedback learning. With caution, we may say that those results lean more towards a cognitive account, because we hypothesised that no motivation is yet required in the first block, and in the first block of the negative condition we observed a steeper

learning rate and higher end-point compared to the first block of the positive condition. It would be interesting to compare dictation scores of both conditions to see whether feedback valence influenced retention rates. This was, however, beyond the time scope of the current paper.

fMRI results: positive and negative PEs

Consistent with earlier studies, positive as well as negative PEs correlated with BOLD responses in a network of areas, including the medial PFC and regions of the striatum (Van den Bos et al., 2012; McClure et al., 2003; O’Doherty et al., 2003). Interestingly, however, we found some regions that were activated by negative PEs but not by positive PEs and vice versa. For positive PEs (para)cingulate gyrus (part of the anterior cingulate gyrus/cortex), bilateral putamen (part of dorsal striatum) and the right accumbens (part of ventral striatum) showed increased activity. Regions that were only activated by negative PEs were the insula, orbitofrontal cortex (OFC) and inferior frontal gyrus (IFG). This result is consistent with results from a meta-analysis by Garrison, Edeniz, and Done, (2013) showing that striatal areas are activated by rewarding reinforcers, and that the insula is activated by aversive reinforcers.

Activation found in the (para)cingulate gyrus, putamen and accumbens for positive PEs is also consistent with other studies showing similar results for tasks in which subjects had to choose between actions with probabilistic rewards, and which distinguished the

involvement of the striatum (Elliot et al., 1999), and anterior cingulate cortex (ACC) (Bush et al., 2002). In addition, Rogers and colleagues (2004) showed that the ACC and striatum were associated with good outcomes (winning) as compared to bad outcomes (loss), a result that the current findings reflect. A meta-analysis showed that (para)cingulate gyrus and right accumbens were more activated by positive than negative reinforcement (Silverman, Jedd & Luciana, 2015), consistent with the current results. Thus, the current results confirm the involvement of (part of) the ACC, striatum and right accumbens in positive PEs.

Activation of the insula and OFC for negative PEs corresponds with earlier research demonstrating a link between the OFC and insula, representing punishing rewards and error awareness (O’Doherty et al., 2003). Additionally, a study using subjects with OFC lesions showed that those people failed to perform above chance level on a probabilistic feedback-learning task (Tsuchida, Doll & Fellows, 2010). They concluded that the OFC is of main importance for interpreting feedback and for flexible stimulus-reinforcement learning in

(12)

12

humans. Within this line of thought, our finding, that the OFC was activated in response to negative PEs and not positive PEs, indicates that subjects interpreted negative feedback, regardless of condition (i.e. negative or blank) in a different way than positive feedback and that they have experienced this as signalling that they made an incorrect response. Activation of the IFG has furthermore been associated with a decrease in the probability to make a risky choice in a probabilistic choice task (Christopoulos et al., 2009). IFG-activation in response to negative PEs may then suggest that when people receive negative feedback, regardless of condition, this decreases their tendency to choose that option again, thereby going for a safer option. Summarizing, the current results confirm the involvement of the insula, OFC and IFG in negative PEs.

The mPFC was activated in response to positive as well as negative PEs. While the function of the mPFC remains in dispute, a simulation study of Alexander and Brown (2011) suggests a general function, namely that the mPFC is concerned with learning and predicting the likely outcomes of actions, whether they are good or bad. This view is in accord with the earlier proposed mechanism that the mPFC use striatal information to update the expected value associated with a chosen option (van den Bos et al., 2008). The OFC however, which is part of the mPFC, was more activated during negative than positive PEs, indicating that this part of the mPFC may not be independent from valence and may signal negative PEs specifically.

fMRI results: condition differences

Regarding the activation of the superior temporal gyrus during positive PEs, which was more pronounced in the negative than positive condition, some studies have found that activation of this region is associated with feelings of (social) acceptance and positive feedback (Guyer, Choate, Pine & Nelson, 2011; Allison, Puce & McCarthy, 2000). We may therefore conclude that blank feedback in the negative condition was correctly interpreted as signalling a correct response. Interestingly, it thus seems that the blank feedback in the negative condition elicited more positive feelings than the positive feedback in the positive condition.

The aforementioned meta-analysis by Silverman et al. (2015) did not find any brain region that was activated more by negative than by positive feedback. In the current study, only areas in the occipital lobe were found to be more activated during negative PEs in the negative condition than in the positive condition. This result, however, is probably due to a non-existent inter-stimulus-interval between the stimulus and the feedback. The pictures accompanying the pseudo-words in the PWT may have elicited a higher BOLD response in the occipital lobe in the negative feedback condition by accident, since they were different from the pictures in the positive feedback condition. This BOLD response may subsequently have become intermingled with the BOLD response to the onset of the feedback, resulting in what looks like a higher BOLD response to the negative feedback in the negative condition within the occipital lobe. A possible solution to such an effect is to use the same pictures for both condition, and to use an inter-stimulus-interval with a duration of at least 1000 ms that has been jittered. Another option would be to model out the onset of the stimuli, but that was beyond the scope of the current paper.

Limitations

A possible limitation of the current study may be that different words were used for the positive than for the negative condition. For the positive condition, only pseudo words with ou/au were used, whereas for the negative condition, only pseudo words with ei/ij were used. We first considered combining ou/au and ei/ij words into one condition, but after piloting

(13)

13

more than ten people we concluded that this would make the task too difficult, since those pilot results showed no learning above chance level. However, it might be that learning words with ou/au is somehow easier than learning words with ei/ij, resulting in faster learning within the negative condition. In a replication study, it is therefore recommended to counterbalance the words used over feedback valence conditions.

Within this line of thought, it may also be possible that some words, regardless of ei/ij or au/ou were easier to learn than others. We could therefore analyse the learning rate and end-point of learning per word pair and subsequently distinguish words that were possibly easier or harder to learn and, for example, use the words that are easier to learn in a

continuation study. In addition, it could be that the percentage at which congruent feedback was given, may have been too low for participants to fully learn all the words. The percentage was 65%, meaning that in 35% of trials, participants chose the correct word but feedback indicated otherwise. This may have resulted in participants only focusing on a smaller part of the word pairs, resulting in a slower learning rate and lower percentage correct at the end of the task. We decided to use 65% after piloting over ten highly motivated research master students after which we concluded that 70% or 75% was too easy. This may, however, not be representative of the larger population. It is therefore recommended to re-test this with a more representative sample and then to decide what percentage to use.

Finally, it could be possible that people with lower (verbal) IQ, lower technical reading skills and/or lower decoding skills experienced more overall difficulties in learning the pseudo-words. This is why we also conducted the synonyms and folding patterns subtests of the GAT-B as well as the EMT and Klepel. Thus, we could distinguish different groups based on their GAT-B and/or EMT and Klepel scores and analyse results separately per group. This was, however, beyond the time scope of the current project.

Conclusion

Taken together, the fMRI results do not confirm the hypothesis that, according to a cognitive view, the striatal-cortical network would be more activated in the negative condition than in the positive condition in especially the beginning of a probabilistic feedback task, nor that, according to a motivational account, this would be the opposite for the end of this task. The behavioural results do also not confirm this, but do lean more towards a cognitive than a motivational view. What the current results reflect is that positive PEs are mainly represented in (parts of) the ACC as well as in both dorsal and ventral striatum, and that negative PEs are mainly represented in the OFC, insula and IFG. Furthermore, results indicate that the superior temporal gyrus was activated more in response to blank than positive feedback during positive PEs, and that during negative PEs parts of the occipital lobe were more activated in the

negative condition. Interesting results that were found include the specificity of the OFC for negative PEs and the result that blank feedback in the negative condition may have elicited more positive feelings than the positive feedback in the positive condition. Future research may therefore continue to make use of the blank feedback paradigm and investigate specific differences between blank and positive feedback when both signal correct responses.

(14)

14 Literature

Alexander, W. H., & Brown, J. W. (2011). Medial prefrontal cortex as an action-outcome predictor. Nature neuroscience, 14(10), 1338-1344.

Allison, T., Puce, A., & McCarthy, G. (2000). Social perception from visual cues: role of the STS region. Trends in cognitive sciences, 4(7), 267-278.

Andersson, J. L., Jenkinson, M., & Smith, S. (2007). Non-linear optimisation. FMRIB technical report TR07JA1. University of Oxford FMRIB Centre: Oxford, UK. Andersson, J. L., Jenkinson, M., & Smith, S. (2007). Non-linear registration, aka Spatial

normalisation FMRIB technical report TR07JA2. FMRIB Analysis Group of the

University of Oxford, 2.

Beckmann, C. F., Jenkinson, M., & Smith, S. M. (2003). General multilevel linear modeling for group analysis in FMRI. Neuroimage, 20(2), 1052-1063.

Boksem, M. A., Tops, M., Wester, A. E., Meijman, T. F., & Lorist, M. M. (2006). Error-related ERP components and individual differences in punishment and reward sensitivity. Brain research, 1101(1), 92-101.

Bush, G., Vogt, B. A., Holmes, J., Dale, A. M., Greve, D., Jenike, M. A., & Rosen, B. R. (2002). Dorsal anterior cingulate cortex: a role in reward-based decision

making. Proceedings of the National Academy of Sciences, 99(1), 523-528. Carver, C. S., & White, T. L. (1994). Behavioral inhibition, behavioral activation, and

affective responses to impending reward and punishment: The BIS/BAS Scales. Journal of personality and social psychology, 67(2), 319.

Caseras, X., Avila, C., & Torrubia, R. (2003). The measurement of individual differences in behavioural inhibition and behavioural activation systems: a comparison of personality scales. Personality and individual differences, 34(6), 999-1013.

Christopoulos, G. I., Tobler, P. N., Bossaerts, P., Dolan, R. J., & Schultz, W. (2009). Neural correlates of value, risk, and risk aversion contributing to decision making under risk. Journal of Neuroscience, 29(40), 12574-12583.

Delgado, M. R., Gillis, M. M., & Phelps, E. A. (2008). Regulating the expectation of reward via cognitive strategies. Nature neuroscience, 11(8), 880-881.

Delgado, M. R., Nystrom, L. E., Fissell, C., Noll, D. C., & Fiez, J. A. (2000). Tracking the hemodynamic responses to reward and punishment in the striatum. Journal of

neurophysiology, 84(6), 3072-3077.

Elliott, R., Rees, G., & Dolan, R. J. (1999). Ventromedial prefrontal cortex mediates guessing. Neuropsychologia, 37(4), 403-411.

Garrison, J., Erdeniz, B., & Done, J. (2013). Prediction error in reinforcement learning: a meta-analysis of neuroimaging studies.Neuroscience & Biobehavioral Reviews, 37(7), 1297-1310.

Gläscher, J., Daw, N., Dayan, P., & O'Doherty, J. P. (2010). States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron, 66(4), 585-595.

Guyer, A. E., Choate, V. R., Pine, D. S., & Nelson, E. E. (2011). Neural circuitry underlying affective response to peer feedback in adolescence. Social cognitive and affective

neuroscience, 7(1), 81-92.

Haruno, M., & Kawato, M. (2006). Different neural correlates of reward expectation and reward expectation error in the putamen and caudate nucleus during stimulus-action-reward association learning. Journal of neurophysiology, 95(2), 948-959.

(15)

15

Jenkinson, M., Bannister, P., Brady, M., & Smith, S. (2002). Improved optimization for the robust and accurate linear registration and motion correction of brain

images. Neuroimage, 17(2), 825-841.

Jenkinson, M., & Smith, S. (2001). A global optimisation method for robust affine registration of brain images. Medical image analysis, 5(2), 143-156.

Knutson, B., Adams, C. M., Fong, G. W., & Hommer, D. (2001). Anticipation of increasing monetary reward selectively recruits nucleus accumbens. J Neurosci, 21(16), RC159. Knutson, B., Taylor, J., Kaufman, M., Peterson, R., & Glover, G. (2005). Distributed neural

representation of expected value. Journal of Neuroscience, 25(19), 4806-4812. Kulhavy, R. W., & Stock, W. A. (1989). Feedback in written Instruction: The place of

response certitude. Educational Psychology Review, 1(4), 279–308.

McClure, S. M., Berns, G. S., & Montague, P. R. (2003). Temporal prediction errors in a passive learning task activate human striatum. Neuron, 38(2), 339-346.

O'Doherty, J. P., Dayan, P., Friston, K., Critchley, H., & Dolan, R. J. (2003). Temporal difference models and reward-related learning in the human brain. Neuron, 38(2), 329-337.

Pessiglione, M., Seymour, B., Flandin, G., Dolan, R. J., & Frith, C. D. (2006). Dopamine-dependent prediction errors underpin reward-seeking behaviour in

humans. Nature, 442(7106), 1042-1045.

Rogers, R. D., Ramnani, N., Mackay, C., Wilson, J. L., Jezzard, P., Carter, C. S., & Smith, S. M. (2004). Distinct portions of anterior cingulate cortex and medial prefrontal cortex are activated by reward processing in separable phases of decision-making

cognition. Biological psychiatry, 55(6), 594-602.

Santesso, D. L., Dzyundzyak, A., & Segalowitz, S. J. (2011). Age, sex and individual differences in punishment sensitivity: Factors influencing the feedback‐related negativity. Psychophysiology, 48(11), 1481-1489.

Schönberg, T., Daw, N. D., Joel, D., & O'Doherty, J. P. (2007). Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making. Journal of Neuroscience, 27(47), 12860-12867.

Silverman, M. H., Jedd, K., & Luciana, M. (2015). Neural networks involved in adolescent reward processing: an activation likelihood estimation meta-analysis of functional neuroimaging studies. NeuroImage, 122, 427-439.

Smith, S. M. (2002). Fast robust automated brain extraction. Human brain mapping, 17(3), 143-155.

Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.

Tsuchida, A., Doll, B. B., & Fellows, L. K. (2010). Beyond reversal: a critical role for human orbitofrontal cortex in flexible learning from probabilistic feedback. Journal of

Neuroscience, 30(50), 16868-16875.

Van den Bos, W., Cohen, M. X., Kahnt, T., & Crone, E. a. (2012). Striatum-Medial Prefrontal Cortex Connectivity Predicts Developmental Changes in Reinforcement Learning. Cerebral Cortex, 22, 1247–1255. doi:10.1093/cercor/bhr198

Van den Bos, W., Güroğlu, B., van den Bulk, B. G., Rombouts, S. a R. B., & Crone, E. a. (2009). Better than expected or as bad as you thought? The neurocognitive

development of probabilistic feedback processing. Frontiers in Human Neuroscience,

3(December), 52. doi:10.3389/neuro.09.052.200

(16)

286-16

301.

Woolrich, M. W., Behrens, T. E., Beckmann, C. F., Jenkinson, M., & Smith, S. M. (2004). Multilevel linear modelling for FMRI group analysis using Bayesian

inference. Neuroimage, 21(4), 1732-1747.

Woolrich, M. W., Ripley, B. D., Brady, M., & Smith, S. M. (2001). Temporal autocorrelation in univariate linear modeling of FMRI data. Neuroimage, 14(6), 1370-1386.

Appendix A: Stimuli for the Pseudo-Word Task and Dictation Task

Practice blocks stimuli:

Sluig Sluich Strag Strach

Knoog Knooch Braag braach

Test blocks stimuli (also used in dictation task):

Negative condition

Grijk Greik Stijk Steik

Snijp Sneip Plijf Pleif

Positive condition

Graus Grous Knaup Knoup

Referenties

GERELATEERDE DOCUMENTEN

Subsequently, we loaded the FE models until failure and asked the following questions: (1) Is there a relationship between penetration depth, contact area and

exploiting the fact that many real-life signals admit a (higher-order) low-rank representation. As such, the BSS problem boils down to a tensor decomposition and 3) we can benefit

We present EnsembleSVM, a free software package con- taining efficient routines to perform ensemble classifi- cation with support vector machine (SVM) base mod- els (Claesen et

A between-participants experiment was conducted in which attitude and behavioural intention on three sustainable domains (animal welfare, climate change and child labour) were

Therefore, I expect that boundary systems have a negative effect on autonomous motivation as long as they are more strongly present in the MCS package than the MCSs

Second, it addresses whether CV responses during a stressor and recovery from it, as a model of prolonged CV activation, are associated with implicit affect as measured with the

Using positive and negative social feedback to promote energy conservation behavior in the home 15:30 Coffee Break Coffee Break. 16:00

H3a: Higher negative switching costs lead to a higher amount of complaints. Conversely as positive switching costs provide the customer with advantages of staying in the