• No results found

Can patterns of brain activity be used to predict meaningful as well as meaningless decisions?

N/A
N/A
Protected

Academic year: 2021

Share "Can patterns of brain activity be used to predict meaningful as well as meaningless decisions?"

Copied!
25
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Can patterns of brain activity be used to predict

meaningful as well as meaningless decisions?

21st September 2017 Alastair Haigh

10531998

Supervisor: Yair Pinto Co-assessor: Marte Otten

MSc Brain & Cognitive Sciences, Cognitive Neuroscience Track University of Amsterdam

(2)

Abstract

Since the experiments of Libet in the early 1980s demonstrated that certain patterns of brain activity reliably precede a subject's spontaneous decision to make a movement, there has been considerable interest in the subject,

particularly in its implications for various concepts of free will. Among the many variants of the original experiment has been the use of multi-voxel pattern analysis of fMRI data to give above-chance predictions of a subject's choice of whether to move the left or right hand up to ten seconds in advance. So far, however, the decisions made by subjects in these experiments have been relatively meaningless. What is not known is whether similar patterns of

predictive activity also occur prior to more meaningful decisions. Might free will only be an illusion for decisions of no importance? To try to answer that

questions, subjects were asked to respond to two categories of "meaningful" decision: moral dilemmas and whether to trust another individual in a game to win money; as well as the "meaningless" choice of whether to move the left or right hand. fMRI data were recorded with the intention of applying MVPA to find any predictive patterns occurring in the run-up to the three categories of

decision. Unfortunately, owing to a technical error, the fMRI data were not suitable for this purpose, so the question remains, as yet, unanswered.

(3)

Introduction

1. General introduction

In a materialistic and deterministic universe, what does it mean to freely perform an action? When we make a decision, could we have done otherwise, or is this sense of free will illusory? This question of how much control we have over the thousands of large and small decisions we make every day, and the role of consciousness as a causal agent in them, has been the subject of philosophical speculation and debate for centuries.

If the origin of both our actions and our sense of freely performing them lies in the brain, then the tools of neuroscience ought to be able to shed light on this problem. The question can be framed this way: in the time leading up to a

decision being taken or an action being performed, does brain activity associated with the decision or action occur before the subject is consciously aware of having made a choice? Further, does it matter what kind of decision is being made? Are we more or less "in control" when a choice is arbitrary or when it is a matter of life and death? The first steps towards answering these questions were made in the early 1980s using electroencephalography (EEG) recording (Libet, 1983). Since then, although work continues to be done and experimental techniques become more sophisticated, the core question of how much control over our actions our conscious self has, remains open.

2. The Libet Experiments

The seminal experiment, conducted by Libet and colleagues (1983), was very simple: wearing an EEG cap, subjects sat looking at a clock-face with a single rotating hand. They were asked to move their arm spontaneously at a time of their own choosing and to remember, using the clock, the time at which they first felt the urge to move (a variable usually known in the literature as "W"). The movements were supposed to be spontaneous and unplanned, so they occurred immediately after a subject felt that urge. The experimenters found that a

characteristic negative electrical potential reliably occurred around 350 ms prior to subjects' reporting that they felt the urge to move (W), with the actual

movement taking place around 200 ms after W (Figure 1). This so-called "readiness potential" has been observed in numerous replications and modifications of the original experiment; its existence and timing are not in serious doubt (Shibasaki & Hallett, 2006). Interpretation of the results—and their implications for philosophical concepts of free will and humans' moral and legal responsibilities—is however far from settled. Libet (and many since) have insisted that the readiness potential is evidence that spontaneous decisions, which feel subjectively as if they occur at a time of our own conscious choosing, in fact originate unconsciously and that our feeling of direct control is only an illusion; that consciousness is nothing more than a spectator.

(4)

-Figure 1: Timeline showing two stages of readiness potential (RP1, RP2), the subject's reported time of wanting to move (W) and movement onset (0 ms)

Regardless of its interpretation, Libet's readiness potential occurs, from the perspective of human experience, only a very short time before the execution of any decision; we may not have immediate conscious control according to Libet, but the events we witness and feel in control of are within a few hundred milliseconds of being "live". More recent experimental results throw doubt on this picture however. Soon et al. (2008) used multi-voxel pattern analysis, (MVPA, an application of machine learning, see Box 1), to predict, from fMRI data,

subjects' choices of left or right in a Libet-style experiment. They found not only that better-than-chance prediction of left/right hand choice was possible

(around 60%), but also that predictions could be made from brain data recorded up to ten seconds before subjects' reported urge to move (W). Although this result is striking, its interpretation has been, as with Libet's results, subject to debate. Subjects in the experiment were asked to behave spontaneously in their movement—the timing and choice of left or right hand—and to ignore previous choices; in other words, to be as unpredictable as possible. Humans however are notoriously poor at producing genuinely unpredictable sequences (e.g.

Wagenaar, 1972; Schulz, Schmalbach, Brugger & Witt, 2012). Work by Lages and Jaworska, (2012) also provides evidence of this: they replicated the experiment of Soon et al. (2008) without using an MRI scanner. Instead of using voxel data in their pattern classification analysis, they used responses from trials immediately prior to the current one. If subjects' response sequences were truly random then by definition no prediction would be possible. However their analysis was able to "predict" subjects' choices with accuracy around 60% - a result very similar to that of Soon et al. (2008), using no brain data at all.Even if the "anti-free-will" interpretation of Libet-style experiments and those using fMRI is correct—that is, that decision outcomes can be predicted before the subject is consciously aware of having made them—this interpretation can currently only be applied to such trivial, meaningless decisions as whether to move the left or right hand. Variations have been carried out involving more complex tasks, such as whether to add or subtract two numbers (Soon et al. 2013), but the choice itself remains arbitrary and meaningless to the subject. The aim of the present study then is twofold. Firstly, having replicated, using fMRI, the results of Soon et al. (2008), we extend the research by asking subjects to respond to two categories of more meaningful decision and attempt to find predictive patterns in the voxel data during the seconds leading up to these more meaningful choices. Secondly, we will ignore the brain data and use subjects' prior responses to make the same sort of predictions. The two categories of more meaningful decision chosen for this study are moral dilemmas involving sacrificing the lives of others, and the decision whether to trust a stranger to return money in one-shot trials. There has been prior research using fMRI into brain areas involved in both these categories of choice, outlined in the following two sections, but to our knowledge, the

(5)

technique of pattern classification has not yet been applied to them in an attempt to predict decisions.

Box 1: Multi-Voxel Pattern Analysis (MVPA)

The technique of MVPA is central to this study. It differs from traditional univariate analysis in its use of fMRI data: instead of treating each voxel as an independent data point, multiple voxels are considered together as patterns of activity. Combining voxel data in this way can potentially detect differences between conditions that would be missed by traditional methods. In its simplest form, the relative activity of two voxels would be compared between two experimental conditions (e.g. looking at objects in two categories). In reality however, experiments may use between tens and thousands of voxels, as well as multiple conditions (Tong & Pratte, 2012).

Whatever the number of voxels and conditions, MVPA uses a pattern-classification algorithm which is trained using one set of data and then tested using another set, with accuracy assessed by a process of cross-validation (for more detail see Methods section).

MVPA is a relatively recent technique, first used by Haxby and colleagues (2001), in an experiment in which brain data were used to differentiate between categories of objects presented to subjects. Since then various other aspects of stimulus perception have been successfully detected using MVPA, including complex scenes, low-level percepts such as orientation, as well as auditory and olfactory information (reviewed in Tong & Pratte, 2012). This ability to predict what a subject is perceiving has been referred to as "brain-reading", in contrast with "mind-reading", in which non-stimulus-driven, private activity is decoded. This may be an imagined visual scene, the resolution of an ambiguous image, or the current dominant image in a case of binocular rivalry (Tong & Pratte, 2012).

The use of MVPA in the present study does not quite fit into either category of brain- or mind-reading. Patterns are being sought here that, rather than representing the contents of consciousness or the response to a stimulus, instead reliably occur before a conscious and self-directed action in performed. The technique of MVPA has been applied in the study of diverse cognitive processes, including unconscious and

subliminal processes (Axelrod et al., 2015; Stertzer et al., 2008), as well as having been used to predict subjects' choices before they were aware of having made them (Soon et al., 2008; Soon et al., 2013).

(6)

3. Moral Dilemmas

In moral psychology research one of the emblematic ethical dilemmas is the so-called Trolley Problem (Foot, 1967). It is a thought-experiment in which the subject is asked to imagine standing by a railway track. A runaway train (or trolley) is approaching and will kill five men working farther along the track unless the subject pulls a lever to divert the train onto another track on which another man is working. The subject must decide whether to take action and save the lives of five people at the cost of the life of one person who would otherwise survive. Since the introduction of the trolley problem, a variety of similar, related problems have been devised and tested experimentally (Bruers & Braeckman, 2014). One early variant asked whether subjects would be prepared to push a very large man from a bridge to halt the speeding train and save five lives. Although the outcomes of this situation and the original problem are the same—one death versus five deaths—the problems differ in that the subject is asked to either let someone die or to actively kill them (Thomson, 1976). Activity in the brain when faced with moral dilemmas of this sort has been the subject of research since the beginning of this century. Greene and colleagues (2001) used fMRI in an experiment to compare reactions to the two variants of the trolley dilemma described above: "impersonal" moral dilemmas (like pulling a lever) vs. "personal" dilemmas (like pushing someone from a bridge). They found that emotion-associated areas such as medial frontal gyrus, posterior cingulate gyrus and angular gyrus were more active during personal dilemmas, whereas areas associated with working memory such as right middle frontal gyrus and bilateral parietal lobe were more active during impersonal dilemmas. These findings, together with differences in reaction times, were taken by the authors as evidence supporting a dual-process model of moral dilemma computation. The two processes are, roughly, a fast, intuitive response and a slower, more deliberative response. Longer reaction times are explained under this model by the latter process coming into conflict with the former (Greene et al. 2004). In general, people tend to choose to pull the lever in the original trolley scenario but choose not push the man off the bridge in the second, even though the outcomes in terms of lives saved or lost is the same in both scenarios. This discrepancy may therefore be explained by the two problem categories being processed by different brain networks (Greene et al. 2004).

Further evidence for the idea that the brain processes these two categories of dilemma in different regions or networks comes from a study involving patients with damage to their ventromedial prefrontal cortices (VMPFC). Compared with healthy controls, these patients were more likely to make "utilitarian"

judgements in personal moral dilemmas—that is, they would push the fat man off the bridge because the most lives would be saved that way. The researchers concluded the VMPFC is essential in making fast, intuitive judgements about personal moral dilemmas (Koenigs et al. 2007; Greene, 2007). Further research has shown that subjects under conditions of cognitive load take longer to make "utilitarian" judgements in personal dilemma scenarios, but are not affected in terms of reaction time when deciding not to act (the "nonutilitarian" choice). Subjects under cognitive load are also less likely to make utilitarian choices

(7)

(Greene et al. 2008). These findings support the idea of two parallel processes that may be somewhat in conflict with each other.

In an effort to help standardise future research in this area, Lotto and colleagues (2014) published a set of 60 moral dilemmas organised into four categories based on two criteria. Firstly, "incidental" vs. "instrumental" dilemmas; equivalent to "impersonal" and "personal" described above; that is, those in which death occurs as an incidental side effect of the protagonist's action (as in the original trolley problem) vs. those in which death occurs as a direct result of the action (as in the footbridge variant). Secondly, dilemmas in which the

protagonist's own life is at risk vs. those in which it is not ("self" vs. "other"). These dilemmas were tested on 120 subjects, providing data including yes/no response rates, degree of moral acceptability, emotional arousal and so on. Moral dilemmas are an example of a deeply meaningful decision category—in complete contrast with the trivial, arbitrary decisions that are typically

investigated in Libet-style experiments. It is of interest, therefore, whether patterns of brain activity leading up to those decisions—before subjects have had the chance to think about the problem—are at all predictive of their choices. If such patterns are to be found, do they show any differences across the four dilemma categories? Further, are there similarities and differences between patterns seen before moral decisions and before decisions in the Libet task? Any similarities or differences that are found may help in differentiating decision types with respect to concepts of free will. Is it meaningful to call a decision "free" if it meaningless (like those in Libet-style tasks)? If moral decisions such as these dilemmas are made via separate processes depending on their type, are both of these processes as "free" as each other? Are they equally predictable (or not)?

4. The Trust Game

As well as moral choices, the other category of meaningful decision investigated in the present study is an economic choice: the trust game. This game was designed to test the assumption in economics that individuals will act rationally in their own self-interest, (Berg et al., 1995). There are two players, one of whom is given an amount of money and must decide how much of it to give to the other player, whereupon that amount is tripled. The second player can then decide to give back some of the tripled money. For both players to maximise their benefits, the first player should send all the money and the second player should return half. However, if the game is played just once, anonymously, it is rational for the first player to send no money to the second, and for the second player to keep any money that is sent. However, under experimental conditions—even those of strict anonymity—people do not behave in this way; rather, they are frequently prepared to trust an anonymous playing partner to send or return money (Berg et al., 1995).

Real human interactions moreover are usually not conducted in complete anonymity. If players are given a chance to interact face-to-face, this may affect

(8)

their assessment of each others' intentions, and therefore their willingness to trust a stranger, even if the game is only played once and so from a strictly

economic point of view, the rational choice would be never to trust an opponent. Frank, Gilovich and Regan (1993), for example, gave subjects 30 minutes to talk to each other before playing a one-shot version of the prisoner's dilemma, a game similar to the trust game, in which trust and cooperation are rewarded. They found that subjects who interacted were significantly more accurate at predicting their partner's decisions than those who did not interact.

What makes people choose to cooperate in one-off encounters, running the risk of being cheated with no possibility of punishment for the defector? Frank et al. (1993) proposed an evolutionary model stating that such cooperative behaviour could be explained if people were capable of detecting other "cooperators". Following an experimental test of this model's assumptions, the researchers concluded that people do have a better-than-chance ability to accurately detect others' intent to cooperate. This led Brosig (2002) to test several hypotheses arising from this finding. She found that willingness to cooperate in anonymous interactions differed reliably between individuals; and that "cooperators" were able, significantly more often than chance, to identify fellow cooperators when given the chance to interact.

Judgements about a stranger's character (as opposed to their current emotional state), including their trustworthiness, can be formed very quickly: as little as 39 ms can be sufficient for an impression to be formed of a stranger's personality from looking at their face with a neutral expression (Bar et al., 2005). This "ability" seems absurd if there is no actual connection between personality and neutral facial appearance. That such judgements are nonetheless formed has been explained via an "overgeneralisation hypothesis" whereby evolved reactions to facial expressions are triggered by neutral faces that bear some resemblance to them (Zebrowitz, 2004). Regardless of their accuracy, we appear to be hard-wired to make rapid, unconscious judgements about others'

personalities, including their trustworthiness, based on facial appearance.

A variety of brain areas have been associated with differing aspects of trust, from these unconscious first impressions to the gradual building of a trusting

relationship. Of greater relevance to the present study are those systems which process cues that might indicate a stranger's trustworthiness. Winston et al. (2002) for example found activation in the amygdala, orbitofrontal cortex and superior temporal sulcus when subjects made facial trustworthiness judgements. Krueger et al. (2007) found an association between decisions to trust and

activation in the septal area and hypothalamus, the site of oxytocin release, a hormone that has been shown to increase trust (Kosfeld et al., 2005).

Although judgements about the trustworthiness of a stranger may take place very rapidly and occur unconsciously, they are presumably at least partially stimulus-driven. This contrasts with the spontaneous, endogenous choices involved in a typical Libet-style task, implying a different mechanism of action. It is therefore of interest, as with the moral dilemmas, to find out whether any predictive patterns that occur preceding Libet-type decisions also occur

(9)

preceding decisions of trust, and whether they occur before the subject has had a chance to respond to the stimulus. In other words, are such judgements entirely stimulus driven or is there evidence of preceding bias?

To summarise, the primary research question of this study is whether similar patterns of brain activation occur prior to decisions of three different categories, one meaningless and the other two meaningful. Furthermore, whether those patterns, if present, are predictive of subjects' choices in the three decision categories.

(10)

Materials and methods 1. Subjects

Nineteen healthy subjects were recruited (13 female, mean age 25.1years, s.d. 2.3) and took part in return for financial compensation. All subjects but one were native Dutch speakers also fluent in English, the language in which the

experiment was conducted (the other subject was a native English speaker). No subjects had a history of medical, psychiatric or neurological conditions and none were currently taking medication. Informed consent was obtained from all subjects before the experiment began.

2. fMRI data acquisition

Data were acquired using a Philips 3-tesla whole-body MRI scanner at the Spinoza for Neuroimaging, Amsterdam. Anatomical imaging: T1-weighted 3D sequence: TR 7.0 ms; TE 3.27 ms; flip angle 8°; FOV (mm) 208 ✕ 252 ✕ 252; thickness 0.9 mm. Functional imaging: 2D gradient sequence: TR 2375 ms; TE 9 ms; flip angle 76.1°; FOV (mm) 224 ✕ 121 ✕ 224; thickness 3 mm; number of echoes 3; number of slices 37; voxel dimensions (mm) 2.8 ✕ 2.8 ✕ 3.

3. Behavioural paradigms

All testing took place in the scanner. A 27-inch LCD monitor of resolution 1920 ✕ 1080 pixels was placed behind the scanner, visible to subjects via a mirror on the head coil. The first nine subjects completed two runs of 1.5 hours each, on

different days. After receiving feedback from some of these subjects that they had found the time in the scanner uncomfortably long, it was decided that the final 10 subjects would instead complete three runs of 1 hour each. Each run consisted of a mixture of the three tasks (Libet, Trust and Dilemmas) presented in a

pseudorandom order. i. Libet task

At the beginning of each trial a clock-face appeared on screen with numbers 1-12 in black on white. Beginning at a pseudorandomly determined angle, a red hand rotated clockwise, once every 2.56 s. Subjects were instructed to allow the hand to complete one full rotation before pressing a button with either their left or right index finger (any presses before a full rotation were ignored by the

computer). Subjects were told to press a button whenever they felt like it, to keep watching the clock face without using it to plan when to press (e.g. at 12 o'clock) but only to remember when they first felt the urge to press. After the left or right button was pressed, the clock hand continued to rotate for between 1 and 4 seconds, before disappearing. This was in order to prevent afterimages of the hand affecting subjects' memory of when they felt the urge to move. The hand then reappeared, pointing in a pseudorandom direction and not moving. Subjects could use left and right buttons to move it to the location it was when they felt the urge to move. Once that location was set the trial was over.

(11)

ii Dilemma task

A set of 75 dilemma scenarios was used. Sixty of the 75 were taken from Lotto et al., (2014). This paper provided a set of dilemmas in five categories:

self-instrumental, other-self-instrumental, self-incidental, other-incidental and "filler". Scenarios in all categories except "filler" involved sacrificing one or more person(s) to save others, numbers differing between scenarios. The self/other distinction refers to whether, in the scenario, the protagonist is himself or herself at risk. The instrumental/incidental distinction refers to whether others may die as a direct result of the protagonist's choice, or as an indirect consequence. "Filler" dilemmas were scenarios in which no lives were at risk but in which some moral choice such as whether to steal was involved. Further dilemmas were taken from Koenigs et al., (2007). There were 15 dilemma scenarios in each category. Each dilemma was presented as a series of six slides. The first three slides explain the scenario, including (except in case of "filler" dilemmas) the number of people who will die if no action is taken and the number who will die if action is taken. The next two slides present a possible action, then the final slide briefly restates the possible action and cues the subject to press buttons representing "yes" and "no", (assignment of buttons remained constant during each session and was counterbalanced across subjects). Each dilemma's text was edited such that each slide consisted of between 13 and 19 words and each slide was presented for a length of time equal to 0.5 s per word. The text was

presented above a blurred background photograph representing in some way the subject of the dilemma to provide subject with extra context (Figure 2).

Figure 2: Example slide showing the first part of a dilemma scenario. Behind the text is a blurred contextually relevant image.

ii Trust game

Stimuli for this task were taken from the UK game show "Golden Balls". The show begins with four contestants. During the game's first two rounds, two players are

(12)

eliminated, leaving two contestants in the final round playing head-to-head for a cash jackpot. The final round is structured as a prisoner's dilemma. This classic of game theory research requires that both players simultaneously choose to cooperate or defect, with each player therefore unaware of the other's choice. In the game show, if both players cooperate they each receive half the jackpot. If one cooperates and the other defects, the defector wins all the money. If they both defect, neither contestant wins anything. Before they make their decisions, the two players are given a chance to talk to each other, usually to convince their opponent that they can be trusted to cooperate. Unsurprisingly however, on many occasions contestants will renege on their promise in the hope of winning the entire jackpot.

Screenshots were taken of each contestant, which were used as the background for each slide. Each trial consisted of the presentation of four slides. The first slide showed the contestant's face, unblurred; the other three showed fragments of speech used by that contestant during the game, consisting of between 1 and 27 words each (mean 10.4 words). Because it was decided to use the trust game format rather than the prisoner's dilemma, some alterations were made to the stimuli, namely changing some of the wording used by contestants (Figure 3).

Figure 3: Example second slide from a trust game trial. The wording of the text has, in this case, been altered from the original to reflect the change from a Prisoner's Dilemma scenario to a Trust Game scenario; the original line was "Can I trust you to split [the money]?"

The game's format was explained to subjects before they entered the scanner, and ran as follows. On each trial, after viewing the four slides, subjects had to choose whether to give a small or a large amount of money (2 points or 6 points, equivalent to 20 cents and 60 cents respectively) to the other contestant (i.e. the face on the screen). The other contestant then returned either a small or a large amount (dependant on whether they chose to defect or cooperate on the show): a "small amount" consisted of half the amount given by the subject and a "large amount" consisted of three times that amount (see Table 1). It was therefore in the subject's interest to give a large amount if they trusted their opponent, and to give a small amount if they did not. After the subject had decided, by pressing a

(13)

button with their left or right index finger, feedback was given in the form of the amount of money returned, indicating whether the subject's trust or mistrust was warranted. Points earned by subjects during these trials were added to money paid to subjects for taking part in the experiment with 10 cents given per point.

Table 1: payoff matrix for the Trust Game in points. Points won were later converted to money and paid to subjects at a rate of 10 cents per point.

Subject (plays first)

give 6 (cooperate) give 2 (defect) Contestant (plays second) return 3/ 2 (cooperate) 9 3 return 1/ 2 (defect) 3 1 4. Data analysis

Owing to a technical error during the setup of the scanner, data acquired were not suitable for multi-voxel pattern analysis, therefore that aspect of the experiment had to be abandoned.

Libet's W, the time at which subjects felt the urge to move was recorded for each trial by converting the angle of the clock-hand set by subjects into a time, and the difference between that time and the subject's subsequent button-press was calculated. Data from trials in which there elapsed less than three seconds

between button-press and subjects' setting of the clock were removed. Given that there is a variable delay of 1-4 seconds between these two events, any setting of the clock-hand before three seconds is assumed to have been done either

accidentally or without due care.

Behavioural data were analysed for predictive power using custom scripts in MATLAB (The MathWorks Inc.). A multivariate pattern classification analysis was performed using a linear support vector machine (SVM), with all runs combined for each subject, producing an uninterrupted sequence of left/right responses. A "labels" set consisted of each response in the sequence, and the "data" set

consisted of the two previous responses. The first two trials from each run were excluded because they necessarily did not have corresponding data, i.e. the two previous responses. In order to overcome the problem of unbalanced response sets (i.e. more of one response-category than the other), for each response set a random, balanced, subset was extracted on which 10-fold cross-validated pattern classification analysis was performed. In this procedure 9/10 of the data were used to train the classifier and the remaining 1/10 used as the test set. This process was repeated 10 times so all data was used for training and testing, maximising efficiency.

Prediction accuracy was calculated, during each fold, by dividing the number of correct predictions by the total number of predictions, producing a number

(14)

between 0 and 1. The mean of this was taken across the 10 folds. As mentioned above, no participant produced exactly equal numbers of left and right responses, so the cross-validation process was repeated 100 times. On each repetition, a new index was created to divide the data into 10 segments. The amount of data that was divided depended on the minimum response category; that is, which response the subject entered the least. Thus, on each repetition a balanced set of

left and right responses was created. A balanced set was important because a

heavily unbalanced set could cause the classifier simply to predict, each time, the more likely response.

In order to produce a probability distribution against which prediction accuracies could be compared, a bootstrapping operation was performed, in which labels and data were randomly shuffled and the pattern classification process described above performed on these random pairings. This was repeated 1000 times for each participant. Results of the classification analysis with the correctly matched data and labels were then compared to the bootstrapped distributions to give p values for each. Specifically, the p value was defined as proportion of the distribution lying above the value of the prediction; thus for example, if 1/20 of the bootstrapped distribution lay above a given prediction then the p value for that prediction would be 0.05.

Responses to those dilemmas in the present study that were taken from Lotto et al., (2014) were compared with the responses reported in that study to see whether subjects in the present study responded similarly to those in Lotto et al., (2014).

Affirmative responses made by subjects to dilemmas were compared with the trade-off in lives involved in each dilemma—that is, how many lives would be saved vs. those lost if and affirmative response was given. Are subjects more likely to take action if the trade-off is greater?

The four categories of dilemma were compared in terms of affirmative responses: Are subjects more likely to take action in an "incidental" dilemma (one in which deaths occur as a secondary consequence of action taken) than an "instrumental" dilemma (in taking action, the subject directly causes the death of others)? Are subjects more likely to take action if their own life is at risk?

Responses by subjects in the trust game were compared with actual outcomes from the game show to determine whether subjects showed a better-than-chance likelihood of correctly choosing whether or not to trust each contestant.

(15)

Results 1. Libet's W

The mean time difference across subjects between their reported feeling of the urge to move and that registered by their subsequent button-presses, widely referred to as W, was 959.4 ms (s.d. 604.7 ms). The frequency distribution of all values of W is shown in Figure 4. Values are not normally distributed (Shapiro-Wilk W = 0.96. p < 0.01).

Figure 4: Frequency distribution for W values (ms) for all trials and all participants. 2. Prediction accuracy from behavioural data

The prediction accuracy produced by the SVM pattern classifier was compared to a bootstrapped permutation distribution for each subject. Results are shown in Figure 5.

(16)

Figure 5: Mean prediction accuracies (red lines; pink lines = 1 s.d.) from pattern classification analysis for each subject overlaid on bootstrapped probability distributions calculated for each. Of the 19 subjects, 8 gave a set of responses for which the pattern classifier was able to provide a prediction-accuracy significantly greater than the mean of the permutation distribution (p < 0.05, i.e. less than 5% of bootstrapped

permutations were farther from the mean than the prediction in question). Under the null hypothesis of a predictive accuracy no better than chance, less than one of the 19 subjects' response sequences would therefore be expected to give a predictive accuracy significantly different from the mean.

The mean lateralisation index in responses during Libet trials, calculated per subject as the difference between the number of left and right responses divided by the total number of responses, was 0.54 toward the left (s.d. 0.48). The mean absolute lateralisation index was 0.63 (s.d 0.33).

(17)

Figure 6: Absolute lateralisation index (the difference between the number of left and right responses divided by the total number of responses.) vs. classifier prediction accuracy combined across all experimental conditions. Each point represents a subject. The vertical line shows lateralisation index threshold used by Soon et al. (2008) in

screening subjects. Red markers show subjects with sequences significantly above chance (see Figure 5).

There was a significant correlation between prediction accuracies across subjects and absolute lateralisation indices for all responses (Pearson's r = 0.943, n = 19,

p < 0.001).

In their study, Soon et al. (2008) pre-screened their subjects, rejecting those whose spontaneously produced left/right sequences had a lateralisation index above 0.3. This resulted in acceptance to the full experiment of 14 out of 36 subjects (38.9%). Applying the same criterion to our subjects would have seen 7 of 19 accepted (36.8%).

3. Dilemma responses

There was no significant correlation between the affirmative response rates to the dilemmas taken from Lotto et al (2014) and the response rates reported by Lotto et al (2014) from their own subjects.

There was also no significant correlation between affirmative response rates to dilemmas in the present study and the trade-off in deaths if no action is taken versus deaths if action is taken; that is, from the data we cannot conclude that a greater trade-off was associated with a greater likelihood of an affirmative response.

In a general linear model analysis, there were no significant main effects of the types of dilemmas (incidental/instrumental or self/other) on the affirmative

(18)

response rate. Nor was there any significant interaction effect between the two dilemma types.

The mean lateralisation index across subjects for dilemma trials was 0.18 toward the right (s.d. 0.21).

4. Trust game responses

The mean "correct" response rate across subjects, 54.7%, was significantly higher than chance (t(55) = 2.69, p < 0.01). That is, the rate at which subjects gave the larger amount of money to contestants who were generous in return, and withheld money from those who did likewise.

The mean lateralisation index across subjects for trust game trials was 0.02 toward the right (s.d. 0.41).

(19)

Discussion

Owing to the unfortunate situation of the data being in an unsuitable format for the proposed tests, this section is shorter and considerably less interesting than it might otherwise have been. However, it was nevertheless possible to test a number of hypotheses using the behavioural data alone: pattern classification was applied to the response data, dilemma responses were compared to previous work and trust game responses were tested against chance.

1. Libet task

The mean value for W, the time-difference between subjects' feeling of the urge to move, and their actual movement was 965.0 ms. This is significantly longer than the value for W originally reported by Libet of 204 ms (Libet et al., 1983), as well as subsequent replications (e.g. 354 ms by Haggard & Eimer, 1999; 122 ms by Trevena & Miller, 2002). Had subjects also been recorded with EEG, their reported feeling of the urge to move may have roughly coincided with the appearance of the readiness potential (RP), or even preceded it: Libet et al., (1983) reported RP onset time at 800 ms.

Not only is the mean value of W recorded in the present study much larger than other values found in the literature, but the values are also not normally

distributed around a mode, as would be expected (see Figure 4). It is not obvious why an unexpected result like this should occur; the subjects appeared to have understood the task and it is unlikely they would consistently input effectively random values for the times they felt the urge to move. This unexpected result may well therefore be due to an undetected error in recording or calculation. 2. Dilemma task

Affirmative response patterns produced by our subjects did not correlate with those found by Lotto et al., (2014). There could be a number of reasons for this. Firstly, our sample size was much smaller: 19 subjects compared with 120. Secondly, in the Lotto study the dilemmas were written out in Italian and responded to by Italian subjects. Ours were not only English translations of the originals but were responded to by a set of subjects, all but one of whom spoke English as a second language, albeit in all cases fluently. It is therefore possible that not all subjects fully understood every dilemma, (and indeed some of the situations presented in the dilemmas were somewhat unrealistic).

Dilemmas used in the present study were divided into four categories based on the criteria of "instrumental" vs. "incidental" (whether they involve directly killing someone or allowing someone to die as an indirect consequence); and "self" vs. "other" (whether or not the subject of the dilemma's own life is at risk). However, there was no apparent relationship between these factors and the likelihood of subjects choosing to take action in a dilemma scenario: for our subjects it did not seem to matter whether the situation involved direct or indirect killing, nor whether their own life was at risk. It also did not seem to matter how many people's lives they stood to save by taking action. A failure on

(20)

the part of subjects to fully comprehend what exactly was being asked in each dilemma situation may go some way to explain these rather puzzling results. Furthermore, along with a lack of brain data, our behavioural results can offer no evidence in support of different processes taking place within the brain when subjects respond to dilemmas of different types (Greene et al., 2001).

3. Trust game

Subjects gave significantly more "correct" responses on trust game trials that would be expected by chance. That is, they were more likely to give a larger amount of money to contestants who then gave a larger amount in return, and to give smaller amounts to contestants who returned likewise. Unfortunately, we cannot say whether there were different patterns of brain activity associated with decisions to trust or not trust. Nor can we say whether any such activity preceded either the moments that decisions were made or stimulus presentation entirely; nor whether such patterns contained predictive information.

Previous work has shown that people are capable of making above-chance predictions of an opponent's subsequent choices in games, if they are given the chance to interact beforehand (Frank et al., 1993; Brosig, 2002). Of course, our subjects did not experience any genuine interaction with the game show contestants used in this part of the study. However, they were able to see the faces of their opponents and read genuine fragments of dialogue taken from a situation in which the two contestants are trying to convince one another they can be trusted. It is possible, therefore, that our subjects were looking to identify trustworthiness in the faces they saw which themselves were trying to convey trustworthiness. In over half of the trials however, trust was not warranted: 55.4% of trials featured contestants who reneged on their promises to cooperate. Our subjects appear, then, to have shown an ability to differentiate between genuine and false appeals to their trust from the faces and fragments of speech of game show contestants at a rate significantly better than chance.

4. Prediction using pattern classification

Eight subjects out of 19 produced sequences of responses that contained sufficient information to allow the pattern classification algorithm to achieve statistically significant predictive accuracy. Although this is less striking than the 10 out of 12 subjects producing such predictive sequences reported by Lages and Jaworska, (2012), it is nonetheless around eight times greater than the number expected by chance, (which would be less than one out of 19 when = 0.05).α This predictability could be a result of the difficulty that people have in

producing genuinely random sequences (Wagenaar, 1972; Schulz, Schmalbach, Brugger & Witt, 2012), although the way in which trial types were ordered complicates matters, (see following section).

An alternative explanation for the eight significantly predictive sequences may be found by comparing predictability with overall balance between left and right

(21)

responses. These two variables were strongly correlated; that is, the greater a subject's lateralisation index (left or right bias) the more predictable their sequences were. This is not surprising: if the prediction each time is simply for the more frequently occurring response, then this "prediction" will become more "accurate" the greater the degree of bias. It may be, then, that as the results of Lages and Jaworska (2012) suggest, there can be some degree of predictability in sequences of left/right responses made spontaneously and without external influence. The response sequences produced by subjects in the current study however were subject to external influence, and since the trials were

pseudorandomly ordered, they appear not to be predictable, at least when there is sufficient balance in the ratio of left to right responses.

5. Limitations of the present study and suggestions for future work

It is regrettable that fMRI data recorded during this study were not suitable for use and therefore the key hypotheses remain untested. However there were also a number of other methodological issues, which deserve attention and possible modification for any future work of this kind.

General limitations and Libet trials

It is important that data collection and (at least preliminary) analysis happen as close together in time as possible. In the case of this study, pre-processing the MRI data as it became available would have revealed its unsuitability for further analysis, allowing adjustments to be made and the majority of sessions to be saved. Similarly, for Libet trials analysis of the behavioural data revealed unusually long values for W. However, if this had been known during scanning sessions, it would have been possible to make sure each subject fully understood the task and was correctly inputting the time they felt the urge to move.

The experimental design meant that several behavioural hypotheses were untestable. Because the three trial conditions were interleaved (Libet, dilemma, trust). it was not possible to exactly replicate the experiment of Lages and Jaworska (2012), whose subjects carried out only successions of Libet tasks. Libet tasks require that subjects act as spontaneously as possible and therefore (at least try to) produce a sequence of responses that is not predictable. The pattern classifier was unable to make significant predictions from the response sequences gathered—a mixture of Libet tasks, trust game trials and dilemma trials; whereas had the Libet tasks been carried out as single blocks, it is possible the results of Lages and Jaworska (2012) could have been replicated. The

opposite is true in the case of the dilemma task and the trust game: subjects are asked to take the questions seriously and respond truthfully and to the best of their ability. Response sequence would therefore be expected to be at least somewhat dependent on question sequence. Had the trial categories been segregated into different blocks, several other questions could have been addressed. For example, in the trust game, would subjects be less likely to trust their current opponent if the previous opponent has just cheated them?

(22)

The surprising amount of left or right hand bias shown by our subjects on Libet trials demonstrates the importance of pre-screening subjects for experiments of this type. Soon et al. (2008) screened 36 subjects from which they selected 14 using the criterion of a lateralisation index of less than 0.3. Applying the same criterion to results from the present study would exclude 12 of 19 subjects. Interestingly, the two acceptance rates are very similar (38.9% and 36.8%). None of the remaining seven subjects' response sequences proved to be significantly predictable, underscoring the need for balanced sets of responses in this sort of study (see Figures 5 and 6). In pre-screening, subjects would be asked to produce a series of spontaneous left/right sequences and selecting only those who met some criterion of left/right balance. This should be done without telling potential subjects they are being screened or explicitly saying that their

sequences should be as balanced as possible; only that they should act as spontaneously as they can, which is what the nature of the experiment requires.

Dilemma trials

As mentioned in the previous section, it is possible that our subjects may have had trouble understanding and responding appropriately to some of the dilemma scenarios. It would be advisable, therefore, as part of the practice and

familiarisation session carried out before the subject enters the scanner, to require that each subject shows an understanding of the dilemma scenarios presented and can explain his or her choice to the experimenter. Furthermore, and especially in cases when subjects' first language is not English, it may be better practice to allow subjects to read through each slide at their own pace, pressing a button to advance to the next one.

Trust trials

If trust game trials had been run as single blocks it would have been possible to look for sequence effects. For example, would subject be less likely to trust a contestant if they had just been "cheated" by the previous contestant? Do subjects become more or less likely to trust their opponents as the trials

progress? Similarly with the dilemma tasks: do previous responses affect (that is, predict) subsequent ones and are there trends of increasing or decreasing likelihood of giving an affirmative or negative answer as trials progress?

Some of our subjects reported that they found the mechanics of the trust game somewhat difficult to understand. Visual feedback was given after each trial in which the subject was told what their opponent had decided to do and how many points were being returned (i.e. whether they had decided to return money generously or not). However, it seems that it was not always clear to subjects what this feedback meant. One way of addressing this problem would be, as recommended above, to run all trials of the same category in the same session, allowing the subject to concentrate on the task and keep the mechanics of the game in mind. A simplification of the game itself may also have been helpful. If the game were played as a prisoner's dilemma, the opponent would have been in the same position as the player; they would be making exactly the same type of decision at the same time. The game mechanic used in the present study was

(23)

sequential (decision made by subject followed by decision made by opponent) and asymmetrical (decisions made by subject and opponent were not the same, e.g. amounts of money, knowledge of the opponent's choice); a simultaneous and symmetrical format—as was used on the game show the stimulus materials were taken from—may have been simpler for subjects to understand and engage with, without losing the essence of the game, which is to decide whether or not to trust a stranger. Finally, using the original format of the game show would have

allowed contestants' quotes taken from the show to be used without alteration. Conclusion

It is unfortunate that on this occasion the central question of this study could not be answered, namely whether predictive patterns of brain activity of the sort reported by Soon et al., (2008) in the seconds preceding spontaneous and meaningless decisions are also to be found preceding two categories of more meaningful decision. The question therefore remains to be answered. If

predictive patterns of activity are not found preceding meaningful decisions, that would suggest that the patterns found by Soon et al., (2008) and others may occur only when a decision is spontaneous and meaningless, and that when we make decisions that matter to us our deliberation occurs in real time, is fully conscious and cannot therefore, in principle, be predicted from prior activity. On the other hand, it could be that such patterns are to be found whenever a self-timed decision is made. Even for meaningful decisions in which the subject genuinely deliberates and tries to answer honestly, the patterns may represent a bias whose effect on the outcome depends on how "difficult" the question is; that is, its effect is greater the less of a good reason a subject has to give one answer over another—a meaningful but obvious decision being essentially unaffected (and therefore perhaps not predictable in the same way).

Whether predictive patterns of brain activity are unique to certain types of decision or are more widespread is a question very much worth asking because it speaks directly to common notions of free will, moral responsibility, and justice. Can notions of retribution against criminals be justified for example, when an offender's action was already a deterministic fait accompli before reaching consciousness (Shariff et al., 2014)? Should we treat everyone the way most people would treat those whose personalities have changed radically through illness or trauma? A robust answer in either direction could therefore have major practical consequences. It would also be of great philosophical value, perhaps finally answering questions of how free humans really are when making different kinds of decision, and what role, if any, consciousness plays in making them.

(24)

References

Axelrod, V., Bar, M., Rees, G. & Yovel, G. (2015) Neural correlates of subliminal language processing. Cereb. Cortex, 25, 2160-2169.

Bar, M., Neta, M. & Linz, H. (2006). Very first impressions. Emotion, 6, 269-278.

Berg, J., Dickhaut, J. & McCabe, K. (1995). Trust, reciprocity and social history. Game. Econ. Behav.,

10, 122-142.

Brosig, J. (2002). Identifying cooperative behavior: Some experimental results in a prisoner's dilemma game. J. Econ. Behav. Organ., 47, 275-290.

Bruers, S. & Braeckman, J. (2014). A review and systematization of the trolley problem.

Philosophia, 42, 251-269.

Frank, R. H., Gilovich, T. & Regan, D. T. (1993). The evolution of one-shot cooperation: An experiment. Ethol. Sociobiol., 14, 247-256.

Foot, P. (1967). The problem of abortion and the doctrine of the double effect. Oxford Review, 5, 5-15.

Greene, J. D. (2004). Why are VMPFC patients more utilitarian? A dual-process theory of moral judgment explains. Trends Cogn. Sci., 8, 323-324.

Greene, J. D., Sommerville, R. B., Nystrom, L. E., Darley, J. M. & Cohen, J. D. (2001). An fMRI investigation of emotional engagement in moral judgment. Science, 293, 2105-2108. Greene, J. D., Nystron, L. E., Engell, A. D., Darley, J. M. & Cohen, J. D. (2004). The neural bases of

cognitive conflict and control in moral judgment. Neuron, 44, 389-400.

Greene, J. D., Morelli, S. A., Lowenberg., K., Nystrom, L. & Cohen, J. D. (2008). Cognitive load selectively interferes with moral judgment. Cognition, 107, 1144-1154.

Haggard, P., & Eimer, M. (1999). On the relation between brain potentials and the awareness of voluntary movements. Exp. Brain Res., 126, 128–133.

Haxby J. V., Gobbini M. I., Furey M. L., Ishai A., Schouten J. L., & Pietrini P. (2001). Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293, 2425–2430.

Koenigs, M., Young, L., Adolphs, R., Tranel, D., Cushman, F., Hauser, M. & Damasio, A. (2007). Damage to the prefrontal cortex increases utilitarian moral judgments. Nature, 446, 908-911.

Kosfeld, M., Heinrichs, M., Zak, P. J., Fischbacher, U. & Fehr, E. (2005). Oxytocin increases trust in humans. Nature, 435, 673-676.

Krueger, F., McCabe, K., Moll, J., Kriegeskorte, N., Zahn, R., Strenziok, M., Heinecke, A. & Grafman, J. (2007). Neural correlates of trust. PNAS, 104, 20084-20089.

Lages, M., & Jaworska, K. (2012). How predictable are “spontaneous decisions” and “hidden intentions”? Comparing classification results based on previous responses with multivariate pattern analysis of fMRI BOLD signals. Front. Psychol., 3, 1–8.

(25)

Libet, B., Gleason, C. A., Wright, E. W., & Pearl, D. K. (1983). Time of conscious intention to act in relation to onset of cerebral activity (readiness-potential). Brain, 106, 623–642.

Lotto, L., Manfrinati, A. & Sarlo, M. (2014). A new set of moral dilemmas: Norms for moral acceptability, decision times and emotional salience. J Behav. Decis. Making, 27, 57-65. Schultz, M-A., Schmalbach, B., Brugger, P. & Witt, K. (2012). Analysing humanly generated random

number sequences: A pattern-based approach. PLoS ONE, 7, e41531.

Shariff, A. F., Greene, J. D., Karremans, J. C., Luguri, J. B., Clark, C. C., Schooler, J. W., Baumeister, R. F. & Vohs, K. D. (2014). Free Will and Punishment: A Mechanistic View of Human Nature Reduces Retribution. Psychol. Sci., 25, 1-8.

Shibasaki, H. & Hallett, M. (2006). What is the Bereitschaftpotential? Clin. Neurophysiol., 117, 2341-2356.

Soon, C. S., Brass, M., Heinze, H.-J., & Haynes, J.-D. (2008). Unconscious determinants of free decisions in the human brain. Nat. Neurosci., 11, 543–545.

Soon, C. S., He, A. H., Bode, S., & Haynes, J.-D. (2013). Predicting free choices for abstract intentions. PNAS, 110, 6217–6222.

Sterzer, P., Haynes, J. D. Rees, G. (2008). Fine-scale activity patters in high-level visual areas encode the category of visual objects. J. Vis., 8, 1-12.

Thomson (1976). Killing, letting die and the trolley problem. The Monist, 59, 204-217. Tong, F. & Pratte, M. S. (2012). Decoding patterns of brain activity. Ann. Rev. Psychol.,

Trevena, J. A., & Miller, J. (2002). Cortical movement preparation before and after a conscious decision to move. Consciousness and Cognition, 11, 162–190.

Wagenaar, W. A. (1972). Generation of random sequences by human subjects: A critical survey of literature. Psychol. Bull., 77, 65-72.

Winston, J. S., Strange, B. A., O'Doherty, J. & Dolan, R.J. (2002). Automatic and intentional brain responses during evaluation of trustworthiness of faces. Nat. Neurosci., 5, 277-283. Zebrowitz, L. A. (2004). The origin of first impressions. J. Cult. Evol. Psychol., 2, 93-108.

Referenties

GERELATEERDE DOCUMENTEN

2015 C ustom Pendant Tube Lights for Christine Cronje’s exhibition ‘On Breath and Ash’

What role does the emotion of disgust play in the experience of reading works of literature that are regarded as immoral by readers.. 1.1 Some

Hence, it appears that trade openness is also important for East China and not only for West China, as suggested by the estimation results of the model including time

Even though the Botswana educational system does not reveal serious pro= b1ems in terms of planning it is nevertheless important that officials of the Ministry

The statistics shown in panel C and panel D point towards the possibility that family firms perform relatively better in the crisis years than non-family firms, because the

How does the rising interest for lifestyle blogs influence the on- and offline appearance of women’s magazines in the Netherlands and in what way does this change the

comes into existence. The tangible or physical form of the work embodies two separate items of property, i.e. the copyright in the work of the intellect and

The extraction of the fetal electrocardiogram from mul- tilead potential recordings on the mother’s skin has been tackled by a combined use of second-order and higher-order