• No results found

On the Domain Generality of Higher-Order Theory of Mind: Transfer Between the Marble Drop Game and the False Belief Task

N/A
N/A
Protected

Academic year: 2021

Share "On the Domain Generality of Higher-Order Theory of Mind: Transfer Between the Marble Drop Game and the False Belief Task"

Copied!
84
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

On the Domain Generality of Higher-Order

Theory of Mind

Transfer Between the Marble Drop Game and the False Belief Task

MSc Thesis (Afstudeerscriptie)

written by Jotte Kuilder

(born September 25th, 1994 in Hengelo, The Netherlands) under the supervision of Dr. Jakub Szymanik and Drs. Gert-Jan Munneke, and submitted to the Board of Examiners in partial fulfillment of

the requirements for the degree of

MSc in Logic

at the Universiteit van Amsterdam.

Date of the public defense: Members of the Thesis Committee: August 30, 2017 Prof. Benedikt L¨owe

Dr. Jakub Szymanik Drs. Gert-Jan Munneke Prof. Rineke Verbrugge

(2)

Abstract

Theory of Mind (ToM) is the ability to attribute mental states to others and to oneself. In higher-order ToM this ability is applied recursively in two or more levels. This thesis investigates whether higher-order ToM is a domain general cognitive capacity by looking at transfer between two different cognitive domains operationalized with the False Belief Task (FBT) and the Marble Drop Game (MDG). Such transfer would be indicative of a common element to both tasks. The experiments conducted in this thesis were also used to test the validity of a specific transfer model constructed using the cognitive model PRIM. Two studies were conducted in which transfer was investigated. We found signifi-cant transfer in the first experiment and a trend towards transfer in the second experiment. Reaction times on the MDG were related to the use of ToM. The second experiment shows that the transfer is not mediated by working memory capacity, which increases the likelihood that it is indicative of higher-order ToM. The use of ToM across different cognitive domains corroborates its domain gen-erality. Hence we provide support for the hypothesis that higher-order ToM is a domain general cognitive capacity. Moreover, results are consistent with the PRIM model. In order to make more robust judgments about the predictions of the PRIM model, this research should be repeated in a laboratory setting, preferably with longer training phases.

(3)

Acknowledgements

First and foremost, I would like to thank my supervisors Jakub Szymanik and Gert-Jan Munneke. Jakub, thank you for always being supportive of my often unfocused interests and questions. Your composure and sense of purpose has made for an almost completely stress-free process. Gert-Jan, thank you for getting me up to speed on statistics, being incredibly involved and helping me stick to my deadlines by creating them.

Secondly, I would like to thank Rineke Verbrugge and Burcu Arslan for taking the time to read and comment on my initial proposal. Thanks to Rineke and Ben Meijering for providing me with the files for the Marble Drop Game that was used in the experiments.

Thanks to Benedikt L¨owe for guiding me throughout the Master’s program, always providing wise advice, and for agreeing to chair the committee. I would also like to thank the committee members Rineke Verbrugge, Maria Aloni and Sonja Smets for taking the time to read this thesis.

I would like to thank my mother and brother for putting up with a great number of phone calls, especially in the last weeks. Lastly I would like to thank my friends for being there, specifically Renske, Floris, Tina and Carlijn for working alongside me to keep me motivated.

(4)

Contents

1 Introduction 3

1.1 Theory of Mind . . . 3

1.2 Domain generality of Theory of Mind . . . 6

1.3 Transfer between Cognitive Tasks . . . 9

1.4 The Marble Drop Game . . . 11

1.4.1 Strategies in the Marble Drop Game . . . 12

1.4.2 Theory of Mind in the Marble Drop Game . . . 14

1.5 Research questions and predictions . . . 17

2 Experiment 1 19 2.1 Methods . . . 19 2.1.1 Participants . . . 19 2.1.2 Materials . . . 19 2.1.3 Procedure . . . 22 2.2 Results . . . 25 2.3 Discussion . . . 30 3 Experiment 2 32 3.1 Introduction . . . 32 3.2 Methods . . . 33 3.2.1 Participants . . . 33 3.2.2 Materials . . . 34 3.2.3 Procedure . . . 35 3.3 Results . . . 36 3.4 Discussion . . . 43 4 General discussion 44 4.1 Critical analysis . . . 45 4.2 Future research . . . 48 4.3 Conclusion . . . 49 References 50 A Instructions Experiment 1 55

(5)

B Instructions Experiment 2 59

(6)

Chapter 1

Introduction

1.1

Theory of Mind

One of the origins of interest in Theory of Mind (ToM) is the idea that having a ToM is a uniquely human quality that is at the root of our social interactions and communities (Penn & Povinelli, 2007; Sperber & Wilson, 1995). ToM is defined as the ability to think about what other people know, want, intend and believe. Specifically, this refers to the ability to attribute mental states to others and oneself, as well as to understand that someone else’s mental state may be different than our own (Apperly, 2010). Hence both thinking “I will watch the series Game of Thrones tonight” and “You know nothing John Snow” are instances of ToM. The first example displays awareness of intention of oneself, which is ToM. In the second example a mental state pertaining to knowledge is attributed to John Snow.

This ability is a necessity for many actions that we perform daily. In order to study for a test it is necessary to attribute mental states to ourselves, such as knowledge of the subject material and desire to obtain a good grade. Another example would be a conversation about a party that took place last weekend at which one of the two speakers was not present. In this case it is necessary for both speakers to understand that the other speaker has a different mental state in order for the conversation to be successful. Not insignificantly, this thesis would be very difficult to understand if the writer did not understand that the reader does not have the same knowledge as the writer. Another clear example is playing a game, since in almost all multi-player games strategy is based on the fact that the other player has different mental states than oneself. It is this scenario that this thesis is concerned with. We will first provide some more information on ToM before outlining how game-play and ToM will interact in this thesis.

It may seem trivial that we possess the capacity of ToM, but research has shown that there is an age before which children do not understand that there is a difference between what they know and what other people know. Specifically,

(7)

they do not understand that they may have knowledge that other people lack. A verbal first-order False Belief Task (FBT) is first passed by typically developing children at the age of 4 on average (Sullivan, Zaitchik, & Tager-Flusberg, 1994; Wimmer & Perner, 1983; Perner, Leekam, & Wimmer, 1987). In the original task devised by Wimmer and Perner (1983), the child is presented with a story about Maxi. Maxi is in the kitchen and puts a chocolate bar in a drawer. Then Maxi leaves the kitchen. After Maxi leaves, his mother enters the kitchen and removes the chocolate bar from the drawer and puts it in the cupboard. Then mother leaves. Now the participant is told that Maxi returns to get the chocolate and is asked: “Where will Maxi look for the chocolate?”.

The correct answer to this question is ‘in the drawer’ since Maxi does not know that the chocolate is moved by the mother. What is important for suc-ceeding in this task is the realization that Maxi’s knowledge is different from that of the participant. Moreover the participant must realize that this knowl-edge leads Maxi to look in a different place than the participant would look himself/herself. The participant believes that Maxi believes that the chocolate remains located in the drawer. Interestingly, children and adults with autism are often unable to successfully perform this task. Because of this inability, a lack of ToM has been suggested to be the cause of the disorder (Baron-Cohen, Leslie, & Frith, 1985). However, other (more recent) theories exist such as impaired closed-world reasoning (Stenning & Van Lambalgen, 2012) and dimin-ished counterfactual predictions and perceptual presence of others’ metal states (Palmer, Seth, & Hohwy, 2015).

Situations in which first-order ToM is applied are numerous. However, higher-order ToM exists and is not uncommon. Higher-order ToM has more embeddings than first-order ToM. “You believe that your mother believes that your sister hates classical music” is a sentence that shows the use of second-order ToM. In this sentence the mother believing that the sister hates classical music is first-order ToM. The second embedding saying that that is what you believe makes the sentence display second-order ToM. An example of when second-order False Belief is used in daily life is in the use of sarcasm. When you sarcastically tell your sister that she is looking great today, you don’t want your sister to believe that you think that she is looking great today, which requires

(8)

second-Mary talked to the driver of the ice cream truck. There is a second embedding necessary for this reasoning, which is why the task requires second order ToM.

Figure 1.1: The second-order False Belief Task as described by Perner and Wimmer (1985).

(9)

1.2

Domain generality of Theory of Mind

Not only has there been interest in exactly what aspects of ToM exist in human cognition, but also in how this capacity functions. In many areas of cognitive science, debates have been sparked about domain generality and domain speci-ficity of cognitive capacities (e.g. Barrett & Kurzban, 2006; Chomsky, 1988; Sperber & Wilson, 1995; Kanwisher, 2000; Evans, 2003). A cognitive capacity is considered to be domain specific if it is an automatic skill that is only used in very specific situations or tasks. Moreover, these domain specific capacities are often considered modules that are located in distinct locations in our brains (Fodor, 1992). An example of such a module is face perception (Kanwisher, 2000).

A domain general capacity is a capacity that takes more computational power, takes more time and can be applied in many different situations (Evans, 2003). Hence it is a slow and deliberate mechanism that does not require a specific type of input. An example of such a capacity is statistical learning (see Frost, Armstrong, Siegelman, & Christiansen, 2015). Our cognitive capacity to track occurrences and distill patterns from these occurrences is something that can be used in learning mathematics, but also the perfect serve in tennis or what reaction is desired in a social interaction.

The debates on domain generality or specificity of cognition initially started with Fodor’s (1983) proposal that ToM is a domain specific module. Since ToM is a capacity that is used continuously throughout the day, it seems likely that using ToM is not a slow process with a large cognitive demand. Hence researchers have theorized that ToM is a domain specific cognitive capacity (e.g. Fodor, 1983, 1992; Sperber & Wilson, 1995). Some components of ToM have been empirically demonstrated to be automatic processes. One example of such a capacity is level-1 perspective taking. Level-1 perspective taking is the ability to realize that people with different lines of sight see different things.

Samson et al. (2010) conducted a study in which participants were asked to say how many dots they were able to see in a room in which an avatar was placed. Sometimes the avatar saw the same number of dots as the participant, and sometimes the avatar only saw part of them as displayed in Figure 1.2. The study showed that reaction times were slower if the perspective of the avatar

(10)

Figure 1.2: An example of stimuli in Samson et al.’s study on level-1 perspective taking (2010)

However, whereas for example Fodor (1992) claims that all aspects of ToM are domain specific, there are also researchers who investigate the possibility that ToM is a mixture of domain general and domain specific faculties (e.g. Apperly, 2010; McKinnon & Moscovitch, 2007).

In this thesis we will consider Apperly’s theory as an account of domain general ToM (2010). He proposes a ’two-systems’ account for ToM (in his terminology, mindreading). He developed his theory trying to theoretically join two aspects of ToM, namely that on the one hand ToM is highly flexible, but that on the other hand it is often fast and efficient. He concludes that the diversity in cognitive effort and flexibility is a reflexion of the variability of the means by which ToM functions. Hence he distinguishes two types of cognitive processes: lower-level ToM and higher-level ToM. The former is very efficient, but inflexible (a domain specific module). The latter is slow and inefficient, but also very flexible (a domain general process). Lower-level ToM consists of capacities that are downwards modularized. Modularization means that when a certain capacity has been used over and over for the same tasks, then this capacity will become a domain specific module as defined previously.

For example, Apperly claims that level-1 perspective taking is a domain specific module, but for instance level-2 perspective taking is not. Level-2 per-spective taking is defined as the ability to realize that other people may see the same things in different ways. For example that when sitting across from some-one at a table, what looks like a 6 to you will look like a 9 to them. In Apperly’s theory, higher-order ToM as defined in this thesis is considered a higher-level process (2010). His theory predicts that higher-order ToM is a slow capacity that can be applied in different domains. Hence reaction times will be slow and it is possible that ToM is used in the MDG.

As a proponent of the alternative view that ToM is a domain specific module, we will consider Leslie’s theory (1987, 1994; 2004). In his account, ToM (in-cluding higher-order ToM) is an innate cognitive module. The ToM mechanism constructs agent-centered descriptions of situations. This means that agents are placed in relation to information. It predicts that the module is not transferred across cognitive domains and that the module is efficient. Hence reaction times will be fast and ToM is not used in the MDG.

(11)

This thesis is concerned with the question whether higher-order ToM is domain general or domain specific. To answer this question, experiments were conducted in different cognitive domains. If higher-order ToM is a domain specific faculty, it would not manifest in two completely different domains. Hence considering if ToM is being used in two different domains will shed light on the question whether higher-order ToM is a domain general or domain specific faculty. One domain considered in this thesis is false belief and the other domain is that of games. Specifically we will investigate the use of ToM in the Marble Drop Game (MDG), a game for which strategies have been extensively studied and ToM is considered to be prevalent. To assess whether ToM is used in the MDG, we will consider transfer from the Mable Drop game to the FBT. We assume that if such transfer is found, it must be the case that higher-order ToM was being used in the MDG. Wierda and Arslan (2014) simulated a model that predicts the occurrence of such transfer. Hence this thesis will also assess the validity of this specific model.

Research on ToM contributes to enhancing the dynamics of both social inter-action and communication. Answering the question whether higher-order ToM is domain general or domain specific will contribute to a better understanding of ToM, which will in turn help improve understanding of how our communi-ties arise. Moreover, it will allow for more directed research on developmental disorders (such as autism) and cognitive deficits due to brain damage that are causing diminished function of ToM. This will aid in creating treatments that might restore function or prevent social impairment.

Scientifically, uncovering the specifics of the mechanism that is ToM will allow for directed research into the evolution of ToM. Cognitive function could be compared to other animals in order to track its development. Additionally, the cognitive mechanisms used in ToM could be compared with for example those used in language development, which would aid in verifying theories that propose co-evolution of ToM and language. Moreover, it could aid in distilling which element of human cognition possibly makes humans distinct from other highly intelligent animals.

In the next sections of the first chapter we will first discuss transfer phe-nomena and their relation to domain generality. Then we will discuss the MDG

(12)

1.3

Transfer between Cognitive Tasks

Cognitive transfer is a phenomenon in which training on one task leads to im-proved performance on a different task. Its tradition goes back to the ancient Greeks, when Plato first formulated the idea of formal discipline in ‘The Repub-lic’ (1987). Formal discipline is the idea that by learning Latin and Mathematics, the brain is trained as a muscle, which would lead to improvement on other un-related tasks. In 1922, Thorndike criticized this idea and introduced a theory of identical elements to discuss transfer. His theory claims that transfer only oc-curs when knowledge elements are identical. So for example it is easier to learn Italian when one knows how to speak French, since there are many common elements to the two languages that are derived from Latin. This theory was later revisited to increase its precision (e.g. Singley & Anderson, 1985; Kieras & Bovair, 1986). In these accounts the production rule is introduced as the element of transfer. This means that some rule is learned in order to complete a task, which gives a certain output. If this same rule can be applied in a different task, there is transfer between the two tasks.

There are several problems with taking procedural knowledge to be the source of transfer (see Taatgen, 2013). This is why other accounts tried to use declarative knowledge as a source of transfer (e.g. Hummel & Holyoak, 1997; Forbus, Gentner, & Law, 1995). These theories look at analogical transfer. The idea is that knowledge about a previous task is retrieved from declarative memory and adapted to a new situation.

Unfortunately neither of these two approaches are able to account for far transfer phenomena. Far transfer occurs when transfer is observed between two tasks that are seemingly unrelated. An example of such transfer is that training task-switching improves performance on the Stroop task, a working memory task and also the Raven’s test (see Karbach & Kray, 2009). To be able to account for such phenomena, Taatgen (2013) introduces a different model of cognitive transfer, which takes smaller elements as a basis for transfer. This model is called PRIM and will be discussed below.

Please recall that this thesis looks at transfer from the MDG to the FBT. This transfer is categorized as far transfer. The appearance of far transfer is a strong indication of domain generality, since it requires some cognitive element to be used in two distinct cognitive domains. The existence of such an element is inconsistent with the definition of domain specificity. Moreover, it is predicted by the definition of domain generality. Hence finding far transfer between to tasks shows that there is a domain general cognitive element that is used in both tasks.

PRIM

In this section we will discuss the primitive elements theory PRIM developed by Taatgen in 2013. The computer model used in Wierda and Arslan (2014) is based on this theory. It is an extension of the cognitive architecture ACT-R. Both of these cognitive architecture were designed to model transfer phenomena.

(13)

In ACT-R, complex tasks are solved by using production rules in procedural memory. The primitive elements theory breaks these production rules down to even smaller building blocks. PRIMs are primitive elements of cognition that can move, compare or copy information between different modules. These modules include a manual module, a declarative memory module, a working memory module, a visual module and a task control module as depicted in Figure 1.3.

(14)

1.4

The Marble Drop Game

The MDG is a game that was recently developed. It is formally equivalent to the centipede game, which as long been subject to study in the field of game theory (see Aumann, 1995). However, the interface of the game is designed to be more intuitive and hence easier to play. An example of the centipede game is provided in Figure 1.4. In the game two players A and B take alternate turns. At each node, one player can decide to exit or continue the game. If the player exits, a certain payoff is reached. At some point the game ends and the last player may decide between two different payoffs. The goal is to obtain the largest possible payoff.

Figure 1.4: An example of the centipede game.

In case of the MDG, there is a white marble that drops onto levers which are controlled by the players. The payoffs are then represented by colored marbles that lie in bins where the white marble may fall into. The payoff is determined by what bin is reached. Instead of trying to obtain the highest possible number, the goal is to obtain the darkest possible marble. The choice to exit is depicted by opening the left lever, and the choice to continue playing is depicted by opening the right lever. The setup is depicted in figure 1.5. In this figure, the highest payoff for the player controlling the orange levers is placed in the far right bin.

(15)

Figure 1.5: One instance of a second-order Marble Drop Game (Meijering et al., 2012)

1.4.1

Strategies in the Marble Drop Game

There are several different strategies that could be applied in the MDG. Some of these strategies involve first or second order ToM. Some strategies do not involve ToM at all. An example of a strategy that does not use ToM is a risk-averse strategy. In this strategy the player makes decisions based on perceived risk of being disadvantaged. Knowledge of the specifics of the strategies detailed below will be assumed in later chapters.

Backward Induction

The original approach to solving extensive form games with perfect information is the strategy called Backward Induction (Aumann, 1995). It has been proven

(16)

they will make the best choice for themselves, assuming common knowledge of rationality, which in turn implies that they will play the Backward Induction strategy. Hence, if anyone deviates from the Backward Induction strategy, the other player will not be able to change their strategy.

Experimental studies in behavioral economics have shown that the Backward Induction outcome is often not reached (McKelvey & Palfrey, 1992). In the game of Figure 1.4, Backward Induction would yield the strategy (a, e) for payer A and (c) for player B, where the reasoning would be: at the last node, playing e yields more for player A. Since player A will play e at the last node, player B should exit the game at the previous node by playing c. Then since player B believes that player A will play e and hence would play c, player A should exit the game at the start by playing a. The oddity is that a result of e would be better for both player A and player B.

Forward Induction

A different strategy is the strategy of Forward Induction. This strategy is to rationalize the opponents past behavior in order to assess his future moves (Stalnaker, 1998; Ghosh, Heifetz, & Verbrugge, 2016). So for example, if player A would start by playing b in Figure 1.4, player B should rationalize this move and then adjust his strategy to a new best response. Suppose that a player plays a strategy that is not optimal assuming common knowledge of rationality, then their opponent might be able to rationalize the choice by attributing a strategy that is optimal against a suboptimal strategy of theirs. If Forward Induction is consistently applied, the player will play an Extensive-Form Rationalizable strategy (Pearce, 1984; Ghosh et al., 2016). Such a strategy would allow for partial cooperation. Clearly this strategy requires ToM, since strategy is based on the expected strategy of the opponent. Since the games involved in this research are not very long, there is not much opportunity for rationalization. Hence this strategy will not be considered in the current research.

Forward Reasoning + Backtracking

Another distinct strategy is Forward Reasoning with Backtracking. A player using this strategy first attempts to find which trapdoor should be opened to reach the highest possible payoff (forward reasoning), after which the player uses backtracking to find out whether the goal is reachable. Backtracking is the procedure of jumping back to previous decision points (Meijering, Van Rijn, Taatgen, & Verbrugge, 2012). In this strategy ToM is used, since finding out whether the optimal solution can be used requires predicting the opponents moves. Using Forward Reasoning in combination with Backtracking is computa-tionally cheaper than Backward Induction for games of length three (Bergwerff, Meijering, Szymanik, Verbrugge, & Wierda, 2014; Szymanik, Meijering, & Ver-brugge, 2013). The games of length three are referred to as second-order MDGs in this thesis.

(17)

Own Payoff = Forward Reasoning - Backtracking

Of course there are also less complicated strategies that may be used to play the game. One such strategy is to simply be concerned only with ones own payoff. This strategy is equivalent to Forward Reasoning without Backtracking. The player finds the maximum payoff and plays such that it will reach that payoff. This means for instance that in Figure 1.4, player B would play strategy d. Since the player completely disregards the opponent, this strategy does not require ToM.

Adaptive heuristics

Game-theoretic strategies are not the only strategies that exist. Another way of learning and adapting strategy is explained by the theory of adaptive heuristics. This theory does not require the use of ToM, but still allows for a player to learn while playing iterated games. An adaptive heuristic is a rule of behavior that is myopic (not concerned with others, but simply with maximal gain), but leads to seemingly rational behavior. They are not fully rational, but come close to mimicking rational behavior since they adapt to previous game play.

The adaptive heuristic relevant to the MDG is Regret Matching. This heuris-tic is defined as follows: “Switch next period to a different action with a proba-bility that is proportional to the regret for that action, where regret is defined as the increase in payoff had such a change always been made in the past” (Hart, 2005, p. 1405). Simply put, the player will change strategies if a different strategy would have given a better result. As such a player learns from previ-ous games, without explicitly defining a different strategy through reasoning. Interestingly, this strategy converges to the set of correlated equilibria. Hence, in the long term, players using this strategy are indistinguishable from fully rational players (Hart, 2005). The amount of trials that participants complete in the experiments conducted in this thesis are not enough for this equilibrium to occur.

(18)

Verbrugge (2015) conducted the only study concerned with this question. They asked their participants “When you made your choices in these games, what did you think about the ways the computer would move when it was about to play next” (Halder et al., 2015, p.3). They next classified the answers to this question according to markers indicating the presence of ToM. Of the 48 participants in their experiment, 5 participants were classified as zero-order players, 27 partic-ipants were classified as first-order players and the 16 remaining particpartic-ipants were classified as second-order players. Interestingly, second-order players made less mistakes than first-order players and first-order players made less mistakes than zero-order players. Since we are also concerned with the use of ToM in the MDG, we will return to these classifications in the general discussion.

Besides experimental research, there have also been attempts to theoretically model behavior in the MDG. One theoretical paper (Wierda & Arslan, 2014) uses the cognitive architecture PRIM to model transfer from the MDG to the FBT. If such transfer occurs it shows that there is some component of the MDG that is also used in the FBT. Assuming that the FBT is a valid measure of ToM, it is likely that the presence of transfer between the two tasks would imply that ToM is used in the MDG.

Even though some researchers have argued that the FBT measures some-thing else than ToM (Stenning & Van Lambalgen, 2012; Bloom & German, 2000), there is evidence that success in the FBT requires more than counter-factual reasoning (Peterson & Bowler, 2000) or linguistic ability (Arslan, Ho-henberger, & Verbrugge, 2012), hence we will use the FBT as a valid measure of ToM. Since this research is testing fully developed adults, task requirements such as counterfactual reasoning or linguistic recursion should not be a deter-mining factor of performance. There is however some evidence that performance on the FBT is mediated by working memory capacity (e.g. Mutter, Alcorn, & Welsh, 2006; Davis & Pratt, 1995). This effect will be controlled for in the experimental design.

To return to Wierda & Arslans (2014) study, their model assumes a For-ward Reasoning plus Backtracking algorithm for solving the MDG. Given the experimental evidence on default strategies in the MDG, this assumption is jus-tifiable. For the FBT, the model stores all the story facts in declarative memory and builds an internal representation as the story is presented. At the end of the story, it backtracks through this representation to first find the zero-order answer and then checks which character observed which action. Although the tasks seem very different there are some combinations of PRIMs that are used in both tasks. In the model, 20 trials were run in four different conditions. Each trial consisted of 3 blocks of 40 tasks. The conditions were

1. MDG-MDG-MDG 2. FBT-FBT-FBT 3. MDG-FBT-MDG 4. FBT-MDB-FBT.

(19)

Here MDG means a block of 40 MDGs and FBT means a block of 40 FBTs. The second block of condition 1 and was compared to the last block of condi-tion 3. The second block of condicondi-tion 2 and was compared to the last block of condition 4. To examine transfer, log reaction times of correct trials were con-sidered. They found a 11,14% decrease in reaction time from the second block of condition 1 to the last block of condition 3. Moreover, the found a 44,2% decrease in reaction time from the second block of condition 2 to the last block of condition 4. The reaction times were only significantly different for transfer from the MDG to the FBT.

Wierda and Arslan (2014) suggest that this difference is due to the differ-ence in complexity of the two tasks. They posit that the MDG requires more use of working memory and hence gets more training in specific working mem-ory strategies.These working memmem-ory strategies hence gain more weight in the declarative memory and can be more easily retrieved. Moreover, the size of the workspace is determined by the size of the buffers of each specific Module it connects to. Since working memory is employed more in the MDG, working memory capacity limits performance on the MDG more than performance on the FBT. Secondly they suggest that transfer found in this model may be in-flated by learning effects that occur in the third block. One direction of transfer has already been tested, namely transfer from the false belief task to the marble drop game. Arslan et al. (2014) found that there was no transfer from the FBT to the MDG, which is consistent with the predictions of PRIM.

There are still some open questions in this field of research. Firstly, transfer from the MDG to the FBT has not yet been experimentally tested in human subjects in order to verify whether the predictions that PRIM has made are also emperically plausible. Secondly, the use of ToM in the MDG has not extensively been studied. In order to contribute to the debate on domain generality of ToM this thesis will investigate the use of ToM in the MDG. Moreover in doing so it will test the predictions made by Wierda & Arslan (2014) to validate the predictions made by the PRIM model.

(20)

1.5

Research questions and predictions

This thesis is primarily concerned with the question:

Is higher-order ToM a domain general capacity?

Since domain generality predicts that higher order ToM will be utilized across different domains, we will investigate the use of ToM in the MDG. The MDG does not require the use of ToM to be solved. Yet players are hypothesized to employ higher-order ToM whilst playing the game. If higher order ToM were a domain specific cognitive faculty, then it would not be used in this game. Secondly, in accordance with domain generality, PRIMS theory predicts that there is a common element to the MDG and the FBT that will create transfer from the FBT to the MDG. Hence, to answer the main research question we will strive to answer two subquestions:

• Is higher-order ToM being employed by players of the MDG?

• Does the cognitive model PRIMS correctly predict transfer between the MDG and the FBT?

We hypothesize that indeed higher-order ToM is a domain general capacity. Moreover, higher-order theory of mind is employed by players of the MDG, and hence transfer will occur from the MDG to the FBT. Given the more complex nature of the MDG, transfer from the FBT to the MDG will not occur.

To verify these hypotheses, subjects will be presented with a number of higher-order MDGs, as well as a higher-order FBT. They will be placed in two conditions. Either they will first be presented with the MDGs, or they will first complete the FBT. Reaction times and accuracy will be measured in each task. By comparing performance on the FBT before and after playing the MDG, transfer can be measured. Similarly, if performance on the MDG improves after completion of the FBT, transfer occurs. After completing the MDGs, participants are asked to describe their strategy. The specifics of the experimental design will be provided in the next two chapters. The experimental setup will test the following predictions:

1. Higher-order ToM is a domain general cognitive capacity.

• Prediction: Higher order ToM is used in the MDG. Reaction times will be slow when higher-order ToM is being employed

2. Results will be consistent with PRIM theory.

• Prediction: Transfer from the MDG to the FBT will be found, but not vice versa.

(21)

• Prediction: Transfer from MDG to FBT. This means better accuracy and/or lower reaction times on the FBT after completing the MDG. Moreover, strategy description for the MDG will include statements indicative of higher-order ToM.

It is important to note that the three hypotheses generate similar predic-tions. This means that they are not exclusive and that they cannot be fully distinguished on the basis of the experimental results of this thesis.

(22)

Chapter 2

Experiment 1

2.1

Methods

2.1.1

Participants

Participants were recruited through Facebook. Out of 102 participants, 15 par-ticipants were excluded for failing control questions and hence not completing the survey. Amongst the remaining 87 participants, age varied from 17 to 75 (M=31.79, SD=14.45). The sample consisted of 31 men and 56 women.

2.1.2

Materials

The measures considered in this experiment are accuracy and reaction time on the FBT, accuracy and reaction time on the MDGs, a strategy specification evaluated for presence of ToM, mental fitness, mental fatigue, transfer, gender, age and education. Accuracy on the MDG is a dependent variable evaluated by the number of correct responses on each instance of the MDG. The computer is programmed to behave rationally through Backward Induction. Hence cor-rect answers are answers that obtain the darkest possible marble taking into account the rational decision of the computer (for example Backward Induction and Forward Reasoning + Backtracking would both yield the correct answer). The participants are not given this information explicitly as can be seen in the instructions in Appendix A. The correct responses are averaged for each level of the game. Reaction time is a dependent variable measured by the last click on the page. Reaction times are averaged for each level of the game. Accuracy and reaction time are assessed separately for the zeroth-, first- and second-order MDGs.

Similarly, accuracy on the FBT is a dependent variable evaluated as binary (correct, incorrect) and reaction time is a dependent variable measured by the last click on the page. The strategy specification given by the participant will provide a measure of what level of ToM was employed by the player. In the survey, after completing the marble drop games, participants are asked to

(23)

ex-plain how they came to their decisions. These explanations are independently assessed for presence of ToM by two students not related to the research. They are given two criteria that have to be met for the strategy to be classified as using ToM:

1. The strategy description mentioned the computer

2. Intentional verbs were used in the description with regards to the com-puter.

This ToM variable create using the classifications is independent.

Next, mental fatigue is measured as decline in mental fitness. Mental fitness is a covariate assessed by asking the participants “How would you rate your current mental fitness?”. The participants rated their mental fitness on a scale of 1 through 10. There are two measures of mental fatigue. The first is measured as a change in mental fitness after playing the MDGs. The second is measured as a change in mental fitness after completing the FBT. Mental fatigue is used as a covariate.

Transfer is operationalized by comparing performance across two different conditions as described below. Transfer between two different tasks implies common elements in the two tasks, such that training on one task improves performance on the second. Lastly, gender, age and education will be considered as covariates.

The research design is experimental and operates on accuracy and reaction time. The experiment consists of two different conditions, administered between subjects, see Figure 2.1. In the first condition participants first play a series of MDGs, after which they complete a second-order FBT. In the second condition they complete the same tasks, but in reverse order. The participants give a strategy description after completing the MDGs. Moreover, they rate their mental fitness at several stages in the experiment. The exact details of the design are given in the procedure. Next, the relation between the research design and the hypotheses will be discussed.

(24)

Figure 2.1: Temporal flow of the experimental conditions

Accuracy and reaction times on both the MDG and the FBT can be compared between conditions. If there is a significant difference in accuracy or reaction time on the FBT such that accuracy is higher or reaction time is lower in the MD-FBT condition, there is transfer from the MDG to the FBT. This would be consistent with the predictions of the PRIM model, as well as the hypothesis that higher-order ToM is being used in the MDG. Moreover, this would show that ToM can be applied across different domains and is hence a domain-general capacity.

Similarly, increased accuracy or lowered reaction times on the MDG in the FBT-MD condition would show transfer from the FBT to the MDG. If data is consistent with the PRIM model, then such transfer will not be found. Moreover this transfer is not predicted by any of our hypotheses.

The responses to the question asking about strategy specification in the MDG will allow for a classification of ToM usage for each participant. Com-paring accuracy and reaction times for the MDG with respect to their usage of ToM will provide information on the domain generality of higher order ToM. Reaction times are predicted to increase with the level of ToM displayed in the strategy specification. Moreover, accuracy is also expected to increase with the level of ToM, since the strategies that employ ToM are more accurate and the games are designed to induce the use of higher-order ToM.

Lastly, there are several covariates that may influence accuracy and reaction times on the MDG and the FBT. These variables are age, gender, education, mental fitness and mental fatigue.

(25)

2.1.3

Procedure

The experiment was conducted using the online survey software Qualtrics. Par-ticipants were randomly assigned to one of two conditions. In both conditions participants passed control questions to check whether their screen settings and perception were sufficient to perceive color differences that were necessary for the MDG. Moreover all participants passed control questions about the MDG and the FBT to verify that they had understood the instructions. Participants that did not answer the control questions correctly were automatically excluded from the survey. The excluded participants formed 14,7% of the original sample. In the first condition (MD-FBT) participants were first presented with a series of MDGs. In this sequence they played four zeroth-order games (see Fig-ure 2.2), eight first-order games (see FigFig-ure 2.3) and eight second-order games (see Figure 2.4). Meijering and Verbrugge generously allowed for the use of the original game as presented in Meijering et al. (2012). After each game, the par-ticipants were presented with correct/incorrect feedback on their performance. After completing this task, they were presented with a second-order FBT, which was modeled after the task created by Perner & Wimmer (1985) presented in chapter 1. The survey instructions are included in Appendix A. The MDGs are included in Appendix C.

(26)

Figure 2.3: A first-order MDG (Meijering et al., 2012).

(27)

In the second condition (FBT-MD) participants were first presented with the second-order FBT. Next they were asked to complete the same series of MDGs as the participants in the first condition. Again participants were presented with correct/incorrect feedback on their performance for each of the games.

In both conditions, participants were asked to provide a strategy specifica-tion after completing the series of MDGs. They were asked to answer the open question “How did you decide on which lever to open in the previous games?”. After completing the series of MDGs and the second-order FBT, participants were asked to answer demographic questions concerning age, gender and edu-cation. In both conditions participants were asked to rate their mental fitness on a scale from 1 to 10 before the first task, in between the first and the second task, and after second task.

Whilst competing the survey, participants were unable to save progress and return to the survey at a later time. Hence, the participants completed the ex-periment without interruption. Moreover, participants were barred from com-pleting the survey more than once based on IP-address, so there was no problem with false data points due to repeated partaking.

(28)

2.2

Results

Data was collected between May 15th and May 18th. An independent-samples t-test showed that there was no significant difference in reaction time between conditions for each level of the MDG. Results were filtered on reaction time. Re-action times of over 15 seconds in the zeroth-order games, over 20 seconds in the first-order games and over 40 seconds in the second-order games were excluded. These boundaries were chosen after visual identification of outliers on the basis of a histogram of reaction times. For the zeroth-order games in the FBT-MD condition (M=5.617, SD=1.776) and MD-FBT condition (M=5.799, SD=1.835) we found t(80) = −.453, p > .05. For the first-order games in the FBT-MD con-dition (M=9.163, SD=3.315) and MD-FBT concon-dition (M=9.041, SD=2.744) we found t(80) = .178, p > .05. For the second-order games in the FBT-MD con-dition (M=12.284, SD=7.750) and MD-FBT concon-dition (M=13.083, SD=8.492) we found t(80) = −.444, p > .05, see Figure 2.5.

When looking at accuracy, the participants previously excluded due to slow reaction times were no longer excluded. A second independent-samples t-test confirmed that there was no significant difference in accuracy between condi-tions for each level for the MDG. For the zeroth-order games in the FBT-MD condition (M=.938, SD=.150) and MD-FBT condition (M=.974, SD=.096) we found t(81) = −1.387, p > .05. For the first-order games in the FBT-MD condi-tion (M=.703, SD=.260) and MD-FBT condicondi-tion (M=.756, SD=.247) we found t(85) = −.973, p > .05. For the second-order games in the FBT-MD condi-tion (M=.568, SD=.223) and MD-FBT condicondi-tion (M=.599, SD=.211) we found t(85) = −.674, p > .05, see Figure 2.6.

(29)

Figure 2.6: Second-order MDG accuracy per condition

Hence there is no significant decrease in reaction time, nor increase in accu-racy on the second-order MDGs after first completing the FBT. This confirms the prediction that there would be no transfer from the FBT to the MDG. The absence of transfer from the FBT to the MDG reinforces the results found by Arslan et al. (2014).

(30)

An independent-samples t-test showed that participants performed significantly better on the second-order MDGs when using ToM (M=.645, SD=.237), than when they did not (M=.543, SD=.199); t(53.7) = −2.025, p = 0.048, effect-size r = .27. Equality of variance is not assumed. Moreover, an independent-samples t-test shows a significant difference between reaction times on the second-order MDGs when using ToM (M=17.17, SD=9.17) or not using ToM (M=10.02, SD=5.99) ; t(43, 5) = −3.824,p < .001, effect-size r = .50. Equality of variance is not assumed.

For the two following analyses, participants who failed any of the control questions of the FBT were excluded. accuracy on the FBT was analyzed with an independent-samples t-test that compared accuracy on the FBT in the FBT-MD condition (M=.889, SD=.318) and the MD-FBT condition (M=1.00, SD=.000), see Figure 2.7. The difference in accuracy with respect to condition was signif-icant: t(44) = −2.345, p = .024, effect-size r = .33. For this test, equality of variance was not assumed. However, since there was a lack of variance in one condition, mediating effects from for instance mental fitness or gender could not be investigated.

(31)

An independent-samples t-test showed no significant difference in reaction times on the FBT in the FBT-MD condition (M=5.848, SD=2.268) and the MD-FBT condition(M=5.968, SD=2.227); t(72) = −.225, p > .05, see Figure 2.8. Partic-ipants with a reaction time of over 15 seconds were excluded in this analysis. These results are partially conflicting with the hypothesis that transfer would be found from the MDG to the FBT. In fact reaction times were not significantly different, whereas they were predicted to be lower after playing the MDG. Yet accuracy was significantly better in the MD-FBT condition than in the FBT-MD condition, which confirms the hypothesis that participants would perform better on the FBT after completing a series of MDGs. Due to a lack of vari-ance in accuracy it was impossible to see whether this significant difference was influenced by effects of mental fitness, age or gender.

(32)

An additional effect was found in analysis of the data. An independent-samples t-test showed that men (M=.646, SD=.237) performed better than women (M=.531, SD=.192) on the second-order MDGs, see Figure 2.9; t(51) = 2.257, p = .028, effect size r = .30. Equality of variance was not assumed.

(33)

2.3

Discussion

The results show support for the hypothesis that higher order ToM is a domain general process. Moreover, it provides evidence that ToM is employed by players of the MDG. Lastly, results are partially consistent with the predictions of the cognitive model PRIM.

No transfer was found from the FBT to the MDG. Figure 2.6 shows that there is no significant difference in accuracy on the MDG dependent on con-dition. Moreover, there was no significant difference in reaction time between conditions (see Figure 2.5). The similarity of conditions is consistent with the predictions made by the PRIM model and hence confirms hypothesis 2.

Secondly, results showed that participants performed significantly more accu-rate on the FBT if they were placed in the condition where they first completed the MDG (see Figure 2.7). This means that training on the MDG improved accuracy on the FBT. This supports hypotheses 1 and 3. This transfer shows that there is a common element to the MDG and the FBT. This element might be second-order ToM. Hence presence of transfer is consistent with higher-order ToM being used across different domains, which supports hypothesis 1. More-over it shows that higher order Theory of mind is possibly employed by players of the MDG. Unfortunately it was impossible to verify mediating effects such as mental fitness because of a lack of variance in the accuracy on the FBT.

Surprisingly the transfer apparent in accuracy on the FBT does not man-ifest as a decrease in reaction time (see Figure 2.8). This is inconsistent with hypothesis 2. The PRIM model predicted a decrease in reaction time on the FBT after training with the MDG. Wierda and Arslan (2014) did mention that the transfer effect predicted might be inflated due to learning effects. Since this experiment only included one FBT, such learning effects could not occur.

Also, participants who provided strategy descriptions containing statements expressive of ToM performed better than their counterparts. This is consistent with the strategy descriptions provided in the first chapter. Those descriptions show that it is highly unlikely to perform above chance-level if ToM is not used. The only strategy that performs well without extensive learning, but does not use ToM is Backward Induction. Yet we have seen that this strategy is definitely not preferred by inexperienced players. Hence it is expected that the use of ToM

(34)

conditions all participants answered correctly on the FBT, no connections with other variables could be investigated. Statistical analysis requires variance in the data in order to be fruitful. Hence the level of the FBT will be elevated to third order. This will create more variation, whilst still measuring higher-order ToM.

Moreover, analysis of the strategy specifications provided by the participants did not include a distinction between first- and second-order ToM. In order to be able to properly assess the predictions made by the hypothesis that higher order ToM is domain general, this distinction is necessary. Hence in the next experiment this distinction will be made. Lastly, the fact that there was training in the MDG, but not in the FBT could have masked transfer effects. Hence in the next experiment an additional first-order FBT will be administered to provide training. Moreover, the training in the MDG will be diminished by only presenting half of the zeroth- and first-order games to match the short training stage of the FBT. The details of all the changes in the experiment and the rationale behind these changes will be provided in the introduction of the next chapter.

(35)

Chapter 3

Experiment 2

3.1

Introduction

To extend our first experiment, several of the assumptions made in Wierda and Arslan (2014) should be considered. Firstly, since the authors believe that the difference in transfer each way is caused by the complexity of working memory strategies, a working memory test will be added . Differences in working memory limitations may mask transfer effects.

There are different theories about what working memory is. One theory states that it is about executive function, another about primary and secondary memory, and yet another that it is about binding. The executive function the-ory states that working memthe-ory capacity is in fact about “using attention to maintain or suppress information” (Engle, 2002, p.20). This theory proposes that working memory capacity is not necessarily about storage, but about main-taining attention in the face of distraction. Simply put, it is about being able to focus.

The second theory states that working memory capacity consists of two com-ponents, namely active maintenance of information (in primary memory) and controlled retrieval from more longterm storage (secondary memory) (Unsworth

(36)

task, a recall n-back task and an updating task. The study showed that all tasks are very closely related and that the interrelations of the different tasks were most consistent with the binding hypothesis. A binding task will be used in this experiment. Given that this task requires remembering arbitrary bindings in serial order, it fits the PRIM models of both the MDG and the FBT. In the FBTs the model stores story facts linked with who knows which information at what time in working memory. In the MDG the model stores payoff informa-tion linked to the players in working memory. Hence the models both require maintenance of temporary bindings in working memory. In the binding task, participants are presented with pairs of words and numbers, which they are asked to remember. They are then presented with either a number or a word and asked what corresponded to it. They get a choice between several of the used words or numbers. The time they receive for memorizing is limited. Their time for answering is not.

In order to prevent the ceiling effects encountered in the first experiment, the order of the FBT will be increased to third-order. Moreover to match the training phases for both tasks a first-order FBT will be presented first. It should be noted that increasing the complexity of the FBT does not change the predictions of the model. The only difference is that there needs to be another level of recursion.

The PRIM model of the FBT functions as follows (Arslan, Taatgen, & Ver-brugge, 2013, p. 4):

1. Retrieve a story fact that has an action verb in its slots.

2. Check the time slot of the retrieved story fact and if it is not the latest fact, request the latest one.

3. Request a retrieval of one of the reasoning levels from declarative memory. 4. If the kth-order reasoning (0 < k ≤ n) is retrieved, determine whose knowledge the question is about and give the answer by reasoning as if that person employs (k−1)th-order reasoning. Based on the success and/or feedback, the model will strengthen successful strategy chunks, or will add or strengthen an alternative strategy if the current one failed

The change would be that the model learns to retrieve third-order reasoning at step 3. This means that using a third-order FBT increases the weight of the third-order reasoning chunk in declarative memory. However, this does not affect the weight of the second-order reasoning chunk in declarative memory. Hence it does not affect the predictions made by the PRIM model.

3.2

Methods

3.2.1

Participants

Participants were recruited through MTurk and received payment of 3 USD for completing the survey. A total of 25 out of 123 participants were excluded

(37)

from completing the survey for failing either the color check or a comprehension question. These participants did not receive payment. Amongst the remaining 98 participants, age varied from 21 to 70 (M=33.49, SD=9.82). The sample consisted of 60 men and 38 women. The MTurk workers had an overall approval rating of at least 95% and were US residents.

3.2.2

Materials

Accuracy on the third-order FBT is a dependent variable measured similarly to accuracy on the second-order FBT in the first experiment. However, there are two third-order test questions that are averaged. Reaction time is also a dependent variable measured and averaged for both questions in a similar fashion to the first experiment. The data collected from the MDG is analyzed in the same manner as in the first experiment. The difference is that in this case there were only 2 instances of zeroth-order games and 4 instances of first-order games. Moreover, the participants are explicitly told that the computer follows a rational strategy in the instructions.

The strategy specification question is evaluated as being zeroth-, first-, or second-order. The answers are classified by the author. The classification crite-ria used in the previous experiment now hold for first-order ToM. In order to be classified as second-order ToM, the description should mention the computer’s account of the participant’s actions in addition to the criteria for first-order ToM. Again, this measure of ToM is an independent variable.

Lastly, instead of mental fitness and fatigue, we measure working memory capacity. Working memory capacity is measured using a binding task. For each of the nine trails, the answers are averaged to give a trial-score, indicating what percentage was answered correctly. Then these trail-scores were averaged to create a general measure of working memory capacity. It will function as an independent variable to see if there is any relation between working memory capacity and either of the tasks.

The experimental design is similar to the design of the first experiment, with a working memory capacity test added at the end of the experiment (see Figure 3.1). The relation between the experimental design and the hypotheses of this

(38)

Figure 3.1: Temporal flow of the experimental conditions

3.2.3

Procedure

The procedure was generally the same as in the first experiment (see Figure 3.1). The comprehension questions and color/darkness check excluded 20,3% of the participants. The sequence of MDGs now consisted of two zeroth-order games, four first-order games, and eight second-order games. Instead of one second-order FBT, participants were presented with a first-order and a third-order FBT. The stories were accompanied by pictures to facilitate remembrance of the story facts. The first-order FBT was modeled after the task created by Hollebrandse, Van Hout and Hendriks (2014). The third-order FBT was adapted from Valle et al. (2015). All materials can be found in Appendix B.

After completing the series of MDGs and the two FBTs, participants had to complete a binding task. This task consisted of one 2-item trial, two 3-item trials, three 4-item trials, two 5-item trials and one 6-item trial. Each trial consisted of several pairs of numbers and words. Each pair was presented for 2 seconds, with an interstimulus interval of 1 second. Next the participants were presented either with a number or a word and had to pick which word or number corresponded to it. There was no time constraint on recall. The words and numbers were always chosen from the same pool, namely

peanut, glasses, tree, lunch, material, snake and

48, 44, 23, 75, 53, 19.

These words and numbers were randomly chosen. Each pairing can be found in Appendix B.

(39)

3.3

Results

Data was collected on the 31st of June. An independent-samples t-test showed that there was no significant difference in reaction time between conditions for each level of the MDG. Results were filtered on reaction time. Reaction times of over 10 seconds in the zeroth-order games, over 20 seconds in the first-order games and over 30 seconds in the second-order games were excluded. For the zeroth-order games in the FBT-MD condition (M=4.187, SD=1.683) and MD-FBT condition (M=4.585, SD=1.759) we found t(90) = 1.108, p > .05. For the first-order games in the FBT-MD condition (M=6.369, SD=3.255) and MD-FBT condition (M=7.235, SD=3.353) we found t(90) = 1.254, p > .05. For the second-order games in the FBT-MD condition (M=5.038, SD=3.061) and MD-FBT condition (M=6.009, SD=2.768) we found t(90) = 1.583, p > .05, see Figure 3.2.

A second independent-samples t-test confirmed that there was no signif-icant difference in accuracy between conditions for each level for the MDG. For the zeroth-order games in the FBT-MD condition (M=.935, SD=.195) and MD-FBT condition (M=.939, SD=.195) we found t(101) = .093, p > .05. For the first-order games in the FBT-MD condition (M=.523, SD=.239) and MD-FBT condition (M=.505, SD=.207) we found t(101) = −.407, p > .05. For the second-order games in the FBT-MD condition (M=.465, SD=.134) and MD-FBT condition (M=.452, SD=.148) we found t(101) = −.494, p > .05, see Figure 3.3.

(40)

Figure 3.3: Mean accuracy on the second-order MDGs per condition

racy on the second-order MDGs after first completing the FBT. This confirms the prediction that there would be no transfer from the FBT to the MDG. The absence of transfer from the FBT to the marble drop game reinforces the results found by Arslan et al. (2014).

(41)

Accuracy on the third-order FBT was analyzed with an independent-samples t-test that compared accuracy on the FBT in the FBT-MD condition (M=.196, SD=.326) and the MD-FBT condition (M=.296, SD=.394), see Figure 3.5. The difference in accuracy with respect to condition was not significant: t(93.4) = 1.398, p > .05. For this test, equality of variance was not assumed.

Another independent-samples t-test showed no significant difference in re-action times on the FBT in the FBT-MD condition (M=16.530, SD=11.899) and the MD-FBT condition (M=16.933, SD=12.764), see Figure 3.4; t(100) = .165, p > .05. Participants with an average reaction time of over 75 seconds were excluded in this analysis. These results are conflicting with the prediction that transfer would be found from the MDG to the FBT. In fact reaction times were not significantly different, whereas they were predicted to be lower after playing the MDG. Moreover there is no change in accuracy between conditions.

(42)
(43)

We found no significant relationship between working memory capacity and accuracy on the MDG (see Figure 3.6), r = .09, p > .05. Nor was there a relationship between working memory capacity and accuracy on the FBT (see Figure 3.7), F (61, 36) = 1, 23, p > .05.

(44)

Figure 3.7: Mean working memory capacity divided by accuracy on the third-order FBT

(45)

Next, an independent-samples t-test shows that participants who displayed markers of first-order ToM (M=7.244, SD=3.471) in their strategy descrip-tion had significantly longer reacdescrip-tion times on the second-order Marble drop games than those who did not display any markers (M=4.844, SD=2.480); t(28, 2) = −3.007, p = .005, effect-size r = .49. Equality of variance was not assumed. It was not possible to include participants displaying markers of higher-order ToM in this analysis because there were only two such partici-pants.

Lastly, the reaction times on the MDGs were similar for all the different orders as can be seen in Figure 3.8. This is odd, since the computational com-plexity of the games increases with the order. However, the comcom-plexity would only increase minimally in case participants are using myopic strategies. Hence, a possible explanation for this phenomenon is that only few players were using ToM.

(46)

3.4

Discussion

The results of the second experiment show little support for the hypothesis that higher order ToM is a domain general process, since it provides only marginal evidence that higher-order ToM is employed by players of the MDG. Out of 103 players of the MDG, the strategy descriptions of only two participants were indicative of higher-order ToM. Moreover, the transfer apparent in the first experiment did not appear in the second. Lastly results are partially consistent with the predictions of the cognitive model PRIM.

No transfer was found from the FBT to the MDG. Figure 3.3 shows that there is no significant difference in accuracy on the MDG dependent on con-dition. Moreover, there was no significant difference in reaction time between conditions (see Figure 3.2). Reaction times are displayed in seconds in each figure. The similarity of conditions is consistent with the predictions made by the PRIM model and hence corroborates hypothesis 2.

Secondly, results showed no transfer from the FBT to the MDG. There was no significant difference in accuracy (see Figure 3.5) nor in reaction times (see Figure 3.4) on the third-order FBT. This contradicts hypotheses 1 and 3. Yet in both cases, especially with accuracy, there seems to be a trend towards transfer which cannot be verified. Possible reasons for the lack of significance are given in the general discussion. One explanation is that the lack of significant transfer might be due to low usage of ToM in the MDG. If that is the case, it means that higher-order ToM is not being used in the MDG. This in turn invalidates the hypothesis that higher-order ToM is a domain general cognitive capacity, since ToM is not used across different cognitive domains. The lack of decrease in reaction times is inconsistent with the predictions made by the PRIMS model. Therefore it does not confirm the hypothesis that the PRIMS model is an accurate model of cognitive transfer.

Also, participants that provided strategy descriptions containing statements expressive of first-order ToM did not perform better than participants who did not employ ToM. Both types of players performed at chance level.

The theory that describes higher order ToM as a domain general mechanism predicts that reaction times will increase with the order of ToM. Results show that indeed participants whose strategy was classified as using first-order ToM had longer reaction times on the second-order MDGs than those who did not employ ToM. This result is consistent with the hypothesis that ToM is a domain general cognitive capacity. It could not be determined whether this increase held for users of higher-order ToM, since there were only two participants displaying markers in their strategy description.

There was no relationship between working memory capacity and accuracy on the second-order MDGs as can be seen in Figure 3.6. Nor was there a relationship between working memory capacity and accuracy on the third-order FBT (see Figure 3.7).This is consistent with the idea that transfer between the two tasks is not mediated by working memory capacity. Since no relation was found, we did not consider working memory as a covariate in the previous analyses.

(47)

Chapter 4

General discussion

The results of both experiments are consistent with the hypothesis that higher-order ToM is a domain general cognitive capacity. In the first experiment, accuracy on the second order FBT increased after playing a series of MDGs. This corroborates that there is a common element to both tasks, which is con-sistent with all hypotheses. The presence of transfer from the MDG to the FBT is consistent with the predictions of the PRIM model, albeit it is in the form of accuracy and not reaction time. The indication of a common element between the MDG and the FBT is also consistent with higher-order ToM being a domain general cognitive capacity, assuming that this common element is higher-order ToM. This assumption will be addressed later in the discussion. The common element shows that higher order ToM is being used in the MDG. Hence it is em-ployed across domains, which is consistent with higher-order ToM being domain general and inconsistent with higher-order ToM being domain specific. Hence it strengthens hypothesis 1.

Unfortunately this transfer effect was not reproduced in the second exper-iment. The accuracy on the second-order MDGs was at chance level and the provided strategy descriptions rarely indicated use of higher-order ToM. This may have resulted in a decrease of transfer, which explains why the transfer effect is no longer apparent. The results do still show a trend towards transfer

(48)

increased reaction times on higher-order MDGs corroborate the hypothesis that higher-order ToM is a domain general cognitive capacity.

Thirdly, the lack of transfer (both in accuracy and reaction times) found from the FBT to the MDG in both experiments is consistent with the second hypothesis as well as previous research. The PRIM model of the MDG developed by Wierda and Arslan (2014) predicted that there is no significant transfer from the FBT to the MDG. This was already corroborated by Arslan et al. (2014), but is now also corroborated by the results found in this thesis. The study by Arslan et al. (2014) provided much more training on the FBT. They presented children with 6 different stories before presenting them with the MDG. Hence they would have been even more likely to find transfer effects. However the complete lack of any trend towards transfer in this thesis is a valuable confirmation of this null-result.

Lastly, there was one result inconsistent with our hypotheses and predictions. The PRIM model predicts that reaction times on the FBT would have decreased after playing the MDG. However, this prediction was not verified by the results in either experiment. There was no significant difference between reaction times on the second-order FBT after playing the MDG or before playing the MDG. The same holds true for the third-order FBT. Therefore this result does not corroborate the hypothesis that the predictions made by PRIM theory would be correct. It is important to notice that the lack of transfer in reaction times might be due to experimental flaws and not to a lack of presence.

4.1

Critical analysis

In the following paragraphs, the possible shortcomings of the experiments will be discussed. Let us begin by addressing the general validity of online survey research. Since the internet is a far less controlled setting than a research lab, one has to be careful with how such research is conducted. One concern was the attentiveness of the participants, however research shows that MTurk laborers are often more attentive than lab participants (Hauser & Schwarz, 2016). Additionally, participants are more engaged in tasks that involve the discovery of a rule, such as the MDG (Crump, McDonnell, & Gureckis, 2013).

Moreover, many online experiments have been replicated in the lab to show that results from online research are reliable (Berinsky, Huber, & Lenz, 2012; Crump et al., 2013). What these papers do show is that the experimenter has to make sure that task comprehension is sufficient by including comprehension questions. Since such questions were included in the current research, reliability should not be a problem. Moreover, in the current research, participants had to complete a color check to ensure that the color differences relevant to the MDG were perceived sufficiently well. Given that there were many control questions throughout the surveys and that subjects were automatically excluded after answering incorrectly, the sample it is highly unlikely that the sample was contaminated by participants answering randomly.

Referenties

GERELATEERDE DOCUMENTEN

dynamic capabilities as the driving forces behind the creation of new cultural products that revitalize a firm through continuous innovation Global dynamic capability is

It is as if one piece of the hierarchy is flattened, or skipped over in parsing.” (p. We may generalize children’s failures at first-order and second-order false belief

data were calculated based on the proportions under the assumption that there was no missing data. The number of repetitions of the DCCS and FB models at pre-test, training and

Based on our computational modeling approach that we presented in Chap- ter 2, we propose that even if children go through another conceptual change after they pass the

Five-year-olds’ systematic errors in second-order false belief tasks are due to first-order theory of mind strategy selection: A computational modeling study.. Frontiers

I want to thank the members of our Social Cognition Research group Ben Meijering, Daniël van der Post, Jakub Szymanik, Harmen de Weerd, Stefan Wierda, and Rineke Verbrugge..

In addition to making precise predictions that can be tested empirically, the goal of the modeling approach was to provide a procedural explanation for the

Our cognitive models predict that training three-year-olds with complex working memory tasks accelerates their development of first-order theory of mind and this prediction needs