• No results found

Beliefs dynamics in psychological games : a learning perspective

N/A
N/A
Protected

Academic year: 2021

Share "Beliefs dynamics in psychological games : a learning perspective"

Copied!
43
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Beliefs dynamics in psychological games

A learning perspective

Enrico Mattia Salonia 11793813

University of Amsterdam

Thesis for Msc Economics: Track in Behavioral Economics & Game Theory

UVA FEB MSc ECO 15 ECTS

Final version 15/08/2018

Supervisor: Dr. Aljaz Ule

(2)

Statement of Originality

This document is written by Enrico Mattia Salonia who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

A day will come when we will be able to deduce the laws of the social science from the principles of psychology.

Pareto,1971

Abstract

Nowadays there is no risk in saying that one of the main aims toward which human behaviour is devoted to is psychological reward. Economic theory, in its rst and massive development of the past century, has failed to take into account such rewards, and opted to focus on material and concrete satisfaction, often for the sake of simplicity in its analysis. Psychological games are an innovative framework to develop arguments that consider emotions driven behaviour. Although there has been ample experimental evidence for arguing that the causal relation between feelings and behaviour exists, explanations have always relied on supercial psychological insights, and have never been included in com-prehensive models. This thesis aims at placing belief-dependent motivations in the debate of the psychological foundation of social sciences and introducing a learning perspective.

(3)

Contents

Contents 2

1 Emotions in game theory 3

1.1 Introduction and agenda . . . 3

1.2 Motivation . . . 3

1.3 Informal introduction to psychological games . . . 5

1.4 Beliefs, emotions and actions . . . 6

1.5 Experimental evidence . . . 8

2 An introduction to the formal analysis 12 2.1 The basic model. . . 12

2.2 Psychological games in the literature . . . 15

2.3 An example of guilt game . . . 18

3 Learning dynamics 22 3.1 Learning in psychological games . . . 22

3.2 The Psychological Rule . . . 26

3.3 Beliefs Dynamics in the guilt game . . . 28

3.4 Emotions and cooperation in a prisoners' dilemma . . . 31

4 Conclusions 36

References 39

(4)

Chapter 1

Emotions in game theory

1.1 Introduction and agenda

This thesis aims at presenting recent literature about what are called "psychological games", games in which players' belief hierarchies enter their utility function, and suggesting how the formal framework can be extended through a learning model.

In this rst chapter introduction and motivation of the thesis are provided and psychological games are placed in the scientic debate about emotions driven behaviour as opposed to rational normative behaviour. Next, we present a brief literature review of the main experimental ndings. In the second chapter, an introduction to the formal model is illustrated, together with one example. In the third chapter, a learning perspective is suggested, and an update rule, which we refer to as "psychological rule", for nding stable equilibria in innitely repeated psychological games is proposed. In conclusion, we examine the primary results and provide some suggestion on how the learning perspective can be explored further.

1.2 Motivation

Colman (2003) suggests that "orthodox conceptions of rationality are evidently internally de-cient and inadequate for explaining human interaction. Psychological game theory, based on nonstandard assumptions, is required to solve these problems, and some suggestions along these lines have already been put forward. [...] Such motivations and others, in various combinations,

(5)

CHAPTER 1. EMOTIONS IN GAME THEORY

can add many layers of complexity to a game-theoretic analysis of the payo ".

Psychological games have been criticised for their analytical complexity, as underlined by

Colman (2003) and Battigalli & Dufwenberg (2009), which could bring the theory away from what actually happens. This thesis aims at suggesting possible steps toward the simplication of the analysis by introducing learning in the theory.

There has been extensive literature to justify and motivate the psychological game theory approach1, however, the main argument to support research in this eld is that traditional

game theory is not a rich enough toolbox to adequately describe many psychological and social aspects of motivation and behaviour, among which we could recall emotional care for intentions, opinions and emotions of others. Several intriguing examples involving various emotions can be found in (Battigalli & Siniscalchi, 1999) and (Geanakoplos et al., 1989), some of them will be presented in this thesis. To go further in the motivation for this approach, we propose a brief excursion in the foundation of economic theory.

We may identify at the base of the "material" foundation of economics an implicit assump-tion of consequentialism. According to consequentialism " [...] since Locke, pleasure was taken to be some sort of internal impression. [...]. Pleasure cannot be an internal impression, for no internal impression could have the consequences of pleasure" (Anscombe,1958).

In classical game theory, players' utility depends only on the consequence determined by action proles through the outcome function (see 2.1 Denition 1 for a formal statement). Although economic utility may be considered a subjective and psychological value, therefore dicult to measure, economics has used monetary, and in general measurable amounts, as a proxy for utility. Therefore, an economist who agrees with consequentialism would assume that any evaluation of any social fact must be exclusively based on the concrete and observable out-come, and not on any other parameter, like intentions, as an example. The direct consequence of the assumption of consequentialism in economic analysis is the fact that what should matter is the "outcome" (see Denition 1), which is indeed the most common word used to identify the result of the unfolding of players moves in classical game theory. However, considerable empirical evidence in the eld of experimental economics should convince social scientists that,

1The interested reader should consult, among others,Attanasi & Nagel(2008),Battigalli & Dufwenberg(2009),

Geanakoplos et al.(1989),Neicu(2012).

(6)

CHAPTER 1. EMOTIONS IN GAME THEORY

sometimes, intentions and feelings are a potential source of explanation of behaviour, aside from the visible and concrete outcome.

As an example of ndings which can justify this position, we could remember the ones about reciprocity (Nowak & Sigmund, 1998), but also evidence of the tendency to cooperate when it is a dominated action. Indeed, one of the most intriguing questions economists tried to answer in the past decades is why people cooperate (Axelrod & Hamilton, 1981; Colman,

2003). There are many ways to answer this question, like introducing repetition or biolog-ical relatedness (Van Veelen et al., 2012), but psychological explanations could certainly t experimental results. Moreover it has been argued that "positive emotions could also sustain cooperation" (Greene, 2014). Feelings could be indeed biological signals that drive behaviour toward the most evolutionary sustainable choice and make people better o. To consider the concepts mentioned above in economic theory mere philosophical consequentialism should be abandoned, as it is done in psychological games.

Robust and replicated empirical ndings of the past years should lead us to think that agents' intentions and feelings should be included in economic models, as they are a signicant source of explanations about human actions. Moreover, the signicant and shared evidence of the fact that emotion and cognition are not two dierent systems that work separately (Damasio,1994) should be a base to develop models in which both psychological and exclusively rational motivations are taken into account. Therefore, the foundation of social science must be enlarged, to consider these multiple aspects and deliver more comprehensive explanations.

1.3 Informal introduction to psychological games

The growing literature about psychological games is a signicant evidence of the fact that the view expressed in the previous section is shared, as these games provide formal frameworks to reason rigorously on how emotional and non-material rewards shape behaviour and interac-tions among agents. While the term "psychological game" was probably rst used by Emile Borel (Fréchet, 1953) in one of his French publications at the beginning of the twentieth cen-tury, Geanakoplos et al. (1989) developed the rst explicit formal model. This model lays its foundations on Harsanyi developments of Bayesian games, as well as on other results on the mathematical object called "hierarchies of beliefs" that were carried out in that period

(7)

CHAPTER 1. EMOTIONS IN GAME THEORY

(Brandenburger & Dekel, 1993; Mertens & Zamir, 1985).

It must be noticed that the tool of classical game theory may still be suitable for analysing specic eects of feelings and emotion on behaviour. The famousFehr & Schmidt(1999) model, as an example, can indeed be placed within the traditional framework and adequately address the issue of preferences about the distribution of resources. In this case, the increase in utility is given exclusively by a psychological and non-material eect, since the fact that there is less inequality among the distribution of resources does not bring any material advantage to agents, in this model. However, if intentions, as an example, and not only outcomes matter, we need to use psychological game theory, that in fact takes into account what is called in the literature "belief-dependent motivation". It has been accepted in some branches of economics and game theory that belief-dependent motivations can express feelings (Attanasi & Nagel,

2006; Battigalli & Dufwenberg, 2009;Geanakoplos et al., 1989; Perea,2012). One of the most famous examples is the model by Rabin (1993), a particular case of a psychological game, that incorporates intentions and not just reciprocity itself.

Psychological games are also an excellent tool to see how emotions interact with other variables, as for sure they are not the only human feature that has a role in dening behaviour. Within the formal framework of psychological games, standard economic theory results could be replicated with assumptions that adhere better to reality, but some results can also turn out to be false, on the light of the interaction between psychological and material motivations. This fact makes the model a good step further to investigate the interaction between these two features of the human being.

In the next section, the relation between beliefs and behaviour is examined to have a better guess of why and how psychological games work.

1.4 Beliefs, emotions and actions

Elster (1998) wrote a prolic survey about the impact of emotions in economic theory. The author in his review argues that even if economics has referred to some particular emotions like envy or guilt, these concepts are just supercially related to what emotion theory considers as proper denitions. This dierence exists because research in the latter eld has not focused on how emotions aect behaviour, an aspect that is instead of primary interest in the extensive

(8)

CHAPTER 1. EMOTIONS IN GAME THEORY

and recent economic literature. Communication between economics and emotion theory has been therefore tricky since the word "emotion" meant dierent concepts in these two disciplines. Moreover, the question of how the study of emotions can be interesting to understand behaviour was rst considered not much time ago by economists, a fact that can explain the lack of this perspective within economics in the past.

To our purpose, it is important to underline that emotions dier when compared to other human sensations like hunger or sleepiness, in that expectations and beliefs trigger them. This dierence is one of the reasons for which psychological game theory, that incorporates emotions as beliefs dependent motivations, is not only an elegant formal model but also a good t of reality. Whether these emotions are under the control of the agent is an unresolved issue. For sure people can partly choose the expression of their psychological status, as an example by crying or screaming to express anger and disappointment respectively. Numerous studies exploited the possibility that emotions can be controlled. In the particular eld of psychological game theoryKolpin(1992), whose aim was to rene psychological equilibrium concepts, assumes that a player can control beliefs.

In traditional game theory beliefs aect behaviour through a single channel, in the sense that what a player thinks the other one will play for sure will be taken into account to determine his behaviour. Instead, in psychological game theory we have two channels through which beliefs aect behaviour: the traditional impact which makes some strategies more compelling than others under certain beliefs, and the psychological impact, that is the fact that payos depend on beliefs. Both in economics and psychology relationships between beliefs and actions have been widely studied, even if of course not all the literature necessarily directly refers to psychological games.

A theory assumption generally conrmed by experiments is that rst-order beliefs regard-ing others' actions are correlated with an individual's behaviour in a strategic environment (Attanasi & Nagel, 2006; Charness & Dufwenberg,2006; Dufwenberg & Gneezy, 2000). As an example, in dilemmas that introduce a social trade-o, this correlation could be explained by the fact that if someone expects the other one to cooperate, he will be more willing to cooper-ate. Regarding this possible correlation there are two competing causal theories to account for experimental results:

(9)

CHAPTER 1. EMOTIONS IN GAME THEORY

Reaction Theory mainly coming from economists, suggests that beliefs cause actions ( At-tanasi & Nagel,2008).

Projection Theory mainly coming from psychologists, argues for the contrary, that actions cause beliefs (Attanasi & Nagel,2008).

It seems that the "reaction eect" is the strongest when testing these two theories (Attanasi & Nagel,2006), but still, there is not much evidence in the literature to support this possibility. Both these views constitute a valuable framework to reason about the games that we will consider. Since beliefs have multiple impacts on behaviour in psychological games, it is useful to understand which of this two theories better explain how interaction with multiple psychological components unfolds. Hence, in a latter section references to these two theories will be made to describe the dynamic relationship between actions and belief precisely. As we will see it is reasonable to consider the two eects as interdependent.

In the next section, some critical facts about experimental evidence that will form our intuition on the theoretical model are presented.

1.5 Experimental evidence

Some recent experimental ndings, which directly relates to psychological games, could be ben-ecial to justify and explain some main features of the formal model we will develop. Since the intuitive possibilities of how a psychological game can be described and played are innitely many, to base the analysis of stylized empirical facts is a good starting point. The primary ex-perimental references for this section are the review and the experiment conducted byAttanasi & Nagel. They made their subject play the following trust game repeatedly:

D T S C 1 2 1 1  0 4  2 2 

The payos are indicated at the nal histories of the game. Player one can play C = Continue or D = Dissolve, while player two can play T = T ake or S = Share.

(10)

CHAPTER 1. EMOTIONS IN GAME THEORY

The main aim of Attanasi & Nagel in these papers was to test specic hypotheses to guide the development of psychological game theory models. We will here report some of their results which we will use to base our analysis in the next sections.

Hypothesis 1. (First-order belief monotonicity) In a repeated game of trust, if in a certain stage t of the repeated game player one expects that player two choose Share with probability αt, and two eectively chooses Share(T ake), then αt+1 ≥ αt (αt+1 ≤ αt), i.e. next period

one's guess that Share will be chosen again does not decrease (increase) if this action is (is not) chosen in the period in play.

Hypothesis 2. (Second-order belief monotonicity) If in a certain stage t of the repeated game of trust player two states that player one expects she will choose Share with probability βt,

and if (Continue, Share) is the action prole of the period in play, then βt+1 ≥ βt and if

(Dissolve, T ake) then (βt+1 ≤ βt), i.e. next period one's guess that Share will be chosen again

does not decrease (increase) if this action is (is not) chosen in the period in play. In other words:

H2a : next period two's statement of one's guess that Share will be chosen does not decrease if (Continue, Share) is the action prole of the period in play;

H2b : next period two's statement of one's guess that Share will be chosen does not increase if (Dissolve, T ake) is the action prole of the period in play.

As the authors underline, these two hypotheses are a direct test of Projection Theory, as if they were true then behaviour would cause expectations. Experimental evidence allows to accept Hypothesis 1 in every repetition of the repeated game, except the last one, since the incentive to choose T ake for player two is strong and consequently player one will more probably choose Dissolve. In the meantime, results allow to accept H2b in all periods of the repeated game, whileH2ais accepted in every period except the last one. This result is an essential basis for the assumptions that we will consider later on. However, the authors also suggest that it is possible to divide the beliefs updating process in multiple steps: one could argue that when beliefs change, the psychological payo changes, and consequently the strategy must be based on the new payo (Reaction Theory); nevertheless, the rst change in belief is determined by the observation of someone else's action (Projection Theory).

(11)

CHAPTER 1. EMOTIONS IN GAME THEORY

To present the next relevant tested hypothesis, we must explain a specic treatment of the experiment. In the treatment called "Questionnaire Transmission Treatment" (QTT) player two had to ll what Attanasi & Nagel call a "hypothetical payo scheme" (HPS). In this treatment player two had the possibility to state, in case he had chosen T ake, and consequently received 4, how much he wanted to give back to the participant paired with him, given his guess of the probability of player two choosing Share. Each player two must then state for every possible guess of player one, how much he wanted to give back to him. The HPS had to be transmitted to the player one to which player two would have been paired with during the next repetition of the game. Therefore, the questionnaire could have been used to give a false signal to the new player one regarding player two sensitivity and will to play Share. The questionnaire looked like this:

One's possible assessment of Share Your reimbursement 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Table 1.1: Taken from Table 2 of Attanasi & Nagel (2006)

Hypothesis 3. (Signalling) Player two uses their lled HPS in QTT as a "false" signal; in other words, player two show some feelings' sensitivity only because they know their lled HPS will be transmitted to some randomly chosen player one.

This hypothesis is rejected, and in the comments to the experiment we have that only one participant (2.5% of all participants) admits that he used his HPS to give a false signal to his

(12)

CHAPTER 1. EMOTIONS IN GAME THEORY

co-players. This result is essential as we will not consider players to be able to act to give a false in the model we will develop. Consequently, we rule out a possible eect of "belief manipulation".

These introductory paragraphs should constitute a good informal background to face the theoretical models that we will present in the following sections. In chapter two, in fact, the primary models of psychological games will be analysed, both to give the reader a historical perspective on their development and to instruct him about the main features of the theory.

(13)

Chapter 2

An introduction to the formal analysis

2.1 The basic model

Before presenting the models of psychological games, a general framework and notation will be here dened from which we will depart to present the results of the literature and in general to work throughout the thesis.

We start with the classical denition of nite game:

Denition 1. A nite game is a structure G = hI, Y, (Si)i∈I, g, (vi)i∈Ii where: I is the set

of players, usually denoted by i or j; Si is the set of actions for player i; g : ×i∈ISi → Y

is the outcome function and Y is the set of outcomes; vi : Y → R is the von

Neumann-Morgenstern utility function. From the outcome function and the utility function we obtain the payo function for every player i, that is ui = vi(g) : ×i∈IS → R.

A part from the denition, we introduce here two minimal assumptions: Assumption 1. Each player i knows Si, S−i and his own payo function ui.

Assumption 2. Each player is rational, in the sense that he behaves according to the Expected Utility theory.

Since players can not peer into other players' minds and learn others' beliefs the component of incomplete information must be taken into account. Moreover, in psychological games, the problem of information is enhanced by the fact that beliefs, as it was mentioned in the previous

(14)

CHAPTER 2. AN INTRODUCTION TO THE FORMAL ANALYSIS

chapter, have multiple impacts on the outcome of the game. A rigorous analysis of uncertainty is, therefore, a necessary path to fully understand the impact of beliefs dependent motivations. For this purpose, the basic structure must be enlarged.

In general, we can say that the outcome function g and the utility function vican depend on

a vector of not commonly known parameters θ. Moreover, θ is composed of all the subvectors θi∈I such that each player i is the only one who knows the true value of θi. At this point we

can say that θ = (θi)i∈I and that Θ is the set of all possible values of θi. This last denition

is useful to characterise the payo function in incomplete information, that is written in fact in parametrized form as ui : Θ × Y → R, where S = ×i∈ISi. Although very pedantic and

repetitive, this brief discussion will be benecial to identify the eect of beliefs in the games we will discuss, as we hope the reader will notice.

The next fundamental objects that must be dened are "beliefs" and "hierarchies of be-liefs"1. We start by assuming that the space of uncertainty from player i's point of view

regarding player −i, X−i = S−i× Θ−i, is a compact polish space2. Player i assigns probabilities

to events in the Borel sigma algebra B of X−i according to some countably addictive

probabil-ity measures. The probabilprobabil-ity measures over our general and nite domain of uncertainty are dened as: ∆(X−i) := ( α ∈ RX−i + : X i αi = 1 )

Player i's beliefs are described by what is called a "conditional probability system" (X−i, B, C)

where C is the set of observable events. A belief for player i is a measure over the domain of uncertainty denoted as αi ∈ ∆(X−i).

It is crucial to notice that beliefs of a generic player i are not known by other players, so the domain of uncertainty X−i should account for this and become larger. In fact, if we only

consider exogenous uncertainty and beliefs, and not strategies, every player i is characterised

1A reader interested in the complete construction of hierarchies of beliefs should consultBattigalli & Siniscalchi

(1999).

2The mathematical framework here is wide for purpose of completeness, we will stick to second-order beliefs in

our applications. However, in order to deal with such beliefs the construction of the complete hierarchy must be presented.

(15)

CHAPTER 2. AN INTRODUCTION TO THE FORMAL ANALYSIS

by a tuple (θi, αi) ∈ Θi × ∆(Θ−i). Thus, in the case of two players i and j, player j will

have to form beliefs about (θi, αi), and not only about θi. These beliefs are a joint probability

measure on a cartesian product βj ∈ ∆(Θi× ∆(Θj))and are called "second-order beliefs". By

computing the marginal distribution on Θi we can recover j's rst-order beliefs αj ∈ ∆(Θ−j),

where in this case ∆(Θ−j) = ∆(Θi). By reintroducing strategies and by reasoning recursively

we can dene the domains of beliefs of k − th order for player i in a game with n players, which are:

X−ik = X−ik−1× ∆(Xi)k−1

These mathematical objects are called in the literature "hierarchies of beliefs". These struc-tures are extensively used in the eld of epistemic game theory and become even more critical in psychological games, which are born within this discipline.

For our purpose beliefs hierarchies must be "coherent". To see what this means we should notice that the domain of the second-order belief includes the domain of the rst-order belief. For such structure to be coherent, we need the marginal of the second-order belief on Θ−i to be

exactly the rst-order belief, as we indicated before. In a more general form, a belief hierarchy is coherent if it is a sequence (α1

i, α2i, ...) such that for every k:

margΘk−2 −i α k i = α k−1 i

The set of all belief hierarchies for player i is denoted by Bi := ×k≥0∆(X−ik ) and of course

the set of all belief hierarchies of every player is B := ×i∈IBi. From now on and in the whole

thesis we will consider αi as the rst-order belief of player i, while βi will be the second-order

belief of player i. We will never use beliefs of higher order. The letter α without any subscript will instead refer to the entire hierarchy.

Since in this framework we allow for mixed strategies, we denote a mixed strategy of player i as σi ∈ ∆(Si). Coherently with current literature on epistemic game theory (Dekel & Siniscalchi,

2015;Perea,2012), the opponents mixed strategies σ−i are interpreted as the beliefs of player i

over the domain of mixed strategies of all the other players −i. The expected utility by a specic strategy is ui(σi, σ−i). The reader should notice that, since a belief over an opponent strategy

is instead αi ∈ ∆(S−i), we can express expected utility from i's perspective as ui(σi, αi).

(16)

CHAPTER 2. AN INTRODUCTION TO THE FORMAL ANALYSIS

We will use the notation and the concepts introduced in this section for the all length of the thesis, some extensions will enlarge the model when it will be necessary.

What follows in the next section is a brief literature review about psychological games and the main ndings in formal structures in which hierarchies of beliefs are taken into account in the domain of utility functions. The literature is exposed both for the sake of completeness and because it is a good starting point for the future analysis.

2.2 Psychological games in the literature

In presenting the literature, we decided to follow a chronological point of view. This perspective makes the reader acquainted with the development of the theory until now and is practical also from a pedagogical standpoint.

The rst paper we should start from is a brief note about what are called "information-dependent games". These are "non-cooperative games in which the player's payo varies with his or other player's information about the play of the game" (Gilboa & Schmeidler,1988). As an example, the utility of gossip is strictly related to who else knew the same information rst. The authors add to the classical game structure the set of S prediction proles, that is the set of all possible actions S = ×i∈ISi, adding that the prediction prole is included in the utility

function, that is now vi : Y × S → R. The rest of the paper is devoted to looking if there are

"informationally consistent plays".

Even if psychology is not mentioned at all in this work by Gilboa and Schmeidler, we may interpret prediction proles as subjective expectations of what is going to happen dur-ing the game. This framework could indeed model dierent psychological motivations, the most straightforward of which is "surprise". In fact, Geanakoplos et al. (1989) started from "information-dependent games" to dene psychological games and investigate "sequential ra-tionality" in these. To understand their model we have to extend ours.

The set of mixed strategy of player i is denoted by Γi := ∆(Si)and Γ−i := ×j6=iΓj. The main

insight of this paper is the introduction of the above mentioned "belief-dependent motivations" in utility functions, which are now built not only from the set of outcomes but also on the set

(17)

CHAPTER 2. AN INTRODUCTION TO THE FORMAL ANALYSIS

of beliefs of player i: vi : Y × Bi → R3. Following this extension we can modify the basic game

denition:

Denition 2. A psychological game à la Geanakoplos et al. is a structure G = hI, Y, (Si)i∈I, g, (vi)i∈Ii where vi : Y × Bi → R.

Here is important to keep the literature aside and reason about what the introduction of Bi

in the domain of vi means and determines.

As we noticed before, in a game with incomplete information, from the outcome function g and from the utility function vi we can derive the payo function ui : Θ × Y → R. One could

ask how does the new structure of Geanakoplos et al. change the payo function. To provide the answer the reader should realize that before we could have indicated the payo function as ui = Θ × Y → R, but now, given the enriched domain of vi we can not express the payo

of player i like this any more, since g and vi do not consider beliefs. Even if before the space

of uncertainty in the payo function ui also contained uncertainty about beliefs, the latter was

not taken into account in the utility function. Moreover, it is important to notice that the set which enters the domain of vi is not Θ, as if player i was uncertain about −i beliefs, which is

safe to assume, he would not know his utility function. Instead, since Bi contains also beliefs

about others beliefs, player i will be able to use this information, together with the known outcome function g, to compute his expected utility, since it is not a function of someone else's beliefs, but of the belief on others' beliefs.

Players' beliefs may reect their disagreement over the strategies that will be played. How-ever, Geanakoplos et al.introduce the following assumption:

Assumption 3. In equilibrium, all beliefs are assumed to conform to a commonly held view of reality.

If ˆσ is the equilibrium prole, each player i believes (with probability 1) that his opponents follow ˆσ−i, that each opponent j 6= i believes that his opponents follow ˆσ−j, and so on. This

prole of beliefs is indicated as ψ(ˆσ) = (ψ1(ˆσ), ..., ψn(ˆσ)) ∈ B. The equilibrium notion dened

3The reader should note that in the previous version of the model we had that v

i: Y → R. Gilboa & Schmeidler

add the set S, whileGeanakoplos et al.includes Bi.

(18)

CHAPTER 2. AN INTRODUCTION TO THE FORMAL ANALYSIS

at the light of these considerations, based on the classical concept by Nash (1951), is the following:

Denition 3. A psychological Nash equilibrium (Geanakoplos et al., 1989) of a game G is a pair ((ˆσ, ˆα) ∈ Γ × B) such that:

1. ˆα = ψ(ˆσ)

2. ∀i ∈ I, σi ∈ Γi, ui( ˆαi, (σi, ˆσ−i)) ≤ ui( ˆαi, ˆσ), where ˆσ = (ˆσi, ˆσ−i)

The main idea this model aims at explaining is that "Emotional reactions often depend on expectations", asGeanakoplos et al. (1989) underline. This denition of utility function is also in accordance withElster (1998) review, in which he states that "by far the most common way of modeling the interaction between emotions and interests is to view the former as psychic cost or benets that enter into the utility function on a par with satisfaction derived from material rewards".

Finally, this model was recently interestingly extended by Battigalli & Dufwenberg(2009), in the light of recent work in the eld of interactive epistemology (Aumann, 1999; Battigalli & Siniscalchi, 1999). Their analysis is based on an enriched domain of utility functions, which allow for updated beliefs and other's beliefs. The model byGeanakoplos et al. only considered, in fact, initial beliefs of player i to enter i's utility function, as we already mentioned above.

Following the notation of the previous section, we could say that what the authors call "psychological payo function" is vi : Y × B × S−i → R, where B is the set of all belief

hierarchies for every player. They also introduce a denition of what they call a "psychological sequential equilibrium", which is a generalisation of the concept by Kreps & Wilson (1982), and prove its existence under some assumptions. More about this work will be illustrated later on, as this is the central framework this thesis will consider.

As expressed in the rst chapter, these payo functions contain the assumptions that beliefs directly determine payos of the game. This assumption makes the payo endogenous, in the sense that behaviour at a certain point of an extensive form in a sequential game could inuence expectations, and consequently payos of the terminal histories of the game. This mechanism reects the fact that psychological games abandon the assumption mentioned above

(19)

CHAPTER 2. AN INTRODUCTION TO THE FORMAL ANALYSIS

of consequentialism (see Section 1.2), since the payo is not only determined by the award of a material reward, but also by an exclusively psychological, and therefore immaterial, satisfaction. A representative case of how analysis of psychological games could be dierent if compared to results of classical game theory is exposed in the next section. This example has the aim of making the reader acquainted with the main feature and complexities of a game, as the upcoming discussion will be focused on an attempt to overcome these complexities.

2.3 An example of guilt game

In the eld of psychological games, signicant attention has been devoted to the emotion of "guilt" (Battigalli & Dufwenberg, 2007; Dufwenberg & Gneezy, 2000). To continue with the analysis, we have to enrich the model by Geanakoplos et al. following the footsteps of

Battigalli & Dufwenberg(2009). They aim to capture "dynamic psychological eects" that were previously ruled out, like sequential reciprocity, by the fact that beliefs could not be updated. Since we will take into account a sequential game, it is necessary to add some notation and dene what a sequential equilibrium in a psychological game is.

An extensive form is a tuple hI, Hi where H is the nite set of feasible histories, denoted by h = (s1, s2, ...s`). The set of feasible actions for player i at history h is S

i(h). The set of

terminal histories is indicated with Z and ζ(s) ∈ Z indicate the terminal history induced by s. Denition 4. A psychological game à la Battigalli & Dufwenberg based on the extensive form hI, Hi is a structure G = hI, H, (Si)i∈I, g, (vi)i∈Ii where vi : Z × B × S−i → R.

Moreover, a behaviour strategy is a vector ρi = (ρi(·|h))h∈H\Z ∈ ×(h∈H\Z)∆(Si(h))4. It is

important to stress how the authors interpret these behaviour strategies: as already mentioned before in this thesis, following the standard interpretation of epistemic game theory, they ex-clude actual randomisation. Mixed strategy are rather used to characterise dominated actions

4By A \ B we mean the relative complement of A with respect to B, that is {x ∈ A | x /∈ B}. In this case H \ Z

represent the set of all the histories except the nal ones.

(20)

CHAPTER 2. AN INTRODUCTION TO THE FORMAL ANALYSIS

and best replies5 so, in this case, ρ

i can be interpreted as a vector of conditional rst-order

beliefs of player −i regarding i's behaviour.

An assessment is then a couple (ρ, α) = (ρi, αi)i∈I where ρ is a behaviour strategy and α is a

belief hierarchy, as dened in the previous sections. We are ready for the equilibrium denition of Battigalli & Dufwenberg (2009):

Denition 5. An assessment (ρ, α) is a sequential equilibrium if ∀i ∈ I, h ∈ H\Z, s∗

i ∈ Si(h):

s∗i ∈ arg max

si∈Si(h)

Esi,αi[vi|h]

This denition implicitly states that in equilibrium players hold common and correct be-liefs about each others' bebe-liefs. The requirement is not very dierent from condition 1 of the psychological Nash equilibrium byGeanakoplos et al..

For the sake of completeness, we also report the existence theorem of page 17 in Battigalli & Dufwenberg (2009):

Theorem 1. If the psychological payo functions are continuous, there exists at least one sequential equilibrium assessment.

Here we will discuss an example of a trust game with guilt illustrated in the original paper. The game is the same used by Attanasi & Nagel in their experiment.

Example 1. The extensive form of our guilt game is the following:

D T S C 1 2 1 1  0 4  2 2 

Again, player one can play C = Continue or D = Dissolve, while player two can play T = Take or S = Share. The payos in this tree are "material payos", and do not represent

belief-5The procedures of exclusion of dominated strategies from this point of view usually involve some kind of

"rationalization operator", as the one dened in3.1.

(21)

CHAPTER 2. AN INTRODUCTION TO THE FORMAL ANALYSIS

dependent motivations of players. Indeed, this is what the authors call the "material payo game". The correspondent psychological guilt game is the one below6:

D T S C 1 2 1 1   0 4 − 2θg2β2  2 2  Where θg

2 is the sensitivity to guilt of player two (Attanasi & Nagel, 2008). Moreover we

have to recall what α1 and β2 are. The rst is the belief of player one that two will play T ake

(T ) if he plays Continue (C), that is his rst-order belief α1 = P(T |C). The second is two's

belief regarding α1, that is his second-order belief β2 = E(α1). The payo of the history reached

by strategies (C, T ) may seem arbitrary. Intuitively, player two receives the original material payo, which is 4, minus the psychological suering due to guilt, which is proportional to his belief regarding one's expectations β2 and his guilt sensitivity θ

g

2. We decided to stick on this

payo as it is commonly used in the literature to capture this type of feeling. The analysis of this game depends on the value of θg

2. We have that v2(ζ(S), β2, s1) = 2 and

v2(ζ(T ), β2, C) = 4 − 2θg2β2 7. In general v2(ζ(S), β2, C) < v2(ζ(T ), β2, s1) if:

2 < 4 − 2θ2gβ2 ⇒ 0 < 1 − θg2β2 ⇒ θg2β2 < 1 ⇒ β2 <

1 θ2g • 0 < θg2 < 1

In this case the condition β2 < θ1g

2 is always satised, so two will always choose to T ake (T) everything for him for every β2. We have the sequential psychological equilibrium

(D, T ). • θ2g > 1

This time whether two will choose T or S depends on the relation between θg

2 and β2.

6It is important to notice that this game is only one possible strategic formulation of the guilt game. As it will

be mentioned below, the payo and the structure are taken from the literature, but other forms are possible.

7Since in this game v

i: Z × B × S−i→ R then vi will be a function of the terminal history determined by the

strategy s, that is ζ(s), the relevant belief from i's hierarchy of beliefs and the strategy s−i of the opponent.

So we will have vi(ζ(s), α, s−i).

(22)

CHAPTER 2. AN INTRODUCTION TO THE FORMAL ANALYSIS

In general player one is uncertain about the values of β2 (but could also be uncertain about

θ2g), so he will have a belief α1 that will guide his choice. This case is a clear example of how

the high number of unknown parameters make psychological games a problematic tool to handle, and someone may say that its complexity is not representative of reality.

The primary aim of this thesis is to argue that possible solutions to the problem of complex-ity in the analysis of psychological games are repetition and learning. In fact, whatever prior beliefs all the players have, implemented actions at a specic node could constitute information about the beliefs, and then also about the feelings, of other players, and thus ease the task of calculating optimal behaviour in next periods of the repeated game.

In the next chapter, a learning perspective in repeated psychological games is introduced, and a discussion about how it may overcome the complex analysis of the illustrated game is conducted.

(23)

Chapter 3

Learning dynamics

3.1 Learning in psychological games

The exposed concepts of psychological Nash equilibrium and sequential equilibrium allow us to consider as equilibrium proles only the ones which assessments contain "correct" beliefs, as expressed by assumption 3. However, the assumption about correct beliefs is quite strong.

Battigalli & Dufwenberganalyse non-equilibrium paths in dynamic psychological games relying on past results in the eld of interactive epistemology abandoning this assumption. (Aumann,

1999;Battigalli & Siniscalchi,1999). The notation is quite complex, and even if predictions are sharp, it is possible that other simpler theories explain the same behaviour.

To proceed in this section, we must slightly change the formal context. Here the game G = hI, Y, (Si)i∈I, g, (vi)i∈Ii is repeated an innite number of times. At the end of each time

t each player i receives a message mt

i = f (st−i) that is a function of the opponent move.

Players use this message to update their beliefs. In our case we assume what is called "perfect monitoring":

Assumption 4. Players can directly observe the action implemented by their opponents, so the message space coincides with the strategy space S−i. Such message is the base on which players

update their beliefs.

Another critical assumption that we have to make is that players behave "myopically" (Battigalli et al.,1992).

(24)

CHAPTER 3. LEARNING DYNAMICS

Assumption 5. At each time t player try to maximise their current expected utility and they do not think about the future.

This assumption, even if strong, may characterise a repeated situation in which players are not "rational" enough, in the sense of classical game theory, to consider future earnings in their present choice. This assumption could adhere better with reality, since the behaviours we want to capture here, driven by "guilt", or other emotions, are rarely the result of a calculation of present and future utility. Moreover, to sustain this assumption, we recall what was mentioned in Chapter 1 about hypothesis 3 by Attanasi & Nagel. Experimental results did not support this hypothesis, fact that does not allow us to conclude that people act to signal, they just try to do their best in every time t. If people were able to give signals, then they would exploit this ability to manipulate their opponent and sacrice some present payo to increase their intertemporal utility. Our model will not consider this possibility.

To reassume, we will base our model here on assumption 1, 4, 5 while we will abandon assumption 2.

As already mentioned, learning theories could ease the analysis of psychological games. Both Battigalli & Dufwenberg and Neicu (2012), in his recent review, argue that it might be worthwhile to introduce a learning or bounded rationality perspective in psychological games. By removing the assumption 2 of complete rationality, in fact, theory prediction that t more with experimental data could be developed. As Battigalli & Dufwenberg underline, the main issue that could arise in this context is the fact that players must learn not only the strategy played by others but also their beliefs. The possibility that this last process can take place is quite low since it is dicult that someone can directly observe the belief of someone else.

Among available models of learning, two that could turn out to be the most useful for our purposes are:

• Reinforcement Learning, in which strategies are reinforced proportionally on their payos. In our case of beliefs learning, since payos are endogenous and directly depend on beliefs, players should be able to reinforce beliefs that turn out to be correct after every stage of the game, to be fully aware of their own payo at every terminal history and correctly adjust their strategy.

(25)

CHAPTER 3. LEARNING DYNAMICS

• Markov learning, in which the prediction outcome is a stochastically stable state. In this context, one could say that players experiment various strategies based on their recent experience. This process could lead to a stable pattern in a quite low number of periods (Huck et al., 1999).

What we argue in this thesis is that if we dene a rule according to which beliefs react to others' moves and are updated consequently, in each stage of the game, we will be able to trace the dynamics of beliefs and nd the optimal behaviour of every player. The rules we are talking about are "psychological rules": if we introduce a particular emotion in a game, then the rule will be shaped by imagining how an agent will react to his opponent behaviour. This reaction must not necessarily be consistent with denitions of full rationality, as it is a rule on how emotions change without taking into account the possibility that individual's will or intentions will play a role.

Although the traditional updating rule has been the Bayes rule, we here imagine a more natural psychological pattern, according to which the belief regarding the possibility that an action is implemented increases if that action is chosen. A "psychological rule" may be con-sidered as some kind of learning model. In fact, players update their beliefs according to some psychological common sense explanation by observing other's past choices. The learning is here implicit, in the sense that players update their beliefs that some strategy is played, change their expectations and consequently change payos1 at specic terminal histories, and, if it is

necessary, they change their strategy.

This rule may be considered as a reinforcement learning process because beliefs about a particular event are reinforced if that event actually happens. Moreover, by updating their beliefs, players also change their reward, since beliefs are a part of the payo of a specic outcome, and consequently will behave in the next stage in order to reach the outcome that gives the higher payo on the light of new beliefs. On the other hand, it also has some features of a Markov learning process. The long-run equilibrium could be interpreted as a stable state reached after many stages useful to learn the right opponents' beliefs and the optimal strategy

1This is the element of discontinuity with standard learning models. In fact, here payos are endogenous and

can change, while they are exogenous in standard models.

(26)

CHAPTER 3. LEARNING DYNAMICS

to play. It is important to notice that to use this rule it is not necessary to assume that players can observe the beliefs of others. In fact, its implicit construction is such that players make a simple association of the strategy that has just been played and the possible belief of a player, without necessarily directly observe it.

From now on we will refer to this rule using the expression "psychological rule", which could be a promising way to analyse repeated psychological game, abandoning the assumption that players are entirely rational and updates belief coherently with Bayes' Rule.

A brief excursion into the eld of cognitive science could justify the fact that such a psy-chological rule has more predictive power when compared with the Bayes' Rule. Within this eld, it is indeed common to describe particular kind of intelligent behaviour by assuming that agents have some rule of the form "if ... then ...". In particular, the main assumption that relates to our thesis is the fact that people have mental rules, but also procedures for using these rules to search for a space of possible solutions when facing a problem. Moreover, " rule-based systems have also been of practical importance in suggesting how to improve learning and how to develop intelligent machine systems" (Thagard, 1996).

The assumptions according to which the Bayes' Rule should work are instead rather strong and lie at a more complex level. To apply the Bayes' Rule, in fact, an agent should be able to have in his mind a representation of statistical correlations and conditional probabilities, and the capacity to conduct probabilistic computations. It is not dicult to see how the assumptions behind the psychological rule are substantially slacker.

Since we do not use the strong assumption 2anymore, and the introduction of assumption4

and assumption5 was due to the introduction of repetition and would have been necessary for any similar framework, we could say that we are using fewer postulates. The fact that fewer assumptions are necessary to nd a stable way of playing may lead to stronger predictions, that moreover hopefully could t experimental results. Even if it is not necessary that players exactly know which rule other players use, one could say that if every player assume that also other agents will learn with their perspective, at the limit, if it exists, he will undoubtedly hold correct beliefs about others beliefs, that is one of the conditions for an assessment to be an equilibrium.

To give a formal denition of the psychological rule, we have to introduce some notation and concepts of learning dynamics. In particular, we will take into account the repetition of nite

(27)

CHAPTER 3. LEARNING DYNAMICS

games, sequences of action proles and beliefs. Moreover, we will use the notion of "adaptive learning" as it will be dened in the following paragraph.

A sequence of action proles is indicated as (st)

t=1 with (sti)i∈I ∈ S. To quote a classical

reference in the literature of learning dynamics in games, we could say that (st

i) is "consistent

with adaptive learning if player n eventually chooses only strategies that are nearly best-replies to some probability distribution over his competitors' joint strategies, where near zero probability is assigned to strategies that have been not played for a suciently long time" (Milgrom & Roberts,1991). In order to have a precise denition of this concept we have to introduce the U rationalization operator, as it was called byMilgrom & Roberts. In our notation we have that:

Ui(S) := Si∩ arg max σi∈∆(Si)

ui(σi, αi)

Ui is the set of all undominated mixed strategies of player i for all possible beliefs. Now we

are ready for the formal denition of adaptive learning, which we adapt fromBattigalli(2018):

Denition 6. A trajectory (st)

t=1 is consistent with adaptive learning if there exist a T such

that, for all t > T , st∈ U (st: t > T ).

This denition just states what we said in the previous paragraph. In order for a trajectory (st)∞t=1 to be consistent with adaptive learning, a time T must exist such that from every time t after T the action of player i st

i is undominated.

In the next sections, we will introduce this approach that could be used to analyse repeated psychological game from a learning perspective. Two examples are provided for the reader to fully understand what this approach means and determines in the analysis.

3.2 The Psychological Rule

In this section, we exploit everything that has been used in the previous paragraphs to give a formal denition of the learning process.

To capture the idea of what happens in a game where there is a psychological rule is useful to briey recall the static concept of conjectural equilibrium (Hahn, 1973) that is convenient

(28)

CHAPTER 3. LEARNING DYNAMICS

to represent the stable state that is reached by a dynamic process like learning. The denition byBattigalli et al. (1992) of conjectural equilibrium in a static game2 states that:

Denition 7. A prole of strategies constitutes a conjectural equilibrium of G if, for each i, we can nd a belief αi ∈ ∆(Si):

1. si ∈ Ui(S)

2. p(mi|si, αi) = 1

where p(mi|si, αi) is the subjective probability of receiving message mi given beliefs αi and

playing strategy si. The informal interpretation of the rst condition tells us that "player i

maximises the expected payo of the current period", while the second condition states that the subjective probability of player i must be correct. As Battigalli et al. explain, the informal interpretation of this static equilibrium concept captures the idea that "the observed long-run frequencies of messages should not induce i to change his belief and choice", so that players can form correct beliefs about the opponent's moves3 after a certain amount of hypothetical

periods and maximise his utility giving these beliefs. As we will see, the psychological rule is the dynamic representation of this idea.

We can now proceed with the formal denition of "psychological rule":

Denition 8. A psychological rule for a player i is a function Ri : Mi× Bit−1 → Bit

This denition recalls the intuition behind the static conjectural equilibrium: the rule is a function that maps the message received and the previous beliefs hierarchy to the updated belief hierarchy. Of course, there may be dierent explicit rules, and it is not the scope of this thesis to investigate the functional properties that the function could have. However, in the examples that will be illustrated in the upcoming sections an idea of what we imagine a rule could be is presented and a connection to the concept of "adaptive learning" is explained. We hope that the examples, together with the connections bounded to this concept, could constitute a good starting point for further research.

2Only in this denition we consider one shot non repeated games within this section, just to give the reader a

useful informal interpretation of the conjectural equilibrium concept.

3We recall that since perfect monitoring is assumed, then the message is the opponent action itself.

(29)

CHAPTER 3. LEARNING DYNAMICS

Psychological rules may be a convenient tool to nd how agents will behave when the game is repeated innite times, but it is important to notice that this approach deviates from the standard praxis in epistemic game theory. Until now nitely repeated psychological games have been in fact analysed with the mean of interactive epistemology, as Battigalli & Dufwenberg

suggest to do in the last paragraph of their article. Nevertheless, this thesis aims to focus on psychological and every day repeated situations: the equilibrium is a stable state derived by "psychological rule", and not the outcome of an entirely computational process.

It may seem that rules could be extremely subjective, and then it may be argued that the stable equilibrium could vary with dierent rules. While this is true if psychological rules are still consistent with explanations from psychology literature than they allow for these theory to be tested from a normative perspective, which may be valuable information when trying to explain behaviour driven by belief-dependent motivations, like "guilt" or sympathy" that we will model in the following sections.

In the next two sections, the concept of psychological rule is presented in two examples, a guilt game and a prisoners' dilemma.

3.3 Beliefs Dynamics in the guilt game

The main complexity in repeated psychological games is the fact that belief-dependent moti-vations enter utility functions. If we allow for a complete beliefs hierarchy and beliefs revision, then in every stage agents will play a dierent game. This feature implies that the argument of backward induction can no more be used to demonstrate that the equilibrium of the stage game will also be the equilibrium of the repeated game. Finding equilibria in these game may, therefore, be a very demanding task.

To give an example that provides intuition for this reasoning, we here present an analysis of an innitely repeated trust game with guilt, similar to the one previously illustrated. For our purpose we will reformulate the sequential game as a one-shot game with this form:

T S

D 1, 1 1, 1

C 0, 4 − θ2gβt 2 2, 2

(30)

CHAPTER 3. LEARNING DYNAMICS

This reformulation is plausible: player two choices will in every case aect the outcome only if player one chooses C, precisely as in the sequential game. Indeed, the analysis for the stage game is the same as provided in the previous section.

The message spaces for the two players are M1 = S2 and M2 = S1, because of perfect

monitoring. The psychological rule is a pattern according to which α1 and β2 evolves in time4.

In this case it could be argued that the expectation of player one that player two will share (S) will increase if in (t − 1) player two chose S. In the same way we can say that the expectations of player two regarding α1 will increase if he plays S and player one continues (C). Our rules

are synthesized by the following dierence equations: αt+11 > αt1 if m1 = S and αt+11 < α t 1 if m1 = T (3.1) β2t+1> β2t if m22 = C and β2t+1 < β t 2 if m2 = D (3.2)

If we assume players will update their beliefs according to equations 3.1 and 3.2 dened above, than the analysis of the game becomes doable and will also make sense. Again, we have to reason in every case:

• 0 < θg2 < 1

Here for every β1

2, and so also for every α11, player two will choose T in the rst stage of

the game. According to the rule α2

1 < α11, whatever α11 or action player one could take. In

words, the expectation of player one that player two will play S will decreased at every stage of the game. Since player two will always choose T we can say that ∆αt

1 < 05. This

means that whatever the starting value β1

2 is, at a certain stage of the game player one

will start playing always D, which means that the value of βt

2 will start decreasing as the

time pass, so also ∆βt

2 < 0. In the long-run we have that:

lim t→∞α t 1 = limt→∞β t 2 = 0

That is indeed a condition for equilibrium, in which players will continuously play (D, T ).

4We recall that in section2.1we dened the rst-order belief of player i as α

i while the second-order belief is

βi. The apex t in αti indicates the round we are talking about. 5By ∆αt

1 we mean the change time of α1.

(31)

CHAPTER 3. LEARNING DYNAMICS

• θ2g > 1

The initial relation between θg

2 and β21 will tell us what will happen in the rst stage of

the game, but the initial uncertainty will not overcome the psychological rule. In fact, if in stage 1 α1

1, β21 and θ g

2 are such that the outcome is (D, T ), then the same as the

previous case applies. On the other hand, it can be that in stage 1 agents play (C, S). In this case, according to the psychological rule ∆αt

1 > 0, ∆β2t > 0 and in the long-run:

lim t→∞α t 1 = limt→∞β t 2 = 1

The repeated outcome will be in this case (C, S).

If in the rst stage something dierent from what has been considered happens, according to the psychological rule sooner or later both players will start playing the same pattern of moves that they would play in the two stable states, and in the long-run they will reach one of these two equilibria.

The twoguresbelow represent all the possible one-shot outcomes of the game as a function of beliefs and the parameter θg

2. In particular, the gure on the right tells us which will be the

outcome of the game for a θg

2 = 12. The psychological rule tells us that if the starting point is

in the quadrant up on the right or in the one down on the left, there will be no possibility to change quadrant and the equilibrium when t → ∞ will be (C, S) in the rst case and (D, T ) in the second. If instead, the starting point is in one of the other two quadrants many things can happen. Basins of attraction for the two xed equilibria can, of course, be found if the psychological rule is well dened. The gure on the left, instead, tells us when player two will play S or T given his beliefs and his guilt aversion parameter6.

6The two gures are plotted using the results obtained in example1

(32)

CHAPTER 3. LEARNING DYNAMICS

Figure 3.1: Outcome

as a function of β2 and and guilt

aversion θg 2.

Figure 3.2: Player two actions as a function of beliefs α1, β2,

for θg 2 = 12.

As the reader can notice it is not necessary to explicitly say what will happen in every time t of the repeated game. In fact, the psychological rule may be useful to see how people used to widespread situations will behave even if they are not able to consider unknown parameter of the game (e.g. other players beliefs.). Probably they faced the same game in the past and learned what to expect and how to play.

This idea is strictly correlated with the fact that in this example the sequence of action proles st implemented by every player is consistent with adaptive learning. In fact, the beliefs

about actions that are generated in this game will consider the probability of (C, S) to be 0 in the rst case and the probability of (D, T ) to be 0 in the second. Since players maximise current utility, they will choose only strategies that are best replies to the generated beliefs, so they will start playing the same maximising action and converge to one of the two equilibria.

3.4 Emotions and cooperation in a prisoners' dilemma

As we mentioned in the introduction, psychological games could constitute a new theoreti-cal framework to explain cooperation in social dilemmas. The explanation of particular non-rational actions may, in fact, be the psychological reward behind the material payo. An interesting example of a psychological explanation of why people cooperate may be provided

(33)

CHAPTER 3. LEARNING DYNAMICS

by taking into account a prisoners' dilemma. Geanakoplos et al.had already given an example of how "surprise" could be included in the model as an explanation of cooperation. Here we will analyse an innitely repeated prisoners' dilemma with "sympathy" and "guilt".

It is not the rst time that this point of view on "sympathy" is taken. In fact, Colman

(2003), while explaining the relation among behavioural game theory and psychological game theory states that "one of Camerer's examples of behavioural game theory is Rabin's (1993) fairness equilibrium, based on payo transformations. According to this approach, a player's payo increases by a xed proportion of a co-player's payo if the co-player acts kindly or helpfully and decreases by of the co-player's payo if the co-player acts meanly or unhelpfully. [...]. According to Camerer, this may help to explain cooperation in the Prisoner's Dilemma game".

Let's consider a standard game with equal gains from switching. Here αt

1 is the rst-order

belief of player one at time t that player two will cooperate, while βt

1 is the second-order belief

of player one, that is, the belief regarding the rst-order belief of player two that player one will cooperate. Below there's a matrix representation of the "material payo game":

C D

C 2, 2 0, 3 D 3, 0 1, 1

Thanks to the rst-order belief we can model "sympathy" or "satisfaction" for cooperating: if players expect the other one to cooperate with a certain probability, they will receive a psy-chological reward if the outcome they expect will happen, and will receive a payo proportional to their expectation. "Guilt" can instead be modelled accurately as before: the more a player thinks he unattended the other expectation that he would have cooperated, the more he feels guilty.

The payos for the psychological game are illustrated in the following matrix:

C D

C 2 + αt

1, 2 + αt2 0, 3 − β2t

D 3 − βt

1, 0 1, 1

(34)

CHAPTER 3. LEARNING DYNAMICS

Before thinking about the repeated game it may be worth analysing the rst stage. From the perspective of player one:

EU1(C) = 2α11+ (α 1 1) 2 EU2(D) = 3α11− α 1 1β 1 1 + 1 − α 1 1 = 2α 1 1− α 1 1β 1 1 + 1

He will be indierent when:

2α11+ (α11)2 = 2α11− α1 1β 1 1 + 1 ⇒ (α 1 1) 2+ α1 1β 1 1 − 1 = 0

Because of symmetry, the same condition will hold for player two.

The gure on the left here represents the utility of cooperating and the utility of defecting. The gureon the right instead identies the regions dened by α1

i and βi1 that lead player i to

cooperate or defect. The shaded area between expected utilities indicates the range covered by U (D)for any possible value of β7.

Figure 3.3: Utilities as a function of beliefs Figure 3.4: Cooperation and Defection regions The fact that the game is repeated innite times can allow players to update beliefs and coordinate on certain equilibria among the innitely many that could be taken into account by the mean of interactive epistemology. Again we can simplify this analysis by introducing a

7Figures are plotted using the results about expected payo obtained in this section.

(35)

CHAPTER 3. LEARNING DYNAMICS

relatively easy and common sense psychological rule according to which beliefs changes as the repetition goes on. We could say that the expectation that the opponent will cooperate will increase if he cooperate in the previous stage, and decrease otherwise:

αt+11 > αt2 if m1 = C and αt+11 < α2t if m1 = D (3.3) αt+12 > αt2 if m2 = C and αt+12 < α t 2 if m2 = D (3.4) β1t+1 > β2t if m1 = C and β1t+1< β t 2 if m1 = D (3.5) β2t+1 > β2t if m2 = C and β2t+1< β t 2 if m2 = D (3.6)

where mi indicates the message received by player i. According to equations 3.3, 3.4, 3.5

and 3.6, and depending on the beliefs at t = 1 it can be that players will end up in a chain of defection or a chain of cooperation. Since they behave following these rules, if the starting conditions are such that they will start cooperating, so players are both in the blue region of thegureabove, they will continuously repeat the same strategy. The same holds if their initial beliefs are both in the red region. It may also be that one of the players starts cooperating and his opponent starts defecting. In this case, it can be that one of the two is convinced at a certain point to cooperate, but also the opposite. By repeating the analysis of the previous section, we have that:

• If players can develop a chain of cooperation: lim t→∞α t 1 = limt→∞α t 2 = limt→∞β t 1 = limt→∞β t 2 = 1

In this case, the game they play after some amount of time is the following:

C D

C 3, 3 0, 2 D 2, 0 1, 1

The players will coordinate on the equilibrium (C, C), even if (D, D) is still a Nash equilibrium.

(36)

CHAPTER 3. LEARNING DYNAMICS

• If instead the starting condition are such that cooperation is impossible, at a certain point beliefs will start decreasing stage by stage:

lim t→∞α t 1 = lim t→∞α t 2 = lim t→∞β t 1 = lim t→∞β t 2 = 0

This evolution will lead the payo to be identical to the traditional prisoners' dilemma:

C D

C 2, 2 0, 3 D 3, 0 1, 1

So the only equilibrium will be (D, D).

A complete analysis could identify the two basins of attraction of cooperation and defection as a function of initial beliefs and the psychological rule, but this is out of the scope of this thesis. However, it is clear that this way of analysing a psychological game is both easy to deal with, is based on reasonable intuition and could give the insight that cooperation in the long term may also be a function of the initial psychological disposition of players, as it is reasonable to conclude in some situations.

This example is not dierent from the rst one in that this unfold is consistent with adaptive learning. The beliefs described above, in fact, lead players to abandon the idea that D in the rst case and C in the second one will ever be played again from a specic time. According to this beliefs and to the assumption that players maximise current payo, they will continuously play the best reply to the opponent action consistent with their beliefs, which is indeed the concept of adaptive learning.

It is important to notice that even if the two long-run equilibria, (C, C) and (D, D), has been found by other theories the analysis proposed here is rather new. In fact, where in the past it has been proposed that cooperation may be reached thanks to repetition, preferences for fairness, or biological relatedness, here the same outcome is explained by a dierent causal theory, that is the one of emotions, in this particular case "sympathy" and "guilt". Again, as

Colman states, "according to Camerer, this [psychological game theory ] may help to explain cooperation in the Prisoner's Dilemma game".

(37)

Chapter 4

Conclusions

To recall the rst part of the thesis, we will focus here on how the questions raised in the introduction were answered in the formal model.

In classical game theory intuition of why players chose specic actions was mostly based on the hypothesis that people reason strategically to maximise material payo. Even in bounded rationality contexts, this "consequentialism" assumption was not abandoned, as the only, though signicant, addiction was the fact that people are not able to compute every-thing that is needed. As already mentioned above, some models of fairness (Fehr & Schmidt,

1999) or reciprocity (Rabin, 1993) have been proposed in the last few years, but the fact that people received a psychological reward based on how the nal allocation was reached or how it corresponded to a particular ideal of equity was just supercially mentioned. As this lit-erature showed, a comprehensive framework that allows taking into account innitely many types of non-material reward is the one of psychological games. By now, a signicant num-ber of economists consider that belief-dependent motivation, which is taken into account in psychological games, is relevant to economic behaviour (Battigalli & Dufwenberg,2009).

However, in the recent literature, these games have only been treated within a static point of view, without abandoning the assumption of complete rationality. Since psychological driven behaviour has been demonstrated to be often the result of a heuristic choice, a bounded ra-tionality framework is a better t when theory predictions are compared with experimental results. Moreover, as already mentioned in the literature review section, experiments explicitly designed to test psychological games predictions have carried out some stylised fact that could

Referenties

GERELATEERDE DOCUMENTEN

Op t = 0,35 is dus de snelheid het grootst en daarmee ook de kinetische

gebruiken en taal (Hovey et. Het onderzoek is echter wel ouder dan 10 jaar dus het is belangrijk dat dit nog een keer bekeken wordt. Voor het huidige onderzoek is het ook belangrijk

It can employ presence and purity detection of peptide droplets via current (charge) tests of control electrodes or impedance (phase) measurements using direct sensing electrodes

The factors identified that influence the level of an individual’s entitlement beliefs are the level of qualification of an individual, the individual’s belief that he/she will

The plot shows a smoothed curve for the mean score on 20 evaluations (upper left) and the mean score on the final 5 evaluations (upper right) taken during the training of 1

dynamics; fun; learning; pilot test; prototype; serious games; serious games gauge (SGG); success factors; systematic literature review (SLR); theory

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of