• No results found

Imitation with memory : the best action rule

N/A
N/A
Protected

Academic year: 2021

Share "Imitation with memory : the best action rule"

Copied!
30
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

University of Amsterdam

Faculty of Economics and Business

Imitation with memory:

The best action rule

Liselotte Siteur

10662340

Bachelor thesis econometrics and operations research

Supervised by Dr. D´avid Kop´anyi

26 June 2016

(2)

This document is written by Liselotte Siteur who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Acknowledgements

I wish to express thankfulness to my supervisor D´avid Kop´anyi, for given us a jump start, be open for questions and suggestions, and most of all to introduce this interesting field of research to us.

In addition to that, I would like to thank the entire econometrics department of the University of Amsterdam, for all doors that have been open these past years and for the opportunity given to develop myself.

I must also express my sincere gratitude and love to my parents and brother for providing me with unfailing support in any kind of way and continuous encouragement throughout the years to look at the bigger picture. Their support and trust in me resulted in confidence to embark on the econometrics program and to succeed with flying colours. Thank you.

(4)

Abstract

The ‘Best action’ rule is an imitation behaviour in which imitations occur by selecting the quantity with the highest average profit. In this thesis the ‘Best action’ rule is analysed with reference to the ‘Imitate the best’ model of Al´os-Ferrer (2004). Market dynamics simulations are performed to investigate the distribution of the long-run market outcome and the influence of the memory length. For a market without memory, the Walras quantity is reached, due to the importance of relative payoff. Interesting is the movement towards the cartel quantity when memory length increases, which results from a high importance of the absolute payoff. When in a market different players have different memories, the player with the smallest memory will incite the rest of the market and will gain the highest average profit.

(5)

Contents

1 Introduction 1

2 Development in Imitation research 3

2.1 ‘Imitate the best’ model without memory . . . 3 2.2 Memory . . . 5 2.3 Imitate the best average . . . 7

3 Simulation model 9

4 Simulation results and analysis 11

4.1 Homogeneous memory length . . . 11 4.2 Heterogeneous memory length . . . 16

5 Conclusion 21

References 22

Appendices

I ‘Average Difference’ extended Table II Distributions heterogeneous K

i All players memory

(6)

1

Introduction

“The past cannot be changed. The future is yet in your power.” - Mary Pickford

This philosophy is true in many ways and can be put in practice in any field of business. However, there is a difference in approach to the future. There is the possibility to continue the same way, or to learn from the past and to take faith into one’s own hands. Classical game theory analyses games that are played by fully rational players. Players express the behaviour of choosing a strategy that is known as the best action for reaching their objectives. Decisions are only based on the expectation of rationality rather than previous experiences. In contrast, evolutionary game theory focuses on games in which each player is playing according to some behavioural pattern and therefore ‘learning’ from the past (Weibull, 1995).

An application of evolutionary game theory is the behaviour of imitation. Firms only base their production decision on observed results from other firms in the market. In this case, the strong assumption of full rationality of all firms can be dropped. It is no longer assumed that firms are aware of the exact specifications of the economy, but they are only interested in results from previous periods to base their decisions on.

Next to the relaxation of full rationality, imitation theory has two more explicit ad-vantages according to Pingle and Day (1996). First they argue that it is more closely related to reality due to low decision costs, therefore no extensive research is needed for optimal decisions. Second, the unrealistic assumption of players being aware of the exact market parameters, for example a demand function, is dropped.

Imitation has several appearances and is therefore not a straightforward rule. This thesis focuses on the ‘Imitate the best average’ rule. In this case, firms adopt the quantity that had the best average payoff over the last periods. This behavioural rule is also known as the ‘Best action rule’. Kandori, Mailath and Rob (1993) motivate the choice of this strategy as the eagerness of firms to find long run best solutions to their production decisions.

The ‘Best action rule’ is an alteration to the ‘Imitate the best’ model. Vega-Redondo (1997) analyses this model in which players choose the quantity that performed best in the most recent period. This imitation strategy will lead to the Walrasian equilibrium in the long run. The Walrasian quantity is described as the quantity that maximizes a firm’s profit under the assumption that it takes the price as given.

(7)

realised in the previous period (see Matthey, 2010 for experimental evidence). To extend the imitation theory, models with memory are introduced. With this introduction, the behaviour of firms will change since they no longer only focus on profits in comparison to other players, but as well in comparison to themselves (Apesteguia, Huck, & Oechssler, 2007).

The paper of Al´os-Ferrer (2004) is an extension of the research of Vega-Redondo, in which memory is incorporated in the model. This implies that now the quantity that once had the best profit over the length of the memory interval will be adopted. He concludes with the proof that all quantities between the Cournot and the Walrasian outcomes can be reached in the long run. In the Cournot equilibrium no player wants to deviate given the actions of the other players, which is also a common result from models with full rationality.

Following the approach of Al´os-Ferrer (2004), in this thesis an analysis is made of the dynamics of the model in which firms imitate quantities by choosing the quantity with the best average profit. The focus is on the long-run equilibrium strategies and the importance of memory to this model. The purpose of this research is to find out how production decisions evolve using the best action rule and how important the introduction and length of memory is. In addition, the influence of different memory lengths between players in the same market is being analysed. This imitation model is studied by means of computer simulations.

The research starts with a review of previous studies in Section 2 on the subject of imitation within the field of evolutionary game theory. In Section 3 the exact model is specified for simulation. Section 4 reports the simulation results and includes some analytical computations to verify the stated theorems and is concluded with a comparison with the theories of Section 2.

(8)

2

Development in Imitation research

A good imitation strategy states that no one should imitate an individual or quantity that performed worse than your own (Schlag, 1998). Ellison and Fudenberg (1995) introduce the best action rule as players eventually adopt the quantity that is best in average. They propose that learning by observing can lead to efficiency in the long run.

This section describes the ‘Imitate the best’ strategy model. From this closely related model, a comparison is made with the ‘Imitate the best average’ model. This analysis starts with the development in research of the ‘Imitate the best’ model. In addition to this, the concept of memory is introduced to investigate possible outcomes for different memory lengths. This section concludes with a comparison with the best action rule after which a proposal is made for a theorem concerning the ‘Imitate the best average’ model.

2.1

‘Imitate the best’ model without memory

A model that is closely related to the model of ‘Imitating the best average’ is ‘Imitating the best’. The difference is found in the behavioural rule. In the ‘Imitate the best’ model the quantity that generates the highest profit in any of the previous periods is imitated, instead of the average highest profit. A brief development of this imitation model is outlined. From this, the Walrasian equilibrium is found to be particularly important and is therefore discussed first.

The Walras quantity qW is given by the following inequality:

P (N qW)qW − C(qW) ≥ P (N qW)q − C(q), (1)

in which P (.) is the inverse demand function, C(.) the individual cost function and N the number of firms. For this equilibrium holds that when assuming the price is given, the Walras quantity is always the best to play. In other words, under the assumption that the price corresponding to the Walrasian equilibrium is given, the profits for firms choosing any other quantity than that of Walras is lower.

The Walrasian equilibrium is found to be very important within imitation research. Schaffer (1989) proposed a model in which two firms play a simple imitation game within a market with an infinite population, making a decision on which quantity to play with identical constant marginal costs. Since he introduces a game within an infinite popu-lation he secures a fixed price. His results show that the evolutionary stable quantity is reached due to two different phenomena. On the one hand, the firms try to maximize

(9)

their profit, on the other hand, spiteful behaviour could be distinguished. Spite is the behaviour of envying one’s opponent and therefore longing for a result at least as good. Schaffer concludes that the Walrasian quantity is globally evolutionary stable since it is stable within any kind of mutation. However, the assumption of an infinite population is rather strong.

Therefore, Rhode and Stegeman (1995) relax the condition of infinite firms, which makes the price no longer exogenous. They model a duopoly with linear demand and quadratic costs. In this duopoly, an imitation strategy is played in which firms imitate the currently most successful output. Including the possibility of a mutation results in converging to the Walrasian equilibrium in the long run. Rhode and Stegeman do form a conclusion with less assumptions, however this theorem is not applicable in every situation.

Vega-Redondo (1997) studies this phenomenon in more detail to find a more general theorem. He still models imitation as choosing the quantity that gave the highest profit in the last period. For this research, two types of models are considered. First, the model without mutation and after that, a model with mutation. He states that in a model without mutation, hence pure imitation, there is only one step needed to reach a stochastically stable state since all firms will choose the same quantity as they all show the same behaviour. In the following periods, no other choices can be made and all future decisions are fixed. No evolution is further found in the dynamics and the Walrasian quantity is not reached in the long run when no firms are in that state at the beginning. Vega-Redondo develops the model by including mutation, which makes it more vivid. He introduces mutation as the possibility to choose a random quantity. In other words, it is the opportunity to try a new quantity, which occurs with a small probability. In his model with mutation, the ability to try random new quantities makes this a model with trial and error. Due to the possibility to choose an arbitrary production level, firms are able to experiment. However, he finds, due to the imitation quantity, that firms are able to correct their mistakes if a decrease in profits has been found. This interesting property of mutation makes evolution play a significant role in this model.

The analysis of his model with mutation is based on the result that

P ((n − k)q + kqW)qW − C(qW) ≥ P ((n − k)q + kqW)q − C(q) (2)

with qW the Walrasian equilibrium and q 6= qW. This inequality holds as long as the

(10)

when some firms play the Walrasian quantity and others play a different quantity, the Walrasian quantity generates higher profits. Bringing this in the light of imitation this means that a firm will always imitate the Walrasian quantity if possible.

Vega-Redondo (1997) then found that if the possibility to mutate is included, the outcome can be that one will always deviate at a certain point and eventually end up in the Walrasian equilibrium since only behaviour that tends towards this equilibrium will be adopted. In addition he found that, when firms are in a Walrasian state and a mutation occurs, this ‘mistake’ is corrected since this mutation will lower profits relative to other players and in the next period, everyone will play the Walras quantity again. Therefore, he concludes that the Walras quantity is globally stable in this model. By the introduction of mutation, it is no longer necessary to start in a specific state to study the stochastically stable Walras quantity. For simplicity, it is assumed that all firms start with the same quantity.

The model of Vega-Redondo focuses on spite, longing for the highest relative payoff (Hamilton, 1970). Even though it is not optimal in absolute payoff, it may be linked to survival. The stronger the position of the firm in relation to other firms, the more support for a predatory campaign, or simply more chance to survive (Vega-Redondo, 1997).

The introduction of mutation makes imitation games evolutionary relevant. When simply imitating the best quantity, it can be concluded that the Walrasian quantity is most important due to the relative difference in payoffs. However, the absolute payoffs does not play a role in this model, since the actual value of profit is not important but only the value in comparison with other players.

2.2

Memory

Memory, the capacity to remember previously played quantities, puts absolute profits into perspective. The introduction of memory shifts the imitation rule of ‘Imitate the best’ to an absolute measurement rather than a relative one. Including memory allows firms to detect and correct mutations that are not profitable. This makes it more difficult to deviate from equilibria, because spiteful behaviour, which implies relative advantages, is no longer important (Al´os-Ferrer & Ania, 2009). To fully explain the concept of memory and investigate the implications the ‘Imitate the best’ model is used to form a clear example.

Al´os-Ferrer (2004) introduces memory in his model, which is based on the framework of Vega-Redondo (1997). The introduction of memory changes the imitation behaviour

(11)

to select the best quantity over the memory length, by selecting the quantity matching the highest profit gained in memory. Since this is the case firms are focusing more on the absolute result than the result relative to other players. The analysis of Al´os-Ferrer concludes with the proposition that for K ≥ 1 (memory length) and N > 2 (number of firms, which implies no duopoly) the set of stochastically stable states are all quantities that lie between the Cournot and Walrasian outcomes, in which Cournot is defined as the quantity when all players follow the best response function. For this quantity the following inequality holds:

P (N qC)qC− C(qC) ≥ P ((N − 1)qC + q)q − C(q) (3)

This means that no player wants to deviate since any deviation will lower his profits. Firms also take into consideration that the price will be affected due to their actions, in contrast to equation (2).

A proof of the proposition of the stable states is made by Al´os-Ferrer (2004) by focusing on two concepts. The first one is the difference in payoff between a mutant and the firms not mutating. In this case x is the quantity everyone is imitating and y is the quantity some firm mutates to. The expression

D(x, y) = P ((N − 1)x + y)(y − x) + C(x) − C(y) (4)

shows the difference in payoffs between the firms playing x against the firm playing y. If this expression is positive, then the mutant earns higher profit.

The second one is the absolute gain by a mutation:

M (x, y) = P ((N − 1)x + y)y − P (N x)x + C(x) − C(y) (5)

Whenever this quantity is positive the mutation leads to a higher absolute payoff than keeping the imitation quantity like every other firm.

Al´os-Ferrer (2004) illustrated his proof by distinguishing two movements. He first shows that a mutant will get higher profits than those not mutating, only if this mutation happens in the direction of the Walrasian quantity. On the other hand, a mutant will get a higher profit than before if the mutation happens in the direction of the Cournot quantity. This combination of processes will lead to a set of stochastically stable states between the Cournot and Walras outcomes.

(12)

(Al´os-Ferrer & Ania, 2009). It does change the dynamics of the movement but the same conclusion concerning the stable states can be drawn.

As a conclusion, when introducing memory the importance of absolute payoffs be-comes significant. It creates more movement in the direction of the Cournot quantity. However, the memory length does not play a role by converging to a stable state.

2.3

Imitate the best average

A discussion is started on the applicability of the ‘Imitate the best’ rule. This rule is found very naive, since one could copy behaviour that has been most successful in history, but turns out not to have the best profits in the long run (Al´os-Ferrer, 2004). This behaviour will be copied until it is removed from the memory, which is not optimal. The ‘Imitate the best average’ model can ease this problem as there is a chance that the average profit of this mutated quantity will drop below that of previous periods.

Both models, when memory is included, consider relative and absolute payoffs in production decisions. The best action rule leans towards an absolute long-term strategy, as the ‘Imitate the best’ model focuses on the short run since only the highest profit, of every quantity ever, is used in the imitation decisions.

The model without memory is identical to that of Vega-Redondo (1997) since the maximum over the average of one period is equal to best played quantity in a period. This results in the proposition of the following theorem.

Theorem 1. For an imitation model without memory (K=0) following the best action rule, the only evolutionary stable state is the Walras quantity

When introducing memory, the relevance of ‘spite’ declines and therefore a less ag-gressive equilibrium could be predicted. However, the opinions about the specific result are divided.

Possajennikov (2003) proposes that in symmetric games the introduction of imper-fections leads to a decrease in information. Imperimper-fections are stated as the ability of players to observe realised quantities instead of intended quantities. It states that the reduction of rationality in imitation games by reducing information will still lead to a rational behaviour. This theory is in line with the results of Al´os-Ferrer (2004).

On the other hand, Bigoni (2010) proposes an “Imitative” behaviour, resulting from players giving most attention to quantities played in the most recent round. In relation

(13)

to the article of Vega-Redondo (1997), Bigoni mentions an expected evolutionary state far from the Walras quantity.

Since it is expected that the absolute payoff is of great importance, taken this in combination with the theory of Bigoni, it is concluded that the expected stable state of the ‘Imitate the best average’ game will lie below Cournot, or at least an environment of stochastically stable states will lie around the Cournot quantity. From this expectation, the following conjecture is proposed.

Conjecture 1. For an imitation model with memory (K > 0) following the best action rule, the set of evolutionary stable states will lie around the Cournot quantity.

In this section, an expectation is formed about the model corresponding the best action rule using the result from the ‘Imitate the best’ model. The most important result from the model without memory is the longing for spite against other competitors. Players only choose the best relative quantity but do not mind the absolute payoff, which will lead to a highly competitive Walras quantity. The introduction of memory shifts the dynamics to longing for the best absolute payoff. The main difference between the two models is that when playing the best action rule instead of imitating the best quantity once played, the relative payoff loses interest and therefore a less competitive outcome is predicted.

(14)

3

Simulation model

To form a proper foundation for the proposed theories, a research is conducted to obtain the dynamics of the economy when firms follow the best action rule. The base of the model is similar to the models of Vega-Redondo (1997) and Al´os-Ferrer (2004). The imitating behaviour is different and some extra assumptions are made. In this study the dynamics are simulated for a fixed number of firms and different memory lengths. This results in a pattern and corresponding distribution of the imitation quantity. To perform the simulation the imitation rule will first be defined in detail. Thereafter some specific assumptions for modelling will be stated.

This study considers a market for a homogeneous product in which N ≥ 2 firms operate. Consider this market to have a linear inverse demand function P : R+ → R+

with P (Q) = a − bQ, and P0(·) < 0, in which all firms are symmetric with identical constant marginal costs c. For simplicity, firms only choose their quantity output from a finite set Γ = {0, δ, 2δ, ...νδ} for δ > 0 and ν ∈ N. The Walrasian quantity qW, the Cournot quantity qC and the cartel quantity qO have to be part of this set.

Two different models are considered; first a model with memory and second a model without memory. In each model two types of behaviour can be distinguished; imitation and mutation. Imitation is discussed here as the behavioural rule in which people imitate the quantity with the highest average profit. Mutation is the occurrence of a random firm playing a random quantity. The probability of mutation is denoted by ε with ε ∈ [0, 1). Memory length is denoted by the symbol K ≥ 0. When K = 0 it is assumed that only the most recent period is part of the production decision, this is also referred to as the model without memory. For K > 0, firms take, besides the most recent period, K more periods into consideration. The information space is now of length K + 1.

Imitation is defined as imitating the quantity with the best average profits, therefore the ‘Best action rule’. Imitation occurs with probability 1 − ε. For defining the exact imitating behaviour some additional notation is required.

Let qt,i ∈ Γ denote the production of firm i in period t, for i = 1, ..., N . Next,

qt = [qt,1, ..., qt,N] ∈ ΓN is the vector of individual productions in period t. Finally let

qKt = [qt; qt−1; ...; qt−K] ∈ ΓN (K+1) be the collection of the production vectors of the last

K+1 periods.

The best action rule can be described as follows: qt,i ∈ Bt−1 for i ∈ 1, ..., N , where

Bt−1= ( q|q ∈ qKt−1 and Pt−1 i=t−k−1πi(q) Pt−1 i=t−k−1Iqi(q) ≥ Pt−1 i=t−k−1πi(qj) Pt−1 i=t−k−1Iqi(qj) ∀qj ∈ qKt−1 )

(15)

and with Iqi(q) =      1 when q is an entry of qi

0 when q is not an entry of qi

Thus Iqi(q) is an indicator function showing if a quantity q is played by any firm in a

specific period i. πi(q) denotes the profit in period i for a firm playing the quantity q.

This imitation process can simply be described as choosing from all the quantities that have been played in the last K+1 periods, the quantity that has the highest average payoff over these periods. With the particular specification that the average is only taken over the periods the quantity is played.

Next to imitation, also mutation is part of the simulation process, which occurs with probability ε. Then qi,t is randomly selected from Γ. All players have an independent

mutation probability. Theoretically, every player can mutate simultaneously. However for ε very small, the probability that 2 players will mutate is proportional to ε2, and

therefore very small. If a player does not mutate it will simply choose qi(t) from Bt−1 for

i ∈ 2, ..., N .

For the actual simulation, a fixed linear demand function is chosen to be able to compare the results for different values of the memory length and the number of firms. The linear inverse demand function is specified as p(Q) = 400 − 0.8Q, where Q is the aggregate output. For each firm i, the costs function C(qi) = 4qi is used. Γ is chosen to

range from 0 to closely above the Walrasian quantity. For these parameters it holds that the Walrasian quantity lies at qW = 99, the Cournot quantity at qC = 82.5 and the cartel quantity at qO = 49.5.

In the simulation, a fixed number of firms starts with a random quantity from Γ, which results in corresponding profits. Each firm then chooses the next quantity from the played quantities using the best action rule. However, some firms may deviate from that by mutating to a random different quantity. Thereafter the profits are calculated and the firms face a new similar decision. When continuing this process and repeating this game over and over again, the market dynamics are analysed.

Now the entire model is specified and the simulation process is explained. From here the actual dynamics can be generated and interpreted. It will be analysed whether variations in K influence the imitation results. The focus will be on the differences between the simulation results and the distributions of outcomes.

(16)

4

Simulation results and analysis

This section presents the results from simulations of the proposed model according to the specific parameters from Section 3. The market features five players to gain general knowledge of the dynamics. The simulation outcomes suggest a common movement as an argument for the proposals stated in Section 2.

The analysis starts by looking at the distributions over homogeneous memory length of the imitation behaviour; thereafter techniques derived from Al´os-Ferrer (2004) are used to explain the position of distributions. Second, the behaviour for the heterogeneous memory length is analysed, in which not all players have the same memory length. This section concludes with a comparison of the simulation results and the expectations formed in Section 2. Furthermore, possible imperfections of the current methods are discussed and suggestions for further research are provided.

4.1

Homogeneous memory length

In the case of a homogeneous memory length, all players have the exact same behavioural rule. The memory length is the same and therefore, the imitation quantity is the same for all players. All players follow one particular imitated quantity, excluding the probability of mutation.

The histograms displayed in Figure 1 represent the imitation simulations. Each simu-lation of the dynamics consists of 20 000 iterations. In addition, this process is replicated 1000 times to eliminate extremities due to mutation effects and the dependence on the starting point. All imitated quantities, from the 10 000th iteration on, are displayed in a histogram. Note that mutations are only displayed if they are adopted in the next period and hence become part of the set of imitated quantities. Together they form the distri-bution of outcomes for different memory lengths. In Figure 1(b)-(f) also an indication is placed for the Walrasian (qW), Cournot (qC) and cartel quantities (qO).

In Figure 1a, the result is displayed when no players have memory. The largest peak can be found at exactly the Walrasian quantity. When excluding memory from the model it can be concluded that the behaviour is equal to that given in the paper of Vega-Redondo (1997). The obtained results match the description of behavioural patterns, however, quite a few observations do not completely match the Walrasian quantity.

This phenomenon can be explained from the occurrence of two events. First, it is possible that no mutation happens at the exact Walrasian quantity and the game evolves to the closest quantity. Second, it can emerge from a situation at which two different

(17)

(a) K=0 (b) K=1

(c) K=3 (d) K=5

(e) K=50 (f) K=500

(18)

firms mutate during the same period. If both mutate to a quantity under the Walras quantity, these mutations will not be adopted. If one of the players will mutate above the Walras quantity and one deviates below, then there are two options: Either prices become negative and the lowest quantity will be imitated, or, if prices stay positive, the highest quantity will be imitated. Therefore, a peak is visible at the Walras quantity, but also slight deviations to the left and right.

In Figure 1(b)-(f) the simulation results for various memory lengths are shown. These results will be discussed using the definitions and meanings of the Walras, Cournot and cartel quantities, respectively.

Recall the Walrasian quantity qW from equation (1). The most used implication of this quantity, by Vega-Redondo (1997) is that all players playing the Walrasian quantity are best off in a specific period regarding their fellow players. In this case, only the relative payoff is of interest. However, when introducing memory and taking the average this quantity is no longer stable.

When all players play the Walrasian quantity but one deviates above, it is still stable. However in this case of linear demand, when a player mutates to a lower quantity the payoff of all players will rise, and that of the player playing Walras will rise the most. However, the average of the new quantity, only counted in that period, will be bigger than the average of the Walras quantity, which had a very low profit in previous periods. Therefore, the lower quantity will be adopted. It will not hold for all quantities below Walras, but when the memory length K increases, the interval in which mutations are successful will increase too.

The Cournot quantity is characterised by all players playing their best response to other players and is stated in equation (3). Similar to the Walras quantity, the Cournot quantity is found to be unstable. When all players are playing the Cournot quantity, this situation appears to be stable, since the property of being the best response quantity implies that a mutation from this equilibrium should not be adopted. However, it rarely happens that the memory of all players consists only of the Cournot quantity. Even if only previous mutations that are not adopted are in the memory, a following mutation can destabilise this equilibrium. When the memory length increases the amount of muta-tions in memory will increase and therefore the probability of remaining at the Cournot quantity will lower. This equilibrium can also vanish when different firms mutate in the same period. Therefore the Cournot outcome is also unstable, but its destabilisation depends on the length of the memory.

(19)

players decide together on the production decision. This quantity can also be seen as the quantity at which absolute most profit can be found for the players altogether.

It is now assumed that all players are playing the cartel quantity. When a player mutates to a lower quantity, the profit of the other players will rise and therefore also the average of the cartel quantity. The profit of the other quantity will lie beneath that of the cartel and will, therefore, never be adopted, since the average profit of the cartel is al-ways higher. When a player mutates to a higher quantity the following process happens. First, all players will adapt this quantity, if not extremely high, since the profits will be higher, but as all players adapt this quantity, the average profit of this quantity will become lower. The memory length however will determine how long it will take for the average profit of the adopted quantity to fall down below the average profit of the cartel quantity. The larger the memory length K and the higher the mutation, the faster the unprofitable mutation is corrected. This behaviour is found very clearly in the figures. For increasing memory length K, expanding the movement of the imitation behaviour centres more closely at the cartel quantity. Moreover, it seems that only quantities above cartel will be imitated, but for an increase in K the distributions become thinner. For K=1 the distribution is still more centered around the Cournot quantity. This is expected since a small memory gives more weight to relative best payoffs in contrast with absolute payoffs, that are realised with long memory lengths.

The results in the distribution of the imitation behaviour do not only find their base in reasoning but can also be supported by a more analytical approach. In Section 2 the analysis of Al´os-Ferrer (2004) was discussed in which two functions were introduced; the difference in profits between a mutant and non-mutant and the difference in the profit of the mutating firm before and after mutation occurs. However, this technique is not used in the exact same form. The important information for imitation decisions lies now in the change of average profits during a mutation. Therefore, a new function, reflecting the ‘Average Difference’ is introduced.

AD(x, y, K) = πy((N − 1) · x + y) −

πx((N − 1) · x + y) + K · πx(N x)

K + 1

Here x represents the quantity all players were playing over the last K+1 periods, y the quantity to which a firm mutates and πx(.) and πy(.) the profits of players playing x

respectively y given the total production level in a market. If AD > 0 a mutation to y will be adopted immediately, since the average will lie higher. However, it is also interesting

(20)

to know what happens if y was adopted and is now being played in each successive period. Therefore, the ‘Average Difference’ function is expanded to:

AD(x, y, K, i) = πy((N − 1) · x + y) + i · πy(N y)

1 + i −

πx((N − 1) · x + y) + (K − i) · πx(N x)

K + 1 − i

Here i is the period after the mutation in which the average difference is measured, under the assumption that all players keep playing the mutated quantity. For a stable quantity x should now hold that for at least one i ≤ K the function AD is below zero, since then there is a point at which the mutation is ‘corrected’ and players return to the stable state. After a mutation is corrected the value of this function will not change until the period, in which the initial mutation occurred, drops out of the memory.

To demonstrate the shape and functionality of the AD function an example is given in Figure 2. For a fixed x = 55, the difference between the average profit between all players playing 55 and a mutation to a different quantity is given. Here K = 5 and the lines represent all possible periods in memory i ∈ {0, .., 5}. The figure shows that for i = 0 all quantities above 55 will be adopted, however in the next period (for i = 2) only quantities up to 75 are played again. For i > 2 all quantities above 55 will not be imitated anymore and all players will return to 55.

Since this is only an example, an expansion of these results can be found in Table 1. Here it is specified for various K and i at which specific value a mutation to both a lower and higher quantity will not be adopted. For i = 0 and K=0 the Walras quantity is

(21)

i K 0 1 5 100 1000 0 99.0 0 0 0 0 1 90.0 70.71 0 0 0 5 84.86 63.46 59.4 0 0 100 82.64 61.95 53.09 55.24 0 1000 82.51 61.88 53.04 49.7 55.02 Table 1: Stable quantity for various K and i

observed and for i=0 and K increasing it is found that the quantity goes to the Cournot equilibrium. For i expanding however, the quantity drops. In Appendix I a more extensive version of this table is displayed. In this table, values for very large K and i are shown in which the results tend to 49.5, which is equal to the cartel quantity in this model. The obtained results displayed in the histogram of Figure 1 are therefore in line with the quantities resulting from analysing the ‘Average Difference’ function.

4.2

Heterogeneous memory length

In the previous section it was assumed that all players have equal memory length, here heterogeneous memory length is discussed. Due to the resulting behavioural differences, it turns out that the players do not have the same imitated quantity. This will lead to possibly different equilibria for different variations in the memory lengths. To investigate this property, the changes in results are discussed. Two different cases with varying memory lengths are considered: One in which all players have at least some memory and one in which players without memory exist.

First, a market is considered with varying positive memory length. The quantities for one of the players are given in Figure 3, the results for all of the other players are given in Appendix II.i. The distribution for K=1 seems rather similar to the figures of homogeneous memory however, for the other players, the results are not in line with the homogeneous case of their memory length and their distribution is now similar to that of K=1. It can be concluded that the behaviour is similar for all players within the market, but it is not comparable with the results found for the individual memory lengths in this market with a market with equal homogeneous lengths for all players.

An explanation of this phenomenon can be found in the difference in memory. Mainly the smallest memory in the market seems to be important. It was previously shown that a mutation to a higher value will be adopted in most situations, but depending on the memory length of the player this movement will be corrected. Now the player with

(22)

(a) Market K=[1 2 10 50 100] Player K=1

(b) Market K=[0 1 2 50 100] Player K=1

Figure 3: Results for heterogeneous memory

the shortest memory has the least longing for returning to the lower quantity. However, players with a long memory will remember the lower quantity and will return. Since a part of the players returns, the high quantity will generate more profits, and this average profit will not lower much or may even increase. This results in a relatively high average profit for the high quantity and eventually all players will adapt to this. Therefore, the player with the smallest memory length is found most important.

The difference in behaviour over all different players, which is pictured in the figures in Appendix II.i, is mostly that smaller quantities are more often imitated by the players with high memory, which is in line with the process described before. Next to this phenomenon, the distribution of imitated quantities shifts to the right, when introducing heterogeneous memory, in comparison with the distribution for the smallest memory when viewing the homogeneous market. This unexpected phenomenon has a similar explanation to the reasoning for all players following the player with the smallest memory. The imitation of low quantities by players with long memory lengths, improves the average profit of high quantities played by players with small memory. Because players with long memory improve the profits for high quantities, these quantities, which would not have been stable before are now more attractive. Therefore, heterogeneous memory will lead to a shift of the distribution to the right.

In the next example a market is considered in which some player has no memory. Figure 3b shows the distribution of some player in such a market, in which it becomes clear that this equilibrium tends to the Walrasian quantity. The player with no memory is fully responsible for this fact. Since this player only chooses high quantities, great disadvantages are created for other players who do not adopt this. If each firm plays

(23)

quantities close to the Walrasian outcome, a mutation to a significantly lower quantity will be adopted by all players with memory. Therefore, a bump can be identified between the cartel quantity and the Cournot quantity. The players who play the lower quantity will create an extra advantage to the average of the quantity around Walras and therefore make it extra appealing. Appendix II.ii shows the distributions of the imitated quantities for every player in this market. It can be observed that the bump varies over different memory lengths. This is most likely caused by the fact that players with smaller memory are more likely to forget the disadvantages of playing a lower quantity and will adopt mutations more frequently.

In both examples, the results show the leading role of the player with the smallest quantity and the similar behaviour of the other players in the market. It is therefore interesting to see the exact difference in profit between these players. To simplify this analysis, a comparison is made between two groups of players, both with different memory lengths. To compare these groups the amount of firms is considered to be N=6 for simplicity.

The measure of the difference in profit between players in the same markets with different memory lengths is stated in Figure 4. It shows the difference in payoff over the last 10 000 iterations between group 1 with a variable memory length and group 2 with a fixed memory of K=5. The results show that the players with the smallest memory length have the highest profit in comparison to the group with longer memory. This

(24)

arises from the fact that players with longer memory adopt more lower quantities and therefore produce less. Nevertheless the difference in profit is very low and therefore the absolute payoffs are almost equal. Only when a comparison is made between a group of players without memory length and the group with a fixed memory of 5, a big difference is found. This also follows from the fact that players with higher memory adopt more lower quantities, but in this case the effect is more extreme.

Now a comparison is made between the outcomes of the simulations with the theo-retical predictions in Section 2. According to both the simulation results and the results found analysing the ‘Average Difference’ function, the Walras quantity is indeed reached when no players have memory. However, the possibility of drifting off to a quantity close to the Walrasian outcome was not mentioned. This movement is only possible if it is allowed for multiple firms to mutate at the same time, however in the long run firms will always return to the Walras quantity. This is in line with the stated theory.

Next, it was conjectured that for a market with memory the distribution of results would lie around the Cournot quantity. This conjecture was not found to be correct, since it is only applicable to a memory of one. Al´os-Ferrer (2004) proposed an argument for the increased relevance of the absolute payoff in contrast with the relative payoff. Whereas in the ‘imitate the best’ model of Al´os-Ferrer a combination of both was found, when using the ‘Best action rule’ the relevances shift to the absolute payoff as the memory length increases. This resulted in the introduction of the cartel quantity in this research, which was not relevant for the results of Al´os-Ferrer.

In Section 2 it was briefly mentioned that the memory length was not influential for the ’Imitate the best’ model. According to Al´os-Ferrer and Ania (2009), it would change the dynamics, but the resulting stochastically stable states would remain the same. Be-cause within an average the importance of the length increases rapidly, this outcome does not match our results. There are stochastically stable states, but the process will never converge to a unique state, and the market will always keep moving.

The results of this research are based on simulations and are hence only an approxi-mation. Most questionable in this research is the criterion for a quantity to be included in the histogram. If a mutation is adopted but corrected immediately after, this observa-tion is taken into account, however, it does not correspond to a stable state. Only taking the last observation is not useful either, since the market is never found exactly stable and therefore this would exclude useful information. Best is to find a criterion at which quantities are found stable in the sense that they are not only mutated ones. However,

(25)

the advantage of this approach is that the distributions show how often a specific state is reached. Therefore this outcome can be found more useful in practice than a strict theorem only about the location of the stable states.

Due to the strong assumption of a linear model, this research has a very low generality. In addition to this, the use of fixed market parameters results in a potential argument, rather than a formal evidence. To increase the applicability of these theories, more testing is needed. Therefore, it would be useful to mostly investigate an analytical proof, but also to extend the research by varying the market parameters and loosening the conditions of a linear model. Future research could also investigate whether a mutation to a quantity close to the imitated quantity would change results since this would be found more realistic than mutating to a random quantity. A different feature could be to include the option to pick the memory length in a game in addition to the quantity.

The phenomenon of actually reaching the cartel quantity is valid in real life imitation games. When a player deviates from his own best response to the other players, this behaviour is being adopted. However, thereafter the entire market will return to the cartel quantity since profits are higher. This gives a validation for the occurrence of a cartel and therefore this imitation rule can be seen as an argument for legal cartels to exist.

(26)

5

Conclusion

The ‘Best action’ rule is an imitation strategy in which firms imitate the quantity that made the best average profit over the last K periods. In this thesis the closely related ‘Imitate the best’ model, is used to form expectations. This model, without memory, results in the Walrasian quantity, in which decisions are only based on relative profits. When introducing memory, a set of stable quantities between the Cournot and Walrasian outcome is found. Memory shifts the attention to the absolute payoff rather than the relative payoff. Since the ‘Best action’ rule is strongly focused on the absolute payoff, it was proposed that the stable quantities of a market with memory would lie around the Cournot outcome.

As for the model without memory, the simulation results were in line with the Walras quantity, as expected. The simulation for models including memory did not fulfil the expectations. In the simulation for K = 1, the distribution lies around the Cournot quantity. However, when the memory length increases, the distribution of the imitated quantities slowly shifts to the cartel quantity. When all players play the cartel quan-tity the absolute maximum profit for each player is reached. Therefore, the increase in memory results in an increase of the importance of the absolute payoff, rather than the difference between players. This market can be described as highly uncompetitive. The distribution results are also in line with the analysis of the ‘Average Difference’ function. The quantities at which a player does not want to deviate, given his memory length and the duration of the mutation, are in line with the simulations of the distributions of the specific memory lengths.

The feature of allowing heterogeneous memory over all players creates very different results. The resulting distributions are most similar to the distributions of the smallest memory in a homogeneous market. The distributions do seem to slightly deviate from this to the right. This is explained by the fact that players with high memory length adapt smaller quantities, however, players with smaller memory adapt higher quantities which become more profitable than in the homogeneous case. This results in the highest average profit for the players with the smallest memory.

The imitation behaviour of adapting the best average quantity results in a dynamic market with a lot of movement, and therefore no fixed solutions. The increase in memory results in less competitive outcomes and a high absolute payoff. By imitating the best average a possible justification is found for playing the cartel quantity.

(27)

References

Al´os-Ferrer, C. (2004). Cournot versus Walras in dynamic oligopolies with memory. International Journal of Industrial Organisation, 22 (2), 193-217.

Al´os-Ferrer, C., & Ania, A. (2009). Robustness of perfectly competitive equilibria to memory in imitative learning. (Working paper)

Apesteguia, J., Huck, S., & Oechssler, J. (2007). Imitation - theory and experimental evidence. Journal of Economic Theory, 136 , 217-235.

Bigoni, M. (2010). What do you want to know? Information acquisition and learning in experimental Cournot games. Research in Economics, 64 (1), 1-17.

Ellison, G., & Fudenberg, D. (1995). Word-of-mouth communication and social learning. The Quarterly Journal of Economics, 110 (1), 93-125.

Hamilton, W. D. (1970). Selfish and spiteful behavior in an evolutionary model. Nature, 228 , 1218-1225.

Kandori, M., Mailath, J., & Rob, R. (1993). Learning, mutation, and long run equilibria in games. Econometrica, 61 (1), 29-56.

Matthey, A. (2010). Imitation with intention and memory: An experiment. The Journal of Socio-Economics, 39 , 585-594.

Pickford, M. (n.d.). BrainyQuote.com. Retrieved May 12, 2016. Retrieved from http://www.brainyquote.com/quotes/authors/m/mary pickford Pingle, M., & Day, R. (1996). Modes of economizing behavior: Experimental evidence.

Journal of Economic Behavior and Organization, 29 , 191-209.

Possajennikov, A. (2003). Imitation dynamic and Nash equilibrium in Cournot oligopoly with capacities. Internation Game Theory Review , 5 (3), 291-305.

Rhode, P., & Stegeman, M. (1995). Non-Nash equilibria of Darwinian dynamics (with applications to duopoly). International Journal of Industrial Organization, 19 (3), 415-453.

Schaffer, M. (1988). Evolutionarily stable strategies for a finite population and a variable contest size. Journal of Theoretical Biology, 132 , 469-478.

Schlag, K. (1998). Why imitate, and if so, how? Journal of Economic Theory, 78 (1), 130-156.

Vega-Redondo, F. (1997). The evolution of Walrasian behavior. Econometrica, 65 (2), 375-384.

Weibull, J. (1995). Evolutionary game theory. Cambridge, Massachusetts: The MIT Press.

(28)

Appendices

I

‘Average Difference’ extended Table

i K 0 1 2 3 5 10 25 100 200 1000 1 000 000 0 99.0 0 0 0 0 0 0 0 0 0 0 1 90.0 70.71 0 0 0 0 0 0 0 0 0 2 87.35 66.0 64.56 0 0 0 0 0 0 0 0 3 86.09 64.56 60.61 61.87 0 0 0 0 0 0 0 4 85.34 63.87 59.4 58.23 0 0 0 0 0 0 0 5 84.86 63.46 58.81 57.11 59.4 0 0 0 0 0 0 6 84.51 63.19 58.46 56.57 56.04 0 0 0 0 0 0 7 84.26 63.0 58.23 56.25 55.0 0 0 0 0 0 0 8 84.06 62.86 58.07 56.04 54.49 0 0 0 0 0 0 9 83.9 62.75 57.95 55.89 54.2 0 0 0 0 0 0 10 83.77 62.66 57.86 55.77 54.0 57.32 0 0 0 0 0 15 83.37 62.39 57.59 55.47 53.56 52.27 0 0 0 0 0 20 83.16 62.26 57.46 55.34 53.39 51.86 0 0 0 0 0 25 83.03 62.19 57.39 55.27 53.31 51.7 55.96 0 0 0 0 30 82.95 62.13 57.34 55.22 53.26 51.62 51.14 0 0 0 0 35 82.88 62.1 57.31 55.19 53.22 51.57 50.74 0 0 0 0 40 82.84 62.07 57.28 55.16 53.19 51.54 50.59 0 0 0 0 45 82.8 62.05 57.26 55.14 53.17 51.52 50.52 0 0 0 0 50 82.77 62.03 57.25 55.13 53.16 51.5 50.47 0 0 0 0 100 82.64 61.95 57.18 55.06 53.09 51.43 50.34 55.24 0 0 0 200 82.57 61.91 57.15 55.03 53.06 51.4 50.3 49.75 55.12 0 0 500 82.53 61.89 57.13 55.01 53.05 51.38 50.28 49.71 49.61 0 0 1000 82.51 61.88 57.12 55.01 53.04 51.37 50.28 49.7 49.6 55.02 0 10000 82.5 61.88 57.12 55.0 53.04 51.37 50.27 49.7 49.6 49.52 0 1.0·1010 82.5 61.87 57.11 55.0 53.04 51.37 50.27 49.7 49.6 49.52 49.5

Table I: Stable quantity for various K and i

Note: For K and i equal, the quantities might seem larger than expected. This results from the fact that when K = i, the last period in which all firms played the ‘old’ quantity is removed from the memory. Only the profit of the mutation in combination with this ‘old’ quantity is part of the average.

(29)

II

Distributions heterogeneous K

i

All players memory

(a) K=1 (b) K=2

(c) K=10 (d) K=50

(e) K=100

Figure I: Histogram of imitated quantity for different memory lengths Market: K=[1 2 10 50 100]

(30)

ii

No memory for some player

(a) K=0 (b) K=1

(c) K=2 (d) K=50

(e) K=100

Figure II: Histogram of imitated quantity for different memory lengths Market: K=[0 1 2 50 100]

Note: The indication of the Walrasian quantity has been left out, since the peak of the distributions coincides with this value.

Referenties

GERELATEERDE DOCUMENTEN

This chapter has two purposes: first, to explore what the current focus on “crisis” in international and global politics reveals, both empirically and normatively, about the state

We opted to visit four Memory Sites in which narratives of the past intersect with present-day nationalistic discourses: Keleti Railway Station (Budapest, Hungary),

License: Licence agreement concerning inclusion of doctoral thesis in the Institutional Repository of the University of Leiden Downloaded.

Chapter 2 Consistency of Memory for Emotionally Arousing Events:. A Review of Prospective and Experimental Studies

The study results show that recall of the violent assault was fairly complete, with emotional aspects being somewhat better recalled (95%) than situational aspects (86%)

The present study investigated the consistency of self-reports of childhood traumatic events in a sample of 50 patients with a borderline personality disorder

Trough valence is the best predictor, accounting for 24% of the variance in direct overall arousal, followed by trough-end valence (21%) and valence variance (18.7%). Later

even boxes. Currently there’s no L3 command for this. This module provides two new L3 functions for rules. The “-D” in the module name indicates, that currently the im-