• No results found

On the effect of local interaction in heterogeneous markets

N/A
N/A
Protected

Academic year: 2021

Share "On the effect of local interaction in heterogeneous markets"

Copied!
33
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

On the Effect of Local Interaction in

Heterogeneous Markets

Friso Cuijpers

Student number: 10216995

Date of final version: July 27, 2015

Master’s programme: Econometrics

Specialisation: Mathematical Economics

Supervisor: M. J. van der Leij

Second reader: J. Tuinstra

(2)

i

Statement of Originality

This document is written by Student Friso Cuijpers who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Contents

1 Introduction 1 2 The Model 5 2.1 Cournot Model . . . 5 2.1.1 Global Interaction . . . 5 2.1.2 Local Interaction . . . 6 2.2 Learning Rules . . . 8 2.2.1 Global Interaction . . . 8 2.2.2 Local Interaction . . . 9 2.3 Moran Process . . . 9 2.3.1 Global Interaction . . . 9 2.3.2 Local Interaction . . . 10 3 Results 11 3.1 Global Interaction . . . 11 3.2 Circle Network . . . 13 3.3 Grid Network . . . 16 3.4 Star Network . . . 19 4 Discussion 23 4.1 Reproduction of reinforcement learners . . . 23

4.2 Reinforcement learning paramters . . . 24

5 Conclusion 25

6 References 27

7 Appendix A 29

(4)

Chapter 1

Introduction

Firms compete with each other every day and many economic models try to explain and predict how firms behave. One of the most used models is the Cournot model (1838), which is used to model competition among firms on quantity. One of the major assumptions in the Cournot model is the rationality of all agents. This assumption seems unrealistic as this would mean that any agent knows what all the other agents are going to do and use that information to perfectly react to that. In real life it seems more realistic that firms learn what their competitors are probably going to do based on the past, one of the first to do research in this area are Sidney and Winter (1964). Sidney and Winter introduce the term “routines”, a routine is the set of strategic characteristics of a firm. Firms using routine A could outperform firms using routine B and over time firms using routine B will switch to routine A and routine B will go extinct, like a form of natural selection. Nowadays we speak of learning rules instead of routines and still a lot of research is being done on agents using learning rules.

Andreoni and Miller (2002) show for a dictator game that agents are heterogeneous, that is, players use different learning rules. If different learning rules are used in practice, then which learning rules are used and which ratio of learning rules is evolutionary stable? Droste, Hommes and Tuinstra (2002) and Ochea (2010) study firms competing in a Cournot market using two different learning rules. Firms can switch between learning rules based on past prof-its. They are interested to find which set of learning rules is evolutionary stable and try to find this by modelling the problem with replicator dynamics. They assume that all firms compete equally strong with all other firms, that is, they interact globally and since they use replicator dynamics, they also assume an infinitely large market. One could question whether such global interaction is realistic; Bakeries within a town compete stronger with other bakeries nearby than with a bakery at the other end of town and the amount of bakeries is certainly not infinite within a town.

This paper will extend upon Droste’s and Ochea’s research by modelling a Cournot mar-ket in a finite line, grid and star network with firms choosing between two learning rules. The paper focuses on two learning rules: myopic best response and reinforcement learning. Firms

(5)

only compete with their direct neighbours and can switch between the learning rules. Strategic interaction with neighbours or connections only is called local interaction. The paper tries to answer the questions: Does adding a network structure have an effect on the evolutionary stable set of learning rules in a Cournot market with heterogeneous players, and if so, why and how? Why would it be interesting to look at competition in a finite market with a network structure? Besides making the model more realistic, it is also to be expected that the results change. Even though no research has been done on Cournot competition with learning rules in a lattice network in the past, a lot of research has been done on cooperation in networks and those typically show that local interaction leads to different results than global interaction. Some network structures can support the evolution of cooperation (Goyal, 2007). Tieman, van der Laan and Houba (2001) look at a prisoners’ dilemma set by Bertrand competition in a lattice. Firms can only choose to ask the Nash equilibrium price or the cartel price and locally interact with their eight nearest neighbours. They find that even though the Nash equilibrium price is favoured with global interaction, with local interaction in a lattice approximately 50% of all firms ask the cartel price. Ohtsuki et al. (2006) finds that natural selection in a finite network favours cooperation if benefit of cooperation divided by the cost exceeds the average number of neighbours. Hence, the introduction of local interaction in a network structure can lead to different results than expected with global interaction.

Even though firm competition using myopic best response and reinforcement learning might not be exactly the same as a prisoners’ dilemma, there is a strong connection to the prisoners’ dilemma. Waltman and Kaymak (2008) study firms competing in a Cournot market and using a reinforcement learning rule, more specifically a Q-learning rule, to learn what their competitors are going to do. Waltman and Kaymak show that firms generally learn to collude with each other, but do not reach the maximum joint profit. A group of reinforcement learners learn to produce a quantity near the cartel quantity every period. The best reaction to the car-tel quantity is not the carcar-tel quantity. The best reaction is the profit maximizing quantity, thus a firm using the best reaction within a group of colluding reinforcement learners will gain more profits than the reinforcement learners. A myopic best response learner produces the best reac-tion to the aggregate quantity of last period, so a myopic best response learner within a group of reinforcement learners will probably outperform the reinforcement learners. As the myopic best response learner outperforms the others it is more likely that the reinforcement learners switch to myopic best response and thus the number of myopic best response learners increases. However Theocharis (1960) has shown that the produced quantities in a market with 3 or more firms consisting solely out of firms using the myopic best response rule are unstable; Assuming that the myopic best response players play simultaneously. Leleno (1993) shows that markets consisting solely out of myopic best response players converge to the Nash equilibrium if the produced quantities are updated sequentially. The big fluctuations in the quantities typically lead to lower payoffs, so when close to the cartel (or Nash) equilibrium it becomes tempting to switch to the myopic best response rule but if all firms switch the market becomes unstable and

(6)

CHAPTER 1. INTRODUCTION 3

payoffs decrease, which is much like a prisoners’ dilemma.

Why focus on myopic best response and reinforcement learning? The reason to focus on only two learning rules is for simplicity. The model becomes hard to track when there are three or more learning rules. The main reason to focus on these two learning rules is to create a setting that is as realistic as possible. Bosch-Dom`enech and Vriend (2003) tries to find and explain imitation of other players in Cournot markets. When that failed, Bosch-Dom`enech and Vriend did show which learning rule(s) fitted the behaviour of players best. They show that a reinforcement learning rule with discounting fits individual choices best. This paper also fo-cusses on myopic best response as it is simple and realistic in fairly stable markets.

Using replicator dynamics implies assuming an infinitely large market, so an alternative is needed for the model. Instead the dynamics will be modelled by a Moran process, a detailed introduction to Moran processes is given by Nowak (2006). In short a Moran process model assumes that each period every player has an equal chance to die and all the players that are left have a chance to be selected for reproduction, proportional to their payoffs. If a player using learning rule A is chosen for reproduction, then the new player will also use learning rule A. Moran processes in a network are also explained by Nowak. In my model every player will still have an equal chance to die, but only the players that have a direct connection to the dead player have a chance to be chosen for reproduction, proportional to their payoffs.

We model strategic interaction in a network structure in the same way as Bramoull´e, Kranton and D’Amours (2014). The model will define the connections of player as the direct neighbours of a player, for this to hold for all players we need to assume that all borders of the network are connected to each other. Connecting the borders creates a circle for the line network and a torus for the grid network. The idea is to model a Cournot market around every firm, such that every firm has its own Cournot market with its own price and aggregate quantity. This also means that two neighbours are both in each other’s market, but can both ask a different price. This normally is not the case in a Cournot market, normally every firm asks the same price. One could say that the location of a firm also determines the price in our model, which is something that is seen in practice as well. A sandwich is more expensive at a train station than at the local supermarket.

The markets with global interaction and a grid network both converge to a market full of reinforcement learners. The main reason is that the produced quantity of a myopic best response learner is unstable if it has too many myopic best response learner neighbours. The market with a circle network converges to the opposite, a market full of myopic best response learners. Every player has two neighbours in a circle network and a market with 3 myopic best response learners converges to the Nash equilibrium. The myopic best response learners outperform the reinforcement learners because of this reason. A star network converges to a fraction of reinforcement learners of 22% on average. Every single repetition converges to a fraction of 0 or 1, but the average of 2500 repetitions converges to 22%. The centre firm in a star network always produces (almost) nothing. The myopic best response learners in the

(7)

periphery produce the maximum quantity every period, as they respond to 0 every period. The reinforcement learners can learn to pick high quantities, but fail to do so every period because they pick a quantity stochastically. So the myopic best response learners take over the market more often, thus the market converges to a fraction of reinforcement learners that is smaller than 12 on average. Clearly network structures have an effect of the evolutionary stable set of learning rules.

The rest of the paper is organized as followed. Section 2 discusses the model and learn-ing rules, Section 3 analyses the model output, Section 4 is the discussion and Section 5 contains the conclusion.

(8)

Chapter 2

The Model

This chapter contains an explanation of the used models. First the Cournot model is discussed, then the learning rules and finally the Moran process. The model in this paper will be a combi-nation of the three: a Cournot market in which firms using certain learning rules compete and the dynamics for firms switching between learning rules will be modelled with a Moran process. The paper investigates whether local interaction has an effect on the market, thus the cases with local interaction need to be compared to the case with global interaction. Since a finite market is considered, it is impossible to compare the local interaction cases with a model that uses replicator dynamics. Instead three models will be introduced one with global interac-tion and two with local interacinterac-tion, all with a Cournot market, firms using the same learning rules and dynamics modelled with a Moran process. All models are programmed and results are obtained by running computer simulations in Matlab. For every network structure we run 2500 simulations.

2.1

Cournot Model

2.1.1 Global Interaction

The Cournot model and its parameter values are based on the model introduced by Ochea (2010). Consider an oligopoly Cournot market with N firms. qi is the quantity produced by

firm i and Q is the aggregate quantity: Q =

N

P

j=1

qj. P (Q) denotes the inverse demand function,

P (Q) is a twice continuously differentiable function with P (Q) ≥ 0 and P0(Q) ≤ 0 for every Q. The cost function is C(qi) for every firm i and is twice continuously differentiable, with C(qi) ≥ 0

and C0(qi) ≥ 0 for every qi. Q−idenotes the aggregate quantity minus qi, Q−i= P j6=i

qj = Qi−qi.

Every firm tries to maximize their profits, πi = qiP (Q) − C(qi). The quantity that maximizes

their profit is given by the reaction function R, qi = R(Q−i), if the second order condition,

(9)

2P0(Q) + qiP00(Q) − C00(qi) ≤ 0, is met.

We need to look at a dynamical Cournot model to implement learning rules. The dynamical Cournot model is almost the same as the Cournot model explained above, only a subscript t needs to be added.

qi,t∗ = R(Q−i,t) (2.1)

The considered Cournot model is a linear Cournot model. The used inverse demand function and cost function are:

P (Qt) = a − bQt (2.2)

C(qi,t) = cqi,t (2.3)

We impose qi,t < N ba to ensure that all prices are positive. The condition on qi,t can be seen as

a budget or technological constraint. With this inverse demand function and cost function are the profit function and reaction function:

πi,t = qi,t(a − bQt) − cqi,t (2.4)

R(Q−i,t) =    a−c−bQ−i,t 2b , if Q−i,t≤ a−c b 0. if Q−i,t> a−cb (2.5)

Values for a, b and c are: a=17, b=1 and c=10. These are the same parameter values as used by Ochea (2010). For these parameter values all conditions are met.

2.1.2 Local Interaction

To model Cournot competition in networks, a market is considered with N players. Figure 2.1 shows every network structure that is considered and for each network structure an example of firm i and its neighbours. Every firm i will compete with its k direct neighbours, see Figure 2.1. For the circle, k = 2. For the lattice, k = 8. k depends on the position of firm i for the star network. Note that for the circle and grid network the edges have to be connected for every firm to have k neighbours.

The way to model the Cournot competition in a network structure will be as in Bramoull´e, Kranton and D’Amours (2014). To define the neighbours of firm i, let gi,j = 1 hold if firm i

and j are neighbours, let gi,j = 0 hold if firm i and j are not neighbours and let gi,i= 1 always

hold. Every firm i competes in a standard Cournot market with its k neighbours, a standard Cournot market can be modelled around every firm i, but some notation needs to be adjusted so it can be used for all i. Consider an oligopoly Cournot market with k + 1 firms. qi is the

quantity produced by firm i and Qi is the local aggregate quantity: Qi = N

P

j=1

gi,jqj. Pi(Qi)

denotes the inverse demand function, Pi(Qi) is a twice continuously differentiable function with

(10)

CHAPTER 2. THE MODEL 7

Figure 2.1: Firm i (black dot) and its neighbours (grey dots) in several networks. Top left shows the circle network. Top right shows the grid network. Bottom left shows the star network with firm i in the periphery. Bottom right shows the star network with firm i in the centre.

twice continuously differentiable, with C(qi) ≥ 0 and C0(qi) ≥ 0 for every qi. Qi−i denotes the

aggregate quantity minus qi, Qi−i =

P

j6=i

gi,jqj = Qi− qi. Every firm tries to maximize their

profits, πi = qiPi(Qi) − C(qi). The quantity that maximizes their profit is given by the reaction

function R, qi = R(Qi−i), if the second order condition, 2Pi0(Qi) + qiPi00(Qi) − C00(qi) ≤ 0, is

met.

As before, we need a dynamical Cournot model to allow the implementation of learning rules.

qi,t∗ = R(Qi−i,t) (2.6)

The considered Cournot model is a linear Cournot model. The used inverse demand function and cost function are:

Pi(Qi,t) = a − bQi,t (2.7)

C(qi,t) = cqi,t (2.8)

We impose qi,t < (k+1)ba for the circle and grid network to ensure that all prices are positive. For

the star network qi,t < N ba is imposed, because all prices must be positive and the centre firm

competes with all firms. The condition on qi,tcan be seen as a budget or technological constraint.

With this inverse demand function and cost function, the profit function and reaction function are:

(11)

R(Qi−i,t) =    a−c−bQi −i,t 2b , if Q i −i,t≤ a−cb 0. if Qi−i,t> a−cb (2.10)

Values for a, b and c are: a=17, b=1 and c=10. For these parameter values all conditions are met.

2.2

Learning Rules

2.2.1 Global Interaction

Firms that perfectly use the reaction function given in (2.5) or (2.10) are rational firms. Ratio-nality is a strong assumption that is almost always violated. More realistic is to think about firms using certain learning rules to determine what their competitors are going to do based on past behaviour. This paper will focus on two learning rules: myopic best response and rein-forcement learning.

Myopic best response is using the reaction function to react on the Q−ifrom last period, Q−i,t−1.

So myopic best response is defined by the following reaction function:

qi,t= R(Q−i,t−1). (2.11)

The myopic best response rule is very simple to use for firms and seems fairly realistic.

Reinforcement learners will try to learn the best action, q∗i,t, for a certain state of the game. Reinforcement learning will be modelled as in Waltman and Kaymak (2008). With reinforcement learning a player observes the state st and based on the past and st the player

can pick an action a from finite action space A. Note that state st can be defined as the

aggregate quantity minus the quantity of firm i at t − 1, Q−i,t−1. The player picks an action, a,

based on the Q-values. Q-values are the weight each action has, based on the state and past. The Q-values are updated each period as follows:

Vi,t+1(st, a) = (1 − α)Vi,t(st, a) + απi,t, (2.12)

in which α is the learning rate. A player picks an action a with a probability that depends on the Q-values. The probability that a player picks action a is:

P r(a) = Pexp(Vi,t(st, a)/β)

a0∈A

exp(Vi,t(st, a0)/β)

, (2.13)

in which β denotes the experimentation tendency. The higher β, the higher the chance that a player chooses to experiment. Experimenting means that the player picks an action that does not have the highest Q-value.

Reinforcement learning was originally made for discrete states and actions. Above formulas do not work for continuous spaces. To fix this the actions and states are put into a lookup table. The actions are produced quantities and the lookup table for the actions will range

(12)

CHAPTER 2. THE MODEL 9

from 0 to N ba in 30 steps. N ba is chosen because the prices stay positive if every firm produces this quantity or less. The reinforcement learners are not able to pick every possible quantity, but the steps are small enough for it to be viable. The states are the aggregate quantity minus the quantity of firm i at time t − 1, Q−i,t−1. The states in the lookup table will range from 0

to (N −1)aN b in 30 steps. (N −1)aN b is chosen because the prices stay positive if every firm produces

a

N b or less. Clearly, the actual Q−i,t−1 can be any number between 0 and a

b because the myopic

best response learners can pick any quantity. The reinforcement learners will look up the state that is closest to the actual Q−i,t−1 in the lookup table.

2.2.2 Local Interaction

Only some notation needs to be changed in order to model the learning rules for local interaction, as the learning is only based on the neighbours of firm i.

Myopic best response is:

qi,t= R(Qi−i,t−1). (2.14)

Reinforcement learning is almost exactly the same with local interaction, as firm i tries to learn its own optimal quantity based on its own past. The thing that changes are the borders of the lookup tables for the states and actions. Reinforcement learning is:

Vi,t+1(st, a) = (1 − α)Vi,t(st, a) + απi,t, (2.15)

P r(a) = Pexp(Vi,t(st, a)/β)

a0∈A

exp(Vi,t(st, a0)/β)

. (2.16)

For the circle and grid network. The actions lookup table will range from 0 to (k+1)ba in 30 steps. The states lookup table will range from 0 to (k+1)bka in 30 steps. For the star network. The actions lookup table will range from 0 to N ba in 30 steps. The states lookup table will range from 0 to (N −1)aN b in 30 steps.

2.3

Moran Process

2.3.1 Global Interaction

The Moran process will be modelled as introduced by Nowak (2006). To keep it general, let’s call myopic best response learners ‘A players’, and reinforcement learners ‘B players’ in this section. Every period all players have an equal chance to die and all living players can be chosen for reproduction to replace the dead player, with a chance proportional to their payoffs in that period. Offspring of an A (B) player, will be an A (B) player. All players have chance

1

N to die every period. Player i has chance

exp(πi,t)

P

i6=j

exp(πi,t)

to be chosen for reproduction, if player

(13)

2.3.2 Local Interaction

The Moran process changes a little bit when looking at local interaction, this will also be mod-elled as introduced by Nowak (2006). Every period all players have an equal chance to die and the neighbours of the dead player can be chosen for reproduction to replace the dead player, with a chance proportional to their payoffs in that period. Note that for the Moran process it does not seem interesting to look at an A player dying if it is surrounded by all A players, as the A player will be replaced by an A player with chance 1, but skipping this period would be incorrect as the period does have an effect on the produced quantities, which in their turn can affect future chances. All players have chance 1

N to die every period. Player i has chance gi,jexp(πi,t)

P

i6=j

gi,jexp(πi,t)

(14)

Chapter 3

Results

This chapter contains the results and analysis of the results. First we look at global interaction. Later we look at how local interaction with the different network structures compares to global interaction.

3.1

Global Interaction

This section contains the results for global interaction. First we will look at the fraction of reinforcement learners for some repetitions and the average fraction of reinforcement learners over all repetitions. Then we will explain what happens to the fraction of reinforcement learners by looking at a firm that is a reinforcement learner at t = 0 and a firm that is a myopic best response learner at t = 0.

Figure 3.1a shows the fraction of reinforcement learners over time for repetition 1 and 2. Figure 3.1b shows the average fraction of reinforcement learners over time. It is the average over 2500 repetitions, of which two are seen in figure 3.1a.

The Overview shows that the fraction of reinforcement learners converges to 1 over time. To understand why, let’s look at 1 of the repetitions in detail: repetition 1. It is unnecessary to look at all 1000 periods to understand what is happening, so let’s focus on the first 100 periods.

A firm is more likely to be chosen for reproduction when it has higher profits. Figures 3.2a and 3.2b show the produced quantities of firm 1 and 2. Firm 1 is a myopic best response learner till period 60 and a reinforcement learner from period 60 onward. Firm 2 is a reinforce-ment learner for all periods.

Figure 3.2a shows that firm 1 bounces between the maximum and minimum quantity the first 35 periods. The produced quantities of firm 1 are so unstable because the fraction of

(15)

(a) The fraction of reinforcement learners over time for some repetitions.

(b) The average fraction of reinforcement learners over time. This is the average over 2500 repeti-tions.

Figure 3.1: Results overview for global interaction. The market contains 25 players.

(a) The produced quantity of firm 1 over time. (b) The produced quantity of firm 2 over time.

Figure 3.2: Produced quantities of firm 1 and 2 for repetition 1. Firm 1 is a myopic best response learner at t = 0 and firm 2 is a reinforcement learner at t = 0.

reinforcement learners is too low. Theocharis (1960) has shown that a market consisting solely out of myopic best response learners is unstable if the market has 4 or more firms. It is to be expected that the quantities of myopic best response learners are unstable when the fraction of reinforcement learners is too low. Figure 3.3b shows the profits of firm 1 and firm 2 over time. Firm 1 generates profits that are either 0 or negative during the first 35 periods, because it produces too much or nothing. So firm 2 outperforms firm 1 during these first 35 periods. Firm 2 is the reinforcement learner and generates more profits, so it is more likely that the fraction of reinforcement learners increases during the first 35 periods.

The produced quantities become a little more stable after the first 35 periods. From this point on the profits of the myopic best response learners and reinforcement learners are almost equal. Still the market is likely to converge to a market full of reinforcement learners. The first

(16)

CHAPTER 3. RESULTS 13

(a) The fraction of reinforcement learners over

time. (b) The profits of firm 1 and 2 over time.

Figure 3.3: The fraction for reinforcement learners on the left en the profits of firm 1 and 2 on the right. Firm 1 is a myopic best response learner at t = 0 and firm 2 is a reinforcement learner at t = 0.

reason why the market is more likely to converge to a full market of reinforcement learners is that the fraction of reinforcement learners is so high after 35 periods that it is very likely that a reinforcement learner is chosen for reproduction. This does not depend on their profits, as their profits are almost equal at this point. The second reason is that if a myopic best response learner is chosen for reproduction several times, then the fraction of reinforcement learners goes down and the produced quantities of myopic best response learners start to become unstable again. Then the reinforcement learners are more likely to be chosen for reproduction just like at the beginning.

3.2

Circle Network

Results for a circle network. We look at an overview of the results first. This overview shows the fraction of reinforcement learners over time for two repetitions and the average fraction of reinforcement learner over time. Later we explain the results seen in this overview by looking at four firms. These four firms form the border between a group of reinforcement learners and a group of myopic best response learners.

In figure 3.4a we see the fraction of reinforcement learners over time for repetition 1 and repetition 2. Figure 3.4b shows the average fraction of reinforcement learners over time for a circle network and for global interaction.

Figure 3.4b shows that the fraction of reinforcement learners converges to 0 over time for a circle network, whereas the fraction of reinforcement learners converges to the opposite,

(17)

(a) The fraction of reinforcement learners over time for some repetitions.

(b) The average fraction of reinforcement learners over time. This is the average over 2500 repeti-tions.

Figure 3.4: Results overview for a circle network. The market contains 25 players.

1, for global interaction. To understand why, let us look at one of the repetitions in detail: repetition 7. It is unnecessary to look at all 1000 periods to understand what is happening, so let’s focus on the first 200 periods. Figure 3.5 shows the fraction of reinforcement learners over time for repetition 7.

Figure 3.5: For a circle network. The fraction of reinforcement learners over time for repetition 7. The market contains 25 players.

Figure 3.6 shows a schematic view of firms 6 to 11 of repetition 7. We focus on firms 7, 8, 9 and 10. The focus lies on those 4 firms as they are either surrounded by firms using the same learning rules as themselves or form the border between the two learning rules.

Figure 3.7 shows the produced quantities of firm 7 (figure 3.7a) and firm 8 (figure 3.7b). Firm 6 becomes a myopic best response learner at period 32. Firm 7 becomes a myopic best response learner at period 34. Firm 8 becomes a myopic best response learner at period 55. Produced quantities of firm 7 and 8 directly start converging towards the Nash equilibrium

(18)

CHAPTER 3. RESULTS 15

Figure 3.6: From left to right are these firms 6 to 11. The red dots are reinforcement learners at t=0 and the white dots are myopic best response learners at t=0.

(a) The produced quantity of firm 7 over time. (b) The produced quantity of firm 8 over time.

Figure 3.7: Produced quantities of firm 7 and 8 for repetition 7.

quantity around the time that firm 8 becomes a myopic best response learner.

(a) The produced quantity of firm 9 over time. (b) The produced quantity of firm 10 over time.

Figure 3.8: Produced quantities of firm 9 and 10 for repetition 7.

Figure 3.8a shows the produced quantities of firm 9 and figure 3.8b shows the produced quantities of firm 10. Firm 9 and 10 are myopic best response learners for all 200 periods. Their produced quantities start converging towards the Nash equilibrium around period 55, when firm 8 becomes a myopic best response learner. Remember that Theocharis (1960) has shown that a market consisting out of 4 or more myopic best response learners is unstable. Markets around every player consist out of 3 players, the player and his two neighbours. A market with 3 or less myopic best response learners converges towards the Nash equilibrium.

(19)

(a) The profits of firm 7 and 10 over time. (b) The profits of firm 8 and 9 over time.

Figure 3.9: Profits of firm 7, 8, 9 and 10 for repetition 7.

Figure 3.9a shows the profits of firm 7 and firm 10. Figure 3.9a shows that firm 10 outperforms firm 7 almost every period. So the myopic best response learner surrounded by myopic best response learners outperforms the reinforcement learner surrounded by reinforce-ment learners.

Figure 3.9b shows the profits of firm 8 and firm 9. Figure 3.9b shows that firm 9 outper-forms firm 8 in the first 34 periods on average. But firm 8 outperoutper-forms firm 9 after period 34. So the myopic best response learner outperforms the reinforcement learner when there are more reinforcement learners behind him. But the reinforcement learner outperforms the myopic best response learner when it is surrounded by myopic best response learners. This will converge to a fraction of reinforcement learners of 0. Because there are only two options for firm 8, either one of his neighbours dies and firm 8 is most likely to be chosen for reproduction or firm 8 dies himself. If a neighbour of firm 8 dies and firm 8 is chosen for reproduction then firm 8 is back in the same situation as during periods 0 to 34. Firm 8 and his new reinforcement learner neighbour are outperformed by the myopic best response learners and thus it is very likely that one of them quickly becomes a myopic best response learner. The other option is that firm 8 dies. Firm 8 is surrounded by myopic best response learners, so he will become a myopic best response learner with chance 1 if he dies.

3.3

Grid Network

Here we discuss the results for a grid network. Let us look at the fraction of reinforcement learners over time for two repetitions and the average fraction of reinforcement learners over time first. Second we will explain what we saw in the first part by looking at two extra repeti-tions.

Figure 3.10a shows the fraction of reinforcement learners over time for repetitions 1 and 2. Figure 3.10b shows the average fraction of reinforcement learners over time for the grid

(20)

CHAPTER 3. RESULTS 17

network and for global interaction.

(a) The fraction of reinforcement learners over time for some repetitions.

(b) The average fraction of reinforcement learners over time. This is the average over 2500 repeti-tions.

Figure 3.10: Results overview for a grid network. The market contains 25 players.

The overview in figure 3.10 shows that a market with a grid network has very similar results as a market with global interaction. The main difference is that the convergence is a bit slower.

We look at two artificial examples to analyse why the market converges to a full market of reinforcement learners. It was impossible to find perfect examples from the 2500 repetitions to analyse what happens to separate firms in the grid network. That is why we look at two artificial examples. The artificial examples are two extra repetitions in which the Moran process is turned off. The artificial examples are set up in the following way. The focus lies on one firm (firm 13). The neighbours of that firm are all myopic best response learners at the start. Every 20 periods one neighbour becomes a reinforcement learner. All firms that are not firm 13 or a neighbour of firm 13 are a myopic best response learner or reinforcement learner with chance

1

2. We look at two situations: when firm 13 is a myopic best response learner and when firm 13

is a reinforcement learner. This way, we can analyse the behaviour of both learning rules when the market converges.

Let’s look at the situation that firm 13 is a myopic best response learner first. Figure 3.11 shows the results for firm 13, when firm 13 is a myopic best response learner. 3.11a shows the fraction of reinforcement learners, 3.11b shows the produced quantities of firm 13 and 3.11c shows the profits of firm 13.

Produced quantities of firm 13 are chaotic at first, bouncing between the minimum and maximum value every period. Quantities stabilize as the fraction of reinforcement learners rises, comparable to global interaction. Notice that profits also rise when the quantities stabilize, but are rarely positive.

(21)

(a) The fraction of reinforce-ment learners around firm 13.

(b) The produced quantities of firm 13 over time.

(c) The profits of firm 13 over time.

Figure 3.11: Results of a myopic best response learner in a grid network. The market contains 25 players.

Let’s now look at the situation that firm 13 is a reinforcement learner. The results with firm 13 as a reinforcement learner are shown in figure 3.12. 3.12a shows the fraction of reinforcement learners, 3.12b shows the produced quantities of firm 13 and 3.12c shows the profits of firm 13.

(a) The fraction of reinforce-ment learners around firm 13.

(b) The produced quantities of firm 13 over time.

(c) The profits of firm 13 over time.

Figure 3.12: Results of a reinforcement learner in a grid network. The market contains 25 players.

Produced quantities of firm 13 are hard to analyse as they fluctuate a lot. The reinforce-ment learners stochastically pick a quantity. Slowly adding more neighbours that stochastically pick quantities, makes the quantities very volatile at first. The profits of firm 13 stabilize, as the fraction of reinforcement learners around firm 13 goes up. Still the profits are mostly negative when the fraction of reinforcement learners is high.

The quantities of the myopic best response learners are chaotic when the fraction of reinforcement learners around them is low. The reinforcement learners outperform the myopic best response learners when the produced quantities of the myopic best response learners are chaotic. The produced quantities of a myopic best response learner do stabilize when the frac-tion of reinforcement learners around him rises. At this point a firm is more likely to be replaced by a reinforcement learner if it dies, because there are a lot of reinforcement learners. The pro-duced quantities of a myopic best response learner will start to become chaotic if a myopic

(22)

CHAPTER 3. RESULTS 19

best response learner does reproduce a few times, creating an advantage for the reinforcement learners. So this market is likely to converge to a market full of reinforcement learners.

We also see that convergence is a bit slower for the grid network compared to global interaction. For global interaction we see that the produced quantities of myopic best response learners start stabilizing when the fraction of reinforcement learners is around 0.8. The pro-duced quantities of a myopic best response learner in a grid network are unstable if there are three or more myopic best response learners (including himself) in its market. This translates to a fraction of reinforcement learners of 23, which is lower than 0.8. The fraction that is needed for the quantities of myopic best response learners to become stable is higher for global interaction, because every myopic best response learner has to compete with all other myopic best response learners. While a myopic best response learners in a network only competes with the myopic best response learners that are also its neighbour. Profits of the reinforcement learners and myopic best response learners are almost equal once the quantities of the myopic best response learners have stabilized. At this point the fraction of reinforcement learners is higher for global interaction. So if a myopic best response learner dies it has a higher chance to be replaced by a reinforcement learner in a market with global interaction than in a market with local interaction in a grid network.

3.4

Star Network

This section contains the results for a star network. Figure 3.13 shows an overview of the re-sults for a star network. Figure 3.13a shows the fraction of reinforcement learners over time for repetitions 1 and 2 and figure 3.13b shows the average fraction of reinforcement learners over time for a star network and for global interaction. First we will discuss the overview in figure 3.13. Then we will explain the results in the overview by looking at two examples of a myopic best response learner and a reinforcement learners in the periphery.

Figure 3.13b shows that the fraction of reinforcement learners converges to 0.2232 over time for a star network, Whereas the fraction of reinforcement learners converges to 1 for global interaction. To understand why, let’s look at 2 of the repetitions in detail: repetitions 1 and 6. It is unnecessary to look at all 1000 periods to understand what is happening, so let us focus on the first 100 periods.

First, why does the fraction in this network structure not converge to 0 or 1 like the other network structures? The star network is special, because any firm in the periphery that dies, is replaced by a firm that uses the learning rules that the centre firm uses with chance 1. A star network typically converges much faster than other networks for this reason. It does not converge to 0 or 1 over all repetitions because there is a chance that the vast majority of firms including the centre firm is of a certain type, let’s say reinforcement learners. In that case is it very likely to converge to a market with all reinforcement learners, because dead firms in the

(23)

(a) The fraction of reinforcement learners over time for some repetitions.

(b) The average fraction of reinforcement learners over time. This is the average over 2500 repeti-tions.

Figure 3.13: Results overview for a star network. The market contains 25 players.

periphery are always replaced by a reinforcement learner and a dead central firm is likely to be replaced by a reinforcement learner because the vast majority in the periphery is a reinforce-ment learner.

First we look at repetition 1. We focus on a reinforcement learner in the periphery, firm 2 and a myopic best response learner in the periphery, firm 3. We do not focus on firm 1, because we want to explain why the market converges to a fraction of 0.2232 reinforcement learners over all repetitions. It is already explained why the market converges to 0 or 1 if the vast majority is of the same type. But most of the time the amount of reinforcement learners and myopic best response learners is almost equal and the centre firm is approximately half of the time a reinforcement learner at t=0. So to understand why the market is still more likely to converge to 0 than 1, we need to check which learning rule is more likely to be chosen for reproduction if the centre firm dies.

In repetition 1 the centre player is a myopic best response learner for all periods. Figure 3.14 shows the results for repetition 1. 3.14a shows the fraction of reinforcement learners over time, 3.14b shows the produced quantities of firms 2 and 3 and 3.14c shows the profits of firms 2 and 3.

Firm 1 (the centre firm) produces nothing, because it has to compete with all 24 firms in the periphery. Therefore, firm 3 (the myopic best response learner) produces the maximum quantity every period. Firm 2 (the reinforcement learner) fails to learn the optimal quantity, thus produces almost random quantities. The reinforcement learners have trouble in the star network, because they only have a short time to learn and the used parameter values are more optimal for a global interaction or a grid network and less optimal for a star network. See Appendix A for results with different parameters for the reinforcement learners. Firm 3 has a

(24)

CHAPTER 3. RESULTS 21

(a) The fraction of reinforce-ment learners over time.

(b) The produced quantities of firm 2 and 3 over time.

(c) The profits of firm 2 and 3 over time.

Figure 3.14: Results of repetition 1 for a star network. The market contains 25 players.

profit higher than or equal to firm 2’s profit in every period. So if the centre player dies, is it more likely that firm 3 is chosen for reproduction than firm 2. So if we start with a myopic best response learner in the centre, this explains why the fraction of reinforcement learners is more likely to converge to 0.

Let’s look at repetition 6 now. In repetition 6 the centre player is a reinforcement learner for all periods. We focus on firms 17 and 19, both firms in the periphery. Firm 17 is a myopic best response learner for all periods and firm 19 is a reinforcement learner for all periods. The results for repetition 6 are shown in figure 3.15. 3.15a shows the fraction of reinforcement learners over time, 3.15b shows the produced quantities of firms 17 and 19 and 3.15c shows the profits of firms 2 and 3.

(a) The fraction of reinforce-ment learners over time.

(b) The produced quantities of firm 17 and 19 over time.

(c) The profits of firm 17 and 19 over time.

Figure 3.15: Results of repetition 6 for a star network. The market contains 25 players.

Firm 1 (the centre firm) produces almost nothing, because it has to compete with all 24 firms in the periphery. Firm 1 is a reinforcement learner, so it picks its quantities stochastically and does not produce zero every period like a myopic best response learner in the centre. Firm 17 (the myopic best response learner) produces the maximum quantity every period. Firm 19 (the reinforcement learner) produces almost random quantities. Firm 17 outperforms firm 19 in every period. So firm 17 is more likely to be chosen for reproduction than firm 19, if the centre firm dies. So if we start with a reinforcement learner in the centre does this explain why

(25)

the fraction of reinforcement learners is more likely to converge to 0.

So together with repetition 1 we have now shown that the market is more likely to converge to 0 than 1 in both the situation with a myopic best response learner is in the centre and a reinforcement learner in the centre.

(26)

Chapter 4

Discussion

In this chapter some choices and alternatives to those choices will be discussed.

4.1

Reproduction of reinforcement learners

Let’s say that a firm who has been a myopic best response learner from the start dies and a reinforcement learner is chosen for reproduction. In my model this new reinforcement learner will have an empty Q-value matrix. An alternative is that the new reinforcement learner gets a copy of the Q-value matrix of the firm that was chosen for reproduction. This choice could be problematic because the reinforcement learners need time to learn and with this choice every new reinforcement learner has to start learning from scratch.

Why did we choose to give new reinforcement learners an empty Q-value matrix instead of a copy of the Q-value matrix of their parent? The myopic best response learner only reacts to the previous period and does use any periods from before period t − 1. The firm can only copy the Q-value matrix if it remembers all past decisions of its neighbour or if it gets a copy of the Q-values from their neighbour when they switch to reinforcement learning. We assume that the myopic best response learner does not remember the past of its neighbours as it does not even use its own past and we assume the neighbour won’t give a copy of his Q-values as they are competitors. How will this affect the results? Reinforcement learners that had some time to learn perform better than the reinforcement learners that still have to start learning, therefore the speed of the convergence of the average fraction of reinforcement learners will probably change dependent on this decision. We are certain that this choice will only have an effect on the speed of convergence and not too much on the actual average fraction of reinforcement learners to which the market converges.

Another choice concerning the Q-values is the following. If a reinforcement learner dies, it keeps its Q-values for when it becomes a reinforcement learner later in time. The alternatives

(27)

would be that a reinforcement learner copies the Q-values of its parent or that the firm gets an empty Q-value matrix when it becomes a reinforcement learner for a second time.

Why we did not choose for the copying of Q-values is explained above. That leaves one alternative. The Q-values are the weights of every action in some possible states based on the actions a firm picked in the past. Overwriting these weights with zeroes means that the firm forgets its past. We chose not to overwrite the weights with zeroes as it seems impossible for a firm to truly forget its past. The downside of my choice is that a reinforcement learner that picked the wrong actions several times and becomes a myopic best response learner because of it, has a bad Q-value matrix when it becomes a reinforcement learner again.

4.2

Reinforcement learning paramters

The parameters for reinforcement learning are the same for every network structure. The parameters are made with a grid network in mind, therefore they are not optimal for other networks. However, the results will remain almost the same when optimizing the parameters for every network separately. Global interaction and a grid network are almost equal, so we do not need to look at those. A circle network converges to a full market of myopic best response learners, because three myopic best response learners (a firm and its two neighbours) will learn to produce the Nash equilibrium quantity. This does not depend on the reinforcement learning parameters.

A star network on average converges to a stable fraction of reinforcement learners. This fraction does change when we try to optimize the reinforcement learning parameters for the star network. Results for a star network with better parameters are found in Appendix A. Compared to the results in the results chapter, the reinforcement learners are better able to learn that they need to pick a high quantity. Because of this will the market converge to a full market of reinforcement learners more often. The fraction the market converges to on average is 0.2232 in the results chapter and 0.3112 in Appendix A. The fraction of reinforcement learners to which the market will converge on average will always be below 0.5 for enough repetitions, as the myopic best response learners in periphery will always outperform the reinforcement learners in the periphery. So the average fraction of reinforcement learners to which the market converges does change, but why and how it converges to that fraction does not change.

(28)

Chapter 5

Conclusion

This thesis tries to answer the following questions: Does adding a network structure have an effect on the evolutionary stable set of learning rules in a Cournot market with heterogeneous players, and if so, why and how? To answer these questions we looked at firms using re-inforcement learning and myopic best response learning and four network structures: global interaction, a grid, a circle and a star. For every network we were interested to see how the results related to global interaction.

Let us start with global interaction, so we can compare global interaction with the other network structures later on. The market converges to a market full of reinforcement learners. The produced quantities of the myopic best response learners are unstable while the fraction of reinforcement learners is low. Theocharis (1960) has shown why the produced quantities are unstable. Theocharis has shown that the produced quantities are unstable in a market with only 4 or more myopic best response learners. The myopic best response learners have lower profits than the reinforcement learners when their quantities are unstable, so the fraction of reinforcement learners rises. The produced quantities of the myopic best response learners sta-bilize when the fraction of reinforcement learners is high enough. This market will converge to a market full reinforcement learners. Because if a player dies, it is more likely to be replaced by a reinforcement learner because there are a lot of reinforcement learners. Plus, the produced quantities of the myopic best response learners will destabilize if the myopic best response learn-ers reproduce too much.

The circle network market converges to the opposite, a market full of myopic best response learners. Every firm has two neighbours in the circle network. Three myopic best response learners in a market learn to produce the Nash equilibrium quantity. This property makes the myopic best response learners unbeatable by the reinforcement learners, thus the market converges to a market full of myopic best response learners.

Let us look at the grid network now. Adding an extra dimension to a circle network creates a grid network. The grid network market converges to a market full of reinforcement learners. The behaviour is almost the same as with global interaction. Every firm has eight

(29)

neighbours. Produced quantities of a myopic best response learner are unstable if the fraction of reinforcement learners around that myopic best response learner is too low. So the fraction of reinforcement learners in the whole network goes up at first. This market will converge to a market full reinforcement learners. As before with global interaction, a player that dies, is very likely to be replaced by a reinforcement learner because the majority of its neighbours are reinforcement learners. The myopic best response learners’ produced quantities destabilize if the myopic best response learners reproduce too much. Basically, the grid network has almost the same results as global interaction because a myopic best response learner can have too many myopic best response neighbours.

The circle and grid network are both symmetric networks, as every player has the same amount of neighbours. The star network is asymmetric as the centre firm has all the firms in the periphery as neighbours, while the firms in the periphery only have the centre firm as neighbour. The star network market converges to a stable fraction of reinforcement learners of 0.2232 on average. The centre firm is 50% of the time a reinforcement learner and if a firm in the periphery dies, will it be replaced by a firm using the same learning rule as the centre firm with chance 1. To understand why the fraction converges to 0.2232 on average we need to look at the firms in the periphery. The centre firm produces (almost) nothing every period, since it has to compete with all firms in the periphery. The myopic best response learners in the periphery produce the maximum quantity every period, which is the profit maximizing quantity. The reinforcement learners in the periphery cannot produce the maximum quantity every period, as they pick a quantity stochastically. So if the centre firm dies it is more likely to be replaced by a myopic best response learner than a reinforcement learner. That is why the fraction of reinforcement learners converges to a value smaller than 0.5 on average.

To answer the research questions. Adding a network structure can certainly have an effect. There seem to be two main factors that drive the change of results when markets in a network are compared to global interaction. These two things are: the number of neighbours and asymmetry in the network. The effect of the number of neighbours is clearly visible when comparing global interaction with a circle network. The effect of asymmetry in the network is visible when comparing global interaction with the star network. So if someone wants to do this kind of research on learning rules in practice, he or she will need to make sure to identify and use the proper network structure. In practice these will often be determined by the shape of a city. Practical network structures can be a grid network, like New York or a star network in which firms in the periphery compete with their two neighbours in the periphery as well, think of cities with an old city centre like Amsterdam.

(30)

Chapter 6

References

- James Andreoni and John Miller. 2002. “Giving According to GARP: An Experimental Test of Consistency of Preferences for Altruism”. Econometrica 70(2), pp. 737-753.

- Antoni Bosch-Dom`enech and Nicolaas J. Vriend. 2003. “Imitation of successful behaviour in Cournot markets.”. The Economic Journal 113, pp. 495-524.

- Yann Bramoull´e, Rachel Kranton and Martin D’Amours. 2014. “Strategic Interaction and Networks”. American Economic Review 104(3), pp. 898-930.

- Augustin Cournot. 1838. “Recherches sur les Principes Mathmatiques de la Thorie des Richesses”.

- Edward Droste, Cars Hommes, and Jan Tuinstra. 2002. “Endogenous fluctuations under evo-lutionary pressure in Cournot competition”. Games and Economic Behavior 40, pp. 232-269.

- Sanjeev Goyal. 2007. “Connections, An Introduction to the Economics of Networks”. Prince-ton University Press. pp. 63-86.

- Joanna M. Leleno. 1994. “Adjustment process-based approach for computing a Nash-Cournot equilibrium”. Computers Operations Research 21(1), pp. 57-65.

- Martin A. Nowak. 2006. “Evolutionary Dynamics, exploring the equations of life”. The Belk-nap press of Harvard university press. pp. 93-166.

- Marius Ochea. 2010. “Essays on Nonlinear Evolutionary Game Dynamics”. Academisch proefschrift, Universiteit van Amsterdam. Tinbergen Institute Research Series 468, pp. 89-102.

- Hisashi Ohtsuki, Christoph Hauert, Erez Lieberman and Martin A. Nowak. 2006. “A simple rule for the evolution of cooperation on graphs and social networks”. Nature 441, pp. 502-505.

- Sidney G. Winter JR. 1964. “Economic “Natural Selection” and the Theory of the Firm”. Yale Economic Essays 4(1), pp. 225-272.

- R. D. Theocharis. 1960. “On the stability of the Cournot solution on the oligopoly problem”.

(31)

Review of Economic Studies 27, pp. 133-134.

- Alexander F. Tieman, Gerard van der Laan and Harold Houba. 2001. “Bertrand Price Com-petition in a Social Environment ”. De Economist 149(1), pp. 33-51.

- Ludo Waltman and Uzay Kaymak. 2008. “Q-learning agents in a Cournot oligopoly model ”. Journal of Economic Dynamics Control 32, pp. 3275-3293.

(32)

Chapter 7

Appendix A

The behaviour of reinforcement learners for star network in Section 3 seems pretty random. The behaviour seems random because the parameter values for reinforcement learning were set with a grid network in mind, that is why the parameter values are far from optimal in a star network. This appendix contains results for a star network with different parameter values for reinforcement learning to show that the reinforcement learners are capable of smarter behaviour and that the end results do not change that much.

These results are gained by running computer simulations with different parameter val-ues for reinforcement learning, all other parameters remain unchanged. The new parameter values are α = 0.4 such that the past weighs more when updating the Q-values and β = 1 to make it less likely that firms start experimenting with quantities.

Figure 7.1a shows average the fraction of reinforcement learners over time. It is an average of 2500 repetitions. An example of a fraction of reinforcement learners over time during one repetition is seen in figure 7.1b. Figure 7.1b shows the fraction of reinforcement learners over time for repetition 1.

(a) The average fraction of re-inforcement learners over time.

(b) The fraction of reinforce-ment learners over time.

Figure 7.1: Results for a star network using different parameters. Figure a is the average fraction of reinforcement learners over time. It is the average over 2500 repetitions. Figure b is the fraction of reinforcement learners of repetition 1.

(33)

The average fraction of reinforcement learners converges to 0.3112, see figure 7.1a. In Section 3 the star network converged to an average fraction of reinforcement learners of 0.2232. The market converges to a higher fraction of reinforcement learners with the new parameters. However the fraction is still lower than 0.5.

Let us look at two firms in the periphery, a reinforcement learner and a myopic best response learner, to understand why the fraction goes up and to see whether the reinforcement learners make smarter decisions. We focus on firm 2 and firm 3. Firm 2 is a reinforcement learner till period 104. Firm 3 is a myopic best response learner for all periods. Figure 7.2a shows the produced quantities of firm 2 and firm 3 over time. The profits of firm 2 and 3 are seen in figure 7.2b.

(a) The produced quantities of firm 2 and 3 over time.

(b) The profits of firm 2 and 3 over time.

Figure 7.2: Results for a star network using different parameters. Figure a shows the produced quantities of firm 2 and firm 3. Figure b shows the profits gained by firm 2 and firm 3.

The centre firm is a myopic best response learner and produced nothing every period. The best reaction to this would be to produce the maximum quantity every period, like firm 3 (the myopic best response learner) does, see figure 7.2a. Firm 2 learns that he should pick high quantities, but cannot produce the maximum quantity every period as he picks his quantities stochastically, figure 7.2a. The myopic best response learner gains profits that are higher or equal to the profits of the reinforcement learner. Therefore, it is more likely that the centre firms is replaced by a myopic best response learner than a reinforcement learner, if it dies. That is why the market converges to an average fraction of reinforcement learners that is smaller than 0.5. The average fraction of reinforcement to which the market converges is higher with these parameters than in Section 3, because the reinforcement learners learn to pick higher quantities, gain higher profits and thus are more likely to be chosen for reproduction compared to the reinforcement learners in Section 3.

Referenties

GERELATEERDE DOCUMENTEN

Onderzocht zijn de relaties van voeding met achtereenvolgens de sierwaarde van de planten nadat ze drie weken in de uitbloeiruimte stonden, het percentage goede bloemen op dat moment

Uit de door- snede blijkt dat zowel de wanden van de waterput als deze van de kuil zeer steil zijn.. Slechts 1 wand kon in beperkte mate vrijge- maakt

Biologische overwegingen De biologische gevaren, werkelijke of ingebeelde, worden hier in twee groepen ingedeeld, gevaren voor de mens en gevaren voor het mi- lieu (Letourneau

Contributions Focusing specifically on formal- ity transfer, for which parallel data is available, (i) we take the contribution of pre-trained models a step further by augmenting

Daarnaast wordt verwacht dat de relatie tussen optimisme en hypertensie minder sterk is bij mensen met obesitas, omdat bij mensen met obesitas het risico op hypertensie zo groot

Evaluations involving all research organisations together still take place; in some disciplines virtually every university organises an independent evaluation, as in the case

The fact that schools are not using data as widely as they want to might be due to the availability of the data (for example, schools in England have much more and

Various bead formulations were prepared by extrusion-spheronisation containing different selected fillers (i.e. furosemide and pyridoxine) using a full factorial