• No results found

Learning in a two-stage two-dimensional spatial duopoly

N/A
N/A
Protected

Academic year: 2021

Share "Learning in a two-stage two-dimensional spatial duopoly"

Copied!
65
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Master’s Thesis

Learning in a two-stage two-dimensional spatial duopoly

Lucas Zuurveld

Student number: 10220380

Date of final version: October 30, 2016 Master’s programme: Econometrics

Specialisation: Free Track Supervisor: dr. D. Kopányi Second reader: prof. dr. J. Tuinstra

Abstract

This paper studies learning in the two-dimensional Hotelling’s model of spatial competition, where duopolistic firms choose their location in the unit square in the first stage, and compete in prices in the second stage. We consider various learning methods, including myopic best response learning, least squares learning and reinforcement learning. The process converges to the unique price equilibrium for any fixed location-combination, when myopic best response learners play repeatedly the second stage. A process with least squares learners in the second stage leads to a big variety of self-sustaining equilibria, in particular the price equilibrium for some of the cases. Learners with naïve expectations about the location and price of the competitor, who play the whole game repeatedly, converge to 2-cycles in which the firms swap their locations and prices. However, when the learners have only naïve expectations about the location and play the price equilibrium in the second stage, a Subgame-perfect Nash equilibrium will be reached with a probability of ¼. Even for perceived duopolists, who use reinforcement learning in the first stage, and least squares learning based on the last few observations in the second stage, the process converges frequently to a Subgame-perfect Nash equilibrium.

(2)
(3)

ii Contents

1. Introduction ... 1

2. The market model ... 3

2.1 Price equilibria ... 7

2.2 Location equilibria ... 8

3. Learners in the price stage ... 9

3.1 Myopic best response learners ...10

3.2 Perceived monopolist LS learners ...11

3.2.1 The learning mechanism ...11

3.2.2 Self-sustaining equilibrium ...13

3.2.3 Simulation results ...14

3.3 Perceived duopolist LS learners ...17

3.3.1 The learning mechanism ...18

3.3.2 Self-sustaining equilibrium ...18

3.3.3 Simulation results ...19

3.4 Comparing prices and profits ...23

4. Learners with naïve expectations ...24

4.1 The learning mechanisms ...24

4.2 Simulation results ...27

4.2.1 Type 1 learners ...27

4.2.2 Type 2 learners ...29

4.2.3 Type 3 learners ...32

4.2.4 Comparing prices and profits ...33

4.2.5 Adaptive learning ...34

4.2.6 Infrequent location updates ...36

5. Reinforcement and LS learners ...38

5.1 Reinforcement learning with price equilibrium ...39

5.2 Reinforcement learning and LS learning ...40

5.2.1 The learning mechanism ...40

5.2.2 Simulation results ...42

6. Conclusion and discussion ...43

7. Appendix A ...45

8. Appendix B ...52

(4)

1 a two-dimensional space and it is unlikely that firms can differentiate their product by only one characteristic. Tabuchi (1994) extends Hotelling’s model of spatial duopoly to a two-dimensional product space. He presupposes uniform rectangular distribution of consumers with quadratic transportation cost and unit demands. The marginal costs of the duopolistic firms are assumed to be constant. Tabushi (1994) proves that firms choose to maximize differentiation in one characteristic and minimize differentiation along the other in the Subgame-perfect Nash equilibria of the game. One of the major assumptions in the model is the rationality of the firms.

Conlisk (1996) argues why it is realistic to incorporate bounded rationality in economic models. Firms have to make a tradeoff between the benefits of better decision making and the effort cost of decisions. Information costs may be high, so it is reasonable to assume that the firms are not fully aware of the complex market structure and/or the behaviour of the competitor. Firms will realise when they have limited information about the environment in which they operate, since the outcomes will not always coincide with the predictions. A reasonable response would be that firms use the information resulting from their actions, to learn about the market environment. We consider Tabushi’s model where firms are assumed to be boundedly rational agents. They apply different learning methods and use simple rules of thumb to choose location and price.

This paper is focused on the following questions: Will maximal differentiation on one dimension and minimum differentiation along the other arise as long-run outcome of a process where firms are learning about the market structure or the optimal choice? If not, will the learning methods lead to other equilibria? And will different learning rules lead to different outcomes?

Firms try to learn from past observations. A widely used learning method in the literature about economic models is least squares (LS) learning. Kirman (1975) studies this misspecified learning rule in a simple duopoly setting. The firms use their own price-quantity observations from the past to estimate the demand function by a linear model. The price for the next period is chosen based on the parameter estimates. Cooper et al. (2009) considers the case where firms include also the past price observations of the competitor in the regression model. Both

(5)

2 variants are studied as learning method in the second stage of the game. LS learning is linked to reinforcement learning in the first stage of the game. A reinforcement learning rule similar to the one used in Kirman and Vriend (2001) is applied. Reinforcement learners do not need to make many assumptions about the market structure. They repeat actions that led to better outcomes in the past.

A more rational learner is the myopic best response learner. A myopic best response learner has a correct perception of the market structure, but does not know how its competitor will act. The myopic best response rule is studied at the second stage of the game separately and as learning method in the whole game. Two other types of learners with naïve expectations about the location of the competitor, but with more rational beliefs about the price of the competitor, are considered as well.

The learning methods we consider lead to different outcomes. Separate analyses of the second stage of the game, where locations are kept fixed over the periods, show that the naïve best response learners move towards the unique price equilibrium for any pair of locations. With LS learners, the process converges most of the time to a big variety of self-sustaining equilibria. This term is introduced by Brousseau and Kirman (1992) and indicates a situation in which the perceived and the actual demands coincide at the prices chosen by the firms, but not for other prices. However, perceived duopolist LS learners, who use only the last few observations for their parameter estimates, move almost always to the price equilibrium.

The fully myopic best response learners who compete in location in the first stage, anticipating the subsequent price competition in the second stage, move to a 2-cycle in which the firms swap prices and locations. The 2-cycle locations consist of the randomly drawn initial locations and locations around the centre of the grid. The more rational learners with naïve expectations about the location of the competitor, who play the best response price to the myopic best response price of the competitor, give similar outcomes. However, they choose in general the midpoints of the sides instead of locations around the centre, when they deviate from the initial locations. A process with firms who have naïve expectations about the competitor’s location and fully rational expectations about the price, moves to a 2-cycle for approximately 75% of the simulation runs and to a Subgame-perfect Nash equilibrium (SPNE) for approximately 25% of the runs. A process with perceived duopolists who use reinforcement learning in the first stage, and LS learning with only the last few observations in the second stage, converge frequently to an SPNE.

Several researchers use models related to the horizontal differentiation model of Hotelling (1929). One of the most famous variations of Hotelling’s location model is the circle model of Salop (1979). Irmen and Thisse (1998) extends Hotelling’s model to a multi-dimensional characteristic space. In that paper, the transportation cost is assumed to be the square of the weighted Euclidian distance. The consumers are distributed over the unit

(6)

3 product differentiation. d’Aspremont et al. (1979) shows that there exists no price equilibrium in Hotelling’s one-dimensional horizontal differentiation model under a linear transportation cost, when duopolists locate close to one another. But existence and uniqueness of a price equilibrium are guaranteed when transportation costs are quadratic, instead of linear. Firms maximize product differentiation in order to relax price competition as d'Aspremont et al. (1979) shows. Tabushi (1994) shows that the principle of maximal differentiation does not hold in a two-dimensional space. Instead, equilibria are identified where the two firms choose to maximize differentiation in one characteristic, while differentiation is minimized along the other.

The remainder of the paper is organized as follows. Section 2 presents the market model by outlining its assumptions, whereafter we determine price equilibria and SPNEs. We discuss the learning rules and the corresponding simulation results for the second stage of the game in Section 3. Section 4 analyses learners in the whole game who are fully aware of the market structure. In Section 5, we consider learners who are not fully aware of the market structure. Section 6 contains the conclusion and discussion. Proofs can be found in the appendix and also some simulation results for the case of heterogeneous firms.

2.

The market model

We consider a situation where two firms choose simultaneously their location in the first stage, anticipating the subsequent price competition in the second stage. Firm 1 locates at (𝑥1, 𝑦1)𝜖[0,1]2 and Firm 2 locates at (𝑥2, 𝑦2)𝜖[0,1]2. The positioning of a firm can be regarded

as urban location or as nonnegative valuation on two horizontal product characteristics. In the second stage, the firms choose simultaneously their prices 𝑝1 and 𝑝2, knowing each other’s

location. Consumers, who purchase a unit of good, are uniformly distributed over the unit square [0,1]2. The transportation cost is assumed to be the squared Euclidian distance between the consumer, located at (𝑥, 𝑦), and the firm. A consumer buying at firm 𝑖 enjoys a utility equal to

(7)

4 where 𝑆 denotes the consumers’ reservation price, which is high enough to ensure that the full market is covered. A consumer buys from the firm that gives him a higher utility and is indifferent between Firm 1 and Firm 2 if

𝑝1+ (𝑥1− 𝑥)2+ (𝑦1− 𝑦)2 = 𝑝2+ (𝑥2− 𝑥)2+ (𝑦2− 𝑦)2. (2)

The firms can produce the product at the same constant marginal cost, which is normalized to zero without loss of generality. Both firms maximize their profit,

𝛱1= 𝑝1𝐷1, 𝛱2= 𝑝2(1 − 𝐷1),

with respect to location and then price, where 𝐷1= ∫ 𝑑𝑥𝑑𝑦𝐶

1 and

𝐶1= {(𝑥, 𝑦)|𝑝1+ (𝑥1− 𝑥)2+ (𝑦1− 𝑦)2 ≤ 𝑝2+ (𝑥2− 𝑥)2+ (𝑦2− 𝑦)2}. (3)

The set 𝐶1 denotes the consumers who are better off when they buy at Firm 1.

The second stage of the game will be analysed first. In this stage the firms maximize their profit with respect to price, given their locations. Caplin and Nalebuff (1991) show that for any given locations of firms and for any log-concave density function of consumers in 𝑅𝑛, a

unique price equilibrium exists.

The location competition can be divided into two types, an asymmetric characteristics competition and dominated characteristics competition. In an asymmetric characteristics competition each of the firms has a higher valuation on one of the two characteristics. A dominated characteristics competition is defined as competition between firms when one firm has a higher valuation on both characteristics. In both cases we can also distinguish between a characteristic 𝑥 dominance competition and a characteristic 𝑦 dominance competition. The former is a competition in which the absolute product differentiation on characteristic 𝑥 is greater than or equal to the absolute product differentiation on characteristic 𝑦, |𝑥1− 𝑥2| ≥

|𝑦1− 𝑦2| . For a characteristic 𝑦 dominance competition, it is the other way around, so

|𝑦1− 𝑦2| ≥ |𝑥1− 𝑥2|. In total there are 8 possible cases of location-combinations, next to the

special cases whereby the firms have the same valuation on at least one of the characteristics. The 8 standard types of location competition are described below and depicted in Figure 1.

Asymmetric characteristics competition

Firm 1 has a higher valuation on 𝑥, Firm 2 has a higher valuation on 𝑦: (𝑥1− 𝑥2) > 0, (𝑦2− 𝑦1) > 0

1. Characteristic 𝑥 dominance competition: (𝑥1− 𝑥2) ≥ (𝑦2− 𝑦1)

2. Characteristic 𝑦 dominance competition: (𝑥1− 𝑥2) ≤ (𝑦2− 𝑦1)

Firm 1 has a higher valuation on y, Firm 2 has a higher valuation on x: (𝑥2− 𝑥1) > 0, (𝑦1− 𝑦2) > 0

3. Characteristic 𝑥 dominance competition: (𝑥2− 𝑥1) ≥ (𝑦1− 𝑦2)

(8)

5 Dominated characteristics competition

Firm 1 dominant: 𝑥1> 𝑥2, 𝑦1 > 𝑦2

5. Characteristic 𝑥 dominance competition: (𝑥1− 𝑥2) ≥ (𝑦1− 𝑦2)

6. Characteristic 𝑦 dominance competition: (𝑥1− 𝑥2) ≤ (𝑦1− 𝑦2)

Firm 2 dominant: 𝑥2> 𝑥1, 𝑦2> 𝑦1

7. Characteristic 𝑥 dominance competition: (𝑥2− 𝑥1) ≥ (𝑦2− 𝑦1)

8. Characteristic 𝑦 dominance competition: (𝑥2− 𝑥1) ≤ (𝑦2− 𝑦1)

It is easily to verify that (2) can be written as 𝑦 =𝑝2−𝑝1+(𝑥22−𝑥12)+(𝑦22−𝑦12)

2(𝑦2−𝑦1) −

𝑥2−𝑥1

𝑦2−𝑦1𝑥. (4)

The indifference line, which is defined by (4), makes for each of the specified competitions an angle with the horizontal axis of

𝛼 = tan−1(𝑥1−𝑥2

𝑦2−𝑦1). (5)

Following Vandenbosch and Weinberg (1995), let us define some special price levels. We define 𝑝1𝑢 as the price for which the indifference line passes through (1,0). 𝑝1𝑙 is the price such

that the indifference line passes through (0,1). We determine 𝑝1𝑛 and 𝑝1𝑚 by solving (4) with

respect to 𝑝1, where (𝑥, 𝑦) are equal to respectively (0,0) and (1,1) . This results in the

following price equations:

𝑝1𝑢= 𝑝2+ (𝑥22− 𝑥12) + (𝑦22− 𝑦12) − 2(𝑥2− 𝑥1), (6)

𝑝1𝑚 = 𝑝2− 2[(𝑦2− 𝑦1) + (𝑥2− 𝑥1)] + (𝑥22− 𝑥12) + (𝑦22− 𝑦12), (7)

𝑝1𝑛= 𝑝2+ (𝑥22− 𝑥12) + (𝑦22− 𝑦12), (8)

𝑝1𝑙 = 𝑝2− 2(𝑦2− 𝑦1) + (𝑥22− 𝑥12) + (𝑦22− 𝑦12). (9) Figure 1: The 8 standard types of location competition.

(9)

6 Consider competition 1, for which 𝑝1𝑙 < 𝑝1𝑛≤ 𝑝1𝑚< 𝑝1𝑢. The grid can be divided in three

regions. Figure 2, which is similar to Figure 3 in Vandenbosch and Weinberg (1995), depicts the three different regions. The indifference line passes region 𝑅1 if the price of Firm 1 is

between 𝑝1𝑚 and 𝑝1𝑢. Let 𝑧1 be the distance from the horizontal axis to the point where the

indifference line meets the right side of the square. Then 𝑧1 is defined by:

𝑧1 = 𝑝1𝑢−𝑝1

𝑝1𝑢−𝑝1𝑚. (10)

At 𝑝1= 𝑝1𝑢, 𝑧1= 0 and at 𝑝1= 𝑝1𝑚, 𝑧1= 1. When 𝑝1𝑚 ≤ 𝑝1 ≤ 𝑝1𝑢, the demand for Firm 1 will be 1

2𝑧1

2cot(𝛼), since the area of the triangle is equal to 1

2(𝑏𝑎𝑠𝑒)(ℎ𝑒𝑖𝑔ℎ𝑡) and cot(𝛼) = 𝑏𝑎𝑠𝑒

ℎ𝑒𝑖𝑔ℎ𝑡. On a

similar way the demand can be obtained when the indifference line passes through 𝑅2 or 𝑅3.

This results in the following piecewise-defined demand function for competition 1, consisting of a concave, a linear and a convex part.

𝐷1 = { 0 𝑖𝑓 𝑝1 > 𝑝1𝑢 1 2𝑧1 2cot(𝛼) 𝑖𝑓 𝑝 1𝑚≤ 𝑝1≤ 𝑝1𝑢 𝑧2(1 − cot(𝛼)) + 1 2cot(𝛼) 𝑖𝑓 𝑝1 𝑛≤ 𝑝 1≤ 𝑝1𝑚 1 −1 2𝑧3 2cot(𝛼) 𝑖𝑓 𝑝 1𝑙 ≤ 𝑝1≤ 𝑝1𝑛 1 𝑖𝑓 𝑝1 < 𝑝1𝑙 } , with 𝑧1 = 𝑝1𝑢−𝑝1 𝑝1𝑢−𝑝 1 𝑚, 𝑧2= 𝑝1𝑚−𝑝1 𝑝1𝑚−𝑝 1𝑛 and 𝑧3= 𝑝1𝑙−𝑝1 𝑝1𝑙−𝑝1𝑛.

The demand curve and the corresponding profit curve for a given 𝑝2 are plotted in Figure 3.

The demand functions for competition 2 till 8 have a similar structure and can be derived in a similar way. They can be found in Appendix I.

For the special case where 𝑥1= 𝑥2, the equation for the indifferent consumer can be

written as: 𝑦 =𝑦12−𝑦22+𝑝1−𝑝2

2𝑦1−2𝑦2 . (11)

(10)

7 This results in a linear demand curve, as for the case where 𝑦1= 𝑦2. The demand curves for

these special cases are given in the appendix as well.

Since the locations can be interpreted as the characteristics of a product, the firms are allowed to choose the same location. We make the assumption that the firm with the lowest price serves the entire market in the case that both firms have the same location. When both firms ask the same price for their good, the market is shared equally.

2.1 Price equilibria

This market game has a unique price equilibrium, given any pair of different locations. For competition 1, the equilibrium can be located in one of the three areas 𝑅1, 𝑅2 or 𝑅3. In 𝑅2, the

demand functions for the two firms are linear in prices. So, the profit functions are quadratic in prices, which results in unique solutions for the first order conditions. For competition 1, the reaction functions are given by:

𝑝1(𝑝2) = − 𝑥12 2 + 𝑥1+ 𝑥22 2 − 𝑥2− 𝑦12 2 + 𝑦1 2 + 𝑦22 2 − 𝑦2 2 + 𝑝2 2, (12) 𝑝2(𝑝1) = 𝑥12 2 − 𝑥22 2 + 𝑦12 2 − 𝑦1 2 − 𝑦22 2 + 𝑦2 2 + 𝑝1 2. (13)

When 𝑝1 is substituted by the right side of (12), the following price equilibrium can be derived:

𝑝1∗= − 𝑥12 3 + 4𝑥1 3 + 𝑥22 3 − 4𝑥2 3 − 𝑦12 3 + 𝑦1 3 + 𝑦22 3 − 𝑦2 3, (14) 𝑝2∗= 𝑥12 3 + 2𝑥1 3 − 𝑥22 3 − 2𝑥2 3 + 𝑦12 3 − 𝑦1 3 − 𝑦22 3 + 𝑦2 3. (15)

These are the equilibrium prices provided that they belong to the intervals defining 𝑅2, which

holds when:

𝑝1∗∈ [𝑝1𝑛(𝑝2∗), 𝑝1𝑚(𝑝2∗)].1 (16)

1 Similar restrictions on the interval of Firm 2’s price can be derived. These restrictions are effectively the same as

(16).

(11)

8 A detailed derivation of the equilibrium is given in Appendix II. In a similar way the price equilibrium that corresponds to the linear piece of the demand function can be derived for competitions 2 till 8.

Assume now the special case where 𝑥1= 𝑥2 and (without loss of generality) 𝑦1< 𝑦2.

The demand for Firm 1 in the price equilibrium is given by (11). This results in quadratic profit functions for both firms, for which the reaction functions are given by:

𝑝1(𝑝2) = − 𝑦12 2 + 𝑦22 2 + 𝑝2 2, (17) 𝑝2(𝑝1) = 𝑦12 2 − 𝑦1 − 𝑦22 2 + 𝑦2 + 𝑝1 2. (18)

When 𝑝1 in (18) is substituted by the right side of (17), the following price equilibrium can be

derived: 𝑝1∗= − 𝑦12 3 − 2𝑦1 3 + 𝑦22 3 + 2𝑦2 3 , (19) 𝑝2∗= 𝑦12 3 − 4𝑦1 3 − 𝑦22 3 + 4𝑦2 3 . (20)

A detailed derivation of the equilibrium is given in Appendix III. In a similar way the equilibrium can be calculated for a competition where 𝑦1= 𝑦2. When both players locate exactly at the

same place, the price equilibrium is 𝑝1∗= 0, 𝑝2∗= 0. Because a firm will undercut the competitor

with an arbitrary small amount, when the competitor plays a positive price. None of the firms can be better off when they offer the product for free.

2.2 Location equilibria

The first stage of the game involves the simultaneous choice of the locations. The location decisions are dependent on the corresponding price equilibrium. Tabushi (1994) identifies SPNE locations for an identical two-stage two-dimensional spatial competition, where consumers are uniformly distributed in a rectangle. The proof is based on the following lemmas about the best response location:

Lemma 1

Given firm 𝑗’𝑠 location of (𝑥𝑗, 𝑦𝑗) ∈ 𝐶1, firm 𝑖 (≠ 𝑗) locates either at a corner or at a midpoint of

one side.

Lemma 2

If one firm locates at a corner, then the other firm locates at a midpoint of one side which is farthest from the corner.

(12)

9 ((𝑥1∗, 𝑦1∗), (𝑥2∗, 𝑦2∗)) = ((0.5,0), (0.5,1)), (23)

((𝑥1∗, 𝑦1∗), (𝑥2∗, 𝑦2∗)) = ((0.5,1), (0.5,0)). (24)

So, firms maximize differentiation in one characteristic and minimize differentiation along the other.

The equilibrium price in an SPNE is 1 for both firms. The equilibrium price can be calculated by filling in the locations of (23) in the price equilibrium equations (19) and (20). Because of symmetry, the price equilibrium is the same for every SPNE. So, any pair of locations in (21) to (24), with 𝑝1∗= 1 and 𝑝2∗= 1, forms an SPNE.

3.

Learners in the price stage

After discussing the market model, let us now turn to learning. In this section we focus on learning in the second stage of the game, keeping locations fixed. We consider two types of learning methods: myopic best response and least squares learning. When firms apply the myopic best response rule, they play their best response to the price of the competitor in the previous period. LS learners estimate the parameters of their perceived demand function based on past price-quantity observations and choose the profit-maximizing price given the estimated function. Two types of LS learners will be studied. We analyse the learner who uses only its own price as explanatory variable in the linear regression model. Thereafter, we discuss the learner who uses also the price of the competitor as explanatory variable. In the last part of this section, the prices and profits of the different learners are compared.

The two locations will be drawn from the uniform discrete distribution on the set 𝐿 = {1000 , 1

100, , … , 100 100}

2

. Hence, there is a positive probability that both players have equal valuations on one or both of the characteristics. Initial prices are drawn from the uniform distribution on the set 𝑃 = [0,2].

(13)

10 3.1 Myopic best response learners

Since the demand function of a firm is known, the best response price can be found numerically and so the myopic best response price. The best response price is the maximum price for which the full market is covered, or a certain price level in the concave, linear or convex part. When the best response price is in one of these parts, the derivative of the profit function has to be zero. Therefore, the first order conditions are solved for the different parts of the profit function. The corresponding profits can be calculated and compared with the profit the firm should get when it asked the maximum price for which the full market is taken. The price that results in the highest profit, is the best response price, 𝑝𝑖 = 𝑝𝑖𝐵𝑅(𝑝𝑗, (𝑥𝑖, 𝑦𝑖), (𝑥𝑗, 𝑦𝑗)), where 𝑝𝑗

and (𝑥𝑗, 𝑦𝑗) are the price and location of the competitor respectively. When both firms are

located at the same place, the best response is to undercut the competitor with an arbitrary small price unit, which is defined as 10−5. Unless the other firm charges zero. In that case zero is the best response price. The prediction is that when the method converges, it leads to the price equilibrium. Myopic best response learning is implemented as follows:

1. (a). (𝑥1, 𝑦1) and (𝑥2, 𝑦2) are randomly drawn from the set 𝐿.

(b). 𝑝1,1 and 𝑝2,1 are randomly drawn from the set 𝑃.

2. In period 𝑡 ≥ 2 the firms ask the price 𝑝𝑖,𝑡= 𝑝𝑖𝐵𝑅(𝑝𝑗,𝑡−1, (𝑥𝑖, 𝑦𝑖), (𝑥𝑗, 𝑦𝑗)).

3. The process stops when the absolute price change is smaller than 𝛿1 for both firms:

max

𝑖 {|𝑝𝑖,𝑡− 𝑝𝑖,𝑡−1|} < 𝛿1, or when period 𝑇 is reached.

The model is simulated where both firms myopically play the best response to their competitor. The simulations are interrupted after period 𝑇 = 20,000 if the stopping criterion with the threshold value 𝛿1= 10−5 is not satisfied before. The process converges to the unique price

equilibrium for all location-combinations of Firms 1 and 2. For 1,833 out of the 2,000 runs, the price equilibrium belonged to the linear part of the demand curve.

Figure 4 shows the distribution of end prices (i.e. equilibrium prices) by a scatterplot (left) and a histogram (right). It shows that there is a big variety of equilibrium prices for different location pairs. The mean end price is 0.4765. Figure 5 depicts a typical time series for the prices (left) and the corresponding distribution of the market for the end prices (right). The prices are more volatile in the first few periods. However, the process moves quickly to the equilibrium.

The model is separately simulated 2,000 times with SPNE locations. All of the runs ended up in the price equilibrium.

(14)

11 3.2 Perceived monopolist LS learners

In this subsection we relax the assumption that firms know the demand function. We consider the LS learner who assumes to be monopolist in a market with linear demand. Kirman (1983) studies this learning rule in a Bertrand duopoly. He claims that it is reasonable to assume that firms ignore the prices of other goods in an oligopolistic setting.

3.2.1 The learning mechanism

When firms perceive to be monopolist in a market with linear demand, the perceived demand is given by:

𝐷𝑖𝑝(𝑝𝑖) = 𝛼 − 𝛽𝑝𝑖+ 𝜀𝑖, (25)

where 𝛼 and 𝛽 are unknown parameters and 𝜀𝑖 is the normally distributed error term with

mean zero. Note that the description of the market structure is incorrect, since the demand of a firm is also dependent on the price of the competitor and not linearly dependent on its own price on each price interval. The perceived demand function implies the perceived expected profit:

𝐸[𝛱𝑖] = 𝐸[(𝛼 − 𝛽𝑝𝑖+ 𝜀𝑖)𝑝𝑖] = (𝛼 − 𝛽𝑝𝑖)𝑝𝑖,

which is maximized for: 𝑝𝑖 =

𝛼

2𝛽. (26)

Figure 4: Scatterplot (left) and histogram (right) of the end prices for the myopic best response learner.

Figure 5: Typical time series of prices (left) and the corresponding distribution of the market at the end of the process (right), for the myopic best response learner.

(15)

12 At the end of each period, the firms estimate the parameters by using ordinary LS regression, based on the observed data till that moment. This results in the following parameter estimators (see Heij et al., 2004 for example):

𝑏𝑖,𝑡 = − ∑𝑡𝑘=1[𝑝𝑖,𝑘−𝑝̅𝑖,𝑡][𝐷𝑖,𝑘−𝐷̅𝑖,𝑡] ∑𝑡𝑘=1[𝑝𝑖,𝑘−𝑝̅𝑖,𝑡]2 (27) and 𝑎𝑖,𝑡= 𝐷̅𝑖,𝑡+ 𝑏𝑖,𝑡𝑝̅𝑖,𝑡 (28) with 𝐷̅𝑖,𝑡 = ∑𝑡𝑘=1𝐷𝑖,𝑘 𝑡 and 𝑝̅𝑖,𝑡= ∑𝑡𝑘=1𝑝𝑖,𝑘 𝑡 .

In order to make sure that firms cannot set extremely high prices based on the parameter estimates, a maximum price of 2 is imposed. This can be considered as a price limit set by the government. Note that the equilibrium price is lower than 2 for any pair of locations. The estimates are economically sensible if 𝑎𝑖,𝑡> 0 and 𝑏𝑖,𝑡 > 0. The firm uses the parameter

estimates to optimize their perceived expected profit, when these conditions hold. When the conditions do not hold, the firm plays the maximum price if his previous price was the maximum price. Otherwise, a random price will be drawn from the admissible price range, as in the initial two periods. An extra stopping criterion is implemented. Because it may occur that the absolute price difference is smaller than the threshold value 𝛿1, while the absolute difference between

the perceived and actual demand is still too large. Simulations showed that when this is the case, the price can change substantially later in the process. Thus we apply a stronger convergence criterion than other papers on least squares learning. Perceived monopolist LS learning is implemented as follows:

1. (a). (𝑥1, 𝑦1) and (𝑥2, 𝑦2) are randomly drawn from the set 𝐿.

(b). 𝑝𝑖,1 and 𝑝𝑖,2 are randomly drawn from the set 𝑃.

2. At the end of period 2 the firms use OLS formulas (27) and (28) to obtain parameter estimates 𝑎𝑖,2 and 𝑏𝑖,2.

3. (a). In period 𝑡 ≥ 3 the firms ask the price 𝑝𝑖,𝑡= min { 𝑎𝑖,𝑡−1

2𝑏𝑖,𝑡−1, 2} if 𝑎𝑖,𝑡−1> 0 and 𝑏𝑖,𝑡−1>

0. Otherwise, 𝑝𝑖,𝑡= 2 if 𝑝𝑖,𝑡−1 = 2. 𝑝𝑖,𝑡 is drawn from the uniform distribution on the set

𝑃 for any other case.

(b). After the realization of demands, the firms update their parameter estimates using (27) and (28).

4. The process stops when the absolute price change is smaller than 𝛿1 and the absolute

difference between the perceived and actual demand is smaller than 𝛿2 for both firms:

max

𝑖 {|𝑝𝑖,𝑡− 𝑝𝑖,𝑡−1|} < 𝛿1 = 10

−5 and max

𝑖 {|𝑎𝑖,𝑡−1− 𝑏𝑖,𝑡−1𝑝𝑖,𝑡− 𝐷𝑖,𝑡 |} < 𝛿2= 10

−3, or

(16)

13 Note that parameter estimates and so the perceived optimal price are more volatile in the beginning of the process, since each observation has a higher relative weight in the initial phase.

3.2.2 Self-sustaining equilibrium

Brousseau and Kirman (1992) shows that for a specific duopoly where firms use LS learning, based on a misspecified model, the process will not converge. The price changes become smaller over time, since the relative weight of new observations decreases. The learning mechanism stops at some point. The process could be close to a so-called self-sustaining equilibrium (SSE) at that point. An SSE is an outcome in which the actual and the perceived demands of each firm coincide at the prices chosen by the firms.

Figure 6 shows the perceived and actual demands for a firm. The current price of the firm is denoted by 𝑝. 𝐷𝑝 and 𝐷𝑎 are respectively the perceived and actual demands for this price, given the price of the competitor. The left panel depicts the situation where the actual demand and perceived demand do not coincide at the given price 𝑝. It follows that the residual for this time period is not equal to zero. So, the next period, when the observation (𝑝, 𝐷𝑎) is added to the regression data, the parameter estimates will be adjusted. In the right panel of the figure, the actual demand and perceived demand coincide (𝐷𝑝= 𝐷𝑎). When this will be the case for both of the perceived monopolist LS learners, an SSE is reached. None of the firms will adjust their price in the next period, since the residual is equal to zero for the last observation. If the stopping criterion of the procedure is satisfied, because the process is close to a regular SSE, the process is said to be converged to an SSE.

When both firms act as perceived monopolist LS learners, the following conditions have to hold in a regular SSE:

𝑝𝑖+= 𝑎𝑖+

2𝑏𝑖+ 𝑓𝑜𝑟 𝑖 = 1,2, (29)

(17)

14 𝐷𝑖(𝑝𝑖+, 𝑝𝑗+) = 𝑎𝑖+− 𝑏𝑖,+𝑝𝑖+ 𝑓𝑜𝑟 𝑖 = 1,2. (30)

The conditions given in (29) state that the price in an SSE should be the optimal price in the perception of the firms. The conditions in (30) state that the actual and perceived demands coincide. The coefficients 𝑎𝑖+ and 𝑏𝑖+ can be expressed in terms of the SSE prices, 𝑝𝑖+ and 𝑝𝑗+. This result is summarized in Proposition 1.

Proposition 1

Given the prices 𝑝1+ and 𝑝2+, an SSE is formed if the coefficients of the perceived demand

function for firm 𝑖 are given by:

𝑎𝑖+= 2𝐷𝑖(𝑝𝑖+, 𝑝𝑗+), 𝑖 = 1,2, (31)

𝑏𝑖+=𝐷𝑖(𝑝𝑖

+,𝑝 𝑗+)

𝑝𝑖+ , 𝑖 = 1,2. (32)

The proof can be found in Appendix IV. We can find coefficients for which the model is in an SSE for any price vector (𝑝1, 𝑝2). Thus any admissible price vector may occur in an SSE, in

particular the price equilibrium for a given pair of locations.

3.2.3 Simulation results

The model is simulated 2,000 times with two perceived monopolist LS learners. The locations, (𝑥1, 𝑦1) and (𝑥2, 𝑦2) , are randomly drawn for each simulation run separately. So, the

equilibrium prices differ along the runs. For 1,497 runs, the process ended up close to an SSE. For 459 runs, the process led to a situation where the prices of both firms increased (in general exponentially) over time. Such a process settles down only because a maximum price is imposed.2 It seems that whether the process ends up this way, depends more on the initial

prices than on the locations. For the remaining runs, the process was not close enough to an SSE. The lower the distance between the firms, the higher the probability that a process will not converge before period 20,000. For the majority of the not converged runs, one firm sets too high prices and was driven out of the market.

Figure 7 depicts the big variety of end prices for the converged runs.3 The mean end

price is 0.8512, which is much higher than the mean equilibrium price. The difference between the end prices and equilibrium prices are plotted in Figure 8. Because of observations out of the linear part, the steepness of the demand curve is in general underestimated in comparison

2 When no maximum price is imposed, firms will choose extremely high prices during the process. The process will

move to dynamics with random prices, because the economic sensibility restrictions do not hold at a certain moment.

3 Note that in the remainder of this thesis, end prices refer to prices at the end of the process for the converged

(18)

15 with that linear part. This results in higher end prices than in the equilibrium. The mean absolute difference between the end price and the equilibrium price is 0.4072.

Figure 9 (left) depicts a typical time series of a process which converged to an SSE (left) and with exponentially increasing prices (right). In Figure 10, the perceived demand curves are plotted for a process where one firm learns that there is no market. This was the case for 26 simulation runs. The perceived demand of Firm 1 moves slowly to the x-axis, while the perceived demand curve of Firm 2 already intersects with the actual demand curve at the chosen price. Firms 1 does not ask a lower price, because the estimate 𝑎1 is such small, that

he expects very little demand at any price level.

Because of symmetry, there is no difference in the outcomes between the 8 standard types of location competition. These competitions only differ in mathematical terms. However, the special cases, where firms have the same valuation on at least one characteristic, are fundamentally different. Therefore, we run 2,000 simulations where we set one randomly drawn characteristic value at the same level for both firms.4 76.10% of the runs converged to

an SSE with a mean end price of 0.8238 and a mean absolute difference with the equilibrium

4 For these simulations, the firms are not allowed to have the same valuation on both characteristics.

Figure 7: Scatterplot (left) and histogram (right) of the end prices, for the perceived monopolist LS learner.

Figure 8: Distribution of the difference between the end price and equilibrium price, for the perceived monopolist LS learner.

(19)

16 price of 0.5020. 19.05% of the runs led to the situation where the prices of both firms increased over time. For the majority of the remaining runs, one firm learns that there is no market.

We also run 2,000 simulations where firms are located at the same position. For these simulations, we use the weaker stopping criterion that we applied for the myopic best response learner. The stopping criterion is satisfied for 85.80% of the runs. 1,274 runs converged to the situation where the absolute difference between the end prices of the firms is smaller than 10−3. It changes over time which firm covers the full market, because of the really small price differences. Note that such processes are not close to an SSEs, since the differences between the perceived and actual demands are large. For the other runs where the stopping criterion is satisfied, one firm learns that there is no market. The mean end price is 0.7808 for the converged runs.

Simulations where the randomly drawn locations are kept fixed along the runs show that the process can lead to many different end prices. The end prices for the converged runs

5 The perceived demand curves for Firm 1 are plotted for 𝑡=500,1000,…,3000,3500. The price levels and remaining

demand curves are given as they are at the end of the simulation run, since they did not change approximately after 𝑡=500.

Figure 9: Typical time series of a process which converged to an SSE (left) and with increasing prices (right), for the perceived monopolist LS learner. The locations for Firm 1 and 2 are respectively (0.75,0.33), (0.63,0.41) (left) and (0.64,0.54), (0.31.016) (right).

Figure 10: The development of the perceived demand curve for Firm 1 (left) and the perceived demand curve for Firm 2 (right),5 for a process where Firm 1 learns that there is no market. The

(20)

17 are in general quite symmetrically distributed. However, the amount of runs which converged to an SSE, the mean end prices and the mean absolute difference between the end prices and the equilibrium prices vary for different location pairs. In the remaining analysis we focus on SPNE locations, as it is important to know how LS learning performs under equilibrium locations.

The 2,000 simulation runs with SPNE locations converged also to an SSE (71.05%) or to the situation where the maximum price is played (28.90%).6 2 of the SSEs formed the price

equilibrium.7

Figure 11 shows that there is still a big variety in SSE prices when simulations are executed with SPNE locations only. In the scatterplot it is observable that end prices are quite close to each other when they are bigger than one. The mean end price is 0.9002, and the mean absolute difference between the end price and equilibrium price is 0.1939. So, end prices are on average closer to the equilibrium price than for the case with randomly drawn locations. Remarkable is that the end prices are below the equilibrium price for the majority of the runs, where it was the other way around with randomly draw locations.

3.3 Perceived duopolist LS learners

In this subsection, firms take their competitor into account when estimating the demand function. So, firms include also the past price observations of the competitor in their linear regression model. This means that the description of the market structure is only correct on the price interval for which the demand curve is linear.

6 1 run did not satisfy the stopping criterion before period 20,000. For this run the process was close to an SSE. 7 A process is considered to have converged to the price equilibrium when the absolute difference between the end

price and the equilibrium price is smaller than 10−3 for both firms.

Figure 11: Scatterplot (left) and histogram (right) of the end prices, for the perceived monopolist LS learner with SPNE locations.

(21)

18 3.3.1 The learning mechanism

When a firm assumes that the demand for his good depends linearly on its own price and the price of its competitor, the perceived demand is given by:

𝐷𝑖𝑝(𝑝) = 𝛼 − 𝛽𝑖𝑝𝑖+ 𝛽𝑗𝑝𝑗+ 𝜀𝑖, (33)

where 𝑝𝑗 denotes the price of the competitor. Despite that a perceived duopolist LS learner

takes into account the price of the competitor, its description of the market structure is incorrect, since the linear dependency does not hold in general for the full price range. Similar to the case of the perceived monopolist, the best response price can be derived as:

𝑝𝑖 = 𝛼+𝛽𝑗𝑝𝑗

2𝛽𝑖 . (34)

The OLS estimators are given by: 𝐵𝑖,𝑡= (𝑋𝑖,𝑡′ 𝑋𝑖,𝑡) −1 𝑋𝑖,𝑡′ 𝑄𝑖,𝑡, (35) where 𝑋𝑖,𝑡= ( 1 −𝑝𝑖,1 𝑝𝑗,1 ⋮ ⋮ ⋮ 1 −𝑝𝑖,𝑡 𝑝𝑗,𝑡 ), 𝑄𝑖,𝑡= ( 𝐷𝑖,1 ⋮ 𝐷𝑖,𝑡 ) and 𝐵𝑖,𝑡= ( 𝑎𝑖,𝑡 𝑏𝑖,𝑡 𝑏𝑗,𝑡 ).

Firms use naïve expectations about the price of the competitor. So, they play the best response price given their perceived demand function. The same restrictions on the perceived best response are imposed as for the case of the perceived monopolist. This results in the following implementation of the perceived duopolist.

1. (a). (𝑥1, 𝑦1) and (𝑥2, 𝑦2) are randomly drawn from the set 𝐿.

(b). 𝑝𝑖,1,𝑝𝑖,2 and 𝑝𝑖,3 are randomly drawn from the set 𝑃.

2. At the end of period 3 the firms use OLS formula (35) to obtain parameter estimates 𝑎𝑖,3, 𝑏𝑖,3 and 𝑏𝑗,3.

3. (a). In period 𝑡 ≥ 4 the firms ask the price 𝑝𝑖,𝑡= max {𝑚𝑖𝑛 {

𝑎𝑖,𝑡−1+𝑏𝑗,𝑡−1𝑝𝑗,𝑡−1

2𝑏𝑖,𝑡−1 , 2} , 0} if

𝑎𝑖,𝑡−1> 0 and 𝑏𝑖,𝑡−1 > 0. Otherwise, 𝑝𝑖,𝑡= 2 if 𝑝𝑖,𝑡−1 = 2. 𝑝𝑖,𝑡 is drawn from the uniform

distribution on the set 𝑃 for any other case.

(b). After the realization of demands, the firms update their parameter estimates using (35).

4. The process stops when the absolute price change is smaller than 𝛿1 and the absolute

difference between the perceived and actual demand is smaller than 𝛿2 for both firms:

max

𝑖 {|𝑝𝑖,𝑡− 𝑝𝑖,𝑡−1|} < 𝛿1 = 10

−5 and max

𝑖 {|𝑎𝑖,𝑡−1− 𝑏𝑖,𝑡−1𝑝𝑖,𝑡+ 𝑏𝑗,𝑡−1𝑝𝑗,𝑡−1− 𝐷𝑖,𝑡|} <

𝛿2= 10−3, or when period 𝑇 = 20,000 is reached.

3.3.2 Self-sustaining equilibrium

We will now describe the SSE prices for the perceived duopolist LS learners. In a regular SSE, the following conditions have to hold:

(22)

19 Given the prices 𝑝1+ and 𝑝2+, an SSE is formed if the coefficients of the perceived demand

function are two periods in a row given by:

𝑎𝑖+= 2𝐷𝑖(𝑝𝑖+, 𝑝𝑗+) − 𝑏𝑗+𝑝𝑗+, 𝑖 = 1,2, (38)

𝑏𝑖+=𝐷𝑖(𝑝𝑖

+,𝑝 𝑗+)

𝑝𝑖+ , 𝑖 = 1,2. (39)

Appendix V provides the proof. So, as for the perceived monopolist LS learner, we can find coefficients for which the model is in an SSE for any price vector (𝑝1, 𝑝2). Thus any price

admissible price vector may occur in an SSE, in particular the price equilibrium for a given location-pair.

3.3.3 Simulation results

The model is simulated with two perceived duopolist LS learners, where the locations are drawn randomly for each simulation run. For 98.85% of the 2,000 simulations, the process converged to an SSE.8

In Figure 12, the distribution of the end prices is given. The prices are more condensed along the diagonal in comparison to the perceived monopolist LS learner, as shown in the scatterplot. The average end price and absolute difference with the equilibrium price are respectively 0.8496 and 0.3763 for the converged runs. So, both of them are only a bit lower than for the perceived monopolist LS learner. Figure 13 shows that the majority of the SSE prices is higher than the equilibrium price, as for the perceived monopolist LS learner. However, there is a peak observable around zero. Firms come quite often close to the equilibrium price. For 76 runs the price equilibrium is achieved. For any process that converged to the price equilibrium, the indifference line passes 𝑅2. When the three initial prices of the

firms and the price equilibrium belong to the linear part of the demand curve, the process

8 For 11 runs the maximum price was asked by at least one of the firms, 7 runs were close to and SSE. 5 runs were

(23)

20 converges to the price equilibrium. However, when only one of the first few observations is not in the linear part, but close to it, the price equilibrium can still be achieved. Whether the process converges to the price equilibrium depends on two other factors, next to the initial prices. The angle 𝛼 and the distance between the firms. When 𝛼 = 45°, the demand curve does not contain a linear part. The more 𝛼 deviates from 45°, the bigger is the linear part of the demand curve. Besides that, the distance between the firms is important. Because the smaller the distance, the steeper the demand curve, so the smaller the price interval that belongs to the linear part of the demand curve. The probability of convergence to the price equilibrium is positively related to the length of the linear part of the demand curve, divided by the length of the price interval 𝑃.

Figure 14 depicts a typical time series of a firm’s price for a process which converged to an SSE (left) and to the price equilibrium (right). The special price levels for the firm considered are given in the graphs. Note that the demand belongs to the linear part when there are two special price levels below and two special price levels above the price level of the firm.

Figure 12: Scatterplot (left) and histogram (right) of the end prices, for the perceived duopolist LS learner.

Figure 13: Distribution of the difference between the end price and equilibrium price, for the perceived duopolist LS learner.

(24)

21 The model is also simulated for the special cases, where one characteristic value is the same for both firms.9 For these cases, the description of the market structure is correctly

specified by the perceived duopolist LS learner. The process converged for 98.50% runs to an SSE. But the price equilibrium is not always reached, despite the correct specification of the market structure. The reason for this is that the observation lies out of the linear part when the demand is zero or one for a firm. So, with such observations a firm cannot learn the parameters of the linear part correctly. This leads to end prices different from the equilibrium price. The mean end price and difference with the equilibrium price are respectively 0.7837 and 0.4329. So, the mean difference with the equilibrium price is even bigger than for the simulations where no characteristic value is set at the same value. The mean end price is lower, because of stronger competition.

We also run 2,000 simulations where firms are located at the same position. For these simulations, we use the weaker stopping criterion we applied for the myopic best response learner. The process converged for 1,555 runs to end prices with an absolute difference smaller than 10−3.The mean end price for these runs is 0.6435.

Simulations where randomly drawn locations are kept fixed over the simulations runs show that many different SSEs can be achieved, as for the perceived monopolist LS learner. Next we discuss the results for SPNE locations, for which the market structure is correctly specified by the perceived duopolist LS learner. The model is simulated 2,000 times. The process almost always converged to an SSE.10 For 848 of the runs, the process ended up in

an SPNE. Figure 15 shows the distribution of end prices for the converged runs. The end price was higher than the equilibrium price for almost every firm whose price did not converge to the SPNE prices. The end prices and absolute difference with the equilibrium price are respectively

9 For these simulations, the firms are not allowed to have the same value on both characteristics. 10 Except of 5 runs, for which the maximum price was asked by one of the firms.

Figure 14: Typical time series of a firm’s price for a process which converged to an SSE (left) and to the price equilibrium (right), for the perceived duopolist LS learer. The locations for Firm 1 and 2 are respectively (0.37,0.23), (0.18,0.55) (left) and (0.43,0.39), (0.92,0.53) (right).

(25)

22 1.1185 and 0.1188. So, the end prices are on average closer to the equilibrium price in comparison with the perceived monopolist LS learner and much closer to the equilibrium price than the case where locations are randomly drawn.

We also run simulations with randomly drawn locations, where both firms use the last 25 observations only to estimate the parameters. The stopping criterion mentioned earlier is violated for these simulations, since a firm can change the price later on with a higher amount than 𝛿1, when earlier observations are truncated. Instead of the violated stopping criterion, the

runs are stopped before period 20,000 when max

𝑖 {|𝑝𝑖,𝑡− 𝑝𝑖 ∗|} < 𝛿

1, where 𝑝𝑖∗ is the equilibrium

price for firm 𝑖 given the pair of locations.

For 98.70% of the 2,000 simulation runs, the process converged to the price equilibrium.11 For the majority of the runs, the process visited approximately one or a few SSEs

before it converged. When the process is in an SSE, the actual and perceived demand curves intersect in one point only. Truncation of older observations leads to changes in the perceived demand curve, and so changes in the price. But when a process moves to the price equilibrium, the approximation of the actual demand curve around that equilibrium price becomes better. At a certain period, the perceived demand function coincides with the linear part of the actual demand function, if the price equilibrium is located there. When the price equilibrium is located out of the linear part, the actual and perceived demands are tangent to each other at the moment of convergence. In Figure 16 typical time series of a firm’s price and the corresponding demand curves are plotted for a process where the price equilibrium is in the linear part (upper) and not in the linear part (lower).

Separate simulations, where one characteristic value is set at the same randomly drawn level for both firms, converged to the price equilibrium for 1,997 of the 2,000 runs. The

11 22 of the not converged runs were dominated by random prices, because of a negative parameter estimate for

𝛼. For 4 runs, the maximum price was asked by both firms.

Figure 15: Scatterplot (left) and histogram (right) of the end prices, for the perceived duopolist LS learner with SPNE locations.

(26)

23 mean end price is 0.3449 for these runs. For the simulations where firms are located at the same place, all of the 2,000 runs satisfied the stopping criterion before period 𝑇 = 20,000. The process converged to the price equilibrium for all of the 2,000 simulation runs where firms are located at the SPNE locations.

3.4 Comparing prices and profits

In Table 1 the end prices and profits are compared for the different learning methods we considered in this section. The equilibrium prices and profits are approximately double as much for SPNE locations. Since the distance between firms is on average smaller when the locations are randomly drawn, the price competition is stronger. This results in lower prices and profits. The mean end price and profit are much higher for the LS learners, in comparison with the more rational myopic best response learner, when locations are randomly drawn. The perceived monopolist chooses the highest end prices, while the more rational perceived duopolist generates the highest profits. Remarkable is that the mean end price and profit are almost equal for the LS learners when locations are randomly drawn, while they are very different for SPNE locations. The average price and profit are lower than in the equilibrium for the perceived monopolist and higher for the perceived duopolist by SPNE locations.

Figure 16: Typical time series of a firm’s price for a process which converged to the price equilibrium in the linear part (upper left) and to the price equilibrium out of the linear part (lower left), for the perceived duopolist LS learner using the last 25 observations. The corresponding perceived demand curves of Firm 1 are plotted in the panels on the right. The locations for Firm 1 and 2 are respectively (0.63,0.43), (0.77,0.11) (upper) and (0.01,0.78), (0.41,0.39) (lower).

(27)

24

4.

Learners with naïve expectations

Next we turn to learning in the whole game, not only in the pricing stage. We consider different learning methods which differ in the degree of rationality they require from firms. In this section we focus on naïve learners who know the market structure. In Section 5 we will relax the assumption of knowing the market structure.

4.1 The learning mechanisms

Before we formally describe the behaviour of the different learners, we define 𝑝𝑖∗((𝑥𝑖, 𝑦𝑖), (𝑥𝑗, 𝑦𝑗)) as the equilibrium price for firm 𝑖 given the locations (𝑥𝑖, 𝑦𝑖) and (𝑥𝑗, 𝑦𝑗) of

firm 𝑖 and its competitor respectively. Remember from Section 3 that 𝑝𝑖𝐵𝑅(𝑝𝑗, (𝑥𝑖, 𝑦𝑖), (𝑥𝑗, 𝑦𝑗))

denotes the best response price. 𝐿𝐵𝑅𝑖 ((𝑥𝑗, 𝑦𝑗), 𝑝𝑖, 𝑝𝑗) denotes the best response location of firm

𝑖, given the location of the competitor and the price levels 𝑝𝑖 and 𝑝𝑗. The best response location

is determined numerically by comparing the profits of any location in the set 𝐿 = {500 , 1

50, , … , 50 50}

2

. The best response price is determined in the same way as in Section 3. The three different types of naïve learners behave as described below.

type 1: naïve best response location and naïve best response price 1st stage: 𝐿𝑖,𝑡 = (𝑥𝑖,𝑡, 𝑦𝑖,𝑡) = 𝐿𝐵𝑅𝑖,𝑡 ((𝑥𝑗,𝑡−1, 𝑦𝑗,𝑡−1), 𝑝𝑖𝐵𝑅(𝑝𝑗,𝑡−1,, (𝑥𝑖,𝑡, 𝑦𝑖,𝑡), (𝑥𝑗,𝑡−1, 𝑦𝑗,𝑡−1)) , 𝑝𝑗,𝑡−1) (40) 2nd stage: 𝑝𝑖,𝑡= 𝑝𝑖,𝑡𝐵𝑅(𝑝𝑗,𝑡−1, (𝑥𝑖,𝑡, 𝑦𝑖,𝑡), (𝑥𝑗,𝑡, 𝑦𝑗,𝑡)) (41) random locations 𝑝̅𝑖𝐸𝑁𝐷 𝑣𝑎𝑟(𝑝𝑖𝐸𝑁𝐷) 𝜋̅𝑖𝐸𝑁𝐷 𝑣𝑎𝑟(𝜋𝑖𝐸𝑁𝐷)

myopic best response learner/equilibrium 0.4765 0.0525 0.2410 0.0148 perceived duopolist LS learner 0.8496 0.0671 0.4258 0.0209 perceived monopolist LS learner 0.8512 0.0529 0.4206 0.0283 SPNE locations

myopic best response learner/equilibrium 1.0000 0.0000 0.5000 0.0000 perceived duopolist LS learner 1.1185 0.0219 0.5588 0.0056 perceived monopolist LS learner 0.9002 0.0485 0.4471 0.0123

Table 1: Mean end prices and profits for different learning rules. 𝑝𝑖𝐸𝑁𝐷 and 𝜋𝑖𝐸𝑁𝐷 are respectively the

(28)

25 𝐿𝐵𝑅𝑖,𝑡 ((𝑥𝑗,𝑡−1, 𝑦𝑗,𝑡−1), 𝑝𝑖𝐵𝑅(𝐸1[𝑝𝑗,𝑡], (𝑥𝑖,𝑡, 𝑦𝑖,𝑡), (𝑥𝑗,𝑡−1, 𝑦𝑗,𝑡−1)) , 𝐸1[𝑝𝑗,𝑡] ) (42) with 𝐸1[𝑝𝑗,𝑡] = 𝑝𝑗𝐵𝑅(𝑝𝑖,𝑡−1, (𝑥𝑗,𝑡−1, 𝑦𝑗,𝑡−1), (𝑥𝑖,𝑡, 𝑦𝑖,𝑡)). 2nd stage: 𝑝𝑖,𝑡= 𝑝𝑖,𝑡𝐵𝑅(𝐸2[𝑝𝑗,𝑡], (𝑥𝑖,𝑡, 𝑦𝑖,𝑡), (𝑥𝑗,𝑡, 𝑦𝑗,𝑡)), (43) with 𝐸2[𝑝𝑗,𝑡] = 𝑝𝑗𝐵𝑅(𝑝𝑖,𝑡−1, (𝑥𝑗,𝑡, 𝑦𝑗,𝑡), (𝑥𝑖,𝑡, 𝑦𝑖,𝑡)).

The type 2 learners have naïve expectations about the location of the competitor, as the type 1 learners. But the type 2 learners think one step further about the competitor’s price. The type 2 learners know that it is reasonable to assume that the competitor will update his price, when he is faced with a location update of his competitor. Therefore, a type 2 learner expects that the competitor plays the best response price to his price in the previous period, given his new location. Based on these expectations, the type 2 learners determine the optimal location. In the second stage, they play the best response price to the myopic best response price of the competitor, given the known locations. So, the type 2 learners have so-called level-2 beliefs about the price in the cognitive hierarchy approach, presented in Stahl and Wilson (1995).

type 3: naïve best response location and equilibrium price 1st stage:

𝐿𝑖,𝑡 = (𝑥𝑖,𝑡, 𝑦𝑖,𝑡) =

𝐿𝐵𝑅𝑖,𝑡 ((𝑥𝑗,𝑡−1, 𝑦𝑗,𝑡−1), 𝑝𝑖∗((𝑥𝑖,𝑡, 𝑦𝑖,𝑡), (𝑥𝑗,𝑡−1, 𝑦𝑗,𝑡−1)) , 𝑝𝑗∗((𝑥𝑗,𝑡−1, 𝑦𝑗,𝑡−1), (𝑥𝑖,𝑡, 𝑦𝑖,𝑡))) (44)

2nd stage:

𝑝𝑖,𝑡= 𝑝𝑖,𝑡∗ ((𝑥𝑖,𝑡, 𝑦𝑖,𝑡), (𝑥𝑗,𝑡, 𝑦𝑗,𝑡)) (45)

The type 3 learners are myopic best response learners in the first stage, given that they are rational in the second stage. The firms are in general faced with a new location of the competitor when the second stage appears. Based on the new locations, the firms choose the

(29)

26 equilibrium price. Note that these equilibrium prices differ from the equilibrium prices the type 3 learners take into account in the first stage. Because the equilibrium prices the learners consider in the first stage correspond to the previous location of the competitor.

The lemmas of Tabushi (1994) discussed in Section 2 gave some insight about the best response location, given that both firms play the equilibrium price in the second stage of the game. But executed simulations showed totally clear behaviour of the firms, in terms of location decisions. It seems that the best response location is the midpoint of one side which is the farthest from the location of the competitor. Figure 17 depicts the areas for which the different midpoints of the sides are the best response location, given that the price equilibrium is played in the second stage of the game. So, the type 3 learners will always choose a midpoint of one side.

The three different types of learners with naïve expectations are implemented as follows:

1. (a). (𝑥1,1, 𝑦1,1) and (𝑥2,1, 𝑦2,1) are randomly drawn from the set 𝐿.

(b). 𝑝1,1 and 𝑝2,1 are randomly drawn from the set 𝑃 = [0,2].

2. (a). In the first stage of period 𝑡 ≥ 2, the firms determine the optimal location in the set 𝐿 numerically, using (40),(42) or (44).

(b). In the second stage of period 𝑡 ≥ 2, the firms determine the price using (41), (43) or (45).

3. The process stops when both firms do not change location and change their price with a smaller amount than 𝛿1, in comparison with 2 periods ago: (𝑥1,𝑡, 𝑦1,𝑡, 𝑥2,𝑡, 𝑦2,𝑡) = Figure 17: Areas for which the midpoint of the sides with the same color is the best response location.

(30)

27 (𝑥1,𝑡−2, 𝑦1,𝑡−2, 𝑥2,𝑡−2, 𝑦2,𝑡−2) and 𝑚𝑎𝑥 𝑖 {|𝑝𝑖,𝑡− 𝑝𝑖,𝑡−2|} < 𝛿1= 10 −5, or when period 𝑇 = 250 is reached.12 4.2 Simulation results

This subsection contains the simulation results for the three different learners with naïve expectations about the location of the competitor. This is followed by a comparison of the end prices and profits. Thereafter, the situation is considered where the learners use an adaptive updating algorithm for their location choices. Finally, the model is analysed where the learners update their locations infrequently.

4.2.1 Type 1 learners

The 2,000 simulations for the type 1 learners show that the process converged in 99.85% of the cases to a 2-cycle in which the firms interchange their locations and prices.13 In the 2-cycle,

a firm has the plan to imitate the location choice and undercut the price of the competitor. But in the second stage of a period, a firm is confronted with the fact that the competitor, who had the same game plan, relocated at his previous location.

For 903 of the runs, the location 2-cycle did not consist of the initial locations of the firms. This means that it was at a certain period a more attractive plan for a firm to relax the price competition a little bit and differentiate the product, than the plan to imitate the competitor’s location and undercut its price. For 680 runs, only one of the initial locations became a 2-cycle point. In Figure 18, the initial locations which did not become a point of the 2-cycle and the locations which replaced them as 2-cycle point are plotted. The figure shows that locations at the borders are replaced by locations more in the centre of the grid.

12 The simulation results discussed in Subsection 4.2 show that the process converges to a specific 2-cycle in which

firms interchange location and price. Based on this fact, we use a different stopping criterion than in the previous section.

13 For 3 simulation runs, the process did not converge within 250 periods. The firms were located both at the same

place period after period and undercut each other’s previous prices.

Figure 18: Initial locations which did not become a point of the 2-cycle (left) and locations which replaced initial locations as 2-cycle point (right), for the type 1 learner.

(31)

28 The prices converged to a 2-cycle in which the firms swap their prices.14 Figure 19 (left)

shows that the difference between the two 2-cycle prices can be quite large. Figure 19 (right) shows the right-skewed distribution of the end prices (i.e. 2-cycle prices). The mean end price is 0.4228. The end price is quite often close to the equilibrium price, as depicted in Figure 20. The mean absolute difference with the equilibrium price is 0.0566.

Figure 21 depicts typical price dynamics of a process for which the initial locations became both cycle points (upper) and for which only one of the initial locations became a 2-cycle point (lower). The corresponding locations are plotted in the right panels of the figure. The black arrows denote the movements of the firms in the 2-cycle. The green arrow in the lower left panel denotes the movements before the location 2-cycle is reached.

Simulations where initial locations and prices are randomly drawn from a small neighbourhood of an SPNE show that an SPNE is not even locally stable. For these runs, the process converged to a 2-cycle consisting of the initial locations.

14 For 16 simulation runs, the location–combination was symmetric along the indifference curve which divided the

market equally. This resulted in the equilibrium price for both firms.

Figure 19: Scatterplot (left) and histogram (right) of the end prices, for the type 1 learner.

Figure 20: Distribution of the difference between the end price and equilibrium price, for the type 1 learner.

(32)

29 4.2.2 Type 2 learners

The process converged to a 2 cycle, as for the type 1 learner.15 In the 2-cycle, a firm has the

plan to imitate the competitor’s location and to undercut the own previous price by two times the minimum price unit. Because based on the beliefs, a firm expects that the competitor will undercut his price with the minimum price unit. In the second stage of a period, a firm is confronted with the fact that the competitor relocated at his previous location.

For 806 simulation runs, the location 2-cycle did not consists of the two initial locations of the firms. For 177 of them, both firms were not located at the initial locations. The initial locations which did not become a 2-cycle point were quite randomly distributed over the grid, as shown in Figure 22 (left). The 2-cycle points which were not initial locations are plotted in the right panel of Figure 22. The replacing location is in general the midpoint of the side which is the farthest from the previous location of the competitor in the period that the replacing location is chosen for the first time. The type 2 learners differentiate their product more than

15 For 43 of the 2,000 simulation runs, the process did not satisfy the stopping criterion before period 𝑇=250. The

firms were positioned on the same location as each other in both phases of the 2-cycle . They undercut the own previous price with two times the minimum price unit. Logical reasoning learns us that these processes should end up with prices equal to zero, if they were not interrupted.

Figure 21: Typical price dynamics (left) and the corresponding locations (right) of a process for which the initial locations became both 2-cycle points (upper) and for which only one of the initial locations became a 2-cycle point (lower), for the type 1 learner. The black line in the lower left panel denotes the period in which Firm 2 moved for the first time to the ‘new’ 2-cycle location.

(33)

30 the type 1 learners, when they deviate from the imitation strategy. The incentive to differentiate the product is higher for the type 2 learners, since they expect that the competitor will then increase its price.

For 74 simulation runs, the process ended up in an so-called 2-cycle SPNE, a situation where the 2-cycle consists of an SPNE location-combination. The firms interchange between them and play the equilibrium price in every period. In total there are 158 runs for which the process converged to the price equilibrium. This occurred when the location-combination was symmetric along the indifference curve, which divided the market equally. In general the process converged to a 2-cycle in which the firms swap their prices. Figure 23 shows the distribution of the 2-cycle prices (i.e. end prices). The scatterplot and histogram illustrate that the 2-cycle prices are much more condensed along the diagonal and that the distribution is more symmetric than for the type 1 learner. The peaks at 0.5 and 1 in the right panel of Figure 23 can be clarified, since 0.5 and 1 are the equilibrium prices when firms are located at midpoints of one side obliquely one above the other and opposite to each other respectively. The end price is on average 0.6143. Figure 24 depicts that the end prices are really close to the equilibrium price. The average absolute difference between them is only 0.0169.

Figure 25 depicts typical price dynamics of a process for which the initial locations became both cycle points (upper) and for which only one of the initial locations became a

2-Figure 22: Initial locations which did not become a point of the 2-cycle (left) and locations which replaced initial locations as 2-cycle point (right), for the type 2 learner.

(34)

31 cycle point (lower). The corresponding locations are plotted in the right panel of the figure. The green dashed arrow in the lower left panel denotes the first movement of Firm 1.

Like for the type 1 learners, an SPNE is not locally stable. Simulation runs where initial locations and prices are randomly drawn from a small neighbourhood of an SPNE, converges to 2-cycles, consisting of the initial locations.

Figure 24: Distribution of the difference between the end price and equilibrium price, for the type 2 learner.

Figure 25: Typical price dynamics (left) and the corresponding locations (right) of a process for which the initial locations became both 2-cycle points (upper) and for which only one of the initial locations became a 2-cycle point (lower), for the type 2 learner.

Referenties

GERELATEERDE DOCUMENTEN

competentie inzake waarheid en moraal voor zichzelf reserveerde; bij deze `denkende schrijvers' ziet de filosofie zich gedrongen in de rol van dienstmaagd van de letteren, die

Als het Nederlandse beleid ten aanzien van de btw voor sierteeltproducten navolging heeft in de EU, met name Duitsland, is de bijdrage aan werkgelegenheid en omzet in de

Door twee zomers lang op vaste tijden vlinders te turven, is geprobeerd te achterhalen welke factoren voor vlinders belangrijk zijn om een bepaalde Buddleja te kiezen..

Developing and evaluating a framework for selecting the location of AED drone network launch sites to facilitate an adequate AED response time in rural areas at affordable costs.. The

© Copyright: Petra Derks, Barbara Hoogenboom, Aletha Steijns, Jakob van Wielink, Gerard Kruithof.. Samen je

De volgende stelling geeft informatie over de straal en de locatie van het middelpunt van de negenpuntscirkel Γ... De straal van Γ is de helft van de straal van

This study aimed at assessing the value of histopathological parameters obtained from an endometrial biopsy (Pipelle  de Cornier; results available preoperatively) and

introduction of the right to speak, which is exercised during this stage, an argument was made for implementing a two stage process; supposedly, allowing the victim to speak about