Stability of equilibria with genetic algorithm as learning mechanism : learning to optimize versus learning to forecast

(1)

Supervisor: prof. dr. C.H. Hommes MSc Econometrics 2nd marker: mw. dr. I.L. Salle Specialization Mathematical Economics

Stability of Equilibria with Genetic Algorithm As Learning Mechanism: Learning to Optimize versus Learning to Forecast

by

Rashied Sheikkariem

Abstract

We study whether heterogeneous agents learn to coordinate on stationary perfect-foresight equilibria in a general-equilibrium environment. Agents’ individual learning is updated by genetic algorithm. We consider two genetic algorithm alternatives. In the first, agents learn to optimize based on beliefs they form about first-period consumption. In the second, agents learn to forecast based on beliefs they form about the future price of the consumption good. We find that the heterogeneous agents are only able to coordinate towards a steady state or a period-two cycle. Furthermore, we find no evidence of coordination towards higher-order equilibria or chaotic behavior.

(2)

“One general law, leading to the advancement of all organic beings, namely, multiply, vary, let the strongest live and the weakest die.”

(3)

Contents

1 Introduction 1

2 The economic environment 5

3 Genetic algorithm learning 12

3.1 Genetic algorithm 12

3.1.1 Reproduction 13

3.1.2 Crossover and mutation 13

3.1.3 Election 15

3.2 Genetic algorithm application in the OLG economy 16

3.2.1 Learning to optimize 17

3.2.2 Learning to forecast 19

3.3 Advantages of genetic algorithm learning 22

4 Simulation results 24 4.1 Experimental design 24 4.2 Results learning-to-optimize 25 4.3 Results learning-to-forecast 29 5 Concluding remarks 33 References 35

(4)

1 Introduction

The possibility of multiple stationary perfect-foresight equilibria in certain classes of general equilibrium models is well established. The nature of these stationary equilibria is an important issue in macroeconomics. They give useful insights when one is studying the future behavior of economic factors and predicting them. Throughout the years it has been shown by researchers that some of these stationary equilibria may represent a steady state, but others can also represent periodic and even chaotic behavior. Chaotic behavior implies that dynamics are sensitive depending on the initial conditions and therefore unpredictable in the long run. This is problematic in economic applications.

For example Grandmont (1985) studied a standard version of the overlapping generations (OLG) model under perfect-foresight with a constant supply of fiat money. His experimental economy involves a perishable consumption good produced from labor that is supplied by consumers. Furthermore his experimental economy consists of

homogeneous risk aversion agents who all face identical utility and live two periods long. In every time period there are two types of agents living in the economy, one "young" and one "old". Young agents are able to save part of their income in each period by holding a non-negative amount of money. Young agents are able to transfer these savings to the second period of live. Consumption in the first and second periods of life is not necessarily a gross substitute. By using this economic environment Grandmont analyzes the nature of stationary equilibria under the perfect foresight assumption. With

(5)

his research Grandmont (1985) showed that stationary perfect-foresight equilibria may exist in which the equilibrium dynamics are characterized either by a steady state, periodic or chaotic trajectories for real money

balances. His research has been used as a benchmark by many researchers to study this type of dynamics (Bullard and Duffy, 1998; Heemeijer et al,

2012, Hommes et al, 2013).

For instance, Bullard and Duffy (1998) used this economic environment to investigate whether agents may learn to coordinate on stationary perfect-foresight cycles in a general-equilibrium environment. They replace the perfect-foresight assumption with a genetic algorithm-learning scheme and examine how a population of heterogeneous adaptive

agents learns in such an environment.Agents are differentiated by the

forecast rule that they use to predict the next period's price, and each agent solves the same optimization problem based on their individual-specific forecast rule. Specifically, agents update their forecast rules by a genetic algorithm, a stochastic search algorithm for numerical optimization

introduced by Holland (1975). With their research, Bullard and Duffy have shown that the stationary equilibria on which the artificial adaptive agents coordinate are always simple, either a steady state or low-order cycles, but not on any higher order cycle or chaotic behavior found by Grandmont (1985) under the perfect-foresight assumption. Bullard and Duffy have used a specific genetic algorithm structure to forecast the price, which may make it likely that agents will learn to coordinate on low-order cycles. This does not mean that the higher order cycles and chaotic behavior found by

(6)

higher-order cycles and even chaotic behavior can be observed when using a general genetic algorithm structure.

The main topic of this paper will be to investigate whether a population of artificial, heterogeneous adaptive agents can learn to

coordinate on stationary perfect-foresight equilibria in a general-equilibrium environment. The agent's learning will also be modeled using a genetic algorithm. The genetic algorithm describes the evolution of a population of agents' rules, which are presented by chromosomes, strings of finite length, written over a binary alphabet. Agents are updating their rules by retaining and improving upon its individual rules that have performed well in the past. This type of learning is referred to as social learning. The performance of each chromosome in a given environment is evaluated through its fitness function, which measures the value of profit or utility resulting from the behavior prescribed by the chromosome. The updating occurs by using a set of four genetic operators: reproduction, crossover, mutation and election (Arifovic, 1995). Reproduction makes copies of individual chromosomes. The creation of new rules is accomplished through the application of crossover and mutation. Crossover randomly exchanges parts of

chromosomes, while mutation changes a bit value on a randomly chosen position in the binary string. Finally, the election operator tests newly generated rules. If their performance is better than the one of their parents, they are placed into a new population of chromosomes.

The behavior of genetic algorithms and other computer-based adaptive algorithms have been studied in a number of different economic environments, among others, Miller (1989), Marimon, McGrattan and Sargent (1989), Arifovic (1995), Bullard and Duffy (1998) and Lux and

(7)

Schornstein (2005). Some of the questions examined in these studies relate to the adaptive agent's ability to learn Nash equilibrium behavior,

equilibrium selection in the environments with the multiplicity of equilibria, and the computation of equilibria in economies in which it is hard to obtain analytical solutions. These algorithms have also been used in examining the observed behavior of human subjects in laboratory experiments; see, for example, Crawford (1991), Miller and Andreoni (1990a) and Arthur (1991). The results of these studies show that computer-based adaptive algorithms can perform better than models with rational expectations in explaining some of the regularities observed in experimental economies (Arifovic, 1995).

This paper will consider two different alternatives of the genetic algorithm depending on the two different economic factors on which agents

will form their beliefs. In thefirst alternative, agents will learn to optimize

based on the beliefs they form about the first-period consumption

(learning-to-optimize). In the second alternative, agents will learn how to forecast

based upon the beliefs they form about the future price of the consumption

good (learning-to-forecast).Based upon these alternatives, the two

following questions will be addressed in this paper. First, Is chaotic behavior relevant with genetic algorithm learning agents? Second, Is there a

difference in the behavior of the dynamics between agents who are

learning-to-optimize and agents who are learning-to-forecast?In order to answer the

research questions, this paper adapts the economic environment from Grandmont (1985). The economy consists of artificial agents, which are differentiated by the individual beliefs they form about an economic factor. The structure of the genetic algorithm will follow Arifovic (1995).

(8)

The main results conducted by this paper show that a population of artificial agents is able to coordinate on stationary perfect-foresight equilibria in a general-equilibrium environment. And when coordination occurs, it only occurs towards a monetary steady state or a period-two cycle. Furthermore, the results show no evidence of coordination towards higher-order equilibira or chaotic behavior. Therefore we can imply that chaotic behavior is not relevant with genetic algorithm learning, even not in our case, when the genetic algorithm has a general structure. This was also suggested by the result conducted from Arifovic (1995) and Bullard and Duffy (1998). Finally, based on the dynamics of the two GA alternatives, the results also suggest that it is easier to learn for agents when they are learning to forecast than when making an economic decision.

The rest of the paper is organized in the following order. The

description of the two-period overlapping generations model is presented in the second section. In the third section the genetic algorithm and the two alternatives applied to the OLG economic environment are presented. The results generated from the computer simulations are discussed in the forth section. Finally, in section five, the concluding remarks are given.

2 The economic environment

This section gives a detailed description of the economic environment used in this paper. The economic environment is based on the overlapping

(9)

The basic OLG model (Samuelson, 1958) represents an economic environment in which agents live a finite time, so that each agent's life overlaps with at least one period of another agent's life. In the basic model agents live for two periods. In the first period of live, they are referred to as the Young. In the second period of live, they are referred to as the Old. The economy consists of one single good, which cannot endure for more than one period. Agents receive a strictly positive amount of initial endowments of the consumption good in each period of live. By holding fiat money, agents can save between periods. At the end of the second period of live, agents consume all savings and endowments. Furthermore, all agents have identical utility, which is a function of consumption in all periods. These basic properties form the baseline for the experimental economy used by Grandmont (1985) in his paper. This also forms the baseline for the experimental economy used in this paper, which will be described in the following.

The economy consists of overlapping generations of two-period-lived agents in which time t ≥ 1 is discrete. At each time step t, the economy consists of a population of N agents. The population consists of a generation of N/2 young agents and a generation of N/2 old agents. All agents can consume a single perishable consumption good. Young agents can save by holding a positive amount of fiat money, which they can transfer to the next period when old. The supply of fiat money M > 0 is constant in every time step. Each agent i of generation t lives over two consecutive periods, t and

t + 1, and consumes

𝑐

_!,!!

in his first period of life (Young) and

𝑐_!,!! in his second period of life (Old). The strictly positive amount of initial

(10)

endowments of the perishable consumption are given by 𝜔!_{, when Young}

and 𝜔!_{, when old. The agent’s identical preferences are given by the}

following utility function:

𝑈

_!,!

=

!!,!! ! ! !! ! ! !_!

+

!_!,!! ! ! !!

! ! !_!

,

(1)

where 𝜌_!, 𝜌_! ∈ (0, ∞) represent the coefficient of relative risk aversion of young agents and old agents, respectively. Each agent born at time t

maximizes its utility subject to the following budget constraints

𝑐

_!,!!

_{≤ 𝜔}

!

_{− 𝑠}

!,!

,

(2)

𝑐

_!,!!

_{≤ 𝜔}

!

₊

!!,!

!_{! ! !}!

,

(3)

where 𝑠_!,! denotes the amount the agent saves in his first period of life.

Agents transfer these savings from period t to period t + 1 in terms of fiat money denoted by 𝑚_!,! = 𝑝_!𝑠_!,!. And 𝑝_! and 𝑝_{! ! !}! _{denotes the price of the}

consumption good in period t and the expected price in period t + 1,

respectively. The first order condition for the utility maximization problem is given by

!_!,!! !!! !!

=

!_!,!! !!!

(11)

Combining the first-order conditions with the two budgets constraints one obtains the following equation:

(!! ! !_!,!)!!! !!

=

(!! ! !!,! _! ! ! !! ) !!! !_{! ! !}!

(5)

Rewriting this with respect to the first-period consumption one obtains the following

𝑐

_!,!!

_{+ 𝑐}

!,!! !! !!

_𝛽

_!!! ! !!!

= 𝜔

!

+ 𝜔

!

𝛽

_!

,

₍₆₎ where𝛽_! = [𝑝_{! ! !}! _{/ 𝑝}

!] denotes the gross inflation factor between date t and

t + 1.From the compactness of the budget set and the strict concavity of the

utility function, it follows that the consumption decision and therefore the savings decision of an agent of time t are determined uniquely (Bullard and Duffy, 1998). Since it is not possible to form a closed-form solution, the first-period consumption will be obtained by numerical methods. Once the optimal consumption and savings amounts are calculated it becomes

possible to define the perfect-foresight equilibrium dynamics for prices using the market clearing condition

𝑆

_!

=

!

(12)

where 𝑆_! denotes the aggregate savings on time t. Using this condition and the fact that the supply of fiat money M is constant, one can derive the following first-order difference equation that characterizes all perfect-foresight equilibria in this economic environment:

𝑃

_!

=

!! ! !

!_!

𝑃

! ! !

.

(8)

Following Grandmont (1985), the compacted form of this difference equation is given by

𝑃

_!

= Φ[𝑃

_{! ! !}

].

(9)

Any sequence of prices [𝑃_!] that satisfies equation (9) is defined as

perfect-foresight equilibrium. And the price level P such that P = Φ (P) is called the steady-state equilibrium of equation (9). From Gale (1973) it follows that this model can have at most two steady-state equilibria, one in which the aggregate savings is zero and agents consume total endowments in both periods, and possibly a second one in which the aggregate savings is positive and the steady state rate of return on savings and the gross inflation factor are both equal to unity. Grandmont (1985) limited his research to the backward perfect-foresight dynamics, that is, sequences of prices that solve map (9) with time reserved direction, because the forward perfect-foresight dynamics, that is, iterates of the map (9), may not be defined uniquely, depending on the properties of the map Φ (.). Grandmont's well-known main result was to show that periodic equilibria of any order and chaos

(13)

could exist as long-run outcomes in the backward perfect-foresight dynamics, without abandoning the classical assumptions of utility

Fig. 1 Grandmont's bifurcation diagram (1985, p. 1030)1.

maximization and market clearing. A periodic equilibrium of order k consist of a sequence of k prices, {𝑃_!, 𝑃_!, . . . , 𝑃_!}, such that 𝑃_! = Φ!_(𝑃

!), j = 1, 2,...,

k, where Φ!_{denotes the kth iterate of the map Φ. Grandmont have shown}

with his study that these two steady states are not the only stationary equilibria that this economy might have. With his analysis he showed that when the offer curve is sufficiently backward bending, in other words when

1_{The
bifurcation
diagram
is
taken
from
Bullard
and
Duffy
(1998).
They
replicated}

Grandmont's (1985, p.1030) Figure 4.

(14)

the coefficient of relative risk aversion of the old agents, 𝜌_!, is large enough

and the coefficient of relative risk aversion of the young agents, 𝜌_!, is less

than or equal to unity, then in addition to the two steady states the economy also may consist of periodic as well as chaotic equilibria. These results can be seen in Fig. 1, which is taken from Grandmont (1985).

As mentioned before, Bullard and Duffy (1998) showed that agents are only able to coordinate on a steady state or low-order cycles if the perfect-foresight assumption is replaced with a genetic algorithm-learning scheme. The results of their experiments are given in Fig. 2 below.

Fig. 2 Limiting learning dynamics Bullard and Duffy (1998, p. 31)2.

2_{Taken
from
Bullard
and
Duffy
(1998).
Limiting
learning
dynamics:
ten
replications}

at each old-‐agent relative risk aversion; convergence values or last 50 iterations of each replication plotted.

(15)

3 Genetic algorithm learning

This section gives a description of the genetic algorithm (GA) and its applications to the two-period overlapping generation model. First, the genetic algorithm learning is explained. Second, a detailed description is given of the two genetic algorithm alternatives depending on the different economic factors agents form their beliefs on. The presentation of the structure of the genetic algorithm applications follows Arifovic (1995). Finally, this section is concluded with the discussion of advantages of genetic algorithm learning.

3.1 Genetic algorithm

Genetic algorithms have been introduced by Holland (1975) as a stochastic search algorithm for numerical optimization. This approach uses operations similar to genetic processes of biological organisms to develop better

solutions of an optimization problem from an existing population of

randomly initiated rules. Typically, the proposed rules have been encoded in strings (chromosomes) using a binary alphabet (see Dawid, 1999). Each string represents an agent’s belief of an economic factor.

Thus, in other words, the GA describes the evolution of a population of rules, representing different possible beliefs, in response to experience. After each time period, half of the individuals in the population have

completed their life cycle and exit the economy. At this stage the individual beliefs of these agents are updated using four genetic operators:

(16)

reproduction, crossover, mutation and election. The implementation of these four genetic operators used in this paper is described below.

3.1.1 Reproduction

The reproduction operator selects a string of the population at time t depending on its relative fitness. Strings with higher fitness values have a higher probability of being selected for reproduction. This is called the roulette-wheel selection process. The probability that a string will be selected is given by

𝑃𝑟𝑜𝑏 =

!!,! !_!,! !

!!!

.

(10)

When a string is selected for reproduction, an exact copy is made. Once N copies are made, they enter into a mating pool to undergo application of the other genetic operators.

3.1.2 Crossover and mutation

The main idea of the crossover and mutation operators is to create new beliefs by mixing portions of existing strings and by changing individual bits with small probability.

Once the mating pool after reproduction is complete, two strings are selected randomly. The randomly selected pair of strings will be referred to as the parents. The selected pair undergoes the crossover operator in order to exchange genetic material between them. The easiest way to do this is by

(17)

randomly selecting a pair of parent strings and swapping genetic material among them to generate two new strings. The new pair of strings will be referred to as the offspring. Here the paper follows the algorithm used by Arifovic (1995). First the algorithm starts by randomly selecting an integer number k in the range of [1, l], where l represents the string length. Once the number is selected, two new strings are constructed by combining the

genetic material from the left-hand side of the position of k from one parent with that from the right-hand side from the other parent and vice versa. The crossover operator is applied to each pair with probability pcross. Note that with probability 1 - pcross the offspring are exact copies of their parents. As an example, suppose that we have a pair of strings of length l = 8 and that the crossover operator is to be performed on these two strings. Suppose also that the randomly chosen point is k = 4. The two parents are divided at this point:

1 1 0 1 0 1 1 0_{0 1 0 0 0 0 1 1}

The portion of the strings to the right of the dividing point are swapped, creating the following two offspring:

1 1 0 1 0 0 1 1 0 1 0 0 0 1 1 0

(18)

The mutation operator randomly changes the value of a position within the string with a given probability pmut > 0. Each bit value b = 1, 0 of the two strings that result from the crossover operator is replaced with the bit 1 - b with probability pmut. The probability pmut is independent and identical across positions. With probability 1 - pmut the bit value in not mutated.

The pair of strings resulted from the crossover and mutation operators are possible candidates to enter the new population with updated beliefs and are referred to as offspring. Before the offspring enter the new population they have to be subject to the election operator.

3.1.3 Election

The election operator tests the newly generated offspring before they are allowed to enter into the new population. This step is necessary to avoid a decrease of the fitness value of the overall population due to the genetic alterations of strategies. In particular, the standard application of the genetic algorithm would allow the offspring to enter the population even if they were not strictly better, in terms of fitness, than their parents from which they are created. However, we think of our economic decision markers as being somewhat less naive. Therefore this paper adapts the algorithm used

by Arifovic (1994).Each pair of offspring is compared to the corresponding

pair of parents in terms of fitness. Only those among the offspring that are at least as fit as one of their parents are allowed to enter into the new

population.

The possible results of the election operator are the following. If both offspring have a fitness value higher than each parent's fitness value, both

(19)

offspring replace the parents and enter into the new population. In the case that both offspring have a fitness value lower than each parent's fitness value, the parents enter the new population rather than the offspring. If only one of the offspring has a fitness value higher than the one of each parent, it replaces the parent with the lowest fitness value. Together with the parent with the higher fitness value they enter into the new population.

Once the election operator has been applied, the population with updated beliefs is created. The genetic algorithm is repeated until the maximum number of iterations is equal to the number of time steps T.

3.2 Genetic algorithm application in the OLG economy

At each time step t ≥ 1, there are two populations of chromosomes in the economy. One is referred to as the population of generation t, the young, and the other to the population of generation t - 1, the old. A population of generation t consists of N/2 chromosomes. As mentioned before, the

chromosomes represent the agent’s individual beliefs of a specific economic factor. Agents will learn to improve their individual beliefs by undergoing the four GA operators. This type of learning is referred to as social learning.

This paper will consider two different alternatives of the genetic algorithm. The alternatives are based upon the two different economic factors on which agents form their beliefs. In the first alternative, the population of chromosomes represents a population of agents, which form

beliefs about the first-period consumption

𝑐

_!,!! . The second alternative

(20)

about the expected future consumption price

𝑝

_{! ! !}! . In both alternatives, each chromosome is a string a finite length l, written over the binary alphabet {0, 1}. Each string is decoded by a specific code in order to get a real-valued number for the economic factor on which agents form their beliefs. This will be explained separately for both alternatives in the following subsections.

3.2.1 Learning to optimize

Agents form beliefs about the first-period consumption

𝑐

_!,!! . Given these

beliefs, they will learn how to optimize their utility. In order to get a real-valued number for the first-period consumption, each binary string is decoded by the following formula:

𝑥

_!,!

=

!

𝑏

_!,!!

2

! ! !

!!!

,

(11)

with 𝑏_!,!! denoting the value at the kth position of the ith string (0 or 1). After

a string is decoded, the integer value 𝑥_!,! is normalized in such a way that

the value for the first-period consumption does not exceed the initial

endowment for agents born in period t, 𝜔!_{. Thus the first-period}

consumption for agent i of generation t is given by

𝑐

_!,!!

₌

!!,!

!

,

(12)

where 𝐾 is the normalizing constant to restrict the determined positive real

(21)

determined, the individual savings of agent i of generation t can be obtained by the following:

𝑠

_!,!

= 𝜔

!

_{− 𝑐}

!,!!

.

(13)

Once the optimal consumption and savings values are determined it is possible to compute the price of output in period t using the market clearing condition (7). The aggregate savings of the young agents and the nominal per capita money supply, h, determine the price of the consumption good at time t. This is given by the following equation.

𝑝

_!

=

!!

!_!,! !

!!!

.

(14)

Each agent of generation t carries his individual savings from period t to period t + 1 in terms of money. The nominal money balances are computed once the price for the consumption good is computed. Together with the individual savings, the nominal money balances are computed by the following:

𝑚

_!,!

= 𝑝

_!

𝑠

_!,!

.

(15)

Next, the second-period consumption at time t + 1 of agent i of generation t can be obtained by the following:

(22)

𝑐

_!,!!

=

!!,!

!_{! ! !}

+ 𝜔

!

_.

₍₁₆₎

Before the second-period consumption can be computed the value for the price of the consumption good at time t + 1 is needed. The price is

determined by the population of agents who are entering the economy at time t + 1, and making their savings decision at time t + 1.

Once the second-period consumption is computed it is possible to compute the utility (1) of agent i of generation t representing the offspring their potential fitness value. As mentioned before the fitness value is used to determine whether the offspring will or will not enter the new population. This is determined by comparing the potential fitness of the offspring with the fitness of both parents, evaluated at the end of time step t. At this stage, the four genetic operators have been applied in order to create the

population with updated beliefs about the first-period consumption. The population with updated beliefs enters the economy at time t + 2. The genetic algorithm is repeated to create the next population with updated beliefs about first-period consumption until the maximum number of iterations, T, has been reached.

3.2.2 Learning to forecast

Agents form beliefs about the expected future price of the consumption

good

𝑝

_{! ! !}! . Given these beliefs, they will learn how to forecast by minimize

(23)

price of consumption, each binary string is decoded by the following formula:

𝑥

_!,!

=

!

𝑏

_!,!!

₂

! ! !

!!!

,

(17)

with 𝑏_!,!! denoting the value at the kth position of the ith string (0 or 1). After

a string is decoded, the integer value 𝑥_!,! is normalized in such a way that

the value of the expected price of consumption is restricted to determined positive real values in the interval [0, 16]. The forecast for the price of the consumption good for agent i of generation t is then given by

𝑝

_{! ! !}!

=

!!,!

!

,

(18)

where 𝐾 is the normalizing constant. Using the decoded value for the expected price (18), the computer algorithm then numerically solves

equation (6) in order to compute agent i's first-period consumption,

𝑐

_!,!!

,

and

the price of the consumption good,

𝑝

_!

,

at time t. Once the first-period

consumption is determined, the savings of agent i of generation t can be obtained by the following:

𝑠

_!,!

= 𝜔

!

_{− 𝑐}

!,!!

.

(19)

Each agent of generation t carries his individual savings from period t to period t + 1 in terms of money. The nominal money balances are computed

(24)

with the price for the consumption good and individual savings in the following way

𝑚

_!,!

= 𝑝

_!

𝑠

_!,!

.

(20)

Next, the second-period consumption at time t + 1 of agent i of generation t can be obtained by the following:

𝑐

_!,!!

₌

!!,!

!_{! ! !}!

+ 𝜔

!

_.

₍₂₁₎

The second-period consumption can be computed using the decoded value for the price of the consumption good at time t + 1 given by (18). Once the value for the second-period consumption is computed, it is possible to compute the utility (1) of agent i of generation t. In this alternative of the genetic algorithm, agents are learning to forecast by minimizing the

forecasting error. Therefore the offspring's fitness value is determined by the forecasting error, which is given by

𝜀

_!,!

= 𝑝

_{! ! !}

− 𝑝

_{! ! !}!

.

(22)

The actual price, 𝑝_{! ! !}, is determined by the population of agents who are

entering the economy on time t + 1, and forecasting the future price of the consumption good at time t + 2.

(25)

Thus, the forecasting error represents the offspring's potential fitness value and is used as the actual decision rule to determine whether an offspring will or will not enter the new population. This is determined by comparing the potential fitness of the offspring with the fitness of both parents, evaluated at the end of time step t. At this stage, the four genetic operators have been applied in order to create the population with updated beliefs about the future price of the consumption good. The population with updated beliefs enters the economy at time t + 2. The genetic algorithm is repeated to create the next population with updated beliefs about the future price of the

consumption good until the maximum number of iterations, T, has been reached.

3.3 Advantages of genetic algorithm learning

This paper interprets genetic algorithm learning as a useful model of trial and error learning in an economy with heterogeneous agent. This learning model is chosen because of the many important advantages it has relative to other models in the literature. These advantages are explained in this

section.

First, the genetic algorithm is population based and beliefs are initially heterogeneous across agents in the economy. This feature is not often modeled in competitive general-equilibrium environments in the

literature to date. The population-based genetic algorithm can be regarded as a global search algorithm whereas representative-agent-type search

(26)

are needed on agents in the economy. In particular, agents only need to know their own utility and beliefs about a specific economic factor in order to make a decision. A third advantage is that the genetic algorithm offers a natural model for experimentation by agents with alternative decision or forecasting rules. This is an important characteristic of learning, which is also rarely modeled in competitive general-equilibrium environments in the literature to date. A fourth advantage is that the heterogeneity of beliefs allows parallel processing to be an important feature of the economy. In other words, this feature allows some agents to try one decision or forecast rule, whereas other agents are trying different decision or forecast rules, with the better rules propagating and the less good ones dying out. This gives a close representation of what happens in the actual economies, where communication between agents encourages successful strategies to be

quickly copied and unsuccessful ones to be discarded. The fifth advantage is that the approach to learning studied here can be applied even in

complicated problems such as the one studied in this paper. Finally, the initial heterogeneity of the population allows to initialize the system

randomly. This makes it able to obtain some sense of the global properties of the learning system as opposed to the local analysis that is often used in the learning literature.

These advantages are the reasons that this paper chooses the genetic algorithm over several alternative methods as a way of modeling learning behavior. This does not mean that alternative methods are not interesting, but the desirable features of the genetic algorithm are well suited to the problem examined in the present paper and thus provide enough reason to choose it rather than other alternative learning methods.

(27)

4 Simulation results

This section of the paper describes the design of the computational

experiment and the results drawn from the simulations. First, a description is given of the experimental design followed by the results of the computer simulation of both the learning-to-optimize and the learning-to-forecast structures of the genetic algorithm.

4.1 Experimental design

The paper makes use of computational experiments in which simulations are conducted for the two GA alternatives structures applied to Grandmont's example economy. The simulations for both genetic algorithm alternatives, learning to optimize and learning to forecast, are conducted using two randomly generated initial populations, one starting at t = 0 and the other starting at t = 1. For both alternatives the nominal money supply is hold constant and set equal to M = 100. Further the population size is set equal to

N = 100. Thus, on every time step t a population of N/2 = 50 agents enters

the economy. The simulations are conducted for T = 1200 periods. The

values for the agent's endowments are set at [𝜔!_{, 𝜔}!_{] = [2, 0.5]}

(Grandmont, 1985; Bullard and Duffy, 1998; Hommes et al, 2013). For both GA alternatives we adapt the value for the length of the bit string l,

representing an agent’s belief, from Arifovic (1995). The value is set equal to l = 30. The values for crossover and mutation operators are set at

respectively pcross = 1 and pmut = 1/ l. These values have been chosen on basis of the optimal values recommended by Grefenstette (1986) and Bäck

(28)

(1993) and in part because the election operator used in this paper ensures that strings resulting after the crossover and mutation operator with

particularly low fitness values will not enter the new population. The experiment is organized around the relative risk aversion

coefficient for the old agents. This value is initially set at 𝜌_! = 2 and then

increased up to 16 with increments of 0.1. Furthermore, the relative risk

aversion coefficient for young agents is hold constant and set at 𝜌_! = 0.5.

For every incremented value of 𝜌_! the last 50 of the 1200 iterations are

plotted into a bifurcation diagram in order to see how the dynamics of the

aggregate savings per capita change when increasing the value of 𝜌_!.

4.2 Results learning-to-optimize

The simulation results for the learning-to-optimize structure clearly shows that the population of artificial agents learns to coordinate towards perfect-foresight stationary equilibira. In particular, when the value of the

coefficient of relative risk aversion for the old agents is low (𝜌_! < 4) the

bifurcation diagram as illustrated in Fig. 3 shows convergence towards the monetary steady state, representing a positive amount of aggregate savings. This is consistent with the results found by Grandmont (1985) and the experiments done by Bullard and Duffy (1998), illustrated in respectively

Fig. 1 and Fig. 2. When the value of 𝜌_! is around 4 the monetary steady

state loses its stability. When this value increases further (> 4) the

bifurcation diagram shows some evidence suggesting convergence towards a two-cycle type of behavior. This evidence is not sufficient enough to say

(29)

that with certainty. Furthermore, the bifurcation diagram shows no clear evidence of convergence towards perfect-foresight equilibria with chaotic or

Fig. 3 Bifurcation diagram. The last 50 of the 1200 iterations are plotted for the learning-to-optimize GA structure.

higher-order periodic behavior. To get more insight into the dynamics we conducted time series for the price and aggregate savings per capita for

several single values of 𝜌_!, namely for 𝜌_! = 3, 6 and 13.

We took these values to see whether our simulation results are consistent with theoretical predictions made by Grandmont (1985) and

(30)

Fig. 4 Time series. Conducted for two different initial states for each value of 𝜌!.

Top: 𝜌_! = 3; Middle: 𝜌_!= 6; Bottom: 𝜌_! = 13.

the results conducted from the experiments done by Bullard and Duffy

(31)

Bullard and Duffy show convergence towards the monetary steady state.

And for 𝜌_! = 6, both the theoretical predictions and the results from Bullard

and Duffy show convergence towards a period-two cycle. Furthermore, the results from Bullard and Duffy also show convergence towards the

monetary steady state for this value of 𝜌_!. Finally, for 𝜌_! = 13 the theoretical

predictions show convergence towards higher-order cycles and chaotic behavior (see Fig. 1), whereas the prediction from Bullard and Duffy show convergence towards a steady state and a period-two cycle (see Fig. 2).

For all 𝜌_! > 2 the numerical simulations of time series only result in

two possible outcomes: slow convergence towards a monetary steady state and slow convergence towards a two-cycle type of behavior as illustrated in

Fig. 4. The simulations conducted for 𝜌_! = 3 confirm the results suggested

by the bifurcation diagram for low values of 𝜌_! (< 4), namely convergence

towards the monetary steady state. Furthermore, the simulations conducted

for 𝜌_! = 6 and 13 show convergence towards a two-cycle type of behavior.

Thus, when the value of 𝜌_! increases (> 4) a period doubling bifurcation

type of dynamics occurs, meaning that the existing steady state loses its stability and the dynamics convergence towards a two-cycle type of

behavior. Furthermore, we observe a difference in the dynamics for higher

values of 𝜌_!. The simulations conducted for 𝜌_! = 13 show that when 𝜌_!

increases further (> 10) the dynamics slowly converge towards a higher and lower value for respectively the upper bound and a lower bound of the two-cycle type of behavior. Here, the upper bound of the two-two-cycle represents a state in which agents almost save all of their initial endowments and that the

(32)

lower bound represents a state in which agents almost consume all of their

initial endowments and only save a small amount.

To sum it up, for 𝜌_! = 3 the simulation results show the same type of

behavior as predicted by Grandmont (1985) and Bullard and Duffy (1998).

Whereas for 𝜌_! = 6 and 13 the simulation results show different type of

behavior than predicted. For these values of 𝜌_! we only observe

convergence towards a two-cycle type of behavior. Finally, the bifurcation diagram and time series show no evidence of convergence towards higher-order periodic equilibria or chaotic behavior, which exist under perfect-foresight.

4.3 Results learning-to-forecast

The simulation results for the learning-to-forecast structure also shows that the population of artificial agents learns to coordinate towards

perfect-foresight stationary equilibria. For low values of 𝜌_! (< 4) the bifurcation

diagram as illustrated in Fig. 5 shows convergence towards the monetary steady state, representing a positive amount of aggregate savings. This is consistent with the results found by the learning-to-optimize model,

Grandmont (1985) and Bullard and Duffy (1998). As before, the monetary

steady state loses its stability when the value of 𝜌_! is around 4. When the

value of 𝜌_! increases further (> 4) the bifurcation diagram also shows some

evidence suggesting convergence towards a two-cycle type of behavior, which is not striking. When this value increases further such two-cycle type of behavior vanishes and we observe convergence towards a steady state.

(33)

Here, the bifurcation diagram also shows no evidence of convergence towards equilibria with chaotic or higher-order periodic behavior. To get more insight into the dynamics we also conducted time series for the price

and aggregate savings per capita for several single values of 𝜌_!. This is done

for 𝜌_! = 3, 5 and 13. Here, we choose a value of 5 instead of 6, because the

two-cycle type of behavior vanishes for high values of 𝜌_! (> 5.8) as

illustrated in Fig. 5.

Fig. 5 Bifurcation diagram. The last 50 of the 1200 iterations are plotted for the learning-to-forecast GA structure3_.

3_{Incomplete
bifurcation
diagram.
Due
to
time
limitations
we
limit
the
bifurcation}

(34)

Fig. 6Time series. Conducted for two different initial states for each value of 𝜌!.

Top: 𝜌_! = 3; Middle: 𝜌_!= 6; Bottom: 𝜌_! = 13.

For all 𝜌_! > 2 the numerical simulations of the time series only result in two possible outcomes: fast convergence towards a monetary steady state

(35)

and fast convergence towards a two-cycle type of behavior as illustrated in

Fig. 6. The simulations conducted for 𝜌_! = 3 confirm the results suggested

by the bifurcation diagram for low values of 𝜌_! (< 4), namely convergence

towards the monetary steady state. Furthermore, the simulations conducted

for 𝜌_! = 5 show convergence towards a two-cycle type of behavior. Here,

we also observe a period doubling bifurcation type of dynamics when the value of 𝜌_! increases further (> 4). The existing steady states lose its

stability and the dynamics converge towards a two-cycle type of behavior as

illustrated in Fig. 6. Furthermore, the simulations conducted for 𝜌_!= 13

show convergence towards a steady state in which agent save almost all of their initial endowments. The value of the steady state is equal to the value of upper bound of the two-cycle found in the learning-to-optimize model for

high values of 𝜌_!. Here, based upon the results of previous research and the

results drawn from the numerical simulations of the learning-to-optimize

model for high values of 𝜌_! we belief that the dynamics also converge

towards a two-cycle type of behavior.The missing values for the lower

bound of the two-cycle could be a result due to limitations of numerical solving.

To sum it up, for 𝜌_! = 3 the simulation results show the same type of

behavior as predicted by (Grandmont, 1985) and Bullard and Duffy (1998).

Whereas for 𝜌_! = 6 and 13 the simulation results show different type of

behavior than predicted. For these values of 𝜌_! we only observe

convergence towards a two-cycle type of behavior. Finally, the bifurcation diagram and time series show no evidence of convergence towards

(36)

higher-order periodic equilibria or chaotic behavior, which exist under perfect-foresight.

5 Concluding remarks

The main topic of this paper was to investigate whether a population of artificial, heterogeneous adaptive agents is able to learn by genetic algorithm to coordinate towards perfect-foresight stationary equilibira in a general-equilibrium economic environment. Therefore we conducted simulations for two general genetic algorithm alternatives, learning-to-optimize and

learning-to-forecast. We studied the results, generated from the simulation, by comparing them to the theoretical results of the same economic

environment under perfect-foresight (Grandmont, 1985) as well to those generated by the experimental economic environment in which agents learn by a specific genetic algorithm structure, in particular, agents learn to update their forecast rule based upon past prices (Bullard and Duffy, 1998).

The results of the numerical simulation for both genetic algorithm alternatives suggest that the population of artificial, heterogeneous adaptive agents, as presented in this paper, are only able to coordinate towards two possible types of perfect-foresight equilibria: a steady state and a period-two cycle, even when learning by a genetic algorithm with a general structure. Furthermore, the results show no evidence of convergence towards higher-order periodic equilibria or chaotic behavior that exists under perfect-foresight or naive expectations.

(37)

The findings conducted by this paper provide no evidence of

non-convergence of the genetic algorithm. Therefore we can suggest that chaotic type of behavior is not relevant with genetic algorithm learning, even not when the genetic algorithm has a general structure. This was also suggested by the results found by Arifovic (1995). Based on these findings we can rule out the convergence towards chaotic behavior, existing under

perfect-foresight, while agents are learning by genetic algorithm. Furthermore, the results suggest that it is easier for an agent to learn when forecasting than when making an economic decision. This is suggested since the dynamics of the learning-to-optimize alternative show slow convergence towards the perfect-foresight equilibria, whereas the dynamics of the learning-to-forecast alternative show fast convergence towards the perfect-foresight equilibria.

(38)

References

Arifovic, J., 1994. Genetic algorithm learning and the cobweb model.

Journal of Economic Dynamics and Control 18, 3-28.

Arifovic, J., 1995. Genetic algorithms and inflationary economies. Journal

of Monetary Economics 36, 219-243.

Arthur, B., 1991. Designing economic agents that act like human agents: A behavioral approach to bounded rationality. American Economic Review:

Papers and Proceedings of the 103rd Annual Meeting of the American Economic Association, 353-359.

Bäck, T., 1993. Optimal mutation rates in genetic search. In S. Forrest (ed.), Proceedings of the Fifth Annual Conference on Genetic Algorithms. San Mateo: Morgan Kaufmann.

Bullard, J. and J. Duffy, 1998. Learning and The Stability of Cycles.

Macroeconomics Dynamics 2, 22-48.

Crawford, V.P., 1991. An 'evolutionary' explanation of Van Huyck, Battalio and Beil's experimental results on coordination. Games and

Economic Behavior 3, 25-59.

Dawid, H., 1999. Adaptive Learning by Genetic Algorithms: Analytical

Results and Applications to Economic Models. Berlin: Springer-Verlag

(second ed.).

Gale, D., 1973. Pure exchange equilibrium of dynamic economic models.

Journal of Economic Theory 6, 12-36.

Grefenstette, J.J., 1986. Optimization of control parameters for Genetic Algorithms. IEEE Transactions on Systems, Man, and Cybernetics 16, 122-128.

Grandmont, J.M., 1985. On endogenous competitive business cycles.

(39)

Heemeijer, P., Hommes, C., Sonnemans, J., Tuinstra, J., 2012, An Experimental Study on Expectations and Learning in Overlapping

Generations Models. Studies in Nonlinear Dynamics and Econometrics, 16, (4), 1-47.

Holland, J.H., 1975. Adaption in natural and artificial systems. Ann Arbor:

University of Michigan Press.

Hommes, C., Sorger, G., Wagener, F., 2013. Consistency of Linear

Forecast in a Nonlinear Stochastic Economy. Global Analysis of Dynamic

Models in Economic and Finance: essays in honour of Laura Gardini,

229-287.

Lucas, R.E., Jr., 1986. Adaptive behavior and economic theory. Journal of

Business 59, 401-426.

Lux, T. and S. Schornstein, 2005. Genetic learning as an explanation of stylized facts of foreign exchange markets. Journal of Mathematical

Economics 41, 169-196.

Marimon, R., McGrattan, E., Sargent, T.J., 1989. Money as a medium of exchange in an economy with artificially intelligent agents. Journal of

Economic Dynamics and Control 14, 329-373.

Miller, J.H., 1989. The coevolution of automata in the repeated prisoner’s dilemma. Working paper no. 89-003(Santa Fe Institute, Santa Fe, NM). Miller, J.H and J. Andreoni, 1990a. A coevolutionary model of free riding behavior: Replicator dynamics as an explanation of experimental results.

Working paper 90-01-004 (Santa Fe Institute, Santa Fe, NM).

Samuelson, P.A., 1958. An exact consumption-loan model of interest with or without the social contrivance of money. Journal of Political Economy 66, 467-482.