• No results found

Neural networks as meta-models of multi-agent models

N/A
N/A
Protected

Academic year: 2021

Share "Neural networks as meta-models of multi-agent models"

Copied!
30
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Neural networks as

meta-models of multi-agent

models

(2)

Layout: typeset by the author using LATEX.

(3)

Neural networks as

meta-models of multi-agent models

Jelle van den Broek 11882425

Bachelor thesis Credits: 18 EC

Bachelor Kunstmatige Intelligentie

University of Amsterdam Faculty of Science Science Park 904 1098 XH Amsterdam Supervisor P. (Peter) Fratrič Informatics Institute Faculty of Science University of Amsterdam Science Park 904 1098 XH Amsterdam Jun 26th, 2020

(4)

Abstract

Simulation is a frequently used method used to predict future state of complex systems, such as multi-agent systems. However, a downside of running simulations is that they can be extremely memory expensive and time consuming. A method to reduce time and memory lost on running simulations is to construct simpler models of the simulation to approximate the outcome of simulations. Such a meta-model can be constructed with a neural network. In this paper a simulation meta-model is constructed using a Neural Network to predict the outcome of a multi-agent system, Namely the public goods game with diverse tolerance as described by M. Perc. In this game multiple agents choose a strategy every time step that affect their own and other agents payoffs. After each round some of the agents adapt their strategy based on how big their payoff was relative to other agents. The main question of this project is to what degree of precision is it possible to approximate the total payoff of the next time step (or later time steps) in the Public Goods Game With Diverse Tolerance using a simulation meta-model constructed with a neural network.

(5)

Contents

1 Introduction 2 2 Background 4 2.1 Agent-Based Modelling . . . 4 2.2 Machine learning . . . 4 2.3 Meta-models . . . 5

3 The model and meta-models 6 3.1 The public goods game with diverse tolerance . . . 6

3.2 The neural network . . . 11

4 Experiment 15 4.1 Data generation . . . 15

4.2 Examining the data . . . 16

4.3 Training and validation . . . 19

4.4 Validation in the context of ABM . . . 21

5 Discussion 23 5.1 Predicting 1 time step ahead . . . 23

5.2 Predicting multiple time steps ahead . . . 23

5.3 Transfer learning . . . 24

5.4 Future Research . . . 24

6 Conclusion 25

(6)

Chapter 1

Introduction

Simulation is a frequently used method used to predict future state of complex systems. An example of a complex system is a multi-agent system. Complex multi-agent systems are currently used in healthcare, marketing, manufacturing and the military for predicting future situations or for decision making. Simula-tions often achieve better results than mathematical models because mathematical models typically impose stronger assumptions in order to be (approximately) solv-able, while simulators constructed as agent-based models make assumptions only on the micro-level of an agent. However, a downside of running simulations is that they can be extremely memory expensive and time consuming. A method to reduce time and memory lost on running simulations is to construct simpler models of the simulation to predict the outcome of simulations. Models of the simulation are called simulation meta-models. A simulation meta-model gives a trade-off between accuracy and the amount of computational power needed for predicting future states of a system.

Such a meta-model can be constructed with a neural network. In this paper a sim-ulation meta-model is constructed using a Neural Network to predict the outcome of a multi-agent system, Namely the public goods game with diverse tolerance [1]. In this game multiple agent choose every time step a strategy that affect their own and other agents payoffs. After each round all the agents adapt their strategy based on how big their payoff was relative to other agents. Subsequently their payoff is calculated and a new round starts. The payoff of all agents combined is how well the system is doing altogether. The simulation meta-model constructed in this paper is tasked to predict this total score of all the agents n rounds fur-ther. The performance of the meta-model is evaluated in terms of accuracy and computational complexity.

(7)

The main question of this project is to what degree of precision is it possible to approximate the total payoff of the next time steps in the public good game with diverse tolerance using a simulation meta-model constructed with a neural network. This question is extended into multiple subquestions. Firstly, Is the sim-ulation meta-model able to predict the overall payoff predefined number of time steps ahead. Secondly, what is the best architecture of the neural network and lastly if this neural network is still able to predict the total payoff if dominant strategies of the agents change.

(8)

Chapter 2

Background

2.1

Agent-Based Modelling

An agent-based model is a computational model used for simulation consisting out of elementary entities called agents. An agent is an individual autonomous element that can interact with other agents and its surroundings [2]. A model with multiple agents is called a Multi-agent-model. At the start of an agent-based model all agent are put into the environment with a set of rules they have to follow. The current state of an agent is updated depending on the past states of the agent, state of other agents, or depending on the environment.[3] If randomness is included in the model then Monte Carlo simulation can be used to generate distribution of future states. This probability distribution can be used for analysis of the behavior of the agent as will be shown later.

2.2

Machine learning

Machine learning algorithms are algorithms that construct mathematical models to approximate or predict solutions to a problem it is trained on. Machine learning algorithms improve their selves over time by training on data. They are therefore mainly used for optimization problems where a certain unknown function has to be approximated. Machine learning algorithms can not directly be programmed to mimic such an unknown function. [4] Therefore certain parameters in the mathe-matical model are changed based on training data to approximate the function to the best of his abilities. There are different ways to train a machine learning algo-rithm. In this paper supervised learning is used where the computer is presented with input data (most of the time called X) and the data that it should output given the input data (called Y). During the training of the model the model tries

(9)

to adjust it parameters in a way that given the input data X it outputs answers as similarly as the desired output Y. The error between the actual output and the desired output is called loss or cost. The machine learning algorithm tries to minimize this cost. A few example of supervised machine learning algorithms are support vector machines, decision trees and neural networks.

2.3

Meta-models

Simulations can often be used for solving complex problems where the exact solu-tion is not known. A simulasolu-tion model is used for obtaining the result. However such a model can be very computationally expensive. When something small changes with the problem, like a parameter change or a time step, the model needs to be simulated again to calculate the new function. Simulating the big model again would cost too much time therefore a simpler model is constructed. Such a model is called a meta-model. These models allow to obtain result more efficiently than big simulations [5]. Meta-models can be constructed with machine learning algorithms. In this paper a meta-model is constructed for a multi-agent-based model. In multi-agent models agents have very often a lot of interactions with each other or their environment. All these interactions have consequences and these consequences need to be calculated. This becomes quickly very time consuming and computational expensive. Therefore a meta-model can be created with machine-learning algorithms to get approximately the same result with less calculations and perhaps be reused for other purposes. [6]

(10)

Chapter 3

The model and meta-models

3.1

The public goods game with diverse tolerance

The multi-agent-based model used in this paper is The public goods game with diverse tolerance. A simplified version, The public goods game, is a simple multi-agent game very similar to the prisoners dilemma. At the start of the game all agent are divided in X groups of G big. Each agent in the game can either coop-erate or defect. This choice affects the payoff for each agent in that agents group. By picking cooperate the agent contributes one point to the common pool of their groups. The defectors, however, contribute nothing to the common pool. The pay-off of for an agent will be the sum of the common pools of his groups multiplied by the synergistic factor r minus the amount the agents contributed to all pools. Therefore the payoff of a cooperator in group g, Pg

c, is calculated by equation 3.1.

Pcg = rNc

G − 1 (3.1) And the payoff of a defector is calculated by equation 3.2.

Pdg = rNc

G (3.2)

With Nc being the amount of cooperators in group g.

If there is only 1 contributor in the group the synergistic factor for the group will change to 1 for that group. As shown above the payoff of a defector will al-ways be higher than the payoff of a cooperator in the same group, since defectors do not contribute anything to the common pool. However if every agent would become a defector the common pool will not go up and the payoff for every agent would be 0. Therefore the best payoff for the agent is not directly linked to the best payoff for the group and this will create a dynamic multi-agent system.

(11)

The actual game used in the paper, The public goods game with diverse toler-ance, is a bit more complex. In this game two extra strategies are added where the agents can choose from. In addition to cooperators (C) and defectors (D) there are now also loners (L) and tolerant agents (M). Loners will not participate in the group and will always get a payoff of 1 in the group. Tolerant agents will change their strategy while in the middle of a round based on how many defectors there are in the group. If there are too many defectors it will act as a loner else it will act like a cooperator. There are as many different level of tolerance as there are agents in a group. The tolerant agents can decide how many defectors they toler-ate ranging from 0 to G-1. Tolerant agent M2 will act as contributor as long as

there are less than 2 defectors in his group. Tolerant agents pay however extra γ amount of points. The amount of contributors in a group with the tolerant agents included can be calculated by equation 3.3.

T = Nc+ G−1

X

i=0

δiNMi (3.3)

With δi being 1 if there are less defectors than i in the group else δi is 0. The

payoff for each strategy in group g is calculated by the equations below.

Pcg = rT T + ND − 1 (3.4) Pdg = rT T + ND (3.5) Plg = 1 (3.6) Pdg = δiPcg+ (δi − 1) − γ (3.7)

The game is played on a NxN board with periodic boundaries with on each tile an agent. Therefore there are N2 agents in one simulation. Each agent is put in 5 overlapping cross-shaped groups of size 5. So 1 group of an agent consists out of the agent itself the agents vertically adjacent and the agents horizontally adjacent. At the start of the game the strategy of the agent is uniformly random initialized. So an agent has a 18 chance to start with a certain strategy (C, D, L, M0, M1,

M2, M3, M4). After the initialization the time steps are executed. Each time step

consists out of first calculating the payoffs for every agent, Which is the sum of all the payoffs he gets in all groups. Secondly some of the agents are randomly selected and they all pick randomly an adjacent agent. Subsequently the selected agents have a probability to adopt the strategy of their neighbour based on payoffs of both these agents. The probability that agent 1 adopts the strategy of agent 2

(12)

can be calculated by the fermi function 3.8. 1

1 + exp(2(P1 − P2))

(3.8) With P1and P2being the payoff for agent 1 and agent 2 respectively. Although

it is more likely that an agent adopts a better strategy it is also possible that an agent adopts a strategy that is worse than the strategy it currently has.

To observe how the system is doing heatmaps are used to see individual agents performance and the total payoff is used how well the whole system is doing. The heatmaps show the payoff of every agent in the grid. 3.1

Figure 3.1: Heatmap showing the payoff of every agent

The total payoff is the sum of the payoff of all agents combined. The total pay-off is tracked during the game. An example of how total paypay-off progress in one simulation is shown in figure 3.2

(13)

Figure 3.2: Total payoff of model during one simulations

At the start of the game all strategies are uniformly randomly generated and therefore the total payoff differs extremely. After around 100 time steps passes cer-tain strategies become more prevalent than others. making the system more stable. Under different parameters different strategies thrive. The dominant strategies in a simulation are mainly decided by the the value of r and γ Figure 3.3 shows the dominant strategies in the r-γ plain. [1]

(14)

Running the game can be computationally very expensive. The more agents there are in the simulation the more calculations are needed to simulate the game. Every time a time step passes the payoff for every agent has to be calculated again requiring a lot of computational power. If there are more agents in the game more agents need to update for the game to stabilize. Therefore more agents need to update every time step or more time steps are needed. Thus adding more agent to model might seem inefficient for running this simulation. However if the board is too small one dominant strategy can, with some luck, take over the board entirely causing the board to become stale and not suitable for drawing data from. Therefore a balance between quality of the data and computational power needed to run the simulation has to be found.

(15)

3.2

The neural network

Since the direct function to approximate the outcome of running the simulation one time step is unknown the function can be approximated using a Monte Carlo simulation. Running all the Monte Carlo steps needed for long simulations will cost a lot of time [7]. Therefore as surrogate for running these steps a convolutional neural network is used in this paper that based on the heatmap of a board state can approximate the total payoff N time step further. Figure 3.4 shows the aim of the meta-model

Figure 3.4: Aim of the meta-model

A neural network (NN) is a powerful machine learning model that is inspired by neurons in human brain. It is a very flexible method that can handle a lot of different inputs and can model any continuous function. A neural network consists out of multiple layers that consist out of nodes called neurons [8]. Figure 3.5 shows a schematic overview of a basic neural network with its layers.

(16)

Figure 3.5: A basic neural network

The first layer of a neural network is called the input layer. The input layer takes the input data (X) and transforms the data into a vector or matrix. The lay-ers between the input laylay-ers and the last layer are called the hidden laylay-ers. In the hidden layers the main part of computation happens. With the values of the input layer the values of the neurons in the first hidden layer are calculated using weights and biases. The input vector is multiplied by the weight matrix (w) creating a new vector. After the new vector is calculated a non-linear activation function is applied on all the values in the new vector creating the vector in the first hidden layer a. By applying a non-linear activation function the neural network is able to model non-linear functions. The activation function used for every layer in the neural network is the ReLU function as described in equation 3.9. The values for vector a are calculated as described in equation 3.10[9]

ReLU (x) = max(0, x) (3.9) aj = ReLU ( D X i=0 wji∗ xi) (3.10)

A hidden layer where these calculations happen is called a dense layer. After all the values in the second layer are calculated the values are put in a vector and are used the same was to calculate the values of the nodes in the next layer. These calculations happen till the vector reaches the last layer. The output layer. The output layers outputs the solution that is calculated by the neural network. Based on if the solution was right or wrong the neural network adjust its weight and biases using backpropagation to make the cost as low as possible [10]. The cost functions used in this paper are the Mean squared error and mean absolute error. The mean squared error is calculated with equation 1 and the mean absolute error with equation 2.

(17)

M SE = N X i=0 (Xi− Yi)2 N (3.11) M AE = N X i=0 |Xi− Yi| N (3.12)

With N being the amount of data points in the data set.

It is an emergent property that in The Public Goods Game With Diverse Tolerance large groups of the same strategy form. To help the neural network recognize these groups or where these groups are located convolutional layers are used. A con-volutional neural network (CNN) is a specialized neural network that works best for grid-like data such as image data. A convolutional neural networks utilizes a linear operation called convolution instead of solely using matrix multiplications. With convolution the neural network is able to recognize patterns very fast and therefore it is mainly used in image recognition. CNNs are for example able to recognize numbers, dogs and cats. They are on the similar levels as humans in some image recognition areas [11]. Convolutional neural networks remove noise from data by smoothing the data with weighted averages of the nodes around the node. Therefore CNNs are useful for detecting groups of agents with the same strategy in a grid-like board and will aid the neural network tremendously. After the convolutional layers Max pooling layers are used to reduce time to train the neural network and agents with low payoff often would change strategies quickly therefore making their payoff less important in the future.

(18)

By changing the amount of hidden layers and the size of the hidden layers neural networks can learn very complex functions, because with each new layer a number of parameters (weights) is added to the neural network. The size of the input layer and output layer are determined by the dimension of your input data and expected answer. Since the input data is a 50x50 sized heatmap the input layer has 2500 nodes. The output of the neural network should be the to-tal payoff N time steps ahead. Therefore the size of the output layer is only 1 node. The amount of hidden layers and the amount of neurons in the hidden layers is a bit trickier to decide since they do not directly handle raw input and out-put. The hidden layers should be capable of modelling the update function of the board. Therefore multiple hidden layers are needed in the neural network since the update function seems to be quite complex with a lot of random elements in-volved. Adding extra layers, especially convolutional layers, is extremely memory expensive and time consuming while training. Therefore models with little lay-ers are preferred over networks with many laylay-ers. The hidden laylay-ers consist first out of convolutional layers to remove noise from the data and locate big groups of agents with the same strategy. The amount of layers is mainly determined by experimenting and keeping what works. For simpler edge detection and smoothing in images only three layers with 100 neurons were needed. Since this what was primarily needed for the neural network only three convolutional layers were used in the neural network. The three convolutional layers were followed by the dense layers to model the main computations of the update function. The amount of dense layer and neurons in these layers is determined by experimentation. The neural network performed best with 3 or more dense layer. Adding more than three layers did not seem to significantly improve the neural network therefore the neural network ended up with 3 dense layers.

(19)

Chapter 4

Experiment

4.1

Data generation

The data needed to train the neural networks was collected by sampling The Pub-lic Goods Game With Diverse Tolerance. The neural network has to approximate the future total payoff based on the heatmap of the payoff of current the board. Therefore one data point consists out of the heatmap of the current payoffs and the total payoff 1, 2, 3, 4, 5, 10, 20 and 30 time steps ahead. Before collecting data points the game was first run 100 time steps so the system could stabilize. After the game stabilized a data point was saved every four time steps. To keep the data set as general as possible multiple simulations were run to give room for different patterns in the heatmap depending on prevalence of strategies in spe-cific sub-region. However running a simulation is computationally expensive and time consuming. Running 1 complete simulation costs around 2 minutes (127 seconds). Therefore only 40 simulations were run making it time efficient while still having a general data set. There were 250 data points collected from every simulation making the data set consist out of 10000 data points. After the entire data set was collected the set got divided into a training and test set. The training set consisting out of 8000 randomly selected data points and the test set out of the remaining 2000 data points. This ensures that there is enough data to train on making the set sufficiently general, while still having enough data in the test set. Before the data is used for training of the neural network, the heatmaps were normalized by giving every agent a zero mean payoff and a standard deviation of 1. A certain set of parameters are used for the simulations ran in this paper to train the neural network more effectively. The size of the board is not be too large, because that would make running the simulations and gathering data extremely computationally expensive. Therefore the size of the board of 50x50 is chosen

(20)

ensuring that no strategy can take over the board. Each time step half the agents are updated. By updating a lot of agents each steps, The total payoff will change faster. Therefore less simulations are needed to collect data. For parameters r and γ the values are 2.8 and 0.35 are chosen respectively. These parameters are always fixed so there was more data of the same parameters making it easier and more efficient to train the neural network.

4.2

Examining the data

The task of the neural network is to approximate the total payoff of the next time step of the simulation. To approximate this payoff The neural network is trained with the current board state and the payoff in future states. The payoff in a future state is not always the same for the same board, because there are a lot of random elements involved with updating the board. Such as what agents are eligible for changing strategies and the fermi function 3.8. Consequently it is impossible for the neural network to always make perfect approximations.

A given board state starts with a certain total payoff. After a time step the payoffs of the agents change therefore the total payoff also changes. With the data obtained before the average distance of total payoffs between time steps can be calculated. For one time step ahead the total payoff will change on average with absolute value of 94.4. Meaning that if the average error of the neural network is higher than 94.4 the neural network would be not very useful since taking the total payoff of the starting board would give better predictions. The same applies for multiple time steps ahead. Table 4.1 shows the Average Absolute Total Payoff Change for all time steps analyzed in this paper.

time steps ahead Average Absolute Total Payoff Change 1 94.4 2 135.9 3 171.0 4 205.5 5 236.6 10 392.5 20 606.3 30 766.1

(21)

Because of randomness both in decision mechanism of the agent expressed in equation and randomness in which agents are chosen to make a decision on strategy change, the total payoff is not the same for each Monte Carlo simulation. By simulating a certain board state multiple times and collecting all the total payoffs of the future state a probability distribution can be constructed for all possible payoffs. As shown in the QQ-plot 4.2 most quantiles of the distribution intersect with quantiles of the probability distribution on the straight line. That means that the distribution is very similar to a normal distribution.

The best possible approximation the neural network could make to reduce the

Figure 4.1: Distribution Figure 4.2: QQ-plot absolute mean error or mean squared distance on average, would be to predict the mean of the normal distribution for each input. A problem while training the neural network is that the neural network can not be trained on these mean values since it is only trained on the value of the distribution that happens to be in the training set and that happens only with certain probability while generating the training set. If the neural network would only be trained on the mean values this mean value first has to be calculated. Therefore approximately 1000 times more data points are needed and that would make the model computational inefficient.

(22)

According to the QQ-plot below (figure 4.4), The distribution two time steps ahead is still very similar to a Gaussian. This distribution has however a higher standard deviation than the distribution one time step ahead, since more time steps passed there is more randomness involved. Therefore it becomes harder for the neural network to approximate the mean and therefore to get a low mean squared error. The standard deviation of the probability distribution becomes higher the more time steps pass. Therefore the error of the neural network will be higher approximating more time steps ahead.

(23)

4.3

Training and validation

The neural network was trained to minimize the mean squared error between its approximation and the simulated total payoff. It was trained for 100 epochs with a batch size of 16 to accelerate the training. After 100 epochs the mean squared error would stop decreasing for all time steps ahead. Most training curves look like the figures below.

Figure 4.5: Training curve 1 time step ahead

Figure 4.6: Training curve 2 time steps ahead

Figure 4.7: Training curve 5 time steps ahead

Figure 4.8: Training curve 10 time steps ahead

(24)

Table 4.2 shows the minimum mean absolute error and mean squared error for all time steps.

time steps ahead MAE MSE Average Absolute Total Payoff Change 1 90 13200 94.4 2 115 20700 135.9 3 135 28500 171.0 4 150 35000 205.5 5 175 48000 236.6 10 240 89000 392.5 20 295 133000 606.3 30 320 152100 766.1

Table 4.2: Performance of the neural networks

All the results are better than the Average Absolute total payoff change. There-fore the neural networks approximations are better than taking the original total payoff. The difference between the absolute change and the mean absolute error is smaller for less time steps ahead. Therefore it seems that the neural network is unable to give a significant approximation for small time steps. However the difference between the Average Absolute total Payoff change and the MAE become bigger when approximating the total payoff more time steps ahead. Therefore the neural network seems significantly more useful for approximating these time steps. When the neural network was asked to make predictions on a game with dif-ferent parameters it performed rather poor. Getting a Mean Absolute Error over 1300. Therefore it does not seem that the neural network is able to predict the total future payoff for different dominant strategies.

(25)

4.4

Validation in the context of ABM

A big problem of the experiment is that the neural network is not always trained on the mean but on simulated data that can be on the tail of a the probability distribution of the possible total payoffs. If such a simulated total payoff is on the tail the probability distribution the mean squared and absolute error can still be high while making a decent approximation. The highest mean absolute error occurs for all the network when the neural networks predicts the total payoff to be on one end of the tail of the probability distribution while the simulated total payoff was on the other end of the tail. An example is shown in figure 4.9

Figure 4.9: Worst probability distribution 1 time step ahead

Therefore solely using the mean square error or mean absolute error to test the performance of the neural network might not be the fairest way. The probability distribution of possible total payoffs one time step ahead given a certain board state is very similar to a Gaussian. Therefore it might be better to see if the prediction of the neural network is inside the interval [µ − σ, µ + σ] to test the performance of the network, because those predictions are still very good predic-tions of the total payoff. It might also be interesting to see if a large part of the mean absolute error comes from the simulated payoff being outside of that interval. Since it is computationally not feasible to create a probability distribution for all board states this test can only be done for a random few board states per neural network. Therefore 50 board states were randomly selected from the test set and were used to construct these probability distributions. These distributions were then categorized based on if the prediction was in the interval and if the simulated total payoff was in the interval. If they both are in the interval it will be notated as (1, 1). If only the prediction is in the interval it will be notated as (1, 0). If only the simulated total payoff is in the interval it will be notated as (0,1) and if

(26)

they are both outside the interval it will be notated as (0, 0). Below is a table (4.3) with the amount of distribution out of 50 distributions categorized in one of these 4 sections.

Time steps ahead (1,1) (1,0) (0,1) (0,0) 1 21 11 13 5 5 17 8 16 9 10 17 10 16 7 30 14 11 12 13

Table 4.3: Amount of distributions

As shown in the table above the neural network for 1 time step ahead pre-dicts the total payoff inside the interval [µ − σ, µ + σ] 32 out of 50 times. This would give the neural network an accuracy of 64%. The neural network 5 time steps ahead has an accuracy of 50% ,the neural network 10 time steps ahead has an accuracy of 54% and the neural network 30 time steps ahead has an accuracy of 50%. Although it is a small sample where the data is based on it might give another indication of how well the neural networks are actually performing.

Although evaluating the neural network this way might also not give the most accurate way since the original total payoff is often inside the interval when the time step is smaller. Therefore it is easier for the neural network to make prediction inside the interval. Therefore to get the most accurate evaluation of the neural network the difference of the MAE and the Average Absolute Total Payoff Change need to be combined while keeping in mind how much time it takes to construct and evaluate the neural network to see how efficient it actually is.

(27)

Chapter 5

Discussion

5.1

Predicting 1 time step ahead

The neural network seems not be able to make significant predictions for the total payoff 1 time step ahead. Although it performs better compared to just using the starting total payoff as prediction it only improves that prediction on average by a very small amount. Since the standard deviation of the probability distribution created by the possible future payoffs and the average absolute total payoff change are about the same the starting total payoff is often in the probability distribution. The starting total payoff is therefore not a bad prediction and that makes it hard for the neural network to improve that prediction of the future total payoff. This small improvement might be too small to be efficient and therefore the neural network is not suitable for predicting the total payoff 1 time step ahead.

5.2

Predicting multiple time steps ahead

The neural networks trained on the total payoff more time steps ahead performs better than the neural network trained on the total payoff 1 time step ahead. Since the mean of the distribution is on average further away from the starting payoff when more time steps passes, the original payoff becomes a worse prediction. Therefore there is more room for the neural network to improve that prediction. The mean absolute error becomes bigger for neural networks approximating fu-ture payoffs further away. That does not mean the neural network is performing worse, because the standard deviation of the probability distribution created by the possible future payoff is also getting bigger making it harder to approximate. Simulating a large number of time steps costs a significant amount of time and therefore making a neural network that is able to approximate the total payoff

(28)

for these time steps is extra beneficial. However analyzing these neural networks by Monte Carlo simulations and gathering data is more expensive for larger time steps. The neural network that can predict the total payoff 30 time steps ahead seems to be the best compromise, because it is accurate enough to get a good insight if the total payoff will be increasing or decreasing and by what amount while it stills is analyzable and efficient enough to gather data for.

5.3

Transfer learning

By conducting few experiments, in which NN for different dominant strategies gave very poor predictions, it seems that the NN trained on one configuration of dominant strategies is not capable to be reused on other configuration of dominant strategies. A new configuration of dominant strategies has often strategies that the neural network has never seen before. Therefore it can not estimate how those strategies will change with time. The neural network only sees the heatmap with the payoffs of every agent and not the specific strategies and therefore it might confuse strategies it even trained on. For the neural network to be able to make predictions for models with other parameters, the neural network needs data for it to train on.

5.4

Future Research

A possible extension to this paper would be to make a more general meta-model trained on all possible parameters r and γ. This would require a lot more data and therefore a lot of extra time. Therefore making it too time consuming to be used in this project.

The same experiment could be tried with a bigger convolutional neural network with more convolutional and dense layers. Although experimented with in this paper the full extent of adding more neurons or adding more hidden layers is not explored enough because of hardware limitations. By adding more layers the func-tion might be better approximated.

Another way to train the neural network is not using as desired output the sim-ulated data but the mean created by Monte Carlo simulations of all the possible total payoffs. This is however extremely time expensive making such a model not computational efficient and that might nullify the purpose of the meta-model. Nevertheless this model could be used for analyzing other models.

(29)

Chapter 6

Conclusion

In this thesis the possibility of using a neural network as meta-model for The Public Goods Game With Diverse Tolerance is explored. A convolutional neural network is able to make decent approximations of the total payoff of the game. Trough the randomness in the games update function it is hard to exactly observe how well the neural network is performing without using too much computation. The neural network is not suitable for approximating the total payoff 1 time step ahead since the possible updated total payoffs do not differ a lot from the original total payoff making it hard for the neural network to make a significant better prediction.

However the neural network seems to be able to give better approximation for the total payoff more time steps ahead and can give a significant indication if the total payoff will be increasing or decreasing in the future and by what amount, but constructing and validating the neural network for a lot of time steps is very computationally expensive. Therefore the neural network excels at approximating the total payoff 30 time steps ahead finding a nice balance between significant approximation and computation power/time needed.

(30)

Bibliography

[1] Matjaž Perc. “Stability of subsystem solutions in agent-based models”. In: European Journal of Physics 39.1 (2017), p. 014001.

[2] Charles M Macal and Michael J North. “Agent-based modeling and simu-lation”. In: Proceedings of the 2009 Winter Simulation Conference (WSC). IEEE. 2009, pp. 86–98.

[3] Stuart Russell and Peter Norvig. “Artificial intelligence: a modern approach”. In: (2002).

[4] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016.

[5] Pramudita Satria Palar et al. “On the use of surrogate models in engineer-ing design optimization and exploration: The key issues”. In: Proceedengineer-ings of the Genetic and Evolutionary Computation Conference Companion. 2019, pp. 1592–1602.

[6] Henri Pierreval. “A metamodeling approach based on neural networks”. In: Int. Journal in Computer Simulation 6.3 (1996), p. 365.

[7] Yinhao Zhu and Nicholas Zabaras. “Bayesian deep convolutional encoder– decoder networks for surrogate modeling and uncertainty quantification”. In: Journal of Computational Physics 366 (2018), pp. 415–447.

[8] Shahzad Khan. Alpaydin Ethem. Introduction to Machine Learning (Adap-tive Computation and Machine Learning Series). 2004.

[9] Christopher M Biship. Pattern recognition and machine learning (informa-tion science and statistics). 2007.

[10] Kevin P Murphy. Machine learning: a probabilistic perspective. MIT press, 2012.

[11] Robert Geirhos et al. “Comparing deep neural networks against humans: ob-ject recognition when the signal gets weaker”. In: arXiv preprint arXiv:1706.06969 (2017).

Referenties

GERELATEERDE DOCUMENTEN

Next to increasing a leader’s future time orientation, it is also expected that high levels of cognitive complexity will result in a greater past and present time orientation..

Among the frequent causes of acute intestinal obstruction encountered in surgical practice are adhesions resulting from previous abdominal operations, obstruction of inguinal

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of

According to the European Parliament legislative resolution, it is the executing state which has to bear these costs, unless certain costs have arisen

2.4 1: An overview of all the selected universities for all four case study countries 20 4.2 2: An overview of the percentage of EFL users categorized by language origin 31

This research will conduct therefore an empirical analysis of the global pharmaceutical industry, in order to investigate how the innovativeness of these acquiring

The next section will discuss why some incumbents, like Python Records and Fox Distribution, took up to a decade to participate in the disruptive technology, where other cases,

The bad performance on Custom datasets (which contains larger instances) results from a difficulty in estimating the size of the interaction effect (in fact, the estimated