• No results found

Social multiplexity in a generalized Axelrod model of cultural dissemination

N/A
N/A
Protected

Academic year: 2021

Share "Social multiplexity in a generalized Axelrod model of cultural dissemination"

Copied!
49
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

University of Amsterdam

MSc Theoretical Physics

Master Thesis

Social multiplexity in a generalized

Axelrod model of cultural dissemination

Author:

Arjen Aerts

Supervisor:

dr. Diego Garlaschelli

(2)

Social multiplexity in a generalized Axelrod model of cultural dissemination

Arjen Aerts

Abstract

Social multiplexity is a ubiquitous feature of human social life. In this thesis it is investigated what the effect of social multiplexity is on cultural dynamics in terms of cultural convergence, using a generalized version of the Axelrod model which incorporates network multiplexity and bounded confidence. This is mostly a computational study; where possible, analytical results are

established as well. First, in the end-state the effect of having multiple networks on a phase transition, with the confidence threshold as control parameter, is studied for the following scenarios: random graphs, network-culture assortativity, updating networks and initial realistic cultures. Second, using the same scenarios, the model dynamics are explicitly analyzed for some

values of the threshold. Third, attention is paid to the effect of cultural evolution on the underlying social networks in the presence of network updating. It is found that the effect of multiplexity differs between treatments, but in most cases promotes cultural convergence. An important mechanism is that local differences in connectivity between the layers lead indirectly

to more cultural convergence, while increasing the time to reach the end-state. When assortativity is present, the effect of multiplexity becomes non-monotone. Network updating

reduces cultural convergence and induces network dynamics that strongly depend on the confidence threshold. In the case of realistic initial cultures more diverse behavior is shown with

an additional phase between full cultural divergence and full cultural convergence. This phase turns out to be unstable when multiplexity is present. Moreover, in the second phase transition

non-equilibrium behavior in the dynamics is shown to result from competition between the layers. Finally, in most cases the layer networks did not resemble realistic social networks.

(3)

Contents

1 Introduction 4

2 Literature review 5

2.1 Original Axelrod model . . . 5

2.2 Extensions of the Axelrod model . . . 8

2.2.1 Networks . . . 8

2.2.2 Other extensions . . . 10

2.3 Social multiplex . . . 11

3 The model 12 3.1 Network-culture assortativity . . . 12

3.1.1 Properties of the network . . . 12

3.2 Generalized Axelrod model . . . 13

3.3 End-state . . . 14

4 Simulation set-up 14 4.1 Routes of investigation . . . 15

4.1.1 Clusterings of the culture: global and layer dependent . . . 16

4.1.2 Random initial culture . . . 17

4.1.3 Structured initial culture . . . 18

4.2 End-state analysis . . . 21

4.2.1 Cluster size entropy . . . 21

4.3 Dynamical analysis . . . 22

4.3.1 Network observables . . . 22

4.3.2 Variation of information . . . 23

5 Results and discussion 24 5.1 Treatment 0: trivial multiplex . . . 24

5.1.1 End-state results . . . 25

5.1.2 Discussion . . . 25

5.2 Treatment 1: random culture and random, static networks . . . 26

5.2.1 End-state results . . . 26

5.2.2 Dynamical results . . . 28

5.2.3 Discussion . . . 29

5.3 Treatment 2: network-culture assortativity . . . 31

5.3.1 End-state results . . . 31

5.3.2 Discussion . . . 32

(4)

5.4.1 End-state results . . . 33

5.4.2 Dynamical results . . . 35

5.4.3 Discussion . . . 36

5.5 Treatment 4: structured culture . . . 37

5.5.1 End-state results . . . 38

5.5.2 Dynamical results . . . 40

5.5.3 Discussion . . . 42

(5)

1

Introduction

The existence of systems with multiple layers of networks interacting with each other can have profound impact on the dynamics of the system. Indeed, in 2003 a powerful electricity failure hit Italy which was particularly strong because the electricity network was coupled to the internet communication network [Buldyrev et al., 2010]. Multiple interacting networks or, more specifically, multiplices (i.e. networks that have different types of connections in different layers) are ubiquitous features of complex systems and social systems are no exception [Boccaletti et al., 2014].

In social science, multiplexity is an important area of study, since interaction in social systems is often within different (evolving) social environments. However, due to their complexity they have not been studied extensively, short of some recent examples [Quattrociocchi et al., 2014, Palchykov et al., 2014]. On the other hand, much work has been done on studying the Axelrod model, a stochastic cellular automaton based on principles from social science that describes cultural dynamics (i.e. the evolution of culture), where culture is represented as a stylized object [Axelrod, 1997, Castellano et al., 2000, Klemm et al., 2003b].

This thesis will mainly investigate the effect of social multiplexity on cultural dynamics in terms of cultural convergence, using a generalized version of the Axelrod model. In addition, some attention will be paid to the effect of this cultural evolution on the underlying social networks. The thesis uses an interdisciplinary statistical-physics approach to understand these aspects. Studying cultural dynamics is important for understanding the causes of cultural diversity and collective social behavior.

The generalized Axelrod model that is used here incorporates the presence of network multi-plexity and bounded confidence, in addition to optional features: network assortativity, updating networks and empirically realistic initial cultures. It aims to model the interaction of social and cultural dynamics, with an association between social networks and subcultures. Since it is diffi-cult to study the model analytically even in its simplest form, simulation will be used to study the model. An advantage of this is that many different scenarios can be studied in a similar way. In addition, analytical results will be established where possible.

Note that an example of social multiplexity has already been studied in the context of the Axelrod model by coupling the Axelrod model with social resource sharing dynamics [Huang and Liu, 2010]; this is a form of social multiplexity that does not imply having multiple networks. In the current work, social multiplexity is not of this kind, because the multiplexity is part of the Axelrod model and implies having multiple networks. When referring to social multiplexity, it is always meant here that the multiplexity concerns multiple networks (or layers).

For different regions of the model’s parameter space, both the end-state and the dynamical behavior are investigated in terms of the effect of having multiple layers. More specifically, the effect of multiplexity on the phase transition (with the confidence threshold as control parameter) is studied for the following scenarios: random graphs, network assortativity, updating networks

(6)

and initial realistic cultures. In addition, the effect of the Axelrod model on the structure of the networks is also investigated when the networks update.

The rest of the thesis is outlined as follows. In Section 2 parts of the literature on the Ax-elrod model are reviewed. Then, Section 3 will present the model, while we elaborate on the simulation set-up in Section 4. The results and discussion are covered in Section 5. More specifi-cally, Subsection 5.1 finds that the generalized Axelrod model introduces a compartmentalization that has some effect on its dynamical behavior. Subsection 5.2 shows that in the simplest case multiplexity leads to more cultural convergence, but this originates from multiple effects, some of which counteract cultural convergence. The most important mechanism is that local differences in connectivity between the layers lead indirectly to more cultural convergence, while increasing the time to reach the end-state. In addition, Subsection 5.3 finds that assortativity largely promotes cultural convergence, but introduces non-monotonicity in the effect of multiplexity. Updating net-works, investigated in Subsection 5.4, reduce cultural convergence by decreasing opportunities for interaction with culturally more distant individuals at later times, so that the system settles down faster. Furthermore, for some values of the confidence threshold there is an indication of the for-mation of realistic networks over time. Then, in Subsection 5.5 it is found that the extra phase that is created when using the structured initial culture becomes less stable when multiplexity increases. Also, the effect of multiplexity on cultural convergence depends heavily on the confi-dence threshold. Moreover, for some parameter values the system sometimes shows non-monotonic dynamical behavior, paving the way for systematic non-equilibrium dynamics in the generalized Axelrod model. Finally, Section 6 concludes the thesis.

2

Literature review

The model that will be discussed in Section 3 builds on several modeling paradigms. In this section their place in the literature is discussed and some notation will be introduced.

2.1

Original Axelrod model

Cultural dynamics should strictly be viewed as a (generalized) opinion model (i.e. a model that describes the evolution of agents’ opinions over time). In the literature, however, a distinction is often made between opinion dynamics and cultural dynamics, where the former concerns opinion models that use scalar variables, while the latter treats opinion models that have a vector of such variables [Castellano et al., 2009]. Prominent examples of (scalar) opinion models are the Voter model [Clifford and Sudbury, 1973] and the Deffuant model [Deffuant et al., 2000], where the former treats an opinion as a binary variable (similar to the Ising model) and the latter treats an opinion as a continuous variable, taking values in the unit interval.

(7)

1997]. It generalizes the voter model in two ways: first, it has more than one opinion variable (called cultural features and the full set of cultural features is a cultural vector) and, secondly, it allows more than two values per cultural feature (called cultural traits). The dynamics (which will be outlined later in this subsection) are based on the assumption that similar (in terms of the cultural vectors) individuals interact more (homophily) and interaction leads to higher similarity (social influence), which are fundamental principles in social science. Much like the dynamics used for the Ising model (e.g. Glauber, Metropolis), this constitutes a relaxation process, whereby the system converges steadily towards an equilibrium.

In the original Axelrod model the most fundamental object is the cultural configuration (or culture), which consists of elements that live in a cultural space. Formally, a cultural space is a finite, discrete space (and, therefore, it is compact); an element from this space is called a cultural

vector v = {v1, ..., vF}, where F is the dimension of the space, vk is called a cultural feature and

vk ∈ {1, ..., q} for k = 1, ..., F and q is some positive integer (i.e. each feature has q possible traits).

A cultural feature could be any cultural property, for example the agent’s religion or its taste for wine. The state-space of the model (i.e. the culture) then consists of N actors (or agents), where

each agent i is associated to a cultural vector vi. Note that the number of possible states the

system can be in is qF N, which is finite, but very large for even moderate values of the parameters.

Define on this space a metric (or cultural distance) between agents i and j by dij := d(vi, vj) :=

PF

k=1d k(vk

i, vjk)/F , where dk ∈ [0, 1] =: I is the cultural distance between two features (i.e. it is

normalized; infinite distances are not possible). Note that by definition it also holds that dij ∈ I.

In the original Axelrod paper [Axelrod, 1997], the variables were of nominal type, which means

that dk(vk

i, vjk) = 1 if vik = vkj and zero otherwise. This is convenient for modeling purposes and

realistic if there is no order in the variables. In this case, the cultural space has no boundaries and no center. However, if there is order it is more realistic and necessary for empirical research (i.e.

questionnaires) to use ordinal variables, so that the distance between vik and vjk is a function of

|vk

i − v k

j|/(q − 1).

In the original Axelrod model [Axelrod, 1997], it is assumed that the agents are organized on a two dimensional lattice. At every time step the following actions occur in the indicated order: (i) An agent i is selected at random; (ii) A neighbor of i, j, is selected at random; (iii) The agents

interact with probability equal to their similarity oij = 1 − dij (homophily); and (iv) Interaction

consists of i copying a random feature vk

j of j for k such that vki 6= vki (social influence).

Note that the Axelrod model is a stochastic dynamical system (i.e. stochastic process) that

starts at time t = 0 from initial condition (vi(0))i∈{1,...,N } =: v(0) and its dynamics are specified

by the updating mechanism. More specifically, it is a time-homogeneous Markov Chain that is absorbing, which means that the system always ends up in an absorbing state, a state in which the system will remain for all later time. The culture will converge to an absorbing state at an end-time t = T in which every agent has the same cultural vector (full cultural convergence), to a state

(8)

with multiple clusters of identical agents with different cluster sizes (partial cultural convergence or cultural divergence) or to a state with N singleton clusters (full cultural divergence). In the original Axelrod model, v(0) is generated by sampling each cultural feature from each cultural vector from a discrete uniform distribution (taking values in [1, q]), which will be referred to as the random culture. Although some attempts have been made to study the Axelrod model analytically, using Markov process theory [Lanchier, 2012] or a master equation approach [Castellano et al., 2000,Vazquez and Redner, 2007,Vilone et al., 2002], these typically rely heavily on approximations and cannot capture the full extent of the model.

When analyzing the Axelrod model, one usually studies observables that condense the state-space information. In the original paper [Axelrod, 1997], the author distinguished between cultural regions, which are groups of adjacent nodes with identical features, and cultural zones, which are groups of adjacent nodes that have positive overlap in their cultural features (i.e. the possibility to interact). When talking about a clustering of the set of agents, we say that a condition has to hold between a pair of nodes (e.g. a cultural distance of zero), but what is really meant is that two nodes are part of the same cluster if there is a path between the nodes such that each consecutive pair on this path satisfies this condition. Note that in the case of having a cultural distance of zero, each pair in the corresponding cluster has this property because it is an equivalence relation, but in general this is not the case.

It was Axelrod’s observation that cultural convergence/divergence depends on the initial state of the system and whether the boundaries of the cultural zones dissolve before the process of homophily and social influence (i.e. the dynamics) settles down. Boundaries between cultural re-gions within cultural zones tend to dissolve as well, due to the randomness inherent in the updating mechanism. The initial state (and therefore the dynamics between cultural zones) changes as a result of parameter changes; Axelrod, therefore, used these concepts to explain cultural conver-gence/divergence for different parameters. In particular he looked at q, F , number of neighbors on the lattice and N .

Applying this reasoning to the system, it is clear that when q is large there are more cultural zones initially and thus more cultural regions in the end-state, hence more cultural diversity. Similarly, when F is large there are fewer cultural zones initially, so more cultural convergence. When the number of neighbors is large, there are also fewer cultural zones in the beginning, hence more convergence. Finally, the effect of system size is most interesting. For small N , there are not many different cultures to begin with, so the number of domains in the end-state is small. However, for large N there is another effect that counteracts this one. If the system is large it takes a long time for it to settle down. Therefore, there are more opportunities for the boundaries (both regions and zones) to dissolve, resulting in fewer regions in the end-state. The effect of varying N is therefore non-monotone.

(9)

et al., 2009]. [Castellano et al., 2000] studied a (nonequilibrium) phase transition with q as a control parameter for various values of F . Clearly, the two phases of such a one-dimensional phase transition are the ordered state (full cultural convergence) and the disordered state (full cultural divergence). It was observed that the phase transition is continuous for F = 2, but discontinuous for larger F . In terms of dynamics, it was seen that the number of active links (i.e. pairs of agents

that have 0 < dij < 1) showed non-monotonic behavior, increasing rapidly and then decreasing

again; this effect is especially pronounced around the phase transition. In general, studying such a phase transition is interesting in itself but looking at the behavior for different values of a parameter also makes it easy to identify the interesting parameter regions (i.e. around the critical value). Another study [Guerra et al., 2010] showed that consensus is reached much faster for most single cultural features than for the entire culture. Presumably, there is monotonicity in most of the cultural features but not in the culture as a whole. A seperation of time-scales occurs, with a few bottleneck features.

2.2

Extensions of the Axelrod model

There have been extensions of the Axelrod model, incorporating cultural drift [Klemm et al., 2003a,Klemm et al., 2005], mass media [Gonzalez-Avella et al., 2005] and the role of dimensionality [Klemm et al., 2003c]. Cultural drift implies that there is a (small) probability that agents change the values of their cultural features without social interaction; clearly, this mechanism destabilizes culturally divergent states. Secondly, introducing mass media means there is a global field that that influences all agents simultaneously, so that the Axelrod model does not consist of local interactions only. Finally, Axelrod suggested a 2D lattice as an interaction structure, having a geographical space in mind [Axelrod, 1997], but when thinking of the interaction structure more as a social space, it becomes interesting to consider the effect of dimensionality. Some other extensions will be discussed in detail below, as they will be important for the model discussed in Section 3 .

2.2.1 Networks

Since the state space consists of finitely many agents, any graph (or network) structure can be imposed on it. Clearly, such a network then represents a social space (i.e. a social network), so that only agents with social ties can interact. Formally, a graph is an ordered pair (V, E), where V is the set of nodes {1, ..., N } and E is a set of edges or links, where each link is related to two nodes (i.e. a link is an object that links two nodes). (Note that in this thesis only unweighted, undirected graphs with no self-links are considered.) A network can also be represented by an

adjacency matrix A, which is an N by N matrix, with elements aij := A[i, j] = 1 if i 6= j and they

share a link, but zero otherwise. Note that A is symmetrical and has a diagonal of zeros.

Before going further, some network measures are defined (most of the notation and definitions in this paragraph and the next are from chapter 1 and 3 respectively in [Barrat et al., 2008]). Let

(10)

ki be the degree (i.e. number of links) of node i. First, the link density L is defined as the total

number of links E divided by the total possible number of links, namely N (N − 1)/2. Secondly, the degree distribution P (k) of a network is defined as as the probability that a randomly picked node i has degree k. Note that by definition the average degree is hki = 2E/N . Thirdly, the clustering coefficient is defined as C = 1 N N X i=1 P jlaijajlali ki(ki− 1) .

For each node i the number of links between the neighbors of i is divided by the maximum possible number of such links; the clustering coefficient is the average of this quantity over all nodes. Finally, the number of connected components is the number of groups of nodes such that there is no link between any of the groups; a measure of the connected components is the size of the giant component (or largest connected component) G. When used in this thesis, network measures will usually be normalized if this is not already true by definition (e.g. G is divided by N such that

G0 := G/N = 1 when the network has only one connected component).

One of the simplest type of networks is a regular lattice, as used in the original Axelrod model. Another network model is the Random Graph (RG), which will be used lateron and is generated as follows. Each pair of nodes (i, j) will share a link with probability p and this is done independently over all pairs. Such a graph has several properties. We have that hEi = N (N −1)p/2, so hki = (N − 1)p. In addition, when hki is larger than one, the graph will have one connected component, but when hki is smaller than one, it will have many small subgraphs. Note that this result hold for N → ∞ and is approximately true for large N . Furthermore, hCi = p and its degree distribution is approximately the Poisson distribution with parameter hki. This means that

the random graph has a narrow degree distribution (i.e. most nodes have ki≈ hki). Graphs that

have a large clustering coefficient are called small-world graphs, while graphs that have a fat-tailed (i.e. broad) degree distribution are called scale-free graphs. A typical measure of the extent to which a degree distribution is fat-tailed is the kurtosis (or scaled, centered fourth momenth) of this distribution. A kurtosis larger than three typically indicates fat tails. Small-world and scale-free graphs are typically referred to as complex networks and many real networks are complex. Specifically, most social networks are small-world, although it is not clear whether they are typically scale-free [Klemm et al., 2003b].

In line with the general interest in (complex) networks from the statistical physics community [Albert and Barabasi, 2002], there has been some work studying the Axelrod model on RG’s, small-world networks and scale-free networks [Guerra et al., 2010, Klemm et al., 2003b]. Typically, it is found that these type of networks facilitate cultural convergence compared to regular lattices (while keeping the number of links fixed), for distinct reasons. As the amount of randomness is larger in the case of an RG compared to a lattice, there is a larger probability that traits spread. A larger clustering coefficient means that networks are better connected locally which will facilitate

(11)

the Axelrod dynamics (which consists of local interactions); clearly the clustering coefficient should not be too large, since otherwise different clusters will become distinct cultural regions, enhancing cultural diversity. Finally, when a network has a fat-tailed degree distribution there are some nodes, called hubs, with many links that are efficient in the spreading of traits. Note that the results for the small-world and scale-free networks depend on the specific network models used.

2.2.2 Other extensions

Three other extensions will also be important. First, the principle of Bounded Confidence (BC) has been used in the Axelrod model [Flache and Macy, 2008, De Sanctis and Galla, 2009]. In the original Axelrod model it was assumed that agents could interact if they had positive cultural overlap. However, in reality agents may only interact when the overlap exceeds a threshold θ. Note that if θ = 0 the original Axelrod model is recovered. Such a parameter may very well be specific to the individual, so this represents a simplification. However, part of an individual’s level of trust may be caused by certain macro events, so that this is similar to other agents. In addition, it appears that BC reduces cultural convergence and makes the model immune to cultural drift [De Sanctis and Galla, 2009]. Finally, the threshold may be used to define a cultural graph,

as follows: for each pair of nodes i and j, aij = 1 if oij > θ and aij = 0 otherwise [Valori et al.,

2011].

Since the interaction probability depends on the cultural distance between two neighbors and the confidence level θ only enters the model through this probability, the effect of BC (i.e. a higher θ) on the Axelrod model is that there are more cultural components (that is, connected components in the cultural graph) at t = 0 and typically there will be more cultural zones as well (where the cultural zone is generalized to BC).

Secondly, co-evolution of network and agents (i.e. dynamical networks) has been implemented in the context of the Axelrod model [Pfau et al., 2013, Centola et al., 2007]. It has been shown that this mechanism stabilizes the dynamics under cultural drift under some conditions [Centola et al., 2007], although the updating rule is different from the one that will be used in this thesis.

Thirdly, in the Axelrod model the dynamics start with an initial culture v(0) that is completely random, but in reality cultures can be more complicated [Valori et al., 2011]. It is possible to run the Axelrod model on these realistic cultures [Stivala et al., 2014, Valori et al., 2011] or generate them artificially [Stivala et al., 2014], so-called structured cultures. Using empirical data it is required to incorporate BC in the Axelrod model and it then becomes interesting to study the phase transition for varying θ (it is impossible to vary q in the case of empirical data). It is typically observed that the phase transition in terms of θ is much less steep for realistic cultures than for random cultural spaces, which is discontinuous (at least for large F , similar to the q-phase transition when θ = 0).

(12)

should be the end-result of the Axelrod model, since a realistic culture is the result of a long term phenomenon and the Axelrod model is long term as well. If the Axelrod model results in diversity or convergence, then the resulting culture will most likely not be completely consistent with such a realistic culture. However, one must realize that in reality there are additional processes such as a growing population and some additional features (which are not incorporated in the specific model at hand) that have other effects. Such effects would then explain why an empirically realistic culture is different from the end-state.

2.3

Social multiplex

In most models of cultural dynamics, agents interact via a specific interaction network (i.e. they are located on a network and can only interact with neighbors). Such an interaction network should be a social network since the process of social influence takes place via social ties, which are then the links. In reality, a social environment can often be subdivided into several distinct social networks (e.g. work, sports, family), which is termed social multiplexity. Multiple self theory lends support to the assertion that agents’ behavior is dependent on their social environment [McConnell, 2011]. In addition, using the full (or aggregate) social network when social multiplexity is present, leads to inaccurate results for dynamic processes [Cozzo et al., 2013].

More generally, there has been an increased interest in the study of multilayer networks, which are collections of networks where the nodes in each layer are associated to nodes in other layers [Boccaletti et al., 2014]. An example of this in the context of opinion models is [Quattrociocchi et al., 2014]. A multiplex network is a special case of a multilayer network, where each node is only associated to its counterparts in the other layers (i.e. for practical purposes, the nodes are the same in each layer). A model of cultural transmission has been studied in this setting [Palchykov et al., 2014].

Formally, a multiplex is represented by a multigraph G, which is an ordered pair (V, E),

where V is as before and E is a multiset of unordered pairs of edges {G1, ..., GM} (i.e. there are

M layers). Again, this can be represented using the adjacency matrix, except that there is an

adjacency matrix Aαfor each layer number α. Each cultural agent/actor can now be associated to

a node in this multigraph. It now depends on which set of unordered edges or layer Gα one looks

at, whether two agents have a social tie.

Thus far, we have not assumed any dependence between this multigraph and the cultural space discussed earlier. With only one social network, it is clear that every cultural feature belongs to that social space. However, this is not clear in the case of a social multigraph. Therefore, there needs to be a correspondence between the set of features and the set of layers. That is, it should be determined for each feature to what layer(s) it belongs. As an example consider an agent’s preference for beer; intuitively, such a cultural feature should be associated to a social network regarding friendship relationships. The combination of the culture and the social multigraph will

(13)

be referred to as the social multiplex.

3

The model

The model that will be used, also referred to as the generalized Axelrod model, has the property that the social and cultural spaces are not one-to-one. Because of this, some additional notation

has to be introduced. First, let βα be the subset of features that contributes to layer α (from

now on, we write layer α when we mean layer Gα). Second, define a cultural subvector of an

agent i as the collection of traits vk

i



k∈β, obtained by restricting to a specific set of features, β.

The dimension of the subspace is |β|. Then, the cultural distance in this subspace, or subcultural

distance, is defined as dβij:=P

k∈βd k(vk

i, vjk)/|β|. Finally, the subcultural overlap o

β

ij = 1 − d

β ij.

3.1

Network-culture assortativity

The social network in each layer may be independently determined from the initial culture. Alter-natively one can let the social networks be generated by the cultural features associated to that layer; in reality, an agent typically has social ties to agents that are culturally similar [Valori et al.,

2011]. More specifically, the probability of a link between node i and j is pαij = f (oβα

ij ) for some

increasing function f : I → I. Note that in the last case, the only input to the model (once the system parameters are specified) is the culture. In the case of multiplex networks this is a very convenient way of generating the networks; it solves the problem of having to handpick a specific network for each layer, which would be arbitrary. Furthermore, assortativity is complementary to the multiplex approach since each layer is associated to a subset of features, so that there is an assumed relation between the social and cultural spaces.

3.1.1 Properties of the network

If the network is generated using the intitial culture, it is of interest to know what the structure of this network is. Here, the focus is on one layer, so F corresponds to the dimension of the subspace associated to that layer. First, assume that the generation process of the initial culture

is stochastic. Then, a priori, the cultural vector of agent i is a random vector Vi that consists of

F random variables Vk

i . Then we get for the probability of having a link

pij = f 1 F F X k=1 P (Vik = Vjk) ! . (1)

Note that even if all Vi have the same distribution so that pij is independent of i and j, this does

not constitute a RG, as the probabilities for different links are not independent, hence the joint probability of all the links together does not factorize. For example, if two nodes both have large cultural overlap with a third node, they are likely to have a larger than average cultural overlap

(14)

with each other, so the clustering coefficient of the graph is expected to be higher than that of an RG.

For collections of Vi that have the same distribution, the argument of f in (1) can be

inter-preted as the average cultural distance between those agents after the culture has been generated. Intuitively, the reason that the resulting (sub)graph is not a RG is that the cultural distances are not exactly equal to this average, but fluctuate around it. Assuming certain regularity properties of the culture generation process it holds that all cultural distances converge to this average as F → ∞. Therefore, if F is large enough, the resulting graph will approximately be a RG. How large F has to be depends on the specific generation process. Note that the condition that the random

variables Vk

i are independent over k is sufficient but not necessary for the cultural distances to

converge.

3.2

Generalized Axelrod model

The generalized Axelrod model is similar to the original Axelrod model (which includes the fact that the cultural features are of nominal type), but there are some modifications. Before any pair of neighbors is selected randomly, a layer α is selected with probability 1/M . The interaction

probability is modified to rij = oijΘ(oij− θ), where θ is a predetermined threshold and Θ(·) is the

Heaviside function, defined as Θ(x) = 0 if x ≤ 0 and 1 if x > 0 (Bounded Confidence). Furthermore,

interaction consists of the original agent copying one of the features vβα

i of his neighbor, in which

the two differ. Finally, an optional feature of the model is that if the interaction is successful, the original individual updates its links with respect to all other agents in that layer according to the same rule that generated its links in that layer at t = 0 (see previous subsection). More specifically, if agent i has a successful interaction in layer α, then agent i’s links in layer α are deleted and with

probability pα

ij = f (o

βα

ij ) it will have a link with agent j for all j 6= i, where f is the same function

as used in the generation of the multiplex at t = 0. Note that updating of the network according to this rule essentially implements network assortativity dynamically. Also, if updating is included, the generalized Axelrod model operates on the entire social multiplex (i.e. the state-variable is the social multiplex), instead of just the culture.

The updating of the social network makes sense, since the cultural dynamics are long term and are therefore expected to be on the same time scale as social network formation. Also, the specific updating rule is intuitive: if a node changes one of its features this is a significant event, so it makes sense that an agent reevaluates its surroundings in the corresponding social environment. Note that if a specific feature is associated to multiple layers, then it may be reasonable to update the links of this node in all corresponding layers, since the agent has changed culturally in all these layers. However, since the actual interaction took place in a specific layer, it seems that the dynamic process should only apply to the social environment where the interaction took place.

(15)

only operates on the level of the social network, so that the generalized Axelrod model extends these principles to a multiplex context. In contrast, the fact that links are generated (and updated) based on the subcultural similarity might also be regarded as homophily, but this time it only regards the cultural subspace. However, in the first case the object of change is the culture, while in the second case the object of change is the social network.

From the dynamical rules it follows that one feature that causes coupling is the fact that

rij depends on the full cultural distance; if it would only depend on the subspace distance and

each feature would map to only one layer, there would be M independent Axelrod models. In addition, if one feature maps to multiple layers, this also introduces a natural coupling between layers (strictly speaking, the object that associates features to layers is then a correspondence since it would not be well-defined as mapping).

3.3

End-state

In the case of the original Axelrod model, the end-state (i.e. the absorbing state) is a state where

between adjacent nodes i and j the cultural overlap (oij) is either 1 or 0. In the case of BC

this translates into the same thing, but with oij = 1 or oij ≤ θ. Clearly, there can be no more

exchange of cultural features in such a state. However, when there are multiple layers, the notion

of adjacency is different; two nodes can be considered adjacent if aαij = 1 for at least one α. If for i

and j, dβα

ij = 0, there can be no interaction between them anymore. Therefore, an absorbing state

in the multiplex is characterized by the following: for all pairs of nodes (i, j) that have oij > θ,

there should be no α such that aα

ij = 1 and d

βα

ij > 0.

4

Simulation set-up

Before delving into the specifics, some general remarks are in order regarding the simulations. First, the approach to study the Axelrod model is empirically realistic, which has several implications. First, it makes no sense to change q as if it were a control parameter, like temperature, because q is a property of the system. The same holds for F . Of course, it is still interesting to look at different values of q and F , but not at a phase transition with respect to these parameters. Second, the control paramater θ can be viewed as an inverse measure of trust or confidence in a society; the larger the value of θ the less confidence there is. It may therefore be more convenient to look at the control parameter ω = 1 − θ, which can be viewed as the (normalized) level of confidence in a society.

The critical threshold of a phase transition, ωd, can be defined as the ω such that the standard

error or standard deviation of the order parameter averaged over different runs is largest [Klemm et al., 2003b]. In addition, the Cluster Size Entropy (CSE), which is defined in Subsection 4.2, is

(16)

when they do not, emphasis is on the CSE.

It is also more interesting to look at the order parameter as a function of confidence ω for other reasons. Changing the value of ω form 0 to 1 has the property that when ω = 0, there will always be full cultural divergence. Furthermore, if q is such that for the normal Axelrod model (i.e. ω = 1) there is cultural convergence, changing ω from 0 to 1 means going from cultural diversity to cultural convergence. Note that for small q this is usually the case and that small q corresponds to a realistic cultural system. Also note that q is the main factor in determining the network topology in the case of network assortativity as is explained in Subsection 3.1.1, so that varying q would have a double effect on the model, which is undesirable.

Similarly, most realistic systems have large F . Therefore, we will generally look at systems with small q, large F and changing ω. As the threshold is always compared to the cultural overlap and there are only F + 1 possible values for the overlap, there are only F relevant values for ω to study. The collection of disjoint sets for which each ω is equivalent is [0, 1/F ], (1/F, 2/F ], ..., ((F − 2)/F, (F − 1)/F ], ((F − 1)/F, 1]. Each set is referred to as an equivalence class and an element from such a class is a representative. We will typically use the midpoint value as a representative, rounded to two decimal places.

A small number of agents (N = 100) is used, since the complexity of the model means running times are long. Furthermore, the correspondence between features and layers will be a symmetric mapping, which means that if there are M layers, there are M ∗ Z features for some positive integer Z and Z features map to each layer. In terms of the cultural parameters, we take (F, q) = (36, 6) for all simulations. The value of F is convenient since it is large enough to be empirically realistic, has nice numerical properties since there are 9 pairs of (M, Z) that multiply to make 36 and it is not too large (which would cause running times to be even longer). The value of q is chosen to be small, so that for ω = 1, the model reaches cultural convergence and a phase transition exists. Finally, note that some of the simulations have also been done using other, similar values of (N, F, q) but the results were not qualitatively different from the results obtained using the original parameter values.

4.1

Routes of investigation

Even when (N, F, q) is fixed, in addition to the mapping from features to layers, there are still many degrees of freedom for which we can study the phase transition in ω. First, one can vary the number of layers. Second, the function f that determines the connection probability has not been determined. Third, the network can be static or dynamic. Finally, the initial culture can be generated in multiple ways.

As increasing the number of layers increases the multiplexity of the system, this lies at the heart of the investigation and will be done in every treatment. In terms of the remaining degrees

of freedom, we distinguish between the following treatments per degree of freedom: f (oβα

(17)

where p ∈ [0, 1], versus f (oβα

ij ) = o

βα

ij ; random culture versus structured culture; and no updating

networks versus updating networks. This would give rise to eight treatments in total, but only four of these are investigated. The simplest treatment is the one where the networks are generated by the RG algorithm, the culture is random and the network does not update. In the second treatment we do the same except we add assortativity. In the third treatment, we do the same as in the second except that networks are allowed to update. Finally, the last treatment is the opposite of the first. In this way, the complexity increases at every step.

With respect to the outcome of the general Axelrod model, several observation have to be made. First, the system can be studied by looking only at the end-state or by explicitly looking at the dynamics, both of which will be done and are explained in the next two subsections. Second, the dimensionality of the system is so high that the only way to study it is to study clusterings (e.g. cultural regions) and, more specifically, the vector of cluster sizes. Indeed, it is not important what values the specific traits in each cultural vector in each cluster have, nor which vector is in which cluster, since these details provide no information regarding cultural convergence/divergence. Next, it will be discussed what type of clusterings will be used (both in the end-state and dynamical analysis).

4.1.1 Clusterings of the culture: global and layer dependent

When analyzing the generalized Axelrod model, some clusterings give important information regarding the structure of the culture. On the global level (i.e. the level of the social multiplex) it is not convenient to take into account which agents are connected to each other since this information is encoded at the layer level. Below, both global and layer dependent clusterings are defined for am arbitrary pair (i, j). (Note again that when we say a property has to hold between pairs of nodes, we mean that between each pair of nodes in the cluster there is a path such that each consecutive pair on the path has the property.)

Global clusterings

• cultural domain: dij = 0. The cultural domain could theoretically consist of multiple

collections of nodes that are not linked within any of the social networks. However, for most parameter values this event is extremely unlikely (especially if there is network updating)

• cultural component: dij< ω. The cultural component can be seen as clusters of nodes that

have the possibility (if there would be a link in the appropriate layer) to interact and converge culturally; with network updating, such an interaction will most likely become possible over time in any of the layers. Note that the cultural component really is a connected component with respect to the cultural graph

(18)

Layer dependent clusterings

• cultural region α: dβα

ij = 0 and aαij = 1. This is a direct analogue of the cultural region

defined for the original Axelrod model

• cultural zone α: dij< ω and aαij = 1. This is a direct analogue of the cultural zone defined

for the original Axelrod model; note that the cultural distance should depend on the full cultural space, since interaction depends on the full space. When the number of cultural regions equals the number of cultural zones in all layers, the system will be in its end-state, since no more dynamics can take place

It is important to note that each cultural component will always encapsulate at least one cultural zone in a specific layer, since cultural zones have a stronger requirement. The extra requirement

ij = 1 can only reduce the number of cultural zones per cultural component (if there was no such

requirement there would trivially be one cultural zone per cultural component, since the two would be the same). One implication that follows from this is that there will always be at least as many cultural zones as cultural components in total.

4.1.2 Random initial culture

In this subsubsection some aspects of the random initial culture with regards to the network structure are discussed in more detail; the next discusses the structured culture used in the last

treatment. In the case of the random culture, a priori each cultural vector Vi is a vector of F

independent discretely uniformly distributed random variables with domain [1, q]. Then, using the

results from 3.1.1, we have that the probability of having a link is pij = f (1/q) = 1/q. Therefore,

the resulting graph will have a relatively simple structure. However, it is not clear what F will be large enough for the network to be like an RG.

We simulated the network generation process for many values of the parameters (N, F, q) (note that here F is the dimension of the subspace associated to a layer, which means that F = Z). In general, the resulting graph has properties (i.e. link density, degree distribution, clustering coefficient, connected component) that closely resemble those of the RG when Z is large. When Z is small the resulting graph has a large clustering coefficient. Both observations are in agreement with the discussion in 3.1.1. For (N, Z, q) = (100, 36, 6), the properties already closely resemble that of an RG. Note, however, that as the M increases Z decreases, so the F corresponding to each layer becomes smaller. When there are many layers, Z is small, so the networks associated to each layer will have a relatively large clustering coefficient. Therefore, when assortativity is present, M has two effects on the dynamics: it influences the Axelrod dynamics and it effects the underlying topology of the network. Also, it was found that when Z > 1 (N = 100, q = 6) the graph is connected (i.e. G = 1). For Z = 1, this is not the case, but this corresponds to M = 36 and as we show later this case shows trivial dynamical behavior when there is assortativity. Finally, N

(19)

should be large relative to q to obtain a connected graph, which is similar to the RG case, where N p should be much larger than 1 to obtain the result.

4.1.3 Structured initial culture

The type of structured culture used in this thesis is based on the prototype evolution algorithm in [Stivala et al., 2014], which initializes the culture by generating the cultural vectors around a few fundamental cultural vectors, called prototypes. This algorithm is inspired by theories from social science which postulate that most individuals fall in a certain cultural category and the prototypes are then the most typical members of such a category (note that a prototype does not need to be an actual individual).

Specifically, in each layer two prototypes are generated by the requirement that they both have f cultural traits in common with a third superprototype (which is just a randomly generated cultural vector), while the remaining traits are generated randomly. Around each prototype N/2 agents are generated by letting each agent have g traits in common with the prototype while the rest is generated randomly. Letting b = g/Z , they will on average have an overlap with the prototype of o = b+(1−b)/q. Therefore, this algorithm will produce two spherical shells of average

radius r0= 1 − o (while the radius of the outer sphere is r = 1 − b). Note that the prototypes are

not included in the initial culture.

To compute the distance between the two prototypes (i.e. between the centers of the shells), R, let first B = f /Z. For each cultural feature three things can happen: both prototypes have cultural traits obtained from the superprototype (so they are equal to each other); only one of the prototypes has a trait obtained from the superprototype; and both prototypes do not have traits obtained from the superprototype. Properly accounting for all the probabilities this means that for each cultural trait separately the probability that the two prototypes have equal trait is

O = B2+ 2B(1 − B)/q + (1 − B)2/q = B2+ (1 − B2)/q. (2)

Averaging over all cultural features gives that the average overlap between the prototypes is exactly the same expression, again denoted by O. If this is used as an estimate for the actual overlap it should be taken into account that it is the average overlap and especially for small Z there will be large fluctuations. For large Z the difference will be small; simulations confirm this. Since in our case Z will typically be large, this is a good approximation. The distance between the spherical shells will then (on average) be R = 1 − O. In addition, it should be noted that if B = 1, only one spherical shell will be created with N agents, since the prototypes are the same in that case. On the other hand, if b = 0, then the algorithm is the same as the algorithm that generates the random culture. However, even though it is a special case, the random culture algorithm can in principle generate any of the structured cultures (typically, however, the probability of generating one with realistic values of B and b is very small if F , q and N are non-trivial).

(20)

An expression similar to (2) can also be derived for the average distance between two subcul-tural vectors generated around the same prototype, by just replacing B by b in that equation. This can give important information on the network structure of the nodes around each shell. Using the

results from 3.1.1 it follows that pij = b + (1 − b)/q = 1/q + (1 − 1/q)b (note that this expression

reduces to the one for a random culture if b = 0, as expected).

For the following, some notation has to be introduced. Since the above discussion was about a single layer, the spherical shell (or shell) for one layer will from now on be referred to as a subshell.

Denote by Gα

i subshell i in layer α, where i ∈ {1, 2}. The prototype around which this subshell

is generated will then be referred to as the subprototype. In addition, the subcultural graph in

layer α is defined as the cultural graph with respect to the corresponding subspace (i.e. with dij

replaced by dβα

ij ).

Now, the question becomes how to select f and g. The most diverse behavior occurs when for some values of ω, there are two subcultural components (that is, if there was only one layer no interaction would occur between agents from the two subshells). In addition, the radius r should not be too small, since otherwise there would be trivial dynamical behavior in each subshell (in the most extreme scenario both subshells would consist of N/2 identical subprototypes). Finally, the average distance between subcultural vectors in each subshell should be equal to the distance between the subshells themselves. This should be true by construction, except for the fact that there is a maximal distance in the cultural space, namely 1.

These three requirements are surely fulfilled if the following conditions are satisfied:

R ≥ 4r + δ; r ≥ ; R + 2r ≤ 1, (3)

where δ controls how many threshold values should exhibit the required behavior outlined in the previous paragraph and  determines how non-trivial the dynamics within the subshells will be. The parameter r is set directly by g, while R is set randomly by f (since R is the average distance between the prototypes), so some margin should be included in setting R. In practice (i.e. for significant values of δ and  and with Z ≤ 36), the conditions in (3) yield no feasible pair (f, g),

even if one replaces r by r0. However, these conditions are based on extremely small probabilities.

For example, the probability that two cultural subvectors are generated in the two subshells that have the smallest distance possible (i.e. ‘lie on the line between the subprototypes’) and also have the highest distance possible to any other agent within their respective shells, is extremely small. Therefore some simulations were performed that computed the following quantities for each generated structured culture:

dmin= min i∈Gα 1,j∈Gα2 dβα ij ; dmax= max i∈Gα 1,j∈Gα2 dβα ij ; d k min= min i∈Gα k,j∈Gαk dβα ij

and it should at least hold that dmin≥ dkmin+ δ for k = 1, 2 and dmax≤ 1. Based on the results

(21)

generation process is stochastic). Some additional constraints are due to the fact that there are multiple layers, which is discussed next.

In the case of a multiplex, there are cultural subspaces associated to each layer, so it becomes necessary to match the cultural subvectors in each layer. The straightforward way to do this is to have a mapping between subprototypes accros layers. In effect one then has multiple prototypes, each composed of different subprototypes and the collection of agents associated to each prototype

depends on this mapping. In general there will be 2M such prototypes. For more than two layers,

the situation becomes increasingly complex, so to keep things simple we will only use M = 1 and M = 2 for the treatment with realitic cultures. When M = 1, there are simply two subprototypes, which are also the prototypes.

Now, consider the case M = 2. In one extreme case, subprototype 1 in layer 1 would be matched to subprototype 1 in layer 2 (i.e. the subcultural vectors that are generated around prototype 1 in layer one are matched to those generated around prototype 1 in layer 2) and subprototype 2 in layer 1 is matched to subprototype 2 in layer 2 (or vice versa), which corresponds to no mixing. In the other extreme case, one will have that only half of the subcultural vectors generated around subprototype 1 in layer 1 will correspond to the subcultural vectors around subprototype 1 in layer 2, while the other half will correspond to the subcultural vectors around subprototype 2 layer 2 (and similar for the other half of the subcultural vectors associated to the subprototypes).

Letting Πij be the prototype with subprototype i in layer 1 and j in layer 2, this scenario

would divide the cultural vectors in four shells (cultural vectors around Π11, Π12, Π21 and Π22

respectively) of size N/4, and corresponds to the case of full mixing. Intermediate cases would also

divide the cultural vectors in these classes, but then those of Π11and Π22(non-mixed prototypes)

have size N (2 − µ)/4, while the others (mixed prototypes) have size N µ/4, where µ ∈ [0, 1] is the mixing coefficient; if µ = 0, there is no mixing, while µ = 1 corresponds to full mixing. Denote

by Gij the shell associated to Πij, while Gαij is used for the same shell when restricting to the

subcultural vectors associated to layer α only. In the sequel, G11 and G22 will be referred to as

non-mixed shells, while G12 and G12 will be called mixed shells, for obvious reasons. Finally, note

that even though positive µ could consistently be used with M = 1, this has no relevance, so in the case M = 1, it always holds that µ = 0.

Since the values of f and g should be the same for M = 1 and M = 2 and because, if M = 2, both layers should have subshells of equal radius, f and g have to be even numbers. In addition, to allow for all possible behavior, it is desirable that R is such that even shells that have one subprototype in common have some distance between them. Using the results of the simulations, it turns out that the combination (f, g) = (0, 28) satisfies all the requirements with some margin. These parameter values will be used for generating the structured culture in Subsection 5.5.

(22)

4.2

End-state analysis

For the end-state analysis the only quantity that will be investigated is the collection of

domain-sizes. To compress this information into one number that also has the property of being an

order parameter (i.e. one value before the phase transition, another after the phase transition),

we typically use the normalized number of domains, which is denoted by ND. Another order

parameter is the normalized size of the largest cluster, denoted by Smax

D . Moreover, a quantity that

is typically only non-zero at the phase-transition and gives more information on the distribution of cluster sizes is the CSE. Finally, the number of time-steps needed to achieve convergence (T ) will also be studied.

As the Axelrod model is stochastic, one can reliably study its dynamics only by employing many runs. The number of runs (K) will be 100 for each parameter constant. The resulting

quantities will then be averages (e.g. hNDi) and in this thesis the average always implies the

average over multiple runs, unless stated otherwise. In addition, the standard deviation of the

normalized number of domains SDD =ph(ND− hNDi)2i and the corresponding standard error

SED= SDD/

K will be computed. (For Smax

D , the standard deviation is denoted as SD

max D and

similar for the standard error.) We look at a phase transitions in terms of ω for various values of M . Effectively, therefore, we look at a two-dimensional phase-transition. However, we are mostly interested in the difference between having 1 layer (M = 1 or singleplex) and having multiple layers (M > 1 pr multiplex).

4.2.1 Cluster size entropy

For the cultural domains in the end-state, D, the number of domains ND and the largest domain

Smax

D compress a full set of cluster sizes into one number. Outside of the phase transition, this

is usually a trivial compression, but around the phase-transition, much information is lost. For a specific clustering, the CSE is the weighted entropy over the the distribution of cluster sizes, that is

CSE = −X

s

Wslog Ws,

where Wsis the probability that an element (agent) belongs to a cluster of size s [Gandica et al.,

2011]. The CSE has a value of zero when only one type of cluster size is present and the more cluster sizes are present, the higher it becomes; at the phase transition, when there usually is some degree of scale invariance, it reaches its maximum (i.e. the distribution over cluster sizes is then closest to being uniform). It is weighted, since the size of the cluster is taken integrated into the probabilities. Otherwise, it would not be a useful measure, since e.g. a clustering of two clusters, where one is singleton and the other comprises the rest, would give rise to an entropy of log 2, while it is supposed to give a very low entropy.

(23)

This measure can be normalized as follows. The largest possible value of the entropy would occur

when all Wshave the same value and when there are as many cluster sizes as possible. These two

requirements are constrained by the fact that the total number of agents is N and that clusters are

discrete objects (i.e. half a cluster is not possible). From this it follows that√N is an upperbound

for the maximum entropy. For example, if N = 100 a maximal entropy is obtained by having 10 clusters of 1, 5 clusters of 2, 2 clusters of 5 and 1 cluster of 10, but the remaining 60 agents cannot be evenly distributed over the remaining 6 cluster sizes, so there is less entropy compared to the case where agents are evenly distributed over the cluster sizes.

4.3

Dynamical analysis

When studying the system dynamically, it makes sense to study the ‘most interesting’ case. Since

each system is investigated for different ω we therefore choose ω’s close to ωc. Also note that it

would not make sense to sample each time step, since too much data would be obtained. Therefore, only once every Y time-steps a measurement is taken from the system. Unless otherwise stated, Y = N ; this is equivalent to one (attempted) update per agent on average. Just a few runs are observed for each paramater-set; no averages over runs will be performed for the dynamical case.

For the purpose of studying the Axelrod dynamics many observables can be computed over time. All the observables discussed at the beginning of this section, such as the cultural compo-nents, will be shown (that is, the number of clusters in each case); in addition, to look at the difference between the layer dependent measures, the variation of information (to be explained later in this subsection) is computed between both the cultural zones and cultural regions for a pair of layers. Denote the clustering of cultural components by D(ω), cultural regions in layer i

by Di and cultural zones in layer i by Di(ω), where the dependence on ω is used to indicate the

explicit dependence of the measure on ω (note, however, that the other measures also depend on ω indirectly since the Axelrod dynamics depends on it). The normalized number of clusters is then

denoted by NX for a clustering X. Finally, if X is a clustering, denote by X[n] its nth cluster.

Since the number of observables grows fast when the number of layers increases, only the cases M = 1 and M = 2 are investigated. Presumably, many of the insights in the dynamical behavior that are obtained by analyzing just two layers can also be applied to more than two layers.

4.3.1 Network observables

The Axelrod model could also have an effect on the social multiplex if the network updates over time. Correlations may develop between the layers as a corollary to the Axelrod dynamics. This will be investigated dynamically by measuring the correlation every Y time-steps. The correlation between two unweighted, undirected networks can simply be computed as the the correlation between the corresponding adjacency lists (i.e. the extent to which a link between node i and j is present in layer 1 is matched by the same link in layer 2). Formally, the correlation coefficient

(24)

between layer α and γ is ρα,γ= haα ija γ iji − haαijiha γ iji q (1 − haα iji)haαiji q (1 − haγiji)haγiji ,

where it should be noted that haiji = ha2iji, since aij is a binary variable. The normalized

correla-tion is then obtained by dividing the correlacorrela-tion by two and adding 0.5.

From the beginning we have assumed that the different layers represent distinct social net-works. It is therefore interesting to investigate whether the layers actually have the properties of social networks. It was already shown that for some configurations, the initial layer looks like a RG, so that this does not resemble a realistic social network. A structured initial culture may re-sult in layers that show properties of social networks like the small world and scale-free properties. To study this, the size of the largest connected component G, the link density L, the clustering coefficient C and the kurtosis of the degree distribution κ (as defined in Subsubsection 2.2.1) are computed for each layer every Y time-steps. In general, if a network measure X corresponds to

layer α, this is denoted as Xα. Clearly, when the network does not update the network properties

stay the same and since they do not vary with the threshold they are the same for all runs.

4.3.2 Variation of information

A measure of discrepancy between two clusterings is the Variation of Information (VI) [Meila,

2003]. If A is a set and X = {X1, ..., Xk} and Y = {Y1, ..., Yl} are such that Xi∩ Xj = ∅ for

all i, j and ∪ki=1Xi = A (similarly for Y ), then X and Y are clusterings (or partitions) of A. In

addition, let N = |A| and let pi = |Xi|/N , while qj = |Yj|/N . Note that pi is the probability that

a randomly picked element of A is in Xi(similarly for qj). The VI between X and Y can then be

defined as

V I(X; Y ) = H(X) + H(Y ) − 2I(X, Y ), where H(X) is the entropy of X, defined by

H(X) = −

k

X

i=1

pilog(pi)

(similarly for H(Y )) and I(X, Y ) is the mutual information, defined by

I(X, Y ) = k X i=1 l X j=1 rijlog  rij piqj  ,

where rij = |Xi∩ Yj|/N is the joint probability of randomly selecting an element in A that is both

in Xi and Yj. It is easily seen that if the clusterings are the same, rii = pi = qi (and rij = 0

for i 6= j), so that I(X, Y ) = H(X) = H(Y ), which implies V I(X; Y ) = 0. Similarly, if the two clusterings are completely independent, then I(X, Y ) = 0, so V I(X; Y ) = H(X) + H(Y ). The

(25)

V I can be conveniently normalized by dividing by log(N ), since this is the maximum value that H(X) or H(Y ) can have, which will be done in the sequel. Note that this measure can then only be compared for systems that have the same N .

It is always the case that cultural zones Di(ω) have partly the same structure since all Di(ω)

are a refinement of D(ω), as was explained in Subsubsection 4.1.1. Here, this means that when

comparing two layers, the matrix with entries rij consists of blocks on the diagonal and is zero

everywhere else. To get a consistent measure of the variation of information, one has to take this into account. One way to do this, is to compute for each cultural component, the (normalized) VI seperately and then compute the weighted average over all cultural components, where the weight is the relative size of the component. The resulting measure is normalized and called the Weighted VI (WVI).

5

Results and discussion

As was outlined in Subsection 4.1, this simulation study focuses on four treatments. The results of these will be shown and discussed in Subsections 5.2, 5.3, 5.4 and 5.5. Note that Subsection 5.2 is fundamental, while the next two subsections build on the results presented there; Subsection 5.5 uses the results obtained in the previous subsections but also constitutes a different approach to studying multiplexity and therefore differs somewhat from the rest. Table 1 presents an overview of the four treatments. Before the actual treatments are discussed, Subsection 5.1 will go into a trivial version of the model that is discussed in treatment 1 to show some differences compared to the singleplex that already arise from the generalized model structure itself. Finally, it was observed that all of the networks are connected at all times t, so G = 1 for any case we have considered; it will not be shown each time in the results below.

Treatment Assortativity Updating Culture

1 no no random

2 yes no random

3 yes yes random

4 yes yes structured

Table 1: Characteristics of the four treatments

5.1

Treatment 0: trivial multiplex

In this subsection a useful intermediate case between a singleplex and a multiplex is discussed, namely a multiplex that has the same graph in each layer (i.e. a trivial multiplex). The main

(26)

result is that there is more cultural convergence for a trivial multiplex compared to a singleplex, due to the compartmentalization of the generalized Axelrod model.

5.1.1 End-state results

In Figure 1 results are shown for a singleplex and two multiplices (M = 2 and M = 36) that have the same RG graph in each layer (i.e. trivial multiplices). Note that it is not possible to include assortativity in the trivial multiplex condition, since the networks would then be generated by the cultural subspaces and will typically be different from each other (the same holds, of course, when networks update).

Clearly, there are differences between the three cases. First, the trivial multiplex condition

shows more convergence (i.e. lower values of hNDi) for all ω than the singleplex, although the

difference is small for M = 2. Second, it is clearly the case that hT i is larger for increasing M , especially when M = 36. 0 0.2 0.4 0.6 0.8 1 0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74 < N D > ω 1 2 36 0 200000 400000 600000 800000 1e+06 1.2e+06 1.4e+06 1.6e+06 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 <T> ω 1 2 36

Figure 1: Average number of domains hNDi (left) and average number of time steps hT i (right) in

the end-state as a function of ω for M = 1, M = 2 and M = 36 (treatment 0)

5.1.2 Discussion

A trivial multiplex implies that if two agents are connected in one layer they are connected in every layer and vice versa. If an interaction is successful in a singleplex, the active agent randomly chooses a feature from the set of all features in which the two differ. However, when such an interaction occurs in a trivial multiplex, the distribution over features that differ is not the same, since the probability to pick any such feature is 1/M times the probability of randomly selecting that feature out of the list of features for which the two differ in the cultural subspace associated to that layer.

In addition, if for some pair (i, j) layer α is selected, it may even occur that dβα

ij = 0, so that the

probability of selecting a feature for which the two differ is not just different, it is zero. Therefore, if a pair of nodes has the same cultural overlap in the two cases, their propensity to interact is the

(27)

same, but if the pair has a layer with subcultural distance equal to zero (in the case of the trivial multiplex), then there is a large probability that, even though the interaction would have been successful, there can be no cultural influence. However, in a singleplex such an interaction would

always occur, unless the agents are culturally identical (i.e. dij = 0), in which case there is no need

for further interaction anyway. This feature introduces a kind of compartmentalization in the generalized Axelrod model, which is not present in the ordinary Axelrod model on a singleplex.

Globally, this means that in the trivial multiplex case it will typically happen that cultural zones (the cultural zones are the same in both layers since the network is the same) form with

some pairs of agents (i, j) that have dβα

ij = 0 for some α. Cultural convergence within such zones

will then take longer on account of these pairs, increasing the possibility that the cultural zone merges with another cultural zone. The more culturally similar the zones become, the more likely the existence of such pairs is. Because there is a larger probability that cultural zones merge, there is more cultural convergence by arguments similar to those in [Axelrod, 1997] (see Subsection 2.1), even without having different networks in different layers.

The more layers there are, the larger the probability of encountering a pair of nodes (i, j) in

layer α that has dβα

ij = 0 while dij > 0, since Z is smaller. It seems therefore, that

compartmen-talization causes cultural convergence and is associated to longer running times, as was observed in the simulation results.

5.2

Treatment 1: random culture and random, static networks

In this section, the simulation results with respect to the first treatment are discussed. To be consistent with the result in 4.1.2, for the RG we set p = 1/q. It will be shown that multiplexity typically leads to more cultural convergence, but this originates from multiple effects, some of which counteract cultural convergence. The most important mechanism is that the cultural zones, by not overlapping perfectly between different layers, interact indirectly to produce more cultural convergence, while increasing the time to reach the end-state.

5.2.1 End-state results

In Figures 2 and 3, the end-state results are shown (i.e. the observables for each M ). Note that,

according to both CSEDand SED the phase transition is at ωc= 0.68, although for M = 1 SED

is large at ω = 0.71 as well, while for M = 36 the same holds with respect to ω = 0.65. There

seems to be a clear hierarchy at ωc, where the different scenarios are ordered almost perfectly

according to hNDi (i.e. if M > M0, then hNDi < hN0Di); the only exception is the pair (9, 12).

Note that these are averages and since the standard error is in the range 0.01 − 0.025 the ordering result is not statistically significant for large M , especially since the large standard errors occur at large M . Most likely, the differences between the largest values of M are small, so that the number of runs K should be even bigger to establish a statistically significant difference. Nonetheless, it

Referenties

GERELATEERDE DOCUMENTEN

• UNESCO’s Convention for the Safeguarding of the Intangible Cultural Heritage defines the intangible cultural heritage.. as the practices, representations, expressions, as well

The world may be plural in a cultural and political sense, but in terms of the global technological infrastructure will have to speak in one language, otherwise the networks will

How does the organizational cultural distance between two firms influence the number of layoffs after M&amp;A’s and is this relation moderated by the hostility

From the frequency analysis can be derived that evoked emotions by the change, the added value of the change, emotional involvement with the change, attitude of others concerning

Revised Proposition 2: Management strategies targeted towards diminishing cultural differences of MR&amp;D alliance partners have a positive mediating impact on the

Verdier and Zenou ( 2015 , 2018 ) further study the dynamics of a two-types model when there is inter-generational transmission and with a community leader, in order to explore the

The success factors found in the theory are: (a) openness within the organizational culture for cultural differences, (b) creating support within the organization (on

In my comment on the above-mentioned papers I will focus on a ques- tion, which is underlying many of the current debates about multiculturalism and religious pluralism in