Computational Agents in the Embodied Communication Game

(1)

Computational Agents in the Embodied Communication Game

Pieter de Bie

January 2009 - November 2009

Masters Thesis Artificial Intelligence Department of Artificial Intelligence

University of Groningen, Groningen, The Netherlands

External supervisors:

Prof. Simon Kirby

Dr. Thomas Scott-Phillips

Language Evolution and Computation Research Unit School of Philosophy, Psychology and Language Sciences University of Edinburgh

Internal supervisor:

Dr. Bart Verheij

Department of Artificial Intelligence University of Groningen

(2)

2

Abstract

Language learning is an important research topic in Artificial Intelligence: understanding how human communication is learned helps understanding human communication in general.

Different approaches have been used for the study of emergence of human communication systems. Some of these studies use computer simulations or robots to create artificial languages while others focus on creating new communication mechanisms using human participants. However, so far no attempt has been made to combine the two.

One example of the human language approach is the Embodied Communica- tion Game (ECG). This game was created to investigate how human participants can create a signalling system without using a channel dedicated for communication. In the ECG two participants have to use movement of a stick man figure both for communication and for travelling in the game. The ECG is relatively simple to understand and restricted in terms of possible moves. It was therefore chosen as the basis for a new game in which participants and computer agents can cooperate.

The ECG was analysed on possible moves in the game as well as the strategies used by human participants. The strategies were classified based on specific properties. Using this analysis a new game was created, called the Simplified Embodied Communication Game, or SECG. The SECG is a simplified version of the ECG in which computer agents can participate, while still keeping the properties that make human strategies possible. Based on the analysis of strategies agents have been created which used different signalling methods. Three primary types of signalling methods have been identified.

Two of these types of signalling methods have been tested with agents to show the viability of doing quantitative experiments on the emergence of signalling systems. Though no significant results were found, the same method can be used for different hypotheses. This experiment showed that some of the properties of strategies are still not fully understood, including the effect of gradually emerging strategies, segmentation problems and the role of dialogue.

(3)

Introduction

1.1 Language Evolution

Language comprehension is one of the main research areas in Artificial Intel- ligence: one of the requirements of a complete artificial, intelligent machine is the ability to understand and produce language. Human language is different from other communication systems in that it is the only known natural communication system that is both learnt and symbolic (Deacon, 1997; Oliphant, 2002).

A symbol in this case is a sign that refers to something in an arbitrary way.

This can be contrasted with non-symbolic (iconic) systems, where the sign is attached to the meaning, for example facial expressions. Innate signs are used in animal communication and can be created through natural selection. This differs from the learned symbols in human language, which children usually learn from their parents. Communication systems in computers are not natural but rather created by the developer of the system, for example the TCP protocol.

Because of this uniqueness of the human communication system, it is of interest to language evolution researchers how such a system could have evolved.

This research is done from multiple approaches (Galantucci, 2009), which can be grouped as ‘Experimental Semiotics’.

One type of research is done using computer simulations. In these simulations, such as the one by Levin (1995), computer agents are forced into interaction sessions and some kind of learning mechanism is used to create a signalling system.

The other approach uses novel situations in which human participants are required to create a new communication system in order to be successful (See Galantucci (2005); Fay et al. (2004); Healey et al. (2007); Scott-Phillips et al.

(2009)). These situations can include games where the participants have to cooperate to score points, or problems which participants are forced to solve that can only be solved by combining information from both players.

Both of these types of research contribute to the research in human language emergence. Galantucci and Steels (2008) for instance compare both methods and conclude that both methods have their merits. The experiments involving human participants, for example, focus more on prerequisites for communication (which are equally important) than on the specific communication systems

7

(8)

8 CHAPTER 1. INTRODUCTION themselves. Furthermore, with human experiments repair strategies play a large role which so far has been absent with simulations. However, so far no attempt has been made to reconcile both research methods in a single experiment.

1.2 Use of Agents

Agents have been used in some computer simulations investigating language emergence. For example, Levin (1995) uses simulated animals to investigate symbol-meaning mappings. Quinn (2001) uses simulated robots to investigate innate, iconic communication schemes. Steels (2003) uses interacting robots to create new languages.

These systems have some advantages when compared to experiments involving only human participants. For example, experimental variables can be varied precisely. Furthermore, the experiments can be repeated easily. The use of agents also formally defines the complete system.

It therefore makes sense to combine the use of agents with the use of human participants to study the capability of creating communication schemes in humans. This type of experiment has several advantages:

• One side of the experiment is kept constant, which results in more significant results;

• Strategies not normally used by humans can be implemented and the effects of those strategies tested;

• The implementation of the agent explicitly shows the strategy used. There is no need to ask participants how they play the game;

• Slight variations on existing strategies can be created to test only one particular property of a signal.

1.3 The Embodied Communication Game

One particularly interesting experiment is done with the Embodied Commu- nication Game, created by Scott-Phillips et al. (2009). In order to study the emergence of language, Scott-Phillips et al. created a game in which two participants have to communicate in order to successfully score points. However, their experiment focusses not on the creation of new signalling systems, but rather on how participants can create a communication system in situations where no means of communication is given. The ECG therefore has no predefined communication channel. Rather, the game is kept simple and all communication done in the game has to be performed through the movement of stick man figures.

This simplicity makes the game extremely suitable for implementing agents.

Some of the advantages of the ECG above other games (which will be discussed in the next chapter) are that it has two participants, of which one can easily be replaced with an agent; the game is simple to understand; it is easy to create a successful agent strategy for the game and the expressivity in the game is limited to discrete moves. Finally, experiments have already been conducted with the ECG and the results from these experiments can be used as a starting point for the creation of agents.

(9)

1.4. RESEARCH QUESTION 9 In the ECG, participants have to use movement in the game as a means for communication. The precise way this movement is enacted differs between participants and thus different strategies are used between pairs.

1.4 Research Question

The research question is: “Can human strategies in the ECG be explicitly modelled as a communicative agent?”, which is part of the general question of how human communication has emerged.

This question will be researched by analysing and simplifying the ECG and its strategies. Some of the existing strategies will be adjusted in order to quali- tatively measure the effects of two properties of the signals. The two properties chosen are efficiency and repetitiveness. Precise definitions of these two properties will be given in Chapter 6.

These two properties will be compared to a base implementation which does not feature either of the properties. Any differences in performance between the groups can be attributed to that specific property, as long as care is taken not to change the strategies in any other important way.

1.5 Structure of the thesis

The next chapter in this thesis visits relevant earlier research. Of this research, two experiments are the most important for this work. These are the Embod- ied Communication Game and the Tacit Communication Game. Both will be analysed further in Chapter 3. The strategies used in the ECG will be analysed further in Chapter 4.

Based on these two games, a new, simplified game will be introduced and compared to the earlier games in Chapter 5. In Chapter 6, some possible strategies for this game will be described. Based on the idea of investigating efficiency and repetitiveness, four agents will be created that explore different combinations of these parameters. Finally, these agents will be tested in Chapter 8.

The results of this research and its relevancy to language and other games are discussed in Chapter 9.

(10)

10 CHAPTER 1. INTRODUCTION

(11)

Chapter 2

Literature Study

Clark (1996) states that in order to study the use of language, one has to look not only at how language is used as a whole, but also how language is used between language users, as everybody uses language in a slightly different way.

Language is more than just knowing the meaning of words and understanding the grammar. Understanding of meaning also involves an understanding about the speaker’s intention (Grice, 1957). Therefore, to study language one has to not only look at the system itself, but also at the process of language use between two speakers (Pickering and Garrod, 2004). It also means that context is important when using language (Clark, 1996). This type of research, where not only the language system itself is taken into account, but also the context and interlocutor interactions, can be called experimental pragmatics (Noveck and Reboul, 2008). Experimental pragmatics focusses on language emergence for example by studying the use of new sign systems in specific populations.

Recently, another field of research has been used to study language emergence (Galantucci, 2009), where new languages are created in the laboratory.

This experimental condition allows for more precise testing than what was previously possible. Galantucci (2009) calls this experimental semiotics. Exper- imental semiotics does not focus on the precise languages used, but more on how the communication systems are created, what the requirements for these systems are and what effects these systems have. The systems themselves are not required to be fully grown languages, but can be relatively simple signalling systems. Human communication in general is studied rather than only spo- ken conversation. Experimental semiotics also differs in the object of study:

where experimental pragmatics studies specific emergence of preexisting forms of communication, experimental semiotics studies completely new systems.

Studying language emergence in controlled conditions offers several advantages. For one, the whole language process from the start can be monitored and controlled, so there is complete information about the history of the emerging communication system. Furthermore, researching language emergence in an experimental setting allows systematic manipulation of different variables, including the type of communication system, the communication medium and the specific population.

Experimental semiotics can be approached from two sides. One is by using human participants and forcing them to adopt a new signalling system in order to solve some kind of problem. The other is to use computational agents, using

11

(12)

12 CHAPTER 2. LITERATURE STUDY computer simulations or robots to create new systems.

This chapter is split up into three parts. In the first, research on experimental pragmatics in general is briefly demonstrated by showing examples of research focussing on newly created sign systems. The next section deals with the use of computer simulations in language emergence. Finally, experimental semiotics using human participants will be discussed.

2.1 Sign systems

Research on the emergence of novel communication systems for humans has a long history. One example of this type of research is the study of sign languages in isolated populations (Kegl et al., 1999). Sandler et al. (2005) look at the specific grammar structures that emerge in a community with a high occurrence of deafness. Senghas et al. (2004) look at a sign system emerged in a community of deaf Nicaraguans, created when deaf, previously home-schooled children were able to join a new elementary school for special education. Specifically, they look at how fundamental language properties such as discreteness and combinatorial patterning emerge.

Another, related, type of research is done by looking at how sign systems emerge in families where deaf children are raised by parents without exposing the children to existing sign systems (Goldin-Meadow and Feldman, 1977; Goldin- Meadow and Mylander, 1998). Interactions with the children were videotaped and the syntax and semantics of the new systems were analysed, for example to study whether the child or the parent was the inventor of the system.

2.2 Computer simulations

Emergence of communication has also been studied in computer simulations.

Levin (1995) uses a signal-meaning mapping in a simulation of animal communication. Animals are modelled as having an interior state and an observable state. These states are abstract and are represented as a matrix of numbers.

The interior states can be explained as wants like hunger and cold, while the exterior states can be thought of as physical signs like raised tails.

In these simulations there is no ontogenetic learning. Instead, all learning is done with a genetic algorithm, which mutates the animals genotype. This genotype consists of two mapping matrixes, one which maps an internal state to an observable state and one which does the reverse.

The fitness of the animals is calculated by letting each animal interact with a number of other animals. One animal acts as a model with a certain interior state. It uses the mapping matrix to map this state to an observable state. The other animal then observes this state and maps this to an internal state using another mapping matrix. The first animal’s real interior state is then compared to the observed interior state by the second animal. The less difference between the two, the higher the fitness. The motivation for this is that if the second animal can better understand what the first animal needs, then it can help the animal more efficiently and thus get a higher fitness.

The communication in Levin’s experiment is very limited. There is no possi- bility for the creation of any new signals: they are predefined in the form of the

(13)

2.2. COMPUTER SIMULATIONS 13 observable states. There is a fixed number of observable states in a fixed order.

The same is true for the observation of signals; the model is built to derive an interior state from the observed state of an animal.

Quinn (2001) studies emergence of communication with the use of simulated Khepera robots: in his experiments communication is not predefined. There is no restricted set of meanings or signals. Nor is there some channel through which his robots can communicate. Khepera robots are simple: they have only two motor-driven wheels and eight proximity sensors. Their behaviour is controlled by an artificial neural network. The robots adapt by a genetic algorithm which uses an artificial neural network, so there is no form of ontogenetic learning.

In each simulated encounter two robots are simulated. After some predefined duration, or when the robots part from each other, a fitness score is calculated.

The fitness score is defined as the distance the robots travelled by center of mass, which is the average distance of both robots from the starting point. This score is used in the genetic algorithm to produce a new generation of simulated robots with slight mutations. Only the most successful robots are considered for the next generation.

This means that some sort of co-operation is necessary: if both robots just drive away randomly, they will go too far apart. If on the other hand both robots follow each other, they will arrive at a deadlock where nobody moves.

As an additional penalty in the system, the fitness score is reduced whenever the robots collide.

Quinn has performed various experiments using this set-up. In most of these trials, the result is a system that comes into existence in multiple stages, the end result of which is a form of communication through which the robots are able to move large distances while staying together. These stages will be described below.

At first the robots just move straight away from their initial position: this results in a high fitness score if the robots are (accidentally) aligned approxi- mately the same way.

In some of the cases this mechanism would produce a collision. The lower fitness in the case of a collision resulted in the emergence of a simple collision avoidance mechanism, where one robot backs away if the other robot is too close.

This second stage, however, can result in a deadlock if the robots come at each other head-first, in which case both stop moving. This is resolved by having two different types of agents – one which backs away in the presence of a deadlock and one which moves forward in the same situation. In a situation where robots of the two different types meet, one backs away as the other moves jerkily after it. Of course, using this solution the deadlock could only be resolved in simulations involving different types of robots – if two robots of the same type meet, there is either a deadlock or a collision.

This combination of two different types of robots is maintained for a long time. Because the situations with a deadlock can potentially end up with a high fitness score, there is an evolutionary pressure to arrive in just such a situation, rather than to avoid it. The robots start the simulation by moving in circles, which increases the chance of having a collision, rather than moving away in a straight line.

The polymorphism is in the end resolved by having a single genotype that can act as both the leader and the follower. This happens in the following way:

(14)

14 CHAPTER 2. LITERATURE STUDY 362 M. Quinn

with the leader moving backwards and the follower moving forwards, together achieving the required distance. How are the roles of leader and follower allo- cated? Observation of the team from each of the starting positions shows that whenever there is a difference in alignment times, this difference plays an important part in role allocation. Figure 2 shows an example of two agents which become aligned at different times. The first agent to become aligned moves very close to its still-rotating partner and then waits, oscillating back and forth. Sub- sequently the second agent becomes aligned and reverses away, closely followed by its partner.

B A

A B

(ii)

(i) (iii)

A B

(iv)

Fig. 2. An example interaction: (i) Both agents rotate anti-clockwise; (ii) Agent B becomes aligned first, and moves toward A; (iii) Agent B then remains close to A, moving backward and forward staying between 0.25–2.0cm (aprox.) from A. (iv) Once agent A becomes aligned it reverses and is followed by B.

It seems then, that the actions of the first aligned agent serve as a signal to the second. If an agent perceives the signal whilst still rotating, it will adopt the leader role. However, if it becomes aligned without having perceived the signal, it will itself perform the signalling action and subsequently take the follower role.

Such a strategy would clearly serve to coordinate role allocation and aid in the successful completion of the task¹.

Analysis of the evolved neural network was undertaken to ensure that the be- haviours identified as signal and response did indeed perform the causal roles suggested above. Figure 3 shows the evolved neural network after all non-functional neurons and connections have been removed. Note that the network only utilises two sensors. Sensor 0 is the sensor immediately in front of the left wheel, and sensor 3 is the right-hand sensor of the front-most pair of sensors (see figure 1). Since agents rotate counterclockwise, sensor 0 will normally be the first to register the presence of another agent. What the analysis presented below will demonstrate is this: In cases where there is a difference in alignment times, if sensor 0 is saturated (i.e. fully activated) prior to the activation of sensor 3, the result is that an agent will reverse, this constitutes perception of, and response to, the signal. However if sensor 3 is activated without the prior saturation of

1 From a minority of starting positions there is insufficient difference in alignment times for this strategy to be effective; the procedure by which roles are then allo- cated is more complicated. However, analysis of the general case is sufficient for the purposes of this paper.

Figure 2.1: An example interaction in Quinn’s simulations (reproduced from Quinn (2001)): (i) Both agents rotate anti-clockwise; (ii) Agent B becomes aligned first and moves toward A; (iii) Agent B then remains close to A, moving backward and forward staying between 0.25 - 2.0cm (approx.) from A. (iv) Once agent A becomes aligned it reverses and is followed by B.

both robots start to turn when they start the simulation. The first robot to become aligned with the other robot moves very close to the other robot and starts to oscillate slightly. The second robot, as it starts to align, detects the oscillation of the other robot on its side-sensors and starts to move backwards.

The oscillating robot then stops oscillating and starts following the first robot.

The actions of the first robot act as a signal for the second robot. Quinn (2001) confirms this by analysis of the network: if an object is perceived in close proximity before the robot is aligned, the robot starts to move away.

Floreano et al. (2007) performs another experiment with robots. Their experiment differs from Quinn’s in that a communication channel has already been given: the robots are equipped with a blue light and a camera to observe light from other robots. The robots also have an infra-red sensor capable of finding a food source. These inputs are coupled to the blue light and their motors through an artificial neural network. When this network is trained using a genetic algorithm, populations evolve that use the light as a signal for other robots to indicate the presence of a food source.

Steels (2003) uses robots situated in the neighbourhood of a whiteboard with a set of geometric figures. These figures are the topics in conversations between agents. In this “Talking Heads” experiment, one of the robots acts a speaker by describing the current topic in specific concepts. The other robot, called the ‘hearer’ has to guess what this topic is, based on the words uttered by the speaker. This requires the speaker to be able to conceptualize a topic and create words to express the selected concepts. The hearer has to do the reverse and combine the concepts to a specific topic. The robots have pan-tilt cameras that can be used for visual sensing and for pointing, which are used for explicit feedback. In contrast to the systems described earlier, this type of system is symbolic; there is no relation between the meaning of the words and the form of the words, as the words for the concepts are created randomly.

2.3 Emergence of Symbolic Communication

Computer simulations of communication emergence are different from how human communication emerges. For one, most of the simulations described above use an iconic communication scheme while human communication is symbolic (Deacon, 1997). The robots in the Talking Heads experiment are programmed

(15)

2.3. EMERGENCE OF SYMBOLIC COMMUNICATION 15 to speak, listen, interpret and play the game, which are all acts that humans are not programmed to do, but rather have learned along the way. Experiments using human participants therefore use different communication systems.

Kirby et al. (2008) use an iterated learning model where participants are shown a limited number of examples from an alien language. These examples consist of a word and a picture which represents the meaning for this word.

The pictures consist of a combinations of colours, shapes and arrows. Once the participants have learned this language, they are asked to create words for pictures they have not seen so far. The first participant is given a random combination of letters and pictures; the next participant receives the output of the first participant as input, and so on. Multiple of these transmission chains are created and the results compared. The result of this experiment is that the error rate decreases significantly, but the expressivity of the language decreases too. When the output from participants is filtered by removing pictures with words attached to them that have already been used, before using the dictio- nary as input for the next participant, expressivity remains and the result is a systematic language.

Fay et al. (2004) use a form of iterated pictionary with a limited set of meanings. Participants are put into groups, called communities. Each member of the community plays the pictionary game exactly once with each other member of the community. In each encounter one participant is given a meaning, which he has to represent as a symbol on paper. The other player then interprets this symbol as one of the previously given meanings. The goal of the game is to get the highest possible accuracy rate in determining the meaning of a symbol.

Possible meanings are given in advance and are known by all players, but there is no convention of symbols. That means that there is more freedom in the game than with Levin’s simulation, in which the possible signals were also predefined.

Because the pairs are not isolated, knowledge from the game spreads within the community. The results show that after all players have played with each other, there is a symbol system used by almost all players in the community, even though there never was a co-ordinated effort to do so. This results in a higher recognition accuracy in the later sessions compared to the earlier sessions, even though the players still play with players they have never seen before.

Similar pictionary-like tasks exist (Garrod et al., 2007; Healey et al., 2002, 2007; Selten and Warglien, 2007). Healey et al. (2007) for example use a shared digital virtual whiteboard to allow participants to communicate with each other.

Participants are grouped into pairs and each participant is given a 30-second piece of piano music. By using the whiteboard the participants have to decide whether they both have the same piece of music. The whiteboard is the only method the participants can use to communicate. They are furthermore not allowed to use letters or numbers.

Galantucci (2005) investigates the emergence of communication with humans in an experiment in which a pair of human participants try to score as many points as possible in a game. This is only possible if they communicate with each other.

The participants are given access to a computer that runs a program showing a house made up of four rooms in a two by two lay-out. Each room has its own symbol (for example a star), which is drawn in the center. The players start in different rooms and they cannot see in which room the other player is. A turn finishes as soon as the players finish their own turns while being in the

(16)

16 CHAPTER 2. LITERATURE STUDY

to the Receiver. The only way this could happen is by the Sender moving around his token on the board. Furthermore, participants (starting with the Sender) have unlimited time to prepare their movements and to signal their readiness to move by means of a button press. At this point, the token of the participant is automatically positioned at the centre of the board and the participant has 5 s to move around the board. Consider the problem illustrated in Figure 26.1. During communicative trials, the Sender has to decide not only how to move his token (the circle) to his goal location (down to the right), but also to communicate to the Receiver where she should place her token (the rectangle), and in which orientation. It is important to emphasize that the only way the Sender can convey this information to the Receiver is by translating and rotating his token in the board. Figure 26.2 provides a representative example of how the problem illustrated in Figure 26.1 was solved by our participants. Further details on the experimental set-up and on a validation of the TCG are provided by J. P. de Ruiter et al. (unpublished data).

There were two further types of trials, in which both Sender and Receiver could see the goal configuration. Therefore, in these trials, the Sender did not need to signal to the Receiver the position and orientation that her token should have taken by the end of the trial. In the ‘control’ trials, the Sender could move directly to his goal configuration, and the Receiver followed suit. In the ‘noncommunicative’ trials, the Sender was instructed first to overlay his token to the goal configuration of the Receiver, and then move to his

THE TACIT COMMUNICATION GAME (TCG) 599

Sender

Goal configuration Receiver

1. 2. 3. 4.

(Green rectangle)

5. 6. 7.

Figure 26.2 Sequence of events in a Communicative trial of the tacit communication game . 1. Sender and Receiver view their tokens (1.5 s)

2. The Sender, but not the Receiver, sees the goal configuration (unlimited time for inspection and planning).

3. The Sender signals his readiness to move by pressing the start button—his shape moves to the center of the board, and the goal configuration disappears.

4. The Sender moves his token on the game board by means of a multi-button controller (max. 5 s). The movements of the Sender’s token are visible to the Receiver. The double arrow indicates that the Sender moved back and forth between those two positions.

5. The Receiver signals her readiness to move by pressing the start button—her shape moves to the center of the board.

6. The Receiver moves her token on the game board (max. 5 s). The movements of the Receiver’s token are visible to the Sender.

7. Sender and Receiver receive feedback indicating whether they were correct (green box) or incorrect (red box) in matching their token to the goal configuration.

26-Haggard-Chap26 5/6/07 1:38 AM Page 599

Figure 2.2: An example of a trial in the Tacit Communication Game. Taken from De Ruiter et al. (2007)

same room. Finishing a turn successfully results in a certain amount of points.

Points are deducted for each minute the players are playing the game, so the faster the turns are finished, the higher the final score. This task is similar to earlier games studying how participants are able to co-ordinate in a restricted context (Garrod and Anderson, 1987).

To assist in playing the game, the players are given a means to communicate:

each player has a digital drawing tablet on which they can draw signals. These signals are then shown on their own and the other player’s screen. To make sure that the participants do not use existing symbols, the drawings are distorted:

the y-component changes linearly with time. This results in symbols that look stretched out, as if the players are drawing on a moving piece of paper.

Reliable communication in Galantucci’s game is not trivial; one pair failed to create any type of system because one player signalled inconsistently. Suc- cessful pairs invent symbol systems that indicate in which location the player is. Almost all pairs are able to achieve communication, but different types of symbol systems are used:

1. Some pairs used signals based on the location of their room, for example by giving each room a number and drawing the corresponding number of lines.

2. Other pairs tried to reference the symbol that was in each room, for example by basing their signal on the number of edges of the symbol.

3. Finally some pairs used systems that were based on the layout of the rooms; indicating whether they were on the left side or right side of the house.

2.3.1 The Tacit Communication Game

De Ruiter et al. (2007) investigate the origin of intentions. As studying first- person intentions is hard (“Why am I doing this?”), they focus on third-person intentions: how does a sender generate an intention in a receiver? This has a close relation to linguistic communication, as linguistic communication is achieved when a recipient recognises the intention with which a communicative act is produced (Grice, 1957; Sperber and Wilson, 1995).

De Ruiter’s model is based on the expectations of the Sender of a signal. The Sender sends a signal depending on how he expects the Receiver to interpret

(17)

2.3. EMERGENCE OF SYMBOLIC COMMUNICATION 17 that signal. This means that the Sender actively selects a signal that he expects will most probably evoke the correct response from the receiver.

To test this hypothesis the Tacit Communication Game, or TCG, was created. In this two-player game each player has a distinct role: there is a Sender and a Receiver. Each player has a token of one of three forms (square, triangle, circle). There is a 3x3 grid shared by both players in which they can move their token.

The players can move their token and also rotate it to change the token’s orientation. The goal in the game is to put both tokens in a predetermined location and orientation on the grid.

The experiment is done using different types of trials, which can be catego- rized as communicative and non-communicative. This is done so that MRI-scans of the players during the non-communicative games can be subtracted from the communicative ones. In the first non-communicative game the players are asked to move their tokens to the correct position. Both players are given the target position of the tokens, so no communication is necessary. In these trials MRI-scans are made of the motor movement.

In the second non-communicative trial the Sender is also asked to move his token to the position of the other player’s token target location. This is done so that motor movement during the communicative and non-communicative trials will show up identical on the MRI scans.

In the communicative trials only the Sender can see the goal position and orientation of the token of the other player. To score in the game he has to convey the location and orientation of that token to the other player. As the other player can see the movements of the Sender before acting himself, this information can be embedded in the movement of the Sender. This is exactly what was done in the non-communicative trials, so the players knew in advance how to communicate this information. Crucially, the Sender has to decide not only how to move his token to his goal location, but also to communicate to the receiver where she should place her token and in which orientation, using the same means. The receiver then has to understand which movements are used for communication and which to move the sender’s token, and also understand what the communicative moves mean.

This trial is depicted in Figure 2.2. The following excerpt from De Ruiter et al. (2007) explains the trial in detail:

1. Sender and Receiver view their tokens.

2. The Sender (not the Receiver) sees the goal configuration.

3. The Sender signals his readiness to move by pressing the start button – his shape moves to the center of the board and the goal configuration disappears.

4. The Sender moves his token on the game board by means of a multi- button controller. The movements of the Sender’s token are visible to the Receiver. The double arrow indicates that the Sender moved back and forth between those two positions.

5. The Receiver signals her readiness to move by pressing the start button – her shape moves to the center of the board.

(18)

18 CHAPTER 2. LITERATURE STUDY 6. The Receiver moves her token on the game board. The movements of the

Receiver’s token are visible to the Sender.

7. Sender and Receiver receive feedback indicating whether they were correct (green box) or incorrect (red box) in matching their token to the goal configuration.

The results of this experiment are somewhat two-fold. In instances where both players had the same token or the receiver would not have to rotate his token, the task was easy and had a 95% success rate. In trials where the Receiver would have to rotate her token (the ‘unconventional’ cases), success rate dropped to about 40%, which is still above chance level. In these cases the players would have to agree on a signalling system to determine how the token should be oriented.

Unfortunately, De Ruiter et al. focus mostly on the aspects of brain activ- ity within the experiment and there is not much data on how the participants behaved in the unconventional case. Still, the experiment shows how communication can occur in an embodied way, without the use of any separate communication channel, as was the case with Galantucci’s experiment. It also shows that communication can emerge between the players.

However, it differs from other experiments in that the system created here is iconic: the target location of the token is signalled by actually travelling to that location and the orientation of a token is signalled by oscillating in the right direction. This is different from human communication, where the signal is arbitrary with respect to the meaning attached to it. Furthermore, the participants were directed to move in specific ways, rather than having freedom of choosing their own system.

2.3.2 The Embodied Communication Game

The previous studies investigate how communication systems evolve once communication is established, assuming that the method for communication already exists. In the pictionary tasks for example the whiteboard is already given. The robots in the Talking Heads experiment are programmed to use sound for speech.

The method of choosing a channel for communication so far has not been studied extensively (Scott-Phillips et al., 2009). An exception to this is the study by (Quinn, 2001), but the communication system created there is innate and iconic, rather than learnt and symbolic, like human communication.

There are three ways in which this aspect has been avoided in previous studies.

First of all, one obvious way to circumnavigate the problems of how signals are recognised is by providing a means of communication. Once the means of communication are defined, all inputs through that channel must be recognised as a signal. This is for instance the case with pictionary-like experiments. While these experiments show how different parties can converge on a set of arbitrary symbols, they do not address the issue of how a signal can be detected, or which signals are more detectable than others.

The same is true in the case where the set of meanings and symbols are already established. If the symbols and meanings are set, there is an obvious difference between signals and non-signals. This can be seen in the experiment by (Levin, 1995), where a predefined array of symbols was used.

(19)

2.3. EMERGENCE OF SYMBOLIC COMMUNICATION 19 Finally, to a lesser extent, it is also possible that the roles of sender or receiver are already defined: if someone expects to be the receiver of a message, he or she is bound to interpret any incoming data as a signals.

Most previous studies have focussed on the the content of a signal. Accord- ing to Scott-Phillips (2010), there is another issue: how do users know what is a signal? Or, alternatively, how do you signal signalhood? This is the difference between informative and communicative intentions. The first is the intention that the receiver understands the signal, the second is the intention that the receiver realises that there is an informative communication (Grice, 1957; Sperber and Wilson, 1995).

To investigate the creation of communication systems without a predefined means, a new game was created. In this game, embodied communication is necessary to successfully complete a game. It is therefore called the Embodied Communication Game, or ECG. This means that there is no communication device through which the players can signal. All communication in the game has to be done through a channel that can also be interpreted as a move in the game. It is also not implied in the game instructions that communication is possible or necessary.

The game focusses on communicating the intent to communicate: how to signal signalhood. Communication has to be done through movement in the game and thus the movement is co-opted (Gould and Vrba, 1982) to be used as communication. This means that one of the first steps the participants have to take is to find out that movement can be used as a form of communication. This is different from the previous experiments, in which the means of communication has already been given.

In the ECG the players are situated in a box with four squares, similar to Galantucci’s game, but the players are able to see in which room each player is.

It is only possible to move between squares, not within. Each square is assigned one of four colours, but the colour assignments are different for each player. The players cannot see the colours of the rooms of the other players and the colours of the rooms change with every game turn. A colour can be present multiple times in the same room, or be completely absent. At least one colour is always common between both players.

The players are given the task to end their turns in rooms with the same colour. A turn ends when both players have finished moving and is signalled by pressing on the spacebar. After a turn is finished, it is no longer possible to move until the next turn has started. Only at the end of each turn can the players see the colour assignments of the rooms of the other player.

This means that the players lack information to complete the task: which colours are available to both players and on which colour can they end their turn? They have no interaction except for their movement in the game. This movement must be used to direct the avatar to a specific square in the room.

Therefore movement must be used for two different goals in order to succeed in the game: for travelling within the world as well as for communication.

Participants are able to successfully play the game by creating strategies where a common colour is negotiated. This usually happens by the players establishing dialogue and allowing each other to respond to proposals for a specific colour. If the listening player has the signalled colour, the player travels to that colour and ends the turn. Otherwise, another colour is suggested. Such a strategy often is only established after a more crude system is in place, for

(20)

20 CHAPTER 2. LITERATURE STUDY example by travelling to one specific colour if this colour is available. Signalhood can then be signalled in exceptional conditions, for example when one of the player does not have this default colour. Such a player can start oscillating or otherwise moving in erratic ways. Players report this as meaning “No red!” or

“Not plan A!” (Scott-Phillips et al., 2009).

Signals in the ECG are symbolic, in contrast to the signals used in the TCG: movement is often done in a particular pattern, for example oscillations or circles, that have no relation to the specific colour they signal.

2.4 Conclusion

In the previous sections different ways of studying emergence of communication have been addressed. While these experiments address interesting questions, the specific methods used allow many degrees of freedom. For example, in the ECG each pair is able to create a different strategy, allowing many different strategies to exist, which can be hard to compare. This can make changing experimental variables challenging, though not impossible. Healey et al. (2007) for example change the type of communication device and the composition of communities in pictionary-like tasks.

Computer simulations have less freedom and are easier to reproduce; however, they tend to investigate different issues than the human based tasks (Galantucci and Steels, 2008; Steels, 2006). For example, they often ignore the prerequisites for communication. So far there has not been an effort to combine both methods and use computer agents that interact with human players.

(21)

Chapter 3

Analysis of the ECG and the TCG

The previous chapter described some of the existing games briefly. To understand which elements of the game should be kept, which can be simplified and which properties can cause problems for an agent it is necessary to investigate the workings of these games in detail. With this knowledge it is possible to build a new game that is playable by both agents and humans and still provides opportunities to quantitatively measure certain properties of the game.

Further investigation of the games should also make clear what the possible space of strategies is, which kind of strategies are currently used by the players and which strategies are more successful than others.

This chapter consists of two sections, describing the game mechanics of the Embodied Communication Game (ECG) by Scott-Phillips et al. (2009) and the Tacit Communication Game by De Ruiter et al. (2007) in detail.

In the next chapter the strategies used by the players in the Embodied Com- munication Game will be analysed. In this analysis the properties of different strategies will be described in order to determine what features a new, simplified game should have. Finally, in Chapter 5 these results are combined to create a new game.

3.1 Terminology

Before discussing the games in detail, it is useful to establish some common terminology, as some confusion can arise by the different use of the same terms in both games. The following terms have been taken from Scott-Phillips et al.

(2009) and De Ruiter et al. (2007), but some have been redefined to avoid confusion between the two games.

Grid A playing field for a player

Box A grid for a single player in the game

Square a single element of the grid (also called a quadrant in the ECG) Location one specific square in the grid where the player is located

21

(22)

22 CHAPTER 3. ANALYSIS OF THE ECG AND THE TCG

Token/Avatar the movable object the player has within the box Orientation a specific rotation of the player’s token

Position of a player in the game, consists of a location and, if applicable, orientation

Situation description of the current board game and the position of all players Turn a sequence of moves by a player, ending with the ‘finish turn’ move Round a single run of the game. It consists of the turns of both players and

the setup and end of the turns.

3.2 Analysis of the Embodied Communication Game

While the ECG is simpler than the games it is based on, such as the game created by Galantucci (2005), it is still a game with many degrees of freedom.

1. The game is continuous; players can move at any time 2. Pauses in the game can have meanings

3. Both players move in the same time period, making later moves depend on the earlier moves

The ECG is played between two players. There is no predetermined role for either of the players; both have exactly the same status in the game, though in some trials one player takes on a leading role, as we will see later.

Each player has a game board, which is called a box. The box consists of 4 squares, each with its own colour. There are 4 possible colours for each square:

red, yellow, green and blue. The square colours are assigned randomly, there is no dependence between the colours of squares or colours between turns. There is only one restriction in the assignments of colours, namely that at least one colour should be in common in the boxes of both players. The starting location of the players is also random.

Each player is situated in the box through the use of a stick-man. This stick- man is always located in one of the four squares, but is allowed to travel from square to square. Movement only happens between squares; it is not possible to travel within a square. Movement between squares is animated and takes 1 second. Movement is done using the arrow keys. During the movement of the stick man it is not possible for the player to do any other moves.

The players play the game for a total of 40 minutes and are asked to score as many points as possible. A point is scored when the players end their turn on squares of the same colour. Ending the turn is signalled by pressing the space bar. After ending the turn it is not possible for the player to do any other moves until the next round starts.

A player can see the location of the other player in their box, but cannot see the colours assigned to the squares of the box of the other player. After both players have ended their turns, the colours of all squares are revealed. This way

(23)

3.2. ANALYSIS OF THE EMBODIED COMMUNICATION GAME 23

players are able to see if they scored a point and which colours were available to the other player. The players can also see whether they scored a point as a message is shown, either stating “Oh dear, no points this time!” or “Yes! You scored a point!”. After both players have again pressed the space bar, the game continues with the next round.

Both players gain a point when they end on squares of the same colour. The goal in the game is to get as many points in succession as possible: if the players fail to score in a round, their score is reset to 0. The final score is the maximum points scored in succession throughout the game.

3.2.1 Possible game layouts

In the ECG, each player has 4 squares with one of four colours each. Further- more, both players start at random in one of the four squares. Therefore, there are a total of 4⁴·4⁴·4 · 4 = 4¹⁰= 2²⁰ = 1048576 potential starting situations possible for each turn. However, the game does not use start conditions in which it is impossible to score a point. This happens for example if one player’s box would be completely red, and the other player’s box completely blue. With the algorithm used in the ECG (described below), there are in total 1019584 different starting situations.

A point is gained if a turn ends while both players are on squares of the same colour. This means that both boxes need to have at least one colour in common in order to have a way to score a point. A situation describes the position of both players. As there are 4 possible positions for each player, there are 16 situations in total.

A successful situation is a situation, that, given a specific colour assignment to the boxes, will result in a point if the game end in that situation.

For example, Figure 3.1(a) shows an unsuccessful situation: the left player is on a yellow square and the right player on a green square, so they won’t score a point if the round is ended at this point. Figure 3.1(b) shows a successful situation, as both players are currently positioned on a yellow square. With this board layout, there are four successful situations possible: two situations in which both players end on yellow, one in which both end on red and one in which both end on blue. Therefore, by pure chance, there would be a 25%

chance of scoring a point (4 of out of 16 situations).

In the ECG every round has at least one successful situation available. This is enforced by the following algorithm to generate the boxes:

r1 <- GenerateRandomBox() r2 <- GenerateRandomBox() c <- GetRandomColour()

SetColor(r1, RandomPosition(), c) SetColor(r2, RandomPosition(), c)

That is, two boxes with random colours are generated. Then a new colour is chosen at random and in each box one of the squares is set to this colour.

This algorithm is different from choosing one of the possible board layouts at random. In particular there a fewer rounds where the number of possible successful situations is low. As a result of this, two players will successfully end a

(24)

24 CHAPTER 3. ANALYSIS OF THE ECG AND THE TCG

(a) Unsuccessful situation

(b) Successful situation

Figure 3.1: Example of successful and unsuccessful situation. The left image shows the position of the first player, the right image the position of the other player

turn by chance in 30% of the rounds. This differs from the 26% average chance with the completely random mechanism.

3.2.2 Movement in the ECG

Moves in the game can be modelled relatively simple. A player is situated in a 2 x 2 grid of squares. Movement is only allowed between squares, not within.

A player is only allowed to move to direct neighbours of a square. Therefore a player can always perform one of exactly two positional changes from any position: move in the horizontal direction, or move in the vertical direction.

The other option the player has is to end his turn; after his turn turn has ended, no other moves are available to him until the next game starts. Therefore the game can be represented as a list of signals like this:

Player 0 1 0 1 0 0

Time 0 1673 3241 3534 4235 9245

Move H V H F V F

Where H is a horizontal move, V is a vertical move and F means the end of the turn. As will be demonstrated in the next chapter, finishing a turn can have a communicative function.

Alternatively, the game can be described as both players choosing one of four actions on each point in time: move horizontally, vertically, finish the turn, or do nothing.

That is, every decision in the game is based on the colours of the squares, the starting positions of the players and the previous moves done by both players.

(25)

3.3. ANALYSIS OF THE TACIT COMMUNICATION GAME 25

to the Receiver. The only way this could happen is by the Sender moving around his token on the board. Furthermore, participants (starting with the Sender) have unlimited time to prepare their movements and to signal their readiness to move by means of a button press. At this point, the token of the participant is automatically positioned at the centre of the board and the participant has 5 s to move around the board. Consider the problem illustrated in Figure 26.1. During communicative trials, the Sender has to decide not only how to move his token (the circle) to his goal location (down to the right), but also to communicate to the Receiver where she should place her token (the rectangle), and in which orientation. It is important to emphasize that the only way the Sender can convey this information to the Receiver is by translating and rotating his token in the board. Figure 26.2 provides a representative example of how the problem illustrated in Figure 26.1 was solved by our participants. Further details on the experimental set-up and on a validation of the TCG are provided by J. P. de Ruiter et al. (unpublished data).

There were two further types of trials, in which both Sender and Receiver could see the goal configuration. Therefore, in these trials, the Sender did not need to signal to the Receiver the position and orientation that her token should have taken by the end of the trial. In the ‘control’ trials, the Sender could move directly to his goal configuration, and the Receiver followed suit. In the ‘noncommunicative’ trials, the Sender was instructed first to overlay his token to the goal configuration of the Receiver, and then move to his

THE TACIT COMMUNICATION GAME (TCG) 599

Sender

Goal configuration Receiver

1. 2. 3. 4.

(Green rectangle)

5. 6. 7.

Figure 26.2 Sequence of events in a Communicative trial of the tacit communication game . 1. Sender and Receiver view their tokens (1.5 s)

2. The Sender, but not the Receiver, sees the goal configuration (unlimited time for inspection and planning).

3. The Sender signals his readiness to move by pressing the start button—his shape moves to the center of the board, and the goal configuration disappears.

4. The Sender moves his token on the game board by means of a multi-button controller (max. 5 s). The movements of the Sender’s token are visible to the Receiver. The double arrow indicates that the Sender moved back and forth between those two positions.

5. The Receiver signals her readiness to move by pressing the start button—her shape moves to the center of the board.

6. The Receiver moves her token on the game board (max. 5 s). The movements of the Receiver’s token are visible to the Sender.

7. Sender and Receiver receive feedback indicating whether they were correct (green box) or incorrect (red box) in matching their token to the goal configuration.

26-Haggard-Chap26 5/6/07 1:38 AM Page 599

Figure 3.2: Example screens in the TCG. 1) The initial state, displaying the tokens used by the players. 2) The goal state as seen by the Sender.

3.3 Analysis of the Tacit Communication Game

The TCG by De Ruiter et al. (2007) has fewer possible states, but gives the player more choices for movement. The next section will expand on the short description of the game in the previous chapter. For ease of exposition, the same approach is as De Ruiter used will be used in this description: the Sender is addressed as ‘he’ and the Receiver as ‘she’.

3.3.1 Game Explanation

The players are equipped with buttons near the hands and on the shoulders.

There are 7 buttons in total: four to change the location of the token, two to change the orientation of the token and one to signal the start or end of turn.

The game consists of a game board (a 3 by 3 grid of squares) and a token for both players. This token changes between rounds, but can be seen as the representation of the player within the game. In the initial phase the token of the Sender is displayed below the grid and the token of the Receiver is displayed above the grid. Both tokens are also displayed inside the grid, each token with a different location and orientation. The goal in the game is for both players to match the location and orientation of the ‘goal tokens’ that are displayed in this initial phase. These first two phases can be seen in Figure 3.2.

The game is divided into two turns, which are played sequentially (rather than having both players move at the same time, as with the ECG). The Sender is always the first player that moves and is always able to see the goal position and orientation of both tokens. Each player has 5 seconds to move before the turn ends, but has unlimited time to plan the moves before starting the 5 second movement phase.

In the control experiments, the Receiver can also see the target position for her token, making it easy to complete the game successfully for both players. In one of the control experiments, the Sender is asked to go directly to his target position. In the other control experiment he is asked to first stop at the goal position of the token of the Receiver. This is done so that the MRI measures

(26)

26 CHAPTER 3. ANALYSIS OF THE ECG AND THE TCG can be set as a baseline.

In the final ‘communicative’ trials the Receiver cannot not see her own target position or orientation. The Sender is still allowed to see both target positions and orientations for the tokens. He can then take all the time necessary to plan his moves. After signalling his readiness by pressing one of the buttons, the token of the Sender is placed in the middle of the grid and the turn starts. He then has 5 seconds to convey the target location and orientation of the Receiver’s token and end at his own target location. The rest of this section only refers to this communicative trial.

The moves done by the Sender are viewed in real-time by the Receiver. After the Sender finishes his turn, the Receiver has unlimited time to prepare her own movement. When the Receiver is ready, she can signal her readiness by pressing one of the buttons. Her token is then put in the middle of the screen and she also has 5 seconds to reach her own target position.

3.3.2 Differences with the ECG

The Sender encodes some information in his moves which will help the Receiver to the right location. He also moves to his own goal location. In this regard the game is similar to the ECG: movement is used both for communication and for moves within the game.

The Receiver moves only after all the moves of the Sender have been shown.

The Receiver must extract the information about her own location from the movements of the Sender and then move to the position that was hopefully communicated correctly through the Sender.

The game differs from the ECG in that the players move in a specific order, making the moves of the Sender independent of the moves of the Receiver. It is similar to the ECG in that the game is continuous: there is an arbitrary amount of time between each move and these time periods can have meanings.

Because of the way the game is set up, the Sender already knows in part how to play the game; he is instructed to move his token to the goal location of the Receiver’s token. The Sender does not know how to convey information about the goal orientation of the Receiver, so he will have to create some kind of strategy for that. This is very different from the ECG, which is set up as a game in which communication is not encouraged or even mentioned; the players in the ECG have to find out for themselves that the only way to succeed in the game is with the use of communication.

The TCG also differs with the ECG in that the roles in the TCG have been predefined. The game has been explained to the players and the Sender always starts before the Receiver. The Sender is instructed how to play the game and how to convey information to the Receiver.

Finally, the TCG uses iconic signals while the ECG is purely symbolic: the participants travel to a specific square to signal that that square is the goal location for the Receiver. Orientations for tokens are often signalled by oscillation in a specific direction.

3.3.3 Possible Game Situations

The TCG has a board based on a 3x3 grid design. Every player has one of 3 tokens. It is not clear how many orientations each token has. If we assume

(27)

3.4. CONCLUSION 27 that the circle has 1 orientation, the rectangle 2 and the triangle 4, there are a total of 7 different token/orientation pairs. Each player has a token somewhere on the board. Presumably, the tokens cannot overlap each other, but players can have the same type of token and the same orientation. There are therefore 9 ∗ 7 ∗ 8 ∗ 7 = 3528 different board layouts possible.

3.3.4 Movement in the TCG

A round in the game is successful if both players end their turn on the goal positions. As the Sender always knows in advance on what position he should end, we can assume that the Sender will always end on the correct position.

This means that success of the game is defined by whether or not the Receiver ends her turn on the correct position. This depends both on how the Sender has sent the information regarding the position and on how well the Receiver is able to interpret this information.

There are 9 squares in the grid. As the Sender occupies one square, there are 8 locations left for the Receiver to choose from. As there are 1, 2 and 4 possible orientations per token type, an average round will have 7/3 possible orientations available. This means that the Receiver has to choose from 8 ∗ 7/3 = 18¹₃ different end positions. The player will then score in the game with a 1/18¹₃ ≈ 0.054 chance.

The moves of the Sender within a turn are only dependent on his previous moves. The Receiver reacts on the moves done by the other player, but not the other way around. As the Sender has done all his moves before the Receiver is allowed to move, both can (and are encouraged) to plan their moves fully ahead.

(Sender)

Time 0 1673 4241 6347

Move 0 1 2 4

(Receiver)

Time 0 8032 9132 9842

Move 0 1 5 3

There are more moves available than in the ECG. Because the game has a 3x3 grid of squares, from the center four moves (up, down, left, right) are available. On the sides three moves are available and in the corners two. The player also has the option to rotate the token clockwise or counter-clockwise. A player can rotate a token both ways, but each token has a limited set of possible rotational positions. For example, the circle always has the same rotation.

3.4 Conclusion

A player in the TCG has more options for choosing a move than a player in the ECG, being allowed to move in at most 4 directions, next to being able to rotate the token. This is offset by the fact that a player in the ECG has a larger array of starting situations. Players in the ECG are more likely to score by chance than players in the TCG are.

The ECG has a dedicated move to finish the turn, while in the TCG the turn ends after 5 seconds. This is possible because only in the ECG back and forth communication is necessary for a fully successful play. The TCG has

(28)

28 CHAPTER 3. ANALYSIS OF THE ECG AND THE TCG two different roles, a Sender and a Receiver, which act in a set order. This means that no dialogue between the participants within a round is possible.

As a Sender has complete information about the game situation (namely, the target positions of both tokens), no dialogue is necessary; only the Sender has to communicate information to the Receiver. In the ECG, on the other hand, both players have limited information as they only know the colours of their own squares. Dialogue is necessary to negotiate a colour that both players are able to end their turns on.

The differences between the two games are highlighted in the strategies by the players in the game. Unfortunately, not a lot is known about the strategies in the TCG. In the next chapter the strategies used in the ECG will be analysed.

(29)

Chapter 4

Strategies in the Embodied Communication Game

Strategies in the ECG have already been analysed by Scott-Phillips et al. (2009).

This chapter will first recount some of that analysis and then proceed to some more specific analysis. The analysis done by Scott-Phillips focusses mostly on two types of different strategies; imposed and evolved strategies. These results are discussed first.

However, more information about the strategies can be obtained by looking closer at the data. In section 4.2 some examples of strategies will be given.

In section 4.3 the strategies will be analysed by extracting common properties and describing each in detail. In Chapter 5 these properties will be used to simplify the ECG.

4.1 Analysis by Scott-Phillips

In the experiment done with the ECG, 7 of the 12 pairs were able to create a partial or complete strategy for the game. Scott-Phillips categorised the pairs in two different groups, based on how the strategy arose. One group (with 5 of the 7 successful pairs) used an emerging communication scheme, where the strategy was created on the basis of interaction between both players. In the other group, with 2 pairs, one of the participants imposed a system on the other participant. This analysis can be found online¹. Both groups will be discussed separately.

4.1.1 Emerged strategies

All of the systems that emerged (the first group) followed the same general pattern:

1. The players move randomly and happen to have a successful round by ending on the same colour.

2. In the next round, both players again finish on the same colour.

1http://ling.ed.ac.uk/˜simon/ecg/

29

Computational Agents in the Embodied Communication Game