• No results found

Towards Personalised Gaming via Facial Expression Recognition

N/A
N/A
Protected

Academic year: 2021

Share "Towards Personalised Gaming via Facial Expression Recognition"

Copied!
43
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Towards Personalised Gaming via Facial

Expression Recognition

Paris Mavromoustakos-Blom

MSc Thesis under the supervision of

Sander Bakkes,

submitted to the Board of Examiners in partial fulfillment of the requirements for the degree of

MSc. in Artificial Intelligence of the

University of Amsterdam.

(2)

Abstract

In this thesis we propose an approach for personalising the space in which a game is played (i.e., levels) dependent on classifications of the user’s facial expression – to the end of tailoring the af-fective game experience to the individual user. Our approach is aimed at online game personalisa-tion, i.e., the game experience is personalised during actual play of the game. A key insight of this research is that game personalisation techniques can leverage novel computer vision-based tech-niques to unobtrusively infer player experiences automatically based on facial expression analysis. Specifically, to the end of tailoring the affective game experience to the individual user, in this thesis we (1) leverage the proven INSIGHTfacial expression recognition SDK as a model of

the user’s affective state, and (2) employ this model for guiding the online game personalisation process.

Two different methods which aim at online and unobtrusive user affective state assessment are introduced. The first approach performs in-game adaptations based on the human player’s fa-cial expression analysis during gameplay. The second approach additionally uses user head pose estimation in order to accurately estimate the human player’s emotional state under circumstances in which the first approach could falter, such as user head movement or sharp changes in illumi-nation.

User studies that validate the game personalisation methods in the actual video game INFI

-NITE MARIO BROS. reveal that the first method provides an effective basis for converging to an appropriate affective state for the individual human player. Furthermore, by using the second method the approach can be made more robust to noise, while it also achieves faster convergence compared to the first method.

(3)

Acknowledgements

First, I would like to thank my supervisor Sander Bakkes for his consistent support and advice throughout the course of this research. His recommendations and guidance have been valuable to my academic career. Indeed, my collaboration with him has lead to part of this thesis work being published at the 2014 Artificial Intelligence and Interactive Digital Entertainment (AIIDE) conference. Furthermore, I would like to thank my family and friends for their support during the last few years, I wouldn’t make it without them. Finally, my thanks to all of the people who participated in my experiments, both friends and strangers.

(4)

Contents

1 Introduction 6 1.1 Research Questions . . . 7 1.2 Thesis Outline . . . 8 2 Related Work 9 2.1 Game Personalisation . . . 9

2.2 Player Experience Analysis . . . 9

2.3 Facial Expression Recognition . . . 10

3 Domain Description 12 4 Personalised Gaming via Facial Expression Recognition 16 4.1 Emotion Tracking . . . 16

4.2 Gradient Ascent Optimisation . . . 18

4.3 Experiments & Results . . . 20

4.3.1 Online personalisation – Pilot study . . . 20

4.3.2 Online personalisation – Pairwise tests . . . 21

5 Personalised Gaming via Facial Expression Recognition and Head Pose Detection 25 5.1 Issues observed in the first approach . . . 25

5.2 Solutions to the observed issues . . . 26

5.3 Improvements to the game adaptation system . . . 27

5.3.1 Classifier Features . . . 27

5.3.2 Training . . . 28

5.3.3 Testing . . . 28

5.3.4 Game Personalisation . . . 29

5.4 Experiments & Results . . . 29

5.4.1 Online personalisation – Pilot study . . . 29

5.4.2 Online personalisation – Pairwise tests . . . 30

6 Discussion 35 6.1 Adapting to the individual human player . . . 35

6.2 Difficulty setting convergence . . . 35

(5)

CONTENTS 5

7 Conclusion 39

7.1 Answers to the research questions . . . 39 7.2 General Conclusion . . . 40 7.3 Future Work . . . 40

(6)

Chapter 1

Introduction

Ideally, artificial intelligence (AI) in games provides satisfactory and effective game experiences for players regardless of gender, age, capabilities, or experience (Charles et al., 2005); it allows for the creation of personalised games, where the game experience is continuously tailored to fit the individual player. Indeed, we are now at a point where modern computer technology, simulation, and AI have opened up the possibility that more can be done with regard to on-demand and just-in-time personalisation(Riedl, 2010). However, achieving the ambition of creating personalised games requires the development of novel techniques for assessing online and unobtrusively which game adaptations are required for optimizing the individual player’s experience.

The goal of this research is to online generate game spaces (i.e. levels) such that the spaces optimise player challenge for the individual player. A major challenge to this end, is that in online gameplay only implicit feedback on the appropriateness of the personalisation actions is available, i.e., the AI can only observe the player interacting with the game, while not being provided with labels on the player experience. Still, methods for tailoring the affective game experience to the individual user require an indication on how appropriate the provided experience is to the player. However, explicitly asking for player feedback during gameplay, is usually too intrusive, and would gravely affect the game experience. It is thus of the essence to use as much implicit feedback as possible, to obtain an as accurate as possible model of the player experience.

A key insight of this paper is that game personalisation techniques can leverage novel com-puter vision-based techniques to unobtrusively infer player experiences automatically based on facial expression analysis. Specifically, to the end of tailoring the affective game experience to the individual user, in this paper we (1) leverage the established INSIGHTfacial expression

recogni-tion SDK as a model of the user’s affective state (Sightcorp, 2014), and (2) employ this model for guiding the online game personalisation process.

As such, we consider challenge to be a cognitive state that might incorporate affective patterns that could be expressed through the face. We focus purely on attaining an appropriate challenge level through the online learning from affective signals; a relatively challenging task. This operates by adjusting procedural parameters that control the intended challenge level -per content type-within the game. This provides expressiveness to tailor the intended challenge level to specific users (by adapting specific content in a distinct manner). Specifically, we will control the intended challenge level based on measured affective states; we do not make assumptions on the relationship of affect and challenge.

(7)

1.1. RESEARCHQUESTIONS 7 Taking all the aforementioned goals and assumptions into consideration, we have developed two different methods, which both independently provide us with an online and unobtrusive game personalisation system, based on facial expression recognition. Namely, the first version only requires an estimation of the human player’s emotional state, in order to decide the action taken towards increasing or decreasing the game difficulty in the future. On the other hand, the second version keeps track of both the emotional state and the head pose metrics of the human player, while it also creates an ’emotional profile’ for each individual user by training a classifier based on these calculations. Given a new user, the classifier is able to estimate the optimal game difficulty setting online, based on the user’s observed reactions.

1.1

Research Questions

Aiming at online and unobtrusive game personalisation, and given the fact that we will be provided with implicit feedback on player affective state only, we need to set certain requirements for our research. If our system manages to fulfil those, we can safely claim that our research goals have been achieved. Therefore, in this thesis, we set the following two research questions:

1. Can we provide online and unobtrusive game personalisation based solely on facial expres-sion analysis?

The first question clearly depicts the core research topic of this thesis. While previous approaches such as (Bakkes et al., 2012) have shown that online and unobtrusive game personalisation is feasible using implicit feedback only, we are aiming at achieving the exact same goal by explicitly using facial expression recognition data. We consider this to be a quite challenging task, given the fact that the above mentioned data can be noisy – affected by lighting conditions, player head pose and general player behavior. Furthermore, we acknowledge that each individual human player is expected to behave and/or react towards the game in different ways, such as facial expressions, hand gestures or in a verbal manner. We would like to investigate whether facial expression recognition provides with adequate information to achieve effective game personalisation.

2. Can online and unobtrusive game personalisaion - based on facial expression analysis - be improved by employing head pose detection?

The second research question expands the first one in terms of how increasing the facial features tracked by our system could improve its efficiency. It is a common fact that players might express their emotions during gameplay in a variety of ways, including facial expres-sions but also head movement, hand gestures and/or verbal expresexpres-sions. We will introduce a new dataset in our second approach of the system, enhanced with head pose detection data, which we will compare to the first approach in terms of efficiency and tolerance to noise.

In answering these two research questions, we will specifically investigate: (1) a system’s ability to converge to the appropriate difficulty setting for the individual human player, and (2) the qualitative effect of the system on player satisfaction.

(8)

1.2. THESISOUTLINE 8

1.2

Thesis Outline

This thesis is divided into several chapters, which are the following: Chapter 1 contains an in-troduction with a brief summary of what the goals of this research are, also posing two research questions which will be answered by the end of this thesis. Chapter 2 includes related work from the fields of adaptive gaming, player experience analysis and facial expression analysis. Chapter 3 contains a description of the domain in which our research takes place, namely the INFINITE

MARIOBROS. game. Next, Chapters 4 and 5 contain implementation details, experiments and re-sults for each approach followed, individually. We have decided to analyse each one in a different chapter, and discuss our findings on both of them in Chapter 6. The latter also contains a compar-ison of the two different implementations, while it also builds upon the results and provides some possible improvement propositions to the system. Finally, a conclusion to the research questions and insight on future work is provided in Chapter 7.

(9)

Chapter 2

Related Work

In this section, we will present work related to our research, from the fields of Game Personalisa-tion, Player Experience Analysis and Facial Expression Recognition. Various ideas deriving from the following work have influenced the way we decided to design our system.

2.1

Game Personalisation

Game personalisation is motivated by a significantly increased involvement and extensive cogni-tive elaboration when subjects are exposed to content of personal relevance (Petty and Cacioppo, 1979); they will exhibit stronger emotional reactions (Darley and Lim, 1992). Particularly, a pos-itive effect on player satisfaction is indicated, i.e., game personalisation raises player loyalty and enjoyment, which in turn can steer the gaming experience towards a (commercial) success (Teng, 2010). Indeed, the perspective of AI researchers to increase the engagement and enjoyment of the player is one that is consistent with the perspective of game designers (Riedl, 2010), i.e., personal-isation methods are regarded as instrumental for achieving industry ambitions (Molyneux, 2006). Tailoring the game experience to the individual player particularly benefits from the use of player models, and requires components that use these models to adapt part of the game (Bakkes et al., 2012).

Our research follows the emerging trend of employing AI methods for adapting the game environment itself (as opposed to, more typically, adapting the behaviour of the game characters) (Bakkes et al., 2014). In our investigation, we choose to focus on personalising the game space to the individual player with respect to experienced challenge. Related work with regard to this scope is discussed next.

2.2

Player Experience Analysis

We build on the novel perspective that computer vision techniques can automatically infer game-play experience metrics (Tan and Pisan, 2012; Tan et al., 2012), a field broadly categorised into qualitative and quantitative methods.

Qualitative methods involve the collection and analysis of subjective data for games; this often includes direct observations, interviews and think-aloud protocols. These methods are most common amongst game practitioners and usually require formal playtest sessions in artificial play environments (Tan et al., 2012). Although these methods have been shown to usually reflect accurate states, they have several shortcomings. Firstly, they might inhibit true play experiences, as players might not be totally at ease when someone is watching or questioning them. Players

(10)

2.3. FACIALEXPRESSIONRECOGNITION 10 might not be able to properly self-articulate their play experiences concurrently during gameplay and might not remember important details when post interviews are performed. Secondly, the sessions also often require a lot of time and resources to conduct and analyze. Hence there is a need for more efficient, accurate and versatile (ability to conduct in non-laboratory settings) ways to perform player experience analysis.

These reasons have driven much research towards quantitative methods that work on objective data. Quantitative methods have the potential to represent true player experiences in the game and are able to continuously capture a more diverse body of information. Common approaches include telemetry and psychophysiology.

Telemetry primarily deals with the logging of player in-game interactions to build player models, and several studies have been performed (Zammitto et al., 2010; Medler et al., 2011; Moura et al., 2011; Gagne et al., 2011). The advantage of Telemetry over qualitative methods is that it is disruptive and that it can continuously capture objective gameplay statistics in non-laboratory settings. However, the data is limited to the in-game actions available to the player and events in the game world. Hence these “virtual observations" do not capture full experiences and might not even represent the true experiences of the player in real life. For example, a player might take a long time to clear a level, but he might be having a high level of arousal in real life, having fun exploring the level, or simply be stimulated by the aesthetics.

Psychophysiology is the other main branch of quantitative player experience research, which consists of methods to infer psychological states from physiological measurements, that commonly include electrodermal activity (EDA), electromyography (EMG), electrocardiogram (ECG), electroencephalography (EEG), body temperature and pupil dilations. Current work (Mandryk et al., 2006; Nacke and Lindley, 2008; Yannakakis and Hallam, 2009; Nacke et al., 2010; Zammitto et al., 2010; Drachen et al., 2010) mostly involve inferring emotional valence and arousal by employing a combination of the measurements. Amongst them, EDA and EMG seem to be most popular as they correspond accurately to emotional dimensions of arousal and valence respectively (Russell, 1980). Similar to telemetry, physiological measurements are able to capture player experiences continuously in real-time. In addition, physiological data represent the real life experiences of the player. Unfortunately, most current approaches deal with expensive specialised equipment that are obtrusive, which are usually only viable in controlled laboratory settings. As such, we propose to investigate using a video-based approach to capture data in way that is more efficient, versatile, and does not affect natural gameplay.

2.3

Facial Expression Recognition

The first step in any facial expressions analysis system is to recognise facial expressions; being a fairly mature domain in computer vision with techniques that boast a high level of accuracy and robustness (Bartlett et al., 1999; Michel and El Kaliouby, 2003; Buenaposada et al., 2007; McDuff et al., 2011). For example, Buenaposada et al. (2007) have reported an 89% recognition accuracy in video sequences in unconstrained environments with strong changes in illumination and face locations.

In terms of using it for analysis of user experiences, there has been a limited number of works performed in non-game applications (Branco, 2006; Zaman and Shrimpton-Smith, 2006). Branco

(11)

2.3. FACIALEXPRESSIONRECOGNITION 11 (2006) showed some encouraging results evaluating positive and negative expressions of users of an online shopping website. Zaman and Shrimpton-Smith (2006) evaluated an automated facial expressions analysis system to infer emotions that users had whilst performing common computer usage tasks. They generally reported a high level of correlation between the system’s findings and human expert analyses. In other domains, general emotion detection based on facial expression recognition (Ghijsen, 2004; Baltrusaitis et al., 2011) have also shown promising results.

In our research, we take the distinct focus of balancing the game’s challenge level by adapting the content that is placed within the game environment dependent on facial expression analysis. Particularly, we focus on procedural content generation (cf. Togelius et al. (2011); Yannakakis (2011) for tailoring the player experience. Our distinct focus in this matter, is to assess online and unobtrusively which game adaptations are required for optimizing the individual player’s ex-perience while the game is being played, so as to have assessments on the exex-perienced player challenge impact the procedural process (cf. Bakkes et al. (2014).

In figure 2.1 we can see the INSIGHTfacial expression recognition SDK which is designed to detect and classify seven basic emotions with a proven 93.2% accuracy, alongside head and eye gaze and pose tracking (Sightcorp, 2014).

Figure 2.1: The INSIGHTSDK, capable of estimating 7 basic emotions along with head pose, gaze and eye gaze metrics.

(12)

Chapter 3

Domain Description

We consider a typical video game: INFINITE MARIO BROS. (Persson, 2009); an open-source

clone of the classic video game SUPER MARIO BROS. It can be regarded an archetypal plat-form game; despite its relatively straightforward appearance it provides a diverse and challenging gameplay experience. We build upon a version of INFINITEMARIOBROS. that has been extended

to procedurally generate entire Mario levels. These extensions have been made by Shaker et al. (2011, 2012, 2013), Pedersen et al. (2009b,a), and Togelius et al. (2010).

Figure 3.1: A snapshot of the INFINITEMARIOBROS. game, currently at the Tubes chunk at maximum difficulty level.

We have made two further enhancements to the 2011 Mario AI Championship game engine of INFINITEMARIOBROS. First, it is now able to procedurally generate segments of Mario levels

while the game is in progress (Figure 3.6). Second, we can now inject short chunks of specific game content: (1) a straight chunk, containing enemies and jumpable blocks, (2) a hill chunk, also containing enemies, (3) a chunk with tubes, containing enemy plants, (4) a jump, and (5) a chunk with cannons. Each chunk can have six distinct implementations, stemming from a per-chunk parameter value ∈ [0, 5]. The challenge level of the per-chunk monotonically increases with the

(13)

3. DOMAINDESCRIPTION 13

Figure 3.2: A snapshot of the INFINITEMARIOBROS. game, currently at the Cannons chunk at maximum difficulty level.

parameter value. In online gameplay, the only action that the personalisation algorithm can take is to output a vector of five integers (chunk parameters) ∈ [0, 5] to the procedural process which in turn generates the next level segment. While the action space is relatively modest in size, its resulting expressiveness ranges from overly easy to exasperatedly hard level segments. In order to provide users with an unpredictable game experience, we randomise the sequence in which chunks are positioned inside a game segment. In Figures 3.1 - 3.5, we can see snapshots of the various chunk types at their respective maximum difficulty level.

(14)

3. DOMAINDESCRIPTION 14

Figure 3.3: A snapshot of the INFINITEMARIOBROS. game, currently at the Jump chunk at maximum difficulty level.

Figure 3.4: A snapshot of the INFINITEMARIOBROS. game, currently at the Straight chunk at maximum difficulty level.

(15)

3. DOMAINDESCRIPTION 15

Figure 3.5: A snapshot of the INFINITEMARIOBROS. game, currently at the Hills chunk at maximum difficulty level.

Figure 3.6: Our enhanced version of INFINITEMARIOBROS. During gameplay it generates short new level segmentsof specific content on-the-fly, on the basis of classifications of the facial expression.

(16)

Chapter 4

Personalised Gaming via Facial Expression

Recognition

1 The goal of this approach is to online generate game spaces (i.e. levels) such that the spaces

optimise player challenge for the individual player. To this end, a key insight is that game per-sonalisation techniques can leverage novel computer vision-based techniques to unobtrusively in-fer player experiences automatically based on facial expression analysis. We perform emotion tracking, with the established INSIGHTfacial expression recognition SDK (Sightcorp, 2014), and gradient ascent optimisation of the individual game experience.

Below, we will analyse how Emotion Tracking is performed, and how the data gathered is fed into the Gradient Ascent Optimisation (GAO) algorithm, where game adaptations take place. Finally, pair-wise tests and their results are presented, providing with an overall assessment of the system.

4.1

Emotion Tracking

In our approach, player emotions are tracked with the INSIGHTfacial expression recognition SDK (Sightcorp, 2014) through the duration of a game session, yet are taken into account real-time and are chunk specific. As such we are not measuring, e.g., general happiness, but instead can map (parameters that generated) specific game content to specific affective states. We hereby assume that the classification probability of an affective stance indicates how strongly it is expressed by the player.

INSIGHTclassifies facial expressions at approximately 15 frames per second. For each frame, it outputs a probability distribution over seven distinct emotions, namely (1) neutrality, (2) hap-piness, (3) disgust, (4) anger, (5) fear, (6) sadness, and (7) surprise. Depending on the progress of the player through the Mario game, a game chunk is typically interacted with for 2 to 10 sec-onds, resulting in a total of 30 to 150 classifications for each game chunk separately. The resulting probability distributions are averaged at the end of each chunk, into an estimate of a players’s emotional stance; it is an estimate that is relatively insensitive to classification noise of the facial expression system (which may occur in individual frames). INSIGHThas an average accuracy of 93.2% over all classified emotions (Sightcorp, 2014).

1This chapter is based on the following publication:

Mavromoustakos-Blom, P., Bakkes, S., Tan, C., Whiteson, S., Roijers, D., Valenti, R., and Gevers, T. (2014). Towards personalised gaming via facial expression recognition. In Proceedings of the 2014 Artificial Intelligence and Interactive Digital Entertainment (AIIDE) conference.

(17)

4.1. EMOTIONTRACKING 17 There are two events at which assessments of the player’s affective state are used to adapt the game; namely (1) when the next level segment needs to be generated, and (2) when the game resets due to player death. To this end, we take into consideration not only player assessments made during actual play of the game, but also in between in-game deaths of the human player – as we observed that during this observational period many game players express high emotional activity. Furthermore, we particularly consider that – following our experience with the target domain – most game players tend to maintain a relatively neutral facial expression during gameplay, with most emotional ‘bursts’ occurring when human players experience an in-game death. Figure 4.1 supports this intuition; it illustrates that ‘neutral’ is the dominant affective stance, as measured for one player over the course of a game play session of approximately ten minutes, with bursts of anger, happiness, and sadness being measured as well.

Figure 4.1: Classifications of the facial expressions of one human participant, over the course of a ten-minute game playing session. We observe that the dominant affective stance is ‘neutral’.

During actual play of the first segment of each game session, we calculate the variance in clas-sification of each emotional stance e, from which we derive a factor α = 1 − var(e1). This factor

α is employed as a baseline for the gradient ascent algorithm; it aims at assessing the “emotional expressiveness” of each individual player. In our approach, the lower the variance in affective states (i.e., an observed player is not very expressive), the larger the gameplay adaptations to spe-cific chunks when emotional bursts actually do occur (in either direction of intended challenge), considering that var(e1) is higher when a user shows high emotionality levels. Factor α is then

(18)

4.2. GRADIENT ASCENTOPTIMISATION 18

4.2

Gradient Ascent Optimisation

To map classifications of the human player’s facial expressions to appropriate in-game challenge levels, we employ a Gradient Ascent Optimisation (GAO) technique. It is employed for optimising the challenge levels for each content type in the game (i.e., for each chunk) such that human interactions with the content yield affective stances that have positive valence (i.e., happiness), while minimising affective stances that have negative valence (i.e., neutrality and anger).

Our implementation of GAO is relatively straightforward (Algorithm 1). After a game seg-ment has been completed by the human player, the probability-distribution vector of the measured emotional stances are retrieved for each individual chunk. The emotions taken into consideration for the present experiments are (1) neutrality, (2) happiness, and (3) anger; our preliminary trials with the Mario game suggested that these emotions were most likely to be expressed by human players (cf. Figure 4.1).

Algorithm 1 Facial Expression-based Gradient Ascent Optimisation

1: procedure GAOPTIMIZE(et, et−1) . Emotion vectors of current and previous segment

2: α ← (1 − Var(e1)) . Calculate α, scale to action space

3: for each : chunk do

4: if playerDies(t) then

5: φ = 5 ∗ round(α ∗ et[Anger])

6: chunk.decreaseChallengeLevel(φ )

7: else if segmentFinished(t) then

8: if et[Neutral] <= 0.8 ∗ α then

9: chunk.decreaseChallengeLevel(1)

10: else

11: ε ← argmaxe|et− et−1|

12: nextAction← round(ε ∗ α)

13: if e ∈ {angry, neutral} then

14: nextAction← −nextAction

15: nextChallengeLevel← previousChallengeLevel + nextAction

16: return newChallengeLevel

At each iteration of GAO, when a game segment is finished, the emotion vectors – of each individual chunk – of the recently played (finished) segment (St), plus the previously completed

segment St−1, are fed into the algorithm. For each emotion that is taken into consideration

(neutral-ity, happiness and anger), the difference between its current (et) and previous iteration (et−1) value

is obtained. Next, the maximum of the three differences is determined, namely argmax(et− et−1).

Since the three emotions are equally weighted, the maximum value calculated could be considered as the “most significant” change in emotional status of the user between two game segments. This value can be considered the desired challenge level of the next segment, St+1. Since emotions

are probabilistic estimates, their difference between two segments follows De∈ [−1...1]. In

or-der to determine the next segment’s challenge level per chunk, De has to be scaled up to action

space of the employed procedural level generator of the Mario game, namely [0...5]. Thus, action at = round(5De), at ∈ [−5...5] is calculated and defines the change in challenge level that will

be presented in the next segment, where negative values define a drop in challenge and positive values define a respective increase. To summarize, the challenge level of a chunk in the next seg-ment will be: dSt+1 = dSt+ at. This calculation will be individually applied to all chunks within a

(19)

4.2. GRADIENT ASCENTOPTIMISATION 19 estimate of an emotion is higher in timestep t compared to t − 1. However, we condition on which emotion is the one defining at, for the reason that an increase in “negative” emotions (neutrality

and anger) should generate a decrease in game difficulty. That is why in these cases, we consider at to be −at.

In order to tailor GAO to the specific target domain, heuristic values are introduced in special occasions; all heuristic values follow from experimentation. Generally, as mentioned, users tend to show highly neutral expressions during gameplay, especially in gameplay settings of low challenge level. In order to prevent “stalling” the game at a certain challenge level due to lack of expressed emotionality, we introduce a heuristic threshold τ = 0.8α. The threshold is derived from our observations on player behaviour in the Mario game (Figure 4.1). If, by the end of a game segment, the level of neutrality of a player during a chunk was higher than the theshold τ, the level generator will force an increase in challenge by a unit measure (+1) in the next segment’s respective chunk. This heuristic corresponds to the insight that the possibility of failure (and the positive affect that is provided by overcoming an obstacle) is an important factor to an appropriate game experience (Juul, 2013).

On the other hand, lasting, excessively high challenge levels may impose an unpleasant ex-perience on game players. In order to avoid player abandonment resulting from an inappropriately high challenge level, a second heuristic is applied onto emotions observed during in-game death. A threshold φ = 5α × εanger is introduced regarding the anger measurement during death. The

chunk in which death happened will instantly drop by round(φ ) units of challenge level in an attempt to reduce player anger and boost player progress in the game. Note that εanger∈ {0...1} is

multiplied by 5 in order to directly map emotion probability scale into game challenge scale.

Participant Easy Normal Hard

1 P P S 2 P P N 3 S P P 4 P P N 5 P P P 6 P P S 7 P S S 8 S S N 9 P S P 10 S P P Totals 70% P 70% P 40% P 30% S 30% S 30% S 0% B 0% B 0% B 0% N 0% N 30% N

Table 4.1: Pairwise preferences of participants on the first approach, per initial challenge level. The legenda is a follows, ‘P’ indicates a preference for the personalised system, ‘S’ indicates a preference for the static system, ‘B’ indicates that both are preferred equally, and ‘N’ indicates that neither is preferred; both are

(20)

4.3. EXPERIMENTS& RESULTS 20

4.3

Experiments & Results

Here we discuss the experiments that validate our approach in the actual video game INFINITE

MARIO BROS.

4.3.1 Online personalisation – Pilot study

In the pilot study, we analyse the personalisation system’s performance by observing one human participant interact with the system under controlled experimental conditions. The participant is placed in a room with stable lighting conditions, and is instructed to interact with the personalised Mario game as she would at home, while attempting to refrain from blocking the face (e.g., by moving a hand through the hair, drinking coffee, etc.). The participant will interact with the game for ten minutes, starting at an initial challenge level of ‘easy’ (all parameter values being ‘1’). Our hypothesis is that when facial expressions can be classified accurately, our online personalisation method will converge to a challenge level that yields an appropriate affective state for the user.

Figures 4.2 - 4.7 illustrate the obtained results. For all chunks (Figure 4.3 – 4.7), we observe the general trend where the algorithm decreases the per chunk challenge levels (Figure 4.7) in the face of user anger, and increases the challenge levels in the face of user neutrality or happiness. Thereby, the online personalisation method operates as expected. For instance, Figure 4.2 reveals that the challenge level for the cannons chunk (Figure 4.7) is initially increased because of high neutralness levels. However, later in the game, high anger levels cause a drop in the challenge level. When lastly the angry emotion disappears, the challenge level becomes stable as well. Furthermore, in Figure 4.3 we observe that the online personalisation method appears stable in the face of classification noise. That is, after approximately 1000 classified frames, the human player suddenly expresses a ‘mix’ of emotions; denoting, in practise, that the player is talking or moving too much. As expected, the associated challenge level (see Figure 4.2) remains stable in the face of this noise from the facial expression classifier.

(21)

4.3. EXPERIMENTS& RESULTS 21

4.3.2 Online personalisation – Pairwise tests

In this experiment, we investigate how human participants experience the personalised game un-der actual game playing conditions, in comparison with a realistic (baseline) static game.2 To this end, in accordance with procedures employed by Shaker et al. (2011), we query for pairwise preferences(i.e., “is system A preferred over system B?”), a methodology with numerous advan-tages over rating-based questionnaires (e.g., no significant order of reporting effects) (Yannakakis and Hallam, 2011). We perform pairwise tests of a static system s, with a fixed difficulty level, and a personalised system p. The experiment follows a within-subjects design composed of two randomised conditions (first s then p, or inversely), each condition consisting of a series of three sequentially performed pairwise tests, in randomized order. A pairwise test compares the static system vs. the personalised system, both starting at one of the three available challenge levels (easy, normal, or hard).

The experiment is performed by ten human participants. To minimise user fatigue impacting the experimental results, each of the three game-playing session is ended after a maximum of 4 level segments (i.e., approximately three minutes of play). After completing a pair of two games, we query the participants’s preference through a 4-alternative forced choice (4-AFC) questionnaire protocol (e.g., s is preferred to p, p is preferred to s, both are preferred equally, neither is preferred; both are equally unpreferred). The question presented to the participant is: “For which game did you find the challenge level more appropriate?”.

Figure 4.3: Facial expressions in Straight chunk

2The static levels are built from chunks of a predetermined challenge level, in random order of occurrence for each

new segment, so as to ensure variation and playability. Given this randomisation, the experimental trials are sufficiently short to prevent players from easily noticing possible chunk repetitions.

(22)

4.3. EXPERIMENTS& RESULTS 22

Figure 4.4: Facial expressions in Hills chunk

Table 4.1 lists the pairwise preferences as reported by the human participants. The results reveals that when both gaming systems are set to an initial challenge level of ‘easy’, a significant majority (p = 0.037) of human participants prefers the personalised system over the static system (70% over 30%). Furthermore, we observe that when both gaming systems are set to an initial challenge level of ‘normal’, a significant majority (p = 0.037) of human participants prefers the personalised system over the static system (also 70% over 30%). When both gaming systems are set to an initial challenge level of ‘hard’, a narrow majority 40% of the human participants prefers the personalised system over the static system (30%), with the remaining 30% of the participants preferring neither; both are equally unpreferred.

From these results we may conclude that, generally, a majority of human participants prefers the personalised system over the static system. In the case the initial challenge level is ‘easy’ or ‘normal’, it concerns a significant majority. In the case the initial challenge level is ‘hard’, it concerns a narrow majority. Also, we have come to the conclusion that facial expression analysis does provide us with adequate information to perform effective game personalisation.

(23)

4.3. EXPERIMENTS& RESULTS 23

Figure 4.5: Facial expressions in Tubes chunk

(24)

4.3. EXPERIMENTS& RESULTS 24

(25)

Chapter 5

Personalised Gaming via Facial Expression

Recognition and Head Pose Detection

The goal of the second approach is to further expand the previously analysed one, by introducing several new features to the game personalisation system. Observations made regarding the first approach’s results have lead us to this second implementation, which aims at tackling the apparent issues of the first one, while also improving the efficiency of the system and further maximising user satisfaction. Below, we will analyse the implementation details of the second approach, after having pointed out critical issues that need to be improved by this version of our system. Finally, another set of pair-wise tests will be presented, along with the results retrieved.

5.1

Issues observed in the first approach

Below, we will point out the most important issues faced in the first approach of our system. Propositions to solve those will be provided in the next sections of this chapter.

1. Gameplay conditions

The first evident issue that derives from the first approach, is that it is almost impossible to control the conditions under which the home user is playing a video game. Lighting con-ditions cannot be stable throughout the entire course of a game session, while users tend to use hand gestures, change their head pose or body position with respect to the computer screen or talk while playing, conditions under which the emotion estimations produced by INSIGHT can be inaccurate. As observed in Figure 4.3, sudden movement or change in illumination can produce false estimations which are far from informative about the user’s true emotional state during that period and consecutively, lead GAO to wrong in-game adap-tations. What is needed, is a way of recognising the user’s emotional status even when the estimations of INSIGHTare likely to be false.

2. User emotion expression

Second, the results shown in Table 4.1 show a drop in the system’s efficiency when users start a game session at a hard difficulty setting. The issue in this case is that, GAO is expected to adapt the game difficulty thus, decrease it, when the user expresses high levels of anger. However, a 30% of the users that participated in the experiments, were unable to complete a full session of either the static or the personalised version, due to poor system adaptations. We can assume that each user may express his/her anger or frustration towards

(26)

5.2. SOLUTIONS TO THE OBSERVED ISSUES 26 the game in different ways, other than a simple facial expression. As a consequence, GAO cannot detect anger as an emotion, and take the required actions towards game adaptation. We would like to develop a system which will be able to detect user anger or frustration, even when it is not described by the user’s facial expressions.

In order to deal with both issues mentioned above, we have decided to expand the first ap-proach in two ways: First, we will retieve more data from the INSIGHTSDK, which are expected to better describe the emotional state of the individual player, and second, we will train a classifier using the ’upgraded’ dataset which will we expect could produce an accurate estimate of the users emotional state under noisy conditions.

5.2

Solutions to the observed issues

The issues mentioned in the previous section define two main points where the second approach will be improved with respect to the first one. Below, we will discuss the possible solutions, and how we expect this new approach to improve the efficiency of our system.

1. Emotion tracking under noisy conditions

What we consider as a solution to the first issue mentioned in Section 5.1 is keeping track of the entirety of the emotion estimation vector output by INSIGHT. Since changes in il-lumination, player head movement and pose change with respect to the computer screen can heavily affect the estimations of the SDK, we believe that tracking only three emo-tions (neutralness, anger and happiness) can lead to false in-game adaptaemo-tions under noisy conditions.

Therefore, we have decided to use all of the seven basic emotion estimates in our new system. Often, we have observed how an ’angry’ expression can be misjudged as ’sad’ by the INSIGHTSDK, when a player has tilted his head forward during gameplay even by a slight margin. In this case, our in-game adaptations based on user anger would be inappropriate, however, by keeping track of sadness along with anger, we could still calculate an accurate estimate of the player’s emotional state.

2. Alternative emotional state estimation

Although tracking all seven basic emotions could already enable more accurate emotional state estimation, we recognise the need of being able to understand users’ interactions with the game even when these are not obvious thorugh facial expressions. For example, a user tilting his head upwards after experiencing in-game death could be a sign of frustration. To address the second issue mentioned in Section 5.1, we have decided to enrich the data used by our system by adding head pose detection metrics, which the INSIGHTSDK is capable of measuring.

In more detail, the data tracked will now consist of a vector of emotions, alongside head pitch, roll & yaw. The above measurements are averaged over the course of a game segment, in the same way the vector of emotions is. By adding this expansion to our system, we believe our tracked data can be more descriptive in terms of emotional state of the individual user, even if he/she is moving with respect to the computer screen or changing his/her head pose, without explicitly describing his/her emotional state through a facial expression.

(27)

5.3. IMPROVEMENTS TO THE GAME ADAPTATION SYSTEM 27 The above mentioned improvements are not adequate regarding the way our system has been implemented. Having three times as much data as the first implementation used to keep track of, expanding the mathematical calculations and defining new heuristic values would be a difficult task. For that reason, we will introduce a classifier which will receive the new dataset as input, and undertake the task of determining the in-game adaptations needed. In the next section, we will discuss how the classifier has been implemented and what enhancements it brings to our system.

5.3

Improvements to the game adaptation system

In this section, we will discuss the core features of our second approach towards a game person-alisation system. We will describe how we introduce the improvements mentioned in the previous section into this approach, and how we implement the classifier which will replace the mathemat-ical calculations and heuristic values used in the first approach.

As mentioned before, the new dataset retrieved by INSIGHTis fed into a classifier, in order to create a platform which is expected to model individual human player behavior and estimate their affective state online. The classification method has been chosen to be a Random Forest Classifier (RFC), which enables us to retrieve a probability distribution over all possible output classes given an unknown input. Furthermore, its computational efficiency is considered to be appropriate for an online setting.

5.3.1 Classifier Features

An important task regarding our classification method is selecting the proper features on which the classifier will be trained. Given the need for online game adaptations, the classification should be as fast and as computationally cheap as possible, meaning that the number of features should be minimal while still informative.

However, while in the first approach we opted to keep track of three emotions only (neu-tralness, happiness and anger), we have chosen to follow all of the seven emotions that INSIGHT

provides us with. By doing so, we can have a complete description of a player’s emotional status even if his movements or the surrounding environment’s alterarions cause false estimates. Accom-panied by the other features, even noisy emotion estimates could produce an accurate prediction of the player’s emotional state.

Apart from the vector of emotions, we have added head pitch, roll & yaw as classifier features. We believe that head pose measurements can be of critical importance in this approach, because they can translate noisy emotion estimates into accurate ones. For example, when a player is nodding forwards with respect to the computer screen, his eyebrows move downwards and thus, INSIGHTwould classify him as ’angry’. However, given that his head pitch has changed, we can discard anger as an affective state for him/her.

Lastly, we have added two more features to improve the classifier’s efficiency: current diffi-culty level, and a likert estimate for each chunk within a segment. Current diffidiffi-culty level tracking can help discriminate spontaneous from consistent emotional bursts, assuming that harder diffi-culty levels can cause persistent frustration, while lower diffidiffi-culty levels tend to be encountered with higher average neutralness by human players. The likert estimate is the feature that will take part in the calculations done in order to perform online game adaptations. It is an integer value

(28)

5.3. IMPROVEMENTS TO THE GAME ADAPTATION SYSTEM 28 in the span [1,2,...,5], with 1 meaning ’too easy’, 5 meaning ’too challenging’ and 3 representing ’optimal challenge level’.

5.3.2 Training

In order to train our RFC, we introduced human players to the personalised game, and asked them to finish 10 segments of INFINITEMARIO BROS. at each difficulty level ([1,2,...,5]). After completing each segment, the users had to manually determine the likert estimate for each chunk separately. In total, we have created a training set of approximately 1250 instances, each labeled with a likert estimate. We have selected both female and male players with varying skill levels, to train the RFC, so as to include as much variety in our data as possible during the training phase.

In Figure 5.1, the average likert value determined by the players participating in the training phase is illustrated. One can observe a tendency of the average user to consider the hardest ’Jump’ chunk possible (5) as the approximately optimal challenge level, while most of the chunks’ optimal challenge level (3) seems to lie between difficulty levels of 3 and 4.

Figure 5.1: Average chunk specific likert preference during system training.

5.3.3 Testing

The testing phase represents the actual game adaptation mechanism, where unknown instances are acquired by human players and the expected output of the RFC is a probability distribution over the likert estimate of the individual player’s affective state for each chunk. The procedure of feeding the RFC with an unknown instance is performed at the end of each segment, or right after in-game death. The likert probability distribution estimated online by the classifier will immediately be used to perform game adaptations as described below.

(29)

5.4. EXPERIMENTS& RESULTS 29 Algorithm 2 Game Personalisation using a Random Forest Classifier

1: procedure RFCPERSONALISATION(In) . Unknown instance

2: PLi← classi f yInstance(In)

3: for each : chunk do

4: Elikert= L[i] ∗ PLi

5: Enormalised= Elikert∗ 1.5 − 4.5

6: round(Enormalised)

7: newDi f f iculty= previousDi f f iculty − Enormalised

8: return newDi f f iculty

5.3.4 Game Personalisation

Given the probability dirstribution PLiover all possible likert classes Li∈ [1, 2, 3, 4, 5], we calculate

a final likert estimate value Elikert= L[i] ∗ PLi. Using this value, game adaptations will take place,

by adjusting the game difficulty for the next game segment, for each chunk individually. How-ever, before actually calculating the next game difficulty setting, we normalise Elikert to calculate

Enormalised= Elikert∗ 1.5 − 4.5. This normalisation factor adjusts the minimum and maximum

in-crease/decrease applied onto game difficulty to lie in the span of [-3...3]. By doing this, we avoid increasing/decreasing game difficulty by extreme values (-5,-4,+4,+5) when the likert estimate is close to its limits (1 or 5), an adaptation which we consider too steep to take in one single step of the personalisation algorithm. The maximum achievable increase/decrease will now be previousDi f f iculty± 3, which is a more realistic approach. Algorithm 2 contains the individual steps taken towards online game adaptations.

5.4

Experiments & Results

In this section we will discuss the experiments that took place regarding the second approach. First, we will describe the setup and results of the pilot study and second, we will present and analyse the results of the pairwise tests that have taken place.

5.4.1 Online personalisation – Pilot study

We have ran a pilot study in order to prove how head movement can affect the emotion estimates calculated by INSIGHT. Figures 5.2 and 5.3 refer to the same INFINITE MARIO BROS. session, and illustrate how strongly these two factors are corellated. Steep variations detected mostly in pitch and roll movement can cause emotional ’pseudo-bursts’ which do not mirror the true emo-tional state of the user. For example, one can see how roll variation in Figure 5.2 causes spikes of surprise in Figure 5.3 while positive values of head pitch seem to cause an increase in the sad-ness estimate. From this, we conclude that tracking head pose metrics can enhance our classifier’s efficiency, given the effect it has on emotion estimations.

However, as mentioned before, by keeping track of both emotions and head movement of the user, we are aiming at training a system that will be able to predict an accurate estimate of the user’s affective state even through ’noisy’ emotion analysis.

(30)

5.4. EXPERIMENTS& RESULTS 30

Figure 5.2: Pitch, roll and yaw tracking during an INFINITEMARIOBROS. session.

Most challenging Most immersive Most frustrating

P 71% 57% 67%

S 29% 23% 33%

N 0% 4% 0%

B 0% 16% 0%

Table 5.1: Participants’ preferences starting at ’normal’ game difficulty settings. The legenda is a follows, ‘P’ indicates a preference for the personalised system, ‘S’ indicates a preference for the static system, ‘B’ indicates that both are preferred equally, and ‘N’ indicates that neither is preferred; both are equally

unpreferred.

5.4.2 Online personalisation – Pairwise tests

In order to assess the second approach we have ran a set of pairwise tests, in which human par-ticipants were asked to complete three segments of two different versions of INFINITE MARIO

BROS.: a static (baseline) version versus the second personalised version. After completing both tasks in three distinct difficulty levels (easy, normal, hard), the participants were asked to answer the following questions:

• For which game did you find the challenge level more appropriate? • Which game did you find more challenging?

• Which game did you find more immersive? • Which game did you find more frustrating?

(31)

5.4. EXPERIMENTS& RESULTS 31

Figure 5.3: Emotion tracking during an INFINITEMARIOBROS. session.

Most challenging Most immersive Most frustrating

P 55% 39% 0%

S 33% 39% 83%

N 0% 11% 0%

B 12% 11% 17%

Table 5.2: Participants’ preferences starting at ’hard’ game difficulty settings. The legenda is a follows, ‘P’ indicates a preference for the personalised system, ‘S’ indicates a preference for the static system, ‘B’ indicates that both are preferred equally, and ‘N’ indicates that neither is preferred; both are equally

unpreferred.

A total of 25 human players have participated in this experiment. The answers available to them again follow the 4-AFC protocol, as in the first approach’s experimenting, meaning that participants could choose P over S or S over P, both equally preferred or both equally unpreferred. The above experiment is similar to the one ran in the first approach, however, we consid-ered appropriate to introduce three more questions which would give us further insight on the participants’ affective state and decision making during gameplay.

As Table 5.6 illustrates, 72% of the participants preferred the personalised version over the static one (20%) starting at an ’easy’ game difficulty setting, with a remaining 8% preferring nei-ther version. 64% of the participants seemed to prefer the personalised version over the static one (20%) when starting at ’normal’ game difficulty levels, with a 12% preferring both versions equally and 4% preferring neither of the two. Lastly, 44% of the participants preferred the per-sonalised version over a 12% who preferred the static version, 8% who preferred both versions equally and 36% who preferred neither version.

(32)

5.4. EXPERIMENTS& RESULTS 32 Most challenging Most immersive Most frustrating

P 95% 100% 75%

S 5% 0% 25%

N 0% 0% 0%

B 0% 0% 0%

Table 5.3: Participants’ preferences starting at ’easy’ game difficulty settings. The legenda is a follows, ‘P’ indicates a preference for the personalised system, ‘S’ indicates a preference for the static system, ‘B’ indicates that both are preferred equally, and ‘N’ indicates that neither is preferred; both are equally

unpreferred.

Result Male Female Average hours spent gaming/week

Abandoned 40% 60% 6.5

Finished 90% 10% 23

Table 5.4: Demographics on user abandonment regarding the second approach. ’Abandoned’ means that the participants was unable to complete all the requested (3) segments on all difficulties, while ’Finished’

means that the participants successfully completed the entirety of the experiment.

What we observe from this pairwise test’s results, is a preference of the participants towards the personalised version of the game in all three different settings. Again, as in the first approach, human players seem to favor a personalised gaming experience rather than a static one. However, even though the majority is statistically significant, one can observe a portion of the participants choosing neither of the two versions, when starting at hard game difficulty. This implies that our system is adapting game difficulty efficiently at easy and normal starting difficulty settings, but drops in performance during hard difficulty game sessions. We could assume that user anger -which should be the main adaptation factor in this case - might not be modelled correctly in some cases, or the system should be finetuned so as to bring sharper in-game adaptations when anger is detected.

A significance test has been ran on the above mentioned results, and the majority has been proven to be significant in all three cases (starting at easy, normal or hard) with a p-value of 0.00011, 0.00082 and 0.00587 respectively.

In order to be able to explain the high percentage of participants choosing neither of the two versions when starting at ’hard’ game difficulty levels, we have analysed the demographic information they have provided us with.

Tables 5.3 - 5.2 illustrate the answers retrieved by the participants on the secondary questions (see 5.4.2). As one can observe, when starting at easy or normal game difficulty settings, the personalised version is consistently considered to be the most challenging, immersive but also frustrating version. However, when starting at hard game difficulty settings, the personalised version is equally preferred to the static one as ’most immersive’, while still considered to be the most challenging one. An important observation in this case is that the static version dominates over the personalised one as ’most frustrating’ with a percentage of 83% over 0%. We believe this phenomenon could derive from the fact that the personalised version would most of the time adapt

(33)

5.4. EXPERIMENTS& RESULTS 33

Version Difficulty setting Male Female Average hours spent gaming/week

Static Easy 100% 0% 0 Normal 50% 50% 5 Hard 40% 60% 8 Personalised Easy 100% 0% 7.5 Normal 60% 40% 5 Hard 35% 65% 6.5

Table 5.5: Demographics on user abandonment regarding the second approach. Comparison between the Personalised and Static version of the game regarding users who abandoned the experiment.

to the player’s frustration and decrease game difficulty, whereas the static version is not designed to adapt, thus, maximises potetntial player frustration.

As shown in tables 5.4 and 5.5, users that abandoned the game in any game difficulty setting, have an impressively lower average of hours spent gaming per week than users that were able to complete three segments of the game in all game difficulty levels. As a consequence, we could state that our system is partially dependent on player skill level, although game abandonment could occur for reasons other than lack of player skill. Also, another important observation is that the majority of participants who abandoned the game at easy and normal difficulty levels are males, while at hard difficulty levels, females tend to abandon the game more. From this, we could conclude that our system detects user frustration more accurately on males, and thus adapts hard difficulty game sessions more efficiently, preventing user abandonment.

In conclusion, it is clear that a significant majority of the participants has consistently pre-ferred the personalised version over the classical (static) version of the game, slightly increasing the percentage of users’ preference compared to the first approach. Also, regarding system con-vergence, the results have shown that this second approach can achieve faster accurate adaptation for the individual human player compared to the first approach. Lastly, we can conclude that the second approach has lead to a system which is more tolerant to noise in facial expression analysis data.

(34)

5.4. EXPERIMENTS& RESULTS 34

Participant Easy Normal Hard

1 P P N 2 P P P 3 S P P 4 P P N 5 P B S 6 P P P 7 P P N 8 P S N 9 P P P 10 N N N 11 P P P 12 P P P 13 S S P 14 S N N 15 P P P 16 S P P 17 P S B 18 P S N 19 P P S 20 P P N 21 P P P 22 P P P 23 P P S 24 S S N 25 N N N Totals 72% P 64% P 44% P 20% S 20% S 12% S 0% B 12% B 8% B 8% N 4% N 36% N

Table 5.6: Pairwise preferences of participants on the second approach, per initial challenge level. The legenda is a follows, ‘P’ indicates a preference for the personalised system, ‘S’ indicates a preference for the static system, ‘B’ indicates that both are preferred equally, and ‘N’ indicates that neither is preferred;

(35)

Chapter 6

Discussion

In this chapter we will discuss the results obtained by our experimentation regarding both ap-proaches individually, while we will also attempt to put them in comparison.

6.1

Adapting to the individual human player

The main point of research regarding our system is how it adapts to the individual human player’s emotional state. The potential of adapting a video game can lead to novel in-game experiences which the user may not have witnessed before.

However, there are some critical points where an adaptive system could falter and not fulfil its purpose. For example, we would like the first personalised version to respond to player frustration (anger) and reduce the in-game difficulty when expressed, but in fact, a user’s temporary anger does not automatically infer the need to reduce game difficulty. In other words, the game should not be excessively ’pitiful’ to human players, in terms of reducing game difficulty sharply and causing the player challenge level towards the game to drop. Smooth in-game adaptations are considered optimal in this case.

In Figures 6.1 and 6.2 we illustrate how the first and second version of our system adapts the game to the same human player, when starting at hard in-game difficulty settings in a four segment game session. We could say that the second version favors smoother game difficulty adaptations, but does not decrease difficulty below level 3. The cause might be that the majority of users that participated in the second system’s training set have set the optimal difficulty settings between difficulty levels 3 and 4 (see Figure 5.1). On the other hand, the first system has allows steeper adaptation in game difficulty, which might not be optimal but can lead to lower in-game difficulty (between 1 and 2) which might be preferred by this particular user.

We believe that both systems could be improved, in order to provide users with more realistic adaptations. Primarily, the first version’s parameters (see Algorithm 1’s φ factor) could be tuned so as to smoothen out the steep adaptations observed and retain high user challenge levels. More-over, the second version could be further trained, in terms of enriching the training set with more instances acquired from human players, creating the opportunity to model players whose preferred difficulty settings do not lie in the field of [3, 4].

6.2

Difficulty setting convergence

Another interesting point of this research work is whether (and how) in-game difficulty settings converge to the appropriate levels for the individual human player. Generally, human players tend

(36)

6.2. DIFFICULTY SETTING CONVERGENCE 36

Figure 6.1: Game difficulty adaptation in the first version (4 segments) starting at hard difficulty settings.

to gain game skills throughout a game session but nevertheless, our system should be able to immediately adapt.

Our pilot study regarding the first system (see Figure 4.2) has shown how it can cause dif-ficulty setting convergence for the individual human player in the course of a 10 segment game session. However, we believe that convergence could be achieved faster if our system could deter-mine and apply a model for the individual player given the very first segment’s facial expression analysis.

In Figures 6.3 and 6.4 we can observe how the first and second versions adapt in-game diffi-culty for the same individual human player in a four segment session when starting at easy game difficulty. Comparing the two graphs, it is obvious how the second version converges to the ap-propriate game difficulty setup after approixmately 12 iterations of the algorithm, whereas the first system has not managed to converge to the optimal setup in the same time. The latter means that either the user’s emotions are not yet stable through consecutive segments, or the user’s neutral-ness levels are still high. However, we could state that the difficulty settings determined by the first system by the end of the session are aiming towards the setup the second version has converged to.

We could improve the first version’s convergence speed in various ways. First of all, we could lower the neutralness heuristic (see Algorithm 1) so that the algorithm will allow faster increase in game difficulty when the player reaches or exceeds that threshold. Secondarily, when exceeding the aforementioned threshold, the game has been designed to increase its difficulty by a minimum amount (1 level). This calculation could be altered so that the difficulty increase derives from the ’distance’ of the player’s actual neutralness level from the threshold. A higher difference, would cause a sharper increase in difficulty.

(37)

6.3. COMPUTATIONAL EFFICIENCY 37

Figure 6.2: Game difficulty adaptation in the second version (4 segments) starting at hard difficulty set-tings.

6.3

Computational efficiency

Given the online setting, an important factor is the computational cost and speed of each system’s calculations. These should be minimised in order to allow computationally ’undetected’ adapta-tions between game segments.

Comparing the two versions of the game personalisation system, one can observe a difference in algorithmic speed. Table 6.1 shows how the second version requires 6 times as much time as the first one would require in order to calculate the appropriate game difficulty adaptations between two consecutive game segments.

This difference derives from the fact that the second version uses a trained classifier to predict the possible likert value for the current difficulty setup for each user, which then is used to calculate the next difficulty setup, whereas the first version only requires raw facial expression analysis data to perform the same calculations straight away. Algorithmically, retrieving an estimate from a trained classifier such as the RFC, is computationally more expensive than applying simple mathematical operations using heuristics. However, the second system is still considered capable of performing online in-game adaptations.

System Average adaptation speed (ms)

First 5

Second 30

Table 6.1: Comparison between the first and second version’s difficulty adaptation calculation speed be-tween two consecutive segments.

(38)

6.3. COMPUTATIONAL EFFICIENCY 38

Figure 6.3: Game difficulty adaptation in the first version (4 segments) starting at easy difficulty settings.

Figure 6.4: Game difficulty adaptation in the second version (4 segments) starting at easy difficulty set-tings.

(39)

Chapter 7

Conclusion

In this chapter, we will provide our answers to the research questions posed in Section 1.1. More-over, we will propose our suggestions on future work that will expand the current system.

7.1

Answers to the research questions

In this section we will present our answers to the research questions posed in this thesis:

1. Can we provide online and unobtrusive game personalisation based solely on facial expres-sion analysis?

Both approaches taken have shown that online and unobtrusive game personalisation is fea-sible, solely based on facial expression analysis. The amount of data that has been used has been minimal, through calculating average emotion estimates over the course of a game chunk. Along, simple mathematical calculations have been used in order to perform in-game adaptations, allowing this procedure to take place during actual gameplay. Lastly, partici-pants in our experiments have never been asked for any explicit feedback during gameplay, which proves how our system can be considered unobtrusive.

2. Can online and unobtrusive game personalisaion - based on facial expression analysis - be improved by employing head pose detection?

By looking at Tables 4.1 and 5.6, we can observe that introducing head pose detection slightly increases user’s preference towards the personalised version of the game against the classical one. Moreover, as illustrated in graphs 6.3 and 6.4, using head pose detection also accelerates the system’s convergence to the appropriate game difficulty setting. Taking these two facts into consideration, we can conclude that enriching our dataset with head pose detection metrics was a step towards system improvement.

(40)

7.2. GENERALCONCLUSION 40

7.2

General Conclusion

In conclusion, user studies that validated the game personalisation methods in the actual video game INFINITEMARIO BROS. revealed that the first method provides an effective basis for con-verging to an appropriate affective state for the individual human player. Furthermore, having used the second method the approach was made more robust to noise, while it also achieved faster con-vergence compared to the first method. As such, we may draw the overall conclusion that online and unobtrusive game personalisation is feasible by solely using facial expression analysis, while head pose detection can contribute to an even more effective game adaptation mechanism.

7.3

Future Work

Below, we will discuss several key fields of our research that could be the core of future work regarding this system.

1. Training set

First of all, in order to further improve the efficiency of the RFC used in the second sys-tem, we would aim towards further increasing the training dataset. This could be done by gathering more facial expression analysis data from users of various age, sex and skill level, which will then be added to the existing dataset and add even more variety in possible player models.

2. Framework adaptability

We would also like to fully parameterise our current system so as to create an independent framework which could be ’attached’ to an unknown video game, and given the correct in-put by game designers, automatically determine the adaptations needed to maximise player satisfaction and challenge level.

3. Metrics

Lastly, we could further expand the input data extracted from human players during game-play, by using devices such as KINECT or keeping track of player’s heart rate through a Webcam Pulse Monitor (WPM). Increasing our database could provide users with more effective game personalisation and even more variety in game experience.

Referenties

GERELATEERDE DOCUMENTEN

In beide jaarrekeningen 2017 is echter de volgende tekst opgenomen: “Er is echter sprake van condities die duiden op het bestaan van een onze- kerheid van materieel belang op

This is an open access article distributed under the terms of the Creative Commons Attribution License (CC-BY-NC-ND 4.0), which permits unrestricted use, distribution,

indiscretion.. Bauk gets into his shuttle and disappears from sight.. Nini is worried, she has missed her period again. Something is definitely wrong. Nini places

The university also held another collection of human remains in the Natural History Museum, but these have since moved to the Anatomical Museum and the National Museum of

Naast dat niet onderzocht is of het SIMCA-model toepasbaar is op collectieve actie in een buurt, is ook niet onderzocht of de SIMCA-factoren voor collectieve actie (onrecht,

Workshop held at the Welten conference on learning, teaching and technology: Theory and practice November 7, Eindhoven... About

[r]

The present study explored the effectiveness of different clean-up methods, using a matrix matched calibration curve (spiked with internal standard; refer to chapter 3), CRM and South