• No results found

Response of Bartle types to a menial task gamified using points and levels

N/A
N/A
Protected

Academic year: 2021

Share "Response of Bartle types to a menial task gamified using points and levels"

Copied!
16
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Response of Bartle types to a menial task

gamified using points and levels

Mentor Palokaj

Universiteit van Amsterdam Spui 21

1012 WX Amsterdam +31 (0)20 525 9111

mentor@palokaj.co

ABSTRACT

The Bartle types (Bartle, 1996) are a player types that stem from Multi User Dungeon games. The test has since been applied far outside its original context in the fields of game design and gamification. This study examined the effect that Bartle types have on susceptibility on the gamification intervention of giving points. This in order to determine whether the Bartle types are a useful tool in designing games and gamification interventions.

A system was built on a web technology stack (Bearnes, 2016) to let participants do a Stroop test (Stroop, 1935) inspired task which randomly assigned the gamification intervention.

The population (overall) gamified with a points reward system showed with a 34% (p ≈ 0.015) difference in engagement as measured by voluntary continued duration of the task. The interaction levels as analyzed per Bartle type yielded a 20% difference for Explorers (p ≈ 0.028) and a 47% difference for Socializers (p ≈ 0.015). The Killer and Achiever samples did now show a statistically significant difference.

The population on the ‘interacting with’ axis (combined Explorer and Socializer populations) showed an overall mean difference of 53% (p ≈ 0.001). The types on the ‘acting on’ axis (combines Killer and Achiever populations) showed a 28% difference (p ≈ 0.002), but only in a normalized data set where outliers were removed. The results indicate that points and levels are an intervention that can be used to increase engagement, without alienating and thus decreasing the engagement of Bartle types not traditionally thought to respond to points and levels. Further research is needed to determine if the same can be said for the response of Bartle types to other forms of gamification.

In addition, Explorer and Socializer types on the ‘interacting with’ axis of the matrix have a greater response to points and levels than the types on the ‘acting on’ axis. This challenges the idea that game and gamification design all Bartle types need to be addressed with separate game mechanics.

Categories and Subject Descriptors

K.8.0 [General]: Games; H.1.2 [User/Machine Systems]: software psychology;

General Terms

Experimentation, Human Factors.

Keywords

Bartle, gamification, game design.

1. Introduction

This research project tests whether Bartle types engage to different degrees with a task more than the overall population when this task is gamified with an Achiever targeted intervention, specifically reward through experience points and levels.

The Achiever type was the focus of this research project since it is stereotypes to have a predisposition to respond well to points and levels as a game mechanic. This mechanic is interesting since it can be widely applied in commerce, education and other fields. Customer loyalty programs for example can be argued to appeal to Achiever types. Likewise changing the education grading system from averages to points and levels may very well appeal better to this Bartle type.

In essence the goal is to examine whether the Bartle matrix can be used to divide subjects by magnitude of response to the gamification intervention.

1.1.1 Hypothesis and goals

The hypothesis is that overall subjects engage longer when gamified but that Achiever types stay engaged longest in a non-game task when motivated with a non-game mechanic geared towards their Bartle type, where the others show lesser effects.

One element that is a major concern is finding out not only if achievers respond well, but whether the other types are alienated by this approach. Preciously the possibility of changing the educational grading system to points and levels was mentioned. If this approach would alienate one of the Bartle types, this approach would put a group of students at a disadvantage.

1.1.2 Methods: the gamified task

Specifically, subjects were asked to name the color of a printed sentence, which was inspired by the Stroop test (Stroop, 1935). This was gamified using experience points and levels, awarded for continued engagement. Subjects are asked to start this repetitive task and are allowed to quit whenever they desire. Engagement is measured as the amount of time subjects spend performing the task.

1.1.3 Pre-testing significance

The research software was tested against a conveniently selected group of participants to gauge whether any effects significant enough to research can be found, and to examine bias between types.

1.2 The Bartle test of gamer psychology

The Bartle test of gamer psychology classifies gamers as being part of one of four gamer types (Bartle, 1996). Each type has a preference for particular game dynamics.

(2)

1.2.1 The Bartle test groups players into four types

The Bartle types (figure 1) can be visualized in quadrants where: • The vertical axis separates those who prefer interacting

with the game world or with other players

• The horizontal axis separates those who prefer acting on something versus interacting with something

These types are not fully mutually exclusive. A player will have a dominant type and a lesser influence of other types.

Figure 1: The Bartle matrix (Kyatric, 2013)

These four types have different preferences in games, as illustrated by quotes from Bartle’s original paper on the matter (Bartle, 1996):

Table 1: Quotes describing Bartle types (Bartle, 1996)

Type Quote

Killers “Players use the tools provided by the game to cause distress to (or, in rare circumstances, to help) other players.”

Achievers “Players give themselves game-related goals, and vigorously set out to achieve them.”

Socializers “Players use the game's communicative facilities, and apply the role-playing that these engender” Explorers “Players try to find out as much as they can about

the virtual world”

Bartle indicates a balanced MUD (multi-user dungeon) game to be one where the game keeps in balance the amount of players in each category over time.

1.2.2 The Bartle test was intended for MUD players

MUD (multi-user dungeon) games are real-time multiplayer game worlds that started out as text based (Bartle, 2004). They are the precursor to the now widespread game genre of Massively Multiplayer Online Role Playing Games, known as MMORPGs for short (Castronova, 2008; Stuart, 2007).

The Bartle test was created based on the paper "Hearts, Clubs, Diamonds, Spades: Players Who suit MUDs" (Bartle, 1996) by MUD community members Erwin Andreasen and Brandon Downey (Andreasen, 2009; Bartle, 2004).

1.2.3 The test is often used in non-MUD fields

While intended for MUD games the Bartle test is popularized as a classification tool for gamers and people in general. It has been suggested as a tool to improve education (Edtechteacher, 2014),

gamification (Thegwailo, 2015), gamification of education (Hanus & Fox, 2015) and many more.

1.2.4 Extending the Bartle test is criticized

When looking at the Bartle test as a MUD playing classifier it may appear limited. Instead it is often applied as a set of player archetypes. Bartle himself and others have criticized blind application of the test (Bartle, 2012; Kyatric, 2013).

Specifically, on gamification Bartle himself cautions against indiscriminately implementing the four Bartle types without considering the underlying principles. One might stimulate unintended behavior or alienate player types (Kyatric, 2013).

1.3 Gamification

Gamification is the application of game mechanics and game psychology in non-game contexts (Deterding, Dixon, Khaled, & Nacke, 2011; Huotari & Hamari, 2012).

1.3.1 Goal of gamification

There are a number or motivations to apply gamification to a task or interaction. Most of these are to facilitate a form of behavioral change such as engagement or motivation to complete a task (Hamari, 2013, 2015). Examples of such behavioral changes include applications in learning, productivity and realigning motivational patterns (Scott, Ghinea, & Arachchilage, 2014; Zichermann & Cunningham, 2011).

1.3.2 Case study: speed camera lottery

The latter can be exemplified by a speed camera in Sweden that creates a raffle that partially distributes the fines of all drivers violating the speed limit to one of those driving below the prescribed speed (Volkswagen, 2010).

Figure 2: the speed camera lottery (Volkswagen, 2010).

The behavioral changes during this experiment were significant: • Average speed before: 32 km/h

• Average speed during: 25 km/h

This experiment used effective reward and competition mechanisms often found in games and turned sticking to the speed limit into a competition.

1.3.3 When is something a game?

One might argue that the above example of nothing more than the application of operant conditioning (Skinner, 1938), specifically positive reinforcement, where the subject is rewarded for a certain behavior which should increase this behavior.

It has been proposed that human beings, especially when younger, use play as a way to explore the world (Hirsh-Pasek, Golinkoff, & Eyer, 2004). Indeed some believe it is an essential part of human culture and society (Huizinga, 1938).

(3)

When referring to game design elements in gamification these, in this paper, mean to be psychological reward, punishment, conditioning and the use thereof to induce a playful interaction. What sets apart gamified activities and play & games as such is that play is said to be “not serious” and “outside ordinary life” (Huizinga, 1938). Many gamified tasks however are certainly serious and part of ordinary life.

The line then between a game and gamified activity is the mode in which one engages with the activity. Specifically, activities engaged with for their own sake qualify as games.

2. Methods and result access

The test has three phases that respectively measure and/or poll 1) demographics, 2) Bartle type and 3) test engagement.

The system used for testing is a platform built using the web technologies PHP, HTML5, CSS3, jQuery and MySQL. The full code, history and results are available on GitHub (Palokaj, 2016). An overview of all elements of the test system interface are available in appendix 13.1.

Figure 3: ungamified test

Variables saved to the MySQL database:

• ID – a unique identification number for this entry • bartle_type – the calculated bartle type

• gamified – whether or not this entry was gamified • interactions – number of task interactions • interactions_correct – number of correct responses • Killer_quotient – percentage of Killer answers • Achiever_quotient – percentage of Killer answers • Explorer_quotient – percentage of Explorer answers • Socializer_quotient – percentage of Socializer answers • gender – reported gender

• age – reported age

• email – reported email (optional)

• timestamp – timestamp of the entry creation

3. Preliminary test

In order to gauge whether there are any biases in the color categotization task there was a pre-test done from a pool of subjects conveniently selected through the Facebook social medium.

3.1.1 Test setup

Participation entered the subjects into a raffle for 6 months of Netflix or Spotify service. The request to participate was shared on Facebook.

• Deployment code of the test: git commit hash 829f7371f5512861cca3857faf071baf9958c872

• Deployment URL: https://www.skillcollector.com/bartle • Test timespan: April 24th 2016 – May 1st 2016

(4)

Figure 4: gamified test

3.1.2 Test results

The MySQL database supports data exports to Microsoft Excel compatible CSV files. The rest results are available in CSV, JSON, PDF and SQL formats on the aforementioned GitHub repository.

Table 2: Distribution of Bartle types, percentages rounded to no decimals

Metric Value

Number of participants 153

Number of gamified entries 82 (54%)

Number of Killer types 24 (16%)

Number of Achiever types 17 (11%)

Number of Socializer types 35 (23%)

Number of Explorer types 77 (50%)

Mean engagements across types 86.5

3.1.3 Data analysis

This preliminary test uses mean data analysis to draw its conclusions.

Means were generated with the SQL ‘AVG()’ function. Standard deviations (σ) with the ‘STD()’ function.

Table 3: Engagement data of the ungamified entry sample

Metric (group: ungamified) Value σ

Number of Killer types 10 -

Number of Achiever types 6 -

Number of Socializer types 14 -

Number of Explorer 38 -

Mean engagements Killer types 71.9 32.4

Mean engagements Achiever types 54.8 26.7

Mean engagements Socializer types 90.9 82.1

Mean engagements Explorer types 67.3 63.2

(5)

Table 4: Engagement data of the gamified entry sample

Metric (group: gamified) Value σ

Number of Killer types 13 -

Number of Achiever types 9 -

Number of Socializer types 21 -

Number of Explorer 39 -

Mean engagements Killer types 175.1 342.3

Mean engagements Achiever types 93.1 61.6

Mean engagements Socializer types 96.5 55.7

Mean engagements Explorer types 75.8 58.4

Mean engagements across types 98.6 150.3

Table 5: Engagement differences between gamified and ungamified Bartle types rounded to 2 decimals

Bartle Type Difference in engagement as factor

Killer 2.44x

Achiever 1.70x

Socializer 1.06x

Explorer 1.13x

3.1.4 Interpretation of preliminary data

The data gathered from the pre test leads to a number of observations and possible interpretations.

3.1.4.1 Software performance was stable

The software performed stable and as expected. There were however three unexplained NULL entries. These could be the result of many things, such as the use of older browsers that handle JavaScript code different than expected.

The JavaScript randomization performed as expected, gamifying 54%, a result close to the theoretical statistic of 50%.

3.1.4.2 The biggest subject group was Explorer type

Half of the participants were typed as Explorers. The Achiever group that is most of interest to the hypothesis was the smallest group.

The next biggest group were the Socializers. Both these groups fall in the part of the matrix concerned with interaction rather than action. Possible reasons for this are that these types might be:

• More active users of Facebook

• Predisposed to clicking links found on Facebook • Highly motivated to win Spotify or Netflix subscriptions

3.1.4.3 Achievers engage least in the task

When not motivated by points and levels the Achievers appear to be predisposed to low engagement. In fact, of all groups the Achievers had the lowest mean interactions, well below the mean engagement across Bartle types.

Interestingly the Socializers engaged longest in the task when ungamified.

3.1.4.4 Achievers engagement mean increases

While Achievers do not engage long in the ungamified task the gamification intervention brings their engagement numbers well above the ungamified mean and close to the gamified mean. In this data the intervention increased engagement interactions by a factor 1.7, which seems to be in line with the overall hypothesis.

3.1.4.5 Killers mean engagement increases more

While the hypothesis expects Achievers to engage more and longer when motivated by points and levels, the Killer gamified and ungamified groups showed the biggest difference.

In fact, the 2.44 factor difference is well above the 1.7 difference observed in the Achievers. The standard deviation indicates that the data points are relatively far apart though, so this observation is very preliminary.

3.1.4.6 ‘Acting on’ vs ‘Interacting with’

The Killers and Achievers are both in the ‘acting on’ side of the matrix rather than the ‘interacting with’ side.

The susceptibility of these types to gamification by points seems to indicate that these types that are concerned with influencing the (game) world around them respond well the have the results of their actions visualized.

4. Final data collection

The final round of data collection was used to supplement the data from the preliminary test.

4.1 Sample characteristics

Due to the nature of conveniently selected participants there are a number of caveats to consider concerning the data.

4.1.1 Target number of participants

The pretest resulted in more entries for certain types than others. The smallest group were the Achievers at 17 entries. It has been suggested that valid statistical analysis requires between 10 and 30 subjects depending on the type of research (Corder & Foreman, 2009).

The target set for the final data collection round was a minimum of 30 entries per Bartle type, a criterion which was satisfied with the smallest group being the Achievers at n=43.

4.1.2 Source of participants

The original intention was to use personal networks for the pretest and the Amazon Mechanical Turk system for mass data collection in the second round.

Due to restrictions placed by the Amazon company, using this service outside the United States was not an option (Amazon, 2016).

Instead subjects were selected through two channels: • A newly created Facebook page

• The email list of skillcollector.com

The Facebook page was promoted with a minor advertising budget of €18. The email list contained 5,781 subscribers of which 4.3% clicked the link to the survey.

(6)

4.1.3 Test setup

Participation entered the subjects into a raffle for 6 months of Netflix or Spotify service. This was made clear on the Facebook page and in the email sent to Skill Collector subscribers.

• Deployment code of the test: git commit hash 8cbb842f0e67e49eef4426238059d2468a278975 • Deployment URL: https://www.skillcollector.com/bartle • Test timespan: May 11th 2016 – May 20st 2016

5. Analysis and results of final data

The rest results are available in CSV, JSON, PDF and SQL formats on the aforementioned GitHub repository.

A Shapiro-Wilk analysis for all samples concluded: • The distribution of data is not normal

• T-tests are irrelevant (they assume normal distribution) • Non parametric analysis will need to be used

5.1 Complete data

For all samples a Shapiro-Wilk analysis was done, which found significances of 3.9856E-29, 0.000003, 1.4534E-15, 4.2985E-14, 0.000001 for the full, achiever, explorer, killer and socializer samples respectively. This in all cases a Mann-Whitney test was used.

5.1.1 Full sample

Data integrity and histograms are available in the appendix section 11.2.

Table 6: Distribution of Bartle types, percentages rounded to no decimals

Metric Value

Number of participants 369

Number of gamified entries 185 (50%)

Number of Killer types 53 (14%)

Number of Achiever types 43 (12%)

Number of Socializer types 73 (20%)

Number of Explorer types 200 (50%)

Mean engagements across types 83.4

Note that the discrepancy in percentages is an artifact of NULL entries and rounding off.

5.1.2 Gamified vs ungamified overview

This analysis starts at a general level and drills down to more advances statistical analysis.

This analysis focuses on a high level ‘at a glance’ overview of the data based on overall means.

Means were generated with the SQL ‘AVG()’ function. Standard deviations (σ) with the ‘STD()’ function.

Table 7: Engagement data of the ungamified entry sample

Metric (group: ungamified) Value σ

Number of Killer types 21 -

Number of Achiever types 18 -

Number of Socializer types 36 -

Number of Explorer 97 -

Mean engagements Killer types 71.9 49.6

Mean engagements Achiever types 72.8 53.1

Mean engagements Socializer types 67.8 63.4

Mean engagements Explorer types 75.9 71.4

Mean engagements across types 73.4 65.7

Table 8: Engagement data of the gamified entry sample

Metric (group: gamified) Value σ

Number of Killer types 30 -

Number of Achiever types 22 -

Number of Socializer types 34 -

Number of Explorer 98 -

Mean engagements Killer types 114 234.6

Mean engagements Achiever types 105.4 86.4

Mean engagements Socializer types 98.3 68.2

Mean engagements Explorer types 91.6 72.6

Mean engagements across types 98.2 116.6 The statistical significance of the effect observed between the gamified and ungamified groups was 0.001. This means the effect of the intervention is statistically significant.

See appendix section 11.3 for detailed analysis results.

5.1.3 Achiever sample analysis

The statistical significance of the effect observed between the gamified and ungamified groups was 0.411, thus p ≤ 0.05 is not true. This means the effect of the intervention is not statistically significant within this group.

See appendix section 11.3 for detailed analysis results.

5.1.4 Explorer sample analysis

The statistical significance of the effect observed between the gamified and ungamified groups was 0.028, thus p ≤ 0.05. This means the effect of the intervention is statistically significant within this group.

See appendix section 11.3 for detailed analysis results.

5.1.5 Killer sample analysis

The statistical significance of the effect observed between the gamified and ungamified groups was 0.893, thus p ≤ 0.05 is false. This means the effect of the intervention is not statistically significant within this group.

See appendix section 11.3 for detailed analysis results.

5.1.6 Socializer sample analysis

The statistical significance of the effect observed between the gamified and ungamified groups was 0.015, thus p ≤ 0.05. This

(7)

means the effect of the intervention is statistically significant within this group.

See appendix section 11.3 for detailed analysis results.

5.2 Analysis of manipulated data by type

The dataset contained some extreme values that might skew results for the smaller groups. The below analysis removes outliers as defined per heading.

A Shapiro-Wilk analysis yielded a significance of 0.000126 and 0.643418 for the achiever and killer samples respectively. In both cases a Mann-Whitney test was used.

5.2.1 Achiever manipulated data analysis

The outliers as found by SPSS exploration are four data entry points with case numbers: 36, 26, 35, 11. These correspond to user ids: 294, 205, 291, 118. These are data points with an interaction amount above 200.

In this analysis only entry points with under 200 interactions will be considers so as to exclude the outliers.

See appendix section 11.4 for detailed outlier analysis.

The statistical significance of the effect observed between the gamified and ungamified groups was 0.657127, thus p ≤ 0.05 is not true. This means the effect of the intervention is not statistically significant within this group.

See appendix section 11.4 for detailed analysis results.

5.2.2 Killer manipulated data analysis

The outliers as found by SPSS exploration are four data entry points with case numbers: 12, 41, 18, 45, 27, 15. These correspond to user ids: 78, 252, 106, 270, 179, 96. These are data points with an interaction amount above 159.

In this analysis only entry points with under 159 interactions will be considers so as to exclude the outliers.

See appendix section 11.4 for detailed outlier analysis.

The statistical significance of the effect observed between the gamified and ungamified groups was 0.890, thus p ≤ 0.05 is not true. This means the effect of the intervention is not statistically significant within this group.

See appendix section 11.4 for detailed analysis results.

5.3 Analysis of manipulated data by axis

The above data seems to indicate that the types on the ‘interacting with’ axis of the Bartle matrix have a statistically significant response, whereas the ‘acting on’ do not.

The below sections analyze the effects and significance as separated by the axes. A Shapiro-Wilk analysis yielded a significance of 7.6073E-18, 1.9851E-17 and 3.7576E-7 for the ‘acting on’, ‘interacting with’ and normalized ‘acting on’, samples respectively. In all cases a Mann-Whitney test was used.

5.3.1 ‘Acting on’ data analysis

The acting on axis concerns the Killer and Achiever Bartle types. The statistical significance of the effect observed between the gamified and ungamified groups was 0.488, thus p ≤ 0.05 is not true. This means the effect of the intervention is not statistically significant within this group.

See appendix section 11.4 for detailed analysis results.

5.3.2 ‘Interacting with’ data analysis

The acting on axis concerns the Explorer and Socializer Bartle types.

The statistical significance of the effect observed between the gamified and ungamified groups was 0.001, thus p ≤ 0. This means the effect of the intervention is statistically significant within this group.

See appendix section 11.4 for detailed analysis results.

5.3.3 Normalized ‘acting on’ data analysis

The acting on axis concerns the Killer and Achiever Bartle types. Normalization removed the outliers as identified by case ids: 147, 222, 4, 13, 193, 226, 52, 209, 80, 265, 167, 121, 263, 251, 255. Which correspond to entry ids: 203, 306, 5, 15, 273, 310, 66, 290, 107, 362, 233, 168, 360, 342, 346. These are data points with an interaction amount above 200.

In this analysis only entry points with under 200 interactions will be considers so as to exclude the outliers.

The statistical significance of the effect observed between the gamified and ungamified groups was 0.002, thus p ≤ 0.05 is not true. This means the effect of the intervention is statistically significant within this group.

See appendix section 11.4 for detailed analysis results.

6. Interpretation of final data

The analysis allows for a number of interpretations and observations.

6.1 Overview of significant effects

This table displays the results of comparing gamified and ungamified samples of separate groups.

Table 9: Overview of statistically significant effects. Mean interactions are for ungamified sample. Mean difference is the

factor of difference between ungamified and gamified groups.

Sample Significant

in Mean interactions Mean difference Explorer Raw data 76 1.20

Socializer Raw data 67 1.47

‘Interacting with’ Raw data 72 1.53

‘Acting on’ Normalized 61 1.28

6.1.1 Overall significant effect on engagement

Overall the engagement means for the ungamified and gamified samples when looking at the total population of all Bartle types show a statistically significant result.

Specifically, the mean increase from 73.4 to 98.2 was significant at a p value of 0.015.

6.1.2 Effects within raw data

When examining the statistical significance of the difference between groups it was found that:

• Achiever data showed no significant effect • Killer data showed no significant effect • Explorer data did show a significant effect • Socializer data did show a significant effect

(8)

When looking at the ‘acting on’ and ‘interacting with’ axes it was found that:

• The ‘acting on’ axis showed no significant effect • The ‘interacting with’ axis showed a significant effect

6.1.3 Effects within normalized data

When examining the statistical significance of the difference between groups it was found that:

• Achiever data showed no significant effect • Killer data showed no significant effect When looking at the axes:

• The ‘acting on’ axis showed significant effect

6.2 Significant observations

Based on the data from this experiment one may observe a number of trends.

6.2.1 Points and levels increase engagement

The overall data indicates that points and levels are a way to significantly increase engagement in an otherwise nonproductive task. This is the case for all statistically significant correlations found.

6.2.2 Difference ‘acting on’ and ‘interacting with’

While populations on both sides of the horizontal axis of the Bartle matrix respond, the ‘interacting with’ population shows a different difference in engagement than the ‘acting on’ axis with 53% and 28% differences respectively.

6.2.3 Unexpected differences between types and axes

The hypothesis that Achievers respond best to points and levels as a motivating game mechanic seems to be contradicted by the significant correlations. There was no specific statistically significant data for the Achievers, but the ‘acting on’ axis engagement difference was lower that that of the ‘interacting with’, even though traditionally points are thought to be more compatible with ‘acting on’ types.

6.2.4 Outliers in the ‘acting on’ population

Without normalization the Killer, Achiever and ‘acting on’ groups did not show statistical significance. A cause for this was the large concentration of outliers in these groups. This could imply that the response of these types is mediated by another factor.

7. Discussion

This study found a significant effect of gamification using points on engagement times, and some statistically significant effects when this data is dissected based on the Bartle matrix. There are however a number of factors that need to be addressed in relation to this study.

7.1 Practical applications

The results of this study seem to confirm the effectiveness of the points and level game mechanic. Application in other fields is indeed already widely present. Commercial rewards programs for example often use a similar dynamic. This study can be interpreted as a confirmation of the hypothetical effectiveness of this approach. In short:

• Points and levels can be used to increase engagement • Points and levels do not seem to alienate non ‘acting on’

Bartle types

• Achiever types do not respond significantly better than other types

One may thus conclude that if the conclusions of this study can be generalized to the human population in general the application of points and levels can benefit all those the intervention is applied to from the perspective of the Bartle types.

In addition, from a game design perspective one can not simply say that all Bartle types need to be addressed with separate game mechanics. Indeed Achievers while stereotypically being associated with points and levels seem to respond less than Explorer and Socializer types.

7.2 Confounding factors

There are a number of factors one might argue limit the conclusions one can draw from the data of this particular study.

7.2.1 Subject bias

The subjects used in this study came from possibly highly biased populations. The main sources were:

• Personal social medium (Facebook) profile • Facebook advertising

• Personal blog email list

There is a convincing argument to be made that these channels have inherent biases. It is for example the case that a large proportion of the subjects fell in the Socializer and Explorer group. One might hypothesize the social nature of the subject source is the cause of this.

Likewise, the individuals from these data sources have been selected over time by direct and indirect interaction with the profile/blog owner, which most likely biased the type of person participating in this study.

7.2.2 Sample size

It is possible that there is a threshold at which effects of the Bartle matrix on engagement mediated by points become visible. This study found significant overall effects in the study subjects, which one could argue to be linked to the fact that the size of this group is per definition larger than the sub groups.

7.2.3 Bartle type vs coefficient

The Bartle test scores subjects per type, and labels a participant based on the type with the highest points. This is a rather binary approach and does not take into account the fact that subjects exhibit traits from all types.

7.2.4 Intervention and task bias

The possibility exists that the chosen intervention of points and levels is a game mechanic that appeals to all Bartle types in such a way that the effect matches the current data.

In addition, it is possible that the task appealed to a particular subject group outside of their Bartle type hereby skewing the results.

8. Future research

Although this data did not find a link between the Bartle matrix and engagement response mediated by gamification, there are a number of avenues that can be explored still.

8.1 With the current data set

The current data set collected from this experiment allows for further research.

(9)

8.1.1 Demographic analysis

The current data set contains demographics data which is currently unused. Effects of gender and age can be examined without the need for extra data.

8.1.2 Separation based on Bartle coefficient

Likewise, the Bartle coefficient for all subjects has been recorded. It is possible that the lack of a found effect in the Killer and Achiever groups is due to the binary nature of the Bartle test in discounting the traits that did not score highest. One could approach this data set in a more complex manner and weigh a subject’s interactions based on their multiple coefficients.

8.2 Outside the current data set

Aside from the current data set there are many possibilities for research in this area.

8.2.1 Different sample size and source

While this study did not find a full correlation between Bartle types and engagement, it is entirely possible that a bigger sample taken from a more diverse source does show an effect. Replication of this study on a bigger scale or using different subjects can present an interesting research opportunity.

8.2.2 Different gamification mechanics

One interesting avenue is the researching of different gamification interventions than points and levels. As discussed before it is possible that all Bartle types respond well to points and levels, therefore not showing significant effects between them. Using different game mechanics and analyzing the responses of the Bartle types may produce more significant effects.

8.2.3 Gamifying a different task

By applying game mechanics to a different task one might possibly find that the Bartle matrix does separate groups of subjects with statistically significantly differing engagement levels.

8.2.4 Using different subject classifications

The Bartle matrix may not be a useful tool to analyze gamification related behavior at all. Instead there are many models that could be used to correlate personality types with susceptibility to gamification. Some common personality tests and/or traits that could be used are:

• Introvert vs extrovert individuals • Myer-Briggs Type Indicator (MBTI) • MMORPG class preferences Although many others are of course possible.

8.2.5 Using extrapolated classifications

From a business perspective it may be interesting to test gamification effects based on subject traits that do not rely on an explicit test. Examples of classifications that could be used are:

• Google Analytics Interest Groups • Google Analytics Market Segments • In game/application behavior

• Classifications based on self reported user profiles The classifications above would not require breaking a user’s usual experience, but can still determine susceptibility to certain gamification mechanics if this classification turns out to have a statistically significant effect.

9. Acknowledgements

My gratitude goes out to those who helped complete this research project. Specifically, I thank the individuals below for their invaluable help.

Liesbeth van den Berg, for support on statistics and general mental support. Without her this research would not have arrived at the conclusions that it did.

Daniel Buzzo, for helping shape this research while allowing me to maintain full autonomy in the development, execution and reporting. His feedback gave inspiration to where needed expand or refocus the research efforts.

Frank Nack, for logistical support and insights that provided much needed reassurance during the course of this project.

10. References

Amazon. (2016). FAQs | Help | Requester | Amazon Mechanical Turk. Retrieved May 21, 2016, from https://requester.mturk.com/help/faq#do_support_outside_u s

Andreasen, E. (2009). Erwin’s MUD resources page. Retrieved March 18, 2016, from http://www.andreasen.org/mud.shtml Bartle, R. (1996). Hearts, clubs, diamonds, spades: Players who suit

MUDs. Journal of MUD Research, 1(1), 19. Bartle, R. (2004). Designing virtual worlds. New Riders.

Bartle, R. (2012). Player Type Theory: Uses and Abuses | Richard BARTLE. Retrieved March 18, 2016, from https://www.youtube.com/watch?v=ZIzLbE-93nc

Bearnes, B. (2016). How To Install Linux, Apache, MySQL, PHP (LAMP) stack on Ubuntu 16.04. Retrieved from https://www.digitalocean.com/community/tutorials/how-to- install-linux-apache-mysql-php-lamp-stack-on-ubuntu-16-04

Castronova, E. (2008). Synthetic worlds: The business and culture

of online games. University of Chicago press.

Corder, G. W., & Foreman, D. I. (2009). Nonparametric Statistics: An Introduction. Nonparametric Statistics for

Non-Statisticians: A Step-by-Step Approach, 1–11.

Deterding, S., Dixon, D., Khaled, R., & Nacke, L. (2011). From game design elements to gamefulness: defining gamification. In Proceedings of the 15th international

academic MindTrek conference: Envisioning future media environments (pp. 9–15). ACM.

Edtechteacher. (2014). Use the Four Gamer Types to Help Your Students Collaborate – from Douglas Kiang on Edudemic. Hamari, J. (2013). Transforming homo economicus into homo

ludens: A field experiment on gamification in a utilitarian peer-to-peer trading service. Electronic Commerce Research

and Applications, 12(4), 236–245.

Hamari, J. (2015). Do badges increase user activity? A field experiment on the effects of gamification. Computers in

Human Behavior.

Hanus, M. D., & Fox, J. (2015). Assessing the effects of gamification in the classroom: A longitudinal study on intrinsic motivation, social comparison, satisfaction, effort, and academic performance. Computers & Education, 80,

(10)

152–161.

Hirsh-Pasek, K., Golinkoff, R. M., & Eyer, D. (2004). Einstein

never used flash cards: How our children really learn--and why they need to play more and memorize less. Rodale.

Huizinga, J. (1938). Homo ludens: proeve fleener bepaling van het spel-element der cultuur. Haarlem: Tjeenk Willink. Huotari, K., & Hamari, J. (2012). Defining gamification: a service

marketing perspective. In Proceeding of the 16th

International Academic MindTrek Conference (pp. 17–22).

ACM.

Kyatric. (2013). Bartle’s Taxonomy of Player Types (And Why It Doesn't Apply to Everything). Retrieved April 4, 2016, from http://gamedevelopment.tutsplus.com/articles/bartles- taxonomy-of-player-types-and-why-it-doesnt-apply-to-everything--gamedev-4173

Palokaj, M. (2016). Bartle Platform. Retrieved from https://github.com/actuallymentor/bartle-platform/

Scott, M. J., Ghinea, G., & Arachchilage, N. A. G. (2014). Assessing the Role of Conceptual Knowledge in an Anti-Phishing Educational Game. In Advanced Learning

Technologies (ICALT), 2014 IEEE 14th International Conference on (p. 218). IEEE.

Skinner, B. F. (1938). The behavior of organisms: An experimental analysis.

Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology, 18(6), 643. Stuart, K. (2007). MUD, PLATO and the dawn of MMORPGs |

Technology | The Guardian. Retrieved April 4, 2016, from http://www.theguardian.com/technology/gamesblog/2007/j ul/19/mudvsplatowh

Thegwailo. (2015). Using the Bartle Test to Gamify Your Life | saga learning system. Retrieved April 4, 2016, from https://sagalearning.wordpress.com/2015/03/04/using-the-bartle-test-to-gamify-your-life/

Volkswagen. (2010). The Speed Camera Lottery | The Fun Theory. Retrieved April 7, 2016, from http://www.thefuntheory.com/speed-camera-lottery-0 Zichermann, G., & Cunningham, C. (2011). Gamification by

design: Implementing game mechanics in web and mobile apps. “ O’Reilly Media, Inc.”

11. Appendix

A high resolution version of this article in PDF is available on Github: https://github.com/actuallymentor/bartle-platform

11.1 Test software screenshots

Figure 3: ungamified test

Figure 4: gamified test

(11)

Figure 6: instructions and demographics for participants

Figure 7: Bartle test interface

Figure 8: gamified pre- test instructions

Figure 8: ungamified pre- test instructions

Figure 9: final ‘thank you’ screen

11.2 Data integrity and distribution

11.2.1 Full data set

Using a t-test and preceding f-test we can determine whether the gamification by points showed a significant effect. Over the entire dataset:

• 11 were NULL entries

• 12 non NULL entries scored a ratio below 0.8 for interactions versus correct interactions

• 4 non NULL entries scored a ratio below 0.6 for interactions versus correct interactions

To eliminate entries resulting from participants not actively engaging in the task but answering randomly the entries below a ratio of 0.8 were eliminated.

(12)

11.2.1.1 Data distribution

Diagram 1: Histogram for ungamified sample

Diagram 2: Histogram for gamified sample

11.2.2 Achiever sample analysis

Since the main hypothesis centers on Achievers they are a main focus of analysis. Of the 44 Achievers:

• 3 were NULL entries

• All non NULL entries scored a ratio above 0.8 for interactions versus correct interactions

The NULL entries were discarded.

The ratio indicates that the remainder of subjects actively engaged with the task. Simply hitting buttons randomly in the task would have resulted in a ratio closer to 0.5.

11.2.2.1 Data distribution

Diagram 3: Histogram for ungamified Achiever types

Diagram 4: Histogram for gamified Achiever types

11.2.3 Killer sample analysis

The preliminary data analysis saw a relatively large mean engagement increase for Killers. The Killer type is also on the ‘acting on’ axis of the Bartle matrix.

• 2 were NULL entries

• 2 non NULL entries scored a ratio below 0.8 for interactions versus correct interactions

• 1 non NULL entry scored a ratio below 0.6 for interactions versus correct interactions

The NULL entries were discarded.

To eliminate entries resulting from participants not actively engaging in the task but answering randomly the entries below a ratio of 0.8 were eliminated.

11.2.3.1 Data distribution

Diagram 5: Histogram for ungamified Killer types

Diagram 6: Histogram for gamified Killer types

0 10 20 30 40 50 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 Mo re 0 10 20 30 40 50 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 Mo re 0 2 4 6 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 Mo re 0 2 4 6 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 Mo re 0 2 4 6 8 20 40 60 80 100 120 140 160 180 200 220 240 260 300 320 380 Mo re 0 2 4 6 8 10 20 40 60 80 100 120 140 160 180 200 220 240 260 300 320 380 Mo re

(13)

11.2.4 Explorer sample analysis

The preliminary data analysis indicated a relatively small difference compared to the ‘acting on’ Bartle types.

• 2 were NULL entries

• 2 non NULL entries scored a ratio below 0.8 for interactions versus correct interactions

• 1 non NULL entry scored a ratio below 0.6 for interactions versus correct interactions

The NULL entries were discarded. Entries below a ratio of 0.8 were discarded also.

11.2.4.1 Data distribution

Diagram 7: Histogram for ungamified Explorer types

Diagram 8: Histogram for gamified Killer types

11.2.5 Socializer data analysis

The preliminary data analysis indicated a relatively small difference compared to the ‘acting on’ Bartle types.

• 2 were NULL entries

• 2 non NULL entries scored a ratio below 0.8 for interactions versus correct interactions

• 2 non NULL entry scored a ratio below 0.6 for interactions versus correct interactions

The NULL entries were discarded. Entries below a ratio of 0.8 were discarded also.

11.2.5.1 Data distribution

Diagram 9: Histogram for ungamified Socializer types

Diagram 10: Histogram for gamified Socializer types

11.3 Mann-Whitney Tests

The Mann-Whitney tests were executed using IBM SPSS software. The diagrams and table sets below are screenshots of said software.

11.3.1 Overall sample

Table Set 1: Mann-Whitney test for overall sample

0 5 10 15 20 20 40 60 80 100 120 140 160 180 200 220 240 260 300 320 380 Mo re 0 10 20 30 20 40 60 80 100 120 140 160 180 200 220 240 260 300 320 380 Mo re 0 5 10 15 20 40 60 80 100 120 140 160 180 200 220 240 260 300 320 380 Mo re 0 2 4 6 8 20 40 60 80 100 120 140 160 180 200 220 240 260 300 320 380 Mo re

(14)

11.3.2 Achiever sample

Table Set 2: Mann-Whitney test for Achiever sample

11.3.3 Explorer sample

Table Set 3: Mann-Whitney test for Explorer sample

11.3.4 Killer sample

Table Set 4: Mann-Whitney test for Killer sample

11.3.5 Socializer sample

(15)

11.4 Manipulated data analysis

The Mann-Whitney tests were executed using IBM SPSS software. The diagrams and table sets below are screenshots of said software.

11.4.1 Achiever sample

Diagram 11: Boxplot of Achiever sample distribution

Table Set 6: Mann-Whitney test of Achiever sample without

outliers

11.4.2 Killer sample

Diagram 12: Boxplot of Killer sample distribution

Table Set 7: Mann-Whitney test of Killer sample without outliers

(16)

11.5 Data analysis by axis

11.5.1 ‘Acting on’ axis

Table Set 8: Mann-Whitney test of ‘acting on’ sample

11.5.2 ‘Interacting with’ axis

Table Set 9: Mann-Whitney test of ‘interacting with’ sample

11.5.3 Normalized ‘acting on’ axis

Table Set 10: Mann-Whitney test of normalized ‘acting on’ sample

Referenties

GERELATEERDE DOCUMENTEN

12u05: Slotwoord door Wouter Beke, Vlaams minister van Welzijn, Volksgezondheid, Gezin en Armoedebestrijding. Wie

Als stikstof (door N-depositie) niet meer limiterend is voor de plantengroei, zullen op nutriëntarme bodems in heidegebieden en bossen andere voedingsstoffen, zoals P, mangaan

In deze sectie wordt besproken op welke manier de afhankelijkheid in defensie-uitgaven tussen landen geschat wordt.. Ook worden variabelen besproken die een causaal verband tonen

Bij de gender paradox theorie wordt ervan uitgegaan dat delinquent gedrag vaker voorkomt bij jongens dan bij meisjes, maar dat wanneer meisjes wel delinquent gedrag vertonen,

This will provide a solid footing in understanding the droughts, as extreme weather events and with the exacerbating factor of climate change and drought

Om snot te voorkomen worden momenteel toetsen ontwikkeld voor de verschillende veroor- zakers van snot, zodat van partijen snel vastge- steld kan worden of ze besmet zijn en of telers

This study aimed to research the effect of different managerial response types, given an apology, compensation or refutation, and the level of personalization of these managerial

Since Davis and Olson also distinguish four hierarchy levels, called Transac- tion Processing, Operational Control, Management Control and Strategic Plan- ning, the types a, c, d and