• No results found

Algorithmic curation on Facebook

N/A
N/A
Protected

Academic year: 2021

Share "Algorithmic curation on Facebook"

Copied!
101
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Algorithmic curation on Facebook

A thesis on the concept of a filter bubble on the Facebook news feed.

Master thesis: Computer-mediated Communication University of Groningen, Faculty of Arts Supervisor: Dr. G.J. Mills

Date: 13-08-2018

Author: Maurice van Iersel

Hora Siccamasingel 350 9721HZ Groningen

m.e.van.iersel@student.rug.nl

(2)

This page has been left blank on purpose.


(3)

Abstract

Social media platforms such as Facebook use algorithms to decide what information its users get to see, in order to prevent information overload and increase engagement with the platform. For each individual user on Facebook the news feed, the centre of the Facebook experience, is algorithmically curated based on individual preferences. As a result of algorithmic curation, users are shown the information that is preferred and relevant to them personally, whilst irrelevant information is hidden. Users can get trapped in their individual “filter bubble”, which is associated with effects such as selective exposure and polarization.

However, it is unknown if, and how such an individual filter bubble could be created by Facebook. No known earlier research has concretely defined and measured the concept of a filter bubble on the platform. To address this knowledge gap it was conceptualized that a filter bubble could be created on Facebook by algorithms that filter and promote or demote posts on the news feed based on relevance. Several basic mechanisms were conceptualized that could affect this relevance of posts, and these mechanisms were examined in three experimental conditions. The main conclusion is that a filter bubble on Facebook is a dynamic concept that operates more on a group level, rather than on the individual level. It was found that Facebook users are capable of dictating their own relevance to others in their network. Facebook users are also partially responsible for the information that is propagated in their network. This highlights the importance of a users’ social connections on Facebook. Additionally it was found that Facebook independently clustered posts, and independently changed the sequentiality and chronology of posts. Overall the results show that other-selected preferences and independent mechanisms used by the algorithms have a more consistent effect in curating posts on the news feed than self-selected preferences. Therefore the notion of a filter bubble on Facebook should be revisited in order to create a better understanding of what a filter bubble on Facebook encompasses.

(4)

Table of contents

1. Introduction...1

2. Conceptual framework...3

2.1 Algorithmic curation of information 3

2.2 A filter bubble 4

2.2.1 Dynamics of a filter bubble 5

2.2.2 Effects of a filter bubble 5

2.3 A filter bubble on Facebook? 6

2.3.1 The Facebook news feed as an individual filter bubble 6

2.3.2 Curation of the Facebook news feed 8

2.4 Conceptualizing possible mechanisms of algorithmic curation on Facebook 10

2.4.1 Liking 10

2.4.2 Individual affinity 10

2.4.3 Symmetrical affinity 11

2.4.4 Algorithmic bias 11

2.5 Research questions 12

3. Overview of the experiments...13

3.1 Common methods used in all experiments 14

4. Experiment 1...17

4.1 Methods 17

4.1.1 The network setup on Facebook 17

4.1.2 Procedure 17

4.1.3 Data collection & analysis 18

4.1.4 Material 18

4.2 Results 18

4.3 Summary of results 19

5. Experiment 2...20

5.1 Methods 20

5.1.1 The network setup on Facebook 20

5.1.2 Manipulating affinity 20

5.1.3 Filtering and promotion/demotion of posts 21

5.1.4 Procedure & hypotheses 21

5.1.5 Data collection: calculating the filtering and promotion/demotion of posts 25

5.1.6 Analysis 28

5.1.7 Material 29

5.2 Results 30

5.3 Summary of results 42

6. Experiment 3...44

6.1 Methods 44

6.1.1 The setup on Facebook 44

6.1.2 Procedure & hypotheses 44

6.1.3 Data collection 47

6.1.4 Analysis 47

6.1.5 Material 47

6.2 Results 48

6.3 Summary of results 50

(5)

7. Discussion...51

7.1 Findings based on the research questions 51

7.2 Findings based on a qualitative observation of the data 56

7.3 Main conclusion 58

7.3.1 Implications 58

7.4 Limitations & Strengths 59

7.5 Suggestions for future research 61

Bibliography ...62

List of appendices...65

Appendix A: Instructions for common methods 66

Appendix B: Protocol experiment 1 68

Appendix C: Collected screenshots for experiment 1 70

Appendix D: Protocol experiment 2 (network 1) 71

Appendix E: Protocol experiment 2 (network 2) 77

Appendix F: Protocol experiment 2 (network 3) 83

Appendix G: Collected screenshots for experiment 2 89

Appendix H: Data sheet for experiment 2 90

Appendix I: Protocol experiment 3 91

Appendix J: Collected screenshots for experiment 3 95

Appendix K: Data sheet for experiment 3 96

(6)

1. Introduction

A digital revolution in the beginning of the 90’s caused the rise of the internet. Nowadays it is almost impossible to imagine life without the internet. It serves all kinds of purposes, from filling in tax forms, to watching videos on YouTube or chatting with someone on the other side of the world. It is a medium that connects people and information, essentially based on an ever growing database. The internet has also profoundly changed the way in which information is produced, distributed and consumed. Platforms on the internet have overtaken the role of traditional media when it comes to this distribution, production and consumption of information (Caplan & Boyd, 2016, p.2). A growing amount of people no longer rely on traditional media, but on social media platforms for news consumption (Barthel, Shearer, Gottfried & Mitchell, 2015, in: Caplan & Boyd, 2016, p.2). And one of the more important social media platforms is Facebook.

Facebook is a social media platform that allows people to connect, communicate and share information in

“stories” (also called “posts”) on the Facebook news feed. Facebook has over 2 billion users. Every user has 1 an average amount of 338 social connections , and every minute nearly half a million posts are added to the 2 Facebook news feed . If people rely on Facebook for their news consumption then these numbers, even the 3 sheer size of Facebook’s current user base, emphasize the importance of Facebook in this process of news consumption.

However, there are caveats to be kept in mind when relying on Facebook for news consumption. Facebook uses algorithms to “personalize” the information its users get to see on the news feed (Pariser, 2011b). This means that there are systems in place that could be capable of evaluating all actions of each individual user, subsequently providing these users with new information based on that evaluation. And this evaluation is a continuous process. This potentially leads to a situation where each individual user only sees the information that he or she wants to see, a situation which is also referred to as the “filter bubble” (Pariser, 2011a, p.9;

Pariser, 2011b). Within such a filter bubble a user sees information that is systematically selected according to that user’s personal preferences. And such personalization of information is considered beneficial. For example, in a broad sense, social media platforms such as Facebook want to keep their users engaged with the platform to generate revenue. The more preferred information Facebook shows to its users, the more likely it is that it keeps its users engaged (Pariser, 2011c). And in the process users are shown more of what they want to see.

It is however important to look at a filter bubble from a different perspective. When news is only consumed through a platform such as Facebook, the platform acts as the “keyhole” or “frame” through which a user interprets the world. By using algorithms to personalize information this frame becomes smaller, because users only see more of that what they want to see. Users can get stuck in their own individual filter bubble of preferred information. In this context, the algorithms that select this information can have a substantial effect on public life (Caplan & Boyd, 2016, p.2). The use of algorithms to control how information is distributed can even be considered worrisome from the point of view from democracy (Bozdag & van den Hoven, 2015, p.

254). This is exacerbated by several issues. First, more than 75% of the users on Facebook are unaware that there are algorithms deciding what information they get to see (Hamilton, Sandvig, Karahalios & Eslami, 2014, p.636). Based on Facebook’s user base this translates to nearly 1.5 billion people affected by a lack of algorithmic awareness, emphasizing the potential size of the effects of a filter bubble. Next, the majority of the

https://newsroom.fb.com/news/2017/06/two-billion-people-coming-together-on-facebook/

1

https://www.brandwatch.com/blog/47-facebook-statistics-2016/

2

https://zephoria.com/top-15-valuable-facebook-statistics/

3

(7)

users that are aware of the algorithms do not know how to influence the algorithms (Hamilton et al., 2014, p.

636). Additionally, Facebook does not disclose how their algorithms work. It is Facebook that ultimately decides what its users get to see, but it is unknown how this is exactly decided.

With Facebook being an important platform for news consumption, the question whether Facebook creates a filter bubble also becomes important. Due to the secrecy surrounding the algorithms on Facebook it is unknown if and how Facebook creates a filter bubble. Lazer (2015, p.1090) states that it is impossible for a single person to understand such “social algorithms”. This might explain why a lot of the research on the concept of a filter bubble has focused on the general implications of the phenomenon, rather than how the concept can be concretely defined and measured on a platform such as Facebook. That does not mean that these elements are not important. The lack of knowledge about these elements emphasizes that it is necessary to start investigating and attempt to create a better understanding of how Facebook could create a filter bubble. Within the scope of this thesis it is therefore attempted to add more insight to this knowledge gap by exploring the following question:

“Does Facebook create an individual filter bubble?”

In order to investigate this question a conceptual framework is developed and research questions are derived from this framework. These are presented in chapter 2. To investigate the research questions a sequence of three experimental studies was conducted. An overview of these experiments and the common methods used in these experiments are presented in chapter 3. The methods and results of each experiment are presented in chapters 4, 5 and 6. In chapter 7 the findings are discussed, the main conclusion and implications are presented, the limitations and strengths are discussed and suggestions for future research are presented.


(8)

2. Conceptual framework

This thesis draws on the idea of creating a conceptual framework, described by Miles & Huberman (1994, in:

Maxwell, 2012, p.39-40). In this chapter the key ideas, theories and concepts for this thesis are discussed. At the end of the chapter the research questions are presented that are derived from the conceptual framework.

2.1 Algorithmic curation of information

The internet is a vast database of information. The growth rate of information on the internet is extraordinarily big. If all of human history from the year 0 up to the year 2003 would be recorded, it would take up around 5 exabytes of raw data (Pariser, 2011a, p.11). Nowadays, the amount of information on the internet grows with 5 exabytes of raw data every 2 days. To put this into perspective: more than 90% of the total amount of information on the internet has been created since 2016, and a big part of this information growth on the internet is specifically attributed to social media platforms. 4

A social media platform can broadly be defined as an application on the internet that builds on the principles of Web 2.0 and allows for the creation and exchange of User Generated Content (Kaplan & Haenlein, 2010, p.

61). Within the definition of Kaplan & Haenlein (2010), the term “Web 2.0” refers to the evolution of the internet towards a more participatory and collaborative environment for its users, created by new technologies. The term “User Generated Content” is used as an umbrella term for all the different kinds of content that can be created and exchanged by users on the internet. Nowadays there are plethora of different social media platforms that allow for people to connect and create or exchange many different types of content. Some of the most common and popular social media platforms include Youtube, Twitter, Instagram and Facebook.

Nearly all of the social media platforms are owned by companies, and companies have financial interests.

From a financial perspective, the more time users spend on a specific platform, the more revenue that platform generates (Pariser, 2011c). It is therefore in the best interest of the companies that own these social media platforms to keep their users engaged with the platform as much as possible. However, the total amount of information on these platforms is growing quickly, and not all information is equally relevant for each user. In order to maintain user engagement it is considered important that each individual user sees information that is relevant for them personally. Otherwise users could become deluged by irrelevant information and the engagement with the platform could start to decline. Therefore relevance of information has to be treated subjectively. The CEO of Facebook, Mark Zuckerberg, once concretely exemplified this perspective by stating:

“A squirrel dying in your front yard may be more relevant to your interests right now than people dying in Africa”. 5

In order to find out what information is relevant for each individual user, social media platforms started to use algorithms to evaluate the preferences for information. These algorithms then selectively provide the user with information that is preferred. This process is conveniently called the “personalization” of information (Pariser, 2011b; Caplan & Boyd, 2016, p.2). In a metaphorical way, algorithms try to pick out the gems out of the

https://blog.microfocus.com/how-much-data-is-created-on-the-internet-each-day/

4

https://www.nytimes.com/2011/05/23/opinion/23pariser.html

5

(9)

information rubble for each individual user. By using such algorithms it is tried to prevent that users are overwhelmed with information that is not relevant to them personally (Lazer, 2015, p.1090). This process of personalizing information can also be called algorithmic curation: the organizing, selecting and presenting of information to the user (Rader & Gray, 2015, p.173).

The algorithmic curation of information is not a new phenomenon. A multitude of online services, e.g. Netflix and Amazon, were already using algorithms to curate content recommendation to increase engagement and sales (Pariser, 2011a, p.7-8). The basic principle that drives this curation is based on the notion: “if you like this, you’ll like that” (Pariser, 2011c). This leads to irrelevant content being filtered and relevant content being shown for each individual. And this algorithmic curation yields positive results. For example, when a user watches comedy movies on Netflix this would result in more comedy movies being recommended to that user by algorithms, in turn increasing the time that user spends watching Netflix and resulting in more rental sales.

Buying a new smartphone on Amazon would result in additional and compatible accessories for that smartphone being recommended by algorithms, again increasing the total sales.

But there are important differences between services such as Netflix and social media platforms such as Twitter and Facebook when it comes to algorithmic curation of information. Services such as Netflix curate information in an environment where there are no other users, nor user generated content. But social media platforms such as Twitter and Facebook curate information in an environment which mainly exists out of other users and user generated content. Algorithmic curation on social media platforms can therefore extend beyond the curation of just content for the individual, because the environment in which the curation takes place suggests that there is more than only content that could be curated. For example, on social media platforms users create and exchange content, and it could be possible that the algorithms on social media platforms curate this exchange, essentially curating the communication between users. It raises the question, what effects could this algorithmic curation of information have?

2.2 A filter bubble

On the one hand algorithmic curation can be considered positive, because it is helpful for the user (Pariser, 2011a, p.11-12). The rate of information growth on the internet and the ease of being connected to that information leads to a notion called “attention crash”, which means that users become unable to attend to all available information. In that context algorithmic curation acts as a helping hand, offering to select and present information that is relevant without additional effort on the part of the user. And the financial benefits of keeping users engaged by using algorithmic curation leads to an arguable win-win situation for both social media platforms and its users.

On the other hand algorithmic curation can be considered negative, because it ultimately leaves users only with information they want to see, not necessarily what they need to see (Pariser, 2011b). For example, when a user chooses to hear more news about guns and nothing about healthcare, this user will be presented with more news about guns whilst the news about healthcare is edited out by algorithms. Algorithmic curation creates a situation where preferred information is shown and unwanted information is hidden for each individual user. This individual universe of preferred information is called a “filter bubble” (Pariser, 2011a, p.9).

(10)

2.2.1 Dynamics of a filter bubble

While it is not an exhaustive list, at least three new dynamics are coined with regards to a filter bubble that could affect how users encounter diverse information and ideas (Pariser, 2011a, p.9-10):

Firstly, users are alone in a filter bubble. When the selected and presented information is tailored to the preferences of each individual user, it is almost impossible to know what information is within the same reference frame of other users. It is unknown what is in the filter bubble of another user.

Secondly, users are unaware of a filter bubble. This could be attributed to the era that precedes the use of algorithms to curate information, where users were able to (partially) select their own criteria for the filtering of information (e.g. when someone chose to read a sports newspaper, it was logical that that newspaper provided information about sports). But algorithms do not explain why they choose to display certain information and hide other information. The user has no knowledge of the criteria used by the algorithms to select and display information. This lack of knowledge makes it hard to see how biased the presented information is, which renders users unaware of this bias.

Thirdly, users are unable to avoid being drawn into a filter bubble. In the context of traditional media, such as television or newspapers, the consumer of the information chooses to consume information based on specific criteria (e.g. the above mentioned example about a sports newspaper). But the consumers are also able to opt out of those specific criteria, for example by choosing a different source or medium. When using a platform that uses algorithms to curate information, there is no such choice. On such platforms users are unavoidably subject to algorithmic curation.

2.2.2 Effects of a filter bubble

The creation of a filter bubble through algorithmic curation could also lead to several effects. At least three different levels of effects can be distinguished, which are the system level, the individual level and the group level.

Effect on the system level: feedback loops

Over time the use of algorithms to curate information changes the interaction between the user and the algorithms: from actively choosing information to consume, to passively consuming information chosen by algorithms (Rader & Gray, 2015, p.175). Consuming information that is selected by algorithms simultaneously provides the same algorithms with signals about what information is preferred next. This is an example of a

“feedback loop”, a phenomenon where the previous output of an algorithm influences the next input for that algorithm (Steck, 2011, in: Rader & Gray, 2015, p.175). What this means is that, over time, the process of algorithmic curation and the feedback loops embedded in this process decrease the diversity of the information that is presented to a user. The algorithms only select more information that has preferred characteristics. As a result the corpus of information that is selected by algorithms on a system level can be affected (Rader & Gray, 2015, p.175).

Effect on the individual level: selective exposure

By design algorithmic curation presents users with a narrow selection of the available information based on individual preferences. This information is arguably getting less and less diverse, a process that is amplified by feedback loops. This can also have an impact on the behavior of the user (Rader & Gray, 2015, p.181).

Users are selectively being exposed to new and diverse information due to algorithmic curation. Selective exposure can be defined as mainly coming in contact with information that supports one’s own ideas and

(11)

opinions (Frey, 1986, in: Liao & Fu, 2013, p.2359). A practical consequence of this selective exposure is that it becomes increasingly harder for users to consume information that is not preferred (Pariser, 2011c). Users can retreat in their online environment that only shows information that they agree with, shielding them from information they do not agree with. This polarizes individual points of view. It also threatens individual learning capability and creativity (Pariser, 2011a, p.94).

Effect on the group level: polarization

In the “real world”, people have diverse discussions and express different points of view. But on the internet, the use of algorithms to curate information decreases possible encounters with opposing points of view due to selective exposure. Users choose what they get and don’t get to see. Users on the internet can become fragmented based on such subjective preferences (Chan & Fu, 2017, p.267), in turn creating and polarizing groups of users with similar interests and disinterests. This process is also referred to as “cyberbalkanization”:

the creation of sub-groups of users on the internet with a prejudiced approach to alternate content (Bozdag &

van den Hoven, 2015, p.249).

On the group level the creation and polarization of subgroups due to selective exposure is believed to be worrisome for public issues, because it can create distorted points of view between political subgroups (Stroud, 2017, p.543), which in turn threatens democratic discussion (Sunstein, 2007, in: Bozdag & van den Hoven, 2015, p.249). Such subgroups act as echo chambers, where personal ideas and beliefs are strengthened and opposing points of view are avoided. And such polarized subgroups can also be seen on current day social media platforms. For example, there is evidence that Twitter users from the United Kingdom have a tendency to selectively talk with politically likeminded individuals, both inside and outside their personal Twitter network (Krasodomski-Jones, 2016, p.33-34). On Facebook it is found that users have a tendency to selectively consume and share politically like-minded information in their networks, which in turn leads to a distorted view of reality (An, Quercia & Crowcroft, 2014, p.13 & 22).

2.3 A filter bubble on Facebook?

Social media platforms are becoming increasingly important for news consumption (Barthel et al., 2015, in:

Caplan & Boyd, 2016, p.2). Because Facebook is globally the biggest and most widely used social media platform , it is reasonable to assume that Facebook is an important, if not the most important social media 6 platform for news consumption. Since Facebook also uses algorithms to curate the information on the platform, the questions can be asked how Facebook could create a filter bubble, and what it could look like.

2.3.1 The Facebook news feed as an individual filter bubble

The place where all information on Facebook is aggregated, shown and consumed is called the “news feed”.

This is where a potential filter bubble on Facebook is located, because the news feed can be considered the centre of the Facebook experience: it is the first thing a user sees when he or she logs on to the platform and it contains all new information. The news feed is described as a regularly updating list of information contained in “stories” from the social connections of a user (Bucher, 2012, p.1167). Due to the interface design of the Facebook news feed these stories, or “posts”, are shown in a sequential list from top to bottom. Posts can

(12)

The posts on this news feed are curated by Facebook’s algorithms to suit individual preferences. Whenever an average user logs on to Facebook, an average of 1500 posts is available to be shown on the news feed, but the algorithms only select an average of 300 of those posts to be shown (Rader & Gray, 2015, p.174).

This translates into an average of 80% of the available posts being filtered at any given time for an average user. The posts that are shown after the filtering are presented in a specific order, because the interface of the news feed shows posts in a sequence from top to bottom. This order is also determined by algorithms, where posts are artificially organized in a specific sequence based on how relevant these posts are for a specific user. This process is also referred to as “ranking” (Rader & Gray, 2015, p.173). The most relevant posts for a user are promoted and can be found on the top of the news feed. Consequently, posts that are not relevant are demoted and are shown lower on the news feed. Posts with the least relevance are filtered. The news feed on Facebook could therefore function as a filter bubble, which is created by algorithms that curate the news feed by filtering and promoting or demoting posts based on relevance.

Image 2.1: An example of an ordinary news feed.

(13)

2.3.2 Curation of the Facebook news feed

Each user on Facebook can have different subjective preferences regarding the relevance of posts. The question can be asked, how do the Facebook algorithms know what posts are relevant to which user in order to curate the news feed?

In order to assess the relevance of posts for each individual user it is assumed that a “relevance score” for each separate post is calculated by the algorithms (Bucher, 2012, p.1167). Using these relevance scores the algorithms promote posts with high relevance and demote or filter posts with low relevance (Bucher, 2012, p.

1169). A post with high relevance is shown high on the news feed, and vice versa.

However, it is unclear how such a relevance score for each post is exactly calculated. The details of the current algorithms on Facebook are not disclosed. There is theory on a previous iteration of the algorithms that Facebook used (called “EdgeRank”), which describes that there are at least three general components that Facebook can use to determine the relevance scores of posts: time decay, weight and affinity (Bucher, 2012, p.1167). The relevance score is calculated for each post by multiplying the values of these components.

Image 2.2 contains an example of how the formula could be written.

Time decay

One of the components that supposedly determines whether posts on the news feed are relevant is time decay. This component is based on the age of a post, and the way it works is indicated by the name. The older a post is, the less relevant it becomes. The relevance of posts decays by the passing of time. More recent posts are assumed to be more relevant than older posts (Bucher, 2012, p.1167).

Weight

Image 2.2: The original EdgeRank formula (Bucher, 2012, p.1168)

(14)

A user can like a post with the click of the “like” button. Liking can be considered the lowest level of interaction. It requires the least amount of cognitive and physical effort.

A user can also comment on a post. Commenting can be considered the intermediate level of interaction.

Compared to liking it requires a more cognitive and physical effort because it requires additional actions on the platform (Kim & Yang, 2017, p.441), such as typing messages.

Finally, a user can also share a post. Sharing can be considered the highest level of interaction. Sharing a post causes this post to appear on the personal page of the user that shared it, whereas liking and commenting on a post does not have this effect. This implies that sharing posts could become a part of strategic behavior for self-presentation (Kim & Yang, 2017, p.442). For example, when information is disclosed by a user itself there is little warrant that there is a credible connection between the claim of self- presentation and the actual self (Walther, van der Heide, Hamel & Shulman, 2009, p.232). However, the information contained in a shared post from another user is not self-disclosed, nor editable. A shared post that appears on a users’ personal page could thus be considered a form of a “testimonial” rather than a self- disclosure, which is more valuable in terms of warranting the presented information (Walther et al., 2009, p.

232). It becomes important to consider what information is shared. Additionally, it is possible to add a comment to a shared post. The combination of an increased commitment to assess the value of a post regarding the self, and the possibility of adding a comment to this post, leads to the sharing of posts requiring the most cognitive and physical effort of the three mentioned types of user interaction (Kim & Yang, 2017, p.

443).

Overall, a post that has received little interaction arguably has less interactional weight than a post that has received a lot of interaction, which explains the use of the term “weight”. Based on the mentioned levels of interaction, sharing has more interactional weight than both comments and likes, and comments have more interactional weight than likes. It is assumed that an increased interactional weight for a post increases the relevance of that post (Bucher, 2012, p.1167-1168).

Affinity

The last mentioned component that supposedly determines the relevance score of a post is called affinity. In everyday lives people have a preference for all kinds of different things and different people, and this also applies on Facebook. At least two different types of affinity on the Facebook news feed can be distinguished:

Firstly, just as in everyday life, a user on Facebook can have more or less affinity with specific other users (e.g. more affinity with close family members and less affinity with distant friends), and the algorithms take this into account when calculating relevance scores. Concretely, the algorithms assume that certain users are more important than others (Bucher, 2012, p.1168), and this importance is based on affinity. This is exemplified by a press statement about the news feed, which states that the algorithms will prioritize posts on the news feed from users that “you care about”. Whether a user has more or less affinity with another user is 7 determined by the relationship with that other user (Bucher, 2012, p.1167). It is assumed that more affinity with a user leads to more relevance for posts from that user.

Secondly, a user can also have more or less affinity with different types of information (e.g. favoring news about guns and disfavoring news about healthcare). Another Facebook press statement states that the algorithms will prioritize posts on the news feed with information that aligns with information that has been

https://newsroom.fb.com/news/2016/06/news-feed-fyi-helping-make-sure-you-dont-miss-stories-from-friends/

7

(15)

clicked on or interacted with previously, to make the news feed more informative. Therefore, posts with a type 8 of information that have been interacted with in the past become more relevant, because the affinity with that information is increased.

2.4 Conceptualizing possible mechanisms of algorithmic curation on Facebook

The previous section elaborates on what theoretical components the algorithms on Facebook could use to curate posts on the news feed based on relevance. However, there is no evidence to support that these components are actually used by Facebook to curate the news feed. To create a better understanding of a possible filter bubble on the Facebook, the question can be asked what some of the concrete, basic mechanisms could be that Facebook uses to curate posts on the news feed. The mentioned theoretical components provide the opportunity to start conceptualizing some of these basic mechanisms.

2.4.1 Liking

The weight component suggests that Facebook could use different types of interactions to determine whether posts are relevant and curate the news feed according to interactional weight. It is safe to assume that any type of interaction with a post increases the interactional weight of that post, consequently increasing the relevance of that post. Based on the mentioned levels of interaction, a like will probably yield the least weight and thus the least relevance, whereas a share will yield the most weight and thus the most relevance.

Therefore a like also seems less important than a comment and a share.

However, based on the level of effort it takes to perform a like, comment or share, it is a lot more probable that users like posts, instead of comment on or share posts. In 2015 the like function was a major source of data for Facebook, with more than an average of 4 million likes generated per minute , an equivalent of 6 billion 9 likes per day. For comparison purposes, nowadays each minute 510.000 comments are posted (734.4 million per day). In contrast to the seemingly little importance of a like, the numbers on the total use could perhaps 10 mean that the like function could be relatively important, despite it being physically and cognitively the least demanding interaction (Kim & Yang, 2017, p.441). Could it be possible that Facebook uses likes to curate the news feed?

2.4.2 Individual affinity

The press releases from Facebook suggest that affinity plays a role in curating the news feed. Facebook could use different types of affinity to determine whether posts are relevant and curate the news feed. Affinity is subjective, and a user can supposedly have more or less affinity with other users, or have more or less affinity with different types of information.

Every time a post from a user is interacted with, the affinity with that user increases (Bucher, 2012, p.1169).

Because the affinity with that other user increases, the posts from that other user become more relevant. It could be possible that Facebook uses such affinity with other users, based on past interaction with posts, to curate the news feed.

(16)

Additionally, posts can contain a specific type of information. Interacting with a post could also lead to an increased affinity with the type of information contained within that post. And the increased affinity with that type of information could make posts with that type of information more relevant. It could therefore be possible that Facebook also uses affinity with information based on such past interaction to curate the news feed.

2.4.3 Symmetrical affinity

The affinity with another user is based on the “relationship” with that other user. A relationship logically refers to two or more “things” being connected in a specific state. The relationship between two users on Facebook can be considered symmetrical in its basic form based on the bilateral fundaments of that relationship: it is established by a friend request which must be accepted by the other, and this relationship can be broken from both sides (Li & Sun, 2014, p.1271).

If the relationship between two users can be considered symmetrical, then it could be possible that Facebook also considers the affinity between users symmetrical. In that case, interacting with posts from another user could not only increase the affinity with the user that created that posts, but it could increase the affinity between the users symmetrically. This implies that the effect of interacting with posts from another user could extend beyond the individual, because it could also affect the curation of the news feed of the user that is interacted with. If so, that would mean that the algorithms curate the news feed on a group level rather than the individual level. The question can be asked, does Facebook treat this affinity symmetrically when curating the news feed?

2.4.4 Algorithmic bias

It is also claimed that Facebook can act independently from its users to curate the news feed, based on bias embedded in the system. This means that the algorithms on Facebook could independently “steer” users 11 towards specific groups on the platform. This claim can be explained in the context that it is made, the online political debate in the United States.

The online political debate in the US is one of the most typical examples of the creation and polarization of subgroups on the internet (Chan & Fu, 2017, p.267; Bozdag & van den Hoven, 2015, p.263). There are two major subgroups in this context: conservatives and liberals. On Facebook, the groups of conservative users and liberal users in the United States represent the two biggest groups that selectively choose to consume political information that is in line with their own views (Mitchell, Gottfried, Kiley & Matsa, 2014, p.7). In these groups the conservative users are mostly connected to other conservative users and see mostly conservative posts on the news feed. Liberal users are mostly connected to other liberal users and see mostly liberal posts on the news feed. It is in this context that the mentioned claim is made that Facebook’s algorithms are responsible for placing users in these specific groups based on bias embedded in the algorithms.This has become an extensive debate in the United States.For example, it is claimed that Facebook is biased to hide conservative posts in favor of liberal posts on the news feed, independent from the user. Facebook is 12 essentially blamed for placing its users in a subgroup where mostly liberal posts are shown, without knowledge or consent from its users. Such an algorithmic bias can arguably be extended to any context and is capable of placing users in all kinds of different subgroups by curating the news feed based on criteria that are chosen by the algorithm, rather than by the user.

https://www.nbcnews.com/tech/social-media/gop-accuses-facebook-censorship-conservative-media-flourishes-online-n865276

11

https://www.lifesitenews.com/news/facebook-uses-its-heft-to-support-liberal-causes-while-suppressing-conserva

12

(17)

However, Bakshy, Messing & Adamic (2015, p.1130-1133) argue that the algorithms on Facebook only supplement the individual choices of the user. The most important factors that determine how the algorithms curate the news feed are the individual connections to other users, and the previous interaction with posts that contain specific information. The users are responsible for placing themselves into groups, and the curation done by the algorithm itself can be considered negligible. An et al. (2014, p.13) also argue that on Facebook, it is a deliberate choice by users themselves to only consume preferred information.

But the results from the study by Bakshy et al. (2015) also clearly show that the algorithms filter and order posts on the Facebook news feed before they are shown to the user. Therefore the algorithms are capable of independently curating the news feed, regardless of the claims that they do not act as such. When posts are curated before they are shown, that user is arguably presented with a limited choice to begin with, amplifying the creation of feedback loops (Sandvig, 2015). Within the context of political polarization in the US, it is argued that the algorithms on Facebook have a slight preference for conservative posts over liberal posts (Pariser, 2015). This could hint at a slight algorithmic bias existing for conservative posts over liberal posts, placing the users in a subgroup where conservative posts are deemed more relevant than liberal posts. This raises the question, does Facebook independently curate the news feed? For example, considering a new, politically unaffiliated user on Facebook, could it be possible that Facebook independently makes conservative posts more relevant than liberal posts?

2.5 Research questions

This thesis aims to acquire more insight about the notion of a filter bubble on Facebook by exploring the main question “does Facebook create an individual filter bubble?”.

The place where such an individual filter bubble on Facebook is potentially located is the news feed.

Facebook uses algorithms to curate posts on the news feed for each individual user, based on relevance. To explore the main questions several mechanisms are conceptualized that Facebook could use to curate posts on the news feed based on relevance, which are derived from theoretical components of an earlier Facebook algorithm called “EdgeRank”. Facebook could use likes to determine how the news feed should be curated.

Facebook could also use affinity to curate the news feed, where a distinction is made between affinity with other users and affinity with specific information. It could also be possible that Facebook treats the affinity between users symmetrically when curating the news feed. Additionally it is possible that Facebook acts independently from the user when curating the news feed. This leads to the following research questions:

RQ1: Does Facebook use likes to curate the news feed?

RQ2: Does Facebook use affinity with other users to curate the news feed?

RQ3: Does Facebook use affinity with information to curate the news feed?

RQ4: Does Facebook treat affinity symmetrically between users?

RQ5: Does Facebook independently curate the news feed?


(18)

3. Overview of the experiments

In order to investigate the research questions a sequence of three experimental studies was conducted.

Experiment 1:

In the first experiment it is investigated if Facebook uses likes to curate the news feed (RQ1). This experiment involved setting up a single line network of four different users on Facebook. In this network it was observed whether and how posts were propagated in the network if these posts were liked by different users. This experiment had two additional purposes: it was examined whether liking was a suitable interaction that could be used to manipulate relevance in subsequent experiments, and additionally it was examined what type of content could be used in subsequent experiments, by investigating whether there is a difference in propagating a post containing text or a post containing a picture.

Experiment 2:

In the second experiment it is investigated if Facebook uses affinity with users to curate the news feed (RQ2), if Facebook uses affinity with information to curate the news feed (RQ3), whether Facebook treats affinity symmetrically (RQ4) and whether Facebook independently curates the news feed (RQ5). This experiment involved setting up multiple networks of three interconnected users on Facebook, in which the different research questions were progressively addressed in a specific sequence. During the experiment posts were added to the network, and the relevance of posts in the network was manipulated by liking specific posts.

Multiple hypotheses were tested, based on the culmination of manipulations in the network. It was investigated how Facebook curates the news feed by looking at how Facebook filters posts on the news feed, and how Facebook promotes/demotes posts on the news feed.

Experiment 3:

In the third experiment it is investigated if Facebook independently curates the news feed (RQ5). For this experiment a single line network of three users was created. No interaction between the users took place.

One of the users added solely liberal posts to the network, whereas another user added solely conservative posts to the network. On the central user it was examined how Facebook orders the posts from these users on the news feed. It was tested if Facebook independently curates the news feed by showing conservative posts higher on the news feed than liberal posts, based on their position on the news feed. 


(19)

3.1 Common methods used in all experiments

For all experiments a series of methods were used that overlap. To avoid repetition in chapters 4-6, this section discusses these common methods. Detailed instructions for these methods can be found in Appendix A.

The access to Facebook

Desktop computers at the University of Groningen were used in the experiments. An “F-account” on Windows was used to operate multiple computers simultaneously. Windows Internet Explorer was used as the web browser to access Facebook, set to inPrivate mode. An example of the research environment can be found in Image 3.1.

The creation of users on Facebook

In an attempt to examine the Facebook news feed from a minimally biased perspective, new Facebook users were created. However, Facebook does not allow a physical person to have more than one genuine user account. Therefore artificial Facebook users were created. The creation of such artificial users proved to be problematic due to several reasons. For example, Facebook blocked users due to spam detection, suspicious activity and time delay between login attempts. This had a profound effect on how the experiments could be conducted because it dictated the amount of useable accounts. It also meant that the experiments had to be conducted in a short timeframe, otherwise the users would be blocked from access. This limitation is elaborated on in the discussion.

Overall a total of 16 new artificial users were able to be created with the help of a fake name generator and 13 e-mail services of Yandex and Gmail for verification purposes. Each of the artificial Facebook users was only used on one designated computer on the university.

Adding posts to a network

In each experiment posts were added to the newly created Facebook networks. To add posts to a network, they were posted on the personal page of a user. An example of a personal page is shown in Image 3.2. The area where the posts were posted on a personal page is highlighted in red.

Image 3.1: An example of the research environment.

(20)

Liking posts

In some of the networks manipulations took place by liking posts. Posts are liked on the personal page of a user. If user A needs to like a post from user B, user A navigates to user B’s personal page and likes the designated post by clicking the like button underneath the post.

Capturing news feeds

During the experiments news feeds of users were “captured” by making screenshots. A program called

“Greenshot” was used to make screenshots. The news feed shows a sequential list of posts that are 14 selected by the algorithms, but not all posts are visible in one screenshot. The news feed was scrolled down whilst making screenshots of all posts. A captured news feed of a user therefore consists out of a collection of screenshots. An example of a screenshot with posts on the news feed is shown in Image 3.3.


https://getgreenshot.org/

14

Image 3.2: An example of a personal page. The highlighted area in red shows where the posts were posted to the news feed.

(21)

Image 3.3: An example of a screenshot of a news feed.

(22)

4. Experiment 1

In this first experiment it is investigated if Facebook uses likes to curate the news feed (RQ1). To examine whether Facebook uses likes to curate the news feed, a new network was created in which posts were liked by different users to see how these posts are propagated in this network. Overall this experiment has two additional purposes. First, it is investigated if liking is an interaction that Facebook uses to determine whether posts are relevant. If that is the case then liking could be used as a form of interaction in subsequent experiments to manipulate the relevance of posts. Second, it is investigated what type of content could be used in subsequent experiments, by examining whether there is a difference in propagating a liked post that only contains text, or a liked post that only contains a photo.

4.1 Methods

4.1.1 The network setup on Facebook

For this experiment a single line network of four new users was created on Facebook. A line network provides the opportunity to investigate how posts could be propagated to other users that have no direct connection to the user that added the posts to the network. The users in the network were called users A, B, C and D, and the order of connection was A→B, B→C, C→D. Image 4.1 contains an illustration of the network structure used for this experiment.

4.1.2 Procedure

The experiment was conducted in a sequence of 3 steps. A step-by-step protocol for this experiment can be found in Appendix B. The procedure was as follows:

Step 1:

User A adds new posts to the network.

The news feed is captured on all users.

Step 2:

User B likes the posts.

The news feed is captured on all users.

Step 3

User C likes the posts.

The news feed is captured on all users.


Image 4.1: An illustration of the network structure used in experiment 1.

(23)

4.1.3 Data collection & analysis

After each step the news feed was captured on all users (a link to the screenshots can be found in Appendix C). It was observed in the screenshots whether the posts that were added to the network were visible for each user in the network after each step. A table was created for this observation. If a post was visible on the news feed of a user after a specific step this was noted with a “+” sign. If a post was not visible the cell in the table was left blank. The created table for this observation is presented as a result.

4.1.4 Material

Only two posts were prepared for this experiment: one containing a picture of static, and one containing a string of text which says “Hallo!” (see Appendix B).

4.2 Results

Table 4.1 contains the result of the observation of the first experiment.

In step 1, user A added the two different posts to the network. After these posts were added, both posts were visible on user A and user B. Neither of the posts were visible on user C and user D.

In step 2, user B liked the posts. Both posts remained visible for user A and user B, but both posts became visible on user C. The posts remained invisible for user D.

In step 3, user C liked the posts. This caused the posts to remain visible for user A and user B, and now both posts also became visible on user D. However, both posts became invisible for user C.

No differences in propagation were observed between the post with a picture and the post with a text. Both types of content were treated equally.

Reading guide: after user A added the post with the picture and the post with the text to the network in step 1, both posts were visible for user A and user B. Both posts were invisible for user C and user D.

Table 4.1: Results of observing visibility of the text and picture post on all users in each step in experiment 1.

Step Post User A User B User C User D

1 Picture + +

Text + +

2 Picture + + +

Text + + +

3 Picture + + +

Text + + +

(24)

4.3 Summary of results

The results show that adding a post to the network seems to propagate this posts to the news feeds of direct connections, but not to the news feeds of distant connections. The post becomes visible on 2 news feeds.

When this post is liked by a near connection (in this case user B), the post is propagated to a distant connection from the source (in this case user C), increasing the total amount of news feeds on which the post is visible from 2 to 3. When this distant connection then likes the post, the post is propagated to an even further distant connection (in this case user D). Within the context of the examined network, liking a post does appear to make a post more relevant because it is propagated through the network when it is liked. There was no difference between propagating the post with a text and the post with a picture. Both types of content could therefore be used in subsequent experiments. Because liking increased the relevance of the liked posts by propagating these posts in the network, liking can also be used as an interaction to make posts more relevant in subsequent experiments.


(25)

5. Experiment 2

In the second experiment it is investigated whether Facebook uses affinity with other users to curate the news feed (RQ2), and subsequently whether Facebook uses affinity with information to curate the news feed (RQ3), whether Facebook treats affinity symmetrically (RQ4) and whether Facebook independently curates the news feed (RQ5). To examine these questions several new triadic networks were created in which posts were added to the network in randomized sequences. At different moments during the experiment affinity was manipulated by liking posts. The news feed was examined on different users. It was investigated how Facebook filters or promotes/demotes posts on the news feed. The experiment was designed in such a way that allowed different research questions to be investigated progressively, without bias from previous manipulations.

5.1 Methods

5.1.1 The network setup on Facebook

Multiple networks with three new interconnected users (closed triads) were created on Facebook to investigate the research questions. A triad was used because it provides the opportunity to view the news feed from different perspectives in a network, and this allows multiple research questions to be addressed within the same network. It is also the smallest network structure that allows to differentiate between having affinity or not having affinity with another user. For this experiment it was aimed to create 10 different networks, but only 3 different networks were able to be successfully created due to difficulties encountered when making new user accounts on Facebook. The users within the created networks were called users A, B and C, and they were connected to each other randomly. Image 5.1 contains an illustration of the network structure used in this experiment.

5.1.2 Manipulating affinity

To study RQ2, RQ3 and RQ4 the affinity was manipulated for this experiment. Both affinity with other users and affinity with information can be increased by a user interacting with posts. Based on the results of experiment 1, liking can be used as an interaction to manipulate the relevance of posts. Therefore the liking of posts was used as an interaction to manipulate affinity. When posts are liked, affinity is increased and subsequently relevance is increased. Affinity is the independent variable that is manipulated in this experiment, and it can be increased by liking posts or remain unaffected by not liking posts.

Image 5.1: An illustration of the network structure used in experiment 2.

(26)

5.1.3 Filtering and promotion/demotion of posts

Facebook allegedly curates the news feed by promoting posts with higher relevance, whilst demoting posts with lower relevance, and also filtering the posts with the least relevance. In this experiment it is investigated how Facebook curates the news feed by looking at how Facebook filters posts, and by looking at how Facebook promotes or demotes posts. The filtering and the promotion/demotion of posts are the two dependent variables in this experiment.

5.1.4 Procedure & hypotheses

The experiment was conducted in a sequence of 12 steps. These steps were identical for the three networks in this experiment. Within these steps multiple hypotheses were formulated and tested based on the culmination of performed manipulations in the network. These hypotheses are explained within the procedure.

A visualization is made of each hypothesis in an attempt to aid the interpretation. Within this visualization the user that the news feed is examined on is colored in green. The posts from the users/posts with specific information that are compared for the hypotheses are colored purple. If a manipulation of affinity took place this was indicated by an arrow. The expectation is indicated by using a “+”, “-“ or “=“ sign. The procedure of the experiment and the hypotheses based on the procedure were as follows:

Step 1:

All users (A, B and C) add new posts with random text to the network in a randomized sequence.

Step 2:

User C likes all posts from user B.

Step 3:

All users (A, B and C) add new posts with random text to the network in a randomized sequence.

After step 3 the news feed is captured on user C.

At this point, user C only liked posts from user B. Therefore user C increased the affinity with user B. User A is left out of any interaction. If Facebook uses affinity with other users to curate the news feed (RQ2), then the posts from user B should be more relevant, thus promoted more than posts from user A on the news feed of user C. H1 is visualized in Image 5.2.

H1: If a user (C) likes posts from a particular other user (B), the posts from B are promoted more than the posts from another user whose posts are not liked (A), on the news feed of C.

Image 5.2: Visualization of H1.

(27)

Step 4:

User C likes all posts from user B.

After step 4 the news feed is captured on user B.

At this point, user C has liked posts from user B. Therefore user C increased affinity with user B. User A is left out of any interaction. If Facebook treats affinity symmetrically and uses it to curate the news feed (RQ4), then the posts from user C should be more relevant, thus promoted more than posts from user A on the news feed of user B. H2 is visualized in Image 5.3.

H2: If a user (C) likes posts from a particular other user (B), the posts from C are promoted more than the posts from another user that did not like any posts (A), on the news feed of B.

Step 5:

User B likes all posts from user C.

Step 6:

All users (A, B and C) add new posts with random text to the network in a randomized sequence.

After step 6 the news feed is first captured on user B.

At this point user B has liked posts from user C. Therefore user B increased the affinity with user C. User A is left out of any interaction. If Facebook uses affinity with other users to curate the news feed (RQ2), then the posts from user C should be more relevant, thus promoted more than posts from user A on the news feed of user B. H3 is visualized in Image 5.4.

H3: If a user (B) likes posts from a particular other user (C), the posts from C are promoted more than the posts from another user whose posts are not liked (A), on the news feed of B.

Image 5.3: Visualization of H2.

Image 5.4: Visualization of H3.

(28)

Then the news feed is captured on A.

User A has been left out of any interaction. User A has not liked any posts nor received any likes on posts. Therefore user A should have equal affinity with both users B and C. If Facebook uses affinity with other users to curate the news feed (RQ2), then the posts from users B and C should be treated equally relevant. There should be no difference between the promotion/demotion of posts from users B or C on the news feed of user A. H4 is visualized in Image 5.5.

H4: If a user (A) has not liked any posts nor received any likes on its posts, then there is no difference in the promotion of posts from other users (B and C) on the news feed of A.

Step 7:

All users (A, B and C) like all posts in the network, including their own, to normalize the network.

Step 8:

All users (A, B and C) add new posts with random text to the network in a randomized sequence.

Step 9:

All users (A, B and C) like all posts in the network, including their own, again to normalize the network.

Step 10:

Users B and C add new posts with conservative news links and liberal news links to the network in a randomized sequence.

After step 10 the news feed is captured on user A.

At this point the network should be normalized and the affinity should be equal between all users. Only users B and C added posts with conservative news links and liberal news links to the network. User A remains neutral. If Facebook independently curates the news feed (RQ5), then conservative posts should be more relevant than liberal posts. Therefore the posts with conservative news

links should be promoted more than posts with liberal news links on the news feed of user A. H5 is visualized in Image 5.6.

H5: Posts with conservative news links are promoted more than posts with liberal news links.

Image 5.5: Visualization of H4.

Image 5.6: Visualization of H5.

(29)

Step 11:

User A likes all posts with a conservative news link.

Step 12:

Users B and C add new posts with conservative news links and liberal news links to the network in a randomized sequence.

After step 12 the news feed is captured on user A.

At this point user A only liked posts with conservative news links.

Therefore A increased affinity with conservative news links. If Facebook uses affinity with information to promote posts on the news feed (RQ3), then posts with conservative news links should be promoted more than posts with liberal news links on the news feed of A. H6 is visualized in Image 5.7.

H6: If a user (A) likes posts with a specific type of information (conservative news links), then the posts with conservative news links are promoted more than posts with other types of information (liberal news links), on the news feed of A.


Image 5.7: Visualization of H6.

(30)

The steps in the procedure and the hypotheses are summarized in Table 5.1.

A step-by-step protocol for each network can be found in Appendices D, E and F.

5.1.5 Data collection: calculating the filtering and promotion/demotion of posts

For the hypotheses the news feed was captured on different users during the experiment (a link to the screenshots can be found in Appendix G). The dependent variables in this experiment are (1) the filtering of posts and (2) the promotion/demotion of posts. To examine whether posts were filtered and promoted/

demoted for each hypothesis, data was collected and calculated for each post in each of the three networks.

For each hypothesis the total amount of data differed, because posts were added to the networks in incremental steps, and the hypotheses were tested at different moments during the experiment. For example, for H1 the networks consisted out of 30 posts, whereas for H3 the networks consisted out of 45 posts. A different spreadsheet was created for each hypothesis, and in these spreadsheets the data for each network was stored and calculated separately (see Appendix H).


Table 5.1: The procedure and hypotheses for experiment 2 Step: Action:

1 All users (A, B and C) add new posts with random text to the network.

2 C likes all posts from B.

3 All users (A, B and C) add new posts with random text to the network.

H1 4 C likes all posts from B.

H2 5 B likes all posts from C.

6 All users (A, B and C) add new posts with random text to the network.

H3 H4 7 A, B and C like all posts from each other and their own.

8 All users (A, B and C) add new posts with random text to the network.

9 A, B and C like all posts from each other and their own.

10 Users B and C add new posts with conservative news links and liberal news links to the network.

H5 11 A likes all posts with conservative news links.

12 Users B and C add new posts with conservative news links and liberal news links to the network.

H6

Reading guide: As the first step new posts with random text were added to the network by all users. As the second step C likes all posts from B. As the third step new posts with random text are added to the network by all users. Between the third and fourth step H1 is tested.

Referenties

GERELATEERDE DOCUMENTEN

For many inhabitants of the three countries these three persons are absolute heroes, but interesting in this thesis is the question whether the populist parties are also content

Colum- nwise examination of Table 3 shows that the highest mean score per scale (and lowest for the prosocial scale which measures strengths) is found among the adolescents with the

However, given that capitalist society is a society of generalised commodity production governed by the legal regime of private property, it follows that

It consists of software tools and consultancy services that respond to the continuum of land rights and FFP approach (M. The innovative technological solutions include 1)

The number of nucleated cells after processing with CellSearch was similar for peripheral blood and the DLA sample whereas a slightly higher number (not significant) of nucleated

Assessment of temperament in children with profound intellectual and multiple disabilities: A pilot study into the role of motor disabilities in instruments to measure

This demonstrates that the rotational behavior of the molecular motor can be further expanded to influence systems coupled to the motor, thus opening additional venues for

2(a) shows, for each Booter separately and on the over- all of all surveyed databases, how many times users purchase attacks from Booters. As expected the number of users that did