Automated Meme Magic: An Exploration into the Implementations and Imaginings of Bots on Reddits

(1)

“Automated Meme Magic: An

Exploration into the Implementations

and Imaginings of Bots on Reddit”

(2)

Acknowledgments...3

Abstract...4

1.2 Research Questions...6

1.2.1 Why Reddit...7

1.2.2 Bots...9

1.3 Outline...10

2 Bot Research...11

2.1. Functional Bots...13

2.2 Harmful Bots...14

2.2.1 The Rise of Socialbots...16

2.2.2 The Rise of Political Bots and Computational Propaganda...19

2.3 Sockpuppets and Cyborgs...21

3 Reddit...23

3. 1 Background on Reddit Structure, History, and Culture...23

3.2 Reddit’s Toxic Technocultures, Gamergate, The Fappening...26

2.2.2 /r/The_Donald...28

2.2.3 Reddit Bots...29

3.2 Independent Research by Redditors...31

3.2.1 Reddit Transparency...37

4 Methodology and Findings...38

4..1 Google Search Findings...40

4.2 Bot Lists...51

5 Discussion...65

5.1 Automation, Technology, Culture, and Economics...65

5.2 Taxonomy of Bots...66

5.2.1 Visible bots...67

5.2.2 Invisible Bots...68

5.3 Suggestions...68

6 Conclusion...70

Works Cited...72

(3)

Acknowledgments

There are several people I would like to acknowledge and thank for their support,

encouragement, and insights. First, I would like to thank all the professors and students I have had the pleasure of working with over the course of this program. Marc Tuters, my thesis supervisor, who encouraged me to pursue this research. Sal Hagen, for writing various SQL scripts which helped me find interesting data. My mom, for the lifelong, love, and support she has given me over the years. To my sisters and brother who encourage me to follow my dreams. To my father, who is no longer with us and the unconditional pride he had for me. And to Laura, for talking to me everyday, for challenging me, loving me, being patient with me, and the amount of growth I have experienced because of you.

(4)

Abstract

The allegations of bots being used as deceptive, persuasive, manipulative, unseen, networked machines, seeded inside digital environments to control, guide, subvert, or otherwise alter the public discourse, is a prevalent topic around the areas of new media, political science, human-computer interaction, journalism, computational propaganda, science and technology studies, and many other areas of interest. Recent instances where bots have arisen and have caused alarm are typically situated around political elections, but also have been seen in some areas related to cryptocurrencies. The effects of bots are most commonly seen via social networking sites where they are capable of exploiting homophilic algorithms and direct content toward particular groups of people. Essentially, visibility is a means toward shifting normative discourse, maintaining popularity or controlling the circulation of a particular piece of information is susceptible to manipulation. Visibility is also directly linked to profit as well.

This thesis will be an attempt to present the history of bot research, classifications for different bots which display specific attributes, and background on the content aggregator site Reddit.com. A first focal point of this thesis will be the use of social bots, political bots, and Search Engine

Optimization (SEO) strategies which entails the use of marketing techniques which seem at odds with Reddit’s behavioral policy. The second focus will revolve around particularities of Reddit and how bots are used on the site.

Discussion points will focus on a taxonomy of the observed bots on Reddit, comments and SEO models, Reddit’s culture and internal governance, and the political, economic, and cultural implications of bot and bot like activity.

Less research has been considered for Reddit as a point of investigation. Hopefully, this thesis will act as a stepping stone into further research on an increasingly prevalent online environment and topic.

(5)

1 Introduction

Is it possible that I did not write this thesis? I, Jonathan Murthy? Or, perhaps, is it possible that someone else wrote it? Is it possible that somethingelse wrote it? If you had an infinite amount of monkeys typing at an infinite number of typewriters, typing words at random, could they not write this thesis? Could my digital profile be fabricated and used to gain credibility? Could my style of writing be derived from a corpus of previously consumed works around a particular area of interest in order to mimic natural knowledgeable language? How much would it take to convince you that I am a human being presenting credible information? In thinking about these questions (despite their hyperbolic nature), we can then think, ‘what is required, technically speaking, to mediate exploitable abstractions between you and I’, and (perhaps more importantly) ‘why would I do this?’ Despite these ponderings, sowing seeds of doubt toward me and this thesis’ authenticity is not what this thesis is about, but attempts to acts as an image to enter into a world of algorithmically mediated communication, automation, and online identities. This is then compounded by the circulation of misinformation, fake news, visibility manipulation, directed marketing, and other issues which concern public discourse around digital media. While there are other factors that can contribute to these same issues, I will be focusing on what is colloquially referred to as ‘bots’, ‘botnets’.

But what exactly are bots and what are they capable of doing? In computing, a bot is “an autonomous program on a network (especially the Internet) that can interact with computer systems or users, especially one designed to respond or behave like a player in an adventure game

(Google.com). The term’ bot’ is a shortening of the word robot, derived from the Czech ‘robota’

meaning “forced Labor” (Google.com), and this seems to be in reference to the ability programmability of bots (Geiger, 2014). These three notions (that bots interact with both humans and computers, that they are designed to mimic human behavior, and can automate tasks) makes for a precarious state of affairs regarding what we see online and what information get circulated (Wooley and Howard, 2016; Howard et al., 2017; Forelle, 2015). Various industries and institutions recognize this reliance on automated tasks within networked systems, and what seems to be a growing department in every market for the ability to automate data heavy tasks (Geiger, 2014). This translates into how visibility,

(6)

virility, and amounts of engagement are transformed into profit by marketing firms described as Search Engine Optimization (SEO) (Heder, 2018). Also, It is not solely economic reasonings for the

deployment and use of bots, as there are social, political, experimental, and other reasons for their use. According to the 2016 Incapsula Bot Traffic report, bots makeup 51.8% of internet traffic where 22.9 % of total traffic is classified as “good” bots and 28.9% are classified as “bad bots”

(www.incapsula.com). But what determines the moral signifier of ‘bad’ or ‘good’?

This thesis will be an attempt to present to the best of my knowledge the ways in which bots are implemented and deployed in online environments. Information regarding bots on Twitter and Facebook will be presented and I will contribute original research involving bot activity on Reddit. Allegations of bots being used to control and manipulate the normative discourse and circulate misinformation is a prevalent topic in many different areas of research and study (Woolley and Howard, 2017). A majority of that research has been focused on Facebook and Twitter especially when we consider events like the 2016 US Presidential Election, where Donald Trump relied heavily on an online campaign and the use of social media (politico.com). Other online spaces, such as Reddit, have not been given the same amount of attention despite there being what seems to be equally dubious activity and even what many consider to be the largest gathering space of Trump supporters in the subreddit /r/The_Donald (Zannetou, 2017; qz.com; thehill.com). Reddit has also been observed to have a techno-libertarian sentiment, a particular culture of self-governance, a controversial history, and what many consider to be an easily manipulable voting system, all of which make it an interesting point of observation (Massanari 2015). For these reasons, I have chosen to focus on bot activity Reddit in order to explore its particularities and bring an area of study to a less publicized site of observation. The rest of this introduction will cover some of the research questions and objectives of this thesis, as well as an outline for the following chapters and sections.

1.2 Research Questions

While investigations into various methods of manipulation on Facebook and Twitter are more prolific, other sites that have equally, if not more, precarious policies and governance systems are

(7)

overlooked. The research presented in this thesis is investigative and exploratory in nature, first attempting to answer the question of ‘how do bots operate on Reddit as opposed to Facebook and Twitter?’ This breaks down into what are the capabilities, functionalities, and imaginings of bots in a broad view. From there we will move into the particularities of how Reddit as site functions both technically and culturally. What is it about Reddit and bots in general that allows them to operate in the way that they do? What are the implications of their activity and what can be done to lessen harmful abuse? I will now present brief summaries for the theoretical reasonings for choosing Reddit as a site of observation and bots as an object of investigation. Longer, more in depth backgrounds will be given on both Reddit and bots in late chapters.

1.2.1 Why Reddit

While Reddit does not seem to receive as much attention as social networking sites such as Facebook and Twitter, it is still the 4th most visited site in the U.S. and the 6th most visited site worldwide with 58.5% of traffic coming from the U.S. followed by 7.6% from the UK and 6.1% from Canada

(www.alexa.com). Reddit is also mired in controversial events, including but not limited to Pizzagate, Gamergate, and The Fappening (thenewyorker.com; Masanari, 2015).

Pizzagate consists of a conspiracy theory accusing the Clinton’s of being part of a child sex trafficking operation centered around a Washington D.C. pizzeria by the name of Planet Ping Pong. The theory developed on 4Chan’s /pol/ board after leaked emails from the Clinton campaign circulated and eventually made its way onto /r/The_Donald in November of 2016, leading up to the US

presidential election (Malmgren, 2017; digitalmethods.net). The event culminated when an unidentified man in his late 20’s opened fire on the pizzeria, and where it was subsequently revealed that the theory was false (snopes.com). This event demonstrated how the spreading of misinformation can have violent implications as well as examining the role anonymous and pseudonymous sites like 4chan and Reddit play in the spread of misinformation. Reddit in particular is considered to have a playful and facetious but scientific and logical sentiment, where the information is shared quite virially, but can be erroneous, antagonistic, and even unlawful (Masanari, 2015; Milner, 2015).

(8)

Gamergate refers to the harassment and alienation of women in the video games industry which initially began as the harassment of Zoe Quinn, but took on the sentiment of misogyny and sexism underneath a growing distrust in video game journalism and the industry as a whole.

(Masanari, 2015). The Fappeing refers to the circulation of private celebrity photos which which were stolen from Apple’s iCloud service (Masanari, 2015). Those photos circulated on Reddit with such a high amount of activity that one subreddit moderator described “insane traffic ”due to the hack. (Masanari, 2015). Both Gamergate and Fappening will be elaborated in a later chapter as well, but what these event demonstrate is a proclivity toward spreading information with an ideological identity, one most closely associated with the American alt-right. All three of these events had dedicated subreddits where they were discussed and where information particular to the event was shared. These subreddits have since been banned, but until after a substantial amount of time.

/r/The_Donald is considered to be the largest pro-Trump online community and, along with other subreddits such as /r/bitcoin, have been accused of vote brigading and vote nudging, a

phenomena where a group of individuals vote in a particular way in order to gain visibility on Reddit’s message board (gizmodo.com; medium.com). Vote brigading is strictly against Reddit’s policy, while vote nudging is a more common practice which exploits how Reddit weighs earlier votes against later votes, enabling content which is voted favorably earlier a longer lifespan (reddit.com). By abusing this exploit, a piece of content’s lifespan can increase exponentially and garner a great deal of attention. The /r/The_Donald subreddit has also been at odds with the Reddit administration for some time where a tenuous relationship regarding free speech, hate speech, censorship, and authenticity hang in the balance. Many calls for the banning of the /r/The_Donald have circulated, but Steve Hoffman (/u/spez), current CEO of Reddit, has been reluctant levy the punishment despite having banned other problemantic subreddits (vox.com). /u/spez has also received a large degree of hate from

/r/The_Donald’s community involving censorship, the suspension of accounts, the purposeful

downvoting of /r/The_Donalds content, and the editing of user comments. This last event involved an script which swapped /uspez’s account name with the name of the person making the comment. This resulted in anger directed toward /u/spez being redirected to the person who first posted, but the event left a bad impression on an already tenuous relationship. The Reddit administrative team has changed some of their interface and homepage to include /r/popular alongside the previous /r/all page in order

(9)

to satisfy the calls for removing /r/The_Donald, and /r/The_Donald’s claims of censorship. Other filtering options like ‘best” and “hot” have been implemented for similar reasons in order to limit or better curate the content that rises to the top of other pages (reddit.com).More on Reddit’s

background, history, and structure will be presented in a later chapter.

These examples demonstrate the technical aspects of Reddit’s infrastructure and cultural sentiment, how information circulates and how content can be controlled. Reddit has a history of controversy and seems to be an intermediary space between mainstream media sites and the darker corners of the web such as 4chan (thehill.com; Malmgren, 2017). But it is important to understand how events similar to these occur on Reddit, and what functionalities are vulnerable to exploitation. How does Reddit’s ranking algorithms, technical nature, and culture contribute and influence the way information is shared? This leads me into the next aspect of this thesis: the role of computation and automation (i.e. bots).

1.2.2 Bots

In order to begin exploring the operationalization of bots in digital spaces, a distinction should be made here about what bots are, what they do, what they are capable of, and where they are placed in the discussion over social media, misinformation, automation, data, and networked activity. Bots are capable of performing infrastructural tasks. These functional bots are helpful in their imaginings Some of the ways in which bots are used include scraping, crawling or cleaning websites, capturing and performing web analytics, assisting in customer services (Geiger, 2014), assisting in data research and many other administrative tasks. But in their utile capacities and given the right environment, their abuse seems to almost inevitably follow, especially in cultures where the hacking of systems is a pillar of that culture.

This is where the notion of harmful or malicious bots can begin to be explored within the context of Reddit. The use of automated systems on social networking sites seem to be responsible for purposeful swaying, manipulating, guiding, drowning, or altering the public discourse of political elections online (Woolley, 2017). Even functional bots have been seen erring on the side of spreading

(10)

misinformation during the Boston Marathon Bombing (Cassa et al, 2013) Those bots were considered benign and were more a product of a faulty architecture, spreading false information about suspects. Similarly, Twitter’s experimental chatbot, Tay, after digesting a corpus of tweets from other users, began tweeting hateful, racist, misogynistic, pro-Hitler content (Neff and Nagy, 2016). The more deliberate bot which attempts to hide its identity in order to astroturf, and present itself as contributing to a normative discussion. Social Bots are by definition deceptive, seeming human while in actuality being automated (Ferrara et al., 2014, Boshmaf, 2011). When Social Bots take on a specifically political sentiment they are considered to be political bots. Political bots have been seen to populate online spaces with partizan ideologies governing what kinds of things they post (Howard and Woolley, 2016). Numerous investigation and research into the influence political bots in the wake of the U.S. Presidential election have been undertaken to gauge the extent with which their environments were infiltrated, as well in other countries around the world (Forelle et al., 2017; Howard et al., 2017 Schafter et al., 2017, Wooley, 2017).

Bots are not just capable of circulating misinformation, but are also capable of reinforcing cultural sentiments. In what Lawrence Lessig refers to in his book Code and Other Laws of

Cyperspace, the code acts as an infrastructure that has the ability to enforce social normative and

governmental rules that is omnipresent, omnitemporal, and automated (Lessig, 1999). Because an automated process can run indefinitely, their presence can allow for a normative discourse to emerge that is encouraged by that process. We will see examples of this in a later chapter on /r/The_Donald.

1.3 Outline

The following chapters and sections of this thesis will review some of the classifications and methodologies of bot research, namely: functional bots, harmful bots, social bots, political bots, and sockpuppets. This chapter on bot research is informed by publications from accredited academic institutions. The sites of investigation where bots are present in include Wikipedia, Twitter, Facebook, and Reddit. The section on sockpuppets will elaborate on SEO strategies and involve various social media accounts which are operated by humans, but where activity is bought and sold.

(11)

The following chapter will present background and history on Reddit as well as its

infrastructure. This will include aspects of Reddit’s site specificity, content policies, and controversial events. Specifically, research regarding Gamergate, The Fappening, and /r/The_Donald will be presented. Another section involving an ethnographic studying on the requesting and creation of bots will be presented

Reddit is considered to have a high degree of internal governance, and has produced non-accredited research regarding bots on the site. Reddit has also released its 2017 Transparency Report where it documents many suspicious accounts and the subreddits they were seen most active in.

Original research is also presented in a Methodology and Findings chapter where particular subreddits are scraped in order to identify bot accounts. Google BigQuery is used to perform the data scrape where a data set of Reddit posts and comments dating back to 2007 is available. Those accounts are then looked at individually in order to gain a sense of the type of activity they partake in as well as to if there are specific attributes to the community those bots are found in.

The final two sections of this thesis will include a discussion and conclusion where a

classification and taxonomy of the bots found on Reddit will be presented. I will also address some of the more problematic aspects of Reddit’s culture and structure as well as how SEO and growth hacking business models encourage the exploitation of digital architectures.

Little research has been done regarding the research of bots and bot like activity on Reddit. It is highly encouraged that future research build off what is presented here in order to gain better insight into an increasingly prevalent digital environment and equally prevalent digital artifact. Due to the subject matter and Reddit as a site of observation, some of the content present will have explicit language or subject matter. The names of the accounts that will also be mentioned have been removed for privacy, unless they are immediately related to a bot account.

2 Bot Research

This section of the thesis will cover a broad range of topics surrounding the research of bots, the methods used for detecting them, and some of socio-technological and political implications

(12)

therein. This section will contain a collection of papers that have been published with academic accreditation in order to present background on the topic as well as situating the work done in this thesis among the academic community. This is an important aspect of this thesis because there will be independent research presented later on.

Some of the earlier methods for detection and classification have proven over the evolution of these technologies to not have the same kind of efficacy they would have had in the early years (Ferrera, 2014). There seems to be what Ferrera et al. refers to as a technological “arms race” between developers, academics, and those who work to detect and prevent the abuse of these systems against those who wish to abuse those systems(Ferrara, 2014; Boshmaf, 2011). Because of this, methodologies for bot detection have progressed hand in hand with methods for concealing bot activity.

The following section will review some of the literatures put forth researching, classifying, detecting, and observing the effects of bots in different online environments. This will include the distinction between benign or functional bots and the deployment of malicious or harmful bots. From there we will move into research done on the specific categorization of socialbots and various methodologies for determining whether certain accounts are in fact automated. From social bots, we will look at the deployment of bots that seem to be related to political elections or politics in general. These are referred to as political bots (Woolley, 2015). The last section of this chapter will focus on accounts that are operated by human users. While this seems at odds with the topic of this thesis, these sockpuppet accounts (accounts which are owned and operated by humans, but produce

inauthentic activity) are instrumental for astroturing campaigns, the concealment of the manipulation of visibility, and the encouragement of such activity online in what Herder refers to as emerging “black markets” for the buying and selling of likes, comments, upvotes, and other activities. This chapter represents a small amount of the research done on bot related issues and encourages further investigation.

(13)

2.1. Functional Bots

Within the potential of automated software lies a seemingly inherent threat that automation can not only replace us, but trick us into not knowing who or what we are interacting with. Alan Turing remarked on the capabilities and requirements necessary for an automated piece of software,

mediated through communicative technologies, to deceive humans into thinking they too were humans through his eponymous “Turing Test” (Turing, 1950). But while the Turing Test was a specific type of imagined technology, Turing is also responsible for using automation to decipher German

communications during WWII. Like any technology, automated software agents are tools, much like the systems they inhabit. The system will rarely be unexploitable, if ever, and the tools which

constitute that system may be used against it. It is for the sake of not completely vilifying the system or the digital entity therein, but rather to present some of the information and implementation of bots that contribute and assist in problem solving, information gathering and system maintenance.

Automation allows for a greater number of tasks to be performed with speeds orders of

magnitude greater than humans are capable. This includes assisting data researchers, data scientists, and other data intensive professions in their work. The affordances and capabilities of bots allow researchers to scrape vast datasets, scanning for particularities in keywords or content, clustering associated entities, and overall allowing for the refinement of seemingly unintelligible data. This allows researchers to perform analysis and visualize information in comprehensible ways that, if those same researchers were to do manually, looking for patterns, would take many times over the amount of work an automated script could accomplish.

Bots are also capable of performing administrative tasks for particular websites, provide chat supported customer service, and aggregate content. The administrative capacity of bots allows us to see the ways certain websites are able remain functional through the flagging of illicit content, performing updates, and overview of websites operations. Chatbots provide customer service by responding through natural language processes in order to assist customers in matters related to website. Content aggregates are capable of collecting news stories and related pieces of media for personal or analytical purposes. Looking at how and in what capacities functional bots operate gives

(14)

us a better sense of the capabilities of automated software and the tasks that they are most commonly associated with doing.

An example of this taking place is how Wikipedia operates (Geiger, 2011). Bot accounts for a large portion of edits to Wikipedia pages and have even been seen to play a larger role in less

developed countries (Niederer and van Dijck, 2010). The extent of which bots perform edits was on full display when in late 2006, the english version of Wikipedia held its third annual election for its

Arbitration Committee. An editor bot by the name of AntiVandalBot, was nominated to the committee due to its larger number of edits, much to the chagrin of the other members (Geiger, 2011).

AntiVandalBot’s candidacy was later revoked, but this brought into the question the role bots play within these systems. This brings Geiger, in a paper titled “Lives of Bots”, to consider the bot not only as a digital artifact or object, but as an actor as well (Geiger, 2011; Hegelish and Janetzko, 2016). In Niderer and van Dijck’s 2010 “Wisdom of the Crowd or Technicity of Content? Wikipedia as a

Sociotechnical System” also calls into question the scale or extent editors on Wikipedia frame that information, as well as what kind of geographical, political, and economic influences can be derived from using editor bots.

While these technologies are becoming more rooted in the infrastructure of various systems, it then becomes imperative to understand how they can be used to exploit that system. From here, the potentialities of automated software agents become more precarious and imaginings for the abuse and deployment of malicious or harmful bots begin to take shape.

2.2 Harmful Bots

While the functionalities of bots allow them to contribute to research, information gathering, and the general upkeep of massive repositories and resources like Wikipedia, those same

functionalities also allows for a degree of abuse. Even in Wikipedia, where in certain countries the contribution or amount of bot traffic is staggeringly larger than that of its human counterparts, the role and influence of these digital entities is questioned (Niederer and van Dijck, 2010). As has been noted, content aggregator bots on Twitter have been responsible for spreading misinformation and machine learning chatbots have also exhibited a disposition for society’s more hateful nature (Cassa et al.,

(15)

2013; Neff and Nagy, 2013). But these are still examples of benign bots producing unintended and potentially harmful consequences due to environmental and functional design. There are also

instances where abuse by automated software agents is far more intentional and deliberate. Examples of malicious bots include distributed denial-of-service (DDoS) attacks, botnets used for identity theft, “click fraud, cyber-sabotage,” malware, etc (Howard and Kollanyi, 2016).

The line between intentional and unintentional abuse becomes blurred when it comes to classifications and implementation of bots especially considering the ability of bots to mimic human behavior. This is where we enter into a classification of bots by Boshmaf et al. of social bots. Social bot is a general term that connotes a piece of software that serves the function of interacting with humans which occupy the same online spaces as the bot on some socially related way (friends, followers, etc.). An example of these are chatbots, but also political bots (Ferrara et al., Howard and Woolley, 2017). While chatbots can be used by a variety of different companies to interact with customers, when dealing with socially networked systems like Twitter and Facebook, accounts that post under the guise of human authorship, especially when relating to contentious public discourses like politics or economics, nefarious motivations can begin to be surmised. These bots exploit our own in our desire for sociality and familiarity, working in tandem with with an algorithmic infrastructure which tends to feedback upon itself. Bots that take part in politically motivated activity have been referred to as political bots (Howard and Woolley, 2017).

In a 2000 paper titled “Mindlessness and Machines: Social Responses to Computers”, Nass and Moon performed a study to see what kind of social signifiers are projected onto computers and how unconscious that projection is. Nass and Moon observed social signifiers like gender, race, ingroup/outgroup formation as playing a role in the ways machines were perceived and detected a level of “mindlessness” when interacting with computers (Nass and Moon, 2000). When we take bots into this kind of consideration and ways in which media technologies have changed the landscape of how information is disseminated (Marwich and Lewis, 2017), the role social and political bots play in an increasingly homophilic and information saturated environment becomes a much more tenuous one.

(16)

2.2.1 The Rise of Socialbots

Prevalence of social media as an arena of discussion in the last decade also coincides with social media become a space where information is shared. Various actors including governmental and non-governmental organizations have made use of social media. In the 2008 US Presidential Election where the Obama Campaign raised 500 million dollars online (Vargas, 2008). This section will cover a particular type of bot that has emerged in the last several decade, coinciding with the prevalence and increased usage of social media networks: the Socialbot (Boshmaf et al, 2011). In 2010, Ratkiewicz et al. published a paper titled “Detecting and Tracking the Spread of Astroturf Memes in Microblog Streams”. The paper highlights Twitter’s uniqueness and data rich environments and its susceptibility to political astroturfing (astrofuring being the appearance of spontaneous grassroots behavior

orchestrated by an individual or organizations) (Ratkiewicz et al., 2010). In this paper, Ratkiewicz et al. outline a method for detecting the “truthiness” of a meme. By meme Ratkiewicz et al. refer to a pieces of content contained in a tweet (e.g. URL, hashtag, body, etc). From there, they created network graphs detecting the diffusion of those memes. This research is largely influenced by Metaxas and Mustafaraj’s work on Twitter accounts which artificially inflated the occurrences of specific URLs in the 2009 Massachusetts special election. Other work cited includes work detecting spam on Twitter by Greier et al., Boyd et al. and their work on manually classifying Twitter accounts, and Benvenuto et al.’s work a machine learning systems that, at the time of publishing, had a 87% accuracy rate for detecting good or bad accounts.

This research showed that bots were instrumental in circulating content on Twitter and

delineate a uniformed pathology for bot activity (Ratkiewicz et al., 2010). A particular meme which was circulated involved Chris Coons, a Democratic candidate for the U.S. Senate in Delaware, where a network of about 10 bot accounts were found. Mind you that this was in 2010. We shall see that since then both the number of accounts in bot networks has increased rapidly and the topologies of their activities becomes less patterned.

The work done by these studies of looking at astroturfing, spamming, and the ways in which information circulates in various online spaces, provides a ground work for the study of Socialbots. There are some key features to the socialbot which allow it to be an effective, albeit insidious tool to

(17)

deploy on whatever online social networking site the perpetrator so chooses. The primarily factor is that it is designed to interact with human users and is meant to conceal its identity as an autonomous being (Boshmaf, et al., 2011). Interactions can include directly posting comments, sharing links, liking content, and any number of other actions on a given social networking site. These sites can be used to influence the human users and mimic activity to an effort to fabricate grassroots digital activism (Misener, 2011; Boshmaf et al., 2011; Ratkiewicz et al., 2010).

In 2011, Boshmaf et al. at University of British Columbia at Vancouver, articulates the socialbot and socialbot network with a particular consideration to social networking sites or “Online Social Networks (OSNs) where socialbots are considered most prevalent and dangerous (Boshmaf et al., 2011). In order to test their hypothesis, a network of socialbots were deployed to gauge their effectiveness. They first identified 4 vulnerabilities in these sites: Ineffective CAPTCHAs, fake user accounts and profiles, crawlable social metrics, and exploitable application programming interfaces (APIs) (Boshmaf et al., 2011). Socialbots are defined as “computer programs that control OSN

accounts and mimic real users.” (Boshmaf,et al., 2011). Ideas that socialbots control the systems that they emerge in and that they do this by pretending to be human users makes for a large portion of current dialectic of how information is presented and shared online.

By revealing the vulnerabilities of OSNs, the Boshmaf et al. sought to see how effective a concentrated, “large-scale” attack by a Socialbot Network (SbN) could be (Boshmaf et al, 2011). At the time of the study, the deployment of bots in order to gauge the effects and depth of penetration, infiltration, or other propagation throughout a social networking site proved to an effective method for exploring the implications and effects of bots activity. Facebook reported to have an 80% infiltration rate. But, there is also the matter of being able to classify, identify, and detect bots. However, as methods for bot detection continue to become more complex, primarily through various data driven methodologies including machine learning and network analysis, the deployment of bots too becomes more complex and multifaceted, leading to sustained efforts to develop new methodologies (Ferrara et al., 2014; Boshmaf et al., 2011).

In 2014, researchers at the University of Indiana developed the BotOrNot? protocol (later named Botometer) in order to gauge the likelihood of a particular account on Twitter, aptly, as a bot or not (Ferrara, et al., 2014). This method was built on some of the methods and discussion put forth by

(18)

Boshmaf and his group, specifically remarking on the interaction with humans, the emulation and manipulation of behavior, as well as their history (Lee et al., 2011; Boshmaf et al., 2011). Botometer breaks down the socialbot into 6 different attributes to investigate: network, user, friends, timing, content, and sentiments (Ferrara, 2014). In order to test this method, Ferrara’s team build a corpus off of a 2010 study at the University of Texas A&M, where a honeypot method was implemented in order to obtain and attract bot accounts on Twitter (Ferrara, 2011; Caverlee, 2010). The honeypot method employed the use of fake Twitter accounts which post nonsensical tweets that would deter a

seemingly normal human from following those accounts (Caverlee et al.). This method also quickly became outdated as socialbots started being able to mimic a greater range of human behavior including circadian rhythms and accounts which were piloted by both automated scripts and human users (Ferrara et al., 2011). These accounts are referred to as cyborgs by Ferrara due to the use of human and automated components. Ferrara’s team collected 200 of the most recent tweets by these previously identified group of bots, and 100 tweets which mentioned them from the Twitter Search API (Ferrara et al., 2011). This resulted in over 2.9 million tweets with 15 thousand manually verified socialbot accounts, but also produced 10 thousand human accounts (Ferrara et al., 2011).

Some of the more recent efforts at classifying and detecting bots can be seen in different universities, research teams, and other organizations. An example of a current project aimed at bot detection is a small group of collaborators of DiscoverText users who have assembled a Bot Or Not: A

Briefing Bookwhich acts as a repository for people to share and contribute work to a growing repository of bot detection (Shulman et al., 2018). Discovertext allows for a textual analysis of suspicious accounts by human coders, who are briefed with methodologies found within the Bot or

Not: Briefing Book (Shulman et al., 2018). Some of these methodologies are derived from DFRLabs,

MIT Technology Review, How-to Geek, The New York Times, botcheck.me, and Mashable. The majority of the work done by Discovertext relies on manually coding suspicious tweets or accounts with an emphasis on metadata and the use of linguistic or rhetorical abnormalities which is (Shulman et al., 2018). From the data generated by the human coders, various data driven and machine learning methods can be used to create visualizations and perform analysis. Shulman et al. published their findings to datadrivenjournalism.net, remarking on how “bots are not as easy to spot as many may

(19)

assume” and how continued research is still necessary to reduce the effects of harmful or exploitative algorithms in digitally mediated society (www.datadrivenjournalism.net).

2.2.2 The Rise of Political Bots and Computational Propaganda

The use of socialbots to exploit or abuse vulnerable social media ecosystems and social media users is nowhere more widely discussed and deliberated than in the political realm (Woolley et al., 2016). The implementation of this type of bot with this particular intention is more commonly captured in the term ut forth by Samuel Woolley: the political bot. Both Woolley and Howard classify the use of political bots as computational propaganda in order to emphasis the production and circulation of content through digital communicative technologies. Political bots are considered to be “algorithms that operate over social media, written to learn from and mimic real people so as to manipulate public opinion across a diverse range of social media” (Howard and Woolley, 2016). Bots can also hinder the advancement of public policy by creating what seems to be grassroot movements, deploy various ‘bombs’ to occupy various search engine spaces or control a piece of contents visibility (Ratkewicz et al., 2011b), and polarize the political discussion (Conover et at., 2011).While the use and deployment of automated agents to influence the outcomes of political elections is not entirely a new phenomenon (e.g. the 2010 US Primaries and 2008 US Presidential election mentioned above), the 2016 US Presidential election and elsewhere around the world has caused a shift in the research of automated software agents in political sphere (Shulman et al., 2018).

One example of the use of political bots comes from the first 2016 U.S. Presidential Debate, where Kollanyi et al. collected pro-Trump hashtags and pro-Clinton hashtags and analyzed the amount of traffic generated by each of those hashtags on Twitter (Kollanyi et al., 2016). This method generated over 9 million tweets with the purpose of “discerning how bots are being used to amplify political communications” (Kollanyi et al., 2016). Some of their findings showed pro-Trump hashtags nearly doubled that of pro-Clinton hashtags, roughly one third of the traffic related to pro-Trump hashtags were from automated accounts (i.e. bots) compared to about one-fifth of the pro-Clinton hashtags (Kollanyi et al., 2016). Their findings also remark on how Twitter appears to be an overall

(20)

pro-Trump space, rather than pro-Clinton where many human users engaged in the political discussion (Kollanyi et al., 2016).

Another research study conducted in 2017 over the tweeting habits of voters in the state of Michigan considers the same political and technological implications (Howard et al., 2017). In this study, Howard et al., were looking for how misinformation and more extremist, sensationalist,

conspiratorial or legitimate misinformation spread and compared to that of professionally researched and legitimate political news (Howard et al., 2017). Their findings showed that fake news, or “junk news”, outperformed that of other more credible news sources (Howard et al., 2017). Interestingly enough, the occurrences of content being shared from credible news sources reached “its lowest point the day before the election: (Howard et al., 2017). In the UK, the referendum on EU membership using the hastags #Strongerin and #Brexit (Howard and Kollanyi, 2016). The results of this study showed that while there was not a distinctly or discernibly high amount of automated accounts, those accounts were still strategically used (Howard and Kollanyi, 2016) Specifically, they showed that less than 1% of the accounts that were tweeting with the particular hashtags associated with the UK referendum generated roughly “32% of all Twitter traffic about Brexit” (Howard and Kollanyi 2016). The implementation of political bots is not only a European or American phenomena, but has been researched in other countries as well. Political bots have been observed as influencing the Japanese 2014 general election (Schafer et al., 2017). The methodology used in this research took the posting patterns as the primary attribute to focus on in their sampling and analysis (Schafer et al., 2017). Some of the findings they point to involve the use of cheap technologies to influence social media and a difficulty in identifying bot accounts using purely statistical analysis (Schafer et al,. 2017).One more example of research done on political bots involves the tweeting habits of Venezuela (Forelle et al., 2015). Much like the UK example earlier, bots tweeting about Venezuelan politics only account for a small percentage (10%) of overall political communication on Twitter but are used strategically and by the radical opposition (Forelle et al., 2017).

Howard and Woolley explore some of the contemporary research issues surrounding political bots in their paper “Political Communication, Computational Propaganda, and Autonomous Agents”. This article is a derived from papers submitted to the 2016 International Communication Association meetings in Fukuoka, Japan. The burgeoning networked apparatus, rise of big-data infrastructure, the

(21)

Internet of Things (IoT) and growing ubiquity and use of digital media for news and information all contribute to the spread of damaging political communication online (Woolley and Howard, 2016; Cisco, 2014). Some of the ethical deliberations regarding how to combat the use of political bots and lessen their impact are included in this article as well. Marechal proposes the normalization of state-sponsored auditing at the algorithmic level while Mittelstadt considers that the “burden of auditing these systems for political bias lies on the shoulders of the platforms themselves” (Marechal, 2016; Mittelstadt, 2016; Woolley & Howard, 2016). Guilbeault sees the emergence of political bots as an opportunity to open discussions about policy and the theoretical implications of both the “innovation and intervention” of digital communicative technologies (Guilbeault, 2016; Woolley & Howard, 2016). Sandvig argues that algorithmic auditing will become an increasingly important area of research in both social and scientific sciences (Sandvig et al., 2016).

2.3 Sockpuppets and Cyborgs

A common thread through all of the studies presented in this chapter thus far delineate an analytical difficulty in the ways bots are researched, namely that of the human element and topography of behavior. This last section will present some of the research done regarding “black markets” that are built off of marketing techniques to boost engagement metrics, and thus visibility, of various pieces of content on different online platforms (Heder, 2018).

In paper titled “A Black Market for Upvotes and Likes”, Heder explores the microtask/freelancer website microworkers.com (Heder, 2018). In that paper, Heder looks at listings posted to the site, where payments are made in exchange for likes, comments, upvotes, watching videos, and other microtasks (Heder,2018). Microtasks are menial tasks performed online where microworkers are compensated from anywhere between 0.15 USD to 1 USD (Heder, 2018). It should be noted that microworkers.com is not exclusively used for the buying and selling of likes and upvotes (some of the tasks conducted include research surveys, software testing, and data processing). However, Heder shows that 89.7 % of microtasks on the site were related to online promotion (Heder, 2018). Heder collected 1,856,316 microtasks and 7,426 campaigns between February 22, 2016-17, categorized by

(22)

platform and specific activity. Some of the platform categories include Amazon, Instagram, Alibaba, Reddit, Facebook, Smartphone (iOS and Android), Twitter, and Google among others. There are also 23 categories of activity including: Following, Like/Upvote, Sharing, Tweet/Retweet, Comment, Signup, Vote, Solving CAPTCHAs and other actions that are associated with online activity and social media. An example of an actual job posting, where the client requests the creation of 230 Gmail accounts via www.fakenamegenerator.com can be seen in Image 1.

Image 1: Posting from microworkers.com for generating Gmail Accounts

The total amount the client of this job posting pays in order to generate these accounts is $40.48 (Heder, 20118).

Heder then looked at some of the totals of these campaigns including how much money was spent on specific platforms. The Smartphone platform had the most campaigns at 1102, 231,892 tasks, and a total budget of $27,702.07 (Heder, 2017). Reddit came in at number 7 behind Google+, Twitter, Facebook, Youtube, and others with 539 campaigns, 77,104 tasks, and a budget of $5,434.15 (Heder, 2017).

Heder uses the term sock puppet in order to describe the phenomenon of when many accounts are owned and operated by one or several human users. One of the ways in which Heder

(23)

describes the hypothetical detection of these methods is by looking how many accounts are logged in order being used by a single serve, IP address, or other location based data.

3 Reddit

This section of the thesis will cover some of the academic contributions as well as some independent research done on Reddit.com. One of the reasons for choosing Reddit has to do with what seems to be a lack of research done (or at least attention given) regarding bots on Reddit as well as Reddit’s culture, politics, or technicity (Massanari, 2015). It should be noted that while the literature regarding bots on Reddit, is smaller in comparison to Twitter and Facebook, there have been some papers dedicated to Reddit as a primary site of observation involving Gamergate, The Fappening, the 2016 U.S. presidential election, and the requesting and creation of bots on Reddit. It is unclear why the subject of bots on Reddit is not a more prevalent topic in the literature, and perhaps some of it has to do with the technical difficulties in studying bots, generally speaking, and particularly aspects or Reddit which makes it a difficult platform to study. One such aspect is referred to as ‘vote fuzzing’ which obfuscates the actual number of votes from the ones depicted (Reddit.com). This is done to prevent spam bots, but also prevents the observation of voting habits. Despite these reflections, there is indeed research done on Reddit, specifically regarding to the rise of the alt-right, pizzagate,

gamergate, toxic technocultures, the circulation and spread of memes, images, and sentiments delineating an anti-feminist, anti-globalist, and technocratic worldview (Massanari, 2015). As we shall see later, Reddit has a system of self governance which can result in slow or nonexistent

administrative action, but also allows for Reddit users (redditors) to act on their own.

3. 1 Background on Reddit Structure, History, and Culture

Reddit was founded in 2005 by Steve Hoffman and Alexis Ohanian. In 2006, Reddit was sold to to Conde Nast Publications and in 2011 became a subsidiary of Conde Nast’s parent company

(24)

Advance Publications. Reddit later became independently operated although Advance Publications was still the primary shareholder.

Reddit is a message board styled content aggregator where users are able to post links in the form of photos, videos, web pages, etc from other content hosting sites on the Internet. Users are also able to post plain text posts as well, which is considered to be the only content that is originally hosted or produced by Reddit. Users are able to vote up or down on a particular piece of content in order to show favor, support, agreement etc. Users then can comment on the original post or reply to other users’ comments where the parent comment can branch into different threads of replies. Users are able to form communities around particular areas of interests in the form of a subreddit. As of November 17, 2017, Reddit has 1,179,342 subreddits (statista.com) and as of February 2018, 542 million monthly visitors and 234 million unique users (alexa.com).

In the introduction to her 2015 book Participatory Culture, Community, and Play: Learning from

Reddit, Adrienne Massanari examines how Reddit functions as a participatory space with particular

affordances that make it unique from other social networking sites like Facebook and Twitter. One of the primary aspects of Reddit (and in general, research around participatory cultures) which

Massanari elaborates on, involves an inaccurate representation for the democratic potential in online participatory cultures that researchers in the past have seemed to think too highly of (Massanari, 2015). That research does not properly describe the ways in which space and attention is negotiated in complex and nuanced ways and is simultaneously co-created by a combination of users, designers, administrators, and moderators (Massanari, 2015).

Another aspect which sets Reddit apart from other social networking sites is the account creation and verification. Both Facebook and Twitter aim for their users’ accounts to have some semblance to a real person in real life where their actions, check-ins, and other methods of engaging their users and the real world are recorded. While Reddit is not completely anonymous (as opposed to 4Chan and 8Chan) it is pseudonymous, where the account and user do not share the same the level of verisimilitude. Furthermore, where Facebook and Twitter operates through the network created through socializing (e.i. followers and the friends), Reddit does not facilitate the same network of users and accounts, although some form of following is available, it is not the primary metric from which a (to borrow a term from Bruno Latour) network of associations forms. The way information or content flows

(25)

through the site is also different. Reddit was originally designed as a content aggregator as opposed to sites which host original content (e.g. video or images). It also allows for text submissions alongside the aforementioned submission types. From those submissions, then that content is voted upon in order to determine what content is considered noteworthy or what should be seen by the larger audience.

Reddit imagines itself as a democratic space, where speech is not suppressed, users have the ability to decide what content rises to the top of an otherwise ubiquitous banality, everyone and anyone can find or build their own community, and everyone has an equal voice. After a piece of content is submitted to Reddit, users are then able to upvote or downvote that submission as they see fit. The post then is given a ‘karma’ score which is an approximation of upvotes minus downvotes. Posts that are voted on approvingly remain on the front page of the subreddit which makes their way onto Reddit’s frontpage, which is a collection of the top posts from various subreddits. Karma is also perceived to be a form of validation, or that if enough people voted positively for whatever that post is, it must be true, valid, or otherwise good. This ideology is supposed to allow for remarkable, unique, or newsworthy content to get pushed to the top of a subreddit’s page, where it will be seen by a greater number of people, and less worthy content gets relegated to the unseen. The same is true of

comments although there are filters which allow the user to select ‘controversial’, ‘hot’, ‘new’, or ‘best’ comments. Despite this seemingly democratic meritocracy, where only worthy content and

commentary can survive, there have been examples of how the technical affordances of Reddit, alongside its user base and culture, has propagated problematic information, materials, and images.

In Reddit’s content policy, there are several key aspects which describes allowable content in legal albeit laissez faire system of governance. It prohibits content that is illegal, contains involuntary pornography, is “sexual or suggestive” of minors, “incites violence”, “threatens, harasses, bullies or encourages others to do so”, share personal information, “impersonates someone”, solicits goods and services, or is spam (reddit.com). The prohibited behavior section only cites “asking for votes or engaging in vote manipulation”, “breaking Reddit or doing anything that interferes with normal use of Reddit”, and “creating multiple accounts to evade punishment or avoid restrictions” (reddit.com). Reddit’s enforcement policy includes warnings, “temporary or permanent suspension of accounts”, “restrictions of accounts and communities, “removal of content”, and the “banning of Reddit

(26)

communities”. Reddit’s sentiment of self-governance can be seen expressly in the section on

moderation within communities where both rules and enforcement are decided upon by a subreddit’s moderators.

In the following sections, several case studies will be presented which represent controversial events in Reddit’s history. These events also represent instances where Reddit’s policies and culture allowed for the circulation of sensitive information and emerging toxic ideologies.

3.2 Reddit’s Toxic Technocultures, Gamergate, The Fappening

In a 2015 paper by Massanari titled “#Gamergate and The Fappening: How Reddit’s Algorithm, Governance, and Culture Support Toxic Technocultures”, Massanari explores how the Reddit ranking algorithm, Reddit’s relatively low barrier to entry in terms of account creation, a history of message boards sites, geek masculinity and culture, and the way in which Reddit governs its own internal politics contributed to a massively anti-feminist movement (Massanari, 2015). It is also arguable that these methods of harassment became appropriated by online cultures to assist in the harassment of the other individuals, primarily that of doxxing (the publishing of personal information, which is

expressly against Reddit’s content policy). Massanari is also interested in how non-human agents act in these spaces.

There are several components to the Reddit platform which have contributed to the type of cultures emergent on the site. Firstly, the ability to create pseudonymous accounts with little

verification encourages a playful demeanor in how users engage with each other. Second, the ability to create subreddits, allowing for niche communities to form. It is from the last point which Massanari considers nerd or geek culture to come into the foreground as well as Reddit’s message board aesthetics like The WELL (Whole Earth ‘lectronic Link) (Massanari, 2015). Aspects of nerd and geek culture along with ideas of geek masculinity, which are embedded in technological expertise, the valorization of knowledge, and the sharing of information play a large role in the circulation of information regarding both Gamergate and the Fappening (Massanari, 2015; Bourdieu, 1977; Coleman, 2013). Massanari also cites the work of Lori Kendall (2011) in terms of how nerds view themselves as adept and joyful in their mastery of computer related activities (Kendall, 2011). This

(27)

sentiment is then embodied in revenge fantasies as seen in the 1984 film Revenge of the Nerds, and, as Ryan Milner observed, spaces like Reddit are radicalized and gendered, assuming a white male centrality (Milner, 2013).

Gamergate began when a jilted ex-lover, Eron Gjoni, posted on the SomethingAwful forums accounting the break up between he and game developer Zoe Quinn. The post included the

defamation of Quinn, who had recently published a game titled Depression Quest to the Steam Greenlight service as well as the accusation that Quinn’s game only received attention and positive reviews due to an intimate relationship with other games journalist (Massanari, 2015). The post also cited alleged Facebook messages exchanged between Quinn and Gjoni. The original post was removed, but was revived on 4Chan where it took on a new life as an argument which called into questioned the ethics of games journalism, but soon thereafter became associated with the harassment of women in different aspects of the video game industry (Stuart, 2014). Much like

Gamergate, the Fappening helped merge gaming, geek, and technocentric cultures with anti-feminist, misogynist and men’s rights activist (Massanari, 2015).

The Fappening was the result of a huge data breach where many celebrities had their personal photographs hacked from Apple’s iCloud service and posted to 4Chan (Massanari, 2015). Within the first 24 hours, /r/thefappening had received 100,000 new subscribers until the subreddit was redacted. The images circulated through different subreddits even going so far as Reddit administrators

commenting on the substantial increase in Reddit’s traffic due to the propagation of intimate celebrity photographs.

Massanari uses both Gamergate and The Fappening as an entry point into a toxic

technoculture that has fomented on Reddit as well as a technical structure which is susceptible to abuse. Because Reddit is a pseudonymous space, where groups can create their own communities, content is voted upon and in that process visibility and attention are prized, and lastly, moderation is kept to a minimum in order to preserve an idyllic sense of authenticity, groups engage in homophilic activities which tend to support a white male point of view (Massanari, 2015; Auerbach, 2014). There is little support from the moderators or administrators to levy punishments on anybody who violates Reddit’s policies, and often, if a complaint is received, it is met with the counsel of ‘if you don’t like the way things are on this subreddit, make your own and enforce your rules there.’

(28)

2.2.2 /r/The_Donald

This coalescing of anti-feminist, misogynists, and men’s rights activists, coupled with a techno-libertarian sentiment that apotheosizes networked technologies, added to an emerging alt-right movement on Reddit which converged on the controversial subreddit /r/The_Donald (/r/The_Donald). Tim Squirrell, in an article for Quartz.co, uses a massive dataset of Reddit comments and posts from the Google BigQuery service to create, what he calls a “Taxonomy of Trolls” on /r/The_Donald

(www.qz.com; Google). Squirrell classifies the following smaller groups of communities as composing the alt-right and gathering on /r/The_Donald: 4Chan shitposters, anti-progressive gamers, men’s rights activists, anti-globalists, and white supremacists. Squirrell then conducts a study similar in scope to one which took place, and larger influenced Squirrell’s study, by the Alt-RIght Open Intelligence Initiative at the University of Amsterdam, where they tracked the usage of the word ‘cuck’ on /r/The_Donald (wiki.digitalmethods.net). In the study by UvA, the implementation of the Google

BigQuery service along with the observation of Reddit as an open-source, data-rich resource, growing in popularity, and which seems to be a gathering place of the alt-right, has allowed for a particularly interesting site of observation.

Other deliberations for Reddit’s consideration, in particular /r/The_Donald, has to do with the positioning of /r/The_Donald within Reddit as an aggregator, incubator, and disseminator or

misinformation, propaganda, or fake news (Zannettou et al., 2017). In an article titled “The Web Centipede: Understanding How Web Communities Influence Each Other Through the Lens of Mainstream and Alternative News Sources”, published in 2017, a study which tracked URLs from 99 different news sites on Reddit, 4Chan, and Twitter, showed the tendencies and capabilities of these sites to circulate information and to what extent their influence is. /r/The_Donald was chosen by Zannettou et al. due to its positioning and history as a hub of alt-right activity. The research reported that 99% of URLs shared on /r/The_Donald were from alternative news sources (Zannettou et al., 2017). Using the Hawkes Process (a point process method which allows for temporal consideration), the report shows that /r/The_Donald “is the only platform that has the greatest alternative URL weights for all of its inputs” (Zannettou et al., 2017).

(29)

Reddit is considered to act as an intermediary space between the deeper and darker corners of the internet (e.g. 4Chan) and mainstream social networking sites (e.g. Facebook and Twitter) (Bradshaw, 2017). In a 2017 article for theHill.com, Samantha Bradshaw, a member of the

Computational Propaganda Project at the University of Oxford, considers Reddit to be a fertile space in which to test the virality of content and a space where “coordinated information campaigns” occur (www.thehill.com). The article goes on to explain that most of the US investigation of Russian

manipulation on social media is focused primarily on ad space, which Reddit manually verifies so as to not have prohibited content including contentious political views, making it less likely for automated agents or manipulators to buy ad space (reddit.com). In a post in /r/announcements, /u/spez shares thoughts about Russian influence on Reddit and Reddit’s internal investigation on these matters (reddit.com). In the post /u/sez remarks on how “ads from Russia are blocked entirely” (reddit.com). He then mentions that indirect propaganda is a more contentious issue particularly because of Reddit’s stance on authentic content.

However and despite this, automated software agents populate a large amount of the Reddit space. And because Reddit seems to have a strong white male geek culture, along with subreddits like /r/The_Donald which posits an extremely partizan political ideology, the experimentation and implementation of automated software agents can still allow for abuse. When this is taken into consideration

2.2.3 Reddit Bots

This last section on Reddit will cover an ethnographic study in which the sentiments

surrounding the requesting and creation of bots on Reddit is discussed. While there does seems to be a well-deserved alarmist response to the use of bots on Reddits, the ways in which Reddit both mediates the request/creation of bots and interaction with its API, allows for an encouraging space for programmers and amateurs to exchange information and share ideas (Long et al., 2017). In a paper titled “‘Could You Define That in Bot Terms?’: Requesting, Creating and Using Bots on Reddit”, a

(30)

researcher study was conducted analyzing the sentiments of “2,284 submissions [and 14,822 comments] across three [subreddits] dedicated to the request, creation and discussion of bots on Reddit (Long et al., 2017). In this paper, some of the key finding for how bots are requested and created mostly delineate a misunderstand of the capabilities of bots between developers and users on Reddit (Long et al., 2017). One of the main purposes of this study is, while there is much research done investigating the effects of bots in different online spaces and the public discourse, little is done which investigates the why and how bots are created (Long et al., 2017). Reddit provides an

interesting opportunity to research these questions because there is an activity community which openly discusses the creations of bots (Long et al., 2017).

In this research /r/requestabot, /r/botrequests, and /r/botwatch were selected for observation due to their subject matter (Long et al., 2017). The data collected ranged from 2012 to 2016 and made up the entire dataset. Two stages of qualitative analysis was then undertaken in order to discover the types of topics and functionalities involved in the request of bots and then a thematic analysis to see what predicates the larger discussion of bots on Reddit.

One of the key findings that the majority of bots requested (705 out of the 2,284 submissions) were administrative with the second most request type of bot request was Play/Humour (278) and the third being functional/quality (206) (Long et al., 2017). The most common functionality of requested bots involve querying and responding to particular keywords (686) followed by automatically posting content from other sources (220), and “querying user accounts” (105) (Long et al., 2017).

The thematic elements that come out of the second stage of this research largely has to do with knowledge and is broken into five themes: “knowledge and kills, technical infeasibility, legitimate and valuable bots, inappropriate and annoying bots, and the value of building a bot” (Long et al., 2017). Out of these categories, the first two primarily deal with technical aspects of bot creation and implementation (i.e. how does one go about doing this or that, and is ‘x’ possible). Overall, the findings show that there is a disconnect between requests and developers, where requesters do not possess the ability or knowledge to manifest their bot idea, and in turn over estimate the resources and

feasibility of their ideas. A particular example are bots that are able to respond to comments in natural ways. This requires a more sophisticated bot that, rather than being triggered by a keyword and posting an automated response, would be able to understand the contextual elements in a post, and

(31)

respond appropriately. An example cited in this study is the request for a bot that is able to respond to fallacies (Long et al., 2017). What is refreshing, though, is that there is an awareness of exploitative or malicious bots (Lang et al., 2017). Requests that have ambiguous motivations or do not fully describe the intent of a bot are called into question. In one instance, a redditor asks another “Please explain why you want this? It can be easily abused” (Lang et al., 2017). As far as malicious bots go, anything that seemed to violate Reddit’s terms, policies, or norms would be identified as such with warnings of bans and other punishments by the more seasoned bot creators.

Lang concludes by stating that the study of bots, and especially a space where bots are requested, created, and openly discussed is a valuable resource for understanding an online

community (Lang et al., 2017). Building bots require a working knowledge of that a platform is capable or doing and be able to observe the requests of users allows us to see what a community desires in about.

This last section will cover independent research done by members of the Reddit community. As observed above, there is a degree of technical expertise regarding the request and creation of bots on Reddit. However, in that study the sentiment toward malicious bots is somewhat overlooked as well as the fact that, for the most part, they are only cover bots which are discussed openly on the platform, rather than ones that could possibly already be running. The articles covered in this section while be mostly focused on /r/The_Donald and /r/bitcoin.

3.2 Independent Research by Redditors

The use of bots or sockpuppets for nefarious activity on Reddit has not gone unnoticed by academics, but it is also a prevalent topic and point of discussion for many Redditors. We have seen that there is an active community centered around the requesting, creation, and discussion of bots, but there are also independent investigations into how Reddit’s ranking algorithm is manipulated by malicious software agents and those who deploy them.

In an archived post from 2016 on the subreddit /r/TheoryOfReddit, an account named

(32)

findings and observations from two experiments conducted in 2012 and 2008 (www.reddit.com). Both experiments were to see if a particular type of post with a particular type of sentiment could make it to the top of certain subreddits and ideally to the frontage. The 2012 experiment was organic, using the MittRomneysCampaign handle to lament Barack Obama’s re-election.

Image 2: MittRomneysCampaign top comment

The poster extrapolates that the account name, along with the content of the comment, allowed the comment to reach the top of the thread. In the 2008 experiment, the content of the comment dealt with a scientific observation that appears adversarial to the general culture (as the poster sees it) of Reddit as whole, and even questions the collective intelligence of the site as a whole. In this situation, the poster boosted the post and used proxies to hide the manipulation. From the standpoint of the original poster, upvotes represent a truth or approval gauge rather than the support of quality content. The poster goes on to claim that most redditors vote with this in mind, where, in simple terms, highly upvoted content is good or true, downvoted content is bad or false. But the truth of the matter is, according to the original poster, quality content can get pushed to the bottom and never see a true day in the light. And it does not take much to determine the fate of a post in its nascent stages.

/u/MittRomneysCampagin uses these examples to illustrate the types of SEO tactics involved in gaming Reddit’s ranking algorithm. The user also identifies several different ways the voting system can be gamed: forced front paging, vote nudging reverse nudging, and vote brigading/puppeteering. Forced front paging seems like an obvious activity to detect if it is particularly suspicious to the poster, but vote nudging, reverse nudging, and brigading/puppeteering seem more common and less easy to identity.

The posters posits a corollary relationship to upvotes, views, and subsequently economic incentive, and points to how Reddit weighs recent submissions and upvotes higher when compared

Automated Meme Magic: An Exploration into the Implementations and Imaginings of Bots on Reddits