80% Toxic: THE POLITICS OF MACHINE LEARNING CONTENT MODERATION TOOLS

(1)

80% Toxic

THE POLITICS OF MACHINE LEARNING CONTENT MODERATION TOOLS

JORDEN (YARDEN) SKOP

(2)

Jorden (Yarden) Skop MA Thesis

University of Amsterdam Graduate School of Humanities Program: Media Studies (Research)

Supervisor: Dr. Bernhard Rieder Second Reader: Dr. Alex Gekker

28, June 2019

Cover image: screen shot of Perspective API’s “authorship” demo, retrieved on 27, June 2019.

Acknowledgments

This thesis would not be what it is without the help, inspiration and intellectual guidance of different people. First, I would like to thank my supervisor, Bernhard Rieder, I truly enjoyed going through this journey with such a guide. Every meeting was a precious lesson in critical thinking and how to conduct productive, original, and rigorous research. I would also like to deeply thank the key informants that are interviewed in this thesis, all of them very busy people who took the time to answer my questions with much patience. I am grateful to my dear friend Merav, who reminded me that I should not strive for perfection, but for excellence. To my partner, who was a beacon of love and support when times were hard, thank you. Lastly, I owe much gratitude to my parents who helped me take a break in my demanding career and move to Amsterdam to do this master, something that was an old dream of mine.

(3)

Abstract

A heated public debate rages worldwide around online content moderation these days; Most of this debate is focused on the large social media platforms such as Facebook, Twitter and YouTube - those who host much of the public debate. Due to the scale of communication on platforms, many of them depend increasingly on automated tools for moderating content. Some platform moguls promise that this is the future of moderation, as did Mark Zuckerberg, CEO of Facebook, in the American Senate. This thesis is interested in the automation of textual content moderation using machine learning technology. It examines specific cases of this automation through a combined methodology of interviews with experts, qualitative analysis of code documentation and algorithmic auditing, balancing between theory and empirical investigation of specific uses of algorithmic techniques. This research is exploratory and embraces the principals of grounded theory that derives theory from the examination of empirical data.

To that end I examine two case studies: Perspective API, a tool developed by Jigsaw (Google-Alphabet) and Troll Patrol, a research effort by amnesty to monitor and predict harassment of women on Twitter. This research explores the ways these tools are used, by what actors, and what are the underlying principles behind their development. Furthermore, it delves into how automated moderation creates an assemblage with the human labor of moderation and examines the socio-technical processes created through it. It also aims to develop the vocabulary that will allow further analysis of these tools.

This research identifies that content moderation is a highly contextual and layered phenomenon and that automated tools only provide a partial answer to the challenges it poses. The tools are an assemblage where the classifying algorithm is only a part of a larger process, and its efficiency is highly dependent on access to quality data and a rigorous labeling process done by humans, preferably professionals. It also sheds light on the entangled and changing relationship between news websites and tech platforms and determines that the automation of moderation in those websites is part of a process of platformization.

Keywords: Platform studies, Machine learning, Content moderation, Algorithmic auditing, Platformization

(5)

Introduction: The Fabrics of Moderation

On April 10, 2018 Mark Zuckerberg appeared before the U.S. Senate, mainly to account for data breaches on Facebook following the Cambridge Analytica scandal. His hearing was a chance to learn what issues surrounding Facebook trouble American legislators, and so, many of the questions directed at him were essentially questions about content moderation policies in the largest social media network in the world. Zuckerberg, The CEO and founder of Facebook, repeated a certain promise to senators probing him about moderation on his platform: “we're developing A.I. tools that can identify certain classes of bad activity proactively and flag it for our team at Facebook” (“Transcript of Mark Zuckerberg’s Senate Hearing”). Zuckerberg’s bottom line was that in 5 years those A.I tools will be good enough to identify harmful behaviors on the world’s largest social network. When given this answer, most senators did not inquire any further about this promised A.I.

In the time that has passed since the hearing, Facebook has notably banned some figures they regard as hate mongers, among them right-wing commentator Alex Jones, known for his brand “Infowars”. Louis Farrakhan, an outspoken black nationalist minister who has been criticized for spreading anti-Semitic remarks, was also banned (Isaac and Roose n.p). Before being banned from Facebook in May, Jones’ content was deleted by Facebook, as well as Apple, Google, and Spotify in August 2018. On Facebook, one of four of Jones’ pages, had had 1.7 million followers before it was removed. It was banned due to what the platform termed “using dehumanizing language to describe people who are transgender, Muslims and immigrants, which violates our hate speech policies” (Facebook Newsroom n.p). On Youtube, a channel with 2.4 million subscribers was terminated (Nicas n.p).

In addition, scholars and journalists accuse YouTube of fostering extremist and radicalizing channels of micro celebrities, which create an “alternative influence network” (Lewis). The company recently declared that it will remove videos that advocate bigotry. As per YouTube’s announcement, the company updated its hate speech policy to specifically prohibits “videos alleging that a group is superior in order to justify discrimination, segregation or exclusion based on qualities like age, gender, race, caste, religion, sexual orientation or veteran status” (YouTube official blog), for example, videos that promote or glorify Nazi ideology. They also promised to remove content “denying that well-documented violent events, like the Holocaust or the shooting at Sandy Hook Elementary, took place” (ibid). The company did not specify how they will identify such videos and whether computational techniques will be involved in the process.

These examples expose only a small portion of the events and controversies surrounding content policies in large Silicone Valley platform companies in recent years. This culmination is

(6)

testimony to a change we are witnessing in public view on issues of content moderation. It is important to note this shift, because the views towards strict content moderation guidelines were not always so forthcoming. Seven years ago, Twitter’s head in the UK, prided it for being “the free speech wing of the free speech party” (Halliday n.p). In their earlier days, many platforms did not anticipate content moderation to be such a significant issue. Facebook and Twitter, for example, started with a rather homogenous class of users, that shared similar values of freedom. The expression “information wants to be free”,1_{was an ethos shared by designers and developers (Gillespie Custodians 117), echoing the}

hacker ethic. Given this homogeneity in the early adapters of social media, this freedom was not just an ethos, but a reality. The early days of the World Wide Web in the 90’s were filled with techno-utopianism and a promise that this new network will decentralize control, flatten organizations and even harmonize people (Negroponte qtd. By Turner 1).2_{Social media pioneers echoed these messages,}

both in spirit and in practice, promising to use the internet to connect people even more - moderation seemed unnecessary.

In recent years, there has been a heated public debate worldwide around the issue of moderation; Most of the debate is focused on the large social media platforms, such as Facebook, Twitter, and YouTube and it spans many different topics. In his book devoted to online moderation,

Custodians of the Internet, Tarleton Gillespie claims that “platforms must, in some form or another,

moderate” (5). Gillespie offers to look at moderation as central to what platforms do and establishes that “moderation is, in many ways, the commodity that platforms offer” (13). Platforms are not like the rest of the internet, where porn, spam, and fake news roam free, they offer something else.

A recent survey published by The Anti-Defamation League in December 2018, conducted among a representative group of 1,134 Americans, revealed that more than 80% want the government to strengthen laws against online hate and harassment. A similar number of respondents want platform companies to provide more options to filter hateful or harassing content (ADL survey). This survey gives a quantitative description of a feeling, also evident in the popular media, that there is a growing public demand for more content moderation online.

This new consideration of platforms’ responsibility for content shared on them happened around many topics, but initially the most pressing one was terrorism (Gillespie Custodians 37). One

1_{This phrase, coined by Stuart Brand, is actually only one part of two opposing statements: “On the one hand information}

wants to be expensive, because it’s so valuable. The right information in the right place just changes your life. On the other hand, information wants to be free, because the cost of getting it out is getting lower and lower all the time. So you have these two fighting against each other” (Levy n.p).

2_{For a fascinating and more detailed account of how Silicone Valley entrepreneurs were influenced by the network} metaphor and ideas of personal freedom and echoed them in their products, see Turner.

(7)

of the early instances of users, governments and advertisers pressuring social media companies to actively moderate terrorist and extremist content was the rise of ISIS, a terrorist organization known for its savvy use of social media to gain followers not only in the Middle East or in Muslim countries, but also in Western countries (Farwell). Twitter, for example, actively suspended thousands of accounts and YouTube removed videos of beheadings and other acts of violence (Press Association; Walker). Automated tools were part of this process (Reuters), that for the time being seems to have diminished ISIS’ presence on large platforms (Broderick and Hall). But, from a liberal Western perspective, ISIS propaganda might be a more clear-cut case than others. White supremacy, anti-Semitism and misogyny seem harder for platforms to recognize as such and act upon, as YouTube’s chief product officer, Neal Mohan, even admitted in a recent interview with The New York Times (Roose n.p).3

In order to broaden the focus beyond American or Euro centric concerns, it is worth looking at a specific case, where criticism on the implementation of sloppy content moderation was evident not only in popular media, but was issued by an international body of governance like the United Nations. One recent example for this is Myanmar, a country where recent events are a warning sign to how hate speech on social media, namely Facebook, can escalate tensions and contribute to a genocide. It is worth to go through the case in some detail, because it is an example of what platform companies are dealing with around issues of moderation, and demonstrate the growing call for stronger restrictions on content on social media. It is also a good entry point into the key points around which the debate in this research will revolve.

The United Nations’ Human Rights Council report, released in Sep. 2018, established that there were indeed crimes against humanity in Myanmar, targeted at Rohingya Muslims who are mainly concentrated in the west provinces of the country. A section of the report is devoted to hate speech dissemination against Muslims including on social media platforms. Myanmar is a unique case due to the fact that Facebook entered the country very swiftly in the beginning of the decade, after its citizens were blocked from the internet for many years. Until 2011, few people in Myanmar were able to afford internet access due to high costs. According to the UN report, Facebook is by far the most common social media platform in Myanmar: “The relative unfamiliarity of the population with the internet and

3_{Mohan answered a question comparing the take down of ISIS material from platforms to the lack of treatment of white}

supremacist materials and accounts: “In the case of violent extremism and limiting those videos on the platform, the reason it’s different than what we’re talking about here is that those [ISIS] videos took on a particular form. They were often designed for propaganda purposes and recruitment purposes. So, they had things like branding and logos, both visually and in terms of the music they might use. Those formed a set of finite clues we could use to bring that content down... In the case of something like this [white supremacy], the challenges are harder because the line, as you can imagine, is sometimes blurry between what clearly might be hate speech versus what might be political speech that we might find distasteful and disagree with, but nonetheless is coming from, you know, candidates that are in elections and the like”.

(8)

with digital platforms and the easier and cheaper access to Facebook have led to a situation in Myanmar where Facebook is the Internet” (UN report 340).

In its report, The International Fact-Finding Mission to Myanmar examined specific Facebook accounts that were influential in spreading hate speech and had high engagement and large followings. The report mentions various examples of hateful posts, memes and videos which incite violence against Muslims in Myanmar. Some of them are based on the spread of misinformation, claiming that Muslims threaten the Buddhist character of the nation. Rohingya were also referred to as terrorists by Myanmar authorities (322-329). The document states that Facebook became an instrument for spreading “hate speech, including advocacy of national, racial or religious hatred that constitutes incitement to discrimination, hostility or violence”. It adds that “it is unsurprising that propagators of hate speech resort to Facebook to wage hate campaigns, amplify their message, and reach new audiences” (340).

The report details a series of attempts by local civil society organizations to approach Facebook administration with proofs of hate speech proliferation on the platform. It claimed it did not have enough content moderators who understand the Burmese language and its nuances, as well as the context in which posts and comments are made. The UN mission itself experienced a slow and ineffective response from Facebook regarding alerts on a post targeting a human rights activist, where comments included explicit calls for his death (341-342).

The mission concludes unequivocally that the prevalence of hate speech on Facebook significantly contributed to the climate leading to the violence against the Rohingya, and condemned Facebook for not providing data about the spread of hate speech on the platform, “which is imperative to assess the problem and the adequacy of its response” (343). Advocates in the news media claimed that Facebook was keener on expanding to a promising untapped new market, rather than dealing with these core issues (McLaughlin n.p.).

In the aftermath of the Rohingya crisis, in which 700,000 people were displaced and fled to neighboring Bangladesh, an estimated 10,000 were murdered and thousands raped and tortured, Zuckerberg admitted to the US Senate that the company was “too slow to deal with the hate and violence in places like Myanmar [...]. The challenges we face in a country that has fast come online are very different than those in other parts of the world, and we are investing in people, technology, and programs to help address them as effectively as possible” (UN report 343).

This is of course an extreme example, but it sheds a light on the real-life dangers of hate speech on social media, as well as on the difficulties of moderation, partly due to the absence of established coping mechanisms to deal with these issues in platform companies. Besides the more extreme examples of bigotry and hate speech that I have mentioned so far, there are also more mundane

(9)

situations of daily antagonistic behaviors between users online. Users do not want to be harassed, or receive insults from others. Sometimes a discussion about a contentious topic can become a slandering match, even without crossing a legal boundary. Those expressions are also something platforms are trying to deal with in different ways, though they are not discussed as much on mass media due to their routine nature and banality.

The process I am pointing at is moving from only moderating that which is overtly illegal to enforcing more strict guidelines on speech and its effects. That means the debate is no longer just legal but also revolves around freedom of speech. Moderation can be seen as exclusionary if it is judged as censorship of free speech, or it can be seen as inclusionary, if one considers the consequences of no moderation at all, as seen above in regards to Myanmar. Perhaps the only people who will communicate in spaces that are not moderated are those who are willing to tolerate incivility, which, in practice means excluding anyone who is sensitive to the mundane rude behaviors of people online. As Milner writes about Reddit, “The populism of the collective is really antagonistic mob rule” (118). So, the question becomes, what kind of public discourse do we accept on the internet? As you can see from the examples above, much of this shift has to do with the move to web 2.0 (O’Reilly), the participatory web, and the rise of platforms as the key organizing structure of it.

I personally started exploring the topic of online moderation when participating in a DMI (Digital Methods Initiative) Summer School in 2018, in a project dedicated to analyzing results from a research by Amnesty. The human rights organization was using machine learning to measure violence against women politicians and journalists on Twitter. In the course of the research called Troll Patrol, Amnesty curated a data set of close to 300,000 tweets targeting women and harnessed more than 6,500 volunteers worldwide to label it. Collaborating with a software company called Element AI, these labels were used to train an algorithm that analyzed 14.5 million tweets mentioning women. This was a human rights organizations’ attempt at dealing with the issue, using machine learning. I wanted to study how software companies and platforms were using this kind of computational technology to tackle online hate, mainly through the detection and moderation of the category of hate speech.

Besides measuring the volume of harassment towards women, Amnesty’s research was using machine learning to create a tool that can automatically identify abusive tweets. This effort is in line with the attempts by social media companies. Like Zuckerberg promised, A.I. is the up and coming solution to issues of problematic content. Given the sheer volume of user generated content across platforms, platform companies need some kind of automation process to detect and review all of the problematic content. These attempts to automate content moderation and eliminate negative behaviors online are easy to dismiss as “solutionist” (Morozov), that is, to try and solve a complex social problem

(10)

with the so called “magic” of code and machine learning (now too easily called Artificial Intelligence). Nevertheless, if moderation is central to the workings of platforms, and as this process of the automation of moderation is accelerating, it is necessary to look at it with true curiosity rather than dismissing it.

There is already quite some research on content moderation practices. Some of it is focused on specific platforms such as Reddit (Squirrel; Massanari) and Facebook (Myers West), while some deal with larger issues of policy and regulation (Gillespie Custodians; Citron-Keats). There is also some literature on delegating moderation to users through mechanisms like flagging, a function which had become ubiquitous on social media and any website hosting user comments (Crawford and Gillespie). Mass media coverage critiqued the working conditions of human content moderators on large platforms (Newton; Chen). However, there is a gap in new media research covering automated tools for moderation, especially tools that deal with language. Therefore, I chose to focus on these tools in my research. I also wanted to explore how issues of policy and moral values like moderation are reflected and embedded in computational techniques, and the implications of translating moderation regimes to computational decision-making processes.

And so, the research questions I set out to answer were:

What decisions and logic stand behind the automation of online moderation?

How are tools for automatic online content moderation conceived by their creators (designers, developers, domain experts)?

In what domains and in what way do these tools operate in real life situations today? How do they affect these domains?

It is important to note at this stage, that the larger project at hand in this paper is not justifying moderation by platforms.I accept it as an existing practice for dealing with user content. It is also not to condemn the censorship of freedom of speech. I am looking at the fabrics of moderation and specifically one relatively new and experimental technique of moderation - automatic detection and filtering of hate speech and more broadly negative speech. This research investigates the ways it is used, by what actors, and what are the underlying principles behind it. Furthermore, it delves into how automated moderation creates an assemblage with the human labor of moderation, and examines the techno-cultural processes created through it. It also aims to develop the vocabulary and understanding that will allow further analysis of these tools. As can be derived from the above description of my research interests, this is an exploratory research grounded in real life tools and practices.

The main challenge for me was finding an entry point or the means to operationalize my research queries - a case study of how commercial companies were dealing with these issues in real life. International platform companies are not known for transparency when it comes to content

(11)

moderation policies and enforcement (Roberts 2), especially not the largest social media platforms that were getting most of the fire for hate proliferation. But there was one large platform company that was transparent about developing that kind of technology – Google-Alphabet, and specifically a “tech for good branch” in the company called Jigsaw, which is working on a kind of negative speech detector and filter called Perspective API.4_{Conversation AI, the research group inside Jigsaw that was working}

on this open source and free tool, had also published a lot of documentation, code and references on GitHub. They also collaborate with other bodies like The New York Times, Wikimedia foundation and an open source project for content moderation on news sites called Coral. This eco-system created around the tool and its development, which is still ongoing, has allowed me an entry point into the topic. Perspective also has a public demo, allowing anyone to test how it functions and what kind of issues might arise while using it. The availability of the material is the reason I chose Perspective API as the main case study in this research.

Following that, and in order to develop a grounded view on the tool, I chose to look at its implementations. One of them is Talk, a moderation platform for content and news websites, which uses Perspective as part of its moderation processes. Talk is used by dozens of news outlets and journalistic websites in the U.S and Europe, notably The Wall Street Journal, The Washington Post, New York Magazine and its daughter websites, and nu.nl.

Another implementation of Perspective is Moderator, The New York Times’5_moderation

system designed especially for them with Jigsaw. The NYT was one of the most important collaborators with Jigsaw, and provided them with crucial data: a decade of user comments and how they were moderated, including the tags that the professional moderators gave them.

A smaller case study is Troll Patrol, Amnesty’s research project with which I collaborated in the DMI Summer School, which is mainly used for comparison with Perspective.

Studying developers’ documentation solely was not enough to answer my research questions. GitHub and other online material provided me with a lot of information, but they were smoothing over differences and failures. That is part of the reason I also aspired to interview the tools’ developers, project managers or other professionals who collaborated in making it, such as NYT’s community manager; or those who use it, like Coral’s head. Through the interviews a clearer picture of the challenges in the process of creating and using the tool were explored.

I opened my introduction with issues of moderation on Facebook and other large social media platforms – issues which are also frequently discussed by mass media. The case studies, however, led

4_{https://www.perspectiveapi.com/#/home} 5_{From now on referred to as the NYT.}

(12)

me to a different domain – news websites. These publishers are also dealing with the question of what are the best practices to moderate reader comments on their websites. Although this domain is less publicly debated, a research done by The Guardian on their own comments revealed that comments on news outlets can be as toxic and abusive. The Study shows that out of the 10 writers who suffered the most abusive comments, 8 were women and 2 were black men despite the fact that most of the opinion writers were actually white men (Gardiner et al. n.p). Some news outlets have decided to close the comment section on many articles all together, due to their difficulty to moderate such large amounts of user generated content (Long n.p.).

Let us go back to the metaphor I used for describing moderation, that of fabric. It is useful because like a fabric, moderation is a web, made of different threads – techniques and actors - woven together in different ways, creating an outfit. Through the course of this research I discovered that using algorithms is weaving another thread into the fabric which works in concert with human moderators, and with different systems and protocols that are built around them. This thread has its own characteristics and texture, and that is what I set to explore.

The first chapter poses the theoretical and factual framework, discussing first the legal ground on which content moderation on platforms grew. It also explores different concepts of platforms governance to problematize the issue of moderation. Another key concept is platformization, or what makes websites into platforms. The chapter will also discuss the computational techniques used for automated content moderation, from a media studies perspective.

The second chapter presents the case studies chosen to explore the topic, and the mixed methods used to analyze them: Semi structured interviews with key informants, qualitative ‘grey literature’ analysis, mainly of GitHub documentation and a small experiment influenced by the (emerging) tradition of algorithmic auditing.

The third chapter presents the findings and discussion, divided into two parts: more descriptive findings emerging from the GitHub documentation, and themes arising from the interviews, organized into reflections informed by the theoretical framework. The last part of the chapter takes a closer look at the specific domain the tool I examined operated on – news websites – and delves in more detail into the implications it has on that domain. I claim that the use of the tools accelerates a process of platformization of news sites, by allowing more comment data from users to flow through them, and creating lasting user profiles. This is allowed by more algorithmic control and order mechanisms, afforded by the use of machine learning structuring the workflow. The underlying logic is that of connectivity as a value for a vibrant public discourse. I also point out the entangled situations platforms and news websites are in. On the one hand, news websites are trying to gain more autonomy, partially by drawing readers from platforms back to their websites. On the other hand, these websites need a

(13)

way of dealing with reader comments, and so again they need the technological abilities platform companies offer, in order to moderate more comments faster and more efficiently.

(14)

Chapter 1: How Platforms Govern and are Governed

1.1 On Content Moderation and its foundations

Online content moderation is a multifaceted subject that brings together moral issues, technical issues and legal issues, and can be approached in many ways. This chapter lays the foundations of this research, both factually and theoretically. The first part introduces the reader to current literature about content moderation, both academically and in the news and popular media. It also delineates the legal foundations in the U.S. regarding (selective) content moderation. It then moves to explain the proposed theoretical framework of platform governance as a means of examining the issues arising around content moderation. The chapter also offers an analysis of the algorithmic techniques used in Perspective from a media studies point of view. This approach searches for a middle ground between broad theorization and a grounded empirical investigation of specific uses of information ordering algorithms (Rieder 101).

1.1.1 The Legal Foundations of Moderation

The founding legal and legislative principles which created the premise of the internet as a free speech haven are usually portrayed by scholars as related to two decisions in American law. One is the 1997 landmark ruling of the American Supreme Court in the Reno v. ACLU case, which ruled that internet speech deserves the same free speech protections as other spoken or written speech. Justice John Paul Stevens wrote in the majority opinion that the internet’s capacity to allow any individual with a phone line to reach mass audiences, made the network even more valuable, perhaps, than its broadcast equivalent (Marwick n.p).

Another landmark was section 230 of U.S. telecommunications law, known as the ‘safe harbor provisions’. According to this provision, intermediaries which provide access to the internet or other network services, cannot be made liable for the speech of their users. Like the telephone companies, intermediaries do not have to police their users’ speech or behavior. At the same time, the second part of section 230 determines that if an intermediary does decide to police its users, it will not lose this safe harbor protection (Gillespie Custodians 30). Social media platforms are still acting according to this safe harbor when choosing what to moderate and what not to moderate. Grimmelmann claims that the underlying policy of section 230 is actually to encourage moderation by taking away the threat of liability due to problematic moderation (103). It is important to note that section 230 works in lockstep with the Digital Millennium Copyright Act which was devised to provide this safe harbor in case of copyright infringement of content (as opposed to hate speech or harassment) (Jeong 44).

(15)

But, as I have showed, this broad agreement about social media platforms falling under the category of mere intermediates is shifting (Gillespie Custodians 33). That is partly due to the fact that these rules were crafted long before these entities or similar ones existed, and the technologies platforms use today are different than the ones the law was intended for. Another reason is the internationality of platforms’ users; these rules were crafted in the U.S, but other countries have different rules regarding the responsibility of publishers and intermediaries. The ‘safe harbor provisions’ enabled platforms to develop as they have, since they provided them a crucial protection from having to control all user content.

Many other countries have laws in place that allow citizens to sue others over behaviors that are deemed as an attack on them online or offline, libel, for example. In some countries it is possible to sue the company publishing the content, even if it is a social media network. In regards to hate speech, European policymakers were faster to react to the changing landscape on the internet portrayed in the introduction. Germany is spearheading the fight against hate speech online with a law from January 2018 that determines that platforms which fail to remove “obviously illegal” content (as defined by the law) within 24 hours may face fines of up to 50 million Euros (Gollatz et al.). Both Germany and France have laws prohibiting the promotion of Nazi ideology, anti-Semitism and white supremacy (Gillespie Custodians 37). In 2016 The European commission persuaded Facebook, Microsoft, Twitter and YouTube to sign a “Code of conduct on countering illegal hate speech online”. It determines that these platforms will develop more extensive tools for identifying hate speech and will respond to take down requests within 24 hours. In 2018 Instagram and Snapchat joined the initiative (The EU code of conduct n.p).

Although the legal aspects are important as the underlying infrastructure of moderation, this research is also interested in expressions and issues beyond the question of legality. In order to frame the discussion about these expressions in more broad terms, the next section will explore the notion of governance, specifically pertaining to platforms and their relationship with users and other stake holders.

1.1.2 Platform Governance

I would like to problematize the issue of content moderation by examining it through the framework of platform governance. The era we live in is referred to by some as “Platform Capitalism” (Srnicek). In it platforms have become a key organizing factor in our financial, public, personal and social lives.

The term platform is strategically deployed by digital companies associated with Silicone Valley because it allows them to place emphasis on being “merely” a platform, a mediator, as opposed to content creators. The term platform allows these companies to elide tensions inherent to their

(16)

services. Tensions caused by their intervention in the delivery of content and their pretense of neutrality (Gillespie, “Politics” 348). The work of moderation lies at the center of this tension.

Almost a decade has passed since Gillespie’s pivotal article which pointed to the discursive work platforms undertake was published, and platform companies have gained even more financial and organizing power. To give a current example of the elusive use of the term ‘platform’ we can go back to Zuckerberg’s hearing in the American senate, where he repeatedly said that Facebook is “a platform for all ideas” (Transcript of Mark Zuckerberg’s Senate Hearing).

Therefore, it is important to provide a working definition of platforms today, not just as a discursive term but as material and technical objects. In their recently published book The Platform

Society, van Dijck et al. define an online platform as socio-technical architecture that fosters and

organizes interactions between users, whether they are end users, corporate entities or public bodies. These software architectures are programmed toward “the systematic collection, algorithmic processing, circulation, and monetization of user data” (4). They also discuss the “platform ecosystem”, an “assemblage of networked platforms, governed by a particular set of mechanisms that shapes everyday practices” (ibid). They draw the boundaries of a Western ecosystem, led by a handful of massive tech companies: Google-Alphabet, Apple, Facebook, Amazon and Microsoft. These companies provide many infrastructural services that shape and design the ecosystem’s distribution of data flows.

Srnicek discusses the rise of these conglomerates from a Marxist point of view as part of a move from an economy of manufacturing as a driving force of growth to an economy of data extraction and creation (6). He calls this era Platform Capitalism and examines the labor and class relations enacted by it. Srnicek’s approach is also important for considering the monopolistic relationships large platforms have both with users and other “non-platform” cultural producers, such as news outlets, a main case study in this research.

Many of the platform companies operate in a legal gray area in many aspects (Van Dijck et al. 4). The issues surrounding content moderation are only a fragment of the frictions that occur in the intersections of platform companies, societies, and governments.

Part of the clashes are created because platforms have complex economic structures, and can be seen as multi sided markets which “consist of a platform that brings together at least two distinct groups of end-users” (Rieder and Sire 199). Platforms are not only comprised of the company on one side and users on the other; another key actor in today’s field, especially on social media platforms, are advertisers. On the free service platforms such as Google and Facebook, advertisers generate the main source of revenue for platforms, meaning they can act as a pressure group which might also affect the circulation of user content or its censorship. One example of the power advertisers hold over

(17)

content circulation is the YouTube “Adpocalipse” that took place in 2017, in which large companies such as Coca-Cola and Amazon pulled ads from the platform after discovering their content is paired with extremist content and hate speech (Dunphy n.p). Advertisers may very well be one of the key reasons moderation has become so important to platforms and why they have responded to public pressure and used the solution of de-platforming hate figures, such as the examples I gave in the introduction.

Going back to the element I set out exploring - the question of platform governance - it is clear that the policy arena is fragmented, with responsibility over the social and political roles platforms play divided between different stakeholders - platform companies, as the architects and creators of online environments; users, as the individuals making decisions about their specific behavior in online environments; and public servants (governments and legislators), as the entities setting the overall ground rules for the interactions (Helberger et al.). I would add advertisers as a pressure group in the mix.

Gorwa, as other scholars have done, borrows the term governance from the field of political science and explains how it is implemented in the platform ecosystem. He explores the changes in the term which was initially associated with domestic governments’ capacity to enforce rules, maintain order and build functioning institutions. The term later developed to a more flexible understanding, expanding beyond local forms into global organization, structure and regulation of life. In addition, it symbolizes how different structures could enforce or create power relations and conflicts (3). The term is also used by scholars of the online world, and particularly in the field of platform studies. Gorwa identifies the terms in which platforms are debated - such as constraining and/or encouraging certain behaviors through the use of content policies, terms of service, algorithms, and interfaces - as governance mechanisms (4). As Grimmelmann notes, moderation is part of the “governance mechanisms that structure participation in a community to facilitate cooperation and prevent abuse” (47).

Moderation as a governance problem is also a problem of collective action - the implications of millions of people gathering virtually in one space and communicating with each other. Bratton’s view of platforms can help in better understanding this. He defines platforms as resembling markets - distributing interfaces and users globally. He also defines them as resembling states, because “their programmed coordination of that distribution reinforces their governance of the interactions that are exchanged and capitalized through them” (42). This process of drawing many actors into a shared infrastructure involves centralization and decentralization at the same time. Users have some autonomy but their behavior and conditions of communications with others are also standardized

(18)

(Bratton 46). Part of this standardization is enacting governance through the different processes of moderation.

As the scale of platforms’ dominance becomes unprecedented, developing automated tools to deal with dangerous content becomes more pressing. This dynamic can be seen as a way to avoid external governance by governments and legislators. As suggested by Helberger et al., governments should outline parameters for cooperative responsibility and provide clear guidance to platforms on targets, but also accept that platforms need “space” to find technical and organizational measures to comply with those targets (11).

Duguay et al. are critical towards this space given to platforms. They develop the idea of ‘patchwork platform governance’ to describe the “uneven, retroactive development of governance policies” by platforms (3). They claim that platform owners and policy leaders respond to mostly voluntary governance obligations and create “uneven, often contradictory, rules, standards, technical features, and loose rhetoric about desirable social conventions” (4). If this is the case, which I argue to be true in some instances and for some users, delegating decisions of content moderation to algorithmic systems may help platforms avoid public scrutiny on those decisions. The use of artificial intelligence, machine learning, big data and automation lends an aura of objectivity and accuracy to those decisions (boyd and Crawford 63; Gillespie “The Relevance” 168) on one hand and opaqueness (Burrell) on the other. This opaqueness makes it harder to formulate clear criticism, especially for actors that are not versed in computer engineering and software development.

Another key concept for understanding what is at stake in content moderation is platformization. Helmond, one of the scholars who developed the idea of platformization, defines it as the “extension of social media platforms into the rest of the web and their drive to make external web data ‘platform ready’” (1). This process is usually referred to as primarily concerned with platforms extending into the web and pulling web data back into the platform (Nieborg and Helmond 197). Platformization is a dual process, in which platforms extend outwards, into other websites, platforms, and apps, as well as inwards, utilizing third-party integrations that function within the boundaries of the core platform (202).

At the center of the shift from social network sites to social media platforms, Helmond places APIs (Application Programming Interface). Since one of the tools this research examines is an API, specifically a web API, it is important to explain what the term means. A web or remote API is an “interface provided by an application that lets users interact with or respond to data or service requests from another program, other applications, or websites. APIs facilitate data exchange between applications, allow the creation of new applications, and form the foundation for the ‘Web as a platform’ concept” (Murugesan 36; qtd by Helmond). In Helmond’s view, once social network sites

(19)

offer APIs, they become social media platforms because they are enacting their programmability, and the API can be a key entry point to critically inquire into the consequences of said programmability (4).

Using Perspective API external developers access data and or functionality by making API calls, which represent “specific operations to perform a task” (Helmond 5). Websites using Perspective can insert it into their own content moderation systems and each comment sent to the API is assessed and given a score. In the case of Perspective, the default is that the comment data is not stored by Google.

A parallel process to platformization is that of infrastructuralization. According to Plantin et al. both concepts are part of a dual development in which the boundaries between infrastructure and software, in this case platforms, have become blurred. By articulating this process they ask to “highlight the tensions arising when media environments increasingly essential to our daily lives (infrastructures) are dominated by corporate entities (platforms)” (3). One stream of infrastructure studies elaborates the phenomenology and sociology of infrastructure, highlighting the role of the human elements of infrastructure, such as work practices and organizational culture (4). From that perspective, moderation itself, as a set of guidelines and agreements, can be seen as infrastructure - an infrastructure for civility. Websites or platforms offering more civil interactions between users can also portray it as one of their features.

1.1.3 The Technology in Question and its Framing

It is also crucial for this research to delineate the point of view from which to look at the technologies of automated content moderation. There is a need to explore the technicity of the process from a media studies perspective in order to develop a vocabulary for these processes which correspond with the platform studies approach I have chosen. I will do this by examining particular algorithmic techniques (Rieder). These techniques are seen as “diverse and general at the same time: while every technique implies a particular way of doing things, they can often be applied to a wide array of domains” (103). So even though I will not engage in reading and analyzing code, I will explain the computational processes enacted by these tools and emphasize the crucial role of data in the process.

The technologies at the center of this research are generally part of the field of Natural Language Processing (NLP). I will illustrate the technology that Perspective is based on as it is the main algorithmic technique this research examines. As explained in its API reference, Perspective relies on a Convolutional Neural Network (CNN), a deep learning technology that is applied in this case for NLP but can also be used for image classification, recommender systems, and many other cases. The network is trained with word-vector inputs. These are vector representation of words called

(20)

“word embeddings”. Word embedding is a way of encoding words on a particular vector space. Instead of representing a word as a word, it presents it in terms of its embeddings, so essentially the word is replaced by what it is normally surrounded by (Mikolov et al).6

The data the network was trained on was manually labeled by humans for different attributes - the first and primary one is ‘Toxicity’. Other attributes include ‘Identity attack’, ‘Insult’, ‘Profanity’, ‘Threat’, ‘Sexually explicit’ and ‘Flirtation’. These attributes are also called models (“Perspective API Reference”). There is a separate classifier for each attribute. The human labelers were asked, for example, to rate whether a comment is very toxic, toxic, slightly toxic, hard to say or not toxic (“Annotation Instructions for Toxicity with Sub-Attributes”). This human labeling of hundreds of thousands of comments was used to train the algorithm. The learning part of the algorithm thus consists in trying to identify the specific features that characterize the decisions made by the human labelers. After the learning process, the algorithm is then able to classify new comments which were not labeled. Now when there is a call to the API, it processes the data and the comment is rated based on the model it is set on. The scores are on a scale of 0.1 to 1, and they represent a probability. The toxicity model, for example, predicts the probability of the comment being perceived as toxic by other users. What the machine learns, then, are human labels, as it is trying to imitate human judgment.

The theoretical basis I will be utilizing to analyze this technique is mainly part of a growing body of literature referred to as “critical algorithm studies” - broadly speaking, the application of humanistic and social scientific approaches to algorithms (Seaver 1).7_{Seaver proposes to see}

“algorithms as culture”, because, like culture, they are “enacted by practices which do not heed a strong distinction between technical and non-technical concerns but rather blend them together. In this view, algorithms are not singular technical objects that enter into many different cultural interactions, but are rather unstable objects, culturally enacted by the practices people use to engage with them… they are composed of collective human practices” (5).8

Rieder offers an approach to the analysis of software “that sits between broad theorizing and the empirical investigation of concrete applications of information ordering algorithms” (101), in order

6_{This is of course a very simplified version of a much more complex technology, suitable for the purposes of this} research. For further reading consult the article.

7_{The academic discourse around algorithms has been in part polarized, with a growing scrutiny of ‘algorithmic bias’} (O’Neil; Noble; Pasquale). This is of course a crucial debate, but since the specific technological tool I am discussing does not make life altering decisions, such as credit scoring or job applicants classifications, the debate in this thesis will not revolve around that much.

8_{Paul Dourish adds that common concerns with algorithms identity and evolution might lead towards an approach to} algorithm studies that puts aside the question of “what an algorithm is as a topic of conceptual study and instead adopt a strategy of seeking out and understanding algorithms as objects of professional practice for computer scientists, software engineers, and system developers” (9).

(21)

to problematize the work of these algorithms. Techniques such as giving prominence to trending topics or calculating the compatibility of users on dating sites, delegate and express cultural and thus highly ambiguous tasks as mechanical procedures (101). The specific algorithmic technique he discusses is the naïve Bayes classifier, used among other things for spam filtering, but the mechanism of which can also apply to the algorithmic techniques used in Perspective. As Rieder puts it, “techniques such as Bayes classifiers propose the means to derive decision models from the encounter between some data, a purpose, and a mechanism for feedback” (110). If we try to make an analogy between the process described by Rieder to apply in the Bayes classifier to that happening in Perspective, we can claim that the tool derives – or ‘learns’ – optimal parameters from the relation between data (comments generated by online users), feedback (the labels, e.g. ‘very toxic’/ ‘toxic’/ ‘nontoxic’ given by crowd source workers to each comment), and a purpose (show the moderator the toxicity score, or notify the user their comment violated community guidelines). Similar to the Bayes classifier, Perspective will answer the question of whether or not a comment should be rejected, not with a yes or no, but rather with 0.7 (112). Parallel to Seaver, who writes about enactments, Rieder emphasizes that “even purely computational procedures rely on extensive work that is not calculative” (113). In this case the most crucial part of that work is labeling the data in a manner that will allow building models upon.

Burrell, also discussing spam filters, writes about the opacity of algorithms and how it stems from three reasons: Intentional corporate self-protection and concealment; writing and reading code still being a specialist skill set, and “mismatch between mathematical optimization in high-dimensionality characteristic of machine learning and the demands of human-scale reasoning and styles of semantic interpretation” (2). She examines machine learning and specifically its application in neural networks for character recognition: She explains the layered structure of neural networks, some of them hidden, and their inter-connectivity, allowing for a more complex modeling of language than a technique like the Bayes classifier. The algorithm assigns weights, and in the case of text classification these weights can be given to specific words or compound phrases. In Perspective, the ‘learner’ part of the algorithm, trained on data labelled by humans - as ‘toxic’ or ‘non-toxic’ and as ‘rejected’ or ‘not rejected’, then creates a matrix of weights to classify the new input data (5-6).

The opacity in the process Burrell describes, character recognition, is evident when the algorithm does not break the task into subtasks readily intelligible to humans. She also notes that handwriting recognition is not a conscious task in humans either, so there is a kind of opacity in the human process of character recognition as well (7). This is also true to human content moderation: sometimes it is hard to say exactly what in the words or meaning of a sentence one is offended by, and it can be a matter of nuance. All of the considerations and processes that I have delineated above are

(22)

important in trying to understand how Perspective ‘thinks’ and I will elaborate on that in the discussion chapter.

1.1.4 Types of Hateful and Problematic Speech

Let us go back to the phenomenon of hate speech proliferation on platforms, in order to distinguish between different categories. In mass media especially, there is a lot of overlap between different categories. Journalists and politicians tend to include hate speech, harassment, and even misinformation and fake news as part of the same group of expressions and behaviors social media platforms should eliminate. I claim it is important to distinguish what kind of speech these tools are supposed to identify.

As I mentioned in the introduction, my initial interest was hate speech, a term that also has a legal definition in many countries and protects certain classes such as women, LGBTQ, racial minorities, and ethnic identities. Most platforms also have their own definition of hate speech or hateful speech, which leans partly on the legal definitions but might add to them. In a chapter of his book dedicated to reviewing and critically analyzing platforms’ community guidelines Gillespie notes that the guidelines regarding hate speech echo U.S legal language and this legal tone takes over the otherwise casual tone of these documents (Custodians 59).

There are also other forms of speech I would like to discuss - that of harassing or abusive speech that is not necessarily directed at a member of a minority but is meant to distress, annoy or insult participants in a discussion. Leiter defines these expressions as ‘tortious harms’, “giving rise to causes of action for torts such as defamation and infliction of emotional distress” and dignitary harms’, which constitute “harms to individuals that are real enough to those affected and recognized by ordinary standards of decency, though not generally actionable” (155), to which he also refers as low-value speech (156).

For the purpose of this research, the guiding definitions of unwanted speech are those offered by the case studies themselves. As I will explain more thoroughly in the discussion part, Perspective has different models for defining what unwanted speech the algorithm identifies. Amnesty focused on sexism and misogyny and defined it as problematic or abusive. As will be elaborated later on, these tools do not specifically identify hate speech, each due to a set of considerations.

Following the design of the tools, I will also not look at private communication in instant messaging within platforms but publicly posted expressions. It is also important to distinguish these correspondences from what is usually termed as online harassment which may lead to behaviors like doxxing (exposing private information that might disclose an individual’s address and phone number

(23)

or workplace and escalate the harassment to the offline realm). Online harassment is a topic that deserves an elaborated exploration of its own, but it is not under the scope of this research.

1.1.5 Moderation Guidelines

Today, platforms use a variety of tools and techniques to moderate user content. A very common mechanism that is embedded in them is the flagging or reporting option (Crawford and Gillespie). This function allows users to notify the platform about any content they feel is violating any of the terms of service or community guidelines. Usually, flagging mechanisms lead the user to a window where they can indicate what category the offensive or problematic content falls under. The report is sent to the platform that decides what to do with the post or comment. YouTube was probably the first platform to offer a flagging button in 2005, while Twitter only added a report button in 2013 (Gillespie

Custodians 88). News websites like The NYT and The Guardian also offer a flagging and reporting

option for their comment sections. This mechanism has become synonymous with the basic affordances of social media, such as liking, commenting or sharing. Flags are a “symbolic linchpin” in maintaining a platform’s self-regulation and avoiding government oversight and intervention. They also provide a rhetorical warrant for content removal by platforms because they represent the users’ concerns and objection (Crawford and Gillespie 412). The flagged content is usually reviewed by human moderators, algorithms, or a combination of both. In many platforms and websites hosting user generated content, flagging still works as the first indicator that the content is problematic. Still, it is not always clear to users why one flag will resort in deletion of content or suspension of a user, and another will be ignored.

With hundreds of millions to billions of users on the largest platforms, there is an unimaginable amount of content every day that must be reviewed in some way. Even with the use of flagging or automatic detection, guidelines must be established, in order for the human moderators to decide what to delete or filter. Much of these policies are purposely secret (Roberts) and there is public criticism regarding how these rules are made and enforced. For example, in 2017 a journalistic exposé revealed Facebook’s moderation guidelines for its human moderators. It had many examples that sparked controversy. One of these was the instruction to delete comments like “Someone shoot Trump” because heads of states are under a protected category, while comments like: “To snap a bitch’s neck, make sure to apply all your pressure to the middle of her throat”, or “fuck off and die”, may be permissible because these are not regarded as particular credible threats. Other permissible phrases that were mentioned in the Facebook documents exposed by The Guardian included “Little girl needs to keep to herself before daddy breaks her face,” which is defined by the platform as “generic or not credible” (Hopkins n.p.).

(24)

Other similar policy documents regarding moderation of political speech in different national contexts exposed by the NYT in 2018, revealed a lack of consistency and coherence in the guidelines. Complex localized political issues, such as what is considered a hate group that should be banned, or what speech is dangerous, are distilled into simple yes-or-no rules determined by lawyers or engineers. Their enforcement and real time decision making, however, are outsourced to companies employing mostly unskilled workers (Fisher n.p.). According to the NYT report those moderators sometimes rely on Google Translate to translate posts and have seconds to recall the numerous rules and apply them to hundreds of posts each day. At the time of the publication, Facebook said it increased the number of moderators from 7,500 to 15,000 worldwide.

Due to the ongoing criticism, platform companies started issuing reports on content moderation. For example, Facebook published clearer and more detailed community standards and a quarterly community standard enforcement report as of May 2018 (Bickert). YouTube, as I have mentioned earlier, recently made changes to their hate speech policy that were communicated to the public (YouTube official blog). As for Twitter, a senior official recently promised that the platform will publish ‘case studies’ explaining the decisions behind banning high profile accounts (Bell).

In addition, criticism was made over the poor working conditions of human moderators of large platform companies. At Facebook, for example, the moderators are employed by external companies and are paid a fraction of what regular Facebook employees’ are paid. They only have seconds to make moderation decisions and their work is essentially watching and reading horrid examples of violence, racism and cruelty day in, day out. According to some accounts, these employees have reported developing PTSD-like symptoms after they leave the company (Newton).

▪

This chapter presented the conceptual and legal background of content moderation policies in large social media platforms. It problematized the issue of content moderation through the framework of platform governance and the process of platformization in order to lay the ground for the analysis of the case studies in the discussion chapter. The framework of platform studies is necessary because this is the cultural logic prevalent in recent debates about content moderation. This chapter also gave the theoretical and practical background of the computational technologies used to automatically analyze user content.

The next chapter will elaborate the methodology of this research. I will present the case studies chosen in order to comprehensively explore the topic of automated content moderation and the mixed research methods applied. Namely, qualitative ‘grey literature’ analysis, semi-structured interviews with experts and a small-scale algorithmic auditing.

(25)

Chapter 2: Methodology

How to Study Complex and Experimental Digital Tools

After situating my research within the theoretical framework proposed, I would like to access the methodology offered in order to answer my research questions, now informed by this framework, which are namely:

What decisions and logic stand behind the automation of online moderation?

How are tools for automatic online content moderation conceived by their creators (designers, developers, domain experts)?

In what domains and in what way do these tools operate in real life situations today? How do they affect these domains?

One of the challenges in new media research and specifically humanities research of technological and computational tools is the need to construct suitable methodologies for examining these phenomena, most of them new fluid and technically complex. In the case of the tools I am examining, they are themselves experimental and go through constant change by their creators. When it comes to automated online moderation tools, a quick Google search shows there are many companies in the market offering different services. Tech media covers large platform companies’ attempts at developing these tools as an answer to the growing public discontent around issues of content moderation I have mentioned earlier. But moderation, human or automatic, is purposely not transparent (Roberts 2). In some of the better case scenarios, like Google’s Perspective, we are left with an API, public code documentation if there is any and an interface of the tool itself. In worst case scenarios, we do not even know there is an automatic tool involved in the moderation process.

This research is under the umbrella of ‘Grounded Theory’ (Glaser and Strauss), meaning “the discovery of theory from data”. Due to its exploratory nature, I did not come to it with any preconceived theoretical hypothesis, but rather chose my case studies based on my ability to gather enough data on them. I first studied them and then chose and came up with the suitable theories to analyze them. This is not to say that it was less academically rigorous, only that the process was not linear as the structure of the paper suggests. It determined an ongoing interplay between analysis and data collection and modifications to existing theories, grounding them in the collected data (Strauss and Corbin 273)

This chapter presents first the case studies and then the mixed methods chosen to pursue this research and explain how they will be utilized to answer the research queries.

(26)

2.1 Qualitative Grey Literature Analysis

‘Grey literature’ is a wide term that includes any material the public reads outside of journals, books and daily newspapers. This material is usually not archived by libraries. Everyday examples of grey literature are weather reports, police records and street maps (“Collecting Grey Literature” 38). The advent of the internet has blurred the lines between official indexed scientific publications and grey scientific literature even more. For the purposes of this research, the main grey source I have used is the code repository GitHub, a collaborative code hosting platform that offers different services, with a clear topical focus of sharing coding projects and discussing them.

Conversation AI published a vast array of material on GitHub, and so it was the first and most easily available resource. The materials included code, code documentation, data that was used for training models, and general information provided by the research team about their work. Other grey materials include Perspective’s demo, Troll Patrol’s research report or Talk’s webinar. Another example of these kinds of material are Kaggle competitions opened in order to improve Perspective, which include explanations on the issues the research group encountered. Analyzing these materials from a qualitative perspective can tell a lot on the decisions and logic behind the development of the tools, as well as their weaknesses and strong suits. Therefore, these materials are crucial for my analysis. One of the main reasons for choosing the said case studies was that there was so much grey material on them available. Without it, conducting this kind of grounded and exploratory research would have been very hard.

GitHub is not yet a common resource for new media research, and so some considerations on how to analyze it were required. Given the abundance of material in the case of Perspective, it was also important to select what material to use. I cannot say that my use of GitHub was exhaustive but rather that it helped me in gaining basic knowledge of Perspective, and informed my interview questions in the first exploratory stages of the research. After conducting the interviews, the material on GitHub helped me identify the corresponding themes in the interviews.

Public code documentation presents itself as a semi-finished product, and sometimes hides the deliberations and failures on the way. That is why another method was necessary in order to substantiate and expand the public material. Given the fact that Perspective demanded a lot of collaborations with experts from different fields, interviewing these experts was chosen as a method. I will now specify the tools that will be studied as the case studies in this research, and then explain the two other complementary methods: Semi-structured interviews and small-scale algorithmic auditing.

80% Toxic: THE POLITICS OF MACHINE LEARNING CONTENT MODERATION TOOLS

80% Toxic

THE POLITICS OF MACHINE LEARNING CONTENT MODERATION TOOLS

Table of Contents

Abstract

Introduction: The Fabrics of Moderation

Chapter 1: How Platforms Govern and are Governed

1.1 On Content Moderation and its foundations

Chapter 2: Methodology

How to Study Complex and Experimental Digital Tools

2.1 Qualitative Grey Literature Analysis