• No results found

Tweeting in the Bubble: An Analysis of Filter Bubbles from the 2016 Elections

N/A
N/A
Protected

Academic year: 2021

Share "Tweeting in the Bubble: An Analysis of Filter Bubbles from the 2016 Elections"

Copied!
12
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Tweeting in the Bubble: An Analysis of Filter Bubbles from

the 2016 Presidential Elections

SUBMITTED IN PARTIAL FULLFILLMENT FOR THE DEGREE OF MASTER

OF SCIENCE

Kellie English

11293802

M

ASTER

I

NFORMATION

S

TUDIES

Human Centered Multimedia

F

ACULTY OF

S

CIENCE

U

NIVERSITY OF

A

MSTERDAM

August 21, 2017

1

st

Supervisor

2

nd

Supervisor

Dr. Stevan Rudinac

Dr. Damian Trilling

(2)

Tweeting in the Bubble

An Analysis of Twitter Sources and American Politics

Kellie English

University of Amsterdam Student number: 11293802

english.kellie@gmail.com

ABSTRACT

Since the 2016 United States presidential elections, the world of politics has changed. Social media has proven itself to be a catalyst for this cange- take, for instance, the use of Twit-ter by President Trump. One of the most widely cited issues with social media is that of filter bubbles. Despite the argu-ments in popular culture against personalization algorithms, research thus far has been inconclusive in determining the existence of filter bubbles. This thesis will further explore the existence of filter bubbles by looking at cross-ideological communication on Twitter, within the context of the current political climate in the USA.

Keywords

Twitter, alternative facts, media bias, social media, social bubble, filter bubble

1.

INTRODUCTION

The 2016 US Presidential elections, and the resulting inau-guration of President Donald Trump, provide a fascinating view into the power of social media in the current political and social climate. Trump‘s win left many in the center and the left reeling, searching for reasons behind the success of Trump‘s campaign versus that of Clinton‘s. The number one culprit? That of social media, specifically, the prevalence of filter bubbles and fake news that pervaded these social sites and networks.

Filter bubbles, or the creation of personalized and there-fore homogeneous internet browsing experiences, are cur-rently under scrutiny due to concerns of users seeing only similar opinions online- thereby eliminating any sort of dis-sent and convincing the user that their opinion is correct,

regardless of facts [DiFranzo Gloria-Garcia 2017]. Filter

bubbles are defined at a sort of cherry-picked version of the internet, in which a user is only exposed to articles they are already highly likely to interact with- for instance, a lib-eral who sees only articles from the New Yorker and posts

from a Black Lives Matter group [Baer 2016]. Further un-der scrutiny is the idea of fake news, or false articles and information, that is able to invade these echo chambers and filter bubbles. These fake articles tend to favor one bias or another, supporting those opinions that are prevalent within each bubble, where users (who are thus supported in their beliefs) are unlikely to check to see if the facts in these arti-cles hold true [Silverman 2016].

Increasingly, the use of social media has cemented itself in American politics- especially in the case of the current ad-ministration and Twitter. In the past, both politicians and voters turned to social websites in order to voice their con-cerns, grow support, and define themselves and their ideas of how the government should be run. Today, the Internet, with it‘s ever evolving continuous news cycles, is provid-ing a platform for politicians and leaders to communicate directly with their constituents - without filters and, often-times, without the approval of their own advisors.

1.1

Tweeting and Politics

Twitter takes full advantage of this 24 hour news cycle [Bhattacharya Ram 2012]. The growth of the use of this social media platform has lead to a multitude of research, encompasing everything from sentiment analysis to the use of microblogging in political affairs [Jungherr 2016]. Twit-ter allows users to share their thoughts in 140 characTwit-ters

or less, in real time [Vergeer Franses 2016]. This means

that the tweets submitted can be tied to certain events or actions that are occurring throughout the world. The Twit-tersphere consists of over 300 million active users [Aslam 2017] . Anyone with Internet access can create an account, and can then unleash their opinions onto the world, with few, if any, regulations. Conversations in the Twittersphere tend to reflect current events, with politics being especially popular [Vergeer Franses 2016, Isaac Ember 2016].

Twitter users are prone to sharing political opinions and thoughts, in addition to news [Weller . 2014]. Tweets are less narrative, and more stream of consciousness, with users sharing up-to-date information as it occurs, annotated with their thoughts and opinions (for instance, in the case of the three election debates in 2016) [Isaac Ember 2016]. These real events are tied together with the user‘s interpretation, whether or not the user is an expert in the subject. Thus, events are framed by the users, connected and networked to other events by users, and are given significance by users, rather than news media telling users what is important and what they should be paying attention to [Weller . 2014]. Users are able to act with agency within the platform, be-having as curators and ‘gatekeepers‘ of information, rather

(3)

than as passive consumers. They are able to define meaning in events for themselves, both on a personal level and on a global scale [Weller . 2014].

The conversations on Twitter are organized according to topics, denoted by hashtags [Trilling 2015, Kong . 2014]. For instance, over the course of President Trump‘s election, Twitter (as well as other social media platforms) made use of the hashtag ‘Make America Great Again‘ (#makeamerican-greatagain and #maga). A Google search for this hashtag reveals tweets, as well as relevant articles and other infor-mation regarding the election. By filtering through trending topics, Twitter can also give insight into the topics that are important in any given time period.

With the polarized reaction to the results of the most re-cent presidential election in the United States, popular cul-ture has become imbued with the idea of fake news. Indeed, the idea was so prevalent that the Oxford English Dictionary declared the Word of the Year for 2016 to be ‘Post-Truth‘. The OED defines post-truth as forgoing ’objective’ facts in favor of emotional arguments and personal beliefs [Oxford English Dictionary Online 2016].

2.

RELATED WORK

For the purposes of this research, we tried to examine the communities that form regarding different ideologies, con-versations, topics, and beliefs, or ‘bubbles‘. It is important to examine whether or not there is a general discourse be-tween different groups, or if there exist self-contained com-munities continually bolstering their pre-defined belief sys-tems.

The prevalence of social media in everyday life adds to concerns of users, and therefore voters, being manipulated by the content that they are exposed to. Some authors even go so far as to claim that social media is ‘destroying‘ democ-racy, or claim that it is the sole reason for Donald Trump‘s election [El-Bermway 2016, Baer 2016]. However, research suggests that this is merely speculation on behalf of people who are unhappy with the current political climate- these claims remain unsubstantiated.

2.1

The 2016 Presidential Elections

According to Allcott Gentzkow, Trump would have lost the election had it not been for the prevalence of fake news, spread through social media sites [Allcott Gentzkow 2017]. Indeed, a 2017 study done by Buzzfeed Research showed

...1) 62 percent of US adults get news on so-cial media (Gottfried and Shearer 2016); 2) the most popular fake news stories were more widely shared on Facebook than the most popular main-stream news stories (Silverman 2016); 3) many people who see fake news stories report that they believe them (Silverman and Singer-Vine 2016); and 4) the most discussed fake news stories tended to favor Donald Trump over Hillary Clinton (Sil-verman 2016). [Allcott Gentzkow 2017, p.2].

Despite this, only 14 percent of US adults used social media platforms as their main resource when it came to important news and information [Allcott Gentzkow 2017]. Users who are searching for political information online are, generally speaking, already predisposed to be politically inclined or already have an interest in politics [Kruikemeier . 2014].

There remains a difference, though, between conventional news coverage, and that of social media. The former is ver-ifiable and backed by the reputation of the organization. Further, these organizations tend to have readership that will force their hand when it comes to making certain that stories are true. On the other hand, social media is a literal free-for-all. With a few notable exceptions, users are free to post anything they desire, despite any statements to the contrary.

It is well known that certain newspapers and news out-lets have particular political leanings. Likewise, it can be assumed that the readership of any given news media will reflect (at least to some degree) the bias of that organization. A reader of the notoriously left-wing New Yorker Magazine will most likely also believe in more liberal policies, though they may not necessarily be a Democrat. Thus, any stories or content distributed by these media organizations may be considered within a frame- the selection and the subsequent promotion of a certain series of facts over others. In other words, the framing, or contextualizing, of certain stories en-sures certain features take center stage, while the rest are left along the periphery[Entman 1993].

The Internet allows for the dispersal of that power- news media are no longer the sole proprietors of information, ac-cording to Groshek and Tandoc‘s study on gatekeeping, or distributing information, via Twitter (2017). Users are no longer the passive consumers they once were, and are now much more likely to create content as well as consume con-tent. Users can now act in the role of media gatekeepers, processing and analyzing news themselves rather than pas-sively allowing someone to do this for them. Indeed, the 2017 study discovered that, while journalists still may hold some prestige in terms of followers, they have lost their hold and no longer wield the influence they once did- journalists are merely a part of the conversation, rather than the leaders of the discussion [Groshek Tandoc 2017, p.208].

2.2

The Modern Day Media

One of the main concerns of social media is that there is no way to verify the content that users post- it is not fact-checked or filtered by anything other than the current mood of the user. The concept of ‘fake news‘ came into play in the early 2000s, when concerns grew over users being able to cherry-pick the stories and articles they wanted to read, rather than being exposed to an entire news program. This would supposedly allow the user to only find and read stories that supported their already established beliefs, while allow-ing them to ignore any other stories that supported contrary beliefs or issues [Allcott Gentzkow 2017]. Allcott Gentzkow state that a single user can, in some cases, have a viral post-ing that garners more engagement than traditional- and far more popular- media outlets [Allcott Gentzkow 2017]. If the users subscribe to a particular political ideology, they are more likely to believe a fake news headline that supports their viewpoint. Further, if a user‘s social media connections are very ‘ideologically segregated‘, that user is predisposed to believing headlines that support their own ideology

[All-cott Gentzkow 2017, p.4]. This could then be the locus

for the spread of fake news, within different social bubbles: A user with a certain perspective posts something, which is picked up and supported by others who follow the same belief pattern.

(4)

elec-tion period have found otherwise, however. According to the study, entitled ‘Is Fake News A Fake Problem‘, researchers found that consumption of fake news remained relatively stable throughout the course of the election, while regular news consumption fluctuated depending on how close elec-tion day was [Nelson 2017]. Further, users mostly accessed fake news via social media, and of the total number of so-cial media users, these were a very small percentage [Nelson 2017]. This leads the authors to speculate that the idea of fake news has been spread out of proportion with the actual scope of the problem.

2.3

Social Exposure

A 2013 paper by Hosanagar et al. argued that the person-alisation of the internet will create ‘fragmentation‘, allowing users to exist within a filtered worldview without influence from alternate opinions. Recently, concern has spread over echo chambers forming via social media- supported by new algorithms that show only those articles a user is most likely to engage with, essentially eliminating all opposing positions

[Neubaum Kr¨amer 2017]. One of the reasons that this is

dangerous is because users are more likely to use their imme-diate communities as a judge of the entire country‘s political climate.

According to Conover et al., users searching for politi-cal information online look for posts, articles and blogs that already reflect their preexisting beliefs (2011). Further exac-erbating the problem is the filtering process that takes place. Though all of this data is available to everyone, at any time, the information a user interacts with is much more selec-tive. Based on everything from complex algorithms that de-termine relevant pieces, to a user‘s digital footprint, data is filtered in such a way that is beneficial to both the user, and the businesses the user interacts with. However, this is cause for some alarm. Studies have shown that social media users form ‘bubbles‘, interacting, for the most part, with content that is shared by like-minded people, with whom the user al-ready agrees (thus perpetuating their beliefs)[MD. Conover . 2012].

Additionally, users will retweet other users who share their own political beliefs. However, mentions run the gamut- the study describes these as a ’bridge’ that connects users of different political leanings [M. Conover . 2011]. Therefore, users are retweeting other users already within their com-munities. Despite this, the user’s mentions could indicate that the users are not creating homogeneous bubbles around themselves, rather, the wide variety of bias indicated in the mentions indicates that users are indeed exposed to opinions contrary to their own [M. Conover . 2011].

Selective exposure theory, or the theory that out of an overwhelming amount of content available online, users will choose to interact with the content that confirm their pre-determined beliefs, has been examined as the basis through which users on social media chose their news [Morgan . 2013]. A study done in 2016 examined the role of social media in forming and perpetuating Islamophobic groups on-line [Puschmann . 2016]. While the subject area does not have direct relevance for this study, the research showed a

few things. First, though traditional media was popular

among all groups, right-wing, Populist groups were more likely to rely on sources from non-traditional places (such as blogs, social media, or less-reliable websites) [Puschmann . 2016]. Further, while social media sites are used by everyone

equally, there is some suggestion that, based on the sharing patterns of users, like-minded (and like-partied) individu-als have overlapping use patterns across various websites, suggesting that an individual who believes in the alt-right would frequent mainly blogs, social media pages and other websites that would further perpetuate their belief system [Puschmann . 2016]. This type of personalized content can create issues, especially when a person is continually back-ing up their personal beliefs, no matter the theories behind it.

2.4

Filter Bubbles

Social networks now mediate information to users via their platforms- users can obtain news stories from Twitter, Face-book, Instagram and even Snapchat. These platforms are subject to algorithms that change the content based on what a user is most likely to engage with. While useful for users who want to filter out noise from their newsfeeds, this con-tent sorting can also create filter bubbles, through which a user is exposed only to viewpoints and ideas that they already support and agree with [Bakshy . 2015].

With the 24 hours news cycle, the daily headlines, then, are chosen based on ‘clickability‘- what will get the most in-teraction from the most users, rather than what is particu-larly important. The traction of a story is more useful to on-line publications than the actual content provided. What is ‘shareable‘ and dynamic is more valuable than exact, factual statements. This is exactly the logic that determines what content users will see on their social media feeds. Take, for instance, Facebook‘s news feed algorithm. Machine learning allows user‘s news feeds, content, and even search engines to be highly personalized. This creates concerns of filter bub-bles being formed- algorithmically enhanced separation of demographics based on their beliefs [Flaxman . 2016]. Be-cause these filtering processes only pair like sources, it is possible that users who are, for instance, particularly liberal might only see other sources that are also deemed ‘liberal‘. Clicking on and following Bernie Sanders, for instance, leads the algorithm to suggest Elizabeth Warren- not Warren Buf-fet.

Though this may seem trivial, it enables users to form bubbles around themselves, filtering out any information they disagree with- no matter the veracity of that informa-tion. However, in actuality, how significant the filter bubbles are is questionable. Facebook users are simply more likely to interact with articles and posts that substantiate their positions- which then allows one to assume that users who solely obtain their information from social media are, indeed, a part of a filter bubble [Allcott Gentzkow 2017].

A 2015 study showed that there was some cross-ideological communication occurring [Barber´a . 2015]. One study found that fragmentation caused by these systems is not always a

given. For instance, a personalized recommender system

has the potential to expose users to far more content than they would otherwise engage with [Hosanagar . 2013]. In another study, researchers found that the use of social media was not actually a factor that contributed to the creation of filter bubbles. In fact, users who frequented social media sites regularly reported being exposed to far more alternate views, ideologies and opinions than they otherwise would have [Groshek Tandoc 2017, p.1400-1].

Despite use of personalization algorithms, users still see a certain percentage of opposing viewpoints and opinions

(5)

across their social media platforms [Neubaum Kr¨amer 2017]. It is arguable that social media could even be exposing users to a greater variety of opinions, thereby ’bursting’ the filter bubble [DiFranzo Gloria-Garcia 2017]. Those voters who are not participants of any social media sites are much more likely to be exposed to any opposing views, rather than cre-ating and perpetucre-ating homogeneous bubbles of information [DiFranzo Gloria-Garcia 2017].

In an interview shortly after the elections, Mark Zucker-berg was quoted saying that Facebook had little to do with the election results- despite the fact that there were identi-fiable ‘fake news‘ articles that had been shared through the social networking site, Zuckerberg was adamant that vot-ers chose a candidate based on their predefined believes and ‘lived experiences‘ rather than articles they may or may not have interacted with throughout the course of the 2016 cam-paign [Newton 2016]. Facebook points also to an internal research project that shows that echo chambers and filter bubbles have less of an impact than popular culture would have one believe, stating that a user‘s news feed changed very little after the adoption of tailored news feeds for users [Bakshy . 2015].

Ultimately, multiple studies have shown that users see more alternative opinions than popular culture would care to admit [Parkinson 2016]. Users are exposed on a daily ba-sis to at least a portion of different viewpoints and ideologies [Bakshy . 2015]. Arguably, this could create a more global user, rather than an insular one: by being a part of a larger social network, users can be exposed to a great number of po-sitions and viewpoints that would not otherwise be available to them [Flaxman . 2016]. Unfortunately, it is nearly impos-sible to measure this. While attempts have been made, by measuring the political leanings of certain articles or mea-suring user‘s exposure to certain viewpoints,measurements are based on users self-reporting rather than anything con-crete [Bakshy . 2015].

3.

PROBLEM STATEMENT

In an analysis of what went viral in the election campaign, Darwish et al. state that the results of the election- the elec-tion of Trump- was reflected on social media. Even his catch-phrase- ‘Make America Great Again‘- was more popular on platforms that any of those attributed to Clinton [Darwish . 2017]. For instance, the majority of the days leading up to the election showed more tweets that were Trump-positive than Clinton-positive. Interestingly,the mainstream media was more likely to show an anti-Trump article, rather than one promoting him. Twitter sentiment did, overall, reflect the results of the actual election [Hamling Agrawal ].

We look to examine the filter bubbles on Twitter, using the context of the current political climate in the USA. It seems, from popular culture, that the country is polarized for and against the current president. Is it possible to trace this on Twitter, and, what would the potential significance of this be? We will examine the communities within which these tweets are shared, and examine the conversations hap-pening around these tweets, in an attempt to understand the manner in which untrustworthy information is spread. Finally, we will try to define the social ‘bubbles‘ that exist online, by looking at whether or not there is cross-ideological communication occurring.

3.1

Research Questions

1. Is it possible to identify social ‘bubbles‘ of homoge-neous opinions that have formed, based on the rela-tionships between user accounts?

2. Are these ‘bubbles‘ closed groups, or do members have cross-party communication?

4.

METHODOLOGY

In order to examine the prevalence of social bubbles on Twitter, we used a collection of tweets originating during the course of the 2016 Presidential elections, and beyond. Be-cause politics in the United States involve fairly well defined groups (Democrats and Republicans), we assumed that it would be possible to identify, at the very least, these groups within the dataset.

4.1

The Dataset

The dataset for this research contains over 250 million tweets, collected between July 2016 and April 2017. These tweets were obtained from the Twitter streaming API based on common hashtags, locations, and mentions that con-nected them to the events occurring during and after the 2016 United States Presidential elections.

For the purposes of this research, and in an attempt to scale down the number of tweets, we used only those tweets obtained between during January 2017, spanning President

Trump‘s inauguration and first weeks in office. In total,

there were approximately 72,000 individual tweets used.

4.2

PEW Research and Bias Measure

In 2014, PEW Research Center published a report that outlined the news consumption habits among adults in the United States. PEW’s data collection focused on surveying adults in the United States, and included both those who had declared themselves to be biased (either liberal or con-servative), as well as those who were more neutral or refused to answer. The adults surveyed listed their most-used news sources, and answered questions about which sources they used on a list of the most popular news sources compiled by PEW. Participants were asked how much they trusted the articles published by each organization in a list of the most popular news sources [Mitchell Matsa 2014]. The results can be seen in figure 1.

Sources are organized along a spectrum: the sources that are used by both liberals and conservatives appear in the center, with trust that is fairly evenly distributed through-out. This can be seen in the figure 2. The sources that appear on the edges have a much more dramatic distribu-tion, with any measure of trust being either very limited or completely missing from the opposing sides [Mitchell Matsa 2014].

4.3

Bias Measure

When considering a dataset that engulfs users from var-ious political parties it can be assumed that they will have various political leanings- either liberal or conservative, demo-crat or republican. What is henceforth referred to as ‘bias‘ is the degree to which a user‘s retweets, hashtags and sources are considered to have either a liberal or conservative bend based on both data from the PEW Research Center as well as sources that cite the trends from both liberal and conser-vative sides during the course of the election.

(6)

Figure 1: Trust measure of News Sources

Figure 2: Ideological Placement of News Sources

Figure 3: Frequency of Tweeted Biased Sources

The varying degrees measure the extent to which a user can be deemed either conservative, liberal, or mixed / neu-tral.

4.4

Determining User Bias

For the sake of clarity, users are described using one of two terms: either url-users, or users who had tweeted links to sources, or non-url users, or users who did not.

Initially, in an attempt to determine user bias in accor-dance with the PEW news source measure, each url-user was organized into a list, along with the root of the URL they tweeted (eg. nytimes.com). Each of these roots were then counted.

A bias measure was created using the data PEW pro-vided. Each news source was assigned a score of either lib-eral or conservative, based on the maximum number given in the table in 1. Sources trusted by liberals received a pos-itive score, with the greatest being NPR at 0.72. Likewise, sources trusted by conservatives received a negative score, with the lowest being Fox News at -0.88. The top URLs were then analyzed according to the bias measure, and plot-ted based on bias and frequency using a scatterplot, which can be seen in figure 3.

The graph shows that the majority of sources tweeted were not news sources- the top URLs tweeted were from twit-ter.com, presumably leading to other tweets (though not, in fact, an official ‘retweet‘). The most frequent URLs, exclud-ing social media, are listed in Table 1.

Conservatives, as described in the PEW report, favor only a few sources, while liberals were across the board, with users ranging from ‘very liberal‘ (the New Yorker) to only ‘mildly liberal‘ (CNN).

The majority of users, approximately 70%, were non-url users.

4.5

Hashtags and Mentions Measure

A list of unique hashtags and mentions was gathered from the dataset. For each user, a vector of values was created that showed either that the user had not used the hash-tag or mention (0), or that they had (1). This was then transformed into a sparse matrix that was L2 normalized to remove any zero vectors.

(7)

Table 1: Top URLs in Entire Dataset Top URLS in Entire Dataset hillaryclinton.com bit.ly foreignpolicy.com amp.twing moby.to politi.co goo.gl theguardian.com truthfeed.com bikers4trump.com washingtonpost.com redflagnews.com

Figure 4: 3 Cluster Centers, Using Affinity Propoga-tion

4.5.1

Affinity Propogation

Using this matrix, we first applied affinity propagation clustering to automatically determine the number of clus-ters. Affinity propogation is frequently used with large datasets. The algorithm works by measuring the similarities between different data points, thereby allowing researchers to easily identify clustering and groupings within with dataset with-out having to predefine clusters [Wikipedia 20171].

This resulted in 3 estimated cluster centers, with homo-geneity at 0.872, seen in Figure 4.

4.5.2

Principal Component Analysis

We then used Principal Component Analysis (PCA), again on the normalized matrix in order to reduce the dimension-ality. PCA is used to put emphasis on the differences, or variations, between each data point, and is used to reveal patterns within the dataset [Wikipedia 20173]. PCA is fre-quently used in conjunction with KMeans, as any attempt to cluster high-dimensional data would be ineffective.

For PCA, we used 200 components, which had a sum of 0.73 variance.

4.5.3

KMeans Clustering

KMeans clustering was applied to the resulting transfor-mation, using K=3. KMeans is used to separate in an un-supervised manner the dataset in K clusters, based on the distance between points in a P-dimensional space. Since the number of clusters must be pre-assigned, KMeans has a high rate of error if it is used alone. However, when used with

Figure 5: Clustering Based on Hashtags and Men-tions

other algorithms like Affinity Propogation and PCA, this error is reduced substantially [Wikipedia 20172].

KMeans works by redefining the mean of different clusters until convergence, or until there is very little change. Each point is assigned to the mean data point that it is closest to. In the case of this dataset, then, each user was assigned to a cluster with other users, based on their distance in a high dimensional space where each dimension is the usage of a hashtag.

4.6

Visualizing the Hashtag / Mention

Clus-ters with T-SNE

T-SNE, or t-distributed stochastic neighbor embedding, was applied for the visualization, with 2 components. T-SNE is used to further reduce the dimensionality of big datasets, in order to easily visualize the results. Dimension-sionality is reduced based on the probability of similarities between datapoints [Wikipedia 20174]. KL divergence af-ter 100 iaf-terations with early exaggeration was 0.515. The result was plotted in a scatterplot, shown in 5. The colors are derived from the three KMeans clusters. As is visible from the graph, each of the clusters is fairly created sep-arated, meaning that there is some differentiation between the users within each of the clusters.

The top hashtags for each of these clusters can be found in at the end of this report in Table 2, showing that there is a wide variety of hashtags and user mentions given for each of the clusters- though there are certain topics prevalent, there are no defined bubbles of primarily conservatives or primarily liberals. Indeed, the third cluster appears to be unrelated to politics at all.

4.7

Checking the Bias Measure

It is evident from 2 that the clusters are less homogeneous than we initially expected. Therefore, we decided to try to create groupings from the data based on the user’s URL usage, for the url-users.

A sparse matrix was again created in the same manner described above, however, using the URLs listed from PEW rather than hashtags. For each user, a vector was created, with a value of 0 for each URL they did not use, and a count for each of the URLs they did use. Similarly to the

(8)

Table 2: Top Hashtags and Mentions for Each Clus-ter Cluster 1: 48 % FoxFriendsFirst, elsolarverde, HillaryClinton, Moodeey3, Wisconsin, SheriffClarke, KeithOlbermann, steinhauserNH1, MikeNellis, FoxNews, Khan, TrendReport, DrMartyFox, ScottWalker, laurencristmann, ananavarro, realDenaldTrump, LiarHillary, DiamondandSilk Cluster 2: 10 % TheTrumpPuppet, FoxFriendsFirst, MuchWowShibes, ananavarro, NewDay, sudoues, ME, India, FoxNews, HallieJackson, MyManJimmyJack, angelinthepine, missimpolitic, lame, Congress, seanhmann, spweber54, ABCNews24, Freight, Pontifex ar Cluster 3: 40 % TrumpStupid, ChatRevolve, Oooooo Donna, LESM, twitt3rpoll, SenateGOP, Takfiri, Ladycashmere, orlandosentinel, SebGorka, msnbc, bipartisan, LAPSEU, Beamaxed, HouseGOP, TriangleSecrets, TextBook, taliban, TPP,

matrix described above, this was then multiplied by the trust measure depicted in figure 1 and described in section 4.4, where news sources were valued from 0.8 to -0.8, based on how much conservatives or liberals trusted each source.

Each user in the matrix was scored based on this. Because of discrepancies with different sources- and because even PEW allows that their ’gently liberal’ category of sources are really quite neutral, users that scored between 0 and 0.3 received a bias of ’Neutral’. Users that score above 0.3 re-ceived a bias of ’Liberal’, while users that scored below 0 received a bias of ’Conservative’.

For each of these lists, the top URLs, the top mentions and the top hashtags were determined. The most frequently used are include in Table 3.

Table 3: Top URLs and Hashtags in User Groups By URL Bias Liberal URL Users Conservative URL Users Neutral URL Users

Top URLs twitter,

hillaryclin-ton, rsbn hs, twitter, youtube twitter, hillaryclin-ton, bit

Top Hashtags Dump

Trump, Donald Trump Make America Great Again, Republi-cans For Hillary release your taxes, Ban The Burqa

Because only the users who had tweeted URLs were now classified, there remained a large number of non-url users who were as yet unclassified. These non-url users were then analyzed to find their hashtags usage and mention usage. These results were compared with hashtags and mentions from the url-users, in order to determine the political lean-ings of the non-url users (assuming that users of a similar political bend would use the same hashtags and mentions). The hashtags and mentions were split into two separate lists, with potential biases for each non-url user. These potential biases were then compared, to see if any of the potential classifications matched.

4.8

Results

In the entire dataset, only 30% of users had used a URL in their tweets. The bias measures were run on this sample. After applying the bias measure described in 4.6, 9.4% of users had a conservative bias, with a score of -0.1 or less, 23.1% of users had a liberal bias, with a score of 0.3 or greater, and the remaining 67.5% had a neutral bias, with scores in between 0 and 0.29.

Out of the same sample, 70% of users were identified as

non-url users. In an attempt to assign potential bias to

these non-url users, each users‘ mentions were analyzed. If a non-url user had mostly conservative mentions, they were placed in a ‘potentially conservative‘ group, and so on for both liberal and neutral mentions.

To check these potential biases, each undefined user‘s hash-tags were compared with the hashtag lists from the defined user‘s hashtags. A user was classified based on the group

(9)

their hashtags had the most in common with.

These two potential bias groups were compared, to see if any of the users were classified the same way based on both hashtags and mentions. In total, less than 1% of the entire dataset was classified correctly.

4.9

Hashtag Bubbles

Finally, we examined hashtag usage only, to try to see if that was a better determiner of user bias. We took a small sample of 4000 users to test this theory on. Within this dataset, political leaning was identified by a manually coded list of the top 10 hashtags for each political group (conser-vatives and liberals). In total, 893 users were identified as conservative, 570 users as liberal and 2547 users as neutral, or as not having tweeted any of the top biased hashtags.

A list of the top 100 most common hashtags in this dataset was created, and each individual tag was manually coded to be either ‘liberal‘, ‘conservative‘ or ‘neutral‘, based on searches of the hashtags and corresponding explanations.

Users were then classified based on their usage of these hashtags.

These new groups were then compared to another subset of unclassified users. The unclassified user‘s mentions were again examined, and the unclassified users were then re-classified accordingly (based on the same methods described above).

For each of these reclassified users, the top URLs and top hashtags were checked, again manually, to see if the PEW source biases could be found in the appropriate sources, and to examine whether known conservative or liberal hashtags would be found in the top hashtags.

Within each of these groups, the bias was verified in two ways. First, the top URLs from each group were checked, and, though there was certainly some significant overlap, the majority of the groups were tweeting urls along political lines.

In addition to checking the URLs, the top hashtags for each of these groups was also identified, with results that were entirely unsurprising. The top conservative hashtags included ‘Trump‘ and ‘MAGA‘, while the top liberal hash-tags consisted of ‘ImWithHer‘ and ‘Hillary‘. Top neutral tags included both major candidate‘s names, as well as some of the major events that took place.

These results can be found in table 4.

5.

CONCLUSION

Based on this research, it appears that there is more of a bipartisan conversation occurring on Twitter than popular culture would have us believe.

Though it may indeed be possible to identify a user‘s opinions based on their hashtag usage (which is much more closely related to content), it is nearly impossible to iden-tify user‘s political bias based on their URL usage and their mentions on Twitter.

This, along with the relatively low percentage of users who even tweet URLs, follows the gatekeeping theory, that users are using Twitter with more agency than other platforms, behaving as their own curators rather than relying on news sources.

From the user mentions, it is clear that there is more of a global conversation occurring on Twitter than the popular media would like to admit- though the nature of this con-versation, whether it is positive or negative, constructive or

Table 4: URLs & Hashtags in Hashtag Bubbles

Liberal Conservative Neutral

Users 523 486 4679 Hashtags Hillary Clinton, Never Trump, real Donald Trump, Republi-cans For Hillary, Family, Nadel Paris, obplan, Iowa For Hillary, Iowa, Iowa State Fair real Donald Trump, Hillary Clinton, CNN, Trump, Never Trump, el solar verde, MAGA, Trump Pence 16, Re-publicans For HIllary, Iowa Trump, Hillary Clinton, MAGA, Trump Pence 16, Never Trump, Republi-cans For Hillary, Iowa, family, Nadel Paris, obplan URL hillary clinton, red flag news, the guardian, freak out nation, wash-ington post hillary clinton, donald j trump, wash-ington post, info wars, truth feed hillary clinton, amp, moby, google, the guardian, politico

(10)

critical, must be examined further.

6.

DISCUSSION AND FUTURE WORK

Though the research described here reaches potential con-clusions, it is possible to extend the findings described here by looking at other aspects of the metadata Twitter pro-vides.

In this paper, the metadata used was only the hashtags, mentions or retweets, and the URLs embedded within each tweet. Of those URLs, the ones linking to social media were not fully examined. By looking at each of these links in more depth (rather than just the links to the root URLs), researchers could find more specifics regard the types of in-formation that users are sharing among themselves.

Further, it would be prudent to examine the text- the tweets themselves- to try to provide more depth and back-ground to what is being examined. Any future work should unequivocally include a more in depth textual analysis of both the user‘s tweets, as well as the user‘s profile descrip-tions. One of the ways in which a user can be determined to be conservative or liberal is according to the content of the tweets themselves- whether they are pro or against cer-tain policies, public figures, or trends can have a very strong indication of their political leanings.

Another aspect that future research should look into is whether or not there are geological ties to certain users and opinions. A long-held assumption in the United States is that certain areas (for instance, Massachusetts) are more liberal leaning, while others (Alabama) are more conserva-tive. It would be interesting to see the rate at which people are tweeting from different locations according to certain biases. Further, this would add fuel to the cities versus -countryside debates, allowing for an analysis of the locations themselves.

Finally, future work should include a timeline of events. It would be interesting to show, rather than a single day, the rate of biases over the course of weeks or months, given certain events. For instance, using a text-based analysis, future researchers could attempt to identify more central users as either liberal-leaning or conservative-leaning, and plot these against different events over a period of time to determine if certain events are catalysts for users to change their opinions and beliefs.

7.

REFERENCES

[Allcott Gentzkow 2017] allcott2017socialAllcott, H. Gentzkow, M. 2017. Social media and fake news in the 2016 election Social media and fake news in the 2016 election . National Bureau of Economic Research. [Aslam 2017] aslam2017twitterstatsAslam, S. 2017.

Twitter By The Numbers: Stats, Demographics, and Fun Facts Twitter by the numbers: Stats, demographics, and fun facts.

https://www.omnicoreagency.com/twitter-statistics/ [Baer 2016] Baer2016filterbubbleBaer, D. 2016. The ’Filter

Bubble’ Explains Why Trump Won And You Didn’t See It Coming The ’filter bubble’ explains why trump won and you didn’t see it coming.

http://nymag.com/scienceofus/2016/11/how-facebook-and-the-filter-bubble-pushed-trump-to-victory.html [Bakshy . 2015] bakshy2015exposureBakshy, E., Messing,

S. Adamic, LA. 2015. Exposure to ideologically diverse news and opinion on Facebook Exposure to ideologically diverse news and opinion on facebook.

Science34862391130–1132.

[Barber´a . 2015] barbera2015tweetingBarber´a, P., Jost, JT., Nagler, J., Tucker, JA. Bonneau, R. 2015. Tweeting from left to right: Is online political communication more than an echo chamber? Tweeting from left to right: Is online political communication more than an echo chamber? Psychological science26101531–1542. [Bhattacharya Ram 2012]

bhattacharya2012sharingBhattacharya, D. Ram, S. 2012. Sharing news articles using 140 characters: A diffusion analysis on Twitter Sharing news articles using 140 characters: A diffusion analysis on twitter.

Advances in Social Networks Analysis and Mining (ASONAM), 2012 IEEE/ACM International Conference on Advances in social networks analysis and mining (asonam), 2012 ieee/acm international conference on ( 966–971).

[M. Conover . 2011] conover2011politicalConover, M., Ratkiewicz, J., Francisco, MR., Gon¸calves, B., Menczer, F. Flammini, A. 2011. Political polarization on twitter. Political polarization on twitter. ICWSM13389–96. [MD. Conover . 2012] conover2012partisanConover, MD.,

Gon¸calves, B., Flammini, A. Menczer, F. 2012. Partisan asymmetries in online political activity Partisan

asymmetries in online political activity. EPJ Data Science116.

[Darwish . 2017] darwish2017trumpDarwish, K., Magdy, W. Zanouda, T. 2017. Trump vs. Hillary: What went Viral during the 2016 US Presidential Election Trump vs. hillary: What went viral during the 2016 us presidential election. arXiv preprint arXiv:1707.03375. [DiFranzo Gloria-Garcia 2017] difranzo2017filterDiFranzo,

D. Gloria-Garcia, K. 2017. Filter bubbles and fake news Filter bubbles and fake news. XRDS: Crossroads, The ACM Magazine for Students23332–35.

[El-Bermway 2016] elbermway2016filtbubbEl-Bermway, MM. 2016. Your Filter Bubble Is Destroying Democracy Your filter bubble is destroying democracy.

https://www.wired.com/2016/11/filter-bubble-destroying-democracy/

(11)

[Entman 1993] entman1993framingEntman, RM. 1993. Framing: Toward clarification of a fractured paradigm Framing: Toward clarification of a fractured paradigm. Journal of communication43451–58.

[Flaxman . 2016] flaxman2016filterFlaxman, S., Goel, S. Rao, JM. 2016. Filter bubbles, echo chambers, and online news consumption Filter bubbles, echo chambers, and online news consumption. Public Opinion

Quarterly80S1298–320.

[Groshek Tandoc 2017] groshek2017affordanceGroshek, J. Tandoc, E. 2017. The affordance effect: Gatekeeping and (non) reciprocal journalism on Twitter The affordance effect: Gatekeeping and (non) reciprocal journalism on twitter. Computers in Human Behavior66201–210. [Hamling Agrawal ] hamling4sentimentHamling, T. Agrawal, A. . Sentiment Analysis of Tweets to Gain Insights into the 2016 US Election Sentiment analysis of tweets to gain insights into the 2016 us election. ˆ

a ˘AIJAnomaly Correction by Optimal Trading

Frequencyˆa ˘A˙I 434.

[Hosanagar . 2013] hosanagar2013willHosanagar, K., Fleder, D., Lee, D. Buja, A. 2013. Will the global village fracture into tribes? Recommender systems and their effects on consumer fragmentation Will the global village fracture into tribes? recommender systems and their effects on consumer fragmentation. Management Science604805–823.

[Isaac Ember 2016] isaac2016elecdayIsaac, M. Ember, S. 2016. For Election Day Influence, Twitter Ruled Social Media For election day influence, twitter ruled social media.

https://www.nytimes.com/2016/11/09/technology/for-election-day-chatter-twitter-ruled-social-media.html [Jungherr 2016] jungherr2016twitterJungherr, A. 2016.

Twitter use in election campaigns: A systematic literature review Twitter use in election campaigns: A systematic literature review. Journal of information technology & politics13172–91.

[Kong . 2014] kong2014predictingKong, S., Mei, Q., Feng, L., Ye, F. Zhao, Z. 2014. Predicting bursts and

popularity of hashtags in real-time Predicting bursts and popularity of hashtags in real-time. Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval Proceedings of the 37th international acm sigir conference on research & development in information retrieval ( 927–930). [Kruikemeier . 2014]

kruikemeier2014unravelingKruikemeier, S., van Noort, G., Vliegenthart, R. de Vreese, CH. 2014. Unraveling the effects of active and passive forms of political Internet use: Does it affect citizensˆa ˘A´Z political

involvement? Unraveling the effects of active and passive forms of political internet use: Does it affect citizensˆa ˘A´Z political involvement? New Media & Society166903–920. [Mitchell Matsa 2014] mitchell2014pewMitchell, JGJK.,

Amy Matsa, KE. 2014. Political Polarization and Media Habits Political polarization and media habits.

[Morgan . 2013] morgan2013newsMorgan, JS., Lampe, C. Shafiq, MZ. 2013. Is news sharing on Twitter

ideologically biased? Is news sharing on twitter ideologically biased? Proceedings of the 2013

conference on Computer supported cooperative work Proceedings of the 2013 conference on computer supported cooperative work ( 887–896).

[Nelson 2017] nelson2017fakenewsNelson, J. 2017. Is ’Fake News’ a Fake Problem? Is ’fake news’ a fake problem? https://www.cjr.org/analysis/fake-news-facebook-audience-drudge-breitbart-study.php

[Neubaum Kr¨amer 2017] neubaum2017opinionNeubaum,

G. Kr¨amer, NC. 2017. Opinion climates in social media: Blending mass and interpersonal communication

Opinion climates in social media: Blending mass and interpersonal communication. Human Communication Research.

[Newton 2016] newton2016zuckerbergNewton, C. 2016. Zuckerberg: The Idea That Fake News on Facebook Influenced the Election is ’Crazy’ Zuckerberg: The idea that fake news on facebook influenced the election is ’crazy’.

https://www.theverge.com/2016/11/10/13594558/mark-zuckerberg-election-fake-news-trump

[Oxford English Dictionary Online 2016]

posttruthoedOxford English Dictionary Online Oxford english dictionary online. 2016. Oxford University Press.

”https://en.oxforddictionaries.com/word-of-the-year/word-of-the-year-2016” Post Truth

[Parkinson 2016] parkinson2016fakenewsParkinson, HJ. 2016. Click and Elect: How Fake News Helped Donald Trump Win a Real Election Click and elect: How fake news helped donald trump win a real election.

https://www.theguardian.com/commentisfree/2016/nov/14/fake-

news-donald-trump-election-alt-right-social-media-tech-companies [Puschmann . 2016]

puschmann2016informationPuschmann, C., Ausserhofer, J., Maan, N. Hametner, M. 2016. Information

laundering, counter-publics: the news sources of

islamophobic groups on Twitter Information laundering, counter-publics: the news sources of islamophobic groups on twitter. Tenth International AAAI Conference on Web and Social Media. Tenth international aaai conference on web and social media.

[Silverman 2016] silverman2016fakenewsSilverman, C. 2016. This Analysis Shows How Viral Fake Election News Stories Outperformed Real News On Facebook This analysis shows how viral fake election news stories outperformed real news on facebook.

https://www.buzzfeed.com/craigsilverman/viral-fake-

election-news-outperformed-real-news-on-facebook?utmterm = .piQwkvKzb.nqLRlV wN A

[Trilling 2015] trilling2015twoTrilling, D. 2015. Two different debates? Investigating the relationship between a political debate on TV and simultaneous comments on Twitter Two different debates? investigating the relationship between a political debate on tv and simultaneous comments on twitter. Social Science Computer Review333259–276.

(12)

[Vergeer Franses 2016] vergeer2016liveVergeer, M. Franses, PH. 2016. Live audience responses to live televised election debates: time series analysis of issue salience and party salience on audience behavior Live audience responses to live televised election debates: time series analysis of issue salience and party salience on audience behavior. Information, Communication & Society19101390–1410.

[Weller . 2014] weller2014twitterWeller, K., Bruns, A., Burgess, J., Mahrt, M. Puschmann, C. 2014. Twitter and society Twitter and society ( 89). P. Lang. [Wikipedia 20171] wiki:affpropWikipedia. 20171. Affinity

propagation — Wikipedia, The Free Encyclopedia. Affinity propagation — wikipedia, the free encyclopedia.

https://en.wikipedia.org/w/index.php?title=Affinitypropagationoldid =

794220915 [Online; accessed 24-August-2017 ]

[Wikipedia 20172] wiki:kmeansWikipedia. 20172. K-means clustering — Wikipedia, The Free Encyclopedia. K-means clustering — wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=K-meansclusteringoldid = 794828195 [Online; accessed

24-August-2017 ]

[Wikipedia 20173] wiki:pcaWikipedia. 20173. Principal component analysis — Wikipedia, The Free

Encyclopedia. Principal component analysis — wikipedia, the free encyclopedia.

https://en.wikipedia.org/w/index.php?title=Principalcomponentanalysisoldid =

795776902 [Online; accessed 24-August-2017 ] [Wikipedia 20174] wiki:tsneWikipedia. 20174.

T-distributed stochastic neighbor embedding — Wikipedia, The Free Encyclopedia. T-distributed stochastic neighbor embedding — wikipedia, the free encyclopedia.

https://en.wikipedia.org/w/index.php?title=T-distributedstochasticneighborembeddingoldid =

Referenties

GERELATEERDE DOCUMENTEN

!--- Define the actions such as drop, reset or log !--- in the inspection policy map policy-map global_policy class inspection_default inspect dns preset_dns_map inspect ftp

IV) ADRESSEN VAN DE ANDERE AANBESTEDENDE DIENST NAMENS WELKE DE AANBESTEDENDE DIENST AANKOOPT (in geval van een door de aanbestedende dienst bekendgemaakte aankondiging) (1)

Onder de menuknop Geschiedenis kunt u niet alleen alle orders terugvinden die zijn gemaakt op uw klantnummer. U kunt hier ook

o Controleer alvorens te persen of de buis volledig is geplaatst en of de dieptemarkering op de juiste positie staat o Voltooi één volledige cyclus van het gereedschap – niet

Timmerwerk: voorbereiden en uitvoeren van timmerwerk (in werkplaats en ter plaatse) op een technisch en cultuurhistorisch verantwoorde wijze, onder andere: beoordelen van

U kunt bijvoorbeeld URL's met een neutrale of onbekende reputatie herschrijven om deze naar de Cisco Web Security Proxy te sturen voor een klik-tijd evaluatie van hun

proposer des stratégies intégrées pour faciliter le relèvement des États qui sortent d’un conflit armé, faire des recommandations et donner des renseignements dans le but

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use