• No results found

Dissemination of political information on Twitter : an empirical social network analysis

N/A
N/A
Protected

Academic year: 2021

Share "Dissemination of political information on Twitter : an empirical social network analysis"

Copied!
41
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

An empirical social network analysis

Petro Tolochko (10391134)

University of Amsterdam

Master Thesis

Graduate School of Communication

Research Master’s programme Communication Science

Dr. D.C. Trilling January 22, 2015


(2)

Abstract

Access to the Internet and social networking services allows people to generate and disseminate information on unprecedented levels, disregarding the geographical and temporal differences. However, patterns of information dissemination through the social networking services has not been a major research focus in communication science. The present study proposes to view these services as social networks, and treat the information flow within them as if it was propagating through a social network. Based on the existing social network theory, and previous communication science research related to social networking services, it is expected that individual characteristics of the social network users (e.g., number of followers, number of friends, number of messages) are

influencing the way the information propagates. The study primarily utilises social network analysis and automated content analysis methods to determine which of the aforementioned characteristics determine the position of the individual in the network and facilitate the flow of information.

Acknowledgement: This work was carried out on the Dutch national e-infrastructure with the support of SURF Foundation.


(3)

Introduction

The widespread penetration of the Internet in general and the use Social networking services in particular created a new way of generation, dissemination and consumption of information. People are enjoying the ability to access news (often right after the newsworthy content has appeared) and share the information with one another with a click of a button and on an unprecedented scale. However, not only the mode of consumption of information is rapidly changing, but also the amount of data being generated is growing exponentially. For example, it is estimated that 90% of all information in human history was generated after the year 2010 (e.g., Dragland, 2013), and that 37 percent of all internet users participated in the creation or the dissemination of information online, in one way or another (Purcell, K., Raine, L., Mitchell, A., Rosenstiel, T., & Olmstead, K. 2010).

The availability of the immense quantities of data provides researchers from various scientific fields with an opportunity to test existing theories developed to explain the social reality, as well as come up with new theories pertaining to the social phenomena online. Since the core idea of the internet is to facilitate and ultimately change the way people communicate, it is of no surprise that with the advance in the online social networking technologies, the research on human

communication enjoys the wide attention from the researchers. Social Networking Services (further SNSs) such as Facebook and Twitter are arguably the biggest aggregators of such data. Social scientists have immediately recognised the huge scientific potential of data that the SNSs are generating, and have been interested in it since the early days of SNSs – examples range from the use of online social networking services and its relation to the social capital (Ellison, Steinfield, & Lampe, 2007) to the identification of potential suicidal behaviour online (Huang, Goh, & Liew, 2007)

Social networking services, as the name aptly implies, are effectively social networks in the sociological sense of the term designed to promote and facilitate the flow of information between

(4)

individuals, social groups, corporations etc. Therefore, it seems only appropriate to apply social network analysis techniques to investigate the social phenomena related to the social networking services. The analytical interest of the present paper is the spread of political information through social networks, i.e., “How information spreads through online social networks?”, “Who is responsible for the information spread?”, etc. The news about the crash of the Malaysia Airlines Flight 17 (MH17) is taken as the case of analysis, since it both has a shocking value – i.e., has a higher probability of being disseminated through a network, and has very strong political

connotations, therefore making it a prime example of political information spread. Twitter SNS was chosen as the social network on which the analyses would be performed for a couple of reasons: first, it is one of the most popular SNSs today, with an active user base approaching 300 million (twitter.com, 2014), and second, because Twitter’s application programming interface allows for an easy access to a relatively large amount of Twitter’s data for social researchers.

Thus, this paper’s aim is twofold: first, a large emphasis is made on the descriptive analysis of the Twitter communication network in the case of MH17 incident i.e., describing how the network structure is organised and how the said structure influences the flow of political information within the network. This will provide valuable insights into how the flow of

information is structured, what are the instrumental characteristics of the information propagators, etc. The second aim of this paper is to test Mark Granovetter’s Tie strength theory (e.g.,

Granovetter, 1973) and it’s applicability to the online networks. While there have been some attempts to draw parallels between the ties strengths in the online and offline social networks, the emphasis was mainly made on the amassing of social capital (e.g., Grabowicz, Ramasco, Moro, Pujol, & Eguiluz, 2012) leaving the question of tie strength and information diffusion open. This paper will attempt to provide clarity on the role of weak and strong ties in the process of

(5)

Theoretical framework

The present study is occupied with social networking services and social networks. The similarity of these terms may cause confusion. Social networking services such as Twitter and Facebook are online services that allow people to communicate via the Internet and will further be referred to as Social Networking Services (SNSs). Social Networks, on the other hand, refer to social structures where individuals are connected with social ties like friendships, discourses, etc. While SNSs could be represented as social networks (and indeed this paper will argue for such representation), it is nevertheless important to keep this distinction in mind when reading this section.

The first part of this section will concern itself with information and news dissemination on social media and social networking services in general, the second part will focus on how

information flows are propagated through social networks and the related research pertaining to this area. Lastly, the third part will try to incorporate information dissemination through social

networking services from a social network point of view, specifically focusing on Twitter SNS.

Social Networking Services and information propagation

The way in which information propagates through human society is not a particularly new topic for social sciences. One of the first and arguably most famous attempts to scientifically explain human communication process was Laswell’s communication model. Laswell proposed to view communication as a linear process with five subsequent steps of analysis: Who sent the message, What was meant by the message, Through which channel the message was sent, Who was the recipient of the message, and What was the effect of the message (Laswell, 1948). While the basic principles of this traditional model hold true, the technological development of media no longer allows for assumptions of linearity or verticality (the top-down approach) of communication. Media consumers no longer simply “consume” information, but rather actively participate in the

(6)

process of its generation (citizen journalism) and dissemination (e.g, sharing information on the SNSs) (Shao, 2009).

Even though the effects and patterns of information diffusion on social networking services are attracting a wide interest from social scientists, since the phenomenon of online communication is a relatively new one, there is still a large theory gap related to this area. A lot of research done on news dissemination through social networking services uses “sharing” for the operationalisation of information transmission through SNSs (the present study will be using the same

operationalisation). Sharing refers to an activity in which users of an SNS include a hyperlink to some already existing piece of information on his or her personal account. This may be manifested in different ways – it may simply mean a manual insertion of a link to e.g., a news article, and most of the times media outlets such as online newspapers include a sharing functionality so that the reader can include a link to an article to a SNS of their choice. On Twitter, for example, the sharing functionality is implemented by “retweeting” someone else’s tweet to be shown on his or her feed.

Various elements of the success of information spread on SNSs have been reported – the source of the message has a significant influence on the outcome of the popularity of message (Romero, Tan, & Ugander, 2011), as well as the frequency of the posts and the novelty of the information (Morales, Borondo, Losada, & Benito, 2014). Messages that induce positive feeling are ones that are the most likely to be retweeted (Bakshy, Hofman, Mason, & Watts, 2011). Prior experience with social media also significantly increases the odds that a person will share the information (e.g., Lee & Ma, 2012).

This news sharing behaviour has been largely studied from a user’s motivations and

intentions point of view. Uses and gratification theory, which aims to explain what factors motivate people to choose between various media and channels became a popular approach to analysing online behaviour in general and news sharing on SNSs in particular (e.g., Ruggiero, 2000; Diddi &

(7)

LaRose, 2006). For example, it has been shown that socialisation and information seeking behaviour are prime drivers behind the sharing process (Weeks & Holbert, 2013).

Consumers became important decision makers when it comes to what information is disseminated, effectively making them a second filter (after the editors) in news gatekeeping (e.g., Singer, 2014). Selective exposure theory is often applied to information sharing behaviour online. Selective exposure to information is the idea that people prefer to consume and disseminate information based on their pre-existing views on the issue rather than the quality of information itself (e.g., Kastenmüller, Greitemeyer, Jonas, Fischer, Frey, & Dieter, 2010). However, there is no clear consensus – while there are some who are claiming that pre-existing ideological preference is affecting the media consumption patterns on SNSs (e.g., Stroud, 2008), many studies argue that the availability of content and ease of switching between various outlets negate the effect of selective exposure providing a more homogenous pattern of news consumption and dissemination (Horrigan, Garrett, & Resnick, 2004; Rainie & Smith, 2012).

The above paragraphs focused mainly on the psychological aspect of information sharing online, and through SNS in particular. While psychological theories of news dissemination are undeniably a very important part, they however describe only half of the issue at hand. They concern themselves with, what one may call “functional” factors of information dissemination i.e., what motivates people to share, how they perceive information diffusion, etc. To investigate the “structural” factors of this phenomenon it is useful to refer to the sociological research done in this area, more specifically, the theories relating to social networks and the information diffusion in them. The next two parts of this section will do exactly that – describe how information is

transmitted through a social network, and an attempt will be made to combine the theory on social networks and SNS together.

(8)

Social networks and information propagation

Social scientists quickly recognised that social reality is more than simply the sum of individual characteristics, but rather a synergetic relationship between them, and therefore could be very well represented as a network structure (e.g., Durkheim, 1894). A social network is an

interconnected web of nodes (individual, organisations, states, etc.) that are joined by ties (e.g., 1 friendships in case of individuals, treaties in case of states, etc.). A graphic representation of a social network – a sociogram – includes nodes and ties between them. Social network analysis draws on the ideas of graph theory for a mathematical representation of society, and enjoys a wide range of application in social sciences (as well as natural sciences and humanities) from determining the causes and the rate of epidemic spread (Klovdahl, 1985) to the importance of marriages of convenience in Renaissance Italy (Padgett & Ansell, 1993).

Social networks and social network analysis play an incredibly important role in the research on information diffusion. Whether some piece of news or an idea dies out quickly or deeply penetrates various social circles is fundamentally tied to the structure of a social network. Diffusion of information such as gossip, rumours, or pieces of news was known for a long time to be influenced by the social network structure (e.g., Kerckhoff & Back, 1965; Friedkin, 1982). The actual spread of information within a social network depends mainly on two criteria: a presence of a “connection” between people (corporations, countries, etc), and on the “communicative

event” (verbal communication, letter, etc) (e.g., Kossinets, Kleinberg, & Watts, 2008). The first criterium is necessary but insufficient for information flow, while second one cannot exit without the first. In a study examining how information spreads within an email network, it was discovered that most communication networks have a dense core with relatively short distances between the nodes, and high frequency of “communicative events” – where the information spreads very fast,

A “tie” in the social network analysis context refers to any type of a relationship between two nodes when

1

(9)

and a network periphery – where information has a low probability of reaching (Holme, 2005). The same study also concludes that the dynamics of the information flow are impeded by a very skewed contacts per edge distribution. A different research investigating human communication networks, namely a database of phone logs, found that networks maintain the information flow with a removal of important nodes, however the flow is disturbed when “hubs” – densely connected subnetworks – are removed (Onnela, et al., 2007).

One of the most important structural factors of information flow through a network are network ties. The ties in a social network could be either strong or weak ones – strong ties may be represented by e.g., close friendships, and weak ties by e.g., acquaintances or coworkers. Intuitively, strong ties were thought to contribute to the information flow in a network much more significantly than weak ties, and that a network of people connected with weak ties would have a lower density than a network of people connected with strong ties. However, one of the most influential works on social networks in general and information flows in particular – Mark Granovetter’s “The Strength of Weak Ties” (Granovetter, 1973) – proved that the reverse is correct. Granovetter empirically showed that weak ties play a significant role in information diffusion process since they “bridge” various communities. While strong ties connect likeminded people with same interests and beliefs, they rarely go beyond the social circle, in effect creating a very homogenous network, making it difficult for new information to enter, and consigning the network to “provincial” information (Granovetter, 1983). Weak ties, conversely, connect less similar individuals linking various social circles together and therefore allowing for a more unconstrained flow of the information through network.

For the offline social networks, the idea that weak ties indeed provide a significant

contribution to information flows was subsequently tested in abundance with empirical data. It was shown, for example, that while strong ties contribute to better information flow within a social subsystem, weak ties contribute to a much larger degree to the information flow outside of the

(10)

social subsystem (Friedkin, 1982). One other important discovery was that the information flow in a social network largely depends not only on the type of a channel (strong or weak tie), but also on the relative number of channels, and since weak ties constitute a larger portion of ties in the network and and are more numerous than strong ties, their sheer number is instrumental in disseminating the information through a network (Friedkin, 1982; Granovetter, 1983). Ideas, as a form of information, are also more likely to be disseminated through weak ties. Interestingly enough, people with more weak ties are more likely not only to be propagators of ideas, but also generate more and better quality ideas (Burt, 2004.) In an analysis of a society-wide network of mobile phone logs, Onnela (2007) and colleagues have shown that the removal of strong ties from a real-life communication network does not significantly harm information flows through a network, whilst the removal of weak ties results in a collapse of the said network. A different study has shown that the weak ties are more likely to lead to successful information searches than strong ties (Dodds, Muhamad, & Watts, 2003).

The next segment will discuss the research done on social networks and social network analysis in relation to online social networks.

SNS as a social network

The topic of information propagation through online social networks began to attract attention from academia. Considerable amount of research has been done on information diffusion within the blogosphere. An analysis of URLs, for example, was performed to see how certain (viral) videos travel through blogs (Adar & Adamic, 2005), or what topics and topic properties have the propensity to be retransmitted (Gruhl, Guha, Liben-Nowell, & Tomkins, 2004). Email networks have also proved to be valuable source of data and replicated to some extent information transfer in offline networks (Liben-Nowell & Kleinberg, 2008).

(11)

A prime example of information diffusion through online social networks is the spread of rumours. Rumour are of particular interest to researchers since they allow to track and model human communication networks. A considerable body of research has been carried out in this particular area resulting in important findings about the nature of information diffusion online. A study investigating the spread of rumours in online social networks such as Twitter and Orkut, has found that rumours have a much higher rate of dissemination in these real-life online networks than in mathematically modelled ones (Doerr, Fouz, & Friedrich, 2012). Doerr and colleagues also found that nodes with small degree values, counterintuitively, facilitate the information diffusion via the network, since they are among the fastest transmitters. Developing and investigating rumour propagation models, or “gossip algorithms” it was discovered that information travels in a social network not unlike an epidemic, with nodes “infecting” neighbouring nodes with information (Nekovee, Moreno, Bianconi, & Marsili, 2007), thus creating an information cascade.

The concept of information cascades is also central to the process of information flows through social networks. An information cascade is a phenomenon that occurs when a person makes a decision to engage in a behaviour (e.g., sharing of information in a network) based on the

behaviour of observable others, even if such behaviour is contradicting his/her preexisting beliefs (Easley, 2010, p. 483). Before the emergence of internet technology, research on information

cascades was limited to the available data, however SNSs such as Twitter, providing large scale data on human communication, are making a population-wide cascade research possible (e.g., Baños, Borge-Holthoefer, & Moreno, 2013). Studying online social networks of Digg and Twitter websites, Lerman and Ghosh (2010) came to a conclusion that the main mechanism of information

propagation is very similar in these two networks – users create dense subnetworks where

information spreads initially and then it travels to the periphery. However, they also found out that the network structure is instrumental in the way information travels – in networks with high density information spreads incredibly quickly through the dense core of the network, but once the

(12)

saturation of the core has been reached, the propagation slows down. While in networks with less density the initial pace of information propagation is much lower, but the pace does not slow down after the saturation has been reached, and the information travels further into the network periphery. A different study was looking into Twitter information cascades formed by the URL mentions of Twitter accounts (Galuba, Aberer, Chakraborty, Despotovic, & Kellerer, 2010). In this paper the researchers tracked 15 million URLs and how they spread through the Twitter network, finding out that the user activity and the frequency of mentions follow a power-law distribution – as is often the case with real-life networks (Barabási & Albert, 1999), and communication networks in particular (Csanyi & Szendroi, 2004). Interestingly, they found that a power-law distribution also applies to information cascades – main informational supercascade is composed of subcascades that follow power-law distributions in their number and size (Galuba et al., 2010).

One of the properties of information cascades in online social networks is that they are often shallow, but wide (e.g., Leskovec, McGlohon, Faloutsos, Glance, & Hurst, 2007), meaning that information is very likely to be transmitted to the neighbouring nodes from the seed node, but every next step has a diminished likelihood of happening. The depth of penetration of a cascade is often similar to the average path length of the network (Baños et al., 2013). The same study finds that the properties of the seed of the cascade are not the sole criteria for the success of the cascade, but intermediate nodes also play an important part in the dissemination of information – for example, mathematical modelling revealed that nodes that do not occupy topologically important positions are instrumental in cascade success . 2

Research has also been carried out on the role of network ties in information propagation in online social networks. Applying and testing Granovetter’s tie strength theory to online networks is however, more complicated. On the SNSs the ties strengths do not always enjoy a qualitative

Nodes with out-degree distribution much lower than maximum play a very important role in the

2

(13)

distinction. For example, Facebook simply has a “friends” category, whether these contacts are in fact friends, or people whom one only met once is an almost unanswerable question for an outside researcher. A recent paper (De Meo, Ferrara, Fiumara, & Provetti, 2014), in fact, argues that on Facebook most ties are weak. Since it is impossible to directly translate Granovetter’s ideas about weak and strong ties to Facebook, tie strength has been re-operationalised in the aforementioned study – strong ties are the ones that connect a user to other users in a community, and weak ties are the ones that bridge communities. Nevertheless, the findings on the relationship between the tie strength and information flow begin to be applied and tested in the context of online social

networks, bridging the theoretical gap. A recent research about the strength of ties on Facebook, for example, relying on the existing theory, used 74 Facebook variables to predict the tie strength of its users (Gilbert & Karahalios, 2009). Interestingly enough, time from the last conversation and time from the first conversation proved to have the most predictive power. A different study on Facebook ties used an experimental design to establish how information is diffused through an online social network (Bakshy, Rosenn, Marlow, & Adamic, 2012). The results were somewhat in line with those of the offline networks: strong ties are more likely to influence people to share information that they would probably not share otherwise, however, weak ties are strong predictor of novel information sharing through a network; weak ties also have access to heterogenous information opposed to strong ties. The authors’ conclusion is that the information on Facebook is mostly driven by contagion (disseminated through weak ties).

In recent years, Twitter in particular became somewhat a focus of analysis, since its relatively straightforward structure, and application programming interface allows researchers to design testable models and provides them with significant amount of data. Studies that focus on the information diffusion on Twitter report that the vast majority of information hardly propagates through the network at all – it has been observed, for example that ~ 71% of all messages never leave the user’s Twitter feed (Cheng, Evans, & Singh, 2009).

(14)

There have been several various operationalisations of the tie strength on Twitter SNS. A paper by Grabowicz and colleagues (2012) considered Twitter @mentiones as an indicator of tie strength – the more @mentiones were exchanged between two users, the stronger was the strength of a tie. The structure of the data used in the present study renders such an operationalisation impossible, since it focuses on the information about retweets rather than @mentions. Researchers have also noted that the strength of a tie is dependent on the “cost” associated with its acquisition (high cost – strong tie, low cost – weak tie), and that operationalising the tie strength in an online social network is problematic since all of the links are “low cost” and are therefore homogenised (e.g., Xiang, Neville, & Rogati, 2010). The Twitter SNS, however, does have a distinction in the “cost” associated with a tie. It distinguishes social ties in two categories: “Friends” and

“Followers”. Twitter “Friends” are social ties that require a specific action from the user (selecting a “friend”, thus being a “high cost” connection), while “Followers” do not require the user to take any actions (“followers” select the user they want to follow, thus being a “low cost” connection). This study proposes to operationalise strong and weak ties with these immediately available Twitter metrics. Strong ties in the Twitter network are operationalised as people that the user is following (“friends” in Twitter nomenclature). Weak ties are operationalised as the followers of the user, since while still being a social link, it does not require a user to perform any actions. Although this

operationalisation is based on Granovetter’s concept of strong and weak ties, it is, however, not a direct translation of this concept. “Friends” and “Followers” could also be called active and passive ties, respectively, with passive ties corresponding to Granovetter’s weak ties, and active ties

corresponding to strong ties.

The aim of the present study is two-fold: first, providing a topological and structural description of political information diffusion on Twitter SNS (based on a dataset of retweets of information pertaining to the crash of Malaysian Airlines flight MH17), and comparing it to the existing findings on offline as well as online social networks. Furthermore, as one of the goals of a

(15)

explorative research of this study, it aims to identify the most prominent generators and propagators of information through a network. Based on the nature of their occupation, it is expected that international news media will dominate the role of information generators.

Second, to investigate whether the social network tie strength has an effect on information propagation in this network. Based on the previous research in this area (e.g., Friedkin, 1982) both strong and weak ties are expected to positively influence information flow through an online social network (H1 and H2), it is also expected (based on e.g., Granovetter, 1983; Bakshy, Rosenn, Marlow, & Adamic, 2012) that weak ties will make a stronger contribution to the information diffusion than strong ties (H3).

Thus, apart from the descriptive nature of this study, it poses the next research question and hypotheses:

RQ1: Who are the most prominent information generators and propagators; in what aspects are they different or similar?

H1: User’s position in a social network will be positively correlated with the number of followers he/she has;

H2: User’s position in a social network will be positively correlated with the number the of other users he/she follows (“friends”);

H3: Number of followers (weak ties) will contribute more to the position in a social network than the number of friends (strong ties).

Method

To be able to answer the posed research questions and hypotheses, an analysis was performed on a large sample of retweets related to the Malaysia Airlines Flight 17 incident on July 17, 2014. This incident makes an appropriate case for studying the dissemination of political information since the incident is inherently political in its nature.

(16)

The data gathering and the subsequent analysis were performed utilising scripts from R and Python programming languages. NetworkX (Pyhton) and igraph (R) modules were used for social network analysis.

Sample

The tweets were collected querying the Twitter streaming API using DMI-TCAT (Borra & Rieder, 2014). The API was queried from June 17 to September 22. Only tweets containing the “mh17” keyword, or those that included “#mh17” were collected. This resulted in a datafile containing ~ 4.7 × 106 tweets. The file contained the following information: Twitter id, Twitter username, time of the tweet, user’s follower count, user’s friend count, and the message of the tweet. The file also contained metadata (e.g., geographical location, timezone, etc.) that was not of interest for this particular study. The text of the tweet variable followed the same syntax throughout the file: [tweet message] in case it was the original tweet, or [RT @User [message]] in case of a retweet. Retweets were of particular interest since they are a prime example of information spread on Twitter. Messages that were retweeted comprised 44.44% of the datafile (~ 1.94 × 106). To test the tie strength hypothesis, nodes in prominent network positions (degree centrality) were

identified, and their Twitter characteristics were collected. Effectively, this study is concerned with two networks – a network of retweets, where the retweets are ties and their strength is homogenous, and the Twitter network itself (in which the retweet network is nested), where the tie strength is operationalised as “Friends” and “Followers”.

Model

A Python programming language script was created to first split the retweeted messages from the original ones and then to build a graph out of the retweets. As was noted above,

(17)

[message]]. For simplification the user who retweeted the message will be called User 1, the user whose message was retweeted will be called User 2 (numeration from left to right), thus: [User 1] [RT @User 2 [message]]. A directed graph was built with User 2 (the original sender of the message) as the source node, and with User 1 (the one who retweeted the original message) as the target node. However, there are cases where a retweet of a message is already a retweet of even earlier message originated by User 3: [User 1] [RT @User 2 [RT @User3 [message]]]. In this case (and in cases when even more users are present in the chain) the direction of the graph is always going from the oldest (original) user through everyone who retweeted the message, and ending at User 1 (the most recent retweet). A graphical representation is provided in the Figure 1 below.

The direction of the graph – from source messages through intermediate nodes, and finally to the last person who retweeted the message – makes theoretical sense, since it mimics the flow of

Figure 1. A graphical representation of a retweet cascade. This figure shows a single retweet with two users (left), and two retweets with three users (right).

(18)

information through the network (e.g., Morales et al., 2014). The resulting network of retweets consisted of 1078515 nodes (users), and 1679783 arcs (retweeted messages between users). 3

Variables

Dependent variable. After the graph was constructed, social network analysis was performed to determine network centrality. Centrality is used in the social network analysis to investigate the most important nodes in the network (Newman, 2010) – information propagators in this particular case. Degree centrality was used to determine the centrality of the network. Degree centrality is determined by how many ties (degrees) to other nodes a node has. However, since the network in question is directed i.e., there is a strict theoretical distinction between those who produce and those who propagate information, there are two centrality measures – out-degree centrality and in-degree centrality. Out-degree centrality is measured by the number of ties directed from the node, and in-degree centrality is determined by how many ties are directed towards the node. Out-degree centrality in this case represents the most prominent Twitter accounts that produced information that was later transmitted, and in-degree centrality represents the most prominent accounts that actually transmit (retweet) this information.

Independent variables. Twitter user metrics such as number of followers, number of friends and number of tweets were used as independent variables. To obtain these metrics, a Python script was written that queried the Application Programming Interface and downloaded the data.

However, the data collection was not performed on every user in the network – Twitter API has a restriction of 180 calls per 15 minutes, and collecting the data from the full sample with these restrictions would have taken approximately 113 days. Therefore it was decided to collect the data only for the nodes that had 10 degrees or higher. From a theoretical standpoint this data reduction should not affect the analysis, since the focus is on the most prominent actors in the network, and by

An “arc” refers to a directional relationship between the nodes of the network.

(19)

excluding the nodes with lower than 10 degrees the analysis is performed on the most important nodes. This resulted in two samples – in-degrees and out-degrees (15646 nodes and 17245 nodes, respectively). When querying the API, some data was lost due to the fact that either some Twitter accounts no longer existed, or were unreachable otherwise. 1044 (6.67%) and 900 (5.21%) cases were missing from in and out-degree samples, respectively. Language of the post was also planned to be used as an independent variable, but was dropped from the analysis. A preliminary analysis showed that ~ 86% of all posts were in English with other languages having a relative frequency of 1% or less, therefore including this variable into the models would not yield any meaningful results.

Results Network description

The graph built from the retweet data resulted in a network with 1,078,515 nodes (Twitter accounts), and 1,679,783 ties (number of retweets). The out-degree distribution ranged from 0 to 11,970 (M = 1.56, SD = 38.36), while the in-degree distribution ranged from 0 to 321 (M = 1.56, SD = 2.94). In the previous sections it was discussed that out-degrees represent the direction of the message traveling from the originator of the message to the one who retweeted it, and therefore nodes with high amounts of out-degrees are more likely to be the ‘originators’. Conversely, nodes with higher numbers of in-degrees represent the ‘propagators’ of information. Such difference in the maximum range of out- and in-degrees (11,970 vs. 321, respectively) may seem quite strange at first, however, it can be explained by the fact that every time the message is retweeted more than once, the ‘propagator’ inadvertently acquires an additional degree, therefore the amount of out-degrees in the network grows with the network size disproportionately to the amount of in-out-degrees.

The network density, that is the ratio of observed to maximum possible number of edges in the network, is incredibly low in the full graph ~ 1.44 × 10-6. However, the complete graph (further referred to as the supernetwork) is an unconnected network, which means that there are multiple

(20)

connected components in it which may explain the sparsity of the network. A connected component is a number of nodes that are connected between themselves with ties, so that any node can be accessed from any other node in the component (e.g., is a situation where A is connected to B, B is connected to C, and X is connected to Y – ABC and XY are two connected components of the ABCXY supernetwork). The supernetwork of retweets consisted of 68,254 different connected components. The largest component consisted of 897,768 nodes, and 1,565,197 ties, which comprised 83.24% of the supernetwork's nodes, and 92.95% of its ties. Only the first 20 largest components (0.03% of all components) exceeded 100 nodes (405 to 109 nodes, after the largest component), and only 1.2% of all components were bigger then 10-node networks. The rest were mostly composed of two to three nodes and ties between them. The largest component (and its comparison to the supernetwork) will be the analytical focus of the present study.

Largest component. As was mentioned above, 897,768 nodes and 1,565,197 ties

constituted this network. Since components have mostly the same properties as their supernetworks, it is a directed graph as well. The range of the in- and out-degree distributions was the same as in the supernetwork, however mean scores and standard deviations were different: 0 to 11,970 (M = 1.74, SD = 42.04) – out-degrees, 0 to 321 (M = 1.74, SD = 3.18) – in-degrees. Table 1 and 2 provide degree distributions for the supernetwork, its first three components, a randomly generated graph with node/edge count as the largest component, and a Scale-free network.

(21)

Degrees distribution in both the supernetwork and in the largest component seems to follow an approximate power law distribution. In fact, it is a known phenomenon that real life social networks follow a power law distribution (Barabási & Albert, 1999; Csanyi & Szendroi, 2004). In this particular case, relatively few people are responsible for relatively large amounts of content creating (posting tweets), and information dissemination (retweeting). Figure 2 below shows the out-degree distribution of a) supernetwork, b) largest component, c) Barabási–Albert model for scale free networks (Barabási & Albert, 1999) with same node and edge count as the largest component, and d) Erdős–Rényi model (Erdős & Rényi,1961) with the same node and edge count as the largest component.


Table 1

In-degree distribution

N Min 1st Q Mean Median 3rd Q Max SD

Supernetwork 1078515 0 1 1.56 1 1 321 2.94

Component A 897768 0 1 1.74 1 2 321 3.18

Component B 405 0 1 1.36 1 2 7 0.99

Component C 332 0 1 1 1 1 2 0.10

Barabási–Albert scale-free model 1078515 0 3 3 3 3 3 0.004

Erdős–Rényi random model (A) 897768 0 1 1.74 2 3 12 1.32

Table 2

Out-degree distribution

N Min 1st Q Mean Median 3rd Q Max SD

Supernetwork 1078515 0 0 1.56 0 0 11970 38.36

Component A 897768 0 0 1.74 0 0 11970 42.04

Component B 405 0 0 1.38 0 0 64 0.99

Component C 332 0 0 0.99 0 0 329 0.10

Barabási–Albert scale-free model 1078515 0 0 3 0 2 45890 85.75

(22)

Figure 2. Cumulative frequencies of out-degree distributions from the Supernetwork (top left), Largest component (top right), Scale-free network (bottom left), Random graph (bottom right)

(23)

As can be seen from the Figure 2, the out-degree distribution from the supernetwork and the largest component are virtually the same (mainly because the largest component comprises ~ 88% of the supernetwork), and are strikingly similar to the distribution of the Barabási scale-free

network (power law distribution). The three aforementioned distributions are, however, dissimilar to a random network model, which has a much more normalised distribution. The in-degree

distribution graphs (for the sake of space provided in Figure A.1 in the Appendix) for the same networks resemble the out-degree distributions. Another noteworthy distribution is that of the components themselves. As was pointed out, only 1.2% of all components within the network contain 10 or more nodes. Once again such distribution resembles a power law distribution (presented in Figure A.2 in the Appendix). This seems to support the idea that most of the real-world (not mathematically synthesised) networks approach power law distributions, and that a small number of people on Twitter are responsible for the vast majority of its activity, whether it be the generation or transition of information. A comparison with randomly generated networks

strengthens this claim.

As was pointed out, the density of the supernetwork was incredibly low. Unsurprisingly, the density of the largest connected component was higher, but still it remained at a very low level ~ 1.94 × 10-6. This indicates a very low cohesion in the network of retweets, which is quite expected, since it is obvious that everyone does not retweet everyone’s messages, but rather transmit information that they find relevant, or know the source of. Transitivity of the network is probabilistic measure of a triad being connected, and is most often described the probability of a friend of a friend being a friend. In the the case of Twitter retweets, it represents the likelihood of a User A retweeting User C if User A retweeted User B, and User C retweeted User B. The

transitivity is low in both the supernetwork and the largest component – 0.08% each. This may seem like an incredibly small probability, however to put this number in perspective, transitivity of a random network of the same size is 2.74 × 10-6, or ~ 0.0003%. The average path length of both the

(24)

supernetwork and the largest component is ~ 9.22 steps. In smaller networks, like e.g., the second largest component, the average path length is very low – 1.29. Compare this to the famous Stanely Milgram “Small World” experiment (Milgram, 1967) where the average path length in a real world social network is hypothesised to be approximately 6. Milgram’s work, however, is not directly generalisable since it most likely underestimated the real mean path length and that most other small-world network have a higher average path length (e.g., Kleinfeld, 2002).

Prominent actors. Since the network includes more than a million of nodes, it would be impossible to provide a complete description of its actors. Therefore, a hundred of both in-degree and out-degree highest nodes in the network will be discussed further. The top 100 in-degree nodes (i.e., information propagators) ranged from 95 to 321 in-degrees (M = 135.40, SD = 74.01).

Interestingly enough, almost exclusively every account from the top 100 list is either a private account or a commercial account whose main purpose is other than media sphere. For example, the top node with 321 in-degrees is a Twitter account of the Qatar Airways website – qatarflights.com (@qatarflights), admittedly, it is of little surprise since air travel agencies would have a significant interest in a large scale air travel related disaster. Only 4 accounts (4%) could be considered news related: @mh17news – an ad hoc account created specifically to cover the MH17 crash (267 in-degrees; last tweet – 17/08/2014), @1NEWS2NEWS – a news aggregator account (235 in-degrees), @DesiNewsyTweets – an automated Indian news aggregator (107 in-degrees), @QxNews – another news aggregator (108 in-degrees).

The top 100 out-degree nodes (i.e., information generators) ranged from 1386 to 11970 (M = 3037, SD = 1889.62) in their out-degree distribution. Unlike the top in-degree nodes, where only 4% constituted media-related accounts, 46% of top 100 out-degree nodes are either

international or local media outlets (excluding journalists’ private accounts). The top node in the network (@Suara_generasi) with by far the highest out-degree score (11970), however, is Mahathir Mohamad – a Malaysian former Prime Minister. The second highest node (9906 out-degrees) is

(25)

another ad hoc account created to cover the news pertaining to the MH17 crash (@MH17Newss, note the difference from a similar account described previously). Table 3 below shows some of the most well known media outlets that are featured in the top 100 out-degree list, as well as their in-degree scores. Similar table for the most prominent actors in the top 100 in-in-degree list is provided in Table A.1 in the Appendix.

As can be clearly seen from the table above, the most important news outlets (also those not included in the table) score incredibly high – ~ top 1% of the network out-degree distribution, but very low – ~ bottom 1% of the network in-degree distribution. While the out-degree distribution could possibly be overestimated for reasons discussed above , this overestimation is present for 4 every node in the network, therefore it does not introduce any analytical inaccuracies for the present

Table 3

Network position and degree scores of prominent news outlets in the network Out-degrees Network position

(out-degrees)

In-degrees Network position (in-degrees)

CNN Breaking 6827 4 3 77140

BBC World 6137 10 11 11429

Reuters 5243 11 0 948711

BBC Breaking 5185 12 0 1007921

Wall Street Journal 4750 13 15 6278

Russia Today 4135 20 4 54070

Associated Press 3771 26 2 132079

New York Times 3550 29 2 129755

NBC News 2471 45 2 121400

TIME 2193 51 1 260451

With every additional retweet the “propagator” inadvertently acquires an additional out-degree, therefore

4

the amount of out-degrees in the network grows with the network size disproportionately to the amount of in-degrees.

(26)

study. International news outlets like CNN or Reuters are clearly the ‘creators’ of original

information. These data support the conjecture made earlier, namely, that nodes (Twitter accounts) responsible for the creation of information will score higher on out-degrees than those responsible for information dissemination through the network. And vice versa, nodes that are mostly

responsible for the information diffusion will have a much higher in-degree number. This also shows that the most prominent ‘creators’ of information are not active in the dissemination of other information (low in-degree scores). And the ‘propagators’ are not very active in creating original information, therefore creating a somewhat dualistic model of a network with two distinct roles – the ‘creators’ and the ‘propagators’.

Role of network ties in information dissemination.

As was already mentioned, the Twitter API prevented the analyses to be performed on all the available data, therefore a decision was made to gather Twitter data from nodes with 10 and higher in- and out-degrees. Hence, this section will be divided into two: first, top in-degree nodes will be analysed, and then top out-degree nodes. The data followed a count distribution with over

dispersion, therefore a negative binomial regression model was used to analyse it.

In-degrees model. A total of 15646 nodes were analysed, with in-degree distribution ranging from 10 to 321 (M = 18.09, SD = 14.55). Descriptives for the number of followers (weak ties), the number of friends (strong ties), and the number of Twitter posts are provided in the Table 4 below.

(27)

The dependent variables were highly skewed, therefore a logarithmic transformation was performed to normalise the distribution. The natural logarithm (loge x) was used to perform the

variables transformation, therefore “an increase in a variable x” actually should be read as “variable x being multiplied by e (~ 2.718)”.

To test the posed hypotheses, a negative binomial regression model was estimated with In-degrees network position variable as the dependent and Friends, Followers, and Posts as the dependent variables (In-degrees model). The model fit was acceptable, with adjusted pseudo-R2 = .

033. Log Likelihood was significantly improved over the baseline model – χ2 = 480.55, p < .001. The results will be interpreted using incidence rate ratios. Table 5 Provides regression model descriptives (Negative binomial for In-degrees; Negative binomial for Out-degrees).

Hypothesis 1 – User’s position in a social network will be positively correlated with the number of followers he/she has – was confirmed. Number of followers IRR – 1.03, p < .001. Therefore, with an increase in followers, the count of in-degrees (network position) is expected to increase by approximately 3%.

Hypothesis 2 – User’s position in a social network will be positively correlated with the number of other users he/she follows – was not confirmed. The “Friends” variable had IRR of 0.98,

Table 4

Descriptives for the independent variables (in-degree model)

Min 1st Q Mean Median 3rd Q Max SD N

Followers 0 227 4693 498 1053 14.7 × 106 14 × 104 15646 Friends 0 209 1031 426 926 282600 5465.33 15646 Posts 0 8704 35080 21340 43250 1.18 × 106 48842.88 15646 ln(Followers + 1) 0 5.65 6.38 6.32 7.04 16.53 1.37 15647 ln(Friends + 1) 0 5.54 6.19 6.16 6.91 12.62 1.20 15647 ln(Posts + 1) 0 9.37 9.97 10.12 10.77 14.06 1.33 15647

(28)

p < .001, indicating that with an increase of Friends, the expected count on the dependent variable is decreased by approximately 2%.

Hypothesis 3 – Number of followers (weak ties) will contribute more to the position in a social network than the number of “friends” – was confirmed. As can be evident from the results, Followers do indeed contribute to the network position, while Friends only seem to hinder it.

Note: All coefficients are significant at p < .001;

Out-degrees model. There were 17245 nodes with 10 out-degrees or higher. Out-degrees ranged from 10 to 11970 (M = 76.52, SD = 293.74). Independent variables’ descriptives are provided in the Table 6 below.

Table 5

Regression models

In-Degrees Out-Degrees

Coefficient S.E. IRR Coefficient S.E. IRR

Followers 0.03 0.004 1.03 0.25 0.004 1.28 Friends -0.02 0.004 0.98 -0.08 0.004 0.92 Posts 0.05 0.004 1.06 -0.08 0.005 0.92 N 15646 17245 Pseudo R2 0.032 0.24 χ2 480.55 4938.32

(29)

As in the previous model, the dependent variables were highly skewed, therefore a logarithmic transformation was performed to normalise the distribution. The natural logarithm (loge x) was used to perform the variables transformation, therefore “an increase in a variable x”

further actually should be read as “variable x being multiplied by e (~ 2.718)”. A negative binomial regression (Out-degrees model) was estimated to investigate the effects of Twitter friends and followers on the out-degree network position. Network position was used as the dependent, friends and followers as the independent variables. Out-degrees model had a good fit; adjusted pseudo-R2 =

.24. Followers had a significant positive effect on the network position – IRR = 1.28, p < .001. Which means that with an increase in Followers, the expected count of the dependent variable would increase by 28%. Friends, however, did not have a positive effect – IRR = 0.92, p < .001, therefore, with an increase in friends, the expected count of the dependent variable would decrease by approximately 8%. These effects share the same directionality as the ones described for the in-degree model, therefore Hypothesis 1 (User’s position in a social network will be positively correlated with the number of followers he/she has) and Hypothesis 3 (Number of followers (weak ties) will contribute more to the position in a social network than the number of “friends”) were

Table 6

Descriptives for the independent variables (out-degree model)

Min 1st Q Mean Median 3rd Q Max SD N

Followers 0 1062 133500 5541 34300 59.67 × 106 1.06 × 106 17245 Friends 0 165 4280 512 1523 938100 21788.09 17245 Posts 0 4050 35710 15650 43060 4.5 × 106 75922.52 17245 ln(Followers + 1) 0 7.26 8.94 8.83 10.60 17.93 2.46 17246 ln(Friends + 1) 0 5.34 6.33 6.35 7.42 13.86 2.01 17246 ln(Posts + 1) 0 8.61 9.48 9.80 10.75 15.34 1.92 17246

(30)

confirmed, but contrary to the initial expectations Hypothesis 2 (User’s position in a social network will be positively correlated with the number of other users he/she follows) was not confirmed.

Conclusion and Discussion

The present research concerned itself with online human communication networks, and the spread of political information through these networks. The Twitter SNS was chosen as a prime example of such a network. The patterns of information flow were modelled using the data pertaining to the crash of Malaysia Flight MH17 in July 2014. The study mainly employed social network analysis to be able to answer the posed questions.

The present paper had two principal goals it strived to achieve: first, an attempt was made to provide a relatively holistic description of the network of communication flows, and the role of the network structure in the dissemination of information. Second, it asked whether the theories developed to explain the flow of information through the offline social networks are applicable to their online counterparts.

Structurally, the modelled online Twitter network bears a lot of similarities with both the real life social networks, and (to some extent) with mathematically modelled ones. For example, nearly every frequency distribution is approaching power law, which is theorised to be a very normal occurrence in social networks. It was discovered that the network has a dualistic structure with two clearly defined roles – “information generators”, and “information propagators”. Prominent network actors that generated information – mainly international news outlets – had a structurally different network position than those who propagated the information – mainly private accounts. The “originators” had a very high out-degree network centrality score (i.e., they had a very high likelihood of being retweeted by another Twitter account) but an extremely low in-degree score (i.e., they had a very low likelihood of retweeting other Twitter account’s tweets), and vice versa, the “propagators” were very likely to retransmit the messages of others, but highly unlikely to

(31)

create their own messages, or having their own messages retweeted by others. There were virtually no accounts that were equally likely to both generate their own messages and retweet the messages of other Twitter accounts (scored high on both the in- and out-degrees), which supports the idea that these roles are not only clearly defined, but also mutually exclusive. While the analysis did not directly support all of the posed hypotheses, it nevertheless provided some valuable insights.

Unsurprisingly, the number of posts a Twitter account has positively influences both the in- and out-degree centrality. The number of Twitter posts is directly related to the activity on the social networking website, and therefore the effect is expected. The number of followers proved to be a significant positive predictor for out-degree centrality. Meaning that weak ties do indeed contribute to information diffusion in an online network as is the case in their offline counterparts. The number of friends (strong ties) did have a significant effect on both out- and in-degree network centrality, however, the directions were opposite to those expected in the initial hypotheses – with an increase in the amount of Twitter friends an account has in the network, the actor is expected to play a lesser role in both the generation and the dissemination of information through a network. With some reservations, it could be said Granovetter’s tie strength theory is applicable to online

communication networks. As is clearly seen from the analysis results, weak ties do indeed

positively influence one’s network position, although unlike their offline counterparts, strong ties only seem to hinder the position of an actor in the network, therefore rendering him less effective as either the “generator” or the “propagator” of information in the network. Admittedly, these finding are based on a single type of information is a specific communication network of the Twitter SNS, however they may indicate a trend that warrants further investigation.

The basic patterns of political information spread in a network have been identified in this paper. A new piece of (successfully transmitted) information is most likely to be originated by an established, well-known channel (e.g., an international news outlet like BBC). When the

(32)

“propagators”, mostly private, non-professional Twitter accounts that simply relay the information onwards. The originators rarely transmit any information (e.g., BBC is highly unlikely to transmit AP’s piece of news), and the propagators rarely create any (successfully transmitted) piece of information (e.g., User_1 is highly unlikely to create some information that will be widely shared in a network). The account with more Twitter followers are more likely to be instrumental in the transmission of information i.e., with a very high probability, from its point of origin, the new piece of information travels down the network via the network weak ties.

The study included a number of limitations associated with it, that could possibly be improved in future research. Due to Twitter API restrictions it was impossible to conduct analyses on the complete dataset. Only 32,891 (~ 3.5% of the full sample) nodes were used for the negative binomial regression models. While the selection of the subsample theoretically should not introduce a serious error to the analysis, since the interest of the research was focused on the most prominent actors in the networks (with a highly skewed distribution this is exactly what is represented in the subsample), this nevertheless may account for some loss of information. This problem may be resolved in future by either allocating significantly more time to be able to obtain the data from Twitter, or by using a much smaller network, in which gathering a complete sample would be a more feasible task. Also, this research focused only on the structural side of the political

information flow in the network, and it did not interest itself in the messages themselves.

Continuation of this study may be an introduction of a textual analysis (e.g., content analysis) to the method to distinguish what kind of information in more likely to be disseminated. Lastly, the description provided in the study is focused on one specific network with one specific type of information. Studies utilising comparative designs, juxtaposing various online networks with different information flows are a necessary next step in this area of research to be able to tell if the finding here could be generalised to the mode of information spread across various online social networks.

(33)

To sum, even with the aforementioned limitations, the results presented in the paper are an important insight into how information spreads in an online network. The study provided the reader with an overview of such a process, and gave suggestions as to what can be done to further improve the theoretical knowledge on online human communication. The findings described in this paper are in no way an exhaustive representation of the processes that are driving the information diffusion online. The exponential growth of communication networks facilitated by the constant advance in technologies not only allows social scientists to investigate the new ways people interact, but also uncover the fundamental principles of human communication, therefore social scientists from various schools and disciplines should continue to investigate these fundamental yet ever-changing phenomena.


(34)

Literature

Adar, E., & Adamic, L. A. (2005, September). Tracking information epidemics in blogspace. In Web intelligence, 2005. Proceedings. The 2005 IEEE/WIC/ACM international conference on (pp. 207-214). IEEE.

Bakshy, E., Hofman, J. M., Mason, W. A., & Watts, D. J. (2011, February). Everyone's an

influencer: quantifying influence on twitter. In Proceedings of the fourth ACM international conference on Web search and data mining (pp. 65-74). ACM.

Bakshy, E., Rosenn, I., Marlow, C., & Adamic, L. (2012, April). The role of social networks in information diffusion. In Proceedings of the 21st international conference on World Wide Web (pp. 519-528). ACM.

Baños, R. A., Borge-Holthoefer, J., & Moreno, Y. (2013). The role of hidden influentials in the diffusion of online information cascades. EPJ Data Science, 2(1), 1-16.

Barabási, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509-512.

Barabási, A. L. (2007). Structure and tie strengths in mobile communication networks. Proceedings of the National Academy of Sciences, 104(18), 7332-7336.

Brown, J. J., & Reingen, P. H. (1987). Social ties and word-of-mouth referral behavior. Journal of Consumer research, 350-362.

Burt, R. S. 2004. Structural Holes and Good Ideas. American Journal of Sociology, 110(2), 349– 399.

Cheng, A., Evans, M., & Singh, H. (2009, June). Inside Twitter: An in-depth look inside the Twitter world. Report of Sysomos, June, Toronto, Canada.

Csányi, G., & Szendrői, B. (2004). Structure of a large social network. Physical Review E, 69(3), 036131.

(35)

De Meo, P., Ferrara, E., Fiumara, G., & Provetti, A. (2014). On Facebook, most ties are weak. Communications of the ACM, 57(11), 78-84.

Diddi, A., & LaRose, R. (2006). Getting hooked on news: Uses and gratifications and the formation of news habits among college students in an Internet environment. Journal of Broadcasting & Electronic Media, 50(2), 193-210.

Dragland, Å. (SINTEF). (2013, May 22). Big Data, for better or worse. Retrieved November 13, 2014 from http://www.sintef.no/home/Press-Room/Research-News/Big-Data--for-better-or-worse

Durkheim, E. (1894). Les règles de la méthode sociologique. Revue Philosophique de la France et de l'Étranger, 37, 465-498.

Doerr, B., Fouz, M., & Friedrich, T. (2012). Why rumors spread so quickly in social networks. Communications of the ACM, 55(6), 70-75.

Easley, D., & Kleinberg, J. (2010). Networks, crowds, and markets: Reasoning about a highly connected world. Cambridge University Press.

Ellison, N. B., Steinfield, C., & Lampe, C. (2007). The benefits of Facebook “friends:” Social capital and college students’ use of online social network sites. Journal of Computer- Mediated Communication, 12(4), 1143-1168.

Erdős, P., & Rényi, A. (1961). On the strength of connectedness of a random graph. Acta Mathematica Hungarica, 12(1), 261-267.

Friedkin, N. E. (1982). Information flow through strong and weak ties in intraorganizational social networks. Social networks, 3(4), 273-285.

Galuba, W., Aberer, K., Chakraborty, D., Despotovic, Z., & Kellerer, W. (2010, June). Outtweeting the twitterers-predicting information cascades in microblogs. In Proceedings of the 3rd conference on Online social networks (pp. 3-3). USENIX Association.

(36)

Gilbert, E., & Karahalios, K. (2009, April). Predicting tie strength with social media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 211-220). ACM.

Grabowicz, P. A., Ramasco, J. J., Moro, E., Pujol, J. M., & Eguiluz, V. M. (2012). Social features of online networks: The strength of intermediary ties in online social media. PloS one, 7(1), e29358.

Granovetter, M. S. (1973). The Strength of Weak Ties1. American Journal of Sociology, 78(6), 1360-1380.

Granovetter, M. (1983). The strength of weak ties: A network theory revisited. Sociological theory, 1(1), 201-233.

Gruhl, D., Guha, R., Liben-Nowell, D., & Tomkins, A. (2004, May). Information diffusion through blogspace. In Proceedings of the 13th international conference on World Wide Web (pp. 491-501). ACM.

Holme, P. (2005). Network reachability of real-world contact sequences. Physical Review E, 71(4), 046119.

Horrigan, J. B., Garrett, K., & Resnick, P. (2004). The Internet and democratic debate. Pew Internet & American Life Project.

Huang, Y. P., Goh, T., & Liew, C. L. (2007, December). Hunting suicide notes in web 2.0-

Preliminary findings. In Multimedia Workshops, 2007. ISMW'07. Ninth IEEE International Symposium on (pp. 517-521). IEEE.

Kastenmüller, A., Greitemeyer, T., Jonas, E., Fischer, P., & Frey, D. (2010). Selective exposure: The impact of collectivism and individualism. British Journal of Social Psychology, 49(4), 745-763.

Kerckhoff, A. C. (1965). Nuclear and extended family relationships: A normative and behavioral analysis. Social Structure and the Family: Generational Relations, 93-112.

(37)

Kerckhoff, A. C., & Back, K. W. (1965). Sociometric patterns in hysterical contagion. Sociometry, 28, 2-15.

Klovdahl, A. S. (1985). Social networks and the spread of infectious diseases: the AIDS example. Social science & medicine, 21(11), 1203-1216.

Kleinfeld, J. S. (2002). Six degrees of separation: urban myth?. Psychology Today, 35(2).

Kossinets, G., Kleinberg, J., & Watts, D. (2008, August). The structure of information pathways in a social communication network. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 435-443). ACM.

Lasswell, H. D. (1948). The structure and function of communication in society. The communication of ideas, 37.

Lerman, K., & Ghosh, R. (2010). Information Contagion: An Empirical Study of the Spread of News on Digg and Twitter Social Networks. ICWSM, 10, 90-97.

Leskovec, J., McGlohon, M., Faloutsos, C., Glance, N. S., & Hurst, M. (2007, April). Patterns of Cascading behavior in large blog graphs. In SDM (Vol. 7, pp. 551-556).

Liben-Nowell, D., & Kleinberg, J. (2008). Tracing information flow on a global scale using Internet chain-letter data. Proceedings of the National Academy of Sciences, 105(12), 4633-4638. Milgram, S. (1967). The small world problem. Psychology today, 2(1), 60-67.

Mislove, A., Lehmann, S., Ahn, Y. Y., Onnela, J. P., & Rosenquist, J. N. (2011). Understanding the Demographics of Twitter Users. ICWSM, 11, 5th.

Morales, A. J., Borondo, J., Losada, J. C., & Benito, R. M. (2014). Efficiency of human activity on information spreading on Twitter. Social Networks, 39, 1-11.

Nekovee, M., Moreno, Y., Bianconi, G., & Marsili, M. (2007). Theory of rumour spreading in complex social networks. Physica A: Statistical Mechanics and its Applications, 374(1), 457-470.

(38)

Onnela, J. P., Saramäki, J., Hyvönen, J., Szabó, G., Lazer, D., Kaski, K., ... & Barabási, A. L. (2007). Structure and tie strengths in mobile communication networks. Proceedings of the National Academy of Sciences, 104(18), 7332-7336.

Padgett, J. F., & Ansell, C. K. (1993). Robust Action and the Rise of the Medici, 1400-1434. American Journal of Sociology, 1259-1319.

Purcell, K., Raine, L., Mitchell, A., Rosenstiel, T., & Olmstead, K. (2010). Understanding the participatory news consumer: How Internet and cell phone users have turned news into a social experience. Pew Internet & American Life Project.

Rainie, L., & Smith, A. (2012). Social networking sites and politics. Washington, DC: Pew Internet & American Life Project. Retrieved November, 20, 2014.

Romero, D. M., Tan, C., & Ugander, J. (2011). Social-Topical Affiliations: The Interplay between Structure and Popularity. CoRR.

Ruggiero, T. E. (2000). Uses and gratifications theory in the 21st century. Mass communication & society, 3(1), 3-37.

Shao, G. (2009). Understanding the appeal of user-generated media: a uses and gratification perspective. Internet Research, 19(1), 7-25.

Singer, J. B. (2014). User-generated visibility: Secondary gatekeeping in a shared media space. New Media & Society, 16(1), 55-73.

Stroud, N. J. (2008). Media use and political predispositions: Revisiting the concept of selective exposure. Political Behavior, 30(3), 341-366.

Weeks, B. E., & Holbert, R. L. (2013). Predicting Dissemination of News Content in Social Media A Focus on Reception, Friending, and Partisanship. Journalism & Mass Communication Quarterly, 90(2), 212-232.

(39)

Xiang, R., Neville, J., & Rogati, M. (2010, April). Modeling relationship strength in online social networks. In Proceedings of the 19th international conference on World wide web (pp. 981-990). ACM.


(40)

Appendix


Figure A.1. Cumulative frequencies of in-degree distributions from the Supernetwork (top left), Largest component (top right), Scale-free network (bottom left), Random graph (bottom right)

(41)

Table A.1

Network position and degree scores of prominent information propagators in the network

Out-degrees Network position (out-degrees)

In-degrees Network position (in-degrees) @qatarflights 3 43322 321 1 @PortalKritis 20 9367 284 2 @mh17news 152 1498 267 3 @Novorossiyan 618 316 255 4 @wavetossed 432 493 249 5 @FalseFlagBot_ 0 220060 240 6 @1NEWS2NEWS 0 213237 235 7 @Pauljaine 30 6551 233 8 @ihsanushshabri 1 114173 232 9 @prayag 0 217410 220 10

Referenties

GERELATEERDE DOCUMENTEN

The results suggest that companies with higher scored assurance on their sustainability disclosure are more likely to have lower environmental performance.. This effect is

The regression models used in this study were able to statistically significantly predict the level of financial information dissemination and the incorporation

In addition, the relationship between Knowledge Complexity and Network Centrality reported in table 5.1 takes our assumptions a step further in the sense that it can be interpreted

Learner resources (e.g. basic task instructions, procedural support, accommodation support). Implementation support

Supported by suggestions from the Scienti fic Advisory Board, 26 international experts were invited, both renowned and young emerging scientists in their field, to present

1 See for instance Pablo Colomo, ‘Intel and Article 102 TFEU Case Law: Making Sense of a Perpetual Controversy’ (2014) 29 Law Society and Economy Working Paper Series 1..

In most of the applications the diodes are made using SOI wafers and a long intrinsic region is used which helps to provide unique properties like low and constant capacitance,

Case-studies in the factory show that the life time of die-sets defined in terms of the number of products made between grindings does not follow a normal