• No results found

Texting with the stars : parasocial interactions in a social media world : a big data study

N/A
N/A
Protected

Academic year: 2021

Share "Texting with the stars : parasocial interactions in a social media world : a big data study"

Copied!
62
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

"Texting With The Stars": Parasocial Interactions In A Social Media World - A Big Data Study Keshet Katz

University of Amsterdam

10866981 Master’s Thesis

Graduate School of Communication Research University of Amsterdam

Supervised by: Prof. Dr. Ed S. H. Tan and Dr. Damian Trilling June 24th, 2016

(2)

Abstract

This study aims to take a social approach on describing and understanding the interaction between celebrities and fans as it is performed in two social platforms: Instagram and

Twitter. Social platforms have increasingly becoming a central part of online communication in particular, and in human communication in general. The study set out to examine how those new forms of communication have influenced Fans to Celebrity interactions. Using fan communication to 38 celebrities; automated content analysis, by supervised machine learning approach, was used to explore 2.7 million textual messages sent on those social platforms. Fan to Celebrity communication was compared to peers interaction across the two social platforms. In orderto explore parasocial interaction in online social media. Analysis of the textual usage focused on four main categories: love expression, compliments, references to physical appearance and direct address language.

The study found a difference in communication with celebrities across different

platforms, suggesting that the virtual environment affects users’ interactions. Though these hypotheses were not supported, the results propose that parasocial interaction on social platform is a field worth exploring further.

Keywords: social platforms, parasocial interaction, fans, celebrities, automated content analysis, supervised machine learning

(3)

Interaction between fans and celebrity is not a new phenomenon. Back in 1840, Franz Liszt, an Hungarian composer was considered to be the 'world's first rock star’. Liszt was adored by women who during his performance threw their underwear on to the stage. That same

enthusiastm was later seen in ‘Beatlemania’ and ‘Bieber fever’. Those ‘personal encounter’ interactions are available to a handful of fans . Textual communication has opened opportunities to many more. Studies found that through letter correspondence (‘fan mail’), fans achieved presided interaction in the form of feedback (Simmons, 2009), love letters (Rother, 2009), and attention seeking (Ferris, 2001).Interaction with celebrities was found to be a predominant presence in identity building. During adolescence and young adulthood, personal identification with a celebrity was found to be a very important factor in the identity- development phase, as identification shifts away from prenatal models (Cramer, 2001). It is interesting to have an understanding how additional technological advances have affected the fan to celebrity interaction.

Social media platforms, like other former aspects of online communication, have pushed limits and changed forever paradigms that were once in place. Geographic location has stopped being an issue for textual communication, as citizens of the internet, people are no longer bound by physical proximity and can interact with people at any point on the globe. Message delivery time has shortened greatly with the invention of electronic mail.

Development of online interactive platforms had dramatically changed textual

communication. Once the Bulletin Board System allowed users, who shared a common interest, the opportunity to converse anonymously. Next, social networks were used to connect off-line acquaintances, online. The following development enabled its users to create profiles, invite friends, organize groups, and surf other user profiles, but still under the assumption that a rich

(4)

online community can exist only between people who truly have common bonds. ‘Myspace’ and ‘Facebook’ followed, focusing more on the individual aspect and letting users express

themselves.

The new platforms, Twitter and Instagram, now adopted for hand-held devices as well, have broken an additional barrier. No longer built around existing ‘circle of friends’, it has given, at least on paper, every user in the network an equal access to follow users, as long as a profile is public. Social media, inherently designed to facilitate human connections, have alternated the way individuals interact with one another (Wallace, Wilson, & Miloch, 2011) and enlarge the opportunities available for fans all over the world to interact with other fans and with celebrities.

Online communication along the years has become increasingly widespread and has influenced, among others, fan culture, fan community and fan interactions with celebrities. Nowadays, with technological advances enabling instant communication, literally in the palm of our hands, and social networks seemingly having brok all barriers, it is interesting to examine what, if any, affect it has had on fan-celebrity relationships.

Fan to celebrity interaction can be presented under the umbrella of parasocial

interaction. Parasocial interaction originally hypothesized by Horton and Wohl (1956), describes the relationship between media users and media figures. Modern social media platforms

encourage the parasocial interaction in two ways: 1) The increased possibility for reciprocation that these platforms provide. 2) The content shared on those platforms is often of a personal nature, much more so than on other media venues. The illusion of “knowing” the persona is strengthened with every new public information received by the fan, more so when the

information is of private nature. Social platforms provide seemingly direct personal information, without any mediator or agent, from the celebrity to the fan. One can assume that the perceived

(5)

personal access that social media presents will reinforce the misconception of closeness to the celebrity and increase prasocial feelings.

This study will try to examine how usage of Twitter and Instagram effect fan to celebrity interaction as compared to peer to peer interaction on the same platforms.

Theoretical Framework

Mid-twentieth century is known as the epoch of considerable development of Information Communication Technology (ICT), as mass media became available to the general public (Preston, 2001). One of the noteworthy characteristics of the ‘new’ mass media at that era – radio, television, and movies – is that they created the illusion of an intimate relationship with the performer. First operationalized by Horton and Wohl (1956), parasocial interaction (PSI) has been defined as an apparent face-to-face interaction between media characters and audience members. Although the interaction is one-sided, most often with a person of higher status, the individual feels as if he knows the figure as a friend or colleague, even if the figure hardly knows him, if at all (Stever, 2009). Horton and Wohl (1956) described this relationship from the fan’s perspective:

“They “know” such a persona in somewhat the same way they know their chosen friends: through direct observation and interpretation of his appearance, his gestures and voice, his conversation and conduct in a variety of situations. His appearance is a regular and dependable event, to be counted on, planned for, and integrated into the routines of daily life. His devotees ‘live with him’ and share the small episodes of his public life – and to some extent even of his private life away from the show.”(p. 220‏).

With development of the new media field, PSI has expanded from a narrow focus on identification with media figures to a multifaceted structure. Giles in his article described the two

(6)

dimensions which sprang from two disciplines as a way to examine the interaction with

celebrities: personal attachment and celebrity worship (Maltby & Giles, 2008). Giles concluded that celebrity worship should always be considered within the context of parasocial phenomena.

While ordinary text-based exchanges are not considered parasocial in the sense that they are reciprocated (Maltby & Giles, 2008), some interesting possibilities for textual

communication emerged in the early days of internet usage. Celebrities and other media figures have created their own websites and later on opened social media profiles, in order to enter into a dialogue with fans. The opportunity for reciprocation did not begin with the invention of the internet, as fans have always written letters to their idols, and received debatable

acknowledgment in return. The popularity of social networks toppled private/public distinction in a way that fans may all feel that their idol is communicating with them personally. Parasocial interaction theory research cannot ignore the importance of online interaction, which became commonly used when personal computer turned to be a household item and grow even more with the introduction of smartphones and social media (Digital Trends, 2016).

Fan culture and celebrity interaction

“Celebrity” and “fame” are not easily defined concepts. Braudy (1986) in his book ‘The Frenzy of Renown’, defined 'fame' using four factors: the person, the accomplishment, the

publicity, and posterity. Caughey (1984) described celebrities as those who appear in "television, movies, radio, books, magazines, and newspapers”. Spitzberg and Cupach (2008) defined

celebrity as fame associated with any person who is under public eyes. Epstein (2005) argues that while not being mutually exclusive, fame and celebrity are also not interchangeable.

According to his theory fame is something one earns, while celebrity is something the individual cultivates.

(7)

The dictionary definition of "fan" refers to it as a shortened version of the word fanatic; commonly used referring to a person who is enthusiastically devoted to something or somebody, such as a band, a sports team, a genre, a book, a movie or an entertainer. This research takes into consideration all audience members who voluntarily and actively communicate on at least one social platform regarding famous people/celebrities as fans.

A group of fans who expresses interest in the same subject or persona are referred to as ‘fandom’ or ‘fan culture’. According to Henry Jenkins (2012), fan culture is a culture produced by fans and other amateurs in addition to the official media product. Matt Hills’ meta-analysis of fan culture has pointed to a theoretical shift, a move towards a general theory of media fandom, rather than the theory which focuses on fans of one particular media product (Hills, 1971). This study will follow that trends in general terms and will not focus on a specific fandom.

With the technological advances, the fan can quickly and actively pursue parasocial relationships. Despite PSI is traditionally considered being one-sided, seemingly can lead people to respond behaviorally as if they were in an actual social relationship. PSI may, to some degree, be more active than passive, and that shift from passive to active might lead to a wider range of PSI social behavior manifestation. Active fans participate in creating PSI opportunities, their action and interaction suggest that they pursue a parasocial relationship that supports their preconceived image of their preferred celeb (Sanderson, 2008).

Developing and expressing strong emotions, or “crushes” toward well-known media figures is nothing new. Documented "crushes" go as far back as 1930 with the likes of Greta Garbo being the object of affection (Blumer, 1933). Soldiers sent Donna Reed love letters during WWII with drawing of broken hearts (Rother, 2009); more recent example includes Elvis Presley (Fraser & Brown, 2002). Scholars have found that expressed affection among fans is correlated

(8)

with higher relational satisfaction (Floyd et al., 2005) as part of relational maintenance (Sanderson, 2009).

There are a lot of media figures competing for the audiences' attention, but some attract more fans than others. One reason that might lead audiences to develop relationships towards a celebrity and not to another one is sex appeal. Stever's (1991) study about celebrity appeal has found that sex appeal is one of four factors that influence audiences' affection with a particular personality. Rubin, Perse, and Powell (1985) developed the PSI Scale to measure the extent to which television viewers develop parasocial relationships with newscasters, found that

respondents with high PSI scores tended to find their favorite performer socially attractive (Sanderson, 2008). This study will focus on the textual communication fans direct to celebrities on social media, without personal access to the fans. The aspect of social attractiveness of the celebrity, in the outlook of fans, will be measured by textual reference by fans regarding physical appearance.

Social interaction theorists contend that the human need for affiliation and self-esteem are basic physiological needs that can only be satisfied through interaction with others (Stark, 1996). From this perspective, complimenting, acknowledging, and expressing appreciation are ways of communicating reinforcement in a text-based medium. Those gestures fuel the development and maintenance of interpersonal interaction. Within the combination of Language and Social Interaction the term ‘compliment’ recurrently refers to complimenting as ‘sequences of interactions, the social actions of both complimenting and responding to compliments’. Complimenting sequences are organized as “adjacency pairs” (Schegloff & Sacks, 1973;

Spickard, Heritage & Garfinkel, 1987). The ‘adjacency pair’ structure is a normative framework for actions, wherein one speaker’s production of a first part initiating some course of action – for

(9)

example, the action of complimenting – the next speaker should immediately produce an

appropriate second pair part – in this case, a compliment response. Use of compliment in textual interaction with celebrities could occur for a couple of reasons, the first is parasocial interaction. Parsocial, like normal social interaction, includes compliments and acknowledgment as part of appropriate text-based communication. Secondly, compliments can also be implied as an attention grabbing tactic. Compliments are expected to be responded. An appropriate response can be short, simple and general (i.e. Thanks) therefore there is an increased chance of eliciting an answer.

A fundamental aspect of PSI is that the interaction should be a natural one. When Horton and Wohl (1956) first operationalized PSI, they mentioned ‘direct address’ as a way to close the gap between the persona and the audience:

Sometimes the ‘actor’ – whether he is playing himself or performing in a fictional role – is seen engaged with others; but often he faces the spectator, uses the mode of direct address, talks as if he were conversing personally and privately. The audience, for its part, responds with something more than mere running observation; it is, as it were, subtly insinuated into the program’s action and internal social relationships and, by dint of this kind of staging, is ambiguously transformed into a group which observes and participates in the show by turns. (p, 225).

Later studies found that behavioral aspects of PSI are most likely to occur with a direct address by media figures (Rubin and Perse, 1987). Caughey’s (1984) work focuses, among others, on the investments individuals put into their relationships with the celebrities, and their perceptions concerning the extent to which they and their idols share a special connection or bond. On social media, the celebrity interacts directly with the audience through text, without visible mediating

(10)

agents such as outside editing, which elicit PSI. As mentioned before, social media has

transferred parasocial interaction from the private sphere, i.e. privately writing a fan-mail, to the public sphere, which enables the opportunities to observe those interactions. Using direct address language may also be regarded as a manifestation of unfounded beliefs that fans have a special relationship or connection with this celebrity (McCutcheon, Lange & Houran, 2002).

Alternatively, usage of direct address can be considered an appropriate behavior on social networks, as one user "talks" to another.

Social media

Social media, inherently designed to facilitate human connections, have alternated the way individuals interact with one another (Wallace, Wilson, & Miloch, 2011), influencing the very nature of communication and expression (Sutton, 2012). The illusion of “knowing” the persona is strengthened with every new public information the fans receives, more so when the information is of private nature. One can assume that the perceived personal access, that social media presents, will reinforce the misconception of closeness to the persona. Research exploring people's expression of parasocial interaction on a sports personality blog, found that their use of information and communication technologies is reconfiguring parasocial relationships as fans take an active role in soliciting and communicating (Sanderson, 2008).

While online social-networks democratizes the distribution of thoughts and to some degree fame, previous research has shown that in terms of number of followers, the top users on Twitter are mostly celebrities and people who have mass media attention (Kwak, Lee, Park, & Moon, 2010). Social network sites (SNS) gives fans the sense of actually “being there” with the celebrity. As such, it is possibly the most intimate form of media communication used now a day by celebrities to connect with their fans (Stever and Lawson, 2013). From the fan perspective,

(11)

computer-mediated communication (CMC) and SNS platforms increased the possibility that a celebrity would respond to one of his fans, and if that interaction would happen, it would be public. While celebrities use Twitter and Instagram to reach out to fans and alter their

relationship to be more “real”, fans are still faced with the same restricted access to this person that they have always had (Stever and Lawson, 2013). According to Frederick, Lim, Clavio and Walsh (2012), even though social platforms opened many doors, mediating barriers still exists. The media user cannot see, touch, or have a face-to-face conversation with the celebrity.

Therefore, traditional mediated interaction patterns may still manifest.

The Social Network platforms, Twitter and Instagram, both fall into the category of microblogging. Those services provide an easy form of communication, through text and picture, to inform about day-to-day activities, opinion, etc. (Boyd, 2014). Both services employ a social-networking model based on "following" – users can follow any other user without permission needed from the followed user (unlike friendship in Facebook, that require approval).

Relationship of 'following' does not require reciprocation as long as the profile is public. The setting of these social networks platforms means that the process of following a close friend or another user who is a celebrity is the same.

This study has set out to examine interaction between fan and celebrities on the

contemporary forms of communication and social networks platforms. Communication directed towards celebrities will be compared with interaction on those platforms directed to peers.

Instagram is structured in a way that the public communication between users occurs through comments on posts. A post, uploaded by user X contains a photo, possibly with a text caption. Other users who choose to follow user X receive that post and are able to interact with its content. Interaction with the post can be to ‘like’ it (simply by using the proper button) or

(12)

leaving a comment which is a textual reaction to the post. This study will look at two modes of communication. The first, ‘Fan to Celebrity’, which will look at fans' comments to posts made by a celebrity. In this mode, user X is a well-known persona, one of many, chosen for this study. The second mode , ‘Peer to Peer’ communication, explores comments made on the platform from one ‘regular’ user to another. In this mode, the writer of the post, can be any one who is an Instagram user, with no specific distinction.

In order to explore the phenomena of communication through social network in general, another platform will be examined in the study - Twitter. On Twitter public

communication between users works in a different manner than on Instagram. To interact with a specific user on Twitter, one needs to use his username headed by an ‘at’ (@) symbol. For example, in order to interact with a user whose user name (or Twitter handle) is ‘EXAMPLE’, the tweet will need to contain @EXAMPLE as part of the text.

Social media are speculated to promote and act as a vessel for PSI behaviors. This study intends to prove that while interacting with celebrities, certain social behaviors will be heightened in comparison to “regular” interactions as a way to compensate for the one sidedness of the relationship in the parasocial communication, to attract the attention of celebrity.

The following hypotheses are assumed:

H1: Comments on posts made by a celebrity will have more expression of affection and love than comments made on posts by peers.

H2: There will be more compliments in comments made on posts by celebrities than on posts made by peers.

H3: Comments on a post made by a celebrity will have more reference to physical appearance than comments on posts made by peers.

(13)

H4: Comments on a post made by a celebrity will have more usage of direct address.

Hypotheses H1-H4, mentioned above, refer to all celebrities as one class. Cohen (1999) based his typology on the idea that different types of media figures affect the parasocial interactions. The following hypotheses will explore how different characteristics of the

celebrities might have an effect on fans' interactions with them.

Auter (1992) concluded that relationship with media figures who address the audience directly, i.e. newscasters, presenters and comedians, results in higher PSI. In most western cultures, the idolized figures originate from sport, entertainment and music. These domains receive broad exposure in the mass media and are further accessible via films, television, video, sport events, and concerts. Past studies demonstrated that PSI occurred between fans and professional athletes (Brown & Basil, 1995); YouTubers are considered as ‘authentic’

celebrities, who share more of their personal life with the audience and seem more excusable and relatable (Smith, 2016). On social platforms, Stever and Lawson (2013) found variety between celebrities in the way Twitter was being used. This study examines a variety of celebrities from several domains such as singers, sports stars, and YouTubers.

Giles and Maltby (2004) found that emotional autonomy and attachment to celebrities increase during adolescence with a negative correlation between attachment to parents and celebrities, illustrating that as parents become de-idealized for adolescents, media persons take over some of the functions that parents had fulfilled in childhood (Cramer, 2001).

Understanding that audiences tend to gravitate toward figures who are similar to them and they can identify with, this study assumes that there is a correlation between the age of the celebrity and the age of his fans. Therefore the following hypothesis was formulated:

(14)

H5: The use of love expression in comments on posts made by celebrities will decrease according to the increase of the celebrity’s age.

Studies focusing on objectification (Tiggemann & Lynch, 2001; Grogan, 2007) found that women are objectified more often than men. Therefore, the following hypothesis was formulated:

H6: Reference to physical appearance will be more present in comments made on posts by female celebrities than on posts made by male celebrities.

Due to the variation of PSI to different classification of celebrities and the different usage of social network, the following research question was formulated:

RQ1 – Is there any connection between the reasons celebrities gained their fame to the way fans interact with them on social networks?

Method

Social media textual communication is remarkably vast at present time. Even limited by reducing parameters, the possible database is enormous. The sheer volume textual

communication dedicated the big data aspect of the study. Looking at textual communication, it made sense to use content analysis as the method of choice, though bounded by limitation of time and money to hand code a large sample. Combining these factors, the hypotheses were tested by using automated content analysis techniques such as supervised machine learning. Sample

As mentioned above, this study investigates two kinds of relationships: Fan to Celebrity and Peer to Peer. All of the celebrities chosen for the sample have earned their fame by

appearing in different media venues as Caughey (1984) defined. They all have accomplishments in their respected fields and publicity. One can argue that maintaining active and public social

(15)

media account can be seen as cultivating celebrity behavior, as per Epstein (2005) definition, all celebrities have an active public account in at least one of the researched platforms. As an attempt not to skew the result, the celebrities chosen vary in age, gender, number of followers and reasons they accumulated fame. Thirty eight celebrities were chosen for the sample of today’s popular English speaking personas (see appendix A). Comments made to them were collected for the Fan to Celebrity mode of communication (nInstagram = 295,141 ; nTwitter =

1,211,439).

Data for the second mode of communication, Peer to Peer, was composed of comments to post made by regular users. In order to collect the data, a list of users with a public account which commented as fans in the Fan to Celebrity was made. Comments made to their posts were collected for the Peer to Peer mode sample (n = 1,217,206). Thus, a total of 2.7 million text messages were collected for this study.

Instagram comments were collected using Instagram API (see appendix B and C for the code). Tweets containing @user_name of the sample celebrities were collected through Twitter streaming API using DMI-TCAT (Borra & Rieder, 2014).

Supervised machine learning

Supervised machine learning is apt for coding implicit variables in a large dataset when it is possible to hand code a small sub-set, but the full dataset of interest is too grant for manual coding (Boumans & Trilling, 2016). Three classification methods were tested - support vector machines, logistic regression and Naïve Bayes. Naïve Bayes classifier was chosen as it

preformed best with a training set contain a small number of cases for some of the variables. To train the classifier, a codebook was constructed, operationalizing three speculated PSI behaviors – (1) affection / admiration: (2) attention seeking / intimacy seeking/ recognition

(16)

seeking; and (3) impress peers. In order to caption the language and terms used of social

platforms, the operationalization of the variables was carried out through careful attention to how the different behaviors are expressed online. All variable are binary, the comment either

contains, or not, the determined parasocial behavior. For example, the variable ‘physical appearance’ is defined as comment who mentioned the way the object looks (i.e. hot, cute, gorgeous) or focus on a specific body feature (i.e, hypnotizing eyes). The explicit behavior of requesting attention was operationalized as three possible actions witnessed in online

communication: (1) Asking for virtual attention (i.e. like on various social networks), (2) Presenting private content details (i.e. phone number) with hope that they will used it for personal connection (3) Asking for interaction in real life. For full codebook see appendix D.

A sub-set of 10,000 comments were used as a training sample. The training sample was composed of comments directed to all 38 celebrities to assure any bias of the classifier training to one particular fan group. Ten percent of the hand coded sub-set was coded‏by two coders to insure the reliability of the code book (intercoder result see appendix E).

The Python scikit-learn machine learning library (Pedregosa et al., 2011) was used for training and testing the classifiers. For binary classifier, precision signifies how many of the selected cases are truly relevant, while recall signifies the fraction of all relevant cases that have been identified. While testing and assessing a classifier, a tradeoff had to be made between precision and recall. A decision was made to prefer precision, and therefore precision received higher weight while setting threshold for qualifying variables.

To ensure the validity of the study, only variables that passed the following thresholds were included in the study: Interceder agreement of over 0.7, Precision of over 0.75 and Recall of over 0.5 (see appendix E and F for intercoder and machine learning results). The four

(17)

remaining variable, ‘love expression’, ‘compliments’, ‘reference to physical appearance’ and ‘direct address’, were coded automatically to the entire dataset (N = 2,723,786) using

Results Using a binary classifier, the variables means represent the percentage of cases that

include the measured behavior in each interaction condition (see table 1). For example, usage of love expressions when interacting with celebrities on Twitter occurs in 3% of the collected database. Though the behaviors were measured in a small number of cases, ranging from 0.1% to 11%, it is important to understand the magnitude of these social platforms. Those 3% of love expressions shared on Twitter represents 363,430 expressions of affection directed to celebrities, over the course of two weeks, towards the sampled celebrities alone. That means an average of 740 love expressions directed to a single celebrity on Twitter on a daily basis. The hypotheses were tested to examine whether parasocial interaction occurs.

Table 1

Mean and standard deviation by communication mode and platform

Fan to Celebrity Peer to Peer

Twitter Instagram M SD M SD M SD Love Expressions 0.03 0.15 0.04 0.2 0.02 0.15 Compliments 0.01 0.1 0.08 0.28 0.07 0.25 Physical Appearance 0.001 0.14 0.03 0.18 0.02 0.14 Direct Address 0.09 0.28 0.11 0.31 0.07 0.26 N= 2,723,786

(18)

Comparison between modes of communication

Love expression in Instagram comments. In order to test H1 (“Comments on posts made by a celebrity will have more expression of affection than comments on posts made by peers”) a chi-square test of independence was performed to examine the relation between love expressions in the different modes of communication. The relation between these variables was significant, X2 (1, N = 1512347) = 3065, p <.001. On Instagram, comment on posts made by celebrities are slightly more likely to contain love expressions in comparison to comments on posts made by peers, ϕ = -0.05, p < .001. In addition to the means difference between the modes of

communication, there is also a difference in variance, as illustrated in Figure 1. While the communication to peers seems to stay constant, in the Fan to Celebrity there is a range in the degrees of interaction. As would be presented later in this section, characteristic of the targeted celebrity influences the interaction. The variety of celebrities included in this sample could affect the wide variance in the Fan to Celebrity mode.

Figure 1. Mean of the variable ‘love expression’ compared between modes of communication. This figure contains data from one platform – Instagram.

(19)

Compliments in Instagram comments. In order to test the second hypothesis H2: “There would be more compliments in comments made on posts by celebrities than on posts made by peers”. Chi-square test of independence examining the relation between compliments in the different modes of communication found out that relationship between these variables to be significant, X2 (1, n = 1512347) = 712.9, p <.001. On Instagram, comments on celebrities' posts are slightly more likely to contain compliments in comparing with comments on posts made by peers, ϕ = -0.02, p < .001. The differences and distribution are similar to those in Figure 1, but for reasons conciseness, the figure for this and the following hypotheses were omitted from the text. All figures can be found in appendix K (see Figure K1).

Reference to physical appearance in Instagram comments. In order to test H3: “Comments on a post made by a celebrity will contain more references to physical appearance than

comments on posts made by peers”. A chi-square test was performed to examine the relation between references to physical appearance in the different modes of communication. The relation between these variables was significant, X2 (1, n = 1,512,347) = 1226.4 , p <.001. On Instagram, comment on posts made by celebrities are marginally more likely to contain references to physical appearance compared to those made by peers, ϕ = -0.03, p < .001, (see Figure K2).

Direct address language in Instagram comments. In order to H4: “Comments on a post made by a celebrity will have more usage of direct address language than comments on posts made by peers” A chi-square test of independence was performed to examine the relation between direct address in the different modes of communication. The relation between these variables was significant, X2 (1, n = 1512347) = 4497.1 , p <.001. On Instagram, comment on posts made by

(20)

celebrities are slightly more likely to contain compliments compared to posts made on peers', ϕ = -0.05, p < .001, (see Figure K3).

Comparison between platforms

Within the “Fan to Celebrity” mode, the difference between platforms was observed as well. Very small, but significant differences, were found between Instagram and Twitter in regard to love expression and direct address. Love expressions are used slightly more often when fans communicate with celebrities on Instagram than on Twitter; X2 (1, n = 295,141) = 1279 , p<.01. ϕ = -0.03, p < 0.01. Direct address is used slightly more often when fans communicate with celebrities on Instagram than when they communicate on Twitter; X2 (1, n = 1,211,439) = 1220.9 , p <.01. ϕ = -0.03, p < .001, (see Figure L1 and Figure L2 in appendix L).

A higher effect was found using references to physical appearances and compliments in communication with celebrities in the different platforms. Compliments are moderately used more often in the case when the fan communicates with celebrities on Instagram than those on Twitter; X2 (1, n = 295,141) = 51289 , p <.01. ϕ = -0.18 , p < .001 (see Figure 2). References to physical appearances are used moderately more often when fans communicate with celebrities on Instagram than communication on Twitter X2 (1, n = 1,211,439) = 30275 , p <.01. ϕ = -0.14, p < .001(see Figure 3).

(21)

Figure 3. Mean of the variable ‘physical appearances’ compared between platforms. This figure contains data from one mode of communication - Fan to Celebrity.

.

Figure 2. Mean of the variable ‘Compliments’ compared between platforms. This figure contains data from one mode of communication - Fan to Celebrity.

(22)

Comparison between celebrities

Hypothesis five and six speculated that different characteristics of the targeted celebrities might influence the fans interaction with them. Those hypotheses were checked for both platforms – Instagram and Twitter.

Age of celebrity. In order to test H5 (The use of love expression in comments on posts made by celebrities will decrease with the increase of the celebrity age) a single linear regression was calculated. Age of celebrity was found to be significantly predicted fans usage of love

expression. Significant prediction was found in Instagram, b* = -.001, p < .001, R2 = .002, F(1, 295,139) = 705.8, p < .001, (see Figure 3); and Twitter b* = -.001, p < .001, R2 = .0002, F(1, 1,211,437) = 328.6, p < .001 (see Figure 4).

Figure 4. Mean of the variable ‘love expression’ by celebrity age. This figure contains data from one platform – Instagram.

(23)

Figure 5. Mean of the variable ‘love expression’ by celebrity age. This figure contains data from one platform – Twitter.

Gender. In order to test H6 (Reference to physical appearance will be more present in comments made on posts by female celebrities than on posts made by male celebrities) a chi-square test of independence was performed to examine the relation between reference to physical appearance and the gender of the celebrity. The relations between these variables were significant on both platforms. On Instagram, comment on posts made by female celebrities (M = 0.04 , SD = 0.13) are slightly more likely to contain references to physical appearance in comparison to comments on posts made by male celebrities (M = 0.02 , SD = 0.2). X2 (1, n = 295,141) = 1438, p <.01. ϕ = -0.07, p <0.01. A smaller but similarly significant affect was found between female (M = 0.003, SD = 0.06) and male (M = 0.0009 , SD = 0.03) celebrities on Twitter X2 (1, n = 1,211,439) = 911.54 .1 , p <.01. ϕ = -0.03, p < 0.01, (See Figure 3).

Fame domain. In order to answer RQ1, an ANOVA analysis was conducted. An analysis of variance showed that love expression directed to celebrities is significantly different among the

(24)

diverse types of celebrities, F (3, 295,137) = 502.2, p <.001. Post hoc analyses using the TukeyHSD post hoc test, indicated that the average number of love expression towards singers or actors was significantly different than the number of love expression directed to sport celebrities and YouTube personas. Due to those results, the celebrities were divided to two broader definition – ‘entertainers’ and ‘non-entertainers’.

A chi-square test of independence was performed to examine the relation between love expressions when directed to ‘entertainers’ and ‘non-entertainers’. The relation between these variables was significant on both platforms. On Instagram, comment on posts made by entertainers (M = 0.05, SD = 0.22) are more likely to contain expression of love compared to comments on posts made by non-entertainers celebrities (M = 0.02 , SD = 0.15); X2 (1, n = 295,141 ) = 335.24 , p <.001. ϕ = 0.02, p < .001. On twitter the affect was smaller, X2 (1, n = 1,211,439) = 1381.1 , p <.001. ϕ = 0.07, p < .001. Entertainers (M = 0.03, SD = 0.17) non-entertainers celebrities (M = 0.02 , SD = 0.14), (see Figure L1).

Discussion and Conclusions

This study intended to explore online textual communication in different modes and platforms, looking at the tactics commenters employ in a parasocial perspective. As shown in the results section, all hypotheses are supported when using significant as the only indication. When using a large sample, like in this study, the high N value can lead to p value close to zero even when the effect is very small (Trilling, accepted for publication). For that reason, other methods need to be implemented to determine whether the hypotheses are supported. Kramer, Guillory & Hancock (2014), in their emotional contagion through Facebook study, concluded that given the massive scale of social networks, even small effects can have large aggregated consequences. Emphasizing their conclusion by arguing that their findings effect size of d = 0.001 would have

(25)

corresponded to hundreds of thousands of emotion expressions in Facebook status updated per day. Twitter and Instagram are social networks of substantial size as well; as of the first quarter of 2016, Twitter averages 310 million monthly active users (The Statistics Portal, 2016) and according to Instagram official statistics, the social platform has over forty-million active users that produce an average of eighty million photos per day (Instagram, 2016). While interpreting the result of this study a decision was made to err on the side of caution. A traditional effect size threshold of 0.1 (Cohen, 1988) was used to determine whether to accept or reject the hypothesis. However, influenced by the interpretation of Big Data earlier studies of social behavior on social networks (Bond et al.,2012 ; Kramer et al., 2014) smaller effect sizes were inspected and

interpreted as well.

Predominantly, in this study, one mode of communication, Fan to Celebrity, holds more characteristics of PSI behavior than Peer to Peer mode of communication (see table 1). Each variable (i.e. behavior) will be discussed further with indication to the different platforms and characteristics of the celebrities.

The first behavior to be looked at is love expression. H1 was not supported as the effect size was lower than the threshold. Percentage-wise, comments directed towards celebrities, on both platforms, contained more love expression than comments directed to peers. These results are consistent with a long tradition of love expressed towards famous people, be that by fan mail, screaming fans at music concerts or holding up signs at sports events (Rother, 2009).

H5 inspected the relation between age of the celebrities and the amount of love expressions directed at them in online communication. The graphs suggest negative linear relationship between age and love expression, for both platforms (see Figure 4 and Figure 5). H5 was not supported due to low effect size. The directionality of young Celebs receiving expressed

(26)

love more often than older celebrities, when theorizing a similar age progression of the fans due to similarity and identification, is persistent with the studies of Giles and Maltby (2004) and Stever (2009) who found that emotional attachment to celebrities increase during adolescence.

Affection expressed towards the celebrity seems to be effected by the fame discipline that the celebrity is known for. RQ1 cannot be definably answered due to low effect size; on both platforms entertainers (i.e. actors and singers) got more love expression.

An additional behavior usage is referring to physical appearance. Hypothesis 3 predicted that references to physical appearance will be higher when directed to celebrities than to peers. The hypothesis was not supported, due to small effect size. That proposition that parasocial relationship is not the main reason for using physical appearance references in online textual communication. Other reasons might contribute to the use of that sort language. Statistical comparison between platforms, which is both significant and meaningful, strengthen the

conclusion that reference to physical appearance has less to do with PSI and more to do with the nature of a particular social platform (see Figure 3). As Instagram is a visual social network, this study examines the comments posted as a response to these photos. As such it is understandable why reaction to photographs, much of them containing a human body, will illicit more remarks regarding physical appearance than a mainly text-based platform like Twitter.

A similar additional behavior used by fans examined in this study is the use of compliments. Like expressions of love, compliments are theorized as an intimate act that communicates closeness and can be used to elicit response. H2 was rejected as the effect size was almost zero. The behavior of this variable is very akin to physical appearance. Like physical appearance, while the original hypothesis was not supported, significant and meaningful

(27)

two possible reasons: The first, as mentioned, with physical appearance, the photographic nature of Instagram can influence the complimentary nature in the remarks. The second, aside to their physic, users share photos of their food, homes, pets etc. (Hu, Manikonda & Kambhampati, 2014), all things that can be complimented upon.

Hypothesis 6 looked at gender; anticipates that female celebrities will receive more attention regarding physical appearance. As mentioned above, comments with reference to physical appearance are a rarity on Twitter, and even so, female celebrities are on the receiving end of three times more comments that contain reference to physical appearance than males. Hypothesis 6 was not supported for both platforms. Be that as it may, the result directionality suggests that using reference to physical appearance in comments might have less to do with PSI and more with social norms that lead to women being the subject of most of them. One

explanation could be, as theorized, that women are objectified more often than men. Another explanation might be that while female commenters feel comfortable to refer to both male and female celebrities, most males do not feel comfortable referring to the male physique on a public form. More research is needed to definitely explain the discrepancy.

The final behavior looked at is the use of direct address language. Direct address is theorized, among other, as a way to create and reinforce intimacy. Hypothesis 4 speculated that comments directed to celebrities will have more direct address language used than comments directed to peers. The hypothesis was not supported, though communications to celebrities on both platforms contain more direct address language than peer communication. That could suggest that using direct address language in textual PSI is equivalent to talking back to a figure on the screen as Horton and Whol (1956) originally suggested. The direct appeal through text is a direct response to the unmediated communication occurring through the different platforms.

(28)

The absolute higher use of direct address when communicating with celebrities suggests that it could be an attention seeking-tactic.

Online social media, in a way, has leveled the playing field. Users, technically, can communicate with each other equally, whether celebrity or peer. There are no mediating agents between the celebrity and his fans like in other fields of interactions. This study sets out to explore fan to celebrity interaction in these new circumstances. Though the hypotheses are not confirmed by the statistical values, the directionality of the results, for all measured behaviors, were more present in one mode over the other, pointing out that on social media, the interaction with celebrities is different from interactions with peers, even when structurally they are the same. The source of that difference is assumed to have originated from the social media users increased usage of social cues while communicating with celebrities as a way of keeping, mostly a one sided relationship alive. The hypotheses which suggest a higher usage of social behaviors are not supported by this study, but the results suggest directionality which should be explored in future research. Though the hypotheses are not supported, the results expressing all social

behaviors as higher when interacting with celebrities, it seems this is not random. In case the hypotheses were supported, a new sub-theory of PSI on social platforms could be suggested, such as a relationship of a more ‘casual’ type. This study found that a one single celebrity Twitter user receives an average of approximate 3000 textual interactions per day that includes parasocial behavior. The easy accessibility has made it easier to engage with celebrities and increased the number of fans participating in parasocial relations. But the lower investment demanded from the fans might have diluted the strong PSI attachment coined by Horton and Wohl more than half a century ago.

(29)

This study casted a wide net to examine the new interaction, exploring interactions with over thirty celebrities from different fields and of various ages. The findings are not strong enough to make a definite conclusion, but they do suggest an interesting occurrence that is worth continuing to explore. Parasocial interactions were found to be stronger with entertainers, and with younger individuals, supporting the hypotheses could be achieved by looking at interaction with young entertainers.

Limitation and future research

A successful social media site produces large quantities of content every day. As mentioned above, Instagram users share 80 million new images every day, with each picture creating additional content in the form of textual comments. This study examined 2.7 million unique textual responses, which are a minuscule percentage of the textual communication occurring on the social media.

The study tried to observe parasocial behavior through textual comments. The surprising outcome of this study is the small percentage of comments that had parasocial behavior present in them. A casual perusing of social media gives the allusion that it occurs more often, which led to the idea of the study in the first place. This discrepancy can have a few possible explanations. First reason might be a psychological one, regarding recognition memory. Recognition memory, a subcategory of declarative memory, is the ability to recognize previously encountered events. Once noticing this cognitive sort of component on social media, further encounters of that sort perceived more prominent than other comments.

This study is the first to use supervised machine learning (SML) to research the topic of parasocial interaction on social platforms. This research has a basis to believe that there is room for use of SML method in this research field, some improvements to the process could be made

(30)

for further research. The small percentage of relevant cases effected the efficiency of the classifier training, which caused low recall abilities. That in turn, caused for even a lower number of cases than what is actually present in the database to be coded, A training sample of 10,000 cases was used to train the classifier. The sample size was chosen according to previous studies’ standards. In the training sampled that was used to code the classifier, the percentage of comments that contained the coded behaviors (variable) was low, ranging from 0.01% to 4.4% of the sample. The variable with highest number of cases in the training sample had 398 cases. The small number of cases available for machine training affected the accuracy of the classifier and renders some of the variable as invalid for the study. The low recall score signifying that not all relevant cases in the sample were coded. The course of action chosen for this study was to work in a top-to-bottom approach, with a set-list of variable determined by theory and observation. Some of the proposed variables were found to be used rarely, as discussed above. A different variety of problematic variables were those who tried to compose more complexed concepts like “discloser” and “closeness”. Most comments that fit the description of those variables are

composed of a long text (in social platforms standards long means more than 15 words) which affect the classifier abilities, as most of those cases were missed and not coded.

As can be seen in the introduction social media networks are a fast changing world. New and different venues are always being developed. At the time these lines are being written, Twitter and Instagram are of the most popular social networks in the western world, but new platforms like Snapchat are growing daily in popularity. The nature of the fan celebrity relation can be swept up with the raise of a new platform.

Using SML might be helpful in assisting research in order to stay in the race on new platforms. Important information can be extracted out of it. Future research should try to resolve

(31)

presented above limitation. Training sample should be set by the number of relevant present cases and not by an absolute number. SML can be used for straight forwards definitions that can be simply defined, and less suitable for complexed concepts and nuances. The research findings suggest that it will be wise, on future studies, to focus on celebrities who have young fans and celebrities from the entertainment field. These groups are subject of more parasocial behavior and additional research might reveal new interesting connections. The study found a significant difference in interactional behavior between social media platforms, future studies should take this under consideration.

Literature

Auter, P. J. (1992). Psychometric: TV that talks back: An experimental validation of a parasocial interaction scale. Journal of Broadcasting & Electronic Media, 36(2), 173– 181. doi:10.1080/08838159209364165

Blumer, H. (1933). Movies and conduct. (p. 51). New York: The Macmillan Company

Bond, R. M., Fariss, C. J., Jones, J. J., Kramer, A. D., Marlow, C., Settle, J. E., & Fowler, J. H. (2012). A 61-million-person experiment in social influence and political

mobilization. Nature, 489(7415), 295-298. doi:10.1038/nature11421

Borra, E., & Rieder, B. (2014). Programmed method: developing a toolset for capturing and analyzing tweets. Aslib Journal of Information Management, 66(3), 262–278.

doi:10.1108/AJIM-09-2013-0094.

Boumans, J.W., & Trilling, D. (2016). Taking stock of the toolkit: An overview of relevant automated content analysis approaches and techniques for digital journalism scholars. Digital Journalism, 4(1), 8–23. doi:10.1080/21670811.2015.1096598

(32)

Yale University Press. Psicologia Em Revista, 20(3). doi:10.5752/10054

Braudy, L. (1986). The frenzy of renown. New York & Oxford: Oxford University Press. Brown, W. J., & Basil, M. D. (1995). Media celebrities and public health: Responses to

'Magic'Johnson's HIV disclosure and its impact on AIDS risk and high-risk

behaviors. Health Communication, 7(4), 345-370. doi:10.1207/s15327027hc0704_4 Caughey, J.L. (1984). Social Relations with media figures. In Imaginary social worlds: A

cultural approach, (pp. 31-76). Lincoln and London: University of Nebraska Press. Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Vol. 2. Lawrence

Earlbaum Associates, Hillsdale, NJ. doi:10.4324/9780203771587

Cohen, J. (1999). Favorite characters of teenage viewers of Israeli serials. Journal of

Broadcasting & Electronic Media, 43(3), 327–345. doi:10.1080/08838159909364495 Cramer, P. (2001). Identification and Its Relation To Identity Development. Journal of

Personality, 69(5), 667–688. doi:10.1111/1467-6494.695159 Digital Trends. (2016). The History of Social Networking. Retrieved from

http://www.digitaltrends.com/features/the-history-of-social-networking/. Epstein, J. (2005). Celebrity culture. Hedgehog Review, 7(1).

Ferris, K. O. (2001). Through a Glass, Darkly: The Dynamics of Fan‐Celebrity Encounters. Symbolic interaction, 24(1), 25-47. doi:10.1525/si.2001.24.1.25

Floyd, K., Hess, J.A., Miczo, L.A., Halone, K.K., Mikkelson, A.C., & Tusing, J.K. (2005). Human affection exchange: VII. Further evidence of the benefits of expressed affection. Communication Quarterly, 53, 285–303. doi:10.1080/01463370500101071

Fraser, B. P., & Brown, W. J. (2002). Media, Celebrities, and Social Influence: Identification With Elvis Presley. Mass Communication and Society, 5(2), 183–206.

(33)

Frederick, E. L., Lim, C. H., Clavio, G., & Walsh, P. (2012). Why we follow: An examination of parasocial interaction and fan motivations for following athlete archetypes on

Twitter. International Journal of Sport Communication,5(4), 481-502.

Giles, D. C., & Maltby, J. (2004). The role of media figures in adolescent development: relations between autonomy, attachment, and interest in celebrities. Personality and Individual Differences, 36(4), 813–822. doi:10.1016/s0191-8869(03)00154-5

Grogan, S. (2007). Body image: Understanding body dissatisfaction in men, women and children. Routledge.

Hills, M. (1971). Fan Cultures. doi:10.4324/9780203361337

Horton, D., & Richard Wohl, R. (1956). Mass communication and para-social interaction: Observations on intimacy at a distance. Psychiatry, 19(3), 215-229. Chicago

Hu, Y., Manikonda, L., & Kambhampati, S. (2014, June). What We Instagram: A First Analysis of Instagram Photo Content and User Types. InICWSM.

Instagram. (2016). Press News. Retrieved from https://www.instagram.com/press/?hl=en

Jenkins, H. (2012). Textual poachers: Television fans and participatory culture. Routledge. Kramer, A. D. I., Guillory, J. E., & Hancock, J. T. (2014). Experimental evidence of massive-

scale emotional contagion through social networks. Proceedings of the National Academy of Sciences, 111(24), 8788–8790. doi:10.1073/pnas.1320040111

Kwak, H., Lee, C., Park, H., & Moon, S. (2010). What is Twitter, a social network or a news media? Proceedings of the 19th International Conference on World Wide Web - WWW ’10. (pp. 591-600). ACM. doi:10.1145/1772690.1772751

Maltby, J., & Giles, D. (2008). Toward the Measurement and Profiling of Celebrity Worship. A Psychological and Behavioral Analysis, 271–286.

(34)

McCutcheon, L. E., Lange, R., & Houran, J. (2002). Conceptualization and measurement of celebrity worship. British journal of psychology, 93(1), 67-87.

doi:10.1348/000712602162454

Journal of Social and Personal Relationships, 16, 731–750.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., . . . Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825-2830.

Preston, P. (2001). Reshaping communications: Technology, information and social change. Sage. doi:10.4135/9781446222164

Rother, L. (2009). Dear Donna: A pinup so swell she kept GI mail. The New York Times

Rubin, A. M., Perse, E. M., & Powell, R. A. (1985). Loneliness, parasocial interaction, and local television news viewing. Human Communication Research, 12, 155–180.

doi:10.1111/j.1468-2958.1985.tb00071.x

Rubin, A. M., & Perse, E. M. (1987). Audience activity and soap opera involvement: A uses and effects investigation. Human Communication Research, 14, 246–268.

The Statistics Portal. (2016). Number of Monthly Active Twitter Users Worldwide From 1st Quarter 2010 to 1st Quarter 2016 (in Millions). Retrieved from

http://www.statista.com/statistics/282087/number-o f-monthly-active-twitter-users/ Tiggemann, M., & Lynch, J. E. (2001). Body image across the life span in adult women: the role

of self-objectification. Developmental psychology,37(2), 243–253. doi:10.1037/0012-1649.37.2.243

Trilling, D. (accepted for publication). Big Data, Analysis of. In: Matthes, J. (ed.), International Encyclopedia of Communication Research Methods. Hoboken, NJ: Wiley.

(35)

Sanderson, J. (2008). “You are the type of person that children should look up to as a hero”: Parasocial interaction on 38pitches.com. International Journal of Sport Communication, 1, 337–360

Sanderson, J. (2009). “You Are All Loved so Much” Exploring Relational Maintenance Within the Context of Parasocial Relationships. Journal of Media Psychology, 21(4), 171-182. doi:10.1027/1864-1105.21.4.171

Schegloff, E. A., & Sacks, H. (1973). Opening up Closings. Semiotica, 8(4). doi:10.1515/semi.1973.8.4.289

Simmons, C. (2009). Dear radio broadcaster: fan mail as a form of perceived interactivity. Journal of Broadcasting & Electronic Media, 53(3), 444-459.

Smith, D. R. (2016). “Imagining others more complexly”: celebrity and the ideology of fame among YouTube’s “Nerdfighteria.” Celebrity Studies, 1–15.

doi:10.1080/19392397.2015.1132174

Spickard, J., Heritage, J., & Garfinkel, H. (1987). Garfinkel and Ethnomethodology. Sociological Analysis, 48(2), 188. doi:10.2307/3711210

Spitzberg, B. H., & Cupach, W. R. (2008). Fanning the Flames of Fandom. A Psychological and Behavioral Analysis, 287–322. doi:10.1093/med:psych/9780195326383.003.0013

Stark, R. (1996). Religion as Context: Hellfire and Delinquency One More Time. Sociology of Religion, 57(2), 163. doi:10.2307/3711948

Stever, G. S. (1991). The Celebrity Appeal Questionnaire. Psychological Reports, 68, 859–866. doi:10.2466/pr0.1991.68.3.859

Stever, G. (2009). Parasocial and social interaction with celebrities: Classification of media fans. Journal of Media psychology, 14(3), 1-39. doi:10.1037/e694892011-001

(36)

Stever, & Lawson (2013). Twitter as a way for celebrities to communicate with fans: Implications for the study of parasocial interaction.North American journal of psychology, 15(2), 339.

Sutton, W.A. (2012). Conclusion: What the future holds for sport marketing researchers and scholars. In N.L. Lough & W.A. Sutton (Eds.), Handbook of sport marketing research (pp. 419–426). Morgantown, WV: Fitness Information Technology.

Wallace, L., Wilson, J., & Miloch, K. (2011). Sporting Facebook: A content analysis of NCAA organizational sport pages and Big 12 athletic department pages. International Journal of Sport Communication, 4, 422–444.

(37)

Appendices Appendix A

List of the celebrities included in the study and their relevant characteristics

Table A1

Celebrities in the study and their relevant characteristics

Name Age type Gender Number of follower on twitter Number of follower on Instagram

Lizzy Green 13 Actor Female --- 578000

Maddie Ziegler 14 Singer Female 1020000 5300000

Rawnan Blanchard 15 Actor Female 385000 3500000

Sabrina Carpenter 17 Actor Female 743000 5500000

Kira Kosarin 18 Actor Female 129000 1000000

Ariel Winter 18 Actor Female 469000 1500000

Masisie Williams 19 Actor Female 1210000 1900000

Jack Griffo 19 Actor Male 2200 955000

Dove Cameron 20 Actor Female 104000 5100000

Zendaya 20 Actor Female 6530000 21900000

Beathany Mota 20 Internet Female 2840000 5500000

Laura Marano 21 Actor Female 1800000 3300000

Ross Lynch 21 Singer Male 2600000 2700000

Justin Bieber 22 Singer Male 81000000 60200000

Debby Ryan 23 Actor Female 4090000 4700000

Ariana Grande 23 Singer Female 1000000 61600000

Selena Gomez 23 Singer Female 45000000 67700000

Dan Howell 24 Internet Male 2710000 2600000

Nick Jonas 24 Singer Male 10000000 6000000

Ed Sheeran 25 Singer Male 32000 5700000

Taylor Swift 26 Singer Female 77000000 68100000

Tyler Oakley 27 Internet Male 5000000 6100000

Ronda Rousey 29 Sport Female 2600000 7600000

Phil Lester 29 Internet Male 2490000 2000000

Shaun White 29 Sport Male 1500000 407000

Lady Gaga 30 Singer Female 59000000 15100000

Katy Perry 31 Singer Female 80000000 42100000

Lindsey Vonn 31 Sport Female 443000 701000

Lebron James 31 Sport Male 3120000 18600000

Cristiano Ronaldo 31 Sport Male 42000000 50400000

Serena William 35 Sport Female 1400000 2800000

Hank Green 36 Internet Male 680000 382000

(38)

Appendix B Python code 1

Automatically coded comments retrieved for ‘Fan to Celebrity’ mode of communication, using Instagram API. %% celeblist katyperry = '407964088' taylorswift = '11830955' ladygaga = '184692323' justinbieber = "6860189" sabrinacarpenter = "8713286" rowanblanchard = "7857420" selenagomez = "460563723" arianagrande = "7719696" teddysphotos = "185546187" nickjonas = "189396108" zendaya = "9777455" Rossr5 = "14124022" dovecemeron = "145312309" leomessi = "427553890" cristiano = "173560420" neymarjr = "26669533" serenawilliams = "15503147" lindseyvonn = "24551101" shaunwhite = "244523904" rondarousey = "29320272" kingjames = "19410587" lizzy_greene = "814327057" kirakosarin = "54668139" lauramarano = "33511106" debbyryan = "9429520" jackgriffo = "22613552" jacenorman7 = "275635898" maddieziegler = "194193935" arielwinter = "12126220" maisie_williams = "35306961" bethanynoelm = "11432541" danisnotonfire = "12017431" amazingphil = "33142819" johngreenwritesbooks = "343005985" tyleroakley = "17448582"

from urllib.request import urlopen import json

(39)

import csv ACCESSTOKEN="766984346.8207a31.c35c671150144dd9bd2a3e3773f5f46f" VIPIDS=[taylorswift] import string printable = set(string.printable) commentlist=[] cleancomments=[] usernamelist=[] fullnamelist=[] idlist=[] commenteridlist=[] picturelevel_id=[] picturelevel_user=[]

for vipid in VIPIDS:

vipdata=urlopen("https://api.instagram.com/v1/users/"+vipid+"/media/recent/?access_token=" +ACCESSTOKEN).read()

vipdatadict=json.loads(vipdata.decode("utf-8")) for picture in vipdatadict['data']:

thispicture_id=picture['id'] thispicture_user=picture['user']

for comment in picture['comments']['data']: commentlist.append(comment['text']) usernamelist.append(comment['from']['username']) fullnamelist.append(comment['from']['full_name']) commenteridlist.append(comment['from']['id']) idlist.append(comment['id']) picturelevel_id.append(thispicture_id) picturelevel_user.append(thispicture_user) nexturl1 = vipdatadict['pagination']['next_url'] vipdata1=urlopen(nexturl1).read() vipdatadict1=json.loads(vipdata1.decode("utf-8")) for picture in vipdatadict1['data']:

thispicture_id=picture['id'] thispicture_user=picture['user']

(40)

commentlist.append(comment['text']) usernamelist.append(comment['from']['username']) fullnamelist.append(comment['from']['full_name']) commenteridlist.append(comment['from']['id']) idlist.append(comment['id']) picturelevel_id.append(thispicture_id) picturelevel_user.append(thispicture_user) ….. import string printable = set(string.printable) commentlistclean=[]

for text in commentlist: s = text

sclean="".join([c for c in s if c in printable]) commentlistclean.append (sclean) import time COMMENTNUM= len(commentlistclean) timestr = time.strftime("%d%m-%H%M") filename = 'D:/thesis/DATA/MOD1/'+timestr+'--'+NAMES+'-clean'+':'+COMMENTNUM+'.csv' print (filename) output=zip(picturelevel_user,picturelevel_id,idlist,usernamelist,fullnamelist,commenteridlist,commentli st,commentlistclean)

with open(filename ,mode="w",encoding="utf-8", newline='' ) as fo: writer=csv.writer(fo) writer.writerows (output) ORIGNALFILE= NAMES+"-"+timestr+"-"+COMMENTNUM import csv allcomments=[] picturelevel_user=[] picturelevel_id=[] idlist=[] usernamelist=[] fullnamelist=[] commenteridlist=[]

with open(filename, 'r', newline='', encoding='ISO-8859-1') as fi: reader=csv.reader(fi)

next(reader) for row in reader: pleveluser=row[0] plevelid=row[1]

(41)

listid=row[2] username=row[3] fullname=row[4] commenterid=row[5] comments=row[7] picturelevel_user.append(pleveluser) picturelevel_id.append(plevelid) idlist.append(listid) usernamelist.append(username) fullnamelist.append(fullname) commenteridlist.append(commenterid) allcomments.append(comments) print(allcomments[3]) print (len(allcomments))

from sklearn.cross_validation import train_test_split from sklearn.naive_bayes import MultinomialNB from sklearn import svm

from sklearn.linear_model import LogisticRegression

from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer from sklearn import metrics

from sklearn.externals import joblib

bigdataset_features = vectorizer.transform(allcomments) # load the saved pipeline that includes both the vectorizer # and the classifier and predict

# classifier = joblib.load('class.pkl') #predict = classifier.predict(X_test) #return predict #%% 1.1 logreg = LogisticRegression() logreg = joblib.load('D:/thesis/ML/1.1-0905-1144.pkl') predictions11 = logreg.predict(bigdataset_features) print (predictions11) #%% 1.2 logreg = LogisticRegression() logreg = joblib.load('D:/thesis/ML/1.2-0905-1147.pkl') predictions12 = logreg.predict(bigdataset_features) print (predictions12) #%% 1.4 logreg = LogisticRegression() logreg = joblib.load('D:/thesis/ML/1.4-0905-1151.pkl') predictions14 = logreg.predict(bigdataset_features)

(42)

print (predictions14) #%% 2.3 logreg = LogisticRegression() logreg = joblib.load('D:/thesis/ML/2.3-0905-1153.pkl') predictions23 = logreg.predict(bigdataset_features) print (predictions23) #%% 3.2 logreg = LogisticRegression() logreg = joblib.load('D:/thesis/ML/3.2-0905-1154.pkl') predictions32 = logreg.predict(bigdataset_features) print (predictions32) print(len(picturelevel_user)) print(len(picturelevel_id)) print(len(idlist)) print(len(usernamelist)) print(len(fullnamelist)) print(len(commenteridlist)) print(len(allcomments)) print(len(predictions11)) print(len(predictions12)) import time timestr = time.strftime("%d%m-%H%M") filename2 = 'D:/thesis/DATA/AOUTOCODED/MODE1/'+timestr+'--'+ORIGNALFILE+'CODED'+'.csv' print (filename2) output=zip(picturelevel_user,picturelevel_id,idlist,usernamelist,fullnamelist,commenteridlist,allcommen ts,predictions11,predictions12,predictions14,predictions23,predictions32)

with open(filename2 ,mode="w",encoding="utf-8", newline='' ) as fo: writer=csv.writer(fo)

writer.writerows (output)

(43)

Appendix C Python code 2

Automatically coded comments retrieved for ‘Peer to Peer’ mode of communication, using Instagram API.

import csv VIPIDS=[]

with open("D:/thesis/DATA/MOD1/CELEBFILE.csv", 'r', newline='', encoding='ISO-8859-1') as fi: reader=csv.reader(fi)

next(reader) for row in reader: PEER=row[5]

VIPIDS.append(PEER) print(VIPIDS)

print (len(VIPIDS)) NAMES = 'XXXX'

from urllib.request import urlopen import json

from pprint import pprint import csv ACCESSTOKEN="766984346.8207a31.c35c671150144dd9bd2a3e3773f5f46f" commentlist=[] usernamelist=[] fullnamelist=[] idlist=[] commenteridlist=[] picturelevel_id=[] picturelevel_user=[]

for vipid in VIPIDS: try:

vipdata=urlopen("https://api.instagram.com/v1/users/"+vipid+"/media/recent/?access_token=" +ACCESSTOKEN).read()

vipdatadict=json.loads(vipdata.decode("utf-8")) for picture in vipdatadict['data']:

thispicture_id=picture['id'] thispicture_user=picture['user']

for comment in picture['comments']['data']: commentlist.append(comment['text'])

Referenties

GERELATEERDE DOCUMENTEN

Als het specifiek over de professionele ruimte van leraren gaat, wordt profes- sionele ruimte meestal gedefinieerd als de mate waarin leraren zeggenschap hebben over of invloed

Bovendien is het van belang om te onderzoeken hoe persoonlijkheidsfactoren van de jongere een mogelijk risico vormen voor het slachtoffer worden van online grooming aangezien

In this thesis, we examined the relative efficiency of Logit model, Probit model and SVMs model with various sampling techniques in the develop- ment of classification models for

The purpose of this note is to critieize Nguyen (1985) for his account of the literature on the generalization of Fisher's exact test and to point out parallels with existing

Van de Velde and Heller take issue with the interpretation of the three-way interaction between sex of the requester, sex of the participant and condition on the likelihood

Toch hebben ook individuele invulling en het ‘keukentafelmodel’ charme: doordat agrariërs op zoek zullen gaan naar een bedrijfs- en gebiedsgebonden logica voor de invulling van

• Bij 81 procent van de patiënten is er een verschil tussen de medicijnen die de patiënt daadwerkelijk gebruikt en de medicatie waarvan de eerstelijnszorgverlener (huisarts,

Langs  het  traject  van  Stevin  werd  in  alle  boringen  vanaf  X11  tot  en  met  X126  uitsluitend