• No results found

THE EFFECT OF INFORMATION FRAMING IN PEER-TO-PEER CHARITY CROWDFUNDING BY THOM HOLLEMAN June 13, 2020

N/A
N/A
Protected

Academic year: 2021

Share "THE EFFECT OF INFORMATION FRAMING IN PEER-TO-PEER CHARITY CROWDFUNDING BY THOM HOLLEMAN June 13, 2020"

Copied!
65
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

1

THE EFFECT OF INFORMATION FRAMING IN

PEER-TO-PEER CHARITY CROWDFUNDING

BY

(2)

2

THE EFFECT OF INFORMATION FRAMING IN

PEER-TO-PEER CHARITY CROWDFUNDING

BY

THOM HOLLEMAN

University of Groningen

(3)

3 ABSTRACT

The growth of donation behavior among individuals has attracted many academics from varying research domains, striving to shed light on the subordinate mechanisms that drive this behavior. Traditional research has explicated these mechanisms by means of experiments and qualitative research. However, the upcoming of crowdsourcing has also enabled beneficiaries to create a campaign for themselves in which they collect money for a self-determined cause, as opposed to the more common phenomenon in which an individual donates to charity or non-profit organizations. Due to the fact that this crowdsourcing takes place online, inferences of insights from traditional researches with presence of advocates may not be valid in online environments as it has been proven that these advocates may provoke socially desirable responses. Hence, we propose to investigate the effect of the only point-of-contact an individual has on such crowdsourcing websites: the presented information frame in terms of its topics, emotionality, sentiment and the use of graphical cues. The key challenge here is to structure the unstructured nature of the fundraisers’ text by means of Latent Dirichlet

Allocation and Multidimensional Sentiment Analysis. That is, as the (logistic) regression assumes numerical values, the information frames need to be converted into metrics. Using this structured, tidy data, we find that there’s no single best way for this, but instead find discrepant ways to describe the beneficiaries’ concerns in fundraiser-texts in order to increase donation behavior. That is, for example, topics on the urgency of the beneficiary’s concerns negatively affect this degree, although this effect can be positively moderated with expressing trust, which again is discrepant over different fundraiser categories. Among others, expressing anger combined with addressing a substantial number of topics has positive effects on the elicited donation behavior towards beneficiaries, whereas expressing other emotions such as disgust should often be avoided. Future beneficiaries can use the insights of this research to make their crowdfunding fundraiser to a success, whereas future academics can use this in understanding donation behavior or as a foundation of their own, subsequent research.

(4)

4

Contents

1 INTRODUCTION ... 5 2 LITERATURE REVIEW ... 10 2.1 Topic ... 11 2.2 Emotion ... 15 2.3 Sentiment ... 16 2.4 Image/Video ... 17 2.5 Control variables ... 18 3 DATA ... 18 3.1 Data source ... 18

3.2 Appropriate data volume ... 18

3.3 Data collection ... 19

3.4 Data cleaning ... 20

3.5 Data descriptives ... 24

4 METHODOLOGY ... 26

4.1 Latent Dirichlet Allocation ... 26

4.2 Multidimensional Sentiment Analysis ... 29

4.3 Data suitability and operationalization ... 31

4.4 Hypothesis testing ... 33

5 RESULTS ... 35

5.1 Fundraiser category distinction ... 35

5.2 Selecting a number of topics ... 36

5.3 Interpretation of topics ... 38

5.4 Multidimensional Sentiment Analysis ... 39

5.5 Hypothesis testing ... 40

5.6 Explored relationships ... 47

6 DISCUSSION & CONCLUSION ... 52

(5)

5

1 INTRODUCTION

“Light has come into the world, and every man must decide whether he will walk in the light of creative altruism or the darkness of destructive selfishness. Life’s persistent and most urgent question is “What are you doing for others?”.

- Martin Luther King Jr. (1963)

Donating is becoming ever more popular in the current era. Despite the decreasing trend in the amount of donors in the USA, the sum of their charitable giving reached to 427 million USD in 2018, risen up from 389 million USD 3 years earlier (Giving USA, 2019; Osili & Zarins, 2018). Also, the number of charity- and non-profit organizations in the US amounts to roughly 1,5 million in 2018, a 10% increase as seen from 2015 (Giving USA, 2019). Not only these figures note the importance of the charity business, as it has been, and still is, a widely studied domain among academics. Numerous authors have noted the importance of

academics’ research efforts to “consciously seek out cross-disciplinary inputs” (Smith, D., 1975, p. 265). As an answer to this call to action, researchers have been attempting to get a grasp on the stimuli and cues of this donation behavior1 in a broad set of disciplines, including social and biological psychology, neurology, economics, and marketing (Bekkers &

Wiepking, 2011), presenting insights in e.g. appropriate social media tactics (Guo & Saxton, 2017), expected psychological gratifications of donating (Algoe & Haidt, 2009; Jason,

Thomson, & Navarro, 2014; Xilin & Xiaofei, 2017), and socio-demographic characteristics of donors (Andreoni & Scholz, 1998).

Also, a substantial number of researches in these domains have been dedicated to the role of information framing on donation behavior, especially in marketing literature. Lindenberg (2006) describes this information framing as the decision on which topics, emotionality and sentiment the textual and/or graphical message presents. Literature based on framing

techniques indicate that one’s tendency to donate can be increased by information framing by means of, among others, soliciting (Benson & Catt, 1978), mood inducement (Cunningham, 1980), and statements of efficacy (Carroll & Kachersky, 2019). Insights in the effects of

1 In this research, the terms “donation behavior” and “donate” refer to the act of giving monetary endowments to

(6)

6 information framing is of essential importance for charities and beneficiaries2, as they strive

to maximize the amount and sum of donations made by individuals.

The problem here is that the (textual) way of describing the beneficiary’s concerns is of the essence in maximizing donations. One of the occurences in donation behavior studies that provides insights in the effects of information framing is the widely cited “identifiable victim effect” (Jenni & Loewenstein, 1997; Small, Loewenstein, & Slovic, 2007). This effect refers to individuals’ tendency to donate towards beneficiaries that are described (a.k.a. “framed”) using affective and personal information of beneficiaries, rather than statistical information and topics (Small et al., 2007). Other authors state the importance of awareness-of-need statements, as these positively influence individuals’ tendency to engage in donation behavior (Guy & Patton, 1989). Besides, graphical cues such as images and videos have also been proven to prompt donation behavior, but different graphical content has discrepant effects in different environments (Chang & Lee, 2009). For example, images of child poverty have a direct, positive effect on individuals’ donation behavior, but when these child poverty images are appended with texts, they also increase donation behavior when the texts pertains

identifiable victim topics (Lewit, Terman, & Behrman, 1997).

Akin to topics and graphical cues, emotionality and sentiment in information frames are also proven to be of influence on donation behavior, indicating it is not only what you say, but also

how you say it (Horberg, Oveis, & Keltner, 2011). Emotions in information frames refer to

the expressed feelings of the beneficiary in texts (i.e. joy, sadness, and anger) even though the text is not written by the beneficiary himself. The sentiment of an information frame involves the way an scenario is described: “This person can’t recover if you don’t donate” reflects a negative sentiment, whereas “This person can recover if you donate” reflects a positive sentiment of the same message. Negative emotions (anger) and negative sentiment in information frames elicits individuals’ donation behavior when the donation made by the individual is used to alleviate troubles of beneficiaries that were treated unfairly. This implies that charities that pursue goals in this “fairness” moral foundation (Caruana & Crane, 2008; Goenka & van Osselaer, 2019), such as Amnesty International striving for equal rights and

2 Congruent with the definitions of Bekkers & Wiepking (2011), the term “beneficiary” refers to the recipient of

(7)

7 freedom, should incorporate sadness and/or a negative sentiment in their communication in order to increase the amount of received donations. However, when an objective is built upon a “care” moral foundation (i.e. humanitarian or individual welfare and recovery), positive emotions such as happiness, compassion and gratitude invoke a larger degree of donation behavior (Caruana & Crane, 2008; Horberg, Oveis, & Keltner, 2011).

Two remarks on previously discussed studies are, however, that (1) plenty of these studies take place in an charity (I2C) environment, rather than an individual-to-beneficiary (I2B) environment. This implies that donors (hereafter: individuals) donate to a charity (I2C) rather than a single beneficiary (I2B)(Bekkers & Wiepking, 2011). And even when studies facilitate a I2B environment, these environments are often characterized by the physical presence of the beneficiary or advocate by means of economic or social games in lab experiments (Cialdini & Goldstein, 2004; Smith, Windmeijer, & Wright, 2014). Here, a beneficiary or advocate asks the individual for a donation or the individual may know or guess the experiment is about donation behavior. These settings may lead to socially desirable behavior (Malhotra, 2010). This socially desirable behavior is known for being a source of invalid experimental results (Malhotra, 2010). Malhotra argues that socially desirable behavior is lowest when “perceived anonymity is high”, which is especially prevalent in online settings (Malhotra, 2010). In short, the dissimilarity of settings where social interaction exists and where it does not cause us to be unable to generalize findings of I2C studies to I2B studies. A second note of criticism on mentioned former research is that (2) a lot of these studies do not account for different ways of information framing in textual messages. As discussed, information frames have discrepant effects when the composition of its parts is altered; when e.g. emotion in the information frame changes while keeping the other components constant, the information frame may have a different effect than it would have had when the presented emotion wasn’t changed. Studies often incorporate only one or a few information frames or alter a single component, ignoring the wide variety of compositions of information frames and its potential effects on donation behavior.

(8)

8 to provide insights on the effect of topics, emotionality, sentiment, and the use of

images/videos on the individuals’ donation behavior as a group. The question that this paper thus strives to answer, reads as follows: “To what extent do information frames in

user-generated content influence donation behavior in an individual-to-beneficiary environment with low social interaction?”. By answering the proposed research question, we aim to

explore the information framing techniques which increase the total sum of donations, the average volume of a donation and the probability of achieving the monetary donation goal.

(9)

9 techniques to increase the elicited donation in among all fundraisers and do not have to

differentiatie between the moral foundations of the fundraisers.

Using a Latent Dirichlet Allocation (LDA) textmining method, we structure the unstructured nature of texts of the fundraiser. This text analysis method “discovers” the probability of occurrence of extracted topics latently in corpora, based on the word-probability of occurrence for a given topic (Büschken & Allenby, 2016), which we call topic intensity. The expressed emotions in these fundraiser-texts will be measured by means of a

multidimensional sentiment analysis (MDSA), which analyzes scores on 8 different emotions (fear, joy, anger, etc.) as well as their negated versions (negated fear, negated joy, etc.), whereas negated emotions are emotions that are expressed by unigrams such as “unhappy” (negated joy) or bigrams such as “not scared” (negated fear). This results in document-specific emotion-intensity values, indicating the ratio between the non-emotional and emotional words in the individual fundraiser’s text. The sentiment will also be assessed, which is done by rating the words’ sentiment that are in the fundraiser text, after which the mean sentiment per document is calculated. Both LDA and MDSA will be elaborated more extensively in the methodology section. Our data does not include metrics relating to psychological effects (Bekkers & Wiepking, 2011). Hence, we do not include this as

explanatory variable(s), but we instead argue that individuals’ elicited donation behavior is a product of direct and moderating effects of components of the presented information frame. The resulting metrics from the LDA and MDSA textmining methods will be used for assessing their effect on donation behavior by means of (logistic) regressions.

The results from the (logistic) regressions indicate that different information frame compositions are appropriate for different metrics of donation behavior (only the average donation volume and probability of goal achievement are analyzed as it was found that the total sum of donations is correlated with the probability of goal achievement). Also, the results show that, for some combinations of topics and emotion, sentiment or the use of images/videos may decrease the elicited donation behavior towards the beneficiary. It is thus highly dependent on the concerns and topics the fundraisers’ creator decides to start a

(10)

10 We aim to contribute to existing academic literature by examining the effect of different compositions of information frames in fundraiser texts, shedding light on specific topics, emotionality, sentiment and the appendence of images and/or videos in fundraiser-texts that increase the total sum of donations, average donation volume, and probability that the fundraiser’s monetary goal is achieved. Applications of this study’s findings extent to (new) beneficiaries that start a fundraiser, who could use these findings in making their call-for-assistance a success. Also, facilitating platforms can incorporate and/or communicate these findings to new beneficiaries in order to help them create their own fundraiser-texts.

The second chapter discusses the research framework, as well as the appurtenant literature review. Here, we discuss the variables as proposed in the research framework and their

expected effects, derived from existing literature. Next, in the third chapter, the data collection and descriptive characteristics of the data are highlighted. The fourth chapter sheds light on the methodology of this research, discussing the essence and assumptions of the used text analysis methods, and how we satisfy those assumptions. Subsequently, in the chapter 5, we lay out our results and explicate those, followed by the conclusions and discussion in chapter 6.

2 LITERATURE REVIEW

Current literature presents knowledge about more general, abstract information framing techniques or components of information frames, providing us with a general understanding of their influence on donation behavior. The key idea of our research of exploratory nature is designed to identify different compositions of information frames that increase or decrease the elicited donation behavior. Hence, in favor of face validity of the results of our study, we outline existing academic theories and findings from which we can derive our hypotheses; we discuss theories and findings about the individual components of information frames and their direct effects on individuals’ donation behavior (figure 2.1). Also, findings in other research domains that can contribute to our understanding of information framing techniques are discussed. In the following subsections, hypotheses are created that are to be tested by analyzing the data. Other relationships that are not hypothesized but are of interest

(11)

11

Figure 2.1: Research framework (effects that are ought to be explored are indicated with “Ex”)

2.1 Topic

First, due to the fact that textmining methods such as LDA have not been broadly used in donation behavior research domain, we want to reach out to another domain where textmining methods have been used more extensively to indicate how LDA can contribute to our

understanding of the susceptibility of peoples’ behavior to different topics in information frames. Hence, we briefly discuss topics that emerge from Latent Dirichtlet Allocation (LDA) analyses on Facebook posts in subsection 2.1.1 to outline the susceptibility of desired

behavior on distinct topics. Subsequently, we discuss four donation behavior-prompting topic categories that are observable in information frames in subsections 2.1.2 - 2.1.5. These four categories originate from the extensive literature review on empirical studies of donation behavior by Bekkers & Wiepking (2011), who elaborate eight determinants of donation behavior, derived from a vast amount of former researchers’ articles. Several determinants that are proposed in their research are developing from (expected) psychological effects (e.g. self-image and “warm glow”). We do not elaborate these determinants as our research revolves around the components of information frames; only determinants that relate to information frames are discussed. Lastly, findings of authors that were not included in Bekkers & Wiepkings’ literature review (2011), but can contribute to our understanding of information frames and its effect on donation behavior are also discussed.

2.1.1 Susceptibility of behavior on topics

(12)

12 behavior). This study shows that Facebook-posts addressing topics on, among others, medical concerns, house and family relationships and asking for support/prayers receive the highest number of audience feedback in terms of comments, whereas (some of) the topics that decrease the number of audience feedback are religiousness/Christianity.

Even though the desired effect of topics in this domain is different than donation behavior, these information frames also strive to prompt desired behavior, and can thus contribute to our preliminary understanding of the effect of topics in texts on donation behavior. We do not assume these topics to be of similar effect on donation behavior as the research domain is different and no monetary endowments are involved in “audience feedback”. Instead, we have discussed these studies to indicate that one’s behavior can indeed be prompted by use of distinct topics in texts, confirming the key idea of our research. Besides, the topics found in previously discussed studies can help by assessing face validity for the topics that emerge from the LDA conducted in our research.

2.1.2 Awareness- and urgency of need

Individuals need to be aware of the need of a beneficiary in order to elicit donation behavior (Bekkers & Wiepking, 2011). In line with Bekkers & Wiepking’s (2011) findings, Guy & Patton (1989) state people’s awareness of the beneficiary’s need and the awareness of the urgency of the beneficiary’s need to be of crucial importance on donation behavior intention. Indeed, compelling evidence of these two elements in this process is found by a number of authors; stating awareness of need of a beneficiary and the urgency of this need has a highly positive influence on the likelihood of an individual to donate (Das, Kerkhof, & Kuiper, 2008; Rodriguez, 2005). This is consistent with the findings of several other authors, who state that merely the urgency in which help is needed positively influences the likelihood of donation behavior of the potential individual (Levitt & Kornhaber, 1977; Schwartz, 1974; Staub & Baer, 1974). This sensitivity of individuals to awareness-of-need and urgency statements is predominantly due to the feeling of responsibility and fear of blame (Gillette & Hopkins, 1989). These feelings, in turn, are subject of the ex-post evaluation that will be elaborated in the efficacy section.

(13)

13 addressing the beneficiary’s concerns and the urgency of required donations, an increased degree of donation behavior towards the beneficiary is elicited.

2.1.3 Efficacy

Efficacy refers to the degree that the individual believes his or her donation has a significant contribution to the alleviation of the beneficiary’s concerns (Gneezy, Keenan, & Gneezy, 2014; Smith & McSweeney, 2007). Bekkers & Wiepking (2011) state this mechanism in their literature review. Although they find little significant evidence for the existence of this

efficacy mechanism, subsequent researches provide more insights here. Individuals tend to want to avoid a charity that allots a substantial amount of monetary donations to overhead costs; informing individuals that the overhead costs are taken care of, and thus when the perceived efficacy is high, results in an increased donation rate of 80% (Carroll & Kachersky, 2019; Smith & McSweeney, 2007). This determinant of perceived efficacy thus resides on how much of the amount of the monetary donation is used for overhead costs, or put differently, how much of the monetary donation actually “reaches” the beneficiary.

Another determinant of efficacy is intertwined in the identifiable victim effect theory by Jenni & Loewenstein (1997). They argue that a determinant of perceived donation efficacy is the ex post and ex ante evaluation, proposed by Gillette & Hopkins (1989). These ex post and ex ante evaluations occur on the moment the decision on whether to donate or not is made. In ex post evaluation, the individual decides whether to donate or not only after (ex post) the beneficiary’s risk-producing event. This decision is based on the evaluation of being able to help the beneficiary; what is the probability of the beneficiary to recover from the risk-producing event when the individual donates (Gillette & Hopkins, 1989)? With ex ante evaluation, the individual decides whether to donate or not based on the probability of

prevention of the risk-producing event, so before (ex ante) the beneficiary is at risk; what is

the probability that the beneficiary can be prevented from exposure to the risk-producing event when the individual donates (Gillette & Hopkins, 1989; Jenni & Loewenstein, 1997).

Following the ex post and ex ante evaluation-theory, statements of efficacy thus are important for provoking donation behavior as these indicate how and/or to what extent donations

(14)

14 donation behavior, leading to a higher total donation volume, sum and, in turn, a higher

probability of achieving the monetary donation goal.

2.1.4 Mood inducement

Bekkers & Wiepking (2011) also mention the influence of positive mood on donation behavior. We propose psychological benefits (expected moods) to be different from induced expected moods, as psychological benefits emerge from the donor’s intrinsically expected feelings when (s)he donates, whereas induced expected moods are moods that potential donors are “told” to expect by a beneficiary when they decide (not) to donate. Stating that donating will bring the individual into a good mood increases donation behavior, eminently when beneficiaries had no influence on the harm that was done to them (Benson & Catt, 1978), thus in ex post evaluation scenario’s. Contradictory to this theory, Cunningham (1980) finds that people tend to engage in donation behavior when a feeling of guilt is provoked; guilt-inducing phrases like “Imagine how you would feel when you don’t donate” cause people to advance into a bad mood, leading to an increased donation behavior. This tendency is especially prevalent in ex post evaluations (Jenni & Loewenstein, 1997). Here, expected psychological detriments such as feelings of responsibility and blame increase ones tendency to elicit donation behavior (Gillette & Hopkins, 1989).

For our research, these studies do not provide us with an expected effect that is generally agreed upon. We therefore propose to latently explore whether or not there are mood-inducing topics emerging from the LDA analysis and whether or not these have significant positive or negative effects on donation behavior.

In conclusion, there is a body of literature showing that topics in texts lead to certain

(15)

15 Preliminarily, we expect that the Individual-to-Beneficiary fundraiser-information frames that positively influence donation behavior display topics such as awareness-of-need statements, urgency statements, and efficacy statements. In terms of the mood-inducing topics, we cannot presume topics due to the contradictory nature of discussed studies concerning these topics; we plan to find these effects latently from our dataset, if any. The hypotheses therefore read as follows:

 H1a: Intensity of topics on awareness-of-need-statements in I2B user-generated

fundraiser texts has an positive effect on the elicited donation behavior towards the beneficiary

 H1b: Intensity of topics on urgency-statements in I2B user-generated fundraiser texts

has an positive effect on the elicited donation behavior towards the beneficiary

 H1c: Intensity of topics on efficacy-statements in I2B user-generated fundraiser texts

has an positive effect on the elicited donation behavior towards the beneficiary

Again, due to the exploratory nature of this research, other specific topics that were not previously found in other studies but are on influence nonetheless are ought to be discovered by the LDA- and subsequent analyses. Hence, we propose that, besides the mentioned topics in the discussed theory and hypotheses this section, there are other topics that may be of influence on the elicited donation behavior towards the beneficiary.

2.2 Emotion

There’s a vast body of literature indicating the positive effects of positive emotions on the donation behavior. Donation behavior towards psychologically distant (out-group)

beneficiaries (e.g. individuals helping beneficiaries that are unknown to the individual) is more likely to occur when love or a combination of love and hope is framed (Cavanaugh, Bettman, & Luce, 2015). The Affect Infusion Model (Forgas, 1995) describes an

interpersonal relationship between two individuals growing stronger when positive emotions are experienced, which in turn provokes donation behavior (Forgas, 2002). Besides, Waugh & Fredrickson (2006) found an even further increase in tendency of resource sharing in the engaged relationship when the emergence of the relationship was based on positive experiences and emotions.

(16)

16 moral foundations, which in turn are based on the Moral Foundation Theory (Graham, Nosek, Haidt, Oyer, & Koleva, 2011). This widely cited Moral Foundation Theory proposes a

framework in which there are five psychological foundations on which cultures constitute their moral behavior: care, fairness, loyalty, respect, and purity. Non-profit organizations ground their activities on either the care foundation (e.g. Red Cross and Make a Wish foundation) or on the fairness foundation (e.g. Amnesty International and Center for Constitutional Rights) (Goenka & van Osselaer, 2019). It has been proven that positive

emotions (happiness, compassion, and gratitude) can positively influence the effect of medical topics on donation behavior (Caruana & Crane, 2008; Goenka & van Osselaer, 2019) in care-founded fundraisers. Negative emotion such as shame, embarrassment, and sadness are of influence on this effect between topics and donation behavior as well, although they have no particular moral foundation they should be accompanied with; they have a small although significant moderating effect on the effect of topics on donation behavior based on both care and fairness moral foundations (Horberg, Oveis, & Keltner, 2011). This implies that charities or beneficiaries should incorporate appropriate congruency between the emotion in their texts and the objective/moral foundation of it.

Following discussed literature, we expect to find a positive, direct effect of positive emotions on the elicited donation behavior towards the beneficiary. Besides, derived from existing literature, we assume the existence of moderating effects of emotionality on the effect

between topic and elicited donation behavior, although we cannot specify which topic effects are moderated by the emotionality in the information frame or which direction this

moderating effect is. Hence, we strive to explore whether or not there are moderating effects on the topics that emerge from the LDA analysis and which emotionality is appropriate for which topic. The hypothesis is therefore as follows

 H2: Positive emotionality in I2B user-generated fundraiser texts has an positive effect

on the elicited donation behavior towards the beneficiary

2.3 Sentiment

According to Das et al. (2008), an example of a negative sentiment (loss frame) in a

(17)

17 objective (Das, Kerkhof, & Kuiper, 2008; Rothman, Bartels, Wlaschin, & Salovey, 2006). This insight does not imply that a negative sentiment is always inferior to a positive one, because negative sentiments have more effect on donation behavior in fairness founded objectives and positively moderate the effect of topics describing beneficiaries with personal information (Horberg et al., 2011). Also, Chou & Murnighan (2013) find that in a blood-donation experiment (residing on a care foundation) a negative sentiment prompted

significantly more donors, whereas they also find that the effects of statements of urgency are not moderated by the sentiment of an information frame.

These contradictory findings of other authors cause us to be unable to determine the

appropriate sentiment for care-founded fundraisers. Besides, previous studies also show that the used sentiment does not moderate the effect on donation behavior of all topics. We thus aim to latently explore which direction (positive/negative) the effect of positive and negative sentiments on donation behavior is in our data. Also, we aim to find whether or not there are moderating effects of sentiment on the effect between topics that emerge latently from our LDA analysis and elicited donation behavior, and if any, what the appropriate sentiment (positive/negative) is for these topics.

2.4 Image/Video

Akin to the information frame components discussed earlier, graphical cues in terms of images and videos have proven their effects on donation behavior. For instance, charity-advertisements are more likely to provoke donation behavior when eyes are in the image (Powell, Roberts, & Nettle, 2012) or negative frames such as child poverty are presented (Lewit, Terman, & Behrman, 1997), whereas the latter has been subject of a wide variety of academic research; image valence. The valence of an image or video refers to what degree it positively or negatively presents objects, people, and/or sceneries (Frijda, Manstead, & Bem, 2009). Image valence has not only a direct influence on donation behavior (Chang & Lee, 2009), but also moderates the information framing effects of accompanied texts (Perrine & Heather, 2000; Thornton, Kirchner, & Jacobs, 1991). The popularity of negative images (demonstrating a more positive influence on donation behavior than positive images),

(18)

18 In light of discussed literature, we do acknowledge the existence of a mechanism of different framing techniques in images or videos and its effect on desired behavior, or in our case, donation behavior. However, in favor of the feasibility of this study, we decide to account for this mechanism by merely recognizing the presence of graphical cues by means of a binary dummy variable (yes/no). This implies that we aim to explore the direction of the direct and moderating effect (if any) of the presence of graphical cues on donation behavior.

2.5 Control variables

By using data stemming from the GoFundMe platform, it has been proven that the gender of the fundraiser-creator has no significant effect on the elicited donation behavior towards the beneficiary (Sisco & Weber, 2019). However, as other researchers argue that gender does have an effect on donation behavior (Bolton & Katok, 1995; Lo & Tashiro, 2013; Mesch, Brown, Moore, & Hayat, 2011; Piper & Schnepf, 2008), as well as in favor of minimizing the unexplained variance in our analyses’ results, we control for the gender of the fundraiser-creator nonetheless: we are not interested in the effect of the fundraiser-fundraiser-creator’s gender on the elicited donation behavior towards the beneficiary, but do account for the possible effects of it for the sake of this research’s explanatory magnitude.

3 DATA

3.1 Data source

The data that will be used for the analyses stem from the online platform gofundme.com (hereafter: GFM). This platform provides beneficiaries the ability to create and post a fundraising campaign, including the text describing their concerns, monetary donation goal, and appended images and/or videos. Donations can subsequently be made by individuals, creating an Individual-to-Beneficiary environment. Beneficiaries can categorize their

fundraiser within a range of 19 cause-categories, among which medical treatments, funerals of acquaintances, animal well-being, educational purposes, and (local) community incentives.

3.2 Appropriate data volume

(19)

19 11-14 words on average (Boot, Tjong Kim Sang, Dijkstra, & Zwaan, 2019), therefore

logically need more observations than a dataset consisting of Shakespeare books in order to create a valid dataset. There’s thus no clear guideline on the appropriate number of

observations that are necessary for the LDA to create valid results. Instead, we discuss other authors that have used this method and derive an appropriate number of observations for our data.

Netzer & Lemaire (2016) used LDA to predict loan defaults from crowd-sourced loan applications. In their research, they used 19,446 observations, containing 3.5 million words. Jian, Gang, Chunxiu & Kerry (2019) used 21,852 consumer reviews as observations for their LDA approach. Because they do not state the number of words present in their dataset, we refer to the word-count levels used in the research of Risselada, de Vries & Verstappen (2018), who state that short and long consumer reviews consists of 70 and 170 words,

respectively. By dividing the dataset of Jian et al. (2019) by the mean word-count proposed by Risselada et al. (2018) (=120), Jian et al.’s (2019) dataset comprises 2.6 million words. A last reference level of word-counts of a valid dataset volume for LDA is the work of Zhao, Du, & Buntine (2017), who compare several datasets in their research, jointly consisting of roughly 1.5 million words. Following these discussed researches, we argue that a dataset comprising a number of words between 1.5 and 3.5 million is appropriate for LDA.

3.3 Data collection

(20)

20 derived from an English lexicon, meaning that non-English words cannot be rated for the emotion or sentiment they represent.

In order to maximize the share of English-written fundraisers in the dataset, we use one city per state in the USA as search term. On the GFM platform, the results of search terms are sorted in descending order, based on the sum of donations and the time since the last donation was made to the fundraiser. Also, the website only provides 1000 results per search query. To eliminate the probability of not extracting the URL’s of fundraisers with few or no donations that are on the bottom of the search results, we use one city per state of all 50 states in the USA as search term that show no more than 999 and no less than 100 search results without selecting a category or adding cause-related keywords; search terms (cities) that show more than 999 or less than 100 results are not used for URL extraction, only the cities that present 100-999 results are used as search terms. The threshold of 100 results is set to avoid using search terms that result in few to no fundraisers, which would otherwise have caused the dataset to be invalidly small. The IP-address of the querying computer is also of influence on the search results. Different results show up when using an e.g. Dutch IP-address than when using an IP-address in the USA. Again, to ascertain the highest share of fundraisers written in English in the scraped data, the IP-address is moved to Kansas City as Kansas City is known for being the most central (moderately large) city in the USA and is done by using a Virtual Private Network (VPN).

The URL’s of all fundraisers that result from the appropriate search terms are extracted by using Tension Software’s Cross-Navigation URL-extractor. Subsequently, the extracted URL’s are used as input in Kuaiyi Technology’s scraping software “ScrapeStorm”. By creating a custom-made scraping “profile”, this software collects the required bodies of data from the URL’s as specified in the scraping-profile and reports this as a .csv file. This results in a number of 26,225 fundraisers, ranging over a period in which they were created from 24-02-11 until 2020-02-19.

3.4 Data cleaning

(21)

21 observational unit forms a table” (Wickham, 2014, p. 4). Besides the creation of tidy data following Wickham’s definitions, we execute more data cleaning practices in favor of the subsequent statistical analyses, following Berger et al.’s (2020) guidelines.

3.4.1 Outliers

The first endeavor that is conducted to create tidy data is the removal of some cases with outliers in the “donation goal” variable. We distinguish two different sorts of outliers in the dataset; interpretable outliers and uninterpretable outliers. Interpretable outliers are those outliers that are larger than the third interquartile range (= 150,000), but are still meaningful observations; it may very well be that some fundraisers are in need of larger sums of money for e.g. specific medical treatments that are expensive. However, observations that are larger than the IQR3 of their variable and are unlikely to be real are removed from the dataset (i.e., fundraisers with 1-100 million dollars as donation goal) (n = 20). Imputation is not

appropriate here as the observations are correctly collected by the scraping software, but the values of these observations are absurd. The removal of uninterpretable outliers decreases the amount of observations in the dataset to 26,205.

3.4.2 Non-English texts and bilingualism

As stated, there’s a high likelihood of the dataset containing at least some fundraisers completely written in non-English languages. These fundraisers are detected by using the “textcat” package and subsequently removed from the dataset. We only use English fundraisers in our dataset for the sake of the interpretability of resulting topics that emerge from the Latent Dirichlet Allocation approach. This leaves the data to consist of 23,851 observations.

In some cases, the creator of the fundraiser created a bilingual fundraiser text; that is, (s)he has written in English and added a non-English translation of this text in a new paragraph. To deal with this, all texts in the dataset are broken down (unnested) into sentences (tokens) whereas the language is detected per sentence. The language detection is based on version 2 and 3 of Google’s “Compact Language Detector” (CLD2 and CLD3, respectively). CLD2 differs from CLD3 as CLD2 is a Bayesian classifier while CLD3’s detection of language uses a neural network. By using these two methods, we can ascertain the correctness of the

(22)

22 by the researcher. Tokens detected with languages other than English are removed from the dataset, after which the remaining tokens are renested.

3.4.3 Inactive campaigns

Fundraisers on the GFM platform that have been posted e.g. one day ago are highly likely to have a lower total sum of donations than a fundraiser that have been collecting donations for months; it takes time to achieve the donation goal of the fundraiser. This causes the analysis to show invalid estimates on the total sum of donations and probability of goal achievement when these fundraisers are incorporated in the dataset. It is for that reason we account for the dependency of the total sum of donations and probability of goal achievement on whether the fundraiser is still active or not. We do so by only incorporating “inactive” fundraisers in the dataset, whereas fundraisers are marked as inactive when the last donation to the fundraiser was made more than two months ago (dated 24-04-2020). This does not mean that the fundraiser itself is inactive on GFM; inactive campaigns are merely labeled as such in the dataset. The removal of inactive campaigns shrinks the dataset to 21,287 observations.

3.4.4 Non-ASCII characters

Some fundraisers (n=4) miss datapoints and instead show non-ASCII characters (html code). This occurs because these fundraisers were closed during the period between the

URL-extraction and webscraping of the fundraisers. Closing the fundraiser causes the donation goal to disappear from the webpage, leading to a useless observation as this metric is required to determine a component of the dependent variable; probability of goal achievement. Therefore, fundraisers showing non-ASCII characters are dropped from the dataset.

3.4.5 Stop-words

(23)

23

3.4.6 Digits, punctuation, whitespace, decapitalization, and stemming

Some other assumptions that need to be satisfied to be able to conduct an LDA are the removal of digits, punctuation, and whitespace. These textual characters lack explanatory power in the model and thus need to be eliminated. Here, whitespace is defined different from “regular” spaces, as whitespaces are a product of using the tab- or enter-key, whereas

“regular” spaces are placed by using the space-key. Hence, “regular” spaces are kept within the dataset. Besides, decapitalization needs to be conducted for the LDA in order to be able to recognize similarity of terms. A note on stemming is that it is not an assumption for a valid LDA analysis, but rather is a choice made by the researcher. Not stemming would lead to (sometimes) better interpretable words in the extracted topics. However, not stemming would cause the LDA to miss similarities of words: i.e., the word “recovery” would not contribute to the same topic-word probability of occurrence than “recovering”, even though they essentially express the same concept. Hence, stemming will be used in order to maintain the assumptions that similar concepts contribute to the same topic-word probability.

3.4.7 Character threshold

Although there’s no conclusive guideline for the minimum amount of characters for an LDA, we adapt the arguments of Maier et al. (2018), who state that a LDA is especially efficient in longer texts. Despite this statement, they also do not provide a clear guideline on a minimum character-count per document. Therefore, cases that show less than 125 characters including spaces in their text are marked as useless and thus are dropped from the dataset as they do not significantly contribute to the topic models that result from the LDA. This threshold of 125 characters is determined by adding 5% of the mean character-count (x̅ =1,473) of all

fundraisers to the amount of characters in the fundraiser with the least characters (51). This is done only after the removal of stop-words, digits, punctuation and whitespace because these words and characters have little to no explanatory power in the LDA topic model. Eliminating the fundraisers that do not satisfy the minimum character count did not result in significant changes in the preliminary topic model. This final cleaning measure provides a dataset of 19,678 observations containing 2,113,920 word-tokens. Considering the proposed word-token threshold discussed in 3.3, we consider this dataset as sufficiently large to be valid.

3.4.8 Gender determination

The determination of the gender of the fundraisers’ creator is done by use of the

(24)

24 and ipums datasets. This first dataset comprises USA’s Social Security Administration’s data of first names since 1880. For observations where gender cannot be determined by using the

ssa dataset, the ipums dataset is used to determine gender, which entails USA Census data

with known names of people born since 1930. A total amount of 12,974 (68.9%) are labeled as female and 5,844 (31.1%) as male, leaving 860 observations for which both the ssa and

ipums dataset could not determine gender. The decision on whether or not to proportionally

allocate gender to these gender-lacking observations (based on gender distribution in the part of the dataset where gender could be determined) will be made based on (significant) changes in control variable estimates and will be evaluated in the results section.

3.5 Data descriptives

The used data ranges over a period from 2011-02-24 until 2020-02-19 and across all states of the USA and comprises 19,678 observations, consisting of 2,113,920 word-tokens. The volume of the data is therefore sufficient for an LDA compared to former researches conducting LDA as elaborated in section 3.2. From these fundraisers, a number of 3,508 (17.8%) beneficiaries have successfully achieved exactly or more than their monetary goal of the fundraiser. The fundraisers that did not successfully achieve their goal amount to a

number of 16,170 (82.2%) fundraisers. Interpretable outliers that are kept in the data may cause a distorted view on the data means of these fundraisers. Again, we decide to keep the interpretable outliers in the dataset for the analyses, however exclude observations in mean calculations (table 3.1) where its goal is larger than the third IQR, whereas IQR3 = 10,000 for successful fundraisers (n = 821) and IQR3 = 150,000 for unsuccessful fundraisers (n = 110).

This total amount of fundraisers is broken down into two groups; fundraisers that achieved their monetary goal and fundraisers that did not, which are aliased as successful and unsuccessful fundraisers, respectively. As can be seen in table 3.1, the degree in which fundraisers’ goals are realized equals 36.7% on average. A notable figure in table 3.1 is the degree in which successful fundraisers realized their donation goal on average. On average, these fundraisers have realized their monetary goal for 163.2%, implying that beneficiaries of these successful fundraisers collected more money than they have asked for.

(25)

25 similar, successful fundraisers created by men receive donations that sum up to more than twice the donation goal. Women, on the other hand, receive donations that sum up to 1.3 times their donation goal. For unsuccessful fundraisers, however, women realize a slightly higher percentage of their donation goal than men do.

Table 3.1: Descriptive statistics of fundraisers in the dataset

In terms of categories where beneficiaries can categorize their fundraiser in, some categories are represented considerably more than others; the categories “medical”, “funerals”, “other”, and “accidents” jointly represent roughly 69% (n = 13,595) of the dataset, whereas the 13 other categories comprise the other 31% (n = 5,152) of the observations (table 3.1). The categories “celebration”, “competition”, and “funerals” appear to have substantially higher achievement rates compared with other categories. On the contrary, “business”, “nonprofit”, “missions”, and “travel” show the lowest achievement rates. In the LDA, prevalent topics of the “general” dataset will be compared with prevalent topics in individual categories to check whether or not there is a different topic prevalence. If any differences show up between those,

Group Fundraisers Cumulative fundraiser share (%) Mean donation sum Mean

donation goal Achieved (%)

(26)

26 we plan to conduct the LDA, MDSA, and subsequent analyses individually per category for the sake of exploring more in-depth topics.

Lastly, concerning the presence of images and/or videos, a number of 2,287 (11.6%) beneficiaries did not append an image or video to their fundraiser’s text. From these, the fundraisers that had no appended images and/or videos, 1,804 (78.9%) did not successfully achieve their donation goal and where 483 (21.1%) did. The group that did append at least one image counts to a number of 17,391 beneficiaries with a mean of 4 (3.9) images/videos with a standard deviation of 5 (5.3). Here, 14,366 (82.61%) did not successfully achieve their donation goal, where 3,025 (17.4%) did. The similarity of the figures of these two groups may denote the inexistence of a correlational effect between fundraiser goal achievement and the appendence of images and/or videos. However, we can’t conclude whether or not this is a significant relationship and plan to analyze this effect by use of the regression model(s).

4 METHODOLOGY

Two methods will be conducted to determine the topic(s), emotion, and sentiment of the information frame presents: Latent Dirichlet Allocation (LDA) to determine the presented topics (4.1) and Multidimensional Sentiment Analysis (MDSA) to determine the emotionality and sentiment of the information frame (4.2).

4.1 Latent Dirichlet Allocation

4.1.1 Method

In this paragraph, we base our methodology of the LDA on the articles of Berger et al. (2020) and Büschken & Allenby (2016) as these articles thoroughly discuss the LDA method and outline key elements and assumptions of this method.

The text analysis method “Latent Dirichlet Allocation” is used regularly among textmining academics. LDA is a suitable method for topic modeling, whereas it extracts topics more accurate in larger textual documents than other text analysis methods such as Poisson Factorization (Berger, et al., 2020). In LDA, documents are seen as “bags of words” where words attribute to latent topics in the text (Büschken & Allenby, 2016). This, however, does not mean that words accord to a single topic, but rather have a different probability

(27)

27 for one topic, but a different probability of occurrence for other topics. For each topic, the 10 words that present the highest probability of occurrence in that given topic are used to

determine the character of the topic.

Topics are defined as word distributions that commonly co-occur and thus have a certain probability of appearing in a topic. The LDA model thus presumes a fixed number k of topics among all observations’ documents, creating a latent topic pool, and assumes that each

document d is characterized by a combination of topics 𝑧𝑑𝑛, noted as 𝜃𝑑 (Berger, et al., 2020).

For all words w present in the data there is a probability of occurrence per topic, but only those that have high probability of occurrence characterize the topics in question. This denotes the existence of different combinations of topics 𝜃𝑑 among documents as a result of different probabilities of topic occurrence, as well as the underlying word-probabilities 𝑤𝑑𝑛

per topic.

LDA mimics a data-generating process in which a writer chooses a topic he wants to write about, and then chooses the words to express these topics (Berger, et al., 2020, p. 11). Accordingly, the nth word is presumed to be generated given the chosen topic, whereas the probability of occurrence of this word stems from a matrix consisting of word-topic

probabilities defined as 𝑝(𝑤𝑑𝑛 ∣ 𝑧𝑑𝑛, Φ), where Φ is a matrix of word-topic probability vectors

{𝜙𝑚,𝑡} for word m and topic t (Büschken & Allenby, 2016, p. 3). Hence, the following formula will be used for the LDA:

𝑝(𝑤𝑑𝑛= 𝑚∣ 𝑧𝑑𝑛= 𝑡, Φ) = 𝑝(𝑚 ∣ 𝜙𝑡) = 𝑝(𝜙𝑚,𝑡)

(4.1) As topics t and word m individually comprise more than two values in the model, the dirichlet where topics and words are chosen from know a multinomial, discrete distribution, aliased as “dirichlet distribution” (Büschken & Allenby, 2016, p. 3). and defined by:

𝑝(𝜃𝑑) ~ 𝐷𝑖𝑟𝑖𝑐h𝑙𝑒𝑡(𝛼), 𝑝(𝜙𝑡 ) ~ 𝐷𝑖𝑟𝑖𝑐h𝑙𝑒𝑡(𝛽),

(4.2)

(28)

28

4.1.2 Selecting an appropriate number of topics

A disadvantage of an LDA is the fact that there’s no universal consensus upon an ideal number of topics, as this heavily depends on the data that is used in the research (Berger, et al., 2020). There are, however, four measures that can indicate an appropriate number of topics in the extracted latent topic pool and which will be applied to the dataset to assist in deciding the number of topics to be extracted.

The first measure is proposed by Arun et al. (2010), who describes his measure to reach a minimum at an appropriate number of topics. This measure is based on the Kullback-Leibler (KL)-divergence, which is a measure of how one probability distribution (𝜃𝑑k) differs from another with a different number of topics (k+1) (Arun, Suresh, Veni Madhavan, &

Narasimha, 2010). The second measure is the method of Cao et al. (2009). They propose this method as it indicates the best fitting model based on topic (cluster) density, so that “the similarity will be as large as possible in the intracluster, but as small as possible between inter-clusters.” (Cao et al., 2009, p. 1778), referring to the word-topic vector 𝜙𝑡. According to this measure, the ideal number of topics is where the measure reaches a minimum. The third measure is presented by Griffiths & Steyvers (2004). They devote their measure based on the posterior distribution changes when one topic is added in vector 𝜃𝑑. The number where the plotted distribution changes reaches a minimum indicates the appropriate number of topics for this measure. The last measure is the metric of Deveaud, SanJuan & Bellot (2014), who use the Jensen-Shannon divergence that maximizes at an appropriate number of topics.

Ideally, these four measures individually indicate the same number of topics. However, in case of the measures showing inconsistent results, not all measure results will be used and instead topic interpretation will be used as decisive means to determine the appropriate number of topics. That is, the LDA will then be run with the different numbers of topics as proposed by the different measures whereas the researcher manually checks whether or not there are any meaningful differences in interpretation when selecting a higher number of topics.

4.1.3 Interpreting topics

(29)

29 manually interpreted by the researcher, after which it receives a name, based on the words included in the group while also accounting for within-topic “weights” of the words, indicated by the probability of occurrence in vector 𝜙𝑡.

4.2 Multidimensional Sentiment Analysis

Besides topic modelling with LDA, the emotion and sentiment of fundraiser texts are also part of the presented information frame and will be analyzed by a Multidimensional Sentiment Analysis (MDSA). The MDSA approach is proposed by Chapman (2020), who states that a MDSA is an adapted version of the sentiment analysis described by Berger et al.’s (2020) and comprises sentiment and emotion, rather than only a unidimensional sentiment as described by Berger et al. (2020). Although Berger et al.’s (2020) method is relatively easy to

implement, being able to determine emotion combined with sentiment can provide more in-depth knowledge about a document’s emotional dimensions (Chapman, 2020). The emotions on which documents are scored are based on Plutchik’s wheel of emotions (Plutchik, 1980). Hence, an MDSA results in document-specific scores on sentiment (positive, negative) and 8 emotions (anger, anticipation, disgust, fear, joy, sadness, surprise, and trust), as well as the negated versions of the 8 emotions: where the unigram “scared” increases the emotion score of “fear”, the bigram “not scared” increases the negated emotion score “negated fear”.

(30)

30

Table 4.1: Emotion valence

A note of criticism on Berger et al.’s sentiment analysis (2020) and Chapman’s MDSA (2020) is, however, that the dictionaries where the sentiment and emotion scores are used from may be subject to invalidity due to differences in linguistic characteristics between regional

dialects and research domains (Chapman, 2020). That is, the word “sick” may have a negative meaning in medical documents, but a positive meaning in documents written by youth.

Dictionaries should thus ideally be created specifically for the domain they’re used in and get adjusted to the regional linguistic characteristics of the creator(s) of the text that will be analyzed. This is, however, unfeasible in numerous situations and the researcher should therefore account for imperfections in the dictionaries as these may invoke misinterpretations of the documents’ emotional dimensions (Chapman, 2020). A dictionary in which the

probability of invalidity is ought to be low is the National Research Council Canada sentiment dictionary (NRC) (Mohammad & Tuney, 2013). This dictionary was created by means of crowdsourcing through Amazon’s Mechanical Turk without selecting “raters” on

demographic and geographic characteristics. This way, invalid emotion and sentiment scores due to regional linguistic dialects, different intellectual levels, and one’s (professional)

background can be substantially decreased, resulting in a dictionary in which the emotion and sentiment scores are generally agreed upon (Mohammad & Tuney, 2013).

The emotion scores that are calculated are based on the ratio of words with emotional value on a given emotion and the total words of the document. The negated emotional scores are the opposite of the regular ones, recognized by preceding “un-“ in unigrams and “not” in bigrams. For this reason, it is expected that the negated emotion scores have lower emotion-scores (ratio’s) than the “normal” emotions.

Emotion Source Valence Negated valence

Anger Ekman, 1999 Negative Positive

Anticipation Macleod & Byrne, 1996 Undefined Undefined

Disgust Izard, 1977 Negative Positive

Fear Izard, 1977 Negative Positive

Joy Watkins, Emmons, Greaves, & Bell, 2017 Positive Negative

Sadness Ekman, 1999 Negative Positive

Surprise Noordewier & Breugelmans, 2013 Negative Positive

(31)

31

4.3 Data suitability and operationalization

The content of the scraped fundraisers is appropriate for our analyses because, firstly, the provided information includes the self-written text of beneficiaries. These self-written texts of the fundraisers on GFM will be used to determine three components of the information frame of the fundraiser; topic (intensity), emotion and sentiment. Secondly, we are able to determine the presence of images and videos due to the possibility of beneficiaries to append images and/or videos to their fundraiser. Thirdly, other provided metrics (the number and total sum of realized donations, the number of (unique) donors, and the monetary goal of the fundraiser) advance us to be able to shed light on the elicited donation behavior towards the beneficiary. Lastly, the name of the creator of the fundraiser is used to determine the gender of the fundraiser’s creator, which is necessary for the control variable “gender” in the research framework.

By using the LDA and MDSA methods, the data can be structured in a way that is appropriate for further statistical analyses. We discuss the endeavors applied to each variable that is used to answer the proposed research question in following subsections.

4.3.1 Information frame - Topic

The individual observations’ documents (texts) are characterized by their own vector 𝜃𝑑, containing probability estimates for every resulting topic from the LDA. Individual

observations receive an estimate with a range of [0,1] on every topic that emerges from the LDA, representing the topic intensities. Because these intensities are interpreted as probability of occurrence based on the topic-word probability of occurrence, topic intensity values sum up to 1 at topic k in the documents-specific topic pool 𝜃𝑑. This introduces perfect

multicollinearity in the topic intensity-scores as we are able to calculate the intensity-score of topic k with the formula ∑𝑘−1𝑖=1 θ𝑑, 𝑖. Mean-centering is not appropriate here as it does not solve multicollinearity and only solves interpretation issues, especially when a zero-value has no meaningful interpretation, which in terms of the topic intensity is not true; a topic intensity of 0 would mean that this topic is not addressed in the text at all. Hence, we leave out topic k in the (logistic) regression analyses. By leaving out this topic intensity, we solve

(32)

32

4.3.2 Information frame – Emotion and sentiment

The emotion and sentiment of a document are determined by use of the sentimentr-package in Rstudio. For the emotion determination, the ratio ([0,1]) between words with emotional value on a given emotion and the total amount of words is calculated on the entire document, rather than sentences or unigrams. The reasoning for this is that sentences in the document may have different lengths, whereas (when determining emotion-scores on sentence-level) each

sentence receives an emotion-score on every (negated) emotion, after which the mean of these scores would have been calculated. This is not representative for the data as long sentences with high emotion-scores may get cancelled out in calculating the mean emotion-scores when a lot of other sentences have low to no emotion-scores. Calculating the emotion-scores on document level would instead provide a representative ratio of emotions over the entire document. Words preceded with “not” contribute to the negated versions of the given emotions, as well as words that would have initially contributed to a given emotion but start with “un”. Words contributing to negated emotion-scores do not contribute to the “normal” emotion-score.

The negated emotion scores are subtracted from the “normal” emotion scores. By doing so, the regression analyses include less variables in favor of distorted estimates, a reduction in the degree of freedom, and reducing the difference between R2 and adjusted R2. We do

acknowledge that this decision penalizes the completeness of the model, but increases the simplicity of the model (Little, 1970).

The sentimentr-package in Rstudio is also used for the sentiment determination, wherefor each individual document is parsed into word-tokens. These word-tokens are analyzed for the sentiment they represent, whereas the sentiment-score has a range of [-5,5] These scores also stem from the NRC lexicon. The word-tokens are then renested into their original observation, meaning the sentiment of the word-tokens and thus providing the documents a mean

sentiment score that ranges from -5 to +5. For sentiment scores smaller than 0, the document is considered to represent a negative sentiment, whereas a sentiment score larger than 0 represents a positive sentiment.

4.3.3 Information frame – Image/video presence

(33)

33 and/or videos in the fundraiser is collected during the scraping of the data. The independent variable image/video presence will thus be a binary dummy (yes/no), wherein a 1 will be assigned when the number of images and/or videos is larger than 0.

4.3.4 Donation behavior

For donation behavior, we test the correlation between both the total sum of donations and average donation volume on the probability of goal achievement, as we expect that an increased value in the former two DV metrics ultimately increase the probability of goal achievement. The correlation between the total sum of donations and the probability of goal achievement is .32, indicating a moderate degree of correlation. A regression analysis on the percentual degree of goal completion (from which the dummy “completion” is derived where this value is >= 1) shows a highly significant effect of the total sum of donations (p <.00). Hence, we decide not the regress the information frames’ components on the total sum of donation. For the average donation volume, a low degree of correlation is found (.08),

whereas no significant effect was found in the regression analysis on the percentual degree of goal completion. We therefore decide to conduct two analyses to investigate the effect of information frames’ components on the donation behavior: their effect on the average

donation behavior by means of a regression analysis and their effect on the probability of goal achievement by means of a logistic regression.

The average donation volume is calculated by dividing the total sum of donations by the total number of donations. We do not divide this by the amount of unique donations (the number of unique individuals that have donated) as individuals can donate more often to a single

fundraiser, causing distortion in the average donation volume values.

Using the percentual degree of goal achievement, which is calculated by dividing the total sum of donations by the monetary fundraiser goal, we can label fundraisers as successful where the degree of goal achievement metric is ≥ 1. A 1 is assigned into the “completion” dummy variable for successful fundraisers.

4.4 Hypothesis testing

The research framework in the literature section describes an information frame of a

(34)

34 presence of images/videos), the effect of the information frame of the fundraiser on the

elicited donation behavior towards that fundraiser can be estimated. The effect of an information frame on the average donation volume will be assessed by running a linear regression model with all combinations between topics, emotion, sentiment and the use of images/videos.

The intensity of topics is calculated per individual fundraiser with a range of [0,1], whereas the emotion has a range of [0,1] in their vector. The explanatory variables “sentiment” and “use of images/videos” both are binary dummies. A binary logistic regression with dummy variables is appropriate to estimate the effects of the fundraisers’ information frame on the probability to achieve the monetary goal of the fundraiser. Also, as the gender of the fundraisers’ creator has been derived from the creator’s name, the effect of gender on the elicited donation behavior can be controlled for. This is again a dummy variable, whereas a 0 represents a female and a 1 a male.

The introduced binary logistic regression results in vector θ𝜋 where: 𝜋𝑖 = 𝑃(𝑌𝑖 = 1 | 𝛼(𝑖)+ 𝛽𝑖𝑥𝑖+ 𝜀(𝑖)) , i = 1,…,N

(4.3) where:

𝜋𝑖 = probability that the fundraiser’s goal is achieved with distribution [0,1]

𝑌𝑖 = binary completion variable with distribution 𝑌𝑖 = {

1 𝑖𝑓 𝜋𝑖 > 1 − 𝜋𝑖

0 𝑖𝑓 𝜋𝑖 ≤ 1 − 𝜋𝑖

𝜋= Intercept

𝛽𝑖 = Parameter slope (estimate) i in i = 1,…, N 𝑥𝑖 = Independent variable i in i = 1, …, N

𝜀 = Error term

A binary logistic regression analysis results in estimates that are not directly interpretable. To be able to interpret these, the difference between 1 and the exponent of the estimate

(35)

35 Preliminarily, we do not expect similar outcomes of the different dependent variables as the literature review outlines the contradictory findings in former research. Besides, we strive to explore which relationships between explanatory and dependent variables hold, which don’t, and whether or not there are any other relationships that have been unknown to thus far.

5 RESULTS

5.1 Fundraiser category distinction

Until now, little attention has been paid to the categories of the fundraisers. At first, the key idea in answering the research question is to extract topics from all fundraisers without distinguishing the categories they’re posted in. The test measures indicate that the best model fit is when a total number of 12 topics are extracted (appendix 1). After topic extraction of these 12 topics, however, the resulting topic pool showed topic-word distributions pertaining words that more or less can be interpreted as the categories the fundraisers can be posted within (appendix 2). That is, topics such as “surgery” (topic 3 and 4), “animals” (topic 5), “school” (topic 10), and “religion” (topic 7) came forth with the LDA. These topics are ought to have little explanatory power in the model. Besides, as we strive to unfold more specific topics rather than the categories which we already knew in the first place, we decide to divide the data into different datasets, based on the category of the fundraiser and re-run the LDA.

As discussed in subsection 3.2, a dataset that is used for LDA needs a sufficiently large set of words contained in the observations. As there are 19 different categories, individual category-subsets of the original dataset do no longer meet the word-threshold for validity. Despite this decision causing a limitation for our research, we continue by analyzing the effect of a fundraiser’s information frame on the elicited donation behavior using the two most

represented categories in the dataset in terms of their number of tokens; “Medical” (n=6,224 containing 711,051 tokens) and “Other” (n=2,219 containing 218,503 tokens). Despite the category “Funerals” having more observations (n=3011), the sum of their tokens is lower (n=191,408). As we value the number of tokens more than observations due to the LDA determining the topics based on those, we prefer the “Other” category over the “Funerals” category. The “Medical” and “Other”-categorized fundraisers jointly represent 42.9% of the observations (n=8,443) and 42.3% of the word tokens (n=902,459) from the general dataset. Two categories are chosen in order to be able to see whether or not findings can be

Referenties

GERELATEERDE DOCUMENTEN

The obvious approach for now is to solve this problem at the problematic resolver’s side. Educational materials like 4 explain this issue in detail. Besides configuring

The comparative study of the dynamics of ultraviolet (UV) and extreme ultraviolet (EUV) induced hydrogen plasma was performed.. It was shown that for low H 2 pressures and

A user could select the desired layer from the right panel that included base-map, villages and municipalities and their administrative boundaries, land-use category (land

As was the case with customer satisfaction, the reliability score of Brand Trust as construct in this survey came out at 0.811, which is regarded as a very good

The results of the study confirmed that brand loyalty for festivals such as Aardklop and Vryfees are multidimensional and consist of satisfaction, affective image, festival

In this paper, we presented a new method to obtain a fully gap-free time series of gridded daily surface temperature from MODIS collection 6 LST products (tiled MOD11A1/MYD11A1 data

Enkele van de benodigde gegevens kunnen niet volgens de eerste methode worden verzameld, daar zij van dynamische aard zijn.. Hiervoor wordt dan de tweede methode

We estimate the probability density function that a device is located at a position given a probability density function for the positions of the other devices in the network, and