• No results found

Predicting kickstarter campaign funding percentage through sentiment expressed within YouTube comments

N/A
N/A
Protected

Academic year: 2021

Share "Predicting kickstarter campaign funding percentage through sentiment expressed within YouTube comments"

Copied!
65
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Bram Walda 11423285 University of Amsterdam

‘Predicting Kickstarter Campaign Funding Percentage Through

Sentiment Expressed within YouTube Comments’

Date of Submission 22 June 2018 – Final Version Qualification MSc Business Administration Track Digital Business Name: B.J.J. Walda Student Number 11423285 Email bram.walda@hotmail.com Institution University of Amsterdam Supervisor Andreas Alexiou Word count 12.184

(2)

Statement of Originality

Bram Walda who declares to take full responsibility for the contents of this document has written this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

Acknowledgements

I would like to thank a few people, who have helped me severely during the processes of writing this research project. In particular I would like to thank my thesis supervisor, Andreas Alexiou, who has helped me in creating the outlines of the project and providing thorough feedback and insights. Furthermore, I would like to thank Lexalytics for providing me with the ability to use up to 50.000 sentiment analysis queries within the demo environment of Semantria. Additionally, I would like to thank Edo Santema and Sunny de Blok since they have been very helpful during the processes of writing and shaping the outlines of this project.

(3)

Abstract

Due to increased barriers to obtain traditional funding and the rise of social media, crowdfunding has gained a lot of popularity among entrepreneurs as a method of financing their ideas. This research pioneers in examining whether sentiment analysis is utilizable as predicting and explaining tool for future events in crowdfunding. Social media are recognized to have an influencing factor during the process of individuals to fund crowdfunding campaigns. This research examines the use of sentiment analysis of YouTube comments as a tool to explain and predict the funding percentage of Kickstarter campaigns. A YouTube comment scraper is used to mine comments together with its meta-data using a sample of 85 Kickstarter campaigns. The textual data is assessed employing the Sementria sentiment analyser to examine the predictive and explanatory power. Moderating variables, which are considered proxies for user attention are added to examine whether engagement reinforces the proposed relation. Current academic literature indicates that sentiment expressed on social media can be used as a predictor for product sales and revenues. This research will examine whether this is generalizable to crowdfunding campaigns.

The results indicate that the sentiment expressed within YouTube comments has insignificant explanatory power for Kickstarter campaign funding percentage. Hence there is no significant correlation between expressed sentiment and funding percentage. By adding moderating variables, such as the number of likes and the number of views, the relation is not reinforced. This conflicts with previous academic research and therefore imposes new limitations on sentiment analysis and the possible application of sentiment analysis within the domain of crowdfunding.

(4)

Table of Contents

1.INTRODUCTION ... 5

1.1 RESEARCH CONTEXT ... 6

1.2 RESEARCH OBJECTIVE & METHOD ... 8

1.3 STRUCTURE OF THE RESEARCH ... 9 2. LITERATURE REVIEW ... 10 2.1 CROWDFUNDING ... 11 2.1.1 Types of crowdfunding and platforms ... 11 2.1.2 Why are individuals funding? ... 13 2.1.3 Prediction of campaign success ... 13 2.1.4 Kickstarter as a crowdfunding platform ... 15

2.2 ONLINE SOCIAL NETWORKS ... 16

2.2.1 Rise of Online social networking and social media ... 16 2.2.2. Social media Monitoring and Analysis ... 18 2.2.3 Predictive use of social media ... 18 2.2.4 Social networks and user behaviour ... 19 2.2.5 YouTube as social network ... 20 2.3 SENTIMENT ANALYSIS ... 22 2.3.1. Sentiment Analysis method ... 22 2.3.2. Sentiment ... 23 2.3.3 Sentiment and user behaviour ... 23 2.3.4 Sentiment Analysis Tools ... 24

2.4 HYPOTHESES AND CONCEPTUAL MODEL ... 25

2.4.1 Hypotheses ... 25 2.4.2 Conceptual Model ... 27 3. METHODOLOGY & DATA COLLECTION ... 29 3.1 RESEARCH APPROACH ... 29 3.2 DATA COLLECTION ... 30 3.2.1. Kickstarter Campaign Data ... 30 3.2.2. YouTube Comments Data Set ... 31 3.2.3 Sentiment Scores Data set ... 31 3.3 CLEANING THE DATA ... 32 3.3.1. File formats ... 32 3.3.2. Kickstarter data sets ... 33 3.3.3 YouTube data set ... 34 3.3.4 Sheet consolidation ... 34 3.4 SAMPLING METHOD ... 34 3.5 ANALYSIS OF THE DATA ... 35 3.5.1. Sentiment Analysis ... 35 3.5.2. Statistical Analysis ... 36 4. RESULTS ... 37 4.1 SAMPLE CHARACTERISTICS ... 37 4.2 DESCRIPTIVE STATISTICS ON CATEGORIES ... 39 4.3 REGRESSION ANALYSIS ... 39

4.4 MULTIPLE REGRESSION ANALYSIS ... 41

(5)

5. DISCUSSION ... 44

5.1. KEY FINDINGS ... 44

5.2 THEORETICAL AND PRACTICAL IMPLICATIONS ... 46

5.3 LIMITATIONS AND FUTURE RESEARCH ... 47

6. CONCLUSION ... 49

REFERENCES ... 51

APPENDICES ... 61

APPENDIX 1 – META-DATA OVERVIEW KICKSTARTER CAMPAIGN DATASET. ... 61

APPENDIX 2 – SENTIMENT KICKSTARTER CAMPAIGN DATASET ... 62

APPENDIX 3 – NORMALITY CHECK DEPENDENT VARIABLE ... 63 APPENDIX 4 – NORMALITY CHECK INDEPENDENT VARIABLES ... 64

(6)

1.Introduction

1.1 Research Context

There have always been barriers to raising financial capital in traditional ways. These barriers increased after the most recent financial crisis when traditional funds dried up and angel investors stopped investing for a period (Block & Sandner, 2009). This allowed crowdfunding to gain popularity as a way of financing projects, ideas, products or services. The relative new way of funding is trying to obtain capital from a large audience, where each individual will pledge a small amount (Belleflamme et al., 2010) (Freedman & Nutting, 2015). Crowdfunding initially comes from the concept of crowdsourcing, which was elaborated on by Jeff Howe and Mark Robinson (2008). It is based on the concept of obtaining information from a large group of people as input into a task or project. Crowdsourcing encompasses the ability of the general public to contribute to the development of a campaign, where crowdfunding provides the general public with the ability to support in a financial matter (Belleflamme et al., 2010). Initially, it gained awareness when in 2003 a Boston musician and computer programmer created the platform ArtistShare (Freedman & Nutting, 2015). What once started as a platform, where artists could request their fan-base for a donation to be able to produce digital records, grew out to a fundraising platform for video, photo and film and more. Thanks to the success of ArtistShare, other funding platforms were launched, among others the well-known platforms Indiegogo in 2008 and Kickstarter in 2009 (Freedman & Nutting, 2015).

Social media play an essential role in crowdfunding. The ability to quickly and easily connect to a potential user base that can generate feedback, help to fund the campaign or

(7)

provide possible solutions, is a massive advantage for the entrepreneur (Agrawal et al., 2015). The rise of funding platforms that can be used by both individuals that are willing to pledge and entrepreneurs to start campaigns has exploded around 2010. According to Statista, there were over 1200 crowdfunding platforms around the world in 2014, this number is only increasing (Statista, 2014).

The rise of social media had a great effect on the expansion of crowdfunding platforms. Research about the definition of social media has been very exhaustive the last couple of years, however it was defined in 2010 already by Kaplan and Haenlein (2010) as “a

group of internet-based applications that build on the ideological and technological foundations of Web 2.0, and that allow the creation and exchange of User-Generated Content”. It is based on three elements: content, community, and Web2.0 (Ahlqvist et al.,

2008). Web2.0 refers to the fact that users of the internet are stimulated participate on the internet rather than solely gather information. User generated content (UGC) is therefore a term that has been established during this internet age. Apart from generating content, users of the internet are engaged in publication, sharing, commenting, voting, and recommending as well (Moisseyev, 2013). All these aspects of engagement are incredibly welcome for entrepreneurs that are interested in funding their project. They could validate what a large group of people think of their project before actually starting production. With over 3.2 billion social media users, over 5 billion mobile users and over 3 billion mobile social media users worldwide, the impact and reach of raising funds via social media could be enormous (Kemp, 2018).

Over the last decade, social media usage by businesses has been growing immensely; it has been at the top of firms’ agenda since 2008/2009 (Kaplan & Haenlein, 2010). Firms try to reach consumers via social media, but also control their brand image, increase their brand

(8)

awareness and provide customer support. Accompanied by this increase, demand for social media analysis and monitoring has risen. One of these analysis methods is intended to measure the sentiment of opinions and was developed around 2010. The essence of this method lies in opinion mining and getting to know what people think, which has been an essential part of our information gathering (Pang & Lee, 2008).

As Lassen, La Cour and Vatrapu (2017) mention in their research about predictive analytics with social media data, the use of sentiment analysis and similar methods to analyse UGC via social media is growing. The ultimate goal is to predict future events with data resulting from the widespread adoption of social media, like predicting outcomes of movie revenues (Asur & Huberman, 2010), Apple Iphone sales (Lassen et al., 2014), election outcomes (Tsakalidis et al., 2015), and stock prices (Karabulut, 2015). Predicting a future outcome could be of high value in many sectors. Several factors in the field of crowdfunding campaign prediction have been studied. Examples of these are the use of social data from Twitter (Etter et al., 2013), the analysis of project updates (Xu et al., 2014), and the project design (Greenberg et al., 2013). However, an analysis of the sentiment that is expressed online over the course of many campaigns to predict the funding outcome is absent in current academic literature. This research will be pioneering in the field of predictive analytics try to explain and predict the level of funding of crowdfunding campaigns with expressed sentiment on the internet.

1.2 Research Objective & Method

The research will be pioneering in trying to explain and predict funding percentage with sentiment, same holds for using YouTube as a source to assess whether individuals feel positive or negative towards a campaign. Kickstarter does not require a video to launch a

(9)

campaign. However, it encourages project initiators to upload one since projects with a compelling video tend to succeed at a much higher rate (Kickstarter, 2018). Since the comments below these clips on YouTube often possess sentiment, this could be an indicator for the success of a Kickstarter campaign (Siersdorfer et al., 2010). By scraping comments of a Kickstarter campaign of YouTube, a relation between the sentiment and the success of the Kickstarter campaign will be examined. The success of the Kickstarter campaign is not measured in whether it is successful or not but rather by the percentage of its funding it received compared to its funding goal. Data of the individual Kickstarter campaigns are obtained by downloading a Kickstarter data set that is provided on data.world (WebRobots, 2018). Additionally, examined will be whether this relation could be strengthened by the number of likes, views, and comments. These factors are considered to be proxies for user attention and therefore, are assumed to have a reinforcing effect. Based on this setup, the research question is formulated in two parts, firstly examining the relation between sentiment and the funding percentage and secondly if this relation could be reinforced by engagement.

(1a.) To what extent does a positive sentiment of user generated YouTube comments on Kickstarter videos have a positive effect on the funding percentage of this Kickstarter campaign, and (1b.) to what extent does engagement in terms of likes, comments and views reinforce this relationship. 1.3 Structure of the Research The structure of this thesis project will be as follows. First, an extensive review of previous academic literature is provided; all aspects that are central to the research question will be

(10)

covered. In the section that follows the research design and theoretical framework will be explained, which is followed by an elaboration of the sample. Data collection and data analysis methods are elaborated upon, which will be followed by the results that this thesis project will generate. Ultimately a discussion and a conclusion will be provided and limitations are elaborated on. Both academic and managerial implications are explained and the future research suggestions within this field of research are exhibited. Please find any additional appendixes, which provide a better understanding of the data collection and data analysis, at the end of the report.

(11)

2. Literature Review

The goal of this section is to enhance the understanding of the concepts used and to provide a discussion of previously written academic literature and proposed theories.

2.1 Crowdfunding

Crowdfunding is a method of collecting many small contributions to fund or capitalize on a specific campaign, often via online crowdfunding platforms (Freedman & Nutting, 2015). This particular way of funding ventures, ideas, products, and enterprises is relatively new in the realm of capital financing. Since the financial crisis of 2008-2009 until 2015 the ability to access a bank loan for small borrowers has dried up. This gave space for crowdfunding to develop itself. Together with the low interest rates, multiple peer-to-peer (P2P) financing types gained traction. Individuals were stimulated to participate in P2P financing types as lenders (Bruton et al., 2015).

2.1.1 Types of crowdfunding and platforms

There are four different types of crowdfunding considered; Debt- and equity-crowdfunding, donation-based and reward-based crowdfunding (Hebert, 2015). Debt-based crowdfunding is constructed on the concept of lending money from peers and pay it back again an interest rate. P2P platforms give the ability for small borrowers to access many small portions of financing for a percentage of the raised amount and a fee as commission (Freedman & Nutting, 2015). Following with equity crowdfunding, or sometimes called ‘crowd-investing’, is a form of crowdfunding based on the concept where funders receive compensation in the form of fundraiser’s equity-based revenue or profit-share arrangements (Wilson & Testoni, 2014). The problematic aspect in equity-crowdfunding is that investors are entering into

(12)

agreements in a very early stage; therefore there are no guarantees. Often ventures lack track records, experience and assets. Valuation and evaluation therefore are very difficult (Harrison, 2013). Furthermore, donation-based crowdfunding is, as the name proposes, a model where backers of crowdfunding campaigns provide the entrepreneur with donations. There is not always a reward for backing (Hebert, 2015). This type of crowdfunding is the only one of the four, which cannot be well compared to traditional venture capital since it lacks the assembly of risk capital for entrepreneurial activities (Frydrych et al., 2014). The last type is the reward-based model, where backers contribute portions of capital to the campaign in exchange for rewards. In most cases, the reward is the item or service that will be produced by the entrepreneur. This is therefore also the form that is most seen on crowdfunding platforms like Kickstarter and Indiegogo (Frydrych et al., 2014) (Hebert, 2015). An example of such a campaign is the brand Pebble, which sold itself to Fitbit for $23 million in 2017. They created multiple Kickstarter campaigns where the pledger would receive a watch for a donation above a specified amount (Weinberger & Canales, 2018).

If we compare above-mentioned forms of crowdfunding, a different goal of the entrepreneur and the funder is discovered (Mollick & Kuppuswamy, 2014). The goal of the entrepreneur is to raise funding, however what is exchanged differs. The funder could receive a reward, interest or equity in the venture or supports realizing product market availability (Mollick & Kuppuswamy, 2014). Entrepreneurs that try to fund their idea by means of crowdfunding highly depend on the platforms like Kickstarter and Indiegogo. The number of platforms has increased dramatically; in 2015 there were 375 crowdfunding platforms in the US alone. Globally the platforms are good for a funding volume of 34,4 billion U.S. Dollars in 2015, where just a little over 20% of the campaigns were successful (meaning they reached their funding goal) (Statista, 2016). The statistics show that the

(13)

number of campaigns that are funded or tried to be funded utilizing crowdfunding is increasing over the years. This could be the result of the accessibility of the mass public via the internet and well-known crowdfunding platforms argue Freedman & Nutting (2015). 2.1.2 Why are individuals funding? Being familiar with the concept of crowdfunding and the different types of funding models does raise the question why individuals are motivated to fund in a P2P environment since it does entail some risk. The behaviour can be explained partly by the statements of Mollick & Kuppuswamy (2014), where the goal of the backer is to either realize product market availability, or receive a reward, interest or equity. Gerber, Hui and Kuo (2012) also argue that funders are motivated to participate because of feelings of connectedness towards a community with related principles and ideals. Since most funders are amateur funders, suggested is that the need to feel part of a community is highly essential when funding (Gerber et al., 2012). Crowdfunding platforms drive on the idea of actually being a social network, and therefore creating an online community around the creation of new ventures. Backing behaviour is provoked by empathy and sympathy, guilt, happiness, and identity. This implies that the backing behaviour is driven by the interpersonal connections between the funder and the creator (Gerber et al., 2012). According to Lassen, La Cour and Vatrapu (2014), social media is playing an important role towards users attention and is potentially influencing in every step towards the actual action of purchasing or funding. This is described in the AIDA model, which will be elaborated on extensively in a later section.

2.1.3 Prediction of campaign success

The fact that connectedness and the interpersonal connection between the funder and the creator are driving funding provides the evidence that crowdfunding could be characterized

(14)

as a social network. A previous study by Herkman & Brusse (2012) has examined the relation between the number of clicks on a Kickstarter project and the networks of the entrepreneur on social media. A large but sparse network is suggested to have the best success rate. Mollick & Kuppuswamy (2014) also suggest that the number of social connections of the entrepreneur is positively related to the success of a crowdfunding campaign. In their research, they indicate that when the entrepreneur has many friends on Facebook, this has a positive effect on the success rate of the campaign. More studies try to predict the success of a crowdfunding campaign based on several factors. Lu, Xie, Kong and Yu (2014) analyse the hidden connections between the fundraising results and the promotion campaigns on social media. They indicate that early promotional activities strongly correlate with the final outcomes of the campaign. This is also indicating that social media is an essential medium to use when trying to fund a Kickstarter campaign. In fact, when a crowdfunding campaign starts early with promotions on social media, the level of success will be higher. Greenberg, Pardo, Hariharan and Gerber (2013), examine the predictive power of sentiment expressed in the text of the crowdfunding campaign design. They have been creating a tool where entrepreneurs can assess their project design, using machine-learning classifiers. Overall they have been able to predict campaign success with 68% accuracy. This could be specifically interesting for entrepreneurs that do not have any experience in starting a creative venture. They could assess their campaign design with the help of the tool Greenberg et al. (2013) shaped. Sawhney, Tran and Tuason (2015) show that language and campaign properties are able to predict moderately whether a campaign will be successful or not. They highlight the level of persuasion and how it can be used to influence the campaign outcome. Campaign pitches could be checked accordingly and improved when the outcome is unfavourable.

(15)

The importance of social media in crowdfunding is also emphasized by Moisseyev (2013). He investigated the connection between social media assets and the campaign results. Herein is found that social media should be used very extensively after launching the campaign. Likes are for instance a hard indicator, as they affect every result that he researched. Indicated is that a number of 546 likes can be sufficient to fund a campaign (Moisseyev, 2013).

Predicting outcomes, and explaining the level of funding, of crowdfunding campaigns is done by multiple researchers and they succeeded. Assessing campaign design and examining the underlying connections between social media and the campaign are methods used. This research will pioneer in using sentiment, where sentiment expressed in YouTube comments of Kickstarter videos is suggested to be an explaining factor.

2.1.4 Kickstarter as a crowdfunding platform

Kickstarter is one of the most well-known crowdfunding platforms at the moment with a pledged funding of over 3.6 billion U.S. Dollars in 2017 and over 144.000 successful projects up till 2018 (Statista, 2018). It is the platform that is used most for crowdfunding purposes; consequently, Kickstarter is used in this research. It is focused on creativity and merchandising and sees its mission as helping to bring creative projects alive (Kickstarter, 2018). Anyone, all around the globe, that has an active internet connection is able to back projects on Kickstarter however, it is only open for creators in 19 countries (all being western ones) (Kickstarter, 2018). Kickstarter generates revenue by applying a 5% fee on the total amount of the funds raised. Additionally, their payment processor applies a fee of 3-5% (Kickstarter, 2018).

Several research projects have been conducted on Kickstarter as a crowdfunding

(16)

platform. Kuppuswamy & Bayus (2008) found that Kickstarter rewards-based and donation-based campaigns available on Kickstarter receive most funding in their first and their last week. The funding shape over time looks has a bathtub shape. Most initial interest is at the beginning of the campaign and at a later stage, when the deadline draws near, individuals will be motivated to fund to realize market availability. Mitra & Gilbert (2014) examined the level of persuasion in the Kickstarter campaign design. They found that the language has predictive power, explaining 58,86% of the variance of successful funding. This has implications for both backers as project creators. These findings are supported by the findings of Sawhney et al. (2015) who find a similar result. The results of both studies indicate that whether the campaign is successful or not can be predicted with some accuracy.

2.2 Online Social Networks

2.2.1 Rise of Online social networking and social media

Since it is crucial for users to feel a connection towards a community, social media and social networking platforms are very important to the concept of crowdfunding. An example to exemplify the power of social media can be indicated by the Exploding Kittens campaign on Kickstarter in 2015. The creators who introduced “a strategic kitty-powered card game based on Russian Roulette” managed to fund their campaign with 87.825% of their goal, raising $8.7 million. The key to success was their social media usage, uploading videos, gifs, and images. The Facebook page had over 3.1 million likes and over 480.000 Twitter followers. The founders reached their campaign goals within 20 minutes (Fontein, 2015).

The success of Kickstarter partly relies on the fact of being an online social network itself and the need for connectedness towards a community of funders. An online social network is defined as ‘a platform that allows people with similar interests to come together

(17)

and share information, photos and videos’ (Rouse, 2018). By now, almost all internet users

are directly or indirectly making use of social media and social networks. According to Ahlqvist et al. (2016), social media consist of 3 elements: content, user communities and Web2.0 technologies. The rise of social media network websites in the web2.0 area enabled users to interact. Besides gathering information, they also contribute, participate and express themselves. The increasing involvement of the user enables a thorough assessment of an entrepreneur’s offer against a large group of individuals (Ahlqvist et al., 2008). The ultimate goal of a business, which is making use of social media, obviously is to benefit from it in one way or another. Platforms like Facebook, Twitter, Snapchat, Youtube and others can be used to create business value, like driving traffic and increasing customer retention and satisfaction by correctly branding the product to the market (Culnan et al., 2010). Businesses could also use social media as a customer service and support tool, which companies widely adopted over the last couple of years. Adding a call to action or a link to purchasing pages could drive sales and social media platforms could be used as a source to align product developments to market needs (Culnan et al., 2010). In their research on social media usage by businesses, Culnan et al. (2010) stretch that ‘the value comes not from the platform itself but how a particular social media platform is used, as any given platform can be used for a variety of purposes’. The UGC data that are generated in vast amounts contain very rich data about brand opinion, sentiment, satisfaction levels, retention indicators and others (Hanna et al., 2011). The need for social media monitoring and analysis, and therefore for tools, grew accordingly. Today there are numerous players in the market that claim to have the best social media monitoring tools with integrated metrics and powerful analytics engines. In their research about social media monitoring tools, Stavrakantonakis et al. (2012) indicate that enterprises

(18)

utilize many different methods and tools, both online and offline, to keep track of what their customers think and need. The use of online sources for collecting this information has some advantages such as the relative costliness and timesaving nature. Therefore businesses are increasingly adopting online over offline sources such as surveys and interviews. (Stavrakantonakis et al., 2012)

2.2.2. Social media Monitoring and Analysis

Many online tools have been developed in the last years, providing added value in offering access to real customer’s opinions and engagement levels, at (almost) real-time basis. Because of its nature, there is hardly any need for sampling; the population as a whole can be taken into account and the process is a lot faster since data collection and data analysis are integrated (Stavrakantonakis et al., 2012). In essence, the last decade, tools and methods have shifted from having an explanatory point of view towards being predictive. ‘Predictive analytics entail the application of data mining, machine learning and statistical modelling to arrive at predictive models of future observations as well as suitable methods for ascertaining the predictive power of these models in practice’ (Shmueli & Koppius, 2011). Explanatory statistical models are more based on relationships and predictive model more on associations (Lassen et al., 2014). For business and also in research areas it is obviously more interesting to predict future outcomes than to explain past behaviour, therefore research also grew accordingly.

2.2.3 Predictive use of social media

Since social media is generating vast amounts of data and often possesses opinions and attitudes, it could be used to predict future events, however there are some challenges to it. Several research projects have been conducted on the proposition that social media data

(19)

can be used as a predictor of real-world events. Overall many projects have succeeded, like Asur & Huberman (2010), who predicted box office revenue with sentiment, or predicting disease outbreaks (Schmidt, 2012), product sales (Lassen et al., 2014) (Dijkman & Ipeirotis, 2015) (Voortman, 2015), and stock price movements (Karabulut, 2015). The reason why social media is so appropriate for future event prediction is emphasized by Asur & Huberman (2010), who mention that it’s data is a result of the opinions and thoughts of the collective population. This argument is also supported by Kalampokis, Tambouris and Tarabanis (2013), who mention ‘social media is a vital component of the web as it enables

the production of data that reflects personal opinions, thoughts and behaviours.’

Predictive analytics are valuable tools for businesses in multiple areas. According to a market study on social media monitoring tools, by Fraunhofer IAO, multiple uses are identified; (1) reputation management, (2) event detection, (3) competitor analysis, (4) trend and market research, (5) influencer detection and (6) product and innovation management (Kasper et al., 2010). Social media analysis tools go one step further and process the data into usable metrics. Sentiment analysis or opinion analysis is one reliable indicator of this process. According to Stavrakantonakis et al. (2012), the analysis should provide metrics about (1) reputation, (2) market insights, (3) identification of conversations to join, (4) competitor analysis and (5) product support or service development.

2.2.4 Social networks and user behaviour

As Lassen et al. (2014) mention in their research about predicting Iphone sales, the research field on predictive analytics is build on the assumption that social media actions such as likes, tweets, comments and rating are proxies for user attention. Lassen et al. (2014) suggest the AIDA model to delineate the relationship between social media actions and the probability to take action in the form of purchasing or other actions. The AIDA model stands

(20)

for (1) attention, (2) interest, (3) decision and (4) action (Lassen et al., 2014). Initially, the AIDA model is proposed to function as a marketing tool used for social media, it could represent any action to be taken with regards to an extended range of social media marketing strategies and applications (Hassan et al., 2016).

UGC affects each step in the AIDA model. Firstly in the attention/awareness phase, individuals read media info, social media info that can result in awareness about an event or product. In the second step of the model, reviews and user ratings could generate liking, knowledge and interest. During the decision/preference phase, the effect of social influence processes of identification and conformity could have an effect. Finally, in the action phase, the individual executes the specific event (Lassen et al., 2014). When having purchased or funded, hence having taken the actual action, many individuals evaluate their action and express their thoughts on social media. Here essential content is generated about a product, service or campaign. According to multiple studies this is called electronic word of mouth (e-WOM) (Gruen et al., 2006) (Chu & Kim, 2015) (Lee & Youn, 2009).

Proposed is that the AIDA model also holds for taking the funding/backing action in the crowdfunding process. Due to the similarity between marketing campaigns for product sales and the marketing campaign of a crowdfunding campaign, UGC affects the decision process of the individual that is looking at making a funding/backing decision.

2.2.5 YouTube as social network

With over a billion active users, YouTube is the second largest online social platform, just after Facebook. Together these users watch circa a billion hours of video, generating billions of views (YouTube, 2018). YouTube is a public video-sharing website where people can view and share videos with varying engagement levels. From casually viewing, to maintaining social connections via video, all is possible on YouTube (Lange, 2007). In 2005 the platform

(21)

started off with just the simple functionality of sharing video’s, however since then, functionalities and features have been added. YouTube is also offering users a personal profile page (channel page) and enables friending, following, sharing, subscribing, commenting and liking amongst others. It therefore classifies itself as a typical online social media platform and can be called a content community, which these days is actively used by youth and young adults to project identities and affiliate with particular social groups (Lange, 2007) (Smith et al., 2012). YouTube is a compelling factor in the AIDA model since it often reflects the (2) interest phase of the model. When people are getting interested in a product or service, they often refer to YouTube as a medium to watch reviews (Reino & Hay, 2016). There are many crowdfunding reviews online and accessible via YouTube (YouTube, 2018). Several YouTube studies related to sentiment analysis, however these are not related to the current topic. Wollmer et al. (2013) assess the sentiment that is spoken within the video to build a system that can analyse spoken reviews. Bermingham, Conway, McInerny, O’Hare & Smeaton (2009) assess whether they can explore the potential for online radicalisation, but are not able to provide any significant evidence. Additionally, Thelwall and Sud (2011) examined user’s sentiment to examine the level of participation on YouTube within the comment section. Therefore this study will pioneer in the field of sentiment analysis based on YouTube comments.

Businesses are actively using social media to market their brand, product or service. The last decade the consumer-marketer relationship has evolved from simply providing content, to a 2-way conversation where users advocate or oppose brands actively. Marketing via YouTube is very effective, and a hot topic these days is the concept of ‘viral marketing’ (Mohr, 2014). Monitoring what people think of your YouTube clips, or product placements in other videos, is important to determine reputation, gather market insights

(22)

and businesses could potentially join conversations to support the consumer or engage them with the brand.

2.3 Sentiment Analysis

An important part of our information gathering behaviour is to determine what people think, what opinion they hold, and whether it has a positive or a negative charge. A method that has been developed over recent years is aiming to classify and score textual data according to the sentiment it encompasses.

2.3.1. Sentiment Analysis method

Sentiment analysis is a method that is being widely adopted as opinion analyser and one of the most studied research fields in predictive analytics at the moment. Yi & Nasukawa (2003) are pioneers in this research field and defined the method as ‘A technique to detect

favourable and unfavourable opinions toward specific subjects (such as organizations and their products) within large numbers of documents offers enormous opportunities for various applications’. The essential factor in sentiment analysis is to identify how sentiment is

expressed in human written text (Yi et al., 2003). The main idea of performing a sentiment analysis is building a scheme that can process subjective information efficiently and create an algorithm that accurately can classify text as positive, negative or neutral (Go et al., 2009). Whether the text is positive, negative or neutral can be indicated in several ways. The person doing the research can indicate a list of words, which suggest when these words are used that the text is positive, negative or neutral, in this case a human determines the sentiment. Other techniques are, for instance, making use of algorithms to determine the sentiment, which tends to have a higher accuracy regarding prediction metrics. The algorithm counts the number of positive and negative words but weighs them according to

(23)

the position in the sentence and the word itself (some words are stronger than others) (Guyt, 2018). One of the most advanced techniques currently in use first transforms all words into tokens and build a model out of that; they run it on a set of records with a know sentiment. This known sentiment is often manually prepared or based on other aspects like customer ratings. This training set is trained on a preferably large data set, the result is a set of values that can determine whether a positive, a negative or a neutral text is analysed (Guyt, 2018). Current research is mainly focussed on the creation of an algorithm that provides the highest prediction accuracy.

2.3.2. Sentiment

The concept of sentiment and why people express their sentiment and share their opinion online has to be explored. According to Liu (2017), the main aspect of sentiment analysis is studying the opinion that expresses or implies the positive or negative sentiment of an individual. The term opinion is used as ‘a broad concept that covers sentiment, evaluation,

appraisal or attitude, and its associated with information such as opinion target and the person who holds the opinion’ (Liu, 2017). The term sentiment however, implies only the

positive or negative feeling that is expressed in the opinion. To define the concept of sentiment, Liu (2017) proposes ‘Sentiment is the underlying feeling, attitude, evaluation or emotion associated with an opinion. It is represented as a triple, (y, o, i), where y is the type of the sentiment, o is the orientation of the sentiment and i is the intensity of the sentiment’. 2.3.3 Sentiment and user behaviour Opinions and sentiment have been important to humans for a long time. People asked for other’s opinions in choosing which auto-mechanic to go to, or explain why they would vote for a specific person in elections. The internet made it very easy to distribute this opinion to

(24)

others, and therefore influence other’s views (Pang & Lee, 2008). Sentiment does affect the way people deal with messages on the internet. Found is, when textual messages are emotionally charged, people tend to share the message more quickly compared to neutral messages, hence they are more engaged (Stieglitz & Dang-Xuan, 2013). This statement is in line with the findings of Scholz, Dorner, Landherr and Probst (2013) who indicate that when UGC is positively loaded, the awareness of the particular product or service increases. Additionally, suggested is that the positive sentiment expressed on social media is affecting sales and purchases positively, Voortman (2015) discovered this concerning car sales.

2.3.4 Sentiment Analysis Tools

Only little previous research has been conducted on the difference between publicly available sentiment analysing tools. Partly because the literature that is being conducted around the sentiment analysis topic is aimed at increasing the accuracy of the analysis itself, hence researchers build their own analysis tool, and partly because it is simply a personal choice of the researcher which tool to use. Nonetheless, Abbasi, Hassan and Dhar (2014), conducted a benchmark analysis of sentiment analysis tools, wherein a detailed performance evaluation is provided. Herein the accuracy of the tool is tested. They claim that researchers who use these analysis tools can rely on their research to substantiate the choice of the tool. Overall all tools that are included in the study score between 40% and 67% accuracy, which is not considered high, however is acceptable (Abbasi et al., 2014). Currently, the highest achieved accuracy scores range somewhere around 90%. These scores are reached by using POS tagging techniques or a lexicon based techniques (El-Din, 2016). POS tagging is the task of identifying nouns, verbs, adjectives, adverbs and more. These tags are then used in the Natural Language Processor (NLP) (Ramanujam, 2014). The lexicon-based technique is based on emotional research for sentiment analysis dictionaries. It works

(25)

on the assumption that the collective polarity of textual data is the sum of polarities of the phase or word (Devika et al., 2016). Both techniques are known to be able to achieve relatively high accuracy scores and are widely used as online accessible tools.

2.4 Hypotheses and Conceptual Model

By thoroughly assessing all theories and discussions out of previous literature, hypotheses can be formulated to test the relationship between online expressed sentiment on YouTube and the funding percentage a Kickstarter campaign receives. All these hypotheses are derived from afore mentioned literature. 2.4.1 Hypotheses According to Lassen et al. (2014), UGC affects all four stages of the AIDA model. According to this model funders go through 4 steps before taking action of funding or purchasing. These steps are (1) attention, (2) interest, (3) decision and (4) action (Lassen et al., 2014). Due to the fact that social media has become an extension of people’s daily life they post many of their opinions, attitudes and emotions online. Gathering this data is very valuable since it contains information for entrepreneurs and businesses (Asur & Huberman, 2010). This data also affects individuals in their journey to purchasing, or in the context of this research, funding. Voortman (2015) and Scholz et al. (2013), indicate that when the displayed messages are positively loaded, this will have a positive effect on the action that will be taken. During their path towards funding, individuals will encounter many social media posts, comments, shares and mentioned likes of others. These social media activities will possibly possess sentiment or an attitude towards the topic of the post. Hence, suggested is that when overall sentiment is positive, this will have a positive effect on the funding

(26)

behaviour. Therefore hypothesis 1 is reflecting part 1a. of the main research question and is formulated as: H1a: The percentage of funding is higher when the overall sentiment expressed in the comments of a Kickstarter YouTube video is more positively classified. Subsequently, according to Lassen et al. (2014), when individuals are commenting, liking and sharing social media content they are more engaged. These social media actions are a proxy for user attention. Additionally, UGC affects many people during their journey of taking actions (according to the AIDA model) by reading others opinions, post-funding reviews, post-purchase reviews, and feeling identifiable with the campaign or not (Lassen et al., 2014). Known is that individuals are motivated to fund a campaign when having a feeling of connectedness towards a community (Gerber et al., 2012). The funding is partly driven by sympathy, guilt, happiness and identity, ague Gerber et al. (2012). Proposed is that when a YouTube video receives more likes, comments and views, more people are engaged and feel sympathy towards the campaign. This is suggested to enforce the relation between the sentiments expressed in the comments and the funding percentage, hence the relation between the independent variable and the dependent variable. This results in the following hypotheses and represents part 1b. of the central research question:

H2a: A higher number of likes on a YouTube video of a Kickstarter campaign leads to a stronger effect of the expressed sentiment within the comments on the funding percentage of the Kickstarter campaign.

(27)

H2b: A higher number of comments on a YouTube video of a Kickstarter campaign leads to a stronger effect of the expressed sentiment within the comments on the funding percentage of the Kickstarter campaign.

H2c: A higher number of views on a YouTube video of a Kickstarter campaign leads to a stronger effect of the expressed sentiment within the comments on the funding percentage of the Kickstarter campaign.

The metric mentioned above are chosen because they are relatively easy to extract from YouTube. Additionally, according to Robertson (2014), these measures are a proxy for video ‘success’. When a video is ‘more successful’ we assume it has a positive effect on the proposed relation between sentiment and the funding percentage.

2.4.2 Conceptual Model

The conceptual model presented represents hypotheses that are derived from the literature. With the understanding that is created in the previous section of this thesis project, the research exploits a correlation between the sentiment that is expressed in YouTube comments of Kickstarter campaign videos and the funding percentage the specific campaign receives. The dependent variable in this model will represent the funding percentage where the independent variable is the sentiment that is expressed in the textual data. By analysing the level of engagement in the form of number of views, number of comments and number of likes, a possible moderation effect is explored. Please refer to figure 1 for the conceptual model.

(28)

Figure 1. Conceptual Model

(29)

3. Methodology & Data Collection

In this chapter of this research project, the methods and techniques that are used to execute this research project are elaborated upon. Firstly the approach to the research is discussed after which the data collection methods and data analysing methods will be explained. Finally, an overview of how the data will be analysed is provided.

3.1 Research approach

Where most research about the concept of sentiment analysis is focussing on the objective of reaching a higher accuracy by lexical chaining, creating more complex neural networks or with the use of comprehensive training sets, this thesis project mostly focuses on a different objective (Lassen et al., 2014). This project mainly focuses on a different field to predict, namely short-term crowdfunding projects, with the use of a social platform that is not frequently concerning UGC or micro-blogging, namely YouTube. By the sentiment, which is expressed in the comments of Kickstarter YouTube videos, a relation between the sentiment and the funding percentage that a Kickstarter campaign obtained is examined. Sentiment analysis is a relatively new research technique to collect and analyse textual data on the internet and assess the textual content using machine-learning techniques. It therefore is also considered to be a technique capable of triangulating qualitative and quantitative analysis methods. Since individuals often encounter the social media platform YouTube when assessing products online, YouTube potentially could be used as an indicator for future success. By scraping comment data of YouTube, a dataset of textual data was compiled in other to analyse the sentiment regarding Kickstarter campaign data. Assessing textual data from YouTube has not been explored extensively, this research adds to the academic

(30)

literature by doing so for crowdfunding campaigns. By combining the sentiment data with meta-data of the Kickstarter campaign, a possible effect of the sentiment score on the Kickstarter funding percentage can be examined. Furthermore, the effect of the number of likes, comments and views, and whether they have a moderating effect on the relationship, is examined. 3.2 Data Collection 3.2.1. Kickstarter Campaign Data To get information about all Kickstarter campaigns that have been started in 2017, a dataset of Web Robots was accessed (Web Robots, 2018). Web Robots is a start-up that materialized in October 2013 when the two founders, Tomas Vitulskis and Paulius Jonaitis, quit their corporate career and started created something of their own. The dataset they made available is subtracting every Kickstarter campaign of Kickstarter.com and provides all meta-data of the campaigns. They started the crawling in April 2014; all subsequent data sets are available at their website1. Because the research is focussed on finding a relationship between expressed sentiment and the funding percentage, the succession of the campaign, number of backers, funding percentage, funding goal, category among others are the most important data to be able to access. Please refer to Appendix 1 for a full overview of all the meta-data that is collected. The full dataset consisted of 24.726 Kickstarter campaigns that were started in 2017. The year 2017 was chosen because it is the most recent year for which full data is available. 1 https://webrobots.io/kickstarter-datasets/

(31)

3.2.2. YouTube Comments Data Set

Out of the 24.726 campaigns, a sample of 85 Kickstarter campaigns is filtered, that suit the requirements for sentiment analysis of its YouTube comments. The sampling technique used will be explained in a later section of this chapter. Of the 85 campaigns that were used to conduct this research a YouTube URL (Uniform Resource Locator) is searched. This URL directs you to the specific YouTube video of a campaign. To collect meta-data of the YouTube videos, all information is scraped from YouTube. The method that is used for scraping the comments of YouTube consisted of using a YouTube scraper that is available on GitHub2, this scraper is developed by ‘philbot9’ and is able to scrape the following

information: − Comment ID − Username − Date − Timestamp − Number of Likes − Comment Text − Replies Dependent on how many comments are scraped this took somewhere between 20 seconds and 8 minutes. To be consistent, there was as little time as possible between the scraping of video 1 and video 85. All videos are scraped between 2018-05-14 15:40 and 2018-5-14 17:08. Eventually, 85 documents with all comments were scraped via the YouTube scraper, all in .csv format, which needed to be cleaned in a certain way.

3.2.3 Sentiment Scores Data set

To subtract the sentiment out of all the comments that are provided in the documents downloaded from the YouTube scraper. A plug-in in Excel needed to be downloaded. This plug-in is developed by Lexalytics text analysis software SaaS & On-premise. Lexalytics is a

2

(32)

language processor and text analyser, that provides multiple solutions to analyse text and conduct sentiment analysis (Lexalytics, 2018). It provides a product for sentiment analysis for surveys, social media, and reviews that is called Semantria. Semantria can be used as an Excel plug-in to run it the researcher needs to sign up at Semantria. When downloading the demo package, the researcher will be provided with a limited number of queries to run. By contacting the Sales department of Lexalytics, the researcher obtained a student limit that was applied to the account, which enabled the researcher to run up to 50.000 queries. The reason the tool of Lexalytics is used is due to their excel plug-in, which enables a simplified method of data analysis. Semantria is generating the sentiment scores by communicating with a database via an API, that is provided in the installation of Semantria. Here the textual data is processed through an algorithm that is considered a ‘black box’, since the way the algorithm functions is not publicly available. Nevertheless, it provides some extra columns in the excel file. These extra columns provide; Detected language, Detected language score, Document Sentiment and Document Sentiment +/-. Please refer to Appendix 2 for an example of the outcomes. An average sentiment per YouTube video is calculated to test whether there is a relation between the sentiment of a Kickstarter YouTube video and the funding percentage. 3.3 Cleaning the Data 3.3.1. File formats Because the datasets are downloaded in a .csv format and contain only raw data, the data need to be cleaned and consolidated. A .csv (comma separated values) format consists of a regular Excel file, but with all data that are normally distributed in the column, located in column A. With the button in Excel in the tab ‘Data’ the text can be moved to columns.

(33)

Three steps are needed in which has to be stated what delimiter the data are separated from each other. The file then can be saved as a .xlsx file and provide the ability to analyse the data per column. The downloaded YouTube scraped documents resulted in .csv files, which are converted to .xlsx files, however they contained blank rows. Those rows are cleared to perform a proper sentiment analysis and not spend the number of queries received from Lexalytics on blank rows.

3.3.2. Kickstarter data sets

Because the Kickstarter data sets are provided by month, first all month files are consolidated into a file that contained all 2017 campaigns. Because most of the campaigns ran in more than one month, this resulted in loads of duplicate records. Those duplicate records were removed using a function in excel that highlights all duplicated id’s and after that removing all records which were highlighted.

The scraped files by Web Robots provided lots of time and date data, however all data are displayed in a Unix timestamp format. This format is defined as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC) (Epochconverter, 2018). The Coordinated Universal Time was introduced on Thursday 1st of January 1970, hence the Unix timestamp started counting on this moment in time. To convert the Unix timestamp to a readable date/time the following formula was used:

=([Cell to convert] / 86400) + 25569 + (-5/24)

(TheSpreadsheetPage, 2008) With the converted dates and times, the runtime of a Kickstarter campaign, launch date, deadline date, and others can be analysed.

(34)

3.3.3 YouTube data set The YouTube scraper provides all the comments that are posted beneath a YouTube video, even the ones posted after the deadline date of the Kickstarter campaign. To make sure the prediction of funding percentage was is only based on comments previous to the deadline date; the ones that are posted later are filtered out. The Kickstarter data set provided the deadline date. By converting the YouTube comment timestamp from a Unix timestamp to a normalized date format, the two dates are compared. 3.3.4 Sheet consolidation By consolidating all data into one workable sheet, all Kickstarter campaigns that are in the sample are displayed as one record with their meta-data, average sentiment, average sentiment score, funding percentage, numbers of likes, numbers of comments and numbers of views. This makes analysing the data as easy as possible because it provided the ability to directly import/copy the data into SPSS, which will be used for the statistical analysis.

3.4 Sampling Method

To generate a representative sample, which is also operational for this research project, the Kickstarter dataset has been examined and filtered. As already mentioned, only 2017 campaigns have been selected to be in scope for the reason of being able to recognize YouTube comments to be in 2017. It is difficult to assess the exact posting moment of the YouTube comment after a year. For consistency reasons only Kickstarter campaigns that have been started in the US are filtered for. The Semantria sentiment analyser is able to determine a sentiment score in multiple languages, however to limit the possibility that sentiment is expressed differently across languages, only English will be analysed. To make sure that some internet traffic was generated about the specific campaign, a cut-off point of

(35)

>1000 backers is applied. Assumed is that when a campaign received at least >1000 funding transactions, at least some internet traffic would be noticeable. Applying both the geographical and the numerical filter resulted in a set of campaigns with N = 279. Checked is whether a YouTube video of the campaign exists and whether it was usable for sentiment extraction purposes manually. Combined the filtering and checks result in a list of 85 campaigns, which will be used to examine whether there is a relation present between sentiment and the funding percentage.

3.5 Analysis of the Data

3.5.1. Sentiment Analysis

To analyse the sentiment expressed in the comments on YouTube, a sentiment analyser is used. This tool is used as an Excel Plug-in, which communicates with a database of Lexalytics via an API. This plug-in enables the analysis of text directly in Excel. By assembling a sentiment score (between -2 and 2) and a sentiment label (positive, neutral, negative), the average sentiment per Kickstarter YouTube video is generated (Lexalytics, 2018). The outcomes that are collected out of the Semantria Sentiment analysis are: # of positive comments, # of negative comments, # number of neutral comments and sentiment score. The outcomes that have been computed manually are the average sentiment score per YouTube video, sentiment score where neutral is not taken into consideration and PN Ratio. These outcomes are in line with previous research, where PN ratio is similarly (Asur & Huberman, 2010) (Lassen et al., 2014). The formula used to compute PN ratio:

PN ratio = positive comments / negative comments

The sentiment score where neutral is not taken into consideration is computed for the reason that neutral comments are presented to lack subjectivity. The average sentiment

(36)

score without consideration of neutral scores is calculated by taking the average of all sentiment scores of textual values that are suggested to possess either a positive or negative sentiment. All these outcomes are consolidated together with the Kickstarter campaign meta-data in a single sheet, for simplicity reasons in the statistical analysis process as mentioned before.

3.5.2. Statistical Analysis

The statistical analyses of this thesis project are executed in SPSS. The consolidated datasheet that has been compiled with both the Kickstarter campaign details, the YouTube meta-data and the sentiment scores are loaded into SPSS. First, a check is done for missing value and the sample characteristics will be computed. In the analysis of the data, this will be explained extensively. All campaigns will be divided into three categories subsequently Technology, Games, and Socio-Cultural. Both dependent and independent variables will be checked for normality and means, correlations will be elaborated on. For the dependent variable, a Z-score will be computed to check for outliers.

By means of a One-way ANOVA the means between the different categories will be compared. Additionally, the means and correlations of the variables will be displayed and explained. With a regression analysis the significance of the model will be tested, the conditions for the regression analysis is that there is no heteroscedasticity or multicollinearity present in the model and the variables are normally distributed. The regression analysis together with the correlations between the different variables will indicate whether there is a relation between the independent and dependent variable. It will also indicate whether the moderation variables affect the suggested relationship. The tests will be substantiated with visual plots and statistics.

(37)

4. Results

This section will present the results of the statistical tests that are executed to test whether there is a relation between the sentiment in YouTube comments and the funding percentage of Kickstarter campaigns, hence to test the hypothesis 1. Additionally, the moderations are tested, hence H2a, H2b and H2c. Descriptive statistics, as well as an elaboration about the tests and conditions, are provided. After data collection, cleaning and processing the textual YouTube comment data through the sentiment analysis tool, a workable sheet was consolidated. All the data are loaded into the statistical program SPSS to perform the statistical analysis.

4.1 Sample Characteristics

In total, the sample that is taken out of the Kickstarter data-set consisted out of 85 campaigns, hence the net sample is 85 (N=85). The conceptual model already displayed the fact that this research consists of the sentiment being an independent variable, where the funding percentage is the dependent variable. The number of likes, views and comments are all suggested to be moderators of this proposed relation.

Since there are YouTube videos with a low number of comments, all cases with less than 8 comments are not taken into consideration. Assumed is that with 7 or fewer comments, the sentiment score will not be representative for the overall sentiment since only an insufficient number of individuals expressed their sentiment. Therefore 4 cases are not taken into consideration, which represents 4,7% of the sample. The gross sample is 81

(38)

As explained in the process of data collection, all campaigns have been launched in 2017. The sample is divided into categories, where 45,7% of the sample belongs to the category ‘technology’, 23,5% belongs to the category ‘Games’ and the remaining 30,9% fall in the category ‘Socio-cultural’. In the category socio-cultural, the Kickstarter categories art,

comics, design, discover, fashion, film, music and publishing are pertained.

The dependent variable is checked for normality by the Shapiro-Wilk test (p > .05) where it can be concluded that funding percentage is significantly deviating from being normally distributed (please refer to appendix 3 for a further visual elaboration). The skewness of the dependent variable is 3.076 (SE = .267), and the kurtosis is 10.603 (SE = .529), which both are quite high. This is due to the nature of the sample, where all campaigns are successful; some campaigns exceed their funding goal by multiple thousand percentages. The variable funding percentage is also checked for outliers, where 3 outliers are found on the positive side (Z > 3), which results in the remaining sample being N=78. The reason to take out the outliers is compelling since they partly cause the skewness and kurtosis values to be relatively high. The mean and standard deviation are 775,74% and 863,97%.

The normality of the dependent variables is also checked for, where is shown that only the sentiment score where neutral is not taken into consideration is normally distributed based on the Shapiro-Wilk test (p > .05). Hence all other dependent variables are significantly deviating from being normally distributed (Please refer to Appendix 4 for an elaboration of these numbers).

(39)

4.2 Descriptive Statistics on categories A One-way ANOVA test and a Post Hoc test with Bonferroni correction are conducted to test whether there is a difference that is significant in Means between the different categories. The One-way ANOVA shows that there is a difference in means between groups (F=7,254, p=.001) for the dependent variable funding percentage, please refer to table 1. The Post Hoc test with Bonferroni correction shows that the mean of the category Technology is different from the category Games with a significance of .001 (p = .001) and different from the category Socio-Cultural with a (very low) significance of .051 (p = .051). This difference in means applies only to the dependent variable funding percentage.

Table 1. Descriptives for One-way ANOVA and Post Hoc, Bonferroni Correction

N Mean Std. Deviation F Sig

Funding percentage Technology 34 1144,51% 1064,52% 7,254 .001* Games 19 307,79% 255,99% Socio-Cultural 25 629,86% 631,93% Total 78 775,74% 863,97% Significance at * p <0.01

These results suggest that the category in which a Kickstarter campaign falls affects the Funding Percentage. More specifically, if a campaign falls into the category ‘Technology’ funding percentage is significantly higher than in the other two categories. These results are not directly linkable to one of the hypotheses but do suggest that technology campaigns overall receive a higher funding percentage.

4.3 Regression Analysis

A bivariate analysis is conducted to investigate the separate correlations between the dependent and the independent variables. This bivariate analysis is done by employing the Pearson correlation statistics in SPSS. As mentioned, in this research the sentiment score

(40)

(neutral out of consideration) and PN-ratio are the subjects of the research as independent variables and are both indicating the level of sentiment that is expressed.

Table 2. Means (M), Standard Deviations (SD) and Pearson correlations for all variables

Variables M SD 1 2 3 4 5 6 7 8

1. Funding Percentage 775,74% 863,97% (-)

2. Sentiment Score (neutral out

of consideration) ,2949 ,2491 .006 (-) 3. #total_comments 288,90 715,47 -.093 -.242* (-) 4. #views 198240,35 5666384,54 -.003 -.313** .662** (-) 5. #likes 3310,03 10442,60 -.090 -.218 .909** .781** (-) 6. PNratio 4,52 6,56 .141 .605* -.071 -.165 -.127 (-) 7. Technology_Dummy .4359 .4991 .378** -.150 .016 .101 .056 -.171 (-) 8. Games_Dummy .2436 .4320 -.309** -.097 .166 -.017 .070 -.108 -.499** (-)

**. Correlation is significant at the 0.01 level (2-tailed)

*. Correlation is significant at the 0.05 level (2-tailed)

Table 2 shows that the independent variable ‘2. Sentiment score (neutral out of

consideration)’ only has a very weak correlation with the dependent variable ‘1. Funding percentage’. This is also the case for the dependent variable PNratio, which shows a higher correlation. The direction of the correlation between sentiment and the funding percentage is in line with hypothesis 1, where sentiment is suggested to affect the funding percentage positively. It is shown than PN-ratio is able to explain more of this effect than the sentiment score; nevertheless, both correlations show to be not significant. Therefore cannot support H1 and assume that there is no relation between the sentiment expressed in the YouTube comments of Kickstarter campaigns and the funding percentage of those campaigns. Additionally, table 2 shows negative correlations between funding percentage and #total_comments, #views and #likes. Nevertheless, these correlations are not significant as well. Hence, concluded can be that there is no direct effect between the moderating variables and the funding percentage. When the number of views, comment or likes goes up, this does not affect the funding percentage. Whether the moderating variables affect the

Referenties

GERELATEERDE DOCUMENTEN

Daarom worden steeds meer smart grids aangelegd, waarbij lokale duurzame energie slim wordt opgewekt en gedistribueerd, bijvoorbeeld door de overtollige zonne-energie van

In this article, we describe the design of a randomized, controlled, multicenter clinical trial comparing: (1) a low to moderate intensity, home-based, self-management physical

Unilin Insulation adviseert jou van a tot z en biedt met zijn Usystem dakelementen een totaaloplossing voor de isolatie van hellende daken..  Hoofdkwartier

Scannen naar netwerkmap (alleen voor Windows®), (S)FTP, FTP via SSL, e-mail server 5, SharePoint 7 en eenvoudig scannen naar e-mail 6 Cloud scanfuncties 6 Scan direct naar Evernote

In addition, data sharing of payments data has also already been possible in India since 2016, New Zealand since 2017 and China since 2020 (Swallow et al. However, people differ in

The father of Pluralism, John Hick, argues that adher- ents of different religions approach the same God, but from various historical and cultural standpoints, 17 and

Connecticut hazardous material survey.: Ammonium nitrate Rhode Island RTK hazardous substances: Ammonium nitrate Pennsylvania RTK: Ammonium nitrate Massachusetts RTK: Ammonium

Eén en ander kan verklaard worden uit het feit dat koeien op een dichte vloer iets trager zijn dan op een