• No results found

Do online word-of-mouth effects differ across platforms?

N/A
N/A
Protected

Academic year: 2021

Share "Do online word-of-mouth effects differ across platforms?"

Copied!
28
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Do online word-of-mouth effects differ

across platforms?

A comparison between eWOM found on Twitter and Reddit

Huub Kuiper

(2)

Master Thesis

“Do online word-of-mouth effects differ across platforms?”

A comparison between eWOM found on Twitter and Reddit

Faculty of Economics & Business

MSc Marketing Intelligence

Huub Kuiper

Address: Westerbinnensingel 19, Groningen

Phone Number: +31652734093

E-Mail:

h.kuiper.7@student.rug.nl

Student Number: S2707101

1

st

Supervisor: dr. E. de Haan

2

nd

Supervisor: dr. A.E. Vomberg

(3)

Management Summary

This research elaborates on the use of electronic word-of-mouth (eWOM) to predict stock return. Substantial research already has been done considering eWOM found on the social media platform Twitter. Therefore, the intention of this research is to extend that knowledge by analyzing eWOM found on the blogging platform Reddit as well. Also, this research aims

to find whether Twitter, Reddit or a combination of both sources performs best in predicting stock returns. First of all, Reddit does seem to have an incremental value in predicting stock return. However, this is only to a slight extend and the Twitter eWOM used in this research

still outperforms eWOM found on Reddit with respect to predicting stock returns. Furthermore, this research finds little evidence for an incremental value of combining the

(4)

Table of contents

1. Introduction 5

2. Theoretical Framework 7

1. Literature Review 7

1. Characteristics of Twitter and Reddit 7 2. Effect of Customer Satisfactions Surveys on Stock Returns 8 3. Effect of eWOM on Stock Returns 9

2. Hypotheses 10

3. Research Design 11

1. Plan of Analysis 11

2. Data Collection 13

1. Overall Specifications of the Dataset 13 2. Customer Satisfaction Data 15 3. eWOM and Sentiment Analysis 15

4. Firm Performance Data 17

4. Results 18

1. Representativeness, Reliability and Validity of the Sample 18 2. Relationships between predictors 18

3. Effects on Stock Return 19

5. Conclusions and Recommendations 20

1. Conclusions 20

2. Recommendations 21

6. References 23

(5)

1. Introduction

The internet has a great number of platforms and communities of which each has its own characteristics. Moreover, all the information that can be found considering firms and products can be relevant for researchers, firms, investors and many more. Hence, the aim of this research is to analyze the word-of-mouth found on a couple of these platforms and find which of these platforms has the highest predictive value considering stock returns.In the field of predicting stock returns, many researchers have tried to find and optimize variables to optimize these predictions. This research compares the effect of multiple customer satisfaction variables on stock performance. One of the earlier methods to assess the customer satisfaction are customer surveys, in which customers are

explicitly asked about their opinion of a certain product or firm. For example, the American Customer Satisfaction Index or ACSI, this is a frequently used customer satisfaction indicator. For instance, Fornell et al. (2006) state that customer satisfaction is statistically significant with respect to market valuation. On the other hand, changes in the customer satisfaction do not reflect in the stock prices. Other research on this subject by Aksoy et al. (2008) argued that portfolios with high stock performance levels and positive changes in the customer satisfaction will outperform portfolios with negative changes in the customer satisfaction. Besides, Aksoy et al. found that the effect of positive customer satisfaction information is initially undervalued in the stock market, but the market adjusts for this information in the long run.

(6)

satisfaction have a positive relationship with word-of-mouth. However, one should note that Anderson measures “normal” word-of-mouth, in the sense of asking how many people someone informed about the experience with a certain firm rather than observing the WOM as can be done by analyzing online data. In recent research De Haan (2020) compares the customer satisfaction in terms of ACSI scores to the effect of online word-of-mouth (eWOM). To be more precise, De Haan (2020) collected sentiment data from the micro-blogging platform Twitter by scraping and analyzing more than 8.4 million tweets. Since this data collecting approach does not involve directly asking a person to their opinion, this method might lead to less biased results. Also, De Haan (2020) finds in his paper that the ACSI scores and the eWOM measures are only slightly correlated and therefore argues that both measures contain unique information about the customer base. Hence, both measures are relevant.

To summarize, prior research showed different kinds of measures regarding the effect of customer satisfaction on stock performance. For example, the customer satisfaction constructed by survey information such as ACSI data, which showed to be a helpful measure to assess the value of a stock. And secondly, the eWOM measured by twitter data, which also proved useful in predicting stock value.

Nevertheless, Twitter is only a small percentage of all word-of-mouth that can be observed online. Besides, it is important to note the specific characteristics of the Twitter platform. For example, Twitter restricts the users to posts of a maximum of 144

characters, whereas other media platforms enable their userbase to share more extensive posts. This is likely to result in different platform usage and different forms of eWOM. Thus, it is interesting to find out whether the addition of another platform with different dynamics either leads to a different outcome or complements the earlier mentioned WOM indicators. For this research that additional platform is Reddit.

(7)

Considering the required data, the aim is to expand the existing data set where possible by scraping data from Reddit. Furthermore, the size of the dataset should be sufficient to compare the WOM effects of both platforms. However, since the volume of Twitter posts is vastly higher on average due to the platform dynamics the volume does not necessarily have to be equal among both platforms. After collection of the data, the goal is to derive the effect of different sentiments per platform. By extension, do the effects of negative WOM and positive WOM found on Twitter correlate with the effects of negative and positive WOM found on Reddit. Lastly, the different WOM measures are compared to find which shows the strongest effects.

To conclude, the outcome of this study can give an insight in possible differences of word-of-mouth between Twitter and Reddit. By comparing the predictive values of both platforms, one might gain insight in specific relationships and predictive models can be tailored to a stock its most relevant eWOM predictor.

2. THEORETICAL FRAMEWORK

2.1. Literature Review

2.1.1. Characteristics of Twitter and Reddit

(8)

measuring the weight of different authors and accounting for the source of the tweets. They found that by accounting for these differences that the Twitter valence sentiment reflected abnormal stock returns better than when all authors were weighed equally.

Secondly, the eWOM of the blogging website Reddit is analyzed. This medium does not hold any restrictions on the length of messages and can therefore be assessed as more suie for in-depth analyses. This notion is supported by Choi et al. (2016), they found that Reddit posts are significantly longer and are generated in a slower pace than posts on the Twitter Platform. Besides that, the number of followers of an author does not affect the reach of a message as much as on the Twitter platform. To elaborate on that, the reach is mostly influenced by the number of followers of a subpage and by the number of up- or downvotes a certain post gets. Other research, by Priya et al. (2019), argued that

considering news updates Reddit is more useful for deeper insights than Twitter. Last of all, Elliot et al. (2018) found that subpages with more followers had a higher

concentration of participation over fewer members and older subpages showed more dispersion of participation among members. Hence, we could argue that the influence and reach of an individual author is larger on this platform.

2.1.2. The effects of Customer Satisfaction surveys on stock returns

The ACSI survey gives a broad look upon the customer satisfaction of the average American household, across multiple industries. Hence, the customer satisfaction is often times measured by these ACSI scores in prior research. First of all, both Fornell et al. (2006) and Peng et al. (2015) showed that by investing in stocks of firms that perform well on the ACSI index it is possible to consistently outperform the market. Therefore, it can be argued that good ACSI scores reflect positively on stock returns. Besides, Aksoy et al. (2008) invigorate this relationship with their research and show that companies with high and increasing customer satisfaction yield significantly higher stock returns than firms with low and decreasing customer satisfaction. Finally, also Ittner & Larcker (1998) found a positive relationship between ACSI scores and stock return.

(9)

2.1.3. The effect of electronic word-of-mouth on stock returns

Multiple research learned that eWOM generated by micro-blogging platform Twitter can predict stock movement. Xun & Guo (2017) found that in the airline industry there was a significant positive relationship between the eWOM provided by twitter and the stock returns. Likewise, Smailović (2013) argued that the sentiment found on twitter

could predict stock movement a few days in advance. Furthermore, similar results were

found on an hourly level according to research by Deng et al. (2018). Therefore, we can

assume that positive sentiment results in a positive stock return in the short- and very short run and that negative sentiment results in a negative stock return.

In other research, Ranco et al. (2015) found a significant dependence between Twitter sentiment and abnormal returns during the peaks of twitter volume. To be more precise, peaks in Twitter volume often arise during firm events, such as quarterly reports or other events that do not return periodically. From this research one might conclude that the predictive value of Twitter sentiment is higher during events.

Besides, the amplified effect of volume is also discussed by Tirunillai & Tellis (2012). They found in their research that the volume of word-of-mouth regardless of the

(10)

2.2. Hypotheses

Since extensive research already has been done on this subject, there are some expected relationships of the newly collected Reddit eWOM with respect to the other customer satisfaction measures. First of all, because of the different dynamics of both platforms, as discussed by Priya et al. (2019) and Choi et al. (2016), it is expected that both sources of information complement each other. Additionally, Priya et al. (2019) define Twitter as a source for direct information and Reddit as a source for in-depth information, their findings are used as a basis for the hypotheses in this research. Firstly, since Priya et al. argued that Twitter is more suitable for requiring short-term superficial information and Reddit is more suitable for in-depth analysis, the first hypothesis is as follows:

H1: eWOM found on Reddit performs better in predicting yearly stock returns than Twitter.

The motivation for the first hypothesis is that Twitter is a better predictor for short-term changes in stock return due to the higher frequency of information generation, as mentioned by Choi et al. (2016) and Reddit is a better predictor for long term changes in stock return. Secondly, with respect to news gathering, Reddit is considered as a

complementary source besides Twitter. Therefore, the second hypothesis is:

(11)

3. Research Design

3.1. Plan of Analysis

Earlier research that has been conducted in this area focused on the eWOM effect found on Twitter. However, Twitter is only one of many large social media platforms and does therefore not account for all online word-of-mouth. In addition, other social media platforms might also have added value considering the effect of eWOM. Therefore, the aim of this research is to extend previous research on this subject with another substantial social media platform called Reddit. In extension, the additional value of this research would be the comparison of different user-base and platform dynamics. A question one might ask is whether or not one of the two platforms has a higher predictive value than the other, or do both sources have a complementary value.

Firstly, this research compares a couple of models to investigate these kinds of relationships. Since this research tries to elaborate on the research by De Haan (2020), a similar methodology has been applied. In this way, it is easier to compare the results and evaluate the possible added value of this research. Hence, the first model used takes the following form:

𝑋𝑚𝑖𝑡 = 𝛼0+ ∑ 𝑎𝑚

𝑀 𝑚=1

⋅ 𝑋𝑚𝑖𝑡−1+ 𝑒𝑚𝑖𝑡

In this model variable X denotes the value of predictor m of firm i in time t.

Additionally, by applying this model, it is possible to measure to what extend different predictors are able to predict one another. Initially this regression will be run without the lagged dependent variable and gives an indication to what extend one predictor is able to predict the other predictor. Thereafter, the lagged dependent variable is included to find how well a variable can predict itself. If it shows that a lagged variable can predict its own future values well, the added information of new observations is relatively less.

The second part of the methodology focuses on the ability of the eWOM predictors to predict stock returns on a yearly base. In order to explore that relationship, the following model will be used:

𝑌𝑖𝑡 = 𝛽0𝑚𝑖+ 𝛽1𝑚⋅ 𝑋𝑚𝑖𝑡−1+ 𝛽2𝑚⋅ 𝑌𝑖𝑡−1+ ∑ 𝛽𝑡𝑚 2017 𝑡=2014

(12)

In the above equation Y denotes the stock return for firm i in year t. Furthermore, to assess the predictive values of the different customer satisfaction variables, the lagged predictor variable m is included in the model, as well as the stock return of the year before. Lastly, dummy variables per year are added to control for the influence of a certain year. The above model is applied to the values of the predictors at t-1 and on the growth of that specific predictor. Moreover, the growth of a certain predictor is calculated as described in the formula below:

∆𝑋𝑚𝑖 = 𝑋𝑚𝑖𝑡− 𝑋𝑚𝑖𝑡−1 𝑋𝑚𝑖𝑡

Moreover, in contrast to the research by de Haan (2020), this research does not compare the predictor values on a firm level with the predictor values on an industry level. Since the dataset does not contain multiple firms for all industries, the results would not give reliable results and therefore such a measure is excluded.

Because, the aim of this research is to find which model performs best in predicting stock return, all regressions have been compared by analyzing the Aikake weights. This is in line with the research on Aikake weights conducted by Wagenmakers & Farrell (2004). Earlier research on this subject by De Haan (2020) and De Haan, Verhoef & Wiesel (2015) also used the Aikake weights to compare the fit of different models. The below equation was used to calculate the Aikake weight:

𝑤(𝐴𝐼𝐶𝑚) = exp (− 1

2 (𝐴𝐼𝐶𝑚− min(𝐴𝐼𝐶))

∑𝐾 exp (−12 ⋅ (𝐴𝐼𝐶𝑚− min(𝐴𝐼𝐶))

𝑘−1

AIC indicates the Aikake information criterion of the model shown in equation 2 for both predictor 𝑋𝑚𝑖𝑡−1 and predictor ∆𝑋𝑚𝑖. Furthermore, min(𝐴𝐼𝐶)denotes the minimum value

(13)

Lastly, to investigate whether there is a complementary value of both eWOM

predictors, a trial and error approach will be utilized. To test the hypothesis of a possible complementary value of reddit eWOM in combination with twitter eWOM, both positive and negative eWOM of both sources will all be included in one regression, along with the ACSI predictor. However, the net sentiment predictors of both platforms will not be included to prevent correlation with the negative and positive sentiment shares of both platforms. Subsequently, the complete model will be run and the least performing predictors will be deleted one by one after running each regression. This process will be repeated until the AIC for the most recent model shows a deterioration in comparison with the model before. Ultimately, if the model with the lowest AIC contains both a Reddit and Twitter sentiment variable the second hypothesis of this research is not rejected and it can be concluded that both predictors have a complementary value.

De Haan (2020) already showed in his research that eWOM found on Twitter does not have an incremental predictive effect in calculating the ACSI score of the following year. In extension, the current ACSI score does very well in predicting next year’s ACSI score. Likewise, the ACSI score also does not show much incremental value in calculating the eWOM of Twitter in the following year. This research tries to find whether the

relationship between eWOM of Reddit shows similar results with respect to the ACSI scores. Besides, it also aims to find the dynamics between the eWOM found on Twitter and Reddit and also the combined predictive value with respect to the ACSI scores.

3.2. Data Collection

3.2.1 Overall specifications of the dataset

(14)

data was not included in the research by De Haan (2020) and was collected specifically for this research by scraping the Reddit platform.

Since the Reddit platform has relatively longer posts than twitter, but in a smaller quantity, there is also less volume available than on Twitter with respect to eWOM. Consequently, the data in this research is collected on a yearly base. Furthermore, this dataset accounts for 20 firms over 7 different industries and industries included in this research are airlines, apparel, consumer shipping, department stores, internet firms, restaurants and specialty stores. Lastly, the observation period used for this research is from 2013 untill 2017.

(15)

3.2.2. Customer Satisfaction Data

Over the years several measures of customer satisfaction have been developed. As mentioned earlier in this research the ACSI is a frequently used index score to measure intra-industry satisfaction scores of customers. Moreover, the ACSI is obtained through customer surveys and the sample is designed to make a good representation of the U.S. population. The original dataset collected by de Haan (2020) which is the basis of the dataset for this research, contained ACSI data of 46 firms that satisfied a couple of important criteria. Firstly, since the dataset had to contain data over the same time period as the collected eWOM data, the selected firms needed to be present in the ACSI data and also be sufficiently referred to on twitter. Furthermore, since the sentiment regarding a certain firm was collected with only the firm’s brand as keyword, the selected firms needed to be qualified as a monobrand firms. To be more precise, Mizik and Jacobson (2008, p. 20-21) define a monobrand firm as a firm “in which a single brand represents the bulk of [its] business.”. Furthermore, some additional criteria have to be met, since this research also deals with eWOM collected via Reddit. First of all, all firms need to have an accessible subreddit. Because the scraping tool used for this research has some limitations, a firm needs to have its own specific subreddit to scrape and if it does it should be publicly accessible. For example, the firm Amazon has the subreddit “r/Amazon”, which is entirely devoted to Amazon related posts. Furthermore, firms without a subreddit or firms with private subreddits cannot be used for this research. As a result, more private industries such as banks and insurance companies did not meet these criteria and were therefore excluded from the dataset. Lastly, firms need to have activity on their subreddit to a certain extent, otherwise there would be insufficient data to analyze. In conclusion, these additional criteria shrank the dataset used by De Haan to a final sample of 20 firms, over the period 2013-2017. Furthermore, the sample covers 7 industries and 83 firm-year observations.

3.2.3. eWOM and Sentiment Analyses

(16)

collect the Twitter data, the package “Twitterscraper” was used, this is a package for Python developed by Taspinar (https://github.com/taspinar/twitterscraper). However, since not all firms fit the requirements for the data collection of Reddit, this research does not include all data used by De Haan (2020) and only 5 million tweet observations

remained for 20 companies.

The second source of eWOM used in this research was collected from the blogging website “Reddit”. In order to collect this data, the “Universal Reddit Scraper” developed by JosephLai241 (https://github.com/JosephLai241/Universal-Reddit-Scraper) has been utilized. With the aid of this package, the subreddits of 20 firms have been scraped. In extension, not all firms have active subreddits and some firms have private communities. Hence, it was not possible to match all firms with the corresponding Twitter and ACSI-data. Furthermore, since each scrape consists of a maximum of 1000 posts, all firms were scraped a total of six times. To be more precise, three of the six scrapes focused on the “top” posts, defined by the number of upvotes a post has. Conversely, the other three scrapes focused on the most “controversial” posts, defined by the number of comments a post has. Since these scrapes are random, the eventual dataset contained duplicates. Therefore, indistinct posts were removed after binding the data. As a result, only 14,432 distinct observations remained. That is, over 20 companies in a timespan of 5 years.

As can be observed, there is a vast difference in the number of tweets and the number of Reddit posts. This is the result of a couple of factors. First of all, the Reddit scraper has more restrictions than the Twitter scraper. In particular the scrape limit of 1000 posts per scrape impedes the data collection process. Secondly, the Reddit scraper can only scrape one distinct scrape per day, which also restricts the collection process.

Furthermore, there are some complications due to the difference in both platforms. Whereas Twitter has many individual messages (Tweets) that can be scraped, the number of Reddit posts is significantly smaller, especially for less renowned companies. This observation is also backed by Choi et al. (2016). Moreover, some firms choose to

communicate with their audience by the means of private Reddit communities, which are not possible to scrape.

(17)

have been selected with respect to the most comments and upvotes. By implementing this selection method, it is possible to still get a relatively good impression of eWOM on Reddit.

The sentiment analyses have been conducted in a similar fashion as de Haan (2020) did for the twitter data. Moreover, with the “qdap” package for R developed by Goodrich et al. (2018) the sentiment of the Reddit posts has been analysed. Since the collection of reddit posts was restricted with respect to volume, this research does not include volume effects of both Twitter and Reddit. This research does however consider the Reddit polarity scores per firm per year. Moreover, the Reddit polarity scores that were

calculated are the mean, the standard deviation, the share of negative posts and the share of positive posts.

Furthermore, with respect to the Twitter data the full tweets were used to calculate polarity. On the other hand, for the Reddit sentiment, only the post title was considered in calculating the polarity scores. This is due to the fact that not all posts have a textual content, but rather have graphic content. However, nearly all posts do have a title.

The polarity is calculated by the R package “qdap”, this package labels the words in a text either positive or negative. The decision of whether a word is positive or negative is based on a sentiment dictionary compiled by Hu and Liu (2014). Besides, the context of the sentence is also assessed by analysing the words before and after a word. In this manner it is possible to recognize negating words such as “not and “don’t”, amplifying words like “very” and deamplifying words such as “barely”. By accounting for these words, it is possible to get a clear image of the sentiment in certain posts.

After the polarity values for the reddit data were calculated, they were combined with the dataset of de Haan (2020) containing the sentiment values for twitter.

3.2.4. Firm performance data

Similarly, to the ACSI and Twitter data the dataset provided by de Haan was re-used for the firm performance data as well. This dataset contains a collection of thirteen different firm performance measures. Furthermore, the data was matched with the found observations for both twitter and reddit. In extension, the following variables were

(18)

margin, EBT margin, cash flow, ROA, ROE and the market share. Furthermore, Yahoo Finance was consulted for the stock performance data consisting of: market value, volume, times traded and the stock return. However, since only the stock return lies within the scope of this research, only this firm performance measure has been used for analyses. In conclusion, the collected data used for this research reach over a time span from 2013-2017.

4.1

RESULTS

4.1. Representativeness, Reliability and Validity of the Sample

First of all, table 1 shows all the firms in the sample along with the number of tweets and Reddit posts collected for each firm. As can be seen in table 1, the volume of Reddit observations is significantly less than the volume of Twitter observations. This may lead to relatively less accurate results with respect to the sentiment, because some observations are out of proportion due to a relatively small number of analysed posts. However, since the Reddit sample is collected with respect to the posts that received the most responses, the sentiment in the Reddit posts reached a vast number of people despite the seemingly low number.

Furthermore, the number of industries in the dataset has been reduced to seven, therefore it should be noted that contradicting results with respect to earlier research, may be due to the reduced size of the dataset. Since some industries are not accounted for, the difference with respect to earlier research might also be industry specific.

4.2. Relationships between predictors

(19)

lagged dependent variable, the eWOM variable did not lose all its significance. These findings contradict earlier results and imply that eWOM does have incremental value in predicting the ACSI a year in advance. Moreover, with respect to the R-squares of both models, the second model that includes the “old” ACSI level has a large increase in the R-square level. This result implies that there is rather much autocorrelation and that “new” results do not add much information.

Furthermore, considering the correlation between the eWOM variables of Twitter and Reddit a significant relationship has been found between the negative twitter sentiment and the negative sentiment on Reddit the year before. According to the model, an increase of 1 percent in the share of negative Reddit sentiment results in a small increase of

0.055% in the share of negative Twitter. In addition, even after including the dependent variable as a lagged variable, the negative Reddit sentiment remains a significant

predictor for the negative twitter sentiment the year after. However, the magnitude of the effect is reduced to 0.039% per 1% increase in negative Reddit sentiment. In comparison to the R-squares found for the ACSI scores, the R-square increases relatively less after including the lagged dependent variable. Hence, these results do imply that there is some autocorrelation, but it can be concluded that the “new” eWOM values contain relatively more information than “new” ACSI values.

For the rest of the analysed predictors there were little significant relationships found. For example, for the positive sentiment in both the Twitter and the Reddit eWOM this research does not find any proof that any of the predictors, including the lagged dependent variable, can predict the positive sentiment for either platform. Only the negative Reddit can be predicted to some extend by its own lagged variable.

4.3. Effects on stock return

The second part of the analyses focuses on the effects of the customer satisfaction predictors on the stock return. The results of each regression can be found in table 3. Of all models that have been tested only two models show a minor significant relationship. Firstly, there is a mildly significant relationship between the net twitter sentiment and the stock return a year later. This effect is an increase of 1.067 percent point per percent point increase in net twitter sentiment. Secondly, a small significant effect was found

(20)

sentiment and the stock return is only mildly significant, the magnitude of the effects is also nearly negligible.

Additionally, the Aikake weights have been measured for all models to find which predictor performs relatively best in predicting stock returns. In view of the fact that the Aikake weight for the net Twitter sentiment is 22.92% and the Aikake weight for the second-best performing model of negative sentiment growth on Reddit is 14.94%, it can be concluded that the model with the net Twitter sentiment outperforms the model with the Reddit sentiment. Thus, the hypothesis that Reddit performs better in predicting yearly stock returns than Twitter is rejected.

Lastly, considering the possible complementary value of both Reddit and Twitter sentiment, table 4 depicts the optimal models after a stepwise regression. To be more specific, the initial model contained all predictor variables and after a stepwise deletion process the best performing model remained. In table 4 each column represents the optimal outcome, with respect to model fit (AIC), for different circumstances. From this table can be concluded that only the model that does not include the lagged dependent variable shows a mildly significant relationship and this model only contains one eWOM predictor variable. Furthermore, the optimal model that controls for years and the lagged dependent variable, contains both a predictor for positive twitter sentiment in the year before, and the change in reddit sentiment since the year before. Moreover, this model outperforms the models in which only one of the two eWOM sources is present. Thus, this is, to some extent, in line with the hypothesis that both sources of eWOM

complement one another. However, since the relationships are not significant and only appear in one of the four tested models, this research yields only little evidence to support the claim that Reddit sentiment has an incremental value with respect to Twitter

sentiment.

5. Conclusions and Recommendations

5.1.

Conclusions

(21)

Furthermore, with respect to the stock return in a given year, the model containing the growth of negative Reddit sentiment performed second best after the model containing the net Twitter sentiment. Thus, the hypothesis that Reddit is better in predicting yearly stock return is rejected. However, the Reddit model considering negative sentiment growth came out second best and can therefore not be considered as totally irrelevant.

In conclusion, this research found little evidence for a complementary value of

including both Twitter and Reddit sentiment in a predictive model. Moreover, only one of the four models showed evidence for this complementary value. Hence, the hypothesis of this complementary value is not rejected, but also not strongly supported by the results of this research.

5.2. Recommendations

However, there were a couple of implications to this research. First and foremost, the lack of data provided by Reddit. In the field of data collection Twitter is more suitable than Reddit is. That is due to a couple of factors. Firstly, it is harder to collect data from Reddit since the available scraping tools have multiple restrictions with regards to the volume of data collection. Hence, the difference in Twitter and Reddit volume in this dataset is quite large. Secondly, the eWOM of Twitter is centred in one place, where Reddit is spread over thousands of sub websites, called subreddits. Therefore, there are different choices to make regarding specific collection points. In this research all data was collected from the subreddit specific to the firm. In future research, an overall collection method can be applied, in which there is no focus on any specific sub-page and the whole website is scraped. Another possibility considering data collection is to only consult subreddits that focus on investing and business. However, it is important to note that these alternative methods can be very time-consuming due to the limitations in the current scraping packages for Reddit.

Research Recommendations

(22)

different characteristics of the userbase can lead to new insights and more eWOM predictors might increase the predictive value of eWOM data as a whole.

Secondly, the time interval used for this research is yearly. However, especially when analysing variables that might impact stock return, it is interesting for many professions to know the effects of other (shorter) time intervals. Also, with respect to eWOM there might be differences among eWOM predictors. In extension, a social media platform like twitter can be considered as more dynamic than Reddit, therefore the effects for Twitter might also have more impact in the short run. This research does not account for these effect differences over time.

Thirdly, the only firm performance variable treated in this research was the stock return. However, other firm performance variables as discussed by De Haan (2020) also showed significant relationships with the eWOM and ACSI variables he discussed. Likewise, the effect of eWOM found on Reddit might also affect other variables than the stock return alone.

(23)

References

Anderson, E. W. (1998). “Customer Satisfaction and Word of Mouth,” Journal of

Service Research, 1(1), 5–17.

Aksoy, Lerzan, Bruce Cooil, Christopher Groening, Timothy L. Keiningham, and Atakan Yalçın (2008), “The Long-Term Stock Market Valuation of Customer Satisfaction,” Journal of Marketing, 72 (4), 105-22.

Choi, D., Matni, Z. and Shah, C. (2016), “What social media data should i use in my research?: A comparative analysis of twitter, youtube, reddit, and the new york times comments.” Proc. Assoc. Info. Sci. Tech., 53: 1-6.

De Haan, E. (2020), “Satisfaction Surveys or Online Sentiment: Which Best Predicts Firm Performance?” MSI Working Paper Series.

De Haan, E., Verhoef, P. C., & Wiesel, T. (2015). The predictive ability of different customer feedback metrics for retention. International Journal of Research in

Marketing, 32(2), 195-206.

Deng, S., Huang, Z., Sinha, A.P., & Zhao, H. (2018), “The Interaction between Microblog Sentiment and Stock Return: An Empirical Examination” MIS Quarterly, 42(3): 895-918.

Elliot, P., Connor, H., Jinjie, Y., & Tyler, R. (2018). “The effects of group size and time on the formation of online communities: Evidence from reddit.” Social Media

Society, (2018).

Fornell, Claes, Sunil Mithas, Forrest V. Morgeson III, and Mayuram S. Krishnan (2006), “Customer Satisfaction and Stock Prices: High Returns, Low Risk,” Journal

of Marketing, 70 (1), 3-14.

Goodrich, B., Kurkiewicz, D., & Rinker, T. (2018). Package ‘qdap’. Tech. Rep., 2018.[Online].

Ittner, C. D., & Larcker, D. F. (1998). “Are nonfinancial measures leading indicators of financial performance? An analysis of customer satisfaction.” Journal of

accounting research, 36, 1-35.

Ittner, C., Larcker, D., & Taylor, D. (2009). “Commentary—The stock market's pricing of customer satisfaction.” Marketing Science, 28(5), 826-835.

Luo, X. (2007). “Consumer Negative Voice and Firm-Idiosyncratic Stock Returns.”

(24)

Luo, X. (2009), “Quantifying the Long-Term Impact of Negative Word of Mouth on Cash Flows and Stock Prices” Marketing Science, 28(1), 148-165.

Peng, C. L., Lai, K. L., Chen, M. L., & Wei, A. P. (2015). “Investor sentiment, customer satisfaction and stock returns.” European Journal of Marketing.

Priya, S., Sequeira, R., Chandra, J., & Dandapat, S. (2019). “Where should one get news updates: Twitter or Reddit.” Online Social Networks and Media, 9, 17-29. Ranco, G., Aleksovski, D., Caldarelli, G., Grčar, M., & Mozetič, I. (2015). “The Effects of Twitter Sentiment on Stock Price Returns.” PloS one, 10(9), e0138441. Ruan, Y., Durresi, A., & Alfantoukh, L. (2018). “Using Twitter trust network for stock market analysis.” Knowledge-Based Systems, 145, 207-218.

Smailović, J., Grčar, M., Lavrač, N., & Žnidaršič, M. (2013, July). “Predictive

sentiment analysis of tweets: A stock market application.” International Workshop on

Human-Computer Interaction and Knowledge Discovery in Complex, Unstructured, Big Data (pp. 77-88). Springer, Berlin, Heidelberg.

Sprenger, T., Tumasjan, A., Sandner, P., & Welpe, I. (2014). “Tweets and trades: The information content of stock microblogs.” European Financial Management, 20(5), 926-957.

Sul, H., Dennis, A., & Yuan, L. (2017). “Trading on twitter: Using social media sentiment to predict stock returns.” Decision Sciences, 48(3), 454-488.

Tirunillai, S., & Tellis, G. J. (2012). “Does chatter really matter? Dynamics of user-generated content and stock performance.” Marketing Science, 31(2), 198-215. Wagenmakers, E. J., & Farrell, S. (2004). “AIC model selection using Akaike weights.” Psychonomic bulletin & review, 11(1), 192-196.

(25)

Table 1

Overview of all the 20 firms in the dataset

Twitter Reddit

Firm Name Account Tweets Subreddit Posts

Amazon Amazon 378341 Amazon 2374

American Air Americanair 397747 Americanairlines 190

Bestbuy Bestbuy 131234 Bestbuy 302

Costco Costco 22625 Costco 423

Delta Delta 403670 Delta 97

Domino's Dominos 259222 Dominos 311

Dunkin' Donuts Dunkindonuts 108770 Dunkindonuts 175

Ebay Ebay 118619 Ebay 1054

Google Google 265869 Google 1731

Home Depot Homedepot 59774 Homedepot 369 McDonald's Mcdonalds 354079 Mcdonalds 2247

Nike Nike 93800 Nike 417

Papa John's Papajohns 148388 Papajohns 702

Sears Sears 37120 Sears 281

Southwest Airlines Southwestair 396366 Southwestairlines 448 Starbucks Starbucks 508268 Starbucks 1122

Target Target 204211 Target 437

United Airlines United 445588 Unitedairlines 846

UPS UPS 128395 UPS 336

(26)

Table 2

Model Estimates with predictors as Dependent Variable (n = 83)

ACSI ACSI Pos. Sent. R. Pos. Sent. R. Neg. Sent. R. T + 1 T + 1 T + 1 T + 1 T + 1 Intercept 85.871 **** 28.595**** 0.110 0.132 0.403** ACSI 0.709**** 0.001 0.001 -0.002 Pos. Sent. R. 5.765 -0.581 0.082 Neg. Sent. R. -5.140 -1.427 -0.000 Pos. Sent. T. -35.695 -31.421** 0.058 0.018 -0.390 Neg. Sent. T. -128.982 *** -71.624** -0.555 -0.587 0.272 R2 .18 .722 .037 .051 .063 Adjusted R2 .123 .697 -0.0291 -.32 -.002

Incl. Lag DV No Yes No Yes No

Neg. Sent. R. Pos. Sent. T. Pos. Sent. T. Neg. Sent. T. Neg. Sent. T. T + 1 T + 1 T + 1 T + 1 T + 1 Intercept 0.275* 0.085 0.132 0.039 0.010 ACSI -0.001 0.000 0.001 -0.000 0.000 Pos. Sent. R. 0.135 0.005 0.082 0.021 0.016 Neg. Sent. R. 0.298 *** -0.045 -0.000 0.055 **** 0.039*** Pos. Sent. T. -0.344 0.018 -0.045 Neg. Sent. T. -0.285 -0.587 0.416*** R2 0.22 .036 .051 .188 .385 Adjusted R2 .151 -.013 -.032 .147 .332

Incl. Lag DV Yes No Yes No Yes

(27)

Table 3

Model Estimates with Stock Return as Dependent Variable

Stock Return T + 1

ACSI 0.007(7.21%)

∆ ACSI -0.727(4.16%)

Net Twitter Sentiment 1.067*(22.92%)

∆ Net Twitter Sentiment 0.002(3.03%)

Positive Twitter 1.438(10.60%)

∆ Positive Twitter -0.084(6.67%)

Negative Twitter -2.747(8.47%)

∆ Negative Twitter -0.005(2.96%)

Net Reddit Sentiment -0.183(4.06%)

∆ Net Reddit Sentiment -0.000(2.96%)

Positive Reddit 0.177(3.84%)

∆ Positive Reddit -0.000(4.63%)

Negative Reddit -0.135(3.54%)

∆ Negative Reddit -0.000*(14.94%)

(28)

Table 4

Models with multiple predictors

ACSI 0.007 ∆ ACSI -1.124 Positive Twitter 1.268 1.291 ∆ Positive Twitter Negative Twitter ∆ Negative Twitter Positive Reddit ∆ Positive Reddit Negative Reddit ∆ Negative Reddit 0.000 0.000* 0.000

Lag DV Yes No Yes No

Year Dummies Yes Yes No No

AIC -2,858 1,392 -6,230 0,348

Referenties

GERELATEERDE DOCUMENTEN

The multivariate analysis searches for valuation differences between European countries, when firm fundamentals (growth, size, and profitability), industry

mechan- ism design-based negotiations in the purchasing function as well as AI, four discussion topics were developed: (1) AI in the purchasing process, (2) AI in mechanism

Chapter 2 Comparison of dynamic magnetic resonance defecography with evacuation of rectal contrast and conventional defecography for posterior pelvic floor compartment

We argue that the hydrodynamic flow associated with the water movement from the buffer solution into the phage capsid and further drainage into the bacterial cytoplasm, driven by

‘Die Täter werden eines Tages verschwunden sein, ihre Taten.. werden

It depends on the type of the crisis which one of these should be used (Dutta & Pullig, 2011). Conversely, the company can deny the responsibility and as a result not take

Om een idee te krijgen van de huidige aanwezigheid van de Apartheidsideologie in de Afrikaner identiteit en de dominante (racistische) denkbeelden die hiermee gepaard gaan is

For covering the costs for controlling certain chemicals and residuals in animals that are going to be slaughtered and meat together with fish, milk and egg, that are done following