• No results found

Using chatter to predict stock price movements : a sentiment analysis for the AEX!

N/A
N/A
Protected

Academic year: 2021

Share "Using chatter to predict stock price movements : a sentiment analysis for the AEX!"

Copied!
29
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

!

!

Using chatter to predict stock price movements: !

!

!

a sentiment analysis for the AEX!

!

!

!

!

!

!

!

!

!

!

!

!

!

Abstract!

!

This thesis examines the effect of sentiment in Twitter messages to predict stock price movements and abnormal returns. Based on prior research this thesis expects to find a positive relationship between positive sentiment and abnormal returns. In a sample of AEX firms between October 2013 and November 2013 this thesis finds no such relation in sentiment, but does in message volume. !

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

Redmar de Boer! student number: 5732131! Referee: V. Malinova!

!

!

!

!

(2)

I Introduction!

!

Imagine someone who has the intention to hurt a company or an individual. These persons can send out false information on the Internet either for their own benefits or just because they hold a grudge against the other party. Before the upcoming of social media, one had to put in quite some time and effort, so that their message reaches only a handful of people. Nowadays, however, with the rise of social media, it is possible to share your thoughts with millions around the world, just by clicking a single button. Someone can manipulate investors by using social media to spread false negative information after taking on a short position (Chen et al., 2012).!

! Social media platforms such as Facebook, Twitter, and YouTube are getting more and more popular every day. They have merged into people’s daily routines. Because of its ease of use, speed, and reach, social media is quickly changing the public discourse in society, and setting trends and agendas in topics that range from the environment and politics to technology and the entertainment industry (Asur & Huberman, 2010).!

! With over 1 billion users on Facebook 1 and more than 500 million users on Twitter 2, social

media are interesting to monitor. Examining the chatter on these platforms can be used for multiple purposes. Some recent research suggests that very early indicators can be extracted from social media to predict changes in various economic and commercial indicators (Bollen et al., 2011). For example, Choi (2009) finds early indicators of disease infection rates in Google search queries. Gruhl et al. (2005) uses social media chatter to predict book sales. Furthermore, Asur and Huber-man (2010) found that the public sentiment related to movies expressed on Twitter can be used to predict box office receipts.! !

! According to Bollen et al. (2011) not only news influences stock prices, but public mood sta-tes and sentiment may play an equally important role. Human emotions play a significant role in human decision making. Nofsinger (2005) finds that financial decisions are significantly driven by emotion and mood (2005). Chatter may not be as innocent as it looks like.!

!

! Previous examples show that the chatter on social media contains valuable information. The sentiment extracted from the chatter can be used to predict various economic indicators. This thesis uses the chatter on Twitter to examine if the sentiment that lies within these messages can predict the stock price movement of the 25 AEX-listed companies, and if one can find abnormal returns using this information. The main research question is: is it possible to predict stock price

movement and abnormal returns by examining sentiment in Twitter chatter about the 25 AEX-listed companies? In addition to the main research question, this paper answers the following

sub-ques-tions:!

! http://www.statisticbrain.com/facebook-statistics/ 1

! http://www.statisticsbrain.com/twitter-statistics/ 2

(3)

• ! Which factors play a role in stock price movement prediction?! • What is Twitter?!

• How can you extract sentiment out of chatter?!

!

! The motivation to write this thesis comes from the literature. According to the efficient mar-ket hypothesis, financial marmar-kets are informationally efficient meaning that marmar-ket prices reflect all known information (Fama, 1970). Yet, microbloggers can send out messages and receive informa-tion about an event before it gets covered by, for example, the news. The chatter is fast: everyone with an account can comment on it and share it again. Research shows that financial markets do no always follow the efficient market hypothesis (Malkiel, 2003). This makes social media, and Twitter in particular, very interesting to examine. If sentiment analysis works for the AEX, then it might be possible to predict future stock price movement. Which is very valuable information. This sentiment analysis in this paper is performed by examining microblog messages, while previous research focused more on internet stock message boards. !

! In the following section, the prior literature will be discussed. It contains information on the efficient market hypothesis, Twitter, blog posts, predictability of stock prices, stock reaction to news and on sentiment. This thesis then describes the used data, methods, and its findings. Finally, the discussion of these findings and the conclusion.!

!

II Literature Review!

!

2.1 Efficient Market Hypothesis! ! ! ! ! !

!

Predicting future stock prices is not something new. It is a topic well examined by numerous re-searchers. According to Fama (1964) and his Efficient Market Hypothesis (EMH), financial markets are informationally efficient: the market prices reflect all known information. The EMH comes in three forms: weak, semi-strong and strong. In the weak version, only historical data is reflected in the current price. The semi-strong version embeds the historical and currently public information in the price. The strong version includes historical, public, and private information in the price (Fama, 1964). Given this main principle of EMH, it is believed that the market reacts instantaneously to any given news and investors cannot earn excess profits from trading strategies based on publicly available information (Fama, 1970, 1991).!

! According to other research, however, financial markets do not always comply with the EMH. Malkiel (2003) examines the predictability of stock prices and finds that predictable patterns in stock returns can appear over time and even persist for short periods. Furthermore, he suggests that markets cannot be perfectly efficient, or there would be no incentive to uncover the information that gets reflected in the market prices (Malkiel, 2003).!

(4)

! Tetlock (2008) finds that firms’ stock prices under react to the textual information within news stories, which is not in line with the EMH. The findings of Tetlock (2008) and Malkiel (2003) suggests that it might be possible to use news stories to earn excess profits. These findings sug-gests that the EMH does not always hold. !

!

2.2 What is Twitter?!

!

Twitter 3 is an information network and microblogging service that allows users to share messages

of up to 140 characters, known as tweets. These tweets typically consists of personal information about the users, news, or links to media content. Since its introduction in 2006, Twitter’s popularity has been constantly increasing. With 9100 4 new tweets every second, it is one of the biggest me

-dia platforms on the Internet. Users cannot only share their own updates, they can also follow other people and forward messages of other users. These forwarded messages are called

ret-weets. The more a topic is retweeted by others, the more relevant it is.!

! There are different kinds of users on Twitter, but according to Java et al. (2007), most of the messages posted by these users are just daily chatter. This is relevant, because most of the daily chatter contains no valuable information. For a successful analysis, the messages with valuable information about the target companies must be filtered out with the use of keywords.!

! Because of its growing popularity, the amount of research on social media is expanding as well. According to Chen et al. (2014), social media are becoming increasingly popular and provide a good platform to share investment opinions or feelings on stocks. By applying a sentiment analy-sis, it is possible to collect the social mood. Nofsinger (2005) finds that social mood determines the types of decisions made by consumers, investors and corporate managers. The pieces of informa-tion we receive from others influences our own decisions (Nofsinger, 2005). In addiinforma-tion, Bollen (2010) wonders if this also applies to societies at large. The behavioral economics tells us that emotions can affect individual behavior and decision making. This social mood is also reflected in consumer behavior. When society is optimistic, investors may be more willing to take on additional debt and increase spending (Nofsinger, 2005). The social mood collection attracts lots of attention from corporations for the huge potential it provides for viral marketing (Asur & Huberman, 2010). Because of its huge reach, news organizations even use it to filter news updates through the community (2010).! ! !

!

!

! www.twitter.com 3 ! http://www.statisticbrain.com/twitter-statistics/ 4

(5)

2.3 Blog postings and sales!

!

Prior research focuses more on blogs postings on Internet financial message boards than on other social media platforms, such as Twitter. But in these blog post researches, there are some valua-ble insights that can be extended to the Twitter messages examination. Such as the research of Chen (2012), Chang (2009), Tumarkin (2001), Tomkins (2005), and Antweiler (2004). There are different outcomes between the authors. Some do find a relation between stock prices and the blog posts and others do not. The quality of the messages and the explanatory variable plays an important role.!

! Antweiler (2004) examines if Internet stock messages contains financially relevant informa-tion. The messages contain information if a stock is a good buy or a good sell. Antweiler (2004) finds useful information present on the stock message boards. A positive shock to message pos-ting boards predicts negative returns on the next day. Also, message pospos-ting helps to predict vola-tility. There is financial relevant information present, but that the stock messages reflect public in-formation extremely rapidly (2004). In addition, Chen (2012) also finds a social media effect and that this effect is stronger for articles that receive more attention. Similar findings are present in the research of Tomkins (2005). With sales rank data from Amazon and blogs it is possible to predict spikes in consumer purchase decisions. !

! This is different from the findings of Tumarkin (2012) and Chang (2009). With an event stu-dy, Tumarkin (2012) examines blog postings to help predict stock returns and/or trading volume. Tumarkin finds that the relation between Internet message board posting and abnormal returns is statistically insignificant and consistent with market efficiency (2012). Chang (2009) also finds no link between increased blog chatter and an increase in sales, but warns that blogs may not be as innocent as they used to be. If there is a link between blog post and sales, one might manipulate information to use it as predictive power (2009).!

! !

2.4 Predicting stock prices!

!

Predicting sales for a book or movie is not the same as predicting stock market prices. There are other variables in place for stock price valuation. But with the growing popularity of social media such as Twitter, the precious information within is tapped. According to Bloomberg (2010) there are investors who attribute their trading success to the information they find on social media platforms. Also, Twitter-based trading systems have been developed by financial professionals to take advan-tage of this. By examining sentiment, they look for investment opportunities (2010).!

! Microbloggers, such as Twitter users, are exposed to the most recent information for all stocks. They belief that one is possible to earn excess profits with this advantage. Which is in con-tradiction to the EMH (1964), where markets reflect all known information. Bollen (2011) extends

(6)

this with that news might be unpredictable, but that one can extract very early indicators from onli-ne social media. These indicators can help to predict changes in various economic and commerci-al indicators. And that this might work for the stock market as well by examining positive and nega-tive mood within Twitter messages (2011). Bollen (2011) finds that changes in public mood state can be tracked from the content of large-scale Twitter feeds. !

! Sprenger (2010) also finds valuable information in microblogs that is not yet fully incorpora-ted in current market indicators. The possible link between online search behavior and important market outcomes is of considerable interest to business practitioners (Zhang, 2011).!

!

2.5 Stock reaction to news!

!

A great amount of articles examines the reaction of stock prices to news. Chan (2014) finds that the influence of news articles on stock originates on two facets: fundamentals and emotion. With fundamentals, people use the latest news articles conveying qualitative and quantitative informati-on to adjust their investment decisiinformati-ons. With emotiinformati-on, positive or negative mood influences their decisions. (2014).! !

! Cutler et al. (1989) reports that macroeconomic performance news can explain approxima-tely one-third of the variance in stock returns. Gidófalvi (2001) points out that short-term stock price movements can be predicted with financial news articles. A short-lived predictive power of general financial news on future stock prices is also present in Tetlock's et al. (2007) research. Chan (2014) adds that stock prices under react to news. !

! According to Schumaker (2009) many information types, such as rumors and scandals, can move stock prices, but financial news articles are more stable and trustworthy. Engelberg et al. (2012) shows that these financial news articles generate profitable trading opportunities post the news release day. !

! In a more recent research, Yu et al. (2013) examines the role of social media attributes in-stead of the more conventional use of news gathering. They find that social media attributes have a stronger relationship with stock performance than do conventional media attributes (2013).!

!

2.6 Sentiment!

!

With the information available, sentiment-analysis of Twitter messages might help to predict stock market prices of the 25 AEX-listed companies. The choice for sentiment analysis comes from prior research in this field. Asur and Huberman (2010), uses a computer program 5 to label sentiment in

a positive, neutral and negative group. With this program, they predict box office revenues for mo-vies (2010).!

! LingPipe linguistic analysis package 5

(7)

! According to Gilbert et al. (2010), emotions can influence our decisions, and such choice includes stock market investment decisions. If people are uncertain about the future, the more they retain to invest and trade. Other studies in behavioral economics, such as the research by Delong et al. (1990), Schumaker (2012) and Li (2006), find a relation between emotion and investment de-cisions. They discover that the sentiments within financial reports or news articles affect stock re-turns. Zhang (2011) uses mood words to measure the collective emotion. Mood words such as hope, fear, happy, positive, negative, and upset. They count all the tweets containing such words and divide them into two groups: positive and negative. !

! !

III Methodology and data!

!

3.1 Data limitations !

!

There are some limitations when collecting Twitter chatter. First, it is not possible to use the search engine in Twitter itself to search for messages on the target companies. Twitter only allows users to search in messages no older than 6 to 9 days old. Twitter grants the right to search the entire da-tabase of tweets to only a handful of companies/institutions. This makes the search engine in Twit-ter useless for the search of TwitTwit-ter chatTwit-ter.!

! Secondly, because it is not possible to search on Twitter for the data, there is need for a script or monitoring program to collect the messages. Also this program helps to analyze the mes-sages. The collection of Twitter data is very time-consuming. It is important to use proper search inquiries to collect messages of high quality. So that the messages provide value-relevant informa-tion.!

! Third, the data consist of Dutch companies. Only the 25 AEX-listed companies are part of the sample. This makes the sample small, but sentiment is collected for 61 days. With this, there are in total 1525 sentiment observations.!

!

3.2 Data!

!

Because of these data limitations, a monitoring website is used to collect data. Most of the pro-blems described in the previous section are eliminated with the use of Twiqs 6. Twiqs is a free onli

-ne monitoring program. The website grants researchers and students the opportunity to exami-ne Twitter messages. Twiqs is developed by the Netherlands eScience Center 7. Its purpose is to give

researchers and students access to daily communication on Twitter back to 16 December 2010.

! www.twiqs.nl 6

! www.twiqs.nl/faq 7

(8)

The software behind Twiqs allows a continuous search of new tweets based on keywords present in the messages or the names of the user that sends the messages (Sang & van den Bosch, 2014). But also Twiqs has its limitations: it only searches for Dutch tweets. It is difficult to examine the content of a tweet in a foreign language. Additionally, Twiqs examines whether the message has content with positive/neutral/negative sentiment.!

! Similar to the research of Bollen (2011), tweets are filtered using a general mood prediction. This mood prediction is build in the search engine of Twiqs. The sample comprises sentiment for the 25 AEX-listed companies collected on a daily basis. For a full list of companies used and there sentiment scores on a daily base, see Appendix A1. With the use of keywords, it is possible to re-ceive company relevant messages. The keywords consists of the name of the company. Twiqs uses this keywords to search between the 1.5 million and 2 million available tweets in their databa-se. One is also possible to downgrade the search by putting a hashtag in front of the keyword. Hashtags on Twitter are used to group messages. People can use the hashtag to categorize a tweet. All tweets containing the same hashtag are grouped and Twitter displays the top ten groups on it’s homepage: also known as trending topics. By applying this filter, only messages with a hashtag on Twitter will come back as result. Otherwise it includes all messages: non-hashtag and hashtag messages. This research makes no distinction between the two. Both are included in the data sample.!

! In total 61 days are monitored, good for 1525 sentiment observations. Then the non-trading days are removed from the sample as they cannot be used to predict stock price. For every search query, Twiqs gives a sentiment score between -100 and +100. A score of +100 (-100) means that 100% of the messages in the search inquiry are positive (negative). The sentiment observations can be divided into the following groups:!

!

!

!

!

Table 1: Sentiment scores and corresponding sentiment categorie!

Type of Sentiment (S) ! value

Very Positive (VP)! +51 and up

Positive (P) between +21 and +50

Slightly Positive (SP) between +1 and +20

Neutral (NT) score of 0

Slightly Negative (SN) between -1 and -20

Negative (N) between -21 and 50

(9)

! There is no distinction between October and November. The data does not provide different data for the two months. The portfolio’s almost stay the same and therefor the data is grouped into a single time-period.!

! After examining the sentiment score, companies are placed into a sentiment portfolio. For a full list of companies per portfolio, see Appendix B1 and B2. All the ’+’ sentiment scores are placed in the Positive portfolio ’P’. The ’-’ sentiment scores are placed in the Negative Portfolio ’N’. And the last portfolio ’NT’ contains all the neutral sentiment scores of zero. There are only three portfo-lio’s because the amount of companies in the other sentiment areas based on the mean are too low to examine. Therefore, companies are divided into a positive, negative or neutral portfolio.!

!

!

!

Table 2: the AEX companies divided in sentiment portfolio’s !

!!

! The table shows that most of the companies belong to a ’P’ portfolio, which suggests most of the chatter contains positive sentiment. Zhang (2011) explains this by suggesting that people prefer optimistic to pessimistic words when using social media.!

!

3.3 Methodology!

!

An event study for this subject is difficult, since it is hard to examine which part is contributed by chatter and which part by, for example, news articles. But it is possible to consider the impact on the balance of emotion. This thesis uses an index similar to the Governance index of Gompers & Metrick (2003). This index is called the Sentiment Index (”S”) from now on. It is the sum of one point for the existent (or absence) of each sentiment. There are three categories of sentiment: po-sitive, negative and neutral.!

!

! There are 25 companies in the data sample. These are the AEX-listed companies during the period 1-10-2013 and 30-11-2013. See the Appendix A1 for the full list and their sentiment sco-res. The sentiment is collected on a daily basis and the companies are assigned to their sentiment categories. It is also possible to add subindices such as the amount of followers. However, this

re-Portfolio based on Sentiment #Firms P 15 NT 4 N 6 Total 25

(10)

search is not using followers because it is too difficult to match the followers to the tweets during the data period. It is not possible to find changes in the amount of followers for October and No-vember. This is why the amount of followers is not included. !

! The companies are divided into 3 portfolio’s corresponding there sentiment score. For ab-normal returns, all the stock prices during the sample period are collected 8 and examined. The

non-trading days are deducted from the observations as they cannot be used to predict stock pri-ces. But these removed values can be used to examine the sentiment in the weekend and then for the prediction of stock prices on mondays. See Appendix C for a full list of percentage changes in stock prices. To calculate abnormal returns, the four-factor model of Carhart (1997) is estimated by:!

!

(1) ! Rt=α+β1*RMRFt+β2*SMBt+β3*HMLt+β4*Momentumt+εt!

!

where Rt is the excess return to some asset in montht, RMRFt is the montht value-weighted market

return minus the risk-free rate, and SMBt, HMLt and Momentumt are the month t returns on

zero-investment factor-mimicking portfolios to capture size, book-to-market, and momentum. !

!

! I follow the research of Zhang (2011). The expectations are that increased market perfor-mances, small firms, high book-to-market firms and firms with recent high returns, will provide ad-ditional returns (2011).!

! For the computation of the risk factors I again use Zhang (2011) and Gompers & Metrick (2003): the risk factor for market performance is constructed by computing the return of the overall market relative to the risk free rate, Rm-Rf. The risk factor for size (SMB) is constructed by em-ploying the return difference between portfolio’s of ”small” and ”big” stocks. For computing the risk factor of market (HML), I use the difference between a portfolio of ”high” and ”low” book-to-market stocks. And the risk factor Momentum (UMD) is the difference between a portfolio of stocks with high returns in the past month and a portfolio of stocks with low returns in the past month (2003, 2011).!

! Similar to the research of Zhang (2011), I will also create a new variable, called SENT. SENT contains 5 equal groups of companies ranging from companies with the highest amount of tweets to companies with the lowest amount of messages. The companies in the Q1 portfolio re-ceives the most attention on Twitter and the companies in Q5 are the ones with the least amount of attention. The SENT variable is constructed by deducting Q5 from Q1.!

!

!

!

! http://beurs.fd.nl/analyse/amsterdam/aex/ 8

(11)

3.4 Hypotheses!

!

According to Zhang (2011), when people express a lot of hope, fear and worry, the Dow Jones goes down the next day. And when they have less hope, fear and worry, the Dow Jones goes up. This is also in line with the results of Sprenger (2010). If people are optimistic about the market, then it goes up. Therefore the expectation will be that a stock with positive sentiment will have an increased stock price the day after.!

• Hypothesis 1: A stock with positive sentiment is followed by an increase in stock price.!

! !

! Next to the sentiment in the chatter, the amount of tweets is also something to consider. The more people talk about a subject, the more it spreads on Twitter. And, in theory, the more at-tention it receives. According to Zhang (2011), an increase in the intensity of ticker search should be accompanied by increased buying pressure and with it an increase in stock price. To compute this, I will use the SENT variable. Antweiler and Frank (2004) find that message volume can predict next-day trading volume. This is similar to the research of Zhang (2011) and Asur & Huberman (2010) who found a positive relationship between search intensity and return.I expect to find a po-sitive relationship as well.!

• Hypothesis 2: Companies within the top group of SENT have higher than companies in the lo-west group.!

!

! According to the EMH, markets are informationally efficient: they reflect all known informa-tion. However, there are several attacks on the EMH. Malkiel (2003) examines the predictability of stock prices and finds that the market does not always follow the EMH. In other research Chan (2003) finds that stock prices under reacts to news. With Twitter, information spreads very fast and it might leave some room to profit from the information in the tweets. Schumaker (2012) and Li (2006) find an effect in sentiment and stock prices. This relation is also visible in the research of Sprenger (2010). He finds that the messages may contain new information not yet reflected in the market prices (2010). There are examples of inefficient markets. I expect to find a positive effect for the sentiment that lies within the tweets and stock prices. !

• Hypothesis 3: A portfolio consisting of companies with positive sentiment can outperform the

market.! ! !

!

!

!

!

!

(12)

IV Results!

!

4.1 Descriptive statistics!

!

In total there are 263.933 stock-related tweets and 1525 sentiment observations for the 25 AEX-listed companies. When deducting the non-trading days from the sample, 206.100 tweets and 1075 sentiment observations remain. The message volume ranges from 2567 to 7776 daily mes-sages and an average of 4793 with a standard deviation of 933. Overall daily sentiment appears to be positive with a total average of 75 and standard deviation of 83. The summary descriptive sta-tistics are displayed in Table 3.!

!

Table 3: summary statistics on the data!

!

! There appears to be weak positive correlation (0.0387) between the message volume and returns and a weak negative correlation (-0.0263) between sentiment and returns.!

!

!

4.2 Positive sentiment and stock price movement!

!

Hypothesis 1: A stock with positive sentiment is followed by an increase in stock price. !

This thesis starts the empirical analysis by examining the predictive power of sentiment. The tive sentiment stocks are filtered out of the sample data. The correlation between stocks with positive sentiment and stock price one day after the measurement has a weak negapositive correlation of -0.1931. PS contains the positive sentiment days and dAEX the stock movements the day after the event.!

!

| PS dAEX! ---+---! PS | 1.0000! dAEX | -0.1931 1.0000!

!

When using OLS, there appears no reason to suggest a predictive power of sentiment in Twitter messages based on this data sample. With a t-score of -1.26, the null hypothesis is rejected at 5% and 1% significance level.!

Variable Obs Mean Std. Dev. Min Max

Sentiment 43 75,186 82,89 -94 301

volume 43 4793,02 932,73 2567 7776

return stocks 43 0,03240 0,1567 -0,2723 0,4197

(13)

! According to this result, the EMH holds. Market prices reflects all known information. This is in line with the findings of Tumarkin (2012), but not with the expectation. A couple of explanations for this can be, first, that the data sample is not solid enough. The sample only contains Dutch tweets. These tweets only cover a small percentage of the total tweets worldwide. Foreign tweets are not included because of the difficulties in extracting sentiment out of these messages. These messages still have an influence, but it is not recorded by this thesis. For further research, it might be better to collect not only Dutch tweets, but foreign tweets as well. Then the data might show an predictive power of twitter messages. !

! Secondly, Twitter is still an upcoming platform. Not everyone uses Twitter to tweet about stocks. Only a fragment of all messages contains stock related information. Most of the chatter contains personal information which is not valuable when predicting stock prices.!

! !

4.3 Message volume!

!

The expectation for volume testing is a positive relationship between message volume and stock returns. To examine this relationship, the companies are split up into 5 different groups. Each group contains five companies with Q1 as the group with the highest amount of messages and Q5 the group with the lowest amount of messages. On average, the Q1 and Q5 portfolio without risk-adjustment generates returns of 0,54% and 0.64% respectively. This measurement follows the re-search of Zhang (2011): the quintile with the lowest volume (Q5) is deducted from the quintile with the highest volume (Q1). The implied return without risk-adjustment is -0.11%. Firms in the Q1 quintile earn 11 basis points less than those in Q5 based on message volume. !

! However, the implied return with risk-adjustment is almost 0.01% higher for Q1 firms than for Q5 firms. Alpha in table 4 shows the daily abnormal return in percentage. !

!

Portfolio Raw Returns (%) Alpha Rm-Rf SMB HML WML R-squared (%) Q1 0.0053

0.0079*

(0.005) 5.1469 (0.7036) 0.0375* (0.0857)

-0.0072*

(0.01651) 0.0598* (0.1366) 57,3 Q2 0.11

0.0035*

(0.0078) 5.8962 (1.072) -0.0221* (0.1305)

0.0043*

(0.0251) -0.0352* (0.2080) 43,5 Q3 0.0037

-0.0015*

(0.007) 4.1596 (0.9735) -0.0132* (0.1186) 0.0025* (0.0228)

-0.0210*

(0.1889) 31,7 Q4 0.0057 0.0014* (0.0040) 4.2505 (0.5504) 0.0141* (0.0670) -0.0027* (0.0129) 0.0225* (0.1068) 60,0 Q5 0.0064 -0.0015* (0.0047) 5.3203 (0.6500) -0.0488* (0.0792) 0.0094* (0.1525) -0.0777* (0.1262) 63,3

(14)

Table 4: Q1 contains the firms with highest message volume and Q5 those with lowest mes-sage volume. The abnormal returns are obtained by the regression of returns on three fac-tors from Fama and French 9 (1993) and a momentum factor from Carhart (1997): excess re

-turn on the market, the re-turn difference between small and big stock, and the re-turn difference between portfolio’s of high and low book-to market stocks. !

* Significant at the 5% level!

!

! The implied return of Q1-Q5 suggest that there is evidence to support the null hypothesis: companies within the top group of SENT have higher abnormal returns than companies in the lo-west group. This result is in line with the expectation and with prior research. Asur & Huberman (2010) and Zhang (2011) finds a positive relationship between volume and returns. But it is difficult to exploit these market inefficiencies due to reasonable transaction costs (Sprenger, 2010).!

! Although the model supports the null hypothesis, the explanatory variables only explain around two-third of the model. There is room for improvement. Retweets can be added to examine the reach of the messages. If one retweets a message, it can reach more people and influence their stock market investment decisions. !

! Also the amount of followers is something to consider. If one has a lot of followers, their messages spreads to more individuals and this might also have an effect on decision making. And the possibility of a retweet for this message rises, because more people can see the message in their timeline.!

! Another improvement can be to insert the trending topics. Trending topics consist of tweets with a hashtag. Hashtags makes tweets visible to a greater audience (Kumar, 2012). This might also help to extend the model and increase the overall quality.!

!

4.4 The ability of positive sentiment portfolio’s to outperform the market!

!

When dividing the companies into a positive/neutral/negative-portfolio based on their sentiment, the positive portfolio shows abnormal returns of 0.18% without risk-adjustment. The negative port-folio 0.09%. With this information, a portport-folio with positive sentiment firms earn 0.09% more than a negative portfolio. ! Q1-Q5 -0.11

0.0023*

(0.0068)

-0.1734*

(0.9402) 0.0863* (0.1145)

-0.0166*

(0.0221)

0.1375*

(0.1825) 2 Implied return Q1-Q5 -0.11 0.0094 Portfolio Raw Returns (%) Alpha Rm-Rf SMB HML WML R-squared (%)

Data on the three Fama & French factors and Carhart momentum factor come from http://

9

mba.tuck.dartmouth.edu/pages/faculty/ken.french/Data_Library/f-f_factors.html and http://ycharts.-com/companies/„companySymbol”/price_to_book_value

(15)

!

Table 5: Raw returns are the abnormal returns without risk-adjustment. ! * significant at 5% level!

!

! After risk-adjustment, the P portfolio performance is less than the negative one. With the data, Jensen’s alpha shows that the P portfolio outperforms the market with 0.000004. However, the data doesn't provide enough statistical evidence to accept the null hypothesis. !

! This thesis does, other than the expectation, not find an effect between the sentiment and stock prices such as in the research of Schumaker (2012) and Li (2006). Other than the findings of Sprenger (2010), it appears that the messages in this sample contains no new information and are already reflected in the market prices.!

! An explanation for this result can be that the sentiment contains messages of low interest. Although specific firm-related keywords are used to filter the messages, there might be some noise from other messages. To manually go through almost 250.000 tweets and assign a sentiment va-lue without the use of a script or program is nearly impossible. !

! Another explanation can be that there is a difference in popularity between the firms. Com-panies such as KPN and Ziggo are very active on Twitter and frequently interact with their custo-mers. This popularity also has a downside, because one can Twitter, for example, about the loss of connection for their mobile services. This might create negative sentiment, while the effect on the stock prices is not certain.!

!

!

!

!

V Conclusion!

!

Prior research suggests that Twitter holds valuable information for sentiment analysis and for va-rious predictive measurements. This thesis tries to find a relationship between stock price move-ment and sentimove-ment, but does not find such a relationship. There is little to no evidence that senti-ment predicts future stock returns. This is in line with the EMH. The market reacts instantaneously

Portfolio ! Raw returns (%) Alpha Rm-Rf HML SMB WML R-squared (%) P 0.1843

-.0005*

(0.0101) 16.1211 (1.4115)

0.0017*

(0.0331)

-.0090*

(0.1719)

-0.0144*

(0.2739) 76,73 NT 0.0472

-0.002*

(0.0036)

3.6628

(.5091)

0.0137*

(0.0119)

-0.0710*

(0.0620)

-0.1131*

(0.0988) 58,02 N 0.0925

0.0052*

(0.0073)

4.9895

(1.0160)

-0.0092*

(0.0238)

0.0476*

(0.1238)

0.0758*

(0.1972) 37,63

(16)

to any given news and investors cannot earn excess profits from trading strategies based on publi-cly available information (Fama, 1970, 1990). !

! However, this thesis finds a relationship between message volume and stock prices. Tom-kins (2005) findings also shows this relation between volume and stock prices. But the transaction costs make it hard to exploit these market inefficiencies (Sprenger, 2010). ! ! !

! For future research this thesis suggests to add more explanatory variables such as ret-weets, followers and trending topics. These variables are not present in the data sample from Twi-qs. Even when one looks for these variables in Twiqs, it is a very time-consuming task. Adding the-se variables increathe-se the quality of the regression.!

! Prior research is more successful in finding a sentiment relation when examining sales or popularity, for example, a book or movie than predicting stock price movements. The quality of the chatter is a real important factor here. Zhang (2011) suggests that the chatter is more likely to characterize the behavior of naive individual investors rather than institutional investors. !

! Twitter continues to grow in popularity and the way people use it might change. With ad-justments, it might be possible to find a relation between stock price movement and sentiment in the future.!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

(17)

References!

!

Antweiler, W., & Frank, M. (2004). Is all the talk just noise? The information content of internet stock message boards. Journal of Finance, 59, 1259-1293!

!

Asur, S. and Huberman, B.A. (2010). Predicting the future with social media. HP Laboratories!

!

Bloomberg (2010). Hedge fund will track Twitter to predict stock moves, Bloomberg, online edition, December 22!

!

Bollen, J. (2010). Twitter mood as a stock market predictor, IEEE, 44(10), 91-94!

!

Carhart, M. (1997). On persistence in mutual fund performance, Journal of Finance, LII, 57-82!

!

Chan, W.S. (2003). Stock price reaction to news and no-news: drift and reversal after headlines,

Journal of Financial Economics, 70, 223-260!

!

Chang, A., Dhar, V. (2009). Does chatter matter? The impact of user-generated content on music sales, Journal of Interactive Marketing, 23, 300-307!

!

Chen, H., et al. (2012). Customers as advisors: the role of social media in financial markets,

SSRN, 1-46!

!

Chen et al. (2014). The effect of news and public mood on stock movements, Information

Scien-ces, 278, 826-840!

!

Cutler et al. (1989). What moves stock prices?, Journal of Portfolio Management, 15, 56-63!

!

Da, Z., Engelberg, J., Gao, P. (2011). In search of attention. Journal of Finance, 66(5), 1461-1499!

!

DeLong et al. (1990). Noise trader risk in financial markets, Journal of Political Economics, 98, 703-738!

!

Engelberg et al. (2012). How are shorts informed? : short sellers, news, and information proces-sing, Journal of Financial Economics, 105, 260-278!

(18)

Fama, E. (1964). The behavior of stock market prices. Tech rep. Graduate School of Business,

University of Chicago!

!

Fama, F., French, F. (1993). Common risk factors in the returns on bonds and stocks. Journal of

Financial Economics, XXXIII, 3-53!

!

———-, (1997). Industry Costs of Equity, Journal of Financial Economics, XLIII, 153-194!

!

Fama, Eugene F., MacBeth, J. (1973). Risk, return and equilibrium: empirical tests, Journal of

Poli-tical Economy, LXXXI, 607-636!

!

Gidófalvi, G. (2001). Using news articles to predict stock price movements, Department of

Compu-ter Science and Engineering, University of California, San Diego!

!

Gompers, P., Metrick, A. (2003). Corporate governance and equity prices, Quarterly Journal of

Economics, 118(1), 107-155!

!

Java et al. (2007). Why we Twitter: understanding microblogging usage and communities.

Pro-ceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, 56-65!

!

Kumar, A., Sebastian, T.M. (2012). Sentiment analysis on Twitter, Journal of Computer Science

Issues, 9(4)-3!

!

Li, F. (2006). Do stock market investors understand the risk sentiment of corporate annual returns,

working paper!

!

Luo, X. (2007). Consumer negative voice and firm-idiosyncratic stock returns. Journal of

Marke-ting, 71 (July), 75-88!

!

Malkiel, B. (2003). The Efficient Market Hypothesis and its critics, Journal of Economic

Perspecti-ves, 17(1), 59-82!

!

McAllister, L. (2011). The relationship between online chatter and firm value, Marketing Letters, 23(1), 1-11!

(19)

Nofsinger, R. (2005). Social mood and financial economics, Journal of behavioral Finance, 6(3), 144-160!

!

Sang, E., & van de Bosch, A., (2014). Dealing with big data: the case of Twitter. Computational

Linguistics in the Netherlands Journal, volume 3!

!

Schumaker, R., Chen, H. (2009). Textual analysis of stock market prediction using breaking finan-cial news: the AZFin text system, ACM Transactions on Information Systems, 27(2), article 12!

!

Schumaker et al. (2012). Evaluating sentiment in financial news articles, Decis. Support System, 53, 458-464!

!

Sprenger, T., Welpe, I. (2010). Tweets and trades: the information content of stock messages, http://ssrn.com/abstract=1702854 !

!

Tetlock, P.C. (2007). Giving content to investor sentiment: the role of media in the stock market,

Journal of Finance, 62, 1139-1168!

!

Tetlock, P. C., Macskassy, S. (2008), More than words: quantifying language to measure firm’s fundamentals, Journal of Finance, 63(3), 1437-1467!

!

Tomkins et al. (2005). The predictive power of online chatter, KDD, 21-24!

!

Tumarkin, R. & Whitelaw, R. (2001). News or noise? Internet postings and stock prices, Financial

Analysts Journal, 57(3), 41-51!

!

Varian, H., Choi, H. (2012). Predicting the present with Google Trends. The economic record, 88, 2-9!

!

Vega, C. (2006). Stock price reaction to public and private information, Journal of Financial

Eco-nomics, 82, 103-133!

!

Wintoki, M., Zhang, Z., Joseph, K. (2011). Forecasting abnormal stock returns and trading volume using investor sentiment: evidence from online search, International Journal of Forecasting, 27, 1116-1127!

(20)

Yu et al. (2013). The impact of social media and conventional media on firm equity value: a senti-ment analysis approach, Decis. Support System, forthcoming!

!

Zhang, X., Gloor, P., Fuehres, H., (2011). Predicting stock market indicators through Twitter: I hope it is not as bad as I fear”, Procedia- Social and Behavioral Sciences, 26, 55-62!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

(21)

Appendix A1!

!

Sentiment scores for the AEX-listed companies on trading days, part 1!

!

!

!

(22)

A1, part 2!

!

!

(23)

!

B1!

!

Summary statistics of all companies!

!

!

Variable | Obs Mean Std. Dev. Min Max! ---+---! Aegon | 43 4.348837 18.47764 -39 51! Ahold | 43 .4651163 24.76302 -100 50! AirFranceKLM | 43 8.651163 12.0631 -33 29! AkzoNobel | 43 4.767442 18.78555 -67 64! Gemalto | 43 -.3023256 18.28548 -50 100! ---+---! ArcelorMit~l | 43 5.55814 19.07667 -67 64! ASML | 43 7.27907 11.18103 -11 43! Imtech | 43 4.604651 16.13653 -35 44! Corio | 43 -2.55814 11.63595 -33 25! DSM | 43 7.744186 13.36679 -21 50! ---+---! Fugro | 43 3.860465 28.99309 -100 100! Heineken | 43 10.34884 8.061914 -10 34! IngGroup | 43 .0465116 4.700322 -9 12! Kpn | 43 -6.604651 7.339195 -23 13! Philips | 43 12.27907 12.68544 -13 58! ---+---! PostNL | 43 -6 4.735881 -16 7! RandstadHo~g | 43 8.906977 4.560775 2 20! ReedElsevier | 43 -.627907 5.627413 -13 12! RoyalDutch~l | 43 1.883721 8.350105 -20 20! SBMOffshore | 43 2.27907 28.3266 -100 100! ---+---! TNTexpress | 43 -2.906977 8.294617 -40 15! UnibailRod~o | 43 .2790698 12.68731 -56 33! Unilever | 43 2.813953 16.72075 -53 32! WoltersKlu~r | 43 8.883721 13.74463 -14 42! Ziggo | 43 -.8139535 5.933183 -13 15!

!

!

!

B2!

!

!

Portfolio Company name

P Philips, Heineken, Randstad,

WoltersKluwer, AirFranceKLM, DSM, ASML, Arcelor, Akzo, Imtech, Aegon, Fugro, Unilever, Shell, SBM

NT Ahold, ING, UnibailRodamco, Gemalto

N ReedElsevier, Ziggo, TNTExpress,

(24)

!

!

C1: Percentage changes in stock prices!

!

!

!

!

!

(25)

!

C2!

!

!

!

!

!

!

!

!

!

(26)

!

!

!

(27)

!

!!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

(28)

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

(29)

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

!

Referenties

GERELATEERDE DOCUMENTEN

The socio-economic factors included as independent variables in the multivariate regressions consist of the home country gross domestic product (GDP) of the sponsoring

cumulative returns for the Japanese stock index and the cumulative returns of a long-short portfolio for he portfolio strategy is to go long in the two best performing sectors

(1997) report that contractionary monetary policy accounts for the decline in aggregate economic activity following an oil price increase. If prices are sticky downward, an oil

Cumulative abnormal returns show a very small significant reversal (significant at the 10 per cent level) for the AMS Total Share sample of 0.6 per cent for the post event

The four common variables market return, oil price, exchange rate and interest rate are on a quarterly and a yearly basis and the company specific variables operational cash

This study tries to explore how formal whistleblowing policies of the AEX listed companies are designed, and how they should be designed to encourage internal whistleblowing in a

This thesis will focus on the right to freedom of religion of the parents, which includes the rights of the parents to bring up their children in accordance with their own religion

De respondenten die geen foto bij het Facebookbericht hebben gezien, beoordeelden de politicus hoger op geloofwaardigheid van de bron wanneer zij een lage mate van