• No results found

Effect of sentiment in news on the Swedish stock market

N/A
N/A
Protected

Academic year: 2021

Share "Effect of sentiment in news on the Swedish stock market"

Copied!
22
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

1

Effect of sentiment in news on the

Swedish stock market

BA Thesis, Economics and finance

By Daan Uittenhout 11057777

Statement of Originality

This document is written by Student Daan Uittenhout who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

Abstract

If news articles have an effect on the market, the way in which they are structured, as well as the words chosen, are powerful tools to influence markets. This article examines the effect of Reuters’ financial news on the Swedish stock market between 2012 and 2015. The research makes use of a sentiment analysis to quantify the effects of the news. The results show evidence that there is a small but significant effect of news articles on the Swedish market. So, the conclusion is that news does influence the market, but which elements of the news affect the market still unclear.

(2)

2

Contents

Abstract ... 1 Introduction ... 3 Literature review ... 4 Methodology ... 7 Data ... 7 Textual data ... 7 Algorithm ... 7 Sentiment analysis ... 7 Processed data ... 9

Stock market data ... 10

Control variables data ... 11

Variables ... 11 Explanatory variables ... 11 Control variables ... 11 Dependent variables ... 11 Regressions ... 11 Statistics ... 13 Results ... 15 Conclusion ... 15 Bibliography ... 16 Appendix ... 18

(3)

3

Introduction

News articles can influence their readers’ emotions positively, negatively or not at all (Lin, Yang, & Chen, 2007). The news articles of Reuters’ financial section rank among the most frequently read articles in the world; therefore, the way in which these news articles are formulated and structured could affect the sentiments of a significant amount of readers. These readers include private investors, employees of investment funds and other institutions that influence the markets. The price, turnover and spread of markets’ indexes are partly determined by the behaviour of their actors. These actors make decisions on the basis of their knowledge, rationality or sentiment. The sentiment of these actors could be influenced, among other things, by the news articles they read. In this thesis the effect of these news articles on market actors will be examined by researching the relationship between the news and the market. To quantify the sentiment found in news articles a Natural Languages Processing (NPL) algorithm is used; this algorithm performs an analysis of the sentiments found in texts and is therefore called a sentiment analysis.

If a relationship between news articles and markets can be detected, this could be used in forecasting models, to predict prices, liquidity or other factors. It could also imply that a critical review of the vocabulary and textual structure of news articles is desirable, in order to prevent a negative impact of news articles on the market.

Prior studies have shown that there is a relation between firm specific news and those firms’ returns (Heston & Sinha, 2017). However, prior research on news articles and feeds mainly focused on domestic news in native languages. According to Seargeant and Swann (2013), the amount of non-native English speakers in the world is growing rapidly. Newman, Fletcher, Kalogeropoulos, Levy, and Nielsen (2017) showed that Reuters’ news articles are read across the globe, through a range of news channels. Therefore, this research will focus on the question how worldwide financial news, written in English, affects the Swedish stock market index. Sweden ranks number one in rate of non-native English speaking countries during the researched period1 ; therefore, if these English news articles affect markets where English is not the native languages, the Swedish market would be the appropriate research market.

The research question of this thesis therefore is: Does the sentiment in news feeds, that are derived from Reuters financial news, affect market indexes’ price or turnover of the Swedish stock exchange?

The stock market data that is analysed in this research incorporates prices, spreads and turnovers of all trading days in Sweden from 2012 until 2017. The news data that are analysed consists of all financial news feeds published by Reuters in the analysed period, which are approximately seven million news feeds. An algorithm, that was specifically written for this database, performs a sentiment analysis on the news data. The results of the sentiment analysis are statistically tested against the stock market data. The stock market data is the index of the thirty largest firms in Sweden, which, taken together, reflect a good representation of the Swedish stock exchange. The news data is obtained from the largest news agency in the world, Thomson Reuters; for this research the news from the financial section is used.

A sentiment analysis is performed to quantify the sentiment in news feeds. This data will be examined on irregularities and trends. Based on this examination the data will be tested on shocks and changes.

(4)

4 The null hypothesis will be that the news feeds have no effect on the Swedish stock market index, and logically the alternative hypothesis will be that the news feeds do have an effect on the stock market. This is going to be tested with ordinally least squares regressions. The assumed model used is a multiple regression model and its estimation using ordinally least squares (OLS).

The news variable will be measured in average polarity, total polarity and log-polarity2. The log polarity is included in order to examine if a relative change in the news sentiment changes the market. The market will be measured in price and turnover (liquidity). The change of price and turnover is measured, as well as the relative change.

The control variables for this regression are independent variables that do not correlate with the explanatory variables, or other control variables, to avoid multicollinearity. However, they do correlate with the dependent variables, to control for omitted variable bias. The chosen control variables consist of the treasury inflation protected securities (TIPS), the gross domestic product, expected inflation rate, the London interbank offered rate (Libor) and the exchange rate between the US dollar and the Swedish Krona.

The research in this thesis shows that there is a relation between market index prices and news, and between market turnover and news. The index price was significantly affected by the amount of news with a sentiment. The turnover was affected by the amount of news and more significantly affected by the sentiment of this news.

The implications of this research are that stock markets are affected by news in two ways. The amount of news with a sentiment leads to higher turnovers and higher prices. The sentiment in news also leads to higher turnover. This implies that the structure and vocabulary of news influences markets, therefore authors should pick their words carefully.

Literature review

As has been seen in the introduction, prior research has dealt with aspects of the relation between news and the market. In this chapter a number of studies that describe this previous research will be discussed, as a basis for the method and focus of the research in this thesis.

The commonly applied tool to quantify news is Natural Languages Processing (NLP), which can be done in many ways. The technique used in most of the articles is a sentiment analysis (Xing, Cambria, & Welsch, 2018). The research which is described in this literature review leads to contradicting conclusions. These contradictions are due to fact that the factors that influence stock prices could not all be controlled. The research was conducted in various markets and in different periods, with different NPL techniques. These arguments resulted in different conclusions because the markets are different, and furthermore the way in which the news is formulated, structured and distributed differs per research.

In the researches discussed news is of a textual form, i.e. articles or feeds. This news can contain, depending on the words used, sentiment which is measured by polarity (Nasukawa & Yi, 2003). This polarity measurement is based on how positive or negative words are interpreted (Hu, Duan, Chen, Pei, & Lu, 2005), i.e. the sentimental value of words. This measurement lies between minus one and one: if a news article is positive it has a polarity that tends to one, while a negative news article tends to minus one.

(5)

5 One assumption that underlies research on sentiment is that people, which includes investors, have a bounded rationality; this was argued by Simon (1955), for which he received the Sveriges Riksbank price. This implies that people are not completely rational and could be influenced by their sentiment (Kahneman, 2003). One of the channels through which their sentiment could be affected is the news they read (Lin et al., 2007). Therefore, trading decisions of investors, could be affected indirectly, through their sentiment, by the news they read.

The trading decisions that investors make affect the supply and demand for stocks, and therefore the whole market (Marshall, 2005). The economic mechanism based on the literature discussed above, is shown in the chart below (Figure 1).

Figure 1 The indirect effect of news on the market

That sentiment in financial news has an impact on the investors’ sentiment is argued by Tetlock (2007). Namouri, Jawadi, Ftiti, and Hachicha (2018) state that investor sentiment is a crucial determinant of stock market returns. Therefore, it is assumed in this thesis that a relation between the news and stock markets exist. The question which market parameters are influenced by which news factors has also been studied in the existing literature. The results of these studies are used to determine which effects should be researched in this thesis.

First of all, Tetlock (2011) found the effect that positive news, i.e. news with a high polarity, has a negative relation with total market returns. The effect that news with a high value of absolute polarity changes the value of the stock market as a whole was also found by Baker and Wurgler (2007). This would imply a negative relationship of news on the stock market.

A second effect that is mentioned in the literature is the effect of firm specific news on their stock prices. Boudoukh, Feldman, Kogan, and Richardson (2013) showed that relevant news has a higher effect than irrelevant news. Heston and Sinha (2017) found that news in general affects stock prices, even if it is neutral or close to neutral news. The effect of firm specific news on the stocks of these firms is larger if the firm has a low transparency (Baker & Wurgler, 2007). Heston and Sinha (2017) also concluded that there is a positive effect after positive firm specific news. On the other hand, Ahmad, Han, Hutson, Kearney, and Liu (2016) found that negative firm specific news negatively

(6)

6 affects stock prices the next trading day. These arguments concerning firm specific news imply a positive relation of news and stock prices.

The discussed effects of firm specific news on their stocks on the one hand, and on the stock market as a whole on the other hand are, therefore, at variance according to the existing literature. On the basis of these contradictions one of the effects that will be researched in this thesis is the effect of news on the stock market price, without a priori assuming a directional relationship. For this purpose, news will be measured in both polarity and amount, and the stock market will be measured in total returns.

A third effect is the effect of news on the market turnover. Turnover or liquidity is, according to Liu (2015), correlated with the investors’ sentiment. Liu (2015) found that if investor sentiment increases, the market turnover was higher. According to Baker and Stein (2004), market liquidity could be used as investors sentiment indicator. In a later paper Baker and Wurgler (2007) state that the origin of this sentiment is hard to determine, because there are many factors affecting this sentiment and the level is hard to measure. However, they do assume a relation between the factors influencing investor sentiment and market turnover. On the basis of this literature the effect of news on the stock market turnover will be researched in this thesis.One a priori assumption, based on the existing literature, will be that high absolute polarity and large amount of news will have a positive relation with market turnover.

The last effect to be discussed here is how the effect of news on the prices of the stock market develops over time. The discussion in the literature focusses on the question whether or not the news effect is incorporated in the price over time. According to Li, Xie, Chen, Wang, and Deng (2014) sentiment analysis does improve the predictability of stock prices (Li et al., 2014). However, the effect of sentiment in the news is diminishing over time; consequently, a day with a high polarity has a stronger effect on the market the first days than later on in time (Zhang & Skiena, 2010). This argument supports the efficient market hypothesis, which predicts that when information becomes more available, prices adjust accordingly. Namouri et al. (2018) argue the opposite, viz. that the market efficiency hypothesis does not hold; they claim that there is a continuous mispricing in the market because not all information is reflected in the price, so the market could be beat by exploiting this mispricing.

This discussion shows that it is not clear whether the effect of the news is part of a continuous mispricing or a temporal mispricing which will be restored by the market. Therefore, only the effect of the news of the previous trading day will be researched in this thesis.

As will be discussed in the methodology chapter, the research will also contain a number of control variables. The first control variable will be treasury inflation protected securities (TIPS). This control variable is selected because it was shown by Wasserfallen (1989) that TIPS represent a rate that moves in accordance with the risk-free rate and they correlate with the stock market, while they are controlled for inflation. Furthermore, Schmeling and Schrimpf (2011) state that expected inflation influences the stock market. They argue that an increase in expected inflation results in higher stock prices. The conditional relationship between inflation and the stock market is, according to Schmeling and Schrimpf (2011), bidirectional; however, Fama (1983) argues that it is a one directional causality. Based on the literature an effect could be assumed, while the direction of the effect is still being discussed. In spite of this discussion, the expected inflation will be used to control the regression.

(7)

7 The third control variable is the Gross Domestic Product (GDP). Changes in the GDP indicate whether the economy of a country is increasing or decreasing (Marshall, 2005). If the GDP increases, investment increases, as it is a function of the GDP, the demand for stocks rises, and the price will consequentially increase as well (Levine & Zervos, 1999). The existence of the positive effect of GDP on the stock market can be assumed on the basis of the literature and could therefore be used to control the effect of the news on the stock market price.

The fourth effect to control the stock market price and turnover discussed is the exchange rate between the Swedish Krona and the US Dollar. The choice of this control variable is based on a study by Bahmani-Oskooee and Sohrabian (1992), who argue that there is a relationship between the exchange rate and the stock price.

The last control variable chosen is the London interbank offered rate (Libor), which is commonly seen as a benchmark for interest rates (Filipović & Trolle, 2013), and therefore provides an indication of a minimum return.

On the basis of the prior research discussed in this chapter it can be concluded that the sentiment in news articles does have an effect on (stock) market prices and liquidity, but the direction and magnitude of the effect differs per research. The prior research mainly focused on the effect of domestic news on individual stocks, while the effect of international news on stock markets as a whole has not yet been researched extensively. On the basis of, and in addition to, the existing research, this thesis will focus on the effect of Reuters global financial news, broadly read in Sweden, on the Swedish stock market as a whole, measured in prices and turnovers, making use of the factors and control variables discussed in these previous studies. In the next chapter methodological aspects of these effects will be explained.

Methodology

Data

Textual data

The textual data is the data containing the news feeds and the date and time when they were published. This data is mined from the financial section of Reuters’ website3.

Algorithm

To process the textual data in the form of news feeds into quantitative data, a Natural Languages Processing algorithm was constructed. This algorithm was programmed in Python and analyses sentimental words to quantify the textual data. The algorithm also counts the number of feeds containing words with sentiment.

Sentiment analysis

The analysis executed by the algorithm is called a sentiment analysis. This analysis splits the sentence up in words and assigns a value between minus one and one to these words. These values are called polarities, and negative polarity means that the analysed word has a negative meaning and for positive polarities hold the opposite. If a word is neutral the polarity will be zero, meaning that it does not affect the total polarity of the sentence. These polarity values are based on lexical databases, in which linguists have assigned values to words (Fellbaum, 1998). The words with their assigned polarity form a sentence, with a weighted average of all the words with a polarity that

(8)

8 differs from zero. The process described in the flowchart (Figure 2Error! Reference source not found.) is repeated for all the news feeds in the period between 2012-2017.

(9)

9

Processed data

The data created by the algorithm consists of the following: • the daily average polarity;

• the daily total polarity;

• the daily news feeds with a polarity different from zero.

The processed data is examined graphically and is statistically tested for irregularities, like shocks.

Graph 1 average daily polarity plotted over time

The daily average polarity was plotted over time (Graph 1), and a drop in the polarity in the beginning of 2016 can be observed. To test if the drop is also statistically significant, a t-test was performed. The assumptions of this t-test are that the variances do not differ, because the data before and after January 1, 2016 are from the same source and analysed in the same way. The null hypothesis is that the means of the daily average polarity before January 1, 2016 do not differs from the daily average polarity after January 1, 2016. The alternative hypothesis is that they do differ from each other.

The difference in the sample means significantly differs from zero, t(1502) = 60.21, p < .001 (see appendix, Table 4Table 4 t-Test: Two-Sample Assuming Equal Variances

). This leads to the conclusion that the daily average polarity data will be divided up in two periods.

Graph 2 total daily polarity plotted over time

0.0000 0.0200 0.0400 0.0600 0.0800 0.1000 0.1200 0.1400 0.1600 0.1800 0.2000 02/01/2012 02/01/2013 02/01/2014 02/01/2015 02/01/2016 02/01/2017

Average daily polarity

0 50 100 150 200 250 300 350 02/01/2012 02/01/2013 02/01/2014 02/01/2015 02/01/2016 02/01/2017

(10)

10 The daily total polarity was also plotted over time (Graph 2), and a drop in the polarity in the beginning of 2016 is also observed in this data. To test if this drop is also statistically significant, a t-test is performed. The assumptions of this t-t-test are that the variances do not differ, because the data is from the same source and analysed in the same way. The null hypothesis is that the means of the daily total polarity before January 1, 2016 do not differ from the daily total polarity after January 1, 2016. The alternative hypothesis is that they do differ from each other.

The difference in the sample means significantly differs from zero, t(1502) = 34.14, p < .001 (see appendix, Table 5), which leads to the conclusion that the daily total polarity data will also be divided up in two periods.

Graph 3 news feeds with sentiment plotted over time

To analyse the decrease in the total polarity, the total amount of news feeds with a polarity different from zero was tested (Graph 3), in order to see if the decrease is the result of fewer news articles with a polarity different from zero. The null hypothesis is that the means of the amount of news articles analysed before January 1, 2016 do not differs from amount of news articles after January 1, 2016. The alternative hypothesis is that they do differ from each other.

The difference in the sample means significantly differs from zero, t(1502) = 16.59, p < .001 (see appendix, Table 6), which leads to the conclusion that the news feed data will also be divided up in two periods.

The difference in the daily average polarity is, therefore, logically explained by the fact that the news became more negative. The implication of this result is that only the data of the first period will be analysed, as this period contain more data points.

Stock market data

The data of the Swedish market is gathered from the official stock exchange website of the Nordic Nasdaq4. The data retrieved is from the OMX Stockholm 30 Index, which contains the stocks of the thirty largest companies registered on the exchange of Stockholm (see appendix, graph 4). The data has no graphically observed irregularities and is therefore not tested.

4 Source: http://www.nasdaqomxnordic.com/index/index_info?Instrument=SE0000337842 (16/06/2018) 0 500 1000 1500 2000 02/01/2012 02/01/2013 02/01/2014 02/01/2015 02/01/2016 02/01/2017

(11)

11

Control variables data

The data for the control variables is retrieved from multiple sources. TIPS and expected inflations are retrieved from the Federal Reserve Bank of St. Louis5. The data of the Gross Domestic Product (GDP) and the exchange rates, are collected from the world data bank6.

Variables

Explanatory variables

The explanatory variables, or regressors, are based on polarity and amount of news articles. The variables based on polarity are divided in daily average polarity and total polarity. To take incubation time of the news into account, the explanatory variables will be based on the news of the previous trading day denoted as (t-1). To examine the relative change, the natural logarithm of the polarity is calculated. The variables based on the amount of news feeds with a polarity different from zero, is the amount of news the previous trading day (t-1) and the relative change of the amount of news. The variables eventually used are based on average daily polarity, total daily polarity and daily amount of news feeds with a polarity different from zero.

Control variables

TIPS are controlled for inflation, which has two advantages: the first is that multicollinearity between TIPS and the exchange rate will be avoided, and the second advantage is that expected inflation could also be used as a control variable. Inflation is an effective control variable, because there is no statistical evidence in prior research that inflation correlates with sentiment. Based on the literature, there is no reason to assume that the exchange rate correlates with sentiment, therefore exchange rate is a logical control variable. The Libor is a rate determined by banks, i.e. is independent from investor sentiment, and therefore an effective control variable. According to the literature multicollinearity is assumed between the exchange rate and inflation control variables.

The control variables (i.e. TIPS, expected inflation, exchange rates, Libor and GDP) are all used in the regressions with their real unmanipulated values. In order to check whether the economic reasoning holds, the regressors will be tested on multicollinearity. A rule of thumb is going to be used here, which states that if the variance inflation factor is above 5, multicollinearity could be present.

Dependent variables

In order to measure the stock market, the price of its index will be the dependent variable. The index price incorporates all the stocks traded on the exchange. To see if the amount of trades and the liquidity of the exchange change, the market turnover also functions as a dependent variable to measure the effect on the stock market. The dependent variables will be measured in relative changes to see the effect of the regressors. This enables the regression to compare the results with research done on other markets.

Regressions

The first model is the model to explain the effect of polarity on the index price: ln⁡(𝑝𝑟𝑖𝑐𝑒 𝑜𝑓 𝑖𝑛𝑑𝑒𝑥) = 𝛽1ln⁡(𝐴𝑣𝑔𝑃𝑜𝑙(𝑡 − 1)) + 𝛽2𝑇𝐼𝑃𝑆 + 𝛽3𝐿𝑖𝑏𝑜𝑟 + 𝛽4𝐸𝑥𝑐ℎ𝑎𝑛𝑔𝑒𝑅𝑎𝑡𝑒

The second model is to explain the effect of the polarity on the price; however, the expected inflation is added, in order to analyse if the model variance will be more explained and if the

5 Source: https://fred.stlouisfed.org/series/T5YIFR (16/06/2018)

(12)

12

expected multicollinearity occurs:

The third model is the effect of news articles with a polarity different from zero on the index price: ln⁡(𝑝𝑟𝑖𝑐𝑒 𝑜𝑓 𝑖𝑛𝑑𝑒𝑥) = 𝛽1𝑁𝑒𝑤𝑠𝐴𝑚𝑜𝑢𝑛𝑡(𝑡 − 1)+ 𝛽2𝑇𝐼𝑃𝑆 + 𝛽3𝐿𝑖𝑏𝑜𝑟 + 𝛽4𝐸𝑥𝑐ℎ𝑎𝑛𝑔𝑒𝑅𝑎𝑡𝑒

The fourth and fifth model have the change in turnover as the dependent variable and will be regressed on, respectively, total polarity of the previous day and total news articles of the previous day:

Figure 3 explains the relation between the variables.

Figure 3 indirect effect of news on the stock market

Based on the existing literature and assumed economic mechanisms, the first hypothesis will be that polarity has an effect on the stock prices, and therefore on the market as a whole, and that consequentially it will affect the index price. The relation is hard to estimate on the basis of the existing, contradicting literature, and therefore the alternative hypothesis will be that the regressor coefficient differs from zero.

The second hypothesis is that polarity affects the turnover in the stock market. On the basis of the literature the estimated effect would be that a high polarity leads to a high turnover. The alternative hypothesis will be that the regressor coefficient will be larger than zero.

The third hypothesis takes the amount of news with a polarity different from zero as the regressor. The hypothesis will be that this news affects the stock market price. The direction of the effect is difficult to estimate, so the alternative hypothesis will be that the regressor coefficient differs from zero.

ln⁡(𝑝𝑟𝑖𝑐𝑒 𝑜𝑓 𝑖𝑛𝑑𝑒𝑥) = 𝛽1ln⁡(𝐴𝑣𝑔𝑃𝑜𝑙(𝑡 − 1)) + 𝛽2𝑇𝐼𝑃𝑆 + 𝛽3𝐿𝑖𝑏𝑜𝑟 + 𝛽4𝐸𝑥𝑐ℎ𝑎𝑛𝑔𝑒𝑅𝑎𝑡𝑒 + 𝛽5𝐸(𝜋)

∆𝑡𝑢𝑟𝑛𝑜𝑣𝑒𝑟 = 𝛽1𝑁𝑒𝑤𝑠𝐴𝑚𝑜𝑢𝑛𝑡(𝑡 − 1)+ 𝛽2𝐺𝐷𝑃 + 𝛽3𝐸(𝜋)

(13)

13 The last hypothesis will be that the amount of news with a polarity different from zero has a positive relation with the turnover, so the alternative hypothesis will be that the regressor coefficient will be larger than zero.

The scientific notation for this hypothesis and the related hypothesises are found in the table below.

Table 1 hypothesis

Statistics

Table 2 summary of results regression model 1, 2 and 3

The first model shows a significant result, with a p-value smaller than 0.001 (Table 2) for the entire regression with an alpha of 0.05. The coefficient of the daily average polarity shows a p-value of 0.058, so there is insufficient statistical evidence to assume a relation between the average polarity and the price of the index with an alpha of 0.05. The control variables do have a significant effect on the dependent variable (see appendix, Table 7). The variance inflation factors for all variables are below the value of 5 (see appendix, Table 9); this indicates that there is no reason to assume multicollinearity. The coefficient of the explanatory variable is 0.02096; this implies that if the average polarity increases by one percent, the price of the index would increase by 0.02096 percent (see appendix, Table 8).

The second model shows that polarity has a significant effect on the price of the index, with an alpha of 0.05. However, theoretically multicollinearity could occur when both exchange rate and expected

Hypothesis H0 H1 First Second Third Fourth 𝛽 = 𝛽 = 𝛽 𝛽 𝛽 ( ) 𝛽 ( ) 𝛽 ( ) = 𝛽 ( ) = * p<0.05, ** p<0.01, *** p<0.001 BIC -2985.0 -3055.8 -2989.1 dfres 986 985 986 R-sqr 0.877 0.887 0.878 (0.03) (0.08) (0.02) constant 13.651*** 13.005*** 13.599*** (0.00) NEWS_t-1 0.000** (0.02) expected_inflation 0.137*** (0.00) (0.00) (0.00) Libor -0.166*** -0.149*** -0.167*** (0.00) (0.01) (0.00) TIPS 0.080*** 0.112*** 0.082*** (0.00) (0.00) (0.00) ExchangeRate 0.099*** 0.138*** 0.098*** (0.01) (0.01) AvgPol 0.021 0.021* b/se b/se b/se Model 1 Model 2 Model 3

(14)

14 inflation are included in the regression, this is assumable since VIF values are larger than 5 (see appendix, Table 11). The marginal explanation of the R-squared is 0.01.

The part of the variance of the index price explained by the independent variables is the R-squared. To determine which part of the regression was explained by the polarity, a Shapley value analysis was performed. This resulted in an explanation of 0.676 percent (see appendix, Table 10).

The third regression model also found a significant result with a p-value smaller than 0.001 for the entire regression selecting an alpha of 0.05. The coefficient of the amount of news showed a p-value of 0.006, so there is sufficient statistical evidence to assume a relation between the amount of news feeds and the price of the index with an alpha of 0.05 (see appendix, table 9).

The coefficient of the explanatory variable is 0.0000133, which implies that if there is one extra news feed with a polarity different from zero, the price of the index would increase by 0.0000133percent (see appendix, Table 12).

Table 3 summary of results regression model 4 and 5

The fourth regression is to test the effect of polarity on the turnover in the market. The model had a p-value of 0.017(see appendix, Table 13), hence it is assumed to be significant with an alpha of 0.05 (Table 3). The R-Squared is 0.01365, which means that little of the variance of the change in turnover is explained by the variables (see appendix, Table 13). Of the total variance explained (R-squared), the largest part is explained by the total polarity (see appendix, Table 14).

The coefficient of the explanatory variable is 2.59656; this implies that if the total polarity increases by one point, the change of the turnover increases by 2.59656 percentage points (Table 3). The coefficients of the control variables do not significantly differ from zero and no multicollinearity is assumed (see appendix, Table 13).

* p<0.05, ** p<0.01, *** p<0.001 BIC 12707.1 12709.9 dfres 740 740 R-sqr 0.014 0.010 (697.76) (689.49) constant 173.556 191.268 (45.05) GDP -17.713 (0.14) NEWS_t-1 0.360* (260.20) (260.58) expected_inflation -160.532 -171.976 (40.18) control_GDP -18.502 (0.86) TotPol 2.597** b/se b/se Model 4 Model 5

(15)

15 The last regression is to test if the amount of news with a polarity different from zero affects change in the turnover. The p-value of the total regression is 0.0618, so there is no statistical evidence to prove that the turnover is affected by the three variables combined (see appendix, Table 15). In order to test if the amount of news in itself has an effect on the turnover, a partial f-test is performed, which shows that the amount of news has an effect on the turnover table (see appendix, Table 16).

Results

The first regression shows that a relative change in the daily average polarity of the previous trading day’s news does not affect the price of the index. However, when expected inflation is added to the regression, the effect of average polarity on the market price is significant, but it is accompanied by assumed multicollinearity. The variance of the price explained by the average polarity is relatively low.

The second regression shows that news with polarity moves the market, and it does not matter whether the news is positive or negative.

The third regression shows that a higher total polarity leads to a higher turnover, so if the sentiment increases, the value of trades increases as well.

The last regression shows that the amount of news with a polarity different from zero does not affect the turnover in the regression model.

Conclusion

The results show that the amount of news has a larger effect on the price, and the total polarity has a larger effect on the turnover. This implies that if there are a lot of news articles with sentiment, the price tends to increase, but the variation explained is low. If the total polarity is high, i.e. if there are more positive news articles than negative ones, which additionally add up to a high level, the turnover increases. This would imply that a lot of positive news stimulates investors to trade. The variances explained by the researched explanatory variables are relatively low. This is because an indirect effect was researched with a lot of other factors that could influence the markets as well. The control variable for the change in turnover were not significant and therefore not optimal for this regression, the choice of control variables mainly focused on the price of the index. The average polarity was computed and tested as a consequence, the effect of negative and positive news could not be distinguished.

The research in this thesis has shown that news with sentiment influences stock markets. Further research is necessary to analyse the effect of the negative polarity and the positive polarity separately. This would give a clearer insight into which type of news affects the market more. Other future researches could also focus on specific words and analyse how they influence the market. These words that could be included could also be the names of stocks, in order to research firm specific news, or names of countries, in order to analyse country specific news. Finally, how news affects the volatility of the market is also an interesting future research topic.

(16)

16

Bibliography

Ahmad, K., Han, J., Hutson, E., Kearney, C., & Liu, S. (2016). Media-expressed negative tone and firm-level stock returns. Journal of Corporate Finance, 37, 152-172.

Bahmani-Oskooee, M., & Sohrabian, A. (1992). Stock prices and the effective exchange rate of the dollar. Applied Economics, 24(4), 459-464. doi:10.1080/00036849200000020

Baker, M., & Stein, J. C. (2004). Market liquidity as a sentiment indicator. Journal of Financial

Markets, 7(3), 271-299.

Baker, M., & Wurgler, J. (2007). Investor sentiment in the stock market. Journal of economic

perspectives, 21(2), 129-152.

Boudoukh, J., Feldman, R., Kogan, S., & Richardson, M. (2013). Which news moves stock prices? a

textual analysis. Retrieved from

Fama, E. F. (1983). Stock returns, real activity, inflation, and money: reply. American Economic

Review, 73(3), 471-472.

Fellbaum, C. (1998). WordNet: Wiley Online Library.

Filipović, D., & Trolle, A. B. (2013). The term structure of interbank risk. Journal of Financial

Economics, 109(3), 707-733.

Heston, S. L., & Sinha, N. R. (2017). News vs. Sentiment: Predicting Stock Returns from News Stories.

Financial Analysts Journal, 73(3), 67-83.

Hu, Y., Duan, J., Chen, X., Pei, B., & Lu, R. (2005). A new method for sentiment classification in text

retrieval. Paper presented at the International Conference on Natural Language Processing.

Kahneman, D. (2003). Maps of bounded rationality: Psychology for behavioral economics. American

Economic Review, 93(5), 1449-1475.

Levine, R., & Zervos, S. (1999). Stock market development and long-run growth: The World Bank. Li, X., Xie, H., Chen, L., Wang, J., & Deng, X. (2014). News impact on stock price return via sentiment

analysis. Knowledge-Based Systems, 69, 14-23.

doi:https://doi.org/10.1016/j.knosys.2014.04.022

Lin, K. H.-Y., Yang, C., & Chen, H.-H. (2007). What emotions do news articles trigger in their readers? Paper presented at the Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval.

Liu, S. (2015). Investor Sentiment and Stock Market Liquidity. Journal of Behavioral Finance, 16(1), 51-67. doi:10.1080/15427560.2015.1000334

Marshall, A. (2005). From Principles of Economics. In Readings In The Economics Of The Division Of

Labor: The Classical Tradition (pp. 195-215): World Scientific.

Namouri, H., Jawadi, F., Ftiti, Z., & Hachicha, N. (2018). Threshold effect in the relationship between investor sentiment and stock market returns: a PSTR specification. Applied Economics, 50(5), 559-573.

Nasukawa, T., & Yi, J. (2003). Sentiment analysis: Capturing favorability using natural language

processing. Paper presented at the Proceedings of the 2nd international conference on

Knowledge capture.

Newman, N., Fletcher, R., Kalogeropoulos, A., Levy, D. A., & Nielsen, R. K. (2017). Reuters institute digital news report 2017.

Schmeling, M., & Schrimpf, A. (2011). Expected inflation, expected stock returns, and money illusion: What can we learn from survey expectations? European Economic Review, 55(5), 702-719. doi:https://doi.org/10.1016/j.euroecorev.2010.09.003

Seargeant, P., & Swann, J. (2013). English in the world: history, diversity, change: Routledge.

Simon, H. A. (1955). A behavioral model of rational choice. The quarterly journal of economics, 69(1), 99-118.

Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market.

The Journal of finance, 62(3), 1139-1168.

Tetlock, P. C. (2011). All the news that's fit to reprint: Do investors react to stale information? The

(17)

17 Wasserfallen, W. (1989). Macroeconomics news and the stock market: Evidence from Europe.

Journal of Banking & Finance, 13(4), 613-626. doi: https://doi.org/10.1016/0378-4266(89)90033-2

Xing, F. Z., Cambria, E., & Welsch, R. E. (2018). Natural language based financial forecasting: a survey.

Artificial Intelligence Review, 1-25.

Zhang, W., & Skiena, S. (2010). Trading Strategies to Exploit Blog and News Sentiment. Paper presented at the Icwsm.

(18)

18

Appendix

Graph 4 closing price of the index plotted over time

0.00 200000.00 400000.00 600000.00 800000.00 1000000.00 1200000.00 1400000.00 1600000.00 1800000.00 2000000.00 02/01/2012 02/01/2013 02/01/2014 02/01/2015 02/01/2016 02/01/2017

(19)

19 Average Polarity 2012-2015 Average Polarity 2016-2017 Mean 0.135604753 0.073296422 Variance 0.000346746 0.000383128 Observations 1000 504 Pooled Variance 0.00035893 Hypothesized Mean Difference 0 df 1502 t Stat 60.20504019 P(T<=t) one-tail 0 t Critical one-tail 1.64586875 P(T<=t) two-tail 0 t Critical two-tail 1.961544645

Table 4 t-Test: Two-Sample Assuming Equal Variances

Average Polarity 2012-2015 Average Polarity 2016-2017 Mean 147.8942 55.65312 Variance 3471.631 396.108 Observations 1001 503 Pooled Variance 2443.727 Hypothesized Mean Difference 0 df 1502 t Stat 34.14091 P(T<=t) one-tail 7.1E-190 t Critical one-tail 1.645869 P(T<=t) two-tail 1.4E-189 t Critical two-tail 1.961545

(20)

20

News feeds (with polarity ≠ 0) 2012-2015

News feeds (with polarity ≠ 0) 2016-2017 Mean 1060.625375 764.9363817 Variance 138000.3285 43140.5059 Observations 1001 503 Pooled Variance 106296.1801 Hypothesized Mean Difference 0 df 1502 t Stat 16.59409005 P(T<=t) one-tail 3.25812E-57 t Critical one-tail 1.64586875 P(T<=t) two-tail 6.51624E-57 t Critical two-tail 1.961544645

Table 6 t-Test: Two-Sample Assuming Equal Variances

Table 7 model 1

Table 8 regression output model 1

Table 9 test for multicollinearity model 1

Total 22.4998399 990 .022727111 Root MSE = .05287 Adj R-squared = 0.8770 Residual 2.75630841 986 .002795445 R-squared = 0.8775 Model 19.7435314 4 4.93588286 Prob > F = 0.0000 F(4, 986) = 1765.69 Source SS df MS Number of obs = 991

_cons 13.65067 .0266962 511.33 0.000 13.59828 13.70305 Libor -.1658183 .004793 -34.60 0.000 -.1752239 -.1564128 TIPS .0799624 .0041664 19.19 0.000 .0717864 .0881385 ExchangeRate .0985435 .0023332 42.24 0.000 .093965 .1031221 AvgPol .020906 .0109987 1.90 0.058 -.0006776 .0424896 price_of_i~x Coef. Std. Err. t P>|t| [95% Conf. Interval]

Mean VIF 1.30 AvgPol 1.04 0.960868 ExchangeRate 1.21 0.825620 Libor 1.39 0.720157 TIPS 1.54 0.648371 Variable VIF 1/VIF

(21)

21 Regressor Shapley %R2 AvgPol 0.676 ExchangeRate 38.4744 TIPS 27.2518 Libor 33.5977

Table 10 Shapeley values model 1

Table 11 regression output model 2

Table 12 regression output model 3

_cons 13.59863 .0189145 718.95 0.000 13.56151 13.63575 Libor -.167398 .0048353 -34.62 0.000 -.1768868 -.1579093 TIPS .0815983 .0041826 19.51 0.000 .0733905 .0898062 ExchangeRate .0982828 .0022992 42.75 0.000 .093771 .1027946 news_t1 .0000133 4.79e-06 2.77 0.006 3.87e-06 .0000227 price_of_i~x Coef. Std. Err. t P>|t| [95% Conf. Interval] Total 22.4998399 990 .022727111 Root MSE = .05276 Adj R-squared = 0.8775 Residual 2.74505372 986 .00278403 R-squared = 0.8780 Model 19.7547861 4 4.93869653 Prob > F = 0.0000 F(4, 986) = 1773.94 Source SS df MS Number of obs = 991

(22)

22

Table 13 regression output and VIF values model 4

Regressor Shapley %R2 TotPol 46.9421 GDP 1.8498 expected_inflation 51.2082

Table 14 Shapley values model 4

Table 15 regression output model 5

Table 16 partial F test model 5

_cons 156.8516 685.5582 0.23 0.819 -1189.019 1502.722 expected_i~n -160.0174 259.986 -0.62 0.538 -670.4155 350.3806 GDP -20.35575 44.41062 -0.46 0.647 -107.5416 66.83007 TotPol 2.59656 .8600537 3.02 0.003 .9081242 4.284996 turnover_c~e Coef. Std. Err. t P>|t| [95% Conf. Interval] Total 1.1146e+09 743 1500087.83 Root MSE = 1218.9 Adj R-squared = 0.0096 Residual 1.0994e+09 740 1485614.63 R-squared = 0.0136 Model 15210429.2 3 5070143.06 Prob > F = 0.0171 F(3, 740) = 3.41 Source SS df MS Number of obs = 744

Referenties

GERELATEERDE DOCUMENTEN

The research aims to look beyond the ‘narrow’ SRHR-related outcomes of the programme to explore the broader liberating potential of participatory approaches

scales; lamina ovate, 17-50(-180) cm x 9-50 cm, bright green, young leaves red, intermediate stages olivaceous, coriaceous and lustrous, glabrous although young plants may have some

In this chapter, I will link this two together to show what the role of Western media is in reproducing the dominant discourse on Africa, the Afro-pessimism

- Does general pessimism induced by Dutch financial news media (content), reflecting investor sentiment, have a negative effect on the Dutch stock market index AEX.. The

Cumulative abnormal returns show a very small significant reversal (significant at the 10 per cent level) for the AMS Total Share sample of 0.6 per cent for the post event

They experiment with three supervised techniques: multinomial Naive Bayes, maximum entropy and support vector machines (SVM). The best result, accuracy of 84%, was obtained

Here, the returns of Stellar and the lagged HE sentiment scores (for a number of different variants) produce relatively high correlations (as high as 0.09). The set of variables

In this research, the main investigated relationship is the possible impact the two different predictors (ESG pillar scores and ESG Twitter sentiment) have on the