• No results found

Investor sentiment : the sentiment-return relationship analysed with a search-based measure : a cross-market analysis between the S&P 500 and cryptocurrencies

N/A
N/A
Protected

Academic year: 2021

Share "Investor sentiment : the sentiment-return relationship analysed with a search-based measure : a cross-market analysis between the S&P 500 and cryptocurrencies"

Copied!
42
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Investor sentiment: the sentiment-return relationship

analysed with a search-based measure.

A cross-market analysis between the S&P 500 and cryptocurrencies

Erwin Kruisbrink Student Number: 10643680

MSc. Economics

Track: Behavioural Economics & Game Theory Supervisor: prof. dr. J.H. (Joep) Sonnemans

(2)

Abstract

This research tries to create a direct measure of investor sentiment based on search engine data. This measure is used to analyse the sentiment-return relationship in financial markets. More specifically, it is tried to predict financial returns in both the traditional stock market and the cryptocurrency market with this measure for investor sentiment. The literature has generated mixed results on the feasibility of using such a direct measure of investor

sentiment. Next to that, a cross-market analysis is conducted to examine which markets are more prone to investor sentiment. Based on this research, it cannot be concluded that a search-based measure for investor sentiment is feasible to analyse the sentiment-return relationship. Moreover, the results show that the constructed negative sentiment index has no real predictive power for returns on both markets. Neither does it provide evidence that the cryptocurrency market is more prone to investor sentiment than the traditional stock market

JEL Codes: G12, G17, G40, G41

Statement of Originality

This document is written by Student Erwin Kruisbrink who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating it.

(3)

Table of contents

1 Introduction ... 4

2 Literature review ... 6

2.1 Traditional Finance theories... 7

2.2 Behavioural Finance ... 8 2.3 Cryptocurrency ... 11 2.4 Measuring Sentiment ... 16 2.4.1 Market measures ... 16 2.4.2 Survey measures ... 17 2.4.3 Other measures ... 17

2.4.4 Search engine data ... 18

3 Methodology and Data ... 19

3.1 Investor sentiment index ... 19

3.2 Other data ... 22 3.3 Model specification ... 25 3.4 Hypotheses ... 27 4 Results ... 28 4.1 Sentiment-return relationship ... 28 4.2 Cross-market analysis ... 33 4.3 Limitations ... 34

5 Discussion and conclusion ... 35

6 References ... 37

7 Appendix ... 40

A: Summary statistics ... 40

B: Test for Stationarity ... 41

(4)

1 Introduction

Stock market prediction has always attracted much attention from both academic

researchers and investors, it is possibly one of the most researched subjects in finance. But can the stock market really be predicted? Early research on stock market prediction was based on the Random Walk Theoryand the Efficient Market Hypothesis (EMH). According to the EMH stock market prices are largely driven by new information, rather than present and past prices and all investors act as rational agents. Since new information is unpredictable, stock market prices will follow a random walk pattern and cannot be predicted (Fama, 1970). However, despite its early theoretical and empirical success, some of the most interesting events in the history of stock markets are still unexplained by this model. The great Crash of 1929, the Nifty Fifty Bubble of the early 1970s, the Black Monday crash of October 1987, and the Dot.com bubble of the 1990s all show abnormal changes in stock prices that seem unjustified by fundamentals. This shows that investors are not as rational as is assumed by the EMH.

In 1936, it was Keynes who already argued that markets fluctuate wildly under the influence of investors ‘animal spirits’ which moves prices away from fundamentals. More recently, De Long, Shleifer, Summers, and Waldmann (1990) formalized this idea and showed that investors aren’t rational agents. Investors are subject to sentiment when making financial decisions, where investor sentiment is defined as a belief about future cash flows and investment risks that is not justified by available information. These irrational investors or noise traders can cause large changes in prices and higher volatility in financial markets.

To further examine noise trading based on sentiment it is interesting to analyse which markets are sensitive to sentiment. Baker and Wurgler (2006) argue that there’s a bigger role for sentiment in younger, unprofitable, highly-volatile stocks of growth

companies and low capitalizations. This follows theoretically as these categories are harder to value, making biases and valuation mistakes more likely, and as these categories are more difficult to arbitrage (higher transaction costs). The best recent examples of markets where prices and volatility had big fluctuations, are the market for internet stocks in the end of the 90’s and the current cryptocurrency market. The description above from Baker and

(5)

a big role in these markets. This thesis will try to examine the role of investor sentiment across markets. More specifically, the role of investor sentiment in the current

cryptocurrency market will be compared to its role in the traditional stock market. In contrast to some decades ago when the traditional finance theories were still widely accepted, nowadays, papers in behavioural finance emphasize the role of behavioural and emotional factors in financial decision-making. As a consequence,

measuring investor and social mood has become a key research issue in financial prediction. As Baker and Wurgler (2007) put it, ‘the question is no longer, as it was a few decades ago, whether investor sentiment affects stock prices, but rather how to measure investor sentiment and quantify its effects’.

Recently, a new way of measuring sentiment, next to the traditional market or survey measures, emerged. That is, measuring investor sentiment based on search engine data. One big advantage of search data, among others, is that such a measure reveals sentiment instead of inquiring about sentiment. Most relevant papers try to measure investor sentiment based on search volume for individual stock ticker symbols or company names and consequently evaluate the individual returns of these companies (Preis, Reith, & Stanley (2010), Bijl, Kringhaug, Molnár, & Sandvik (2016)). Besides the fact that these studies generate contradicting results, these studies seem to merely capture investor attention for individual stocks instead of broad investor sentiment. Nevertheless, other researchers do use aggregate search terms to capture society wide investor sentiment. Although these studies (Preis, Moat, & Stanley, (2013) Da, Engelberg, & Gao (2014)), find significant correlations, one can argue that their methodologies are not that clean. These papers are discussed in Section 2.4.4 and to adress their issues this thesis uses a slightly different methodological approach.

As the results of the relevant literature are mixed, this research tries to find more evidence for the sentiment-return relationship with use of a search-based measure for investor sentiment. The main research question of this thesis is therefore whether a search-based measure for investor sentiment can predict returns on financial markets. Next to that, to gain further knowledge about the effects of investor sentiment in different markets, a cross-market analysis between the traditional stock market (S&P 500) and the market for cryptocurrencies will be conducted.

(6)

Based on the literature it is expected that negative sentiment will be correlated with negative returns on the same day. In line with behavioural finance, the days following the negative sentiment should show a return reversal. That is, the returns will increase again and undo the initial decrease. Concerning the cross-market analysis, it is expected that investor sentiment will play a bigger role in the market for cryptocurrencies as in the traditional stock market which would be in line with Baker and Wurgler (2006).

However, based on this research, it cannot be concluded that a search-based measure for investor sentiment is feasible to analyse the sentiment-return relationship. Moreover, the results show that the constructed negative sentiment index has no real predictive power for returns on both markets. Neither does it provide evidence that the cryptocurrency market is more prone to investor sentiment than the traditional stock market.

This research adds on to the mixed results of the literature. The difference between this research and the existing literature is an alternative methodological approach and the cross-market analysis, as the cryptocurrency market has not been examined in this context before. Moreover, this thesis tests the feasibility of using a search-based data in financial applications. Search data has the potential to be used in more economic models as it directly reveals the sentiment of an entire population of agents.

This thesis is organized as follows. In Section 2 the relevant literature and related studies will be reviewed. Subsequently, in Section 3, the methodology and data used will be described. Section 4 will show the results of the data-analysis. At last, in Section 5 the conclusions of this thesis will be presented.

2 Literature review

In this section the relevant literature and related research will be examined and reviewed to better understand the theoretical framework of the effect of investor sentiment on financial markets. First, the most important traditional finance theories will be discussed.

Consequently, the scientific switch to behavioural finance theories such as limitations to arbitrage and investor sentiment will be described. Then, the cryptocurrency market will be

(7)

investor sentiment, in section 2.4 the several ways of measuring investor sentiment will be discussed and analysed. At last, the literature on the sentiment-return relationship with a search-based measure will be reviewed.

2.1 Traditional Finance theories

Traditional finance theories are assuming that markets and thus prices are perfectly efficient. This idea is formulated in the Efficient Market Hypothesis (EMH) by Fama (1970). The EMH is one of the most important theories in financial literature. The EMH states that asset prices fully reflect all information available. All agents in the market are rational, and the best strategy to follow is a fully random diversified portfolio where only systematic risk remains. As only news can change prices and news is unpredictable, asset prices follow a random walk (Fama, 1970).

However, there are some periods in history that stock markets behaved in a manner that does not match up with the EMH. The lastcentury the world has seen some bubbles form and burst where the changes in asset prices were not justified by fundamentals. Among others, the Great Crash of 1929, the Nifty Fifty bubble of the early 1970s, the Internet bubble of the 1990s, and maybe also the current cryptocurrency bubble cannot be explained by traditional finance theories.

During the pastdecades more researchers, based on observed market anomalies, started to challenge the assumptions of the EMH. Shiller (1981) showed empirically that variations in discounted future cash flows (dividend) are smaller than variations in asset prices. This indicates that investors are not fully rational and prices can thus move away from fundamentals.

Nevertheless, with the presence of irrational investors the EMH can still hold. Less rational speculators who, by buying when prices are high and selling when prices are low move prices away from fundamentals, are eliminated from the market by rational

speculators (arbitrageurs). The less rational speculators will make no profit until only rational speculators remain and prices are stabilized at fundamental level (DSSW, 1990). This process, arbitrage, is one of the fundamental concepts in finance and is defined as “the simultaneous purchase and sale of the same, or essentially similar, security in two different markets for advantageously different prices” (Sharpe, Alexander, & Bailey, 1998).

(8)

Irrespective of the presence of these less rational traders, or ‘noise traders,’ the market can still be efficient. Black (1986) came up with the term ‘noise traders’, being defined as investors trading based on ’noise’, unjustified information, instead on information about fundamentals. Black argues that these noise traders are even important for financial markets, as without them trading would be impossible.

Nonetheless, De Long et al., (1990) showed that when arbitrage is limited, asset prices that are away from fundamentals can persist, and consequently noise traders do not get eliminated from the market. Shleifer and Vishny (1997) argue that, although textbook arbitrage requires no capital and does not involve any risk, in reality arbitrage requires capital and is risky. They show that professional arbitrage, as most arbitrage is carried out by a small group of professional arbitrageurs using other people’s capital, can become ineffective in restoring prices at fundamental values when prices divert too much from these fundamentals (Shleifer & Vishny, 1997).

Next to these theoretical researches challenging the EMH, there is also empirical evidence contrasting the EMH. According to the EMH, asset prices follow a random walk and can thus not be predicted. When in fact, multiple studies find anomalies in the market and are able to predict returns to some extent. Among others, French, Schwert and

Stambaugh (1987); Fama and French (1988); and Campbell and Shiller (1988) find that simple models with measures as valuation ratios can predict stock prices and that predictability increases at longer horizons.

To conclude, Shiller (2003) states that the existence of anomalies implies that either financial markets are inefficient or traditional asset pricing models are incorrect. This has resulted in a new view on financial markets and created a new research orientation, namely: behavioural finance. That is, finance from a broader social science perspective including psychology and sociology (Shiller, 2003). In the next section some topics of this branch of research will be discussed, specifically, the limitations on arbitrage and psychological factors affecting investors behaviour.

2.2 Behavioural Finance

(9)

finance theories were not able to explain. These theories were built around two pillars: limits to arbitrage and investor sentiment. In this section these two main topics of behavioural finance will be discussed.

2.2.1 Limits of arbitrage

As described before, arbitrage is essential in traditional financial theories assuming perfect markets as it theoretically brings prices back to fundamental values and keeps markets efficient. In these theories, arbitrage opportunities should only exist for a short time or practically not exist at all as rational arbitrageurs are constantly looking for these opportunities and the noise traders get eliminated out of the market.

Theoretically, an arbitrageur buys a cheaper asset and sells a more expensive one, the net future cash flows are zero all along the process and the arbitrageur gets profits immediately. This process does not involve any risk. Even capital is not required in the traditional model of arbitrage, as the market described has very large number of arbitrageurs who all take an extremely small position against mispricing (Shleifer & Summers, 1990).

However, in reality, there are not so many arbitrageurs with the knowledge and information to take part in this process. It is more common that only a few specialized, professional investors take larger positions with other people's capital. This agency relationship in reality causes arbitrage to require capital, in contrast with the theoretical version of arbitrage. Next to that, arbitrage can become risky when prices move fast. When the value of the assets the arbitrageur delivers differs from value of the assets delivered to him, losses may occur. This is called risk arbitrage, in reality most arbitrage trades are examples of risky and capital-requiring trades (Shleifer & Vishny, 1997). Shleifer and Vishny (1997) argue that the agency relationship causes arbitrageurs to be more cautious in exploiting trade opportunities as these trades involve risk and they are dependent of the capital of others. Bad performance based on past returns leads to less capital allocated. Hence, this significantly limits effectiveness of arbitrage in achieving market efficiency.

Another source of risk is the fact that noise tends to be not random. Theoretically, noise traders form beliefs based on random unjustified information. When some noise traders form beliefs about higher prices and some noise traders about lower prices, they cancel each other out. However, Shleifer and Summers (1990) show that many trading

(10)

strategies are correlated causing aggregate changes in demand. They argue that biases in information processing affect investors in the same manner.

Furthermore, the beliefs of noise traders, and the horizons of these beliefs, are hard to predict. De Long et al. (1990) give an example of this problem. An arbitrageur takes a short position when an asset is overpriced. As the arbitrageur is dependent on the capital of others there is no infinite horizon to close this position, there is a risk that the asset is still overpriced when closing the position. This means that the arbitrageur depends on the beliefs of the value at the moment of selling. If the noise traders still cause the asset to be overpriced the arbitrageur loses money on the trade. This is called noise trader risk (De Long et al., 1990). As Keynes stated it already in 1931: “There is nothing so disastrous as a

rational investment policy in an irrational world.”

To conclude, there are limitations to arbitrage as it is risky and requires capital. This makes it possible that the mispricing of assets in the market can arise and sustain for longer periods of time.

2.2.2 Sentiment

There are two types of traders in the market. Rational traders who arbitrage and

sentimental traders who cause mispricing in the market. As shown in the previous section, this mispricing can persist due to limits to arbitrageurs. However, where does this sentiment causing mispricing come from and how does it affect prices? What makes some stocks more vulnerable to shifts in investor sentiment?

Investor sentiment is defined as a unjustified belief about future cash flows or risks. Or to put it differently, the propensity to speculate (Baker & Wurgler, 2007). It is argued that unjustified beliefs come from psychological biases that investors have. Shiller (2003) states that concepts like optimism, self-attribution, limited attention and the disposition effect cause investors to under- or overreact to news (Shiller, 2003). These biases even might be stronger in cases of uncertainty. When uncertainty, or subjectivity of valuations, is high, there are bigger roles for overconfidence, representativeness and conservatism. Next to that, opinion differences are bigger in times of uncertainty even when investors have the same information (Baker & Wurgler, 2007).

(11)

That being the case, it is interesting to determine which assets have a higher propensity to speculate compared to others. That is, for which assets is it harder to

determine the fundamental value. Baker and Wurgler (2006) perform an empirical study on the cross-sectional effect of investor sentiment on stocks. They find that companies that are young, small, highly volatile, unprofitable, non-dividend paying, distressed, or with extreme growth potential have stocks that are most sensitive to investor sentiment (Baker &

Wurgler, 2006). All these characteristics lead to a higher subjectivity in valuation. In their paper one year later they show that stocks that are the hardest to arbitrage also are the stocks that are difficult to value (Baker & Wurgler, 2007). This means that investor sentiment is highly correlated with limits to arbitrage.

The Internet bubble of the 90s gives an example of anecdotal evidence for the conclusions of Baker and Wurgler. The future of the internet was not yet clear, companies were young and unprofitable and this made these stock hard to value and thus prone to investor sentiment. Nowadays, a parallel story can be told about the emerging

cryptocurrency market. Is cryptocurrency going to deliver on its promises and change the way we think about money and banking forever, like the internet completely changed our lives? Or is it just a speculative bubble? One thing is for sure, the true value of the

cryptocurrency market is very hard to unravel which makes it a market where investor sentiment can play a big role.

In the following part, the cryptocurrency market will be examined. Are

cryptocurrencies really currencies or rather assets? How does the price development compare to historical bubbles?

2.3 Cryptocurrency

In this section the market for cryptocurrency will be examined. First, as for many the purpose of cryptocurrencies and the way cryptocurrencies work remain unclear, the background and application will be shortly discussed. Next, it will be tried to analyse the fundamental value of cryptocurrency as an asset class. At last, the recent development of the market will be observed and it will be shown that the market is prone to investor sentiment.

(12)

2.3.1 What are cryptocurrencies?

Cryptocurrencies are a new currency or asset class that enable decentralized applications. They propose a shift away from conventional financial infrastructures and create a system with decentralized organisation and transparency, based on technological solutions as peer-to-peer connectivity and cryptographic algorithms.

These decentralized applications enable services like payments, storage or

computing without a central operator. These services obviously already exist, the only new thing to them is the absence of a central party. Most aspects of centralized services are better than their decentralized counterparts. In general, decentralized services are slower, more expensive, and less scalable. Moreover, they have worse user experiences and have volatile and uncertain governance. The only dimension where decentralized applications are better is censorship resistance. This means that only people that prefer to make this trade-off will use the decentralized applications themselves.

The most famous cryptocurrency is Bitcoin, a decentralized application for payments. Bitcoin, designed by Nakamoto (2008), is an electronic financial mechanism resembling an established currency system. It has its own money creation and transaction administration relying on a decentralized organizational structure.

Bitcoins can be sent to other users on a peer-to-peer network directly without the need for trusted intermediaries that verify the transaction. Instead, transactions are verified by entities through cryptography using great processing powers. These entities compete for the right to verify the latest transactions and get Bitcoin as compensation. The full history of transactions is stored in a chain, often called the Blockchain (Nakamoto, 2008).

Nowadays there are a lot of different cryptocurrencies besides Bitcoin. In June 2018 already around 400 cryptocurrencies existed with market capitalisations of at least 10 million US dollars.These alternative cryptocurrencies differentiate by either providing different services or proposing solutions for problems with the Bitcoin network; higher speed of transaction validation, better energy efficiency and a more robust algorithm (European Central Bank, 2015). However, as Bitcoin is still the biggest player in the market with a market capitalisation more than twice as big as the number two Ethereum, here the focus will be mainly on Bitcoin.

(13)

2.3.2 Currency or asset?

There are two possible reasons people would buy a cryptocurrency as Bitcoin. The first one is the use of Bitcoins as alternative payment system where users can easily and cheaply make financial transactions without a central operator. The second one is buying Bitcoin as an asset speculating on increasing prices to make a profit when selling for higher prices. This question if Bitcoin is a currency or an asset is still an ongoing discussion, although more and more researchers conclude that Bitcoin is an asset.

A currency, according to Laidler (1969), can be used as a medium of exchange, as a way to store value, or as a unit of account to compare the value of goods and services. Whereas an asset is defined as a resource with economic value that an individual, corporation or country owns with the expectation that it will provide a future benefit.

Weber (2014) argues that 70% of all existing Bitcoins in 2014 were held in dormant accounts, that is, accounts without any limitations on withdrawals that haven’t shown any activity for a long time. This shows that Bitcoins are not generally used as means of trade. However, Bitcoins can still be owned to store value or be used as a unit of account and thus be a currency. For these functions to be fulfilled, Bitcoin should have a certain level of confidence among its users. In other words, Bitcoin cannot be a reliable unit of account or storage of value when prices fluctuate a lot. When in fact, prices of Bitcoin fluctuate way more than other currencies. In Figure 1 the historical volatility of the BTC price in US dollars is plotted against the volatility of US dollars in Euros. Here one sees that the volatility of these fiat currencies is around 0.5%, while the prices of BTC fluctuate wildly with a volatility of around 5% in recent years. This means that Bitcoin is not reliable as a unit of account or store of value and thus is not a currency.

(14)

Figure 1: Volatility of Bitcoin/USD against the volatility of USD/EUR. Graph source:

https://www.buybitcoinworldwide.com/nl/volatiliteits-index/

This high volatility follows from the uncertainty about the fundamental value of a Bitcoin. Researchers have tried to find the factors that drive the price of a Bitcoin in several ways. Kristoufek (2013) shows in his research that future cash-flow models, purchasing power parity, or uncovered interest rate parity cannot be applied to Bitcoin to discover a fundamental value. Alternatively, Hayes (2017) tries to describe the supply side of the price through the cost of production of Bitcoins which comes down to electricity costs. However, he concludes that current prices are way higher than these costs, meaning the demand side drives this high price.

To conclude, there are no convincing factors driving prices other than individual investors demand. As the future embracement of new technology is hard to predict, the short-run driver of prices of cryptocurrencies will be the demand by individual investors speculating on price increases. In the long-run, the value of a cryptocurrency will depend on how much the decentralized service it enables will be used.

Now that the price of a Bitcoin is higher than its fundamental value it is interesting to look at how this price developed. The price development of cryptocurrencies as Bitcoin since origination, as seen in Figure 2, lets many people think it is a bubble. When comparing the price development with, for example, the tech stocks of the Internet bubble one sees the similarities. One big difference is the speed and the height of the price development. Where the Internet bubble took around six years to form and burst, it took the Bitcoin price one year to multiply with a factor twenty.

(15)

Figure 2: Historical development of the price of Bitcoin and Nasdaq stocks compared. Graph source:

Morgan Stanley

Moreover, not only the prices behave similarly, also the idea about new-era

technology that would change the way we think about the world is a parallel. Shiller (2000) argues that ‘new era’ stories become attached to a bubble and then acquire increasing plausibility and investor enthusiasm as the market continues to achieve high returns. Irrational speculative bubbles are characterised by an unusual kind of buzz or social epidemic created by social psychological biases and imperfect information. This leads to market sentiments with overoptimistic expectations about an assets fundamental value. Under these kinds of circumstances, the relationship between price and fundamental value drifts apart (Shiller, 2000).

For a conclusion, cryptocurrencies are young, highly volatile, unprofitable, non-dividend paying, and with extreme growth potential and thus very well describe the assets that are most sensitive to investor sentiment according to Baker and Wurgler (2006). So, whether or not the cryptocurrency market is experiencing a bubble, it is clear that the market for cryptocurrencies is prone to investor sentiment. That means it should be

possible to capture price development in a measure for investor sentiment. The next section will get back to investor sentiment and will examine the different ways of measuring

(16)

2.4 Measuring Sentiment

In this section the literature on how to measure investor sentiment will be reviewed. Different options as market-based measures, survey-based measures and other-based measures will be discussed. Furthermore, the choice for the search-based measure used in this research will be explained.

As described in Section 2.2 investors are subject to sentiment when making financial decisions, where investor sentiment is defined as a belief about future cash flows and investment risks that is not justified by available information. Or, in other words, investor sentiment is the propensity to speculate. It is clear that there is no perfect measure for an attitude or belief, one will need a proxy in order to research the effects of investor

sentiment. In behavioural finance research several measures of investor sentiment can be found.

One way of approaching this problem is the so called ‘bottom up’ approach. That is, examining psychological biases in individual investor behaviour to understand how investors under- or overreact to past returns or fundamentals. However, most measures used, and also the one in this thesis, are ‘top down’. This approach focuses on aggregate sentiment and examines its effect on the market.

2.4.1 Market measures

Historically, investor sentiment was measured through market-based measures as trading volume, closed-end fund discount, IPO first-day returns, IPO volume, option implied volatilities (VIX) or mutual fund flows. An example of a research using these proxies as a measure for investor sentiment is the paper by Baker and Wurgler (2006). They build an index including these six proxies and find that companies that are young, small, highly volatile, unprofitable, non-dividend paying, distressed, or with extreme growth potential have stocks that are most sensitive to investor sentiment. Although these measures are easily available on high-frequency, their disadvantage is that they are the outcome of not just investor sentiment but also other economic conditions. It is measuring an input-output relationship with an output measure.

(17)

2.4.2 Survey measures

Another way of measuring investor sentiment is through survey-based indices as the Consumer Sentiment Index from the University of Michigan or the UBS / GALLUP Index for Investor optimism. However, surveys have a couple of disadvantages compared to other measures of sentiment. That is, surveys are often only conducted monthly or quarterly which is compared to other measures a low frequency. Moreover, surveys are expensive and time-consuming to conduct, especially with worldwide decreasing response rates. Next to that, there is often little incentive to answer survey questions carefully or truthfully, especially when questions are sensitive (Tourangeau & Yan, 2007).

2.4.3 Other measures

More recently, researchers have found multiple other innovative ways of measuring investor sentiment. Here, some of these measures will be briefly described.

Tetlock (2007) quantitatively measures the interactions between the media and the stock market using daily content from a popular Wall Street Journal column. He finds that high media pessimism predicts downward pressure on market prices followed by a

reversion fundamentals, and unusually high or low pessimism predicts high market trading volume.

Bollen, Mao and Zeng (2011) derive collective public mood, or investor sentiment, from large-scale Twitter feeds. By analysing the content of daily tweets they measure mood in six different dimensions. With these mood states they are able to predict the direction change in DJIA closing values with an accuracy of 86.7%.

Even more unorthodox, Hirshleifer and Shumway (2003) examine the sentiment-return relationship with weather as a proxy for sentiment. They find that sunny morning weather is significantly correlated with stock returns, again showing that the pricing of stocks is not a fully rational process. In the same category falls the paper of Edmans, García and Norli (2007). They use international football results as a proxy for investor mood and find significant market declines after football losses.

Another, more serious, measure that emerged the last decade is a search-based measure. That is, using search engine data to examine where people search for and base a measure for investor sentiment on this data. This has a couple of advantages compared to

(18)

the market and survey measures discussed before. Namely, search data is available on a daily basis. Furthermore, the data is freely available, and based on a large sample.

Moreover, a search-based measure reveals sentiment instead of inquiring about sentiment. Take the example of a person answering a question in a survey about the likelihood of losing his job, as this is a sensitive topic the answer might be not fully truthful. Whereas aggregate search for terms as ‘unemployed’, and ‘job search’ reveal concern about job loss. Finally, search-based measures have the advantage of capturing the behaviour of individual investors, who are more prone to social, cognitive and emotional biases, which better reflects the nationwide sentiment compared to institutional investors who are more likely to use professional sources as Bloomberg.

2.4.4 Search engine data

Research using search data to predict financial markets has generated mixed results. Preis et al., (2010) find no significant relation between returns and search volume for company names. They do however find evidence for predicting trading volume with search volume. This seems to be investor attention instead of sentiment they are measuring. In 2013 Preis et al used general search terms related to finance to predict market returns. They found that their buy and sell strategy based on search volumes would have outperformed the market by 310% for the seven years they examined (Preis et al., 2013). However, this research received some criticism on their methodology as they test for the predictability of historical returns using terms that were chosen based on usage in the Financial Times knowledge during the period researched (2004-2011). This means that keywords from 2011 are used analyzing the years before injecting future knowledge into the past. Moreover, only three out of ninety-eight terms have a significant higher return than the mean return at the 95% level which is perfectly explainable by random probability as it is expected for 5% of the keywords to be significant. To prove these concerns some other researchers showed that the same result was obtained using random words that were not related to finance (Challet & Ayed, 2014). Another paper that does find asignificant relationship between search volume and returns is the one by Bijl et al. (2016). They find a weak, though

significant, relationship between individual stock returns and search volumes for the stock its company name.

(19)

The study that is most related to this research is the study by Da et al. (2014). Their study uses a search-based measure for investor sentiment. They use daily internet search volume from Google and aggregate a list of negative economic terms to create an investor sentiment index which they call ‘FEARS’. These researchers find that an increase in the FEARS index leads to a significant return reversal. That is, high FEARS are correlated with low returns today, however, they lead to high returns over the next two days. Next to that, these researchers find that this effect is the strongest for stocks that attract sentiment investors. However, Da et al. (2014) use a historical regression-based approach to select the most relevant terms. That is, only the n terms with the most negative relationship with historical returns are used in their sentiment index. This makes it not surprising that they consequently find that their index significantly correlates with market returns.

To be able to examine the effectiveness of measuring investor sentiment with search data and consequently predicting market returns the methodology is a bit different

compared to the literature described above. In order to capture investor sentiment and not just investor attention in this research the focus will not be on stock ticker symbols or company names but on terms that reveal sentiment towards economic conditions. Next to that, to prevent overfitting the model by selecting terms based on their historical

correlations, in this thesis all negative terms will be aggregated into the index. The precise way the sentiment index is constructed will be discussed in the next section.

3 Methodology and Data

In this part the methodology and data used will be described. First, it is discussed how the index measuring investor sentiment is constructed. Then, the sources of the data used and some tests on the usability of the data will be described. At last, the specification of the models used in the analysis will be discussed.

3.1 Investor sentiment index

The main objective in this thesis is to construct a reliable measure of investor sentiment based on search behaviour of agents. Here, the method used by Da et al. (2014) will be partly followed. The key to get this measure of investor sentiment is by identifying relevant

(20)

terms that reveal sentiment about economic conditions and obtain their search volume with the use of Google Trends.

In order to create a list of relevant search terms the General Inquirer based on the Harvard IV-4 Dictionary* (http://www.wjh.harvard.edu/~inquirer/spreadsheet_guide.htm) is used. This dictionary is often used for textual sentiment analysis in financial scientific research, for example in Tetlock (2007). In the General Inquirer words are categorized into several groups, among which also the categories ‘economic’ and ‘positive’ or ‘negative’. As the papers by Da et al (2014) and Tetlock (2007) show, it seems that negative words are most informative and capture sentiment the best. This is why in this research all ‘economic’ words that have ‘negative’ sentiment were taken to form the anxiety index.

This resulted in 49 words, including words as ‘jobless’, ‘bankrupt’ and ‘inflation’. To include other ways agents might search for these terms, the top ten ‘related top searches’ were downloaded as well. These are defined by Google Trends as: “Top searches are terms that are most frequently searched with the term you entered in the same search session, within the chosen category, country, or region.” An example, related top searches for the term ‘unemployment’ are, among others, ‘unemployment office’, and ‘unemployment benefits’.

The 49 original words generate a list of 539 search terms of which 453 after

removing duplicates. Next, non-economic terms were removed, resulting in 119 remaining terms. An example of a non-economic term is seen in the related search terms of a word as ‘depression’. Two of the related search terms are ‘the great depression’ and ‘anxiety depression’. It is clear that the second one has nothing to do with economic conditions and is thus removed from the list. At last, after removing the search terms with not enough data (more than 100 missing values in the sample of 730 days), 43 search terms remained. The methodology here differs from Da et al. (2014) as they take both ‘negative’ and ‘positive’ words and use a historical regression-based approach to select the terms that have the most negative correlation with the market.

For the remaining search terms the daily Search Volume Index (SVI) for the period 01.05.2016 – 01.05.2018 were downloaded. This time period is chosen as only in most recent years a lot of cryptocurrencies had their origination and became known to a wider public. This leaded to higher adoption rates which makes it possible to analyse the market.

(21)

Google Trends gives the option to restrict the SVI’s to specific countries. As the search terms are used in English and the returns from the S&P 500 are used for the

traditional stock market returns the SVI’s are downloaded for the United States. This means that the sentiment index will represent American households.

The daily SVI data by Google Trends is only available one quarter at a time, for longer periods only weekly data is available. So, the data for the full sample period is downloaded one quarter at a time. As described in Section 2.5 Google Trends scales the SVI’s by the series maximum. This means that the daily SVI’s in a certain quarter are scaled by highest SVI in that quarter. A problem with the normalization within quarters is that the daily SVI change on the first day of a new quarter can’t be computed. To solve this problem the downloaded periods were such that the last day of a quarter is the same date as the first day of the next quarter. This makes it possible to compute the daily change in search for every day. Thus, for all days daily SVI logarithmic change for search term j is defined as:

∆𝑆𝑉𝐼𝑗,𝑡= 𝑙𝑛(𝑆𝑉𝐼𝑗,𝑡) − 𝑙𝑛(𝑆𝑉𝐼𝑗,𝑡−1) (1)

The last objective is to create the Anxiety Index from these remaining terms. This is done by taking the average of all terms on for each t in the sample period. Summary statistics of the 43 terms and the Anxiety index through daily SVI change are presented in Table A.1 in the Appendix.

Another important aspect to check is seasonality in the data. Figure 3 plots the daily SVI’s for the term “unemployment” for the last two months of the sample period. This shows some clear intraweek seasonality appears in the data. Figure 4 shows that this intraweek seasonality is not limited to only some terms. The plotted Anxiety index for the first two months of the sample period clearly shows intraweek seasonality with high SVI’s on Mondays which gradually decrease until Sunday. In order to deal with this problem, a

control for seasonality has to be included in the regression. This is done by adding dummy variables for all days of the week and all months of the year. When running the regressions, Monday and January will be omitted to prevent the dummy trap.

(22)

Figure 3:Weekly seasonality in the SVI’s for the search term ‘unemployment’ Data source: Google Trends (www.google.com/trends).

Figure 4: The Anxiety index in the first two months of the sample period. Some clear intraweek seasonality is seen.

3.2 Other data

The daily returns for the traditional stock market are calculated based on the Standard & Poor’s 500 (S&P 500). This is a stock market index from the United States based on the market capitalizations of the 500 largest companies being publicly traded on the New York Stock Exchange or NASDAQ. The S&P 500 is one of the most commonly used equity indices and is considered one of the best representations of the U.S. stock market. The opening and closing prices from the S&P 500 are downloaded from the Center for Research in Security Prices (CRSP). In this research the returns are calculated as following:

(23)

𝑅𝑒𝑡𝑢𝑟𝑛𝑡=

𝐶𝑙𝑜𝑠𝑖𝑛𝑔𝑃𝑟𝑖𝑐𝑒𝑡−𝐶𝑙𝑜𝑠𝑖𝑛𝑔𝑃𝑟𝑖𝑐𝑒𝑡−1

𝐶𝑙𝑜𝑠𝑖𝑛𝑔𝑃𝑟𝑖𝑐𝑒𝑡−1 (2)

where 𝐶𝑙𝑜𝑠𝑖𝑛𝑔𝑃𝑟𝑖𝑐𝑒𝑡 is the price of the S&P 500 index or cryptocurrency index at the end

of the trading day t, and 𝐶𝑙𝑜𝑠𝑖𝑛𝑔𝑃𝑟𝑖𝑐𝑒𝑡−1 is the price of the end of the previous trading day

(t-1). This means that 𝑅𝑒𝑡𝑢𝑟𝑛𝑡 is the daily change in prices of the index it concerns, either

the S&P 500 index or the cryptocurrency index.

Figure 5 shows that it seems that the S&P 500 returns are stationary in the mean, it is harder to observe if the variance is also stationary. In order to be able to use an

Autoregressive Distributed Lag model (ADL-model) all variables should be stationary time series. In the next section stationarity will be formally tested for with the Augmented

Dickey-Fuller test. To cope with the problem of heteroscedasticity that follows from possible varying variance, robust standard errors will be used.

Figure 5: Returns from the S&P 500 for the full sample period.

The daily returns for the cryptocurrency market are calculated based on a

self-constructed index. This weighted cryptocurrency index is made of the top six cryptocurrency based on market capitalizations at 01.05.2018 and the fact that they existed already at 01.05.2016. An index instead of the returns of a single cryptocurrency is chosen to prevent returns being affected too much by news about this single cryptocurrency. The weighting is based on market capitalizations at time t when calculating the return for time t. Daily prices

(24)

same way as the S&P 500 returns. The six cryptocurrencies included in the index are Bitcoin (BTC) , Ethereum (ETH), Ripple (XRP), Litecoin (LTC), Stellar (XLM) and Dash (DASH). The current market capitalizations of these six cryptocurrencies are between USD 4 billion (DASH) and USD 150 billion (BTC).

Figure 7 shows, that just as the returns of the S&P 500, the returns of the

cryptocurrency index seem to be stationary in the mean and non-stationary in variance. Another thing that stands out is the big difference in volatility of the S&P 500 returns and the cryptocurrency returns. Where for the S&P 500 almost all daily volatility is less than 2% and the most volatile day shows 4% variance, volatility of the cryptocurrency index is generally below 10% with the most extreme days displaying 20% volatility.

In the next section a formal test for stationarity in the mean will be performed. To cope with the problem of heteroscedasticity that follows from possible varying variance, robust standard errors will be used.

Figure 6: Returns from the cryptocurrency index for the full sample period.

Other controls, next to the dummies regarding seasonality, are lagged returns, the Aruoba-Diebold-Scotti Business Conditions Index (ADS) and a news-based measure of economic policy uncertainty (EPU) recently developed by Baker, Bloom, & Davis (2013).

The ADS Business Conditions Index is a daily measure of macroeconomic conditions from the Federal Reserve Bank of Philadelphia. The measure is constructed by Aruoba,

(25)

variables of different frequencies. The index includes manufacturing and trade sales, industrial production, quarterly GDP, weekly jobless claims, personal income less transfer payments, and monthly payroll employment (Aruoba, Diebold, & Scotti, 2009). The ADS index reflects changes caused by macroeconomic conditions.

To control for uncertainty related to economic policy the news-based measure EPU will be used. This is a measure of economic policy uncertainty based on news articles. The number of U.S. newspaper articles that contain a term of the following three categories are counted. These categories are: i) “economy” or “economic”; ii) “uncertain” or “uncertainty”; and iii) “congress” , “deficit”, “Federal Reserve”, “legislation”, “regulation”, or “White

House”. Their paper shows that this measure reflects perceived economic policy uncertainty (Baker, Bloom, & Davis, 2013).

These last two controls are included to control for news events. One might argue that search might be endogenous to macroeconomic events. In fact, investor sentiment should be endogenous to macroeconomic events. Sentiment is generated by something and it is logical to expect that these factors are things like unemployment, GDP, wealth changes and so on. News about macroeconomic events arrive daily, so by using daily returns, the economic policy uncertainty index and the business-conditions index in the models news events are being controlled for. This makes the Anxiety index a variable describing sentiment caused by new events.

3.3 Model specification

In this research a Autoregressive Distributed Lag model (ADL-model) will be used. This is an econometric model used for analysing time series that may include both lags of the

dependent variable as lags of one or more explanatory variables as regressor.

To be able to use an ADL-model the variables used must be stationary time series. Regressions with only non-stationary variables or with stationary and non-stationary variables can easily lead to spurious relationships. As shown in Figures 4, 5 and 6 in the previous section it appears that the returns of the S&P 500, the returns of the

cryptocurrency index, and the Anxiety index are stationary time series. This is also logical as these variables are first differences. However, the variables ADS and EPU, which are used as controls, are level variables and their stationarity also needs to be checked for. Moreover, it

(26)

series. That is why here, these variables will be formally tested for stationarity with the Augmented Dickey-Fuller test (ADF-test). To be able to perform an ADF-test, it is first needed to determine the number of lags. If the number of lags is too small then the

remaining serial correlation in the errors will bias the test. If p is too large then the power of the test will suffer. The usual method for determining the number of lags is using Schwert’s criterion (1989). This criterion is defined as:

𝑀𝑎𝑥 # 𝑜𝑓 𝑙𝑎𝑔𝑠 = 𝑆( 𝑛

100)

1 4⁄ (3)

where S is seasonality which is equal to seven in this research since mainly intraweek seasonality appears. N is the length of the time series or the number of observations. That is, for this research the number of lags to use the ADF-test is twelve.

The results of the ADF-test for stationarity are presented in Table B.1 in the

Appendix. The results show that all variables are stationary at the 1% significance level. This means that it is allowed to use these variables in a ADL-model.

The last objective before running the regressions of the ADL-model is lag selection. As an autoregressive process for a time series includes lags in the model it is necessary to determine what the optimal number of lags to use is. There are several model selection criteria that help with lag selection. In this research the lag selection is based on four different criteria. These criteria are the final prediction error (FPE), Akaike's information criterion (AIC), Schwarz's Bayesian information criterion (SBIC), and the Hannan and Quinn information criterion (HQIC). Results of these selection criteria are presented in Table C.1 and C.2 in the Appendix. The selection of the lags is based on which number of lags is selected by most criteria. The results show that the number of lags is two for ReturnSP, and zero for ReturnCrypto.

This leads to the following equations of ADL-models:

𝑅𝑒𝑡𝑢𝑟𝑛𝑆𝑃𝑡 = 𝛼0+ 𝛼1𝐴𝑛𝑥𝑖𝑒𝑡𝑦𝑡−𝑘+ 𝛼2𝑅𝑒𝑡𝑢𝑟𝑛𝑆𝑃𝑡−𝑘+ 𝛼3𝐸𝑃𝑈𝑡+ 𝛼4𝐴𝐷𝑆𝑡+

∑ 𝛾𝑚 𝑚𝑆𝑒𝑎𝑠𝑜𝑛𝐷𝑢𝑚𝑚𝑖𝑒𝑠𝑡𝑚+ µ𝑖,𝑡+𝑘 (4)

(27)

where in Equation (4) the dependent variable 𝑅𝑒𝑡𝑢𝑟𝑛𝑆𝑃𝑡 is the return on the S&P 500 on

day t. The main independent variable is 𝐴𝑛𝑥𝑖𝑒𝑡𝑦𝑡−𝑘 (up to eight lags) where it is important to note that the Anxiety index thus shows the daily change in search for the aggregate of terms included. Control variables include lagged S&P 500 returns (𝑅𝑒𝑡𝑢𝑟𝑛𝑆𝑃𝑡−𝑘), changes in

economic policy uncertainty (𝐸𝑃𝑈𝑡), and changes in the Aruoba-Diebold-Scotti business

conditions index (𝐴𝐷𝑆𝑡). Equation (5) is similar to Equation (4) except that the dependent

variable now the return on the cryptocurrency index on day t (𝑅𝑒𝑡𝑢𝑟𝑛𝐶𝑟𝑦𝑝𝑡𝑜𝑡).

3.4 Hypotheses

Based on the literature is it expected that the constructed Anxiety index is able to directly capture investor sentiment and that returns will show a reversal. That is, higher anxiety correlates with lower returns today, but higher returns should follow in the days after. This follows from behavioural finance theory about noise traders. Noise traders bring prices below fundamental values in times of anxiety, consequently rational investors will arbitrage these opportunities away and returns will thus be positive in the days following negative sentiment.

Next to that, in line with theory, it is expected that investor sentiment will play a bigger role in the cryptocurrency market compared to the stock market as companies that are young, small, highly volatile, unprofitable, non-dividend paying, or with extreme growth potential have stocks that are most sensitive to investor sentiment. These characteristics describe the current status of the cryptocurrency market very well.

This leads to the hypotheses formulated in the following way:

1. A search-based measure of investor sentiment is able to predict returns in financial markets.

1.2. Returns will show a return reversal. That is, when anxiety is high, prices are temporarily low and consequently increase until the initial decrease is undone.

2. As the market for cryptocurrency theoretically is more prone to investor sentiment it is expected that anxiety plays a bigger role in that market than in the traditional stock market.

(28)

To answer hypothesis 1 the significance of 𝛼1and 𝛽1in both regressions (both run for the periods t to t+4) will be checked. For hypothesis 1.2 the signs of 𝛼1and 𝛽1for the

different periods matter, where it is expected that the sign of 𝛼1and 𝛽1is negative in period t and positive in t+x periods following until the effect is non-significant.

For hypothesis 2 it will be tested if there is a significant difference between the size of the absolute values of𝛼1and 𝛽1for every period, where it is expected that 𝛽1from

Equation (5) will be significantly larger than 𝛼1 from Equation (4) meaning that investor

sentiment has bigger predictive power in the cryptocurrency market than in the traditional stock market.

4 Results

In this section the results of the analysis will be presented. First, some regressions are run to observe the contemporaneous correlation between anxiety and returns. Next, the

regression on future returns are added to analyse the prediction ability of the Anxiety index. At last, a cross-market analysis is conducted between the traditional stock market and the cryptocurrency market.

4.1 Sentiment-return relationship

First, some contemporaneous regressions are run to examine if a search-based measure of investor sentiment can explain the sentiment-return relationship. Table 1 presents the results of these regressions for the traditional stock market. The dependent variable in each regression is the contemporaneous return of the S&P 500 index. The main independent variable is the Anxiety index. For every regression (1) to (5) extra control variables are added. Note that two lags of the independent variable are included in the control variables as that was the optimal number of lags determined by the lag selection criteria. The results show that the constructed Anxiety index does not significantly correlate with

contemporaneous returns as all coefficients are insignificant. The results also make clear that including seasonality dummies turns the sign of the insignificant coefficient of the Anxiety index from positive to negative. The rest of the added controls do not change the coefficient for Anxiety. The only coefficients that are significant are those of the two-periods

(29)

contemporaneous S&P 500 returns is not being observed. That is, there is no significant relationship between people searching for negative economic terms today and stock market returns today.

Table 1: Contemporaneous relationship between the Anxiety index and S&P 500 returns

(1) (2) (3) (4) (5)

Ret(t) Ret(t) Ret(t) Ret(t) Ret(t)

Anxiety 0.00076 -0.00154 -0.00166 -0.00159 -0.00172 (0.00137) (0.00383) (0.00375) (0.00382) (0.00377) EPU 0.00000 0.00001 (0.00001) (0.00001) ADS 0.00012 0.00005 (0.00117) (0.00118) Ret(t-1) -0.08024 -0.08079 (0.08058) (0.08113) Ret(t–2) -0.15644* -0.15911** (0.07999) (0.07965) Constant 0.00034 0.00242 0.00242 0.00200 0.00177 (0.00030) (0.00223) (0.00223) (0.00293) (0.00293) Seasonality

dummies no yes yes yes yes

Observations 504 504 502 504 502

𝑹𝟐 0.0005 0.0155 0.04400 0.01610 0.0452

Standard errors are robust and shown in parentheses. Significance levels are shown by *, **, and *** which respectively denote significance at the 10%, 5%, and 1% levels. The sample period is from 2 May 2016 until 1 May 2018.

Table 2 presents the results of the contemporaneous regressions of anxiety on cryptocurrency returns. The dependent variable in each regression is the contemporaneous return of the cryptocurrency index. The main independent variable is the Anxiety index. For every regression (1) to (3) extra control variables are added. The results show that daily change in the constructed Anxiety index does significantly correlate with contemporaneous cryptocurrency returns as the coefficients are significant when control variables are added. The sign of the significant coefficient is positive, indicating that when anxiety increases

(30)

today, cryptocurrency returns increase today. Again, besides the seasonality dummies, the rest of the added controls do not change the coefficient for Anxiety.

To conclude, the significant correlation between the Anxiety index and contemporaneous cryptocurrency returns indicates that the search-based measure captures investor

sentiment in the market contemporaneously. However, it is contrary to the expectation that returns would be low when anxiety is high. The coefficient indicates that on days that the Anxiety index is up with 1%, cryptocurrency returns are up by 3%. One reason for this result might be that cryptocurrencies are seen as an alternative investment compared to

traditional stocks. So when anxiety towards the economy is high, confidence in traditional stocks decreases and investors move to alternative investments.

Table 2: Contemporaneous relationship between the Anxiety index and cryptocurrency returns

(1) (2) (3)

Ret(t) Ret(t) Ret(t)

Anxiety 0.00426 0.03190** 0.03180** 0.00590 0.01599 0.01586 EPU - - 0.00003 - - 0.00003 ADS - - 0.01086* - - 0.00056 Constant 0.00560 -0.00496 -0.00559 0.00159 0.00874 0.01046 Seasonality

dummies No yes yes

Observations 730 730 730

𝑹𝟐 0.0007 0.0349 0.0387

Standard errors are robust and shown in parentheses. Significance levels are shown by *, **, and *** which respectively denote significance at the 10%, 5%, and 1% levels. The sample period is from 2 May 2016 until 1 May 2018.

Next to the contemporaneous relationship between negative and sentiment it is also interesting to observe if effects persist over time or maybe even reverse. More precisely, to check for the return reversal hypothesis it is needed to observe the effect of anxiety on future returns. Table 3 presents the results of the relationship between daily S&P 500 returns and the constructed Anxiety index. The dependent variables are contemporaneous

(31)

(6)), and cumulative future returns over the first two days (column (4)). Regression (4) should give more insights on the cumulative effects of return reversals. The main

independent variable is the Anxiety index, control variables are the EPU measure, the ADS Business Conditions index, lagged returns, and dummies to cope with seasonality. As just described at Table 1 the coefficient of the contemporaneous relationship is negative but insignificant. The results of the following days neither show convincing evidence. The coefficients for anxiety in regression (2) and (3) are significant at the 10% level. This indicates that anxiety today is correlated with low returns one day later and high returns two days later. The signs of the coefficients of anxiety in the first three regressions show a return reversal, although, due to insignificance of the first regression no hard conclusion can be drawn from this. On a further horizon the coefficients even get more insignificant.

Table 3: Anxiety and S&P 500 index returns

(1) (2) (3) (4) (5) (6)

Ret(t) Ret(t+1) Ret(t+2) Ret[t+1,t+2] Ret(t+3) Ret(t+4) Anxiety -0.00172 -0.00528* 0.00490* -0.00042 0.00343 0.00253 (0.00377) (0.0032) (0.00295) (0.00389) (0.00365) (0.00307) EPU 0.00001 0.00002** 0.00002** 0.00003*** 0.00001 0.00001 (0.00001) (0.00001) (0.00001) (0.00001) (0.00001) (0.00001) ADS 0.00005 -0.00010 -0.00014 -0.00024 -0.00003 -0.00016 (0.00118) (0.00114) (0.00115) (0.00156) (0.00114) (0.00115) Ret(t-1) -0.08079 -0.14946* 0.02671 -0.12251 -0.04180 -0.01742 (0.08113) (0.07784) (0.08374) (0.12357) (0.06587) (0.05109) Ret(t–2) -0.15911** 0.00898 -0.04473 -0.03712 -0.03116 0.05924 (0.07965) (0.07307) (0.06771) (0.10621) (0.04950) (0.05535) Constant 0.00177 0.00126 -0.00428* -0.00305 -0.00261 -0.00260 (0.00293) (0.00228) (0.00231) (0.00316) (0.00286) (0.00268) Seasonality

dummies yes yes yes yes yes yes

Observations 502 501 500 500 499 498

𝑹𝟐 0.0452 0.0572 0.0420 0.0715 0.0259 0.0249

Standard errors are robust and shown in parentheses. Significance levels are shown by *, **, and *** which respectively denote significance at the 10%, 5%, and 1% levels. The sample period is from 2 May 2016 until 1

(32)

Table 4 presents the results of the relationship between daily cryptocurrency returns and daily anxiety. The dependent variables are contemporaneous returns (column 1), future returns in the next four days (columns (2), (3), (5) and (6)), and cumulative future returns over the first two days (column (4)). The main independent variable is the Anxiety index, control variables are the EPU measure, the ADS Business Conditions index, lagged returns, and dummies to cope with seasonality. As just described at Table 2 the coefficient of the contemporaneous relationship is significant and positive. Nevertheless, the results of the following days show no convincing evidence for a return reversal. The signs of the

coefficients of anxiety in the first three regressions show a return reversal, although, due to insignificance of the anxiety coefficient in regression (2) and (3) no conclusion can be drawn from this. On a further horizon, the coefficients even get more insignificant.

Table 4: Anxiety and cryptocurrency index returns

(1) (2) (3) (4) (5) (6)

Ret(t) Ret(t+1) Ret(t+2) Ret[t+1,t+2] Ret(t+3) Ret(t+4) Anxiety 0.03180** -0.00817 -0.01437 -0.01816 0.00514 -0.00924 (0.01586) (0.01642) (0.01542) (0.02325) (0.01133) (0.01080) EPU 0.00003 0.00003 0.00003 0.00002 -0.00001 0.00002 (0.00003) (0.00003) (0.00003) (0.00005) (0.00003) (0.00003) ADS 0.01086* 0.01082 0.01071 0.02062*** 0.00672 0.00744 (0.00056) (0.00566) (0.00568) (0.00796) (0.00557) (0.00552) Constant -0.00559 -0.00051 -0.00169 -0.00929 -0.00805 -0.00923 (0.01046) (0.01374) (0.01124) (0.01320) (0.00871) (0.00936) Seasonality

dummies yes yes yes yes yes yes

Observations 730 729 728 728 727 726

𝑹𝟐 0.0387 0.0335 0.0071 0.0705 0.0044 0.0478

Standard errors are robust and shown in parentheses. Significance levels are shown by *, **, and *** which respectively denote significance at the 10%, 5%, and 1% levels. The sample period is from 2 May 2016 until 1 May 2018.

(33)

For a conclusion, the constructed anxiety index does not correlate significantly with contemporaneous stock returns. Analysing future returns shows that the Anxiety index slightly significant correlates with stock returns for both one and two periods ahead. A return reversal after two days is observed, though due to the insignificance of the

contemporaneous coefficient no hard conclusions can be drawn from this. On the contrary, the Anxiety index does contemporaneously correlate significantly with returns on the cryptocurrency index. However, the coefficients in the following periods, that show a return reversal, become insignificant. Overall, it can be concluded that the Anxiety index has no real predictive power for returns on both markets.

4.2 Cross-market analysis

To test for the hypothesis that the cryptocurrency market is more prone to investor sentiment than the traditional stock market the anxiety coefficients across markets are observed for all periods. In the previous tables one sees that for the contemporaneous and the first two lagged periods one of the coefficients for anxiety is significant while the other is not. However, this does not mean that there is a significant difference between these coefficients. Stata has a feature for this cross-equation coefficient testing under the name ‘suest’, or ‘seemingly unrelated estimation’. The difference between coefficients is tested using a 𝐶ℎ𝑖2 test with a null-hypothesis stating the absolute values of the coefficients are equal. Absolute values are used as it is not interesting to find a significant difference

resulting from different directions of the coefficients. For this cross-market analysis only the size of the effect matters.

Results are presented in Table 5. The columns represent the different periods for which the regressions are run. In the first two rows the coefficients of anxiety in both regressions are shown. In row three and four the 𝐶ℎ𝑖2 value of the difference test and the corresponding p-value are shown respectively.

The results show that only in the contemporaneous regression, the coefficients for anxiety across markets are significantly different when testing at the 10% level. The null hypothesis, which states that there is no difference, is rejected as the p-value of 0.0545 is lower than 0.1. It is however more common to test at the 5% level which would generate no significant results. This shows that anxiety correlates contemporaneous slightly more with

(34)

cryptocurrency returns than with S&P 500 returns. This difference fully disappears however when analysing future returns with current anxiety.

Table 5: Test for difference in absolute values of the cross-market coefficient for anxiety

t t+1 t+2 (t+1 + t+2) t+3 t+4 Anxiety coefficient in S&P 500 regression -0.00172 -0.00528* 0.00490* -0.00042 0.00343 0 Anxiety coefficient in cryptocurrency regression 0.03180** -0.00817 -0.01437 -0.01816 0.00514 -0.00924 𝑪𝒉𝒊𝟐 3.70000 0.00000 0.29000 0.60000 0.00000 0.45000 P-value 0.05450 0.99830 0.58900 0.43900 0.97900 0.50000

These results do not deliver compelling evidence for the hypothesis that investor sentiment plays a bigger role in the market for cryptocurrency than in the traditional stock market.

4.3 Limitations

One might argue that searching for a certain term doesn’t need to reflect positive or

negative sentiment towards economic conditions but rather reflects interest in information about a certain term. This might explain the fact that in this research no convincing

evidence is found for the sentiment-return relationship. However, other studies find high correlations between search data and confidence indices. For example, Da et al., (2014) find a 85.8% correlation between the search for the word ‘recession’ and the Consumer

Confidence Index from the University of Michigan.

Another possible limitation is using all ‘negative’ and ‘economic’ terms from the General Inquirer based on the Harvard IV-4 Dictionary while some of these terms might be categorised as ‘negative’ but still can correlate positively with the market and the other way around. This dictionary is often used for textual sentiment analysis in financial scientific research, but in other research it is shown that indeed a word as ‘gold’ is categorised as

(35)

To be able to observe the a return reversal one could get a stronger mdoel by selecting terms based on their historical relation with the market. For the question if this search-based measure is feasible to analyse the sentiment-return relationship such overfitting is not desirable.

Other concern might be that search is endogenous to macroeconomic events. In fact, investor sentiment should be endogenous to macroeconomic events. Sentiment is generated by something and it is logical to expect that these factors are things like

unemployment, GDP, wealth changes and so on. News about macroeconomic events arrive daily, so by using daily returns, the economic policy uncertainty index and the business-conditions index in the models news events are being controlled for. This makes the Anxiety index a variable describing sentiment caused by new events.

5 Discussion and conclusion

This thesis tried to examine the feasibility of using a search-based measure for analysing the sentiment-return relationship. Moreover, it checks if a return reversal, as behavioural finance predicts, is being observed. Next to that, a cross-market analysis was conducted to examine which markets are more prone to investor sentiment. This is done by constructing an investor sentiment index by aggregating the search behaviour of households. This Anxiety index is based on economic search terms that reveal negative sentiment towards economic conditions. The correlations of the Anxiety index with both traditional stock market returns as cryptocurrency returns are examined. This is done for both

contemporaneous returns as for future returns.

The results show the expected correlation between the Anxiety index and contemporaneous S&P 500 returns is not being observed. That is, there is no significant relationship between people searching for negative economic terms today and stock market returns today. The results of the analysis on future returns show that the Anxiety index slightly significantly correlates with stock returns for both one and two periods ahead. A return reversal is observed, though due to the insignificance of the contemporaneous coefficient no hard conclusions can be drawn from this.

On the contrary, there is a significant correlation between the Anxiety index and contemporaneous cryptocurrency returns but no significant correlation for future

Referenties

GERELATEERDE DOCUMENTEN

In short, birth cohorts can differ in their job satisfaction level because older cohorts are replaced by younger cohorts who have higher and more idealistic expectations, different

Identity extraction, merging and correlation in Tracks In- spector will be demonstrated using a working system with a case containing evidence that has already been processed..

- In 2015 “practically all” teachers of the participating schools are sufficiently able to provide differentiation in their teaching and deal with individual

45 Nu het EHRM in deze zaak geen schending van artikel 6 lid 1 EVRM aanneemt, terwijl de nationale rechter zich niet over de evenredigheid van de sanctie had kunnen uitlaten, kan

cumulative returns for the Japanese stock index and the cumulative returns of a long-short portfolio for he portfolio strategy is to go long in the two best performing sectors

Each column refers to a different multivariate regression model including a single dimension from Hofstede’s cultural framework, while each row refers to the coefficient of

In this research, the main investigated relationship is the possible impact the two different predictors (ESG pillar scores and ESG Twitter sentiment) have on the

For the silver price index three linear tests namely the autocorrelation, the Runs and variance ratio test indicated that the market efficiency changes over time.. Which suggests