The predictive power of social sentiment over cryptocurrencies’ price fluctuations. causal effects and forecasts

(1)

UNIVERSITY OF AMSTERDAM

MASTER’S THESIS

The Predictive Power of Social Sentiment Over Cryptocurrencies’ Price

Fluctuations. Causal Effects and Forecasts

Vlad-Cristian Marin 10418164

Faculty Economics and Business

Study program Business Administration

Track Digital Business

Version Final

Date of Submission 22-06-2018 Supervisor Gijs Overgoor

(2)

2

Abstract

This study investigates the presence of dynamic causal effects of five top cryptocurrencies’ social sentiment over their price, using hourly data for a period of 121 days. Granger causality testing is used to first validate the initial assumptions about these relationships. Then, a forecasting algorithm called Prophet is used to confirm the findings. Previous work in the field of cryptocurrencies, as well as in similar fields such as the stock market, suggested that social sentiment can be used to predict future returns. The empirical findings show that for four out of the five studied cryptocurrencies, there are not only strong correlations between a currency’s social sentiment and price, but also between their values in time. Bitcoin, EOS, Ethereum and Ripple have all been found to have their current prices affected by past values of the market’s sentiment. The results support the theory that cryptocurrencies are highly influenced by social signals.

Keywords: cryptocurrency, sentiment analysis, text mining, classification, causal effects,

(3)

3

Statement of originality

This document is written by student Vlad-Cristian Marin who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(4)

4 Table of Contents Abstract 2 Statement of originality 3 1. Introduction 5 2. Literature review 9

2.1 Blockchain & Cryptocurrencies 9

2.2 Cryptocurrency - What do we know? 11

2.3 Online chatter 14

2.4 Predictive power of social sentiment 15

2.5 Related work 18

3. Data collection & classification 21

3.1 Data Sampling 22

3.2 Data Analysis - Natural Language Processing 24 3.2.1 Named Entity Recognition 24 3.2.2 Sentiment Analysis 25

4. Methodology 25

4.1. Causal effects 29 4.1.1 Non-stationarity testing (unit root testing) 29 4.1.2. Multicollinearity 31

4.1.3. Granger causality testing 32

4.2 Forecasting 33 4.2.1 The Prophet algorithm 33

4.2.2 The science behind 34 4.2.3 Methodological considerations 35

5. Results 38 6. Discussion 47

6.1 Limitations and further research 50

7. Conclusion 53

7.1 The predictive power of social sentiment over cryptocurrencies’ price fluctuation 53 7.2 Contributions 53

Bibliography 55

(5)

5 “Tremendous foreboding such as we always feel when there comes an enormous, an unheard-of event whose consequences are imponderable and incalculable”

❖ Heinrich Heine

1. Introduction

The world has seen dramatic changes in terms of technology in the past few decades. Technology has enabled humanity to discover and accumulate knowledge at an exponential rate. The emergence of blockchain and the subsequent development of cryptocurrencies - the first observable instances of the Blockchain technology “in the wild” (Laskowski & Kim, 2016) - have opened up new great opportunities. Blockchain and cryptocurrencies are breaking into the mainstream a little bit every day, with rapidly growing numbers of users and applications. Cryptocurrencies and blockchain have already started moving downwards, from the top of the Gartner hype cycle of emerging technologies (the so-called “peak of inflated expectations”) towards the lower level, the so-called “trough of disillusionment”, at a much faster speed compared to other technologies. The trough of disillusionment is the stage a technology goes through before mainstream adoption occurs. This is perhaps the make-or-break point in any technology’s life cycle, as only some of them survive (Gartner, 2018).

In terms of cryptocurrencies, we have witnessed a growth from $18 bln. to +$700 bln. in 2017 alone and a total number of coins of over 1500 (Coinmarketcap, 2018). However, there seems to be no established set of rules governing the development and the routines of cryptocurrency markets, leaving room for uncertainty, distrust and chaos. The said cryptocurrencies are notorious for being extremely volatile (Colianni, Rosales & Signorotti, 2015; Polasik, Piotrowska, Wisniewski, Kotkowski & Lightfoot, 2015; Letra, 2016; Katsiampa, 2017) and unlike other

(6)

6 prevalent monetary systems, they have become an opportunity for speculation from participants in the ecosystem (Kim, Kim, Kim, Im, Kim, Kang & Kim, 2016). Consider, for example, the end of 2017 (22nd of December) and twice the beginning of 2018 (January 18th and January 27th), have seen spectacular moments of market fluctuations, with prices of cryptocurrencies suddenly plummeting, after unexpected periods of rapid, yet unsustainable growth (Coinmarketcap, 2018). Scholars have been interested in finding out what drives the said turbulence on cryptocurrency markets. Different stances have been taken in trying to explain the phenomena surrounding cryptocurrencies. Rohr & Wright (2017) recognize in their study that the volatility of cryptocurrencies is caused in part by the lack of regulatory oversight characteristic for this market and the unusual behavior of enthusiastic individuals who invest in crypto-related companies without informed opinions. Other studies, such as Poyser (2017), identify macroeconomic and financial factors as predictors of market movements. His Bitcoin-centered study shows that the aspects of the market that contribute to the price fluctuations of the popular cryptocurrency can be categorized into internal (supply and demand) and external (attractiveness and adoption; macro-financial). An alternative view is offered by Kristoufek (2015) whose study proposes that despite its reputation as a purely speculative asset, Bitcoin was found to also be driven by more traditional factors. He found evidence that Bitcoin’s dynamics can sometimes be in line with standard economic theory and be influenced by money supply, trade level or price level. On the other hand, more research has been concerned with exploring certain dimensions of social sentiment and their effects on cryptocurrency prices (Kim et. al, 2016; Laskowksi & Kim, 2016; Colianni et. al, 2015).

The cryptocurrency sphere is complex far beyond the common lexicon and consensus on this topic has yet to be reached. Kim et. al (2016), as well as Laskwowski & Kim (2016) highlight the lack of consistent evidence for drivers of cryptocurrencies’ price fluctuations, as well as to the

(7)

7 lack of an interdisciplinary effort to uncover the nature of these fluctuations. In countering this problem, both papers propose cryptocurrency prediction models based on various dimensions of social sentiment, while ultimately recognizing their importance and role as drivers of market movements. The gap in the extant literature is further highlighted by Colianni et. al (2015) and Garcia & Schweitzer (2015) who believe that the very large amounts of available digital trace kind of data about cryptocurrencies has the potential to be turned into actionable insights. The amount of discussions around cryptocurrencies represent an invaluable source of clues about socio-economic factors in the crypto environment. Given the decentralized nature of cryptocurrencies (explained further), as well as the involvement of their respective communities in the decision making processes, user behavior is an important shaper of the dynamics of their economy.

Serving as grounds for research in the area of cryptocurrency trading and the correlation between price fluctuations and social data streams, findings by scholars in Marketing are more relevant than ever. Advancements in research on topics such as text mining, chatter, user-generated content (UGC) and previously confirmed and established connections between them and stock markets’ performance (Tirunillai & Tellis, 2012) also represent a good starting point in exploring the correlation between cryptocurrencies’ market fluctuations and social dimensions. Online activity has steadily been increasing in the recent years, as the Internet has not only become a means of connection with each other, but also a crucial source of information. Internet use translates to numerous clues about one’s behavior, preferences, interests, concerns and intentions (Ranco, Aleksovski, Caldarelli, Grcar & Mozetic, 2015). There is a significant knowledge gap between the collection, analysis and application of insights derived from social signals to cryptocurrency trading scenarios.

(8)

8 As it has already been proven that sentiments can be a proxy for market mood in various other industries, this paper will be concerned with investigating the presence of a dynamic causal

relationship between one dimension of social sentiment and the price variations of cryptocurrencies and its predictive power. This paper proposes the study of the relationship

between social sentiment gathered from online communities and traditional news outlets, and the chaotic movements of cryptocurrencies in terms of price. On top of being methodologically rigorous with the time series analysis, this thesis also elaborates on the data collection procedure, sentiment analysis and forecasting, proposing a distinct approach to the topic.

While an answer to what drives cryptocurrencies’ value has yet to be given, we can only hope to advance our understanding of the mechanics of this environment by uncovering some of the influential drivers of their fluctuations.

(9)

9

2. Literature review

Before diving into the specifics of the proposed model and its potential applications, let us first define some of the concepts and analyze the relevant theory surrounding the current research.

2.1 Blockchain & Cryptocurrencies

To begin with, Blockchain is a “technology for establishing a shared, immutable version of the truth between a network of participants that do not trust one another and therefore has the potential to disrupt [..] industries that rely on third parties to establish trust” (Laskowksi & Kim, 2016). In other words, the Blockchain derives its value from the fact that it provides a mathematically verifiable means of exchange between different participants in a network (Laskowksi & Kim, 2016) on a distributed ledger where every participant has a copy of the (universal) truth. The same authors point out and praise Blockchain’s applications and prospects for the future, recognizing this technology’s potential and opportunity to fundamentally change whole industries (e.g. Fintech). Conley (2017) recognizes the increasing interest in the blockchain technology since its inception, with security, anonymity, data integrity and lack of central authorities’ involvement as main drivers of its popularity. Although initially created with Bitcoin in mind, the blockchain technology has applications beyond the financial world (Swan, 2015). Consequently, beside their store of value function, cryptocurrencies have applications ranging from medical to energy, from security to finance, from social to political and humanitarian. Understanding their behavior is key for developing them further.

Built upon the same disruptive technology - the blockchain - , cryptocurrencies are “distributed (peer-to-peer) systems of token exchange between users underpinned and mathematically verifiable by virtue of the same cryptographic principles that underlie encryption

(10)

10 on the internet” (Laskowksi & Kim, 2016). Colianni et. al (2015) define a cryptocurrency as “an alternative medium of exchange consisting of numerous decentralized crypto coin types”. They also underline the fact that it is the cryptographic foundation of each token where its essence lies and that the decentralized nature of the exchange network allows for secure peer to peer transactions.

Cryptocurrencies have taken the world by storm and have caused paradigm shifts on numerous levels - regulation, business models, capital markets, etc. Rohr & Wright (2017) compare the disruptive power of cryptocurrencies to that of the Internet in relation to the media and music industries. While the Internet has dramatically reduced the costs and time of sending digital files over from a place to another, blockchains (and cryptocurrencies) have the potential to dramatically reduce the costs and time of exchanging value in a trusted and pseudonymous fashion, causing in turn effects on securities laws and capital markets, for example.

Bitcoin, the most popular cryptocurrency, is regarded as a digital commodity of great interest, with beliefs that its worth will sometime be comparable with regular money (Colianni et. al, 2015). The growth in popularity for this alternative to current monetary systems has been marked by the birth of Bitcoin in 2008. Ever since, cryptocurrencies have emerged in great numbers (e.g. Litecoin, Ripple, Ethereum, Dogecoin) and have become increasingly adopted in various industries (i.e. being often used for online transactions), according to Kim et al. (2016). Bitcoin has been built on the promise of decentralization, giving control and ownership of the network to all its users (Nakamoto, 2008). The vision of its creator comprised a world free of government control, middlemen and unnecessary transaction fees. In spite of major criticism that has been surrounding the cryptocurrency since its break into the mainstream, its value has peaked at an enormous $20,089 per BTC (Coinmarketcap, 2018) on the 16th of December 2017.

(11)

11 Ether is the second most popular cryptocurrency, according to Coinmarketcap and one of Bitcoin’s main competitors. Ethereum - its place of birth - is a decentralized platform that allows contractual agreement-type of applications (smart contracts) to be run (Ethereum, 2014). Numerous other cryptocurrencies (compliant with the ERC20 standard) and decentralized applications are built on top of the Ethereum platform.

Litecoin, a cryptocurrency very similar to Bitcoin was launched in 2011 on the same protocol and it is considered by many its main rival. Litecoin was founded as the “cryptocurrency for payments” as it promises instant transactions, close to zero transaction fees and faster transaction confirmation times as opposed to other competitors (Litecoin, 2011). Ripple is a blockchain network developed by the banking industry to facilitate instant international payments. Its payment system and currency (XRP) were launched in 2012 (Ripple, 2012). Ripple has been in the spotlight towards the end of year 2017, when its price has reached an unexpected high, backed by announcements of mass adoption within the global banking ecosystem, followed by a steep plunge immediately after (Coinmarketcap, 2018).

EOS is a blockchain network developed with the ambition of building an decentralized operating system that can support industrial-scale decentralized applications (Grigg, 2017). The team behind the project promises complete removal of transaction fees and number of transactions per second of the order of millions, which makes them one of the most promising projects among the thousands others. The company as well as their fan base has seen an immense growth since their launch in 2017 (Coinmarketcap, 2018).

2.2 Cryptocurrency - What do we know?

(12)

12 also referred to as volume. We were able to observe similar patterns in their development throughout time. For example, Bitcoin had not seen any significant changes between 2008 and 2013 when it lacked popularity and penetration into the mainstream. As soon as the beginning of 2014, concurrently with the time it had started garnering popularity, Bitcoin has seen both an enormous rise and numerous fluctuations in price and volume (Kim et. al, 2016). For example, in their 2016 article, Mai, Bai, Shan, Wang & Chiang found that Bitcoin is 41 times more volatile than the USD-EUR exchange, rising thousands of dollars from a day to the other, only to plummet the next. Similarly, according to Kim et. al (2016), Litecoin and Ripple have also seen unusually irregular fluctuations since the end of 2013, which in turn allowed for speculation for some participants in the network, while possibly disallowing others to take part.

While these patterns might be considered ordinary in any new technology’s life cycle, the sudden explosion of involvement in the cryptocurrency sphere and consequently their chaotic development over time are anything but normal. With a jump from less than $100 million worth of cryptocurrencies sold in 2016 in Initial Coin Offerings - new method of raising early capital - to over $3.2 billion sold by October 2017 (Rohr & Wright, 2017), it is no wonder this phenomenon has captured so much interest and unsurprisingly, a lot of criticism.

Cryptocurrencies have not only drawn the attention of enthusiasts, technologists or venture capitalists, but also of financial regulatory bodies across the globe and have constantly raised controversies. According to Rohr & Wright (2017), in July 2017, the US Securities and Exchange Commission (SEC) began investigating the nature of blockchain-based cryptocurrencies sales and imposed a strict set of rules for participation in the so called token sales for US citizens. Similarly, countries such as China, South Korea, Japan, Singapore, Hong Kong or Canada, have challenged the fit between their regulatory frameworks and cryptocurrency trading. The financial regulatory

(13)

13 bodies’ involvement in the cryptocurrency sphere is merely an effort to slow down the frenzy once associated with trending Silicon Valley startups, before an universal, established set of rules can be put together. Previously identified as part of the problem, regulatory oversight and its lack thereof could explain the lengths cryptocurrencies have gone to. However, while enforcing certain rules might curb the severity of price fluctuations in the long run, one should remember that the underlying principle of the blockchain is that of a decentralized network, outside of any governmental control.

These transformative technologies benefit from large groups of both supporters and critics. Understandably, critics cast blame upon cryptocurrency enthusiasts for a presumably soon-to-burst asset bubble, as a result of their irrational investments of large amounts of money into projects with no history of producing revenue. At the other end of the spectrum, however, supporters strongly believe in a future where cryptocurrencies will enable greater technological development and will open up opportunities for everyone to participate in a multi-trillion dollar global financial market. Just like Rohr & Wright (2017) outline, cryptocurrencies raise hopes to being key to “truly global capital markets”, thanks to their decentralized, geographically agnostic and universally accessible nature. In spite of witnessing a lot of hype surrounding cryptocurrencies, it is difficult to assess what factors contribute to their growth and influence their value.

With such behaviors and heated debates, cryptocurrencies topped the list of the most trending topics in the recent years and made the headlines of numerous economic and financial issues. Even though these digital assets’ future has been been studied and predicted on many different occasions, searching for meaningful predictors of price fluctuations continues today. And as these technologies are maturing, it has become a top priority to understand them and more about the forces that affect them.

(14)

14

2.3 Online Chatter

Garcia & Schweitzer (2015) explain how in today’s world, society leaves digital traces of its behavior at unforeseen and unprecedented scales. Modern society has grown highly dependent on the Internet, where its people engage and interact. People’s behavior on the Internet is a reliable predictor of their preferences, interests, concerns (Ranco et. al, 2015). However, we have yet to learn how to make the best use of these traces’ potential for better understanding the world around us. Central to their study, the concept of sentiment (and particularly sentiment analysis) is tackled and proven that social media sentiment has the potential to be turned into profits.

Web 2.0 has enabled individuals to become creators of content and to engage with each other into discussions on social media. Not only virtual communities have allowed us to exchange opinions, but also to network and share content (Asur and Huberman, 2010). Users and use of social media have grown out of proportions and the digital technology has become an integrated part of our lives. Social media can be a representative form of collective wisdom and as Asur & Huberman (2010) show, it can prove to have significant predictive power for real-world situations. In fact, the study finds that the online chatter as a predictor can outperform those of artificial markets when making quantitative predictions.

Consequently with the expansion of social media as a platform for communication, one particular instrument has become available to the world wide web - user-generated content (UGC). UGC is what has become of online information seekers who decided to be more engaged on online platforms (communities, blogs, social media), to communicate, share their experiences and leave traces of their activity (Tirunillai & Tellis, 2012). Instant and low cost availability, great reach and use of use make user-generated content a valuable resource and a better alternative to word of mouth. UGC can take various shapes and levels of creativity and span from experiential stories,

(15)

15 product reviews, informal discussions to stories about products and services and multimedia content (Berthon, Pitt, Plangger & Shapiro, 2012). Social media empowers individuals to discuss about and engage with preferred topics, brands, personalitiesand, of course, cryptocurrencies. The abundance of valuable clues in the UGC on social media justify the consideration of this digital technology as the main source of data for this study.

With a growing social media base, traditional news sources have only benefitted. The wide span of social media allowed traditional news sources to spread more, faster and cheaper. However, they have long been ignored as a raw source of sentiment data. In their study about social media text mining for stock market prediction, Sun, Lachanski & Fabozzi (2016) include a great deal of text data from traditional news outlets, such as CNNMoney, Reuters, Mail, Yahoo! Finance and many others. Other research (Garcia, Tessone, Mavrodiev & Perony, 2014) suggests that Bitcoin adoption and its value are clearly linked to both social chatter and news reports.

There is very limited previous academic work that uses text analysis of both social media and traditional news as a source to extract sentiment dimensions about a certain topic. However, the available evidence suggests that combining the two streams of data into any analysis, increases the power of the prediction model. This could prove especially useful within the cryptocurrency sphere.

2.4 Predictive power of social sentiment

Opinions on social media are anything but scarce, regardless of the topic of interest, fact that has attracted a lot of researchers and practitioners into trying to extract meaning out of them. Attempts to prove relationships between social sentiment and product sales, stock markets, movie sales or

(16)

16 book sales have uncovered evidence of positive relationships. Kalampokis, Tambouris & Tarabanis (2013) do a great job at classifying the ways in which social media can be used to make vital predictions. Their pivotal research provides support for uses of social media beyond traditional cases, such as elections outcome predictions or disease outbreaks.

Moe & Schweidel (2011) discuss about the dynamics behind writing product reviews and the influence others’ opinions has on the decisions of whether to contribute and what to contribute to product reviews. Their study comprises a fundamentally different approach in finding clues about one’s behavior when confronted with others’ past opinions. Their study also outline the effects of the presence of disruptive factors such as “activist” groups, that collectively post negative opinions. Their research is yet another proof of what a powerful predictor social sentiment can be. Other research is concerned with revealing a causal relationship between earned media, namely traditional internet sources (expert reviews) and UGC (online customer reviews), and future online sales for books (Bao & Chang, 2014). Movies make no exception from the applications of social media sentiment, as Asur & Huberman (2010) prove in their study that there is a link between the attention a movie benefits from on social media and its future success. Moreover, revenues can be accurately predicted making use of insights derived from sentiment data from social media.

The importance of social media signals is also outlined in marketing research related to text mining and user-generated content. In their paper, Tirunillai & Tellis (2014) recognize the fact that, while managers and researchers have long been obtaining indications of perceived product quality from customers through surveys or interviews, technology has advanced towards them extracting data from user-generated content (UGC). This now represents a rich source of data to extract the dimensions of quality. Quality of a product or service is an important determinant of

(17)

17 consumer satisfaction and brand performance. By extrapolation, we could assume that market price fluctuations for cryptocurrencies can be determined in a similar fashion. Numerous studies have shown, that UGC is influential in determining demand, sales, or financial performance. In the same context of text-mining research, a new characteristic of digital content is introduced - the sentiment. The sentiment or valence is nothing but the “expression of positive versus negative performance on a dimension or attribute” (Tirunillai & Tellis, 2014). In a 2012 paper, the same authors bridge the concept of user-generated content with the (traditional) stock market performance of firms, proving that the volume of chatter has a significant positive lead effect on returns (specifically suggesting that UGC predicts returns and trading volume). The bridge their research builds creates in turn favorable conditions and raises great interest to explore the same phenomenon, but in the cryptocurrency world.

At the forefront of text mining, through a legacy method called Naive Bayes Classification algorithm, Antweiler & Frank (2004) found that Internet stock message boards can be useful in predicting the publishing of articles in a reputed financial issue, the Wall Street Journal. This may not be the first instance of using text classification and sentiment analysis to make sense of clues from online chatter, but their study is a pioneering piece at the forefront of stock market prediction. Many other studies have been concerned with finding significant relationships and drivers of stock market fluctuations.

People’s traces on social media have extensive use in monitoring and predicting company’s financial outcomes and implicitly of stock markets (Liu, Wu, Li & Li, 2015). Gathering information has never been any easier, thanks in big part to social media. Microblogging services (such as Twitter and Reddit) represent a commonly used source of data for determining a proxy of

(18)

18 the market mood. Numerous studies have been trying to propose perfected methodologies for mining opinions and emotions.

At the other extreme, Checkley, Anon Higon & Alles (2016), upon a thorough review of different views on theories of market prediction, discover that sentiment metrics minimally Granger-cause stock market indicators such as volatility, returns and trading volume. Moreover, they find “modest and selective” evidence of better forecast errors as a consequence of making use of sentiment metrics. As observable, findings may be different and encourage better techniques. Sun et. al (2016) confirm in their study, performing statistical analysis of natural language data on various sources with the aim to extract meaning useful for predictions, is a top interest for the researchers of today. Exciting new research constantly adds to the complexity of what we know and understand about the world around us.

2.5 Related work

There are multiple reasons to believe that a likely predictor of cryptocurrencies’ market movements is social sentiment. Evidence from numerous studies suggests it is safe to assume that changes in cryptocurrencies’ value are influenced by the mood of the community as expressed online. In their papers, Garcia & Schweitzer (2015), Laskowski & Kim (2016), Kim et. al (2016) and Colianni et. al (2015) all demonstrate through their own proposed models that to a certain extent and through different dimensions, social signals have the potential to predict financial returns in the cryptocurrency ecosystem.

Colianni et. al (2015), Garcia & Schweitzer (2015) and Laskowski & Kim (2016), all propose interesting methodologies for analyzing social media data streams, through machine

(19)

19 learning and algorithmic trading techniques (borrowed from securities and security-like financial instruments). Their research serves as foundation of this study.

Colianni et. al (2015) for instance, propose a framework through which Twitter data that relates to cryptocurrencies can be developed into a cryptocurrency trading strategy and be used as aid into making investment and trading decisions (specifically for Bitcoin). Their paper has a strong focus on the technical aspect of text classification and sign change prediction accuracy, as well as machine learning algorithms. Their proposed method (which uses a logistic regression) yields in the optimal setup an accuracy of 76.23% on the hour to hour sign change prediction for Bitcoin.

Kim et. al (2016) propose a model through which user comments in certain online cryptocurrency communities influence the fluctuations in price and volume for the coins. They prove that particularly, positive comments significantly affected price fluctuations for Bitcoin, while negative comments significantly affected the price fluctuations of the other cryptocurrencies. Moreover, the authors argue that the model can be scaled up and that more currencies can be looked at, at different parameters. By using a classifying algorithm called VADER, Kim et. al (2016) manage to create a low cost and highly accurate prediction algorithm for three key cryptocurrencies - Bitcoin, Ethereum and Ripple. Unlike other studies, their model takes into account differences between real currencies and cryptocurrencies.

Findings may vary greatly from research to research, as different techniques are used for both text mining and the statistical analysis. Kaminski (2014) has found that the Twitter sentiment over a given period of time has a moderate effect on Bitcoin’s closing price and volume. Additionally, Granger causality testing revealed that there is no statistical significance for Twitter signals as a predictor, but that they are rather an emotional reflection of Bitcoin’s price fluctuations.

(20)

20 Garcia et. al (2014), in their study about digital traces of bubbles, claim that analyzing digital footprints such as social media usage and news reports, can help us measure socio-economic aspects of a cryptocurrency’s economy. Their research points out towards both social chatter and news reporting as influencers of the adoption and pricing of Bitcoin. Their choice of methods validates choosing both social media and traditional news sources for this thesis.

Sovbetov (2018) investigates the presence of causal dynamic relationships for five top cryptocurrencies, over the course of 9 years. Interestingly, he uses an ARDL model to examine five cryptocurrencies of similar behavior and characteristics to those considered in this study. Bitcoin, Ethereum, Dash, Litecoin and Monero have all proven to be affected both short- and long-term by cryptocurrency market factors such as beta, trading volume and volatility Moreover, Sovbetov (2018) confirmed what Poyser (2017) demonstrated in his study, namely that cryptocurrencies’ attractiveness is a determinant of price, in the long-run. Even though his study is not related to sentiment, his efforts of uncovering drivers of cryptocurrency price fluctuations and his methodological choice represent valuable contributions to research in the field.

The common ground of these topics, specifically cryptocurrencies, text mining and prediction models, is lacking a catalyst on a theoretical level. There are few relevant and up to date clues about the impact of social sentiment on the price and volume fluctuations for alternative digital currencies (cryptocurrencies other than Bitcoin), and even fewer that look at both social media and traditional news sources. This study proposes a multidisciplinary methodology with concepts borrowed from Natural Language Processing, Marketing, Machine Learning and Finance for investigating causal dynamic relationships between the collective mood in the online space and cryptocurrencies’ price changes.

(21)

21

3. Data collection and classification

This paper aims to test if polarity, the social sentiment metric indicating whether sentiment is positive or negative, has any causal effects on cryptocurrencies’ price fluctuations. It has previously been shown that social sentiment data may have predictive power and could be used to derive invaluable insights. Because data collection would be an inefficient process to do manually, it has been resorted to a few automated techniques. This chapter covers in detail the collection, processing and nature of the data.

In this light, a sample of relevant data was collected from various online sources and prepared for statistical analysis, using a combination of data crawling, text mining and sentiment analysis techniques. Quantitative research was deemed the most appropriate research method, as it allows to empirically prove patterns of relationships and to generate a usable model in the real world.

Data from online communities and relevant sources was gathered and analyzed for a period of 121 days for five top cryptocurrencies by market capitalization according to Coinmarketcap (2018). The sampling period began on the 15th of January 2018 and ended on the 15th of May 2018. As opposed to other research, relatively granular data (equal, 1 hour measurements) is gathered. It must be noted that this period was characterized by weaker fluctuations than usual, as cryptocurrencies have been slowly recovering after the latest market crash in the end of January 2018. Consequently, many participants have exited, suspended or reduced their activity in the cryptocurrency ecosystem, fact that may have had an impact of the amounts of data collected otherwise.

The five individual data sets amounted for 14,435 observations (with a total of 73,870 unique news items collected, as some of the items retrieved contain references to more than one

(22)

22 cryptocurrency), containing date and time, price and average polarity of each corresponding coin. The cryptocurrencies of choice, as well as their tickers and number of items retrieved are shown in Table 1. Descriptive statistics about the price and sentiment can be found in Appendix B.

Table 1. Number of items retrieved by cryptocurrency Cryptocurrency Ticker Items retrieved

Bitcoin BTC 48,278

Ether (Ethereum) ETH 13,500

Litecoin LTC 8,067

Ripple XRP 6,407

EOS EOS 5,561

3.1 Data Sampling

Before being able to extract sentiment, data from different sources had to be collected, uniformized and taken through a few processes before insights could be derived from it. Two main streams of data can be distinguished.

Through an on-demand pull model, raw text is scraped every 15 minutes from non-social media sources (herewith defined as “classic”), such as Youtube, Initial Coin Offering (ICO) tracking websites and traditional established online publications in the cryptocurrency sphere. A function of the pull model asks all the sources for the latest updates, checking thus for differences between the latest and previous sets. If any are found, a process for metadata extraction begins and the newest items are inserted into the database.

This marks the second stage in the news item lifecycle, namely storing (1.Scrape -> 2. Store -> 3. Process). Classic news sources consist of 20 news outlets that were manually picked and that are constantly updated. The full list can be found in Appendix C. Naturally, we can expect some of these sources to contain more bias (be more subjective) or be a product of paid media.

(23)

23 The criteria for choosing the news sources were prevalence, quality (originality, length, style of writing) and RSS feed options - possibility to read RSS feeds containing XML files.

The second sampling technique is through a real time/stream model (live) that permanently collects data from Twitter, Reddit and Steem.it - a social platform for user-created content, similar to blogs. The main advantage of this type of data is represented by its unaltered nature (directly from the source), while the main disadvantage is its sheer amount of junk present among the more reliable and relevant points. As opposed to classic news sources, where new items are checked intermittently, social media items are retrieved constantly through a “persistent connection” with the sources’ websockets. Each of the three sources is divided into platform-specific sub-sources, that were chosen manually. For steem.it, tags are used as a filter. For Reddit, a curated list of specialty subreddits represent the data stream. A differentiation between specialized (general) and coin-specific (official) subreddits has been made, for example r/Cryptocurrency, r/Bitcoin, r/Ethereum and r/Ripple, r/EOS, respectively. Both r/Bitcoin and r/Ethereum have started as dedicated subreddits, but have become more populated with general cryptocurrency discussions rather than coin-specific threads. In case of Twitter, only the coin official handlers are followed. At the same time with retrieving a news item, the exact price of the coin(s) that the item concerns is recorded from Coinmarketcap. This comes as a consequence of the fact that, unlike stock markets, cryptocurrency markets are open 24 hours a day, 7 days a week, without a break. Information about newly launched coins, as well as about all coins’ official social media channels, is retrieved from Coinmarketcap and Cryptocompare (2018).

As an overarching rule, for social media items, only the original posts and posts’ metadata was gathered and indexed, while comments, shares, retweets, upvotes or any kind of engagement or redistribution were purposely left out. This comes as a consequence of the fact that, when

(24)

24 originally posted, items i) do not have any comments or reactions and ii) they are indexed as they show up, without looking them up again at any point in time.

3.2 Data Analysis - Natural Language Processing I. Named Entity Recognition

Through a subprocess called Named Entity Recognition (NER), which is the first step of the information extraction process, the text is analyzed to first locate and classify named entities by a) coins and b) others - names, companies, geographical location, miscellaneous.

For coins in particular, relevant are their tickers - just like stocks’ tickers (e.g. BTC for Bitcoin or ETH for Ethereum) - and/or their names. The end product of this subprocess is a list of coin tickers. Using regular expressions, the sequence patterns of interest are described and a search for them is performed through two string searching algorithms called pattern matching and pattern recognition.

The next step in further processing the data is represented by sanitation and normalization, using the Natural Language Toolkit (NLTK), a Python-based suite of libraries and programs for symbolic and statistical English NLP. The first layer of sanitation is represented by the removal of the non-ASCII and non-UTF-8 characters. Following, all characters in the string need to be latinized and all dates and times need to be translated into the ISO-8601 standard.

For entities other than coins, text analysis is performed using Stanford NER, also known as CRFClassifier, a JAVA-based implementation of a Named Entity Recognizer. Using its extensive library, it was possible to extract geographic locations, persons’ names and public institutions’ (organizations) names. Their importance must not be neglected, as these entities play a crucial role into giving sense to the retrieved information.

(25)

25

II. Sentiment Analysis

Sentiment Analysis is facilitated by a Python library called TextBlob, which is based on NLTK. Before sentiment metrics can be computed, a series of steps needs to be followed as final sanitation layer of the data. Firstly, using a word extraction algorithm, the text is tokenized. Tokenization translates to dividing the text into parts, called “tokens” (unique occurrence(s) of a term that is not a stop word). Secondly, stop words are removed. Stop words can be defined as common words that do not add any value in the text analysis, such as “the”, “a”, “and”, “but”, “or”, etc. Following, through a process with its roots in computational linguistics, called lemmatization, the lemma of a word is determined based on its intended meaning. This is done by analyzing the context in which the inflected form of the word is found, hence deriving what part of speech a word is. Furthermore, the complementary process called stemming, is the action of reducing the inflected words to their root - or stem. The stemming process employs an n-gram stemming algorithm (suffix stripping algorithm) called Porter stemmer. The algorithm removes the common morphological and inflexional endings (suffixes) from English words. At this point, terms in the text are normalized and ready to be analyzed.

While online content and particularly social sentiment can be broken down to multiple other dimensions, the metric of interest for this study was restricted to polarity (negative/positive). Polarity refers to whether a word can be classified as positive or negative. Polarity spans from -1.0 (very negative) to +-1.0 (very positive). The total (average) polarity of a particular data point (news item or social media post) is calculated as the mean of all words’ polarity. The extensive lexicon of TextBlob contains scores for each of the words, in both dimensions of interest. Price refers to the (average) price a particular cryptocurrency trades for at a given point T in time. Upon

(26)

26 performing the statistical analysis, polarity was normalized and translated into a scale from 1 to 100, where 50 is neutral.

The proposed sampling methodology is consistent and it causes no data sparsity, as all items are structured in the same way: title, description, thumbnail, metadata, time, etc. The structure of a news item can be seen in Appendix A.

(27)

27

4. Methodology

Into creating an own framework for both time series analysis and forecasting, it has been drawn on several studies that shared similarities with this thesis or that provided solutions to issues of implementation complexity. Models such as ARIMA (Autoregressive Integrated Moving Average), VAR (Vector Autoregressive Model), ARCH (Autoregressive Conditional Heteroskedasticity), ARDL (Autoregressive Distributed Lag), GA (Generalized Additive) or GADL (Generalized Additive Distributed Lag) are commonly used techniques for either investigating the presence of dynamic causal relationships or for forecasting. Most of them share the same techniques for cleaning and preparation of a time series (non-stationarity, decomposition, etc.), as well as some general assumptions. Components from the said studies have been borrowed into building this methodology for 1) investigating Granger causality and 2) testing the results in a forecasting scenario. Their widely spread use in and out of cryptocurrency literature acts as an extra validation mechanism for the choices made for this study.

Particularly interesting is Sovbetov’s (2018) paper about influencing factors of some of the most common cryptocurrencies over a 9 years period, using and ARDL model. Tirunillai & Tellis’s 2012 study about the predictive power of user-generated content over stock market movements also uses the ARDL model and shows that there is strong causation of UGC. The ARDL is a popular choice for modelling relationships between economic variables in a single-equation time-series setting. One of the main advantages of this model is that cointegration of nonstationary variables is equivalent to an error-correction process. The ARDL models is composed of an “autoregressive” component, as the dependent variable is regressed on its past values, and the “distributed lag” component, because on top of including the current values of the independent (explanatory) variables into the model, their lagged values are also studied. Generally,

(28)

28 an autoregressive distributed lag model with p lags of the dependent variable 𝑌𝑡 and q lags of the independent variable 𝑋𝑡 is called an ARDL(p,q) model.

Finally, relevant for this study’s framework is Garcia & Schweitzer’s (2015) study about social signals in the context of algorithmic trading. Their study uses the VAR (an extension of the ARDL model) model and consequently an impulse analysis to reveal temporal patterns within the Bitcoin ecosystem, in terms of price returns, the signals of exchange volume and Twitter sentiment. The Vector Autoregressive Model allows reverse causality among the dependent and independent variables using their own lagged values.

While fitting a time series model to yield the best estimation, it is important to consider the number of lags for each of the variables, for each of the independent cryptocurrencies. The optimal combination of order parameters should be found. Sufficient lags for all explanatory variables should be considered. However, a middle ground between including enough lags and not imposing too many restrictions should be found. Too few lags could mean potentially missing out on information present in the past values, while too many lags could increase the error in the regression model. The most common information criteria (Stock & Watson, 2012) are the 1) Bayes Information Criterion (BIC) and 2) the Akaike Information Criterion (AIC). The two criteria can be interpreted as an estimation of the amount of lost information if a particular model is chosen. The AIC is the more popular choice in business and economics time series analysis, in spite of the criterion usually estimating more lags than its counterpart, the BIC.

In terms of forecasting algorithms, this study borrows from Taylor & Letham’s (2017) article and adapt to allow the method to account for a) hourly data and b) additional regressors. The forecasting method they propose counters issues of complexity of implementation and adaptation of parameters, as well as of statistical expertise.

(29)

29

4.1 Causal effects

Given that data over multiple periods of time is available for all the variables, a time series regression is the most appropriate method to estimate dynamic causal effects. However, estimating the coefficients goes beyond the scope of this study and the investigation will only entail the steps up until to (and including) Granger-causality testing.

4.1.1 Nonstationarity testing (unit root testing)

Stationarity is a very important characteristic of a variable in any reliable time series regression. In a stationary time series, the current distribution of the data equals the past distribution - so their mean, variance and autocovariance are time invariant (Stock & Watson, 2012). This translates to the fact that the time series’ statistical properties are all constant over time. Most economic time series are far from stationary when looked at in their original, raw format. However, thanks to mathematical transformations, such as differencing, a series can usually be stationarized by eliminating trends or cycles. Differencing assumes that the change from a period to another might have constant properties over time in case the original series does not. Differencing is computed by subtracting one period’s values from the previous period’s values. Following the right steps for transforming a non-stationary time series into a stationary one, may give clues into what an appropriate model is (VAR, OLS, ARDL). For this study, the variables have been differenced once and have been log transformed accordingly.

𝑌ⅆ𝑡 = 𝑌𝑡− 𝑌𝑡−1

In order to test for nonstationarity, there are both formal and informal methods used in practice (Stock & Watson, 2012). Informally, the autocorrelation plots (of the autocorrelation

(30)

30 function) represent the most used visual tools to detect whether a series is stationary, by displaying correlations between a series and its lags. If that is the case, the series presents trends or seasonal components and is therefore non-stationary. Moreover, by computing the first autocorrelation (the correlation of a variable with its first lagged value) it can be determined whether non-stationarity occurs (Stock & Watson, 2012).

On the formal side, the Augmented Dickey-Fuller (ADF) statistic is the most widely and the first ever used test for stationarity. Under the null hypothesis, Yt has a stochastic trend and the series is therefore non-stationary. Under the alternative hypothesis, Yt is stationary (Stock & Watson, 2012). Through ADF it can be tested whether lagged values of the dependent variables and a linear trend explain its change. The null hypothesis cannot be rejected and therefore the series is non-stationary if the the lagged values of Y do not contribute significantly to its change and if there a trend component is present (Stock & Watson, 2012).

A stochastic trend is a trend that is random and that varies over time. The presence of a stochastic trend (also called unit root) in a time series can lead to one of the following problems: 1) the estimated coefficients of either the dependent variables or of the included regressors and their t-statistic can have a nonstandard distribution, even in large samples; 2) the estimator of the dependent variable’s coefficient of the first lag can be biased towards 0 when its true value is 1; and 3) two independent series can misleadingly seem to be related if both contain stochastic trends, which is also known as spurious regression (Stock & Watson, 2012).

According to Stock & Watson (2012), time series can be broken down (when possible) into up to four elements - trend, seasonality, cycle and residuals. Separating these components from one another is called decomposition. Decomposition can be a very useful tool for inferring crucial attributes of any time series and a first step into the time series analysis. A trend is a “persistent

(31)

31 long-term movement of a variable over time” (Stock & Watson, 2012) or the general pattern of the series. Seasonality refers to fluctuations in the data set in relation to specific points in time (e.g. summer, Christmas, quarter etc). Cycles refer to patterns that are not seasonal. Lastly, the remaining of the series that cannot be attributed to either of the three components is called the residual or error (Stock & Watson, 2012). The series can either be broken down into an additive series (𝑌 = 𝑆_𝑡+ 𝑇_𝑡+ 𝐸_𝑡) or a multiplicative series (𝑌 = 𝑆_𝑡∗ 𝑇_𝑡∗ 𝐸_𝑡).

4.1.2 Multicollinearity

Perfect multicollinearity is the condition of a regressor (independent variable) to be a perfect linear combination of any of the other regressors, which occurs when the correlation between these is 1 (Stock & Watson, 2012). Independent variables need not to be collinear, meaning that they are not highly or excessively correlated, as this would not be advantageous and could create issues. This should be avoided as much as possible.

With a correlation approaching 1, the tolerance approaches 0. When there is a high correlation, but below 1, two variables suffer from imperfect multicollinearity. Imperfect collinearity does not block the estimation of the regression, but it does however impact the precision of estimating one or more regression coefficients (Stock & Watson, 2012).

Multicollinearity can be looked at as to a reduction in the sample size which would implicitly reduce the estimation power of the model. On the other hand, when modelling a dependent variable, it actually makes sense to expect a close relationship with each of the regressors (cointegration). Given that the model only implies one regressor, multicollinearity does not pose a threat.

(32)

32

4.1.3 Granger causality testing

As seen in Stock & Watson (2012), the Granger causality statistic is often used in time series analysis to determine whether a time series is useful in predicting another.

∆𝑌_𝑡 = ∑ 𝛼_𝑖 𝑛 𝑖=1 ∆𝑌_𝑡−1+ ∑ 𝛽_𝑗 𝑛 𝑗=1 ∆𝑋_𝑡−𝑗+ 𝑢_1𝑡 ∆𝑋_𝑡 = ∑ 𝛾_𝑖 𝑛 𝑖=1 ∆𝑋_𝑡−𝑖+ ∑ 𝛿_𝑗 𝑛 𝑗=1 ∆𝑌_𝑡−𝑗+ 𝑢_2𝑡

Under the null hypothesis (H01: ∆𝑋 does not granger cause ∆𝑌, respectively H02: ∆𝑌 does not granger cause ∆𝑋), coefficients on all lags of that variable are zero, which means that the variable has no predictive content. Rejecting or non-rejection of the null hypothesis is done based on the F-statistic. Granger causality (also referred to as “Granger predictability” or “predictive causality”) is rather different than the traditional meaning of “causality”.

∆𝑃𝑅𝐼𝐶𝐸_𝑡= ∑ 𝛼_𝑖 𝑛 𝑖=1 ∆𝑃𝑅𝐼𝐶𝐸_𝑡−1+ ∑ 𝛽_𝑗 𝑛 𝑗=1 ∆𝑃𝑂𝐿𝐴𝑅𝐼𝑇𝑌_𝑡−𝑗+ 𝑢_1𝑡 ∆𝑃𝑂𝐿𝐴𝑅𝐼𝑇𝑌_𝑡 = ∑ 𝛼_𝑖 𝑛 𝑖=1 ∆𝑃𝑂𝐿𝐴𝑅𝐼𝑇𝑌_𝑡−1+ ∑ 𝛽_𝑗 𝑛 𝑗=1 ∆𝑃𝑅𝐼𝐶𝐸_𝑡−𝑗 + 𝑢_1𝑡

This method requires thorough preparation and careful transformation of the data set, in order to preserve a time series’ characteristics. Granger causality is very sensitive to time frames and requires perfectly overlapping datasets. Mishandling or breaks in the set can cause dramatic

(33)

33 alterations of the results. Sparsity of data makes it hard to determine causal effects with accuracy. Given that there is no authoritative reference to use 0.05 as significance level, a threshold of 0.1 has been established for this study. Certain results have proven only marginally insignificant at the 95% confidence interval, reason why they were accepted as evidence and later confirmed in the forecasting simulation. In this study, variables are also test for reverse causality or bidirectional causality.

4.2 Forecasting

High-caliber forecasting is troublesome for both machines and researchers. There are many methods that either fail to meet the needs of the researcher or business manager or that are likely to estimate erroneous results. Experience has shown that a high level of expertise is required in order to adjust the parameters for optimal behavior of a forecast. Moreover, commonly used forecasting methods such as ARIMA require not only an in-depth understanding of econometric principles, but are also very susceptible to parametrization errors. They are generally inflexible and do not allow for heuristics or useful assumptions (Taylor & Letham, 2017).

4.2.1 The Prophet algorithm

Prophet is a highly successful forecasting algorithm open-sourced by the core data science team at Facebook, based on a time series model, which was created specifically to predict relevant growth metrics at Facebook. This method brought tangible results in the business world and as a consequence its use has been increasing steadily since its launch. At Facebook, it has originally used in the context of the Events functionality and has since become the standard in forecasting (Taylor & Letham, 2017). Its success is based on its robustness and simplicity of configuration even by non-expert users or individuals with limited knowledge about time series models and

(34)

34 econometrics. Fully automated and highly customizable, the algorithm is very good at handling outliers, missing data and shifts in trends, and can be used intuitively.

4.2.2 The science behind

As Taylor & Letham (2017) underline, forecasting with other (automated) methods such as ARIMA is usually prone to large trend errors and overfitting. At its core, Prophet uses a decomposable time series with three main model components: trend, seasonality and holidays, combined under the following (additive) equation:

𝑦(𝑡) = 𝑔(𝑡) + 𝑠(𝑡) + ℎ(𝑡) + 𝜖_𝑡,

where 𝑔(𝑡) is the trend function (non-periodic changes), 𝑠(𝑡) represents periodic changes, ℎ(𝑡) is concerned with the effects of potential holidays or special events and 𝜖_𝑡 is the error term representing idiosyncratic changes not computed by the model. et is assumed to be normally distributed (Taylor & Letham, 2017).

The setup of Prophet resembles that of a generalized additive model (GAM). Unlike other time series models, Prophet applies a curve-fitting technique on the dataset. While this might imply foregoing the ability to make certain inferences, the advantages of using this method outweigh the disadvantages. In fact, the method solves numerous issues inherent to time series analysis, both on theoretical and practical levels. On top of accommodating for seasonality with multiple periods and being fast to fit a model, Prophet is also easy to interpret (Taylor & Letham, 2017). It uses a framework called “Analyst-in-the-Loop Modeling” that assumes that analysts possess ample knowledge in their areas of expertise, but little statistical knowledge. On one hand the algorithm limits the necessity for the latter, and on the other hand, it allows the user to customize their search

(35)

35 based on their needs. The method is ultimately easy to adapt to allow for multiple regressors (Taylor & Letham, 2017).

In terms of metrics of interest, at the Root Mean Square Forecast Error (RMSFE) is of interest. RMSFE is a measure of the uncertainty of a forecast, in terms of estimation of the regression coefficients and the future values of the error term (Stock & Watson, 2012). This measure is widely used in a various range of domains to calculate the size of forecasts’ errors. The error of the models is computed as the square root of the average of all differences between actual and predicted values of the regressors, price and sentiment.

𝑅𝑀𝑆𝐹𝐸 = √(𝐸[𝑌_𝑡+1− Ŷ_𝑡+1|𝜏)2_]

4.2.3 Methodological considerations

Following the methodology proposed in Taylor & Letham (2017) and tweaking it to allow for an additional regressor, as well as for hourly data (as it initially did not allow for these), predictions have been made on different cryptocurrencies with and without an additional regressor. The steps followed in the Granger causality investigation are very relevant for the forecast as well. The results of the causality testing serve as foundation for the forecasts made using Prophet.

To begin with, normalization of data is a crucial transformation essential for the best results. Even though data had been normalized at the beginning of the analysis, it is worthy reminding of the importance of these measures. The nature of the data used in this study calls for leveling up its scale. To put it in perspective, take for example the price of Bitcoin, that ranged from $6,105.6 to $14,226, while Bitcoin sentiment polarity ranged from 2.02 to 83.41. At such different scales, the weight of one would deem the other insignificant and would compromise the results of the forecast.

(36)

36 The training set consisted of 75% of the available data, namely an average of 2150 observations (hours), equal to 90 days. This cut-off point at ¾ is commonly used and in this case, it allows the algorithm to train on a large enough data set, while it also leaves enough room for cross-validation with the actual values from the test set. Data sparsity is often a threat to the reliability of forecasts. To counter this, data had initially been inspected manually for missing data and subsequently cleaned both during the initial analysis and prior to the forecast. Having perfectly overlapping time periods in a series is a very important consideration, as time series are very reactive to even the slightest errors in a data set.

Prophet is as of recently capable of handling sub-daily data, however then, daily seasonality will be automatically fit. In this model, 720 horizons (denoted H) are specified, a measure of the number of units of time the forecast should be done for (the test set). Prophet’s prediction is a blend of statistical and judgmental forecasts, that allows the user to quickly improve the model with a few iterations. The method uses simulated historical forecasts (SHFs) and produces K forecasts at different cutoff points, unlike one forecast per period in comparable algorithms (Taylor & Letham, 2017). This is advantageous because it allows the algorithm to train itself on different paths (and a number of paths set by the user) and choose whichever model fits best, as well as how fast it should learn. In terms of forecasting period, Prophet is very flexible and can be tuned to predict accurate results in the very long run.

Upon the definition of the data frames - or the structure of the data -, two different scenarios are forecasted - a scenario where the price of a cryptocurrency is forecasted on its own past (lagged) values and another one where an additional regressor (sentiment polarity) is added to the model. One of Prophet’s requirements is that both past and future values of the extra regressor must be known in advance, either as a series that has known future values or a series that has been

(37)

37 forecasted separately in a different context. Essentially, a prediction for sentiment polarity is also made, to later act as input into the forecast of the multiple regression model (as another time series). The underlying assumption is that the series depends on the explanatory variable as either an additive or multiplicative factor. Details and comments on the implementation of the Prophet algorithm can be found in Appendix F.

(38)

38

5. Results

This chapter proceeds with presenting the results of both causality testing and the forecasting scenarios. Bitcoin, Ethereum, Ripple, EOS and Litecoin, five top cryptocurrencies by market capitalization, have been tested for dynamic causal effects between their sentiment polarity and their price. Evidence of these relationships has been found for the first four of them, at various lags. The results have been tested and validated in all prediction scenarios, applying the Prophet algorithm proposed in Taylor & Letham (2017). The prediction error has decreased substantially in all cases as a consequence of introducing the additive regressor polarity. While there is no immediately obvious reason why evidence could not be found for the fifth cryptocurrency, the findings are generally consistent with related research. An attempt to explain the results and their variation across currencies will be made in the Chapter 6.

The procedure starts with first confirming all series are stationary, by visually inspecting their plots and by ensuring the Augmented Dickey-Fuller statistic is significant at 1%. All series have been differenced by 1 and log transformed. The results for the ADF test on differenced data can be found in Appendix D.

As described in the Methodology section, Granger causality is a helpful test to investigate whether the lagged values of an independent variable have any predictive content. This method is used to indicate that one signal possibly causes another and not to determine causality with certainty, but rather correlations at various times. Tables 2 to 6 give the overview of the Granger causality tests for sentiment polarity and price that have been found to be significant below the threshold of 10%. The tables present plausible findings in relation to the expectations. All cryptocurrencies have also been tested for reverse and/or bidirectional causality. For Bitcoin and

(39)

39 EOS, evidence of bidirectional causality has been found, suggesting that price might also act as a predictor of sentiment and not only the other way around.

Table 7 presents the Root Mean Square Forecast Errors (RMSFE) of the eight scenarios modeled by Prophet, corresponding to the cross-validation of the test set and the predicted results. The four cryptocurrencies for which causality has been found at the sentiment level have been tested with and without an additional regressor, besides price. In all four cases, adding a regressor (sentiment polarity), whose values past the cutoff point have also been predicted by the same algorithm, reduces the error considerably, validating the findings of the causality testing. The paragraphs following give an individual overview of the results for each of cryptocurrencies.

Table 2. Granger Causality Bitcoin

H0 Lags F-score p-value

Polarity ⇻ Price 2 3.5925 0.01765* 3 2.5635 0.05308 . 4 3.3737 0.009203 ** 5 2.8437 0.01444 * 6 3.0033 0.006294 ** 7 2.6171 0.01078 * 8 2.3925 0.01436 * 9 2.783 0.003007 ** 10 2.5809 0.004123 ** 11 2.2413 0.01048 * 12 2.1779 0.01052 * Price ⇻ Polarity 2 0.4406 0.6437 . 3 4.6413 0.003055 ** 4 4.1004 0.002572 ** 5 3.2568 0.006176 ** 6 2.8864 0.008315 ** 7 2.2867 0.02534 * 8 2.4621 0.01175 * 9 2.2779 0.01532 * 10 2.5776 0.004172 ** 11 3.1191 0.0003408 *** 12 2.7003 0.001254 ** Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(40)

40

Table 3. Granger Causality EOS

Polarity ⇻ Price 8 2.0854 0.03394 * 9 1.8406 0.05647 . Price ⇻ Polarity 5 2.1455 0.05737 . 6 1.8945 0.07809 . 7 2.024 0.04866 * 8 2.0148 0.04111 * 9 1.7935 0.0645 . Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Table 4. Granger Causality Ethereum

Polarity ⇻ Price 2 4.553 0.01061 * 3 3.2592 0.02069 * 4 2.4279 0.04583 * 5 2.5774 0.02468 * 6 2.256 0.03557 * Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Table 5. Granger Causality Ripple

Polarity ⇻ Price 2 3.7593 0.02342 * 3 2.6797 0.04541 * 4 2.1831 0.06843 . 5 2.0022 0.07528 . 6 1.9758 0.06567 . 7 1.7366 0.09601 . Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(41)

41

Table 6. Granger Causality Litecoin

Polarity ⇻ Price 2 1.7537 0.1733 3 1.517 0.208 4 1.1526 0.3299 5 0.9961 0.4185 6 0.8015 0.5686 7 0.7783 0.6055 Price ⇻ Polarity 2 0.3763 0.6864 3 0.5874 0.6233 4 0.3308 0.8574 5 0.27475 0.9271 Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Table 7. RMSFE with and without additive regressor

Coins Additive Regressor RMSFE ΔRMSFE Bitcoin No 0.1794 0.1178 Polarity 0.0616 Ethereum No 0.1337 0.0878 Polarity 0.0460 Ripple No 0.0870 0.0480 Polarity 0.0390 EOS No 0.1915 0.1571 Polarity 0.0345 Bitcoin (BTC)

Bitcoin, the dominant cryptocurrency has been found to show indication of bidirectional Granger causality at lags 2 to 12 (2 hours to 12 hours), as seen in Table 3. Bitcoin’s price has been spiraling up and down, spanning from around $14,000 to around $6,000, after previously having reached an