Bitcoin price prediction using Deep Neural Networks

(1)

Bitcoin price prediction using

Deep Neural Networks

Author

Michelle Appel

10170359

appel.michelle@gmail.com

Supervisor

Tom Runia

PhD student Deep Learning

Science Park 904, Room C3.250A

A thesis submitted in partial fulfillment for the

degree of Bachelor of Science

in

Beta-Gamma major: Artificial Intelligence

University of Amsterdam

(2)

Abstract

Bitcoin is currently a thriving open-source community and payment network, which is currently used by approximately 10 million people. As the value of Bitcoin in US Dollar fluctuates every day, it would be very interesting for investors to forecast the Bitcoin value but at the same time making it difficult to predict. This work focuses on predicting Bitcoin prices using a Long Short Term Memory (LSTM) algorithm. The effect of scaling methods on prediction effectivity has be investigated and it has been found that scaling the features and target will improve prediction effectivity. Also, an experiment on the effect of different feature combin-ations on prediction effectivity has been conducted and a certain optimal combination of 9 features has been found. Finally, the effect of differ-ent sequence lengths and prediction delays on prediction effectivity has been tested and it has been found that the best absolute prediction is done using a prediction delay of 0 and a sequence length of 1, which is against expectations. Overall, the predictions are slightly better than the baseline, which is by taking the last known Bitcoin price as a prediction for the day to be predicted.

(3)

1 Introduction

1.1 Bitcoin

Satoshi Nakamoto, a pseudonym for the mysterious developer of the Bitcoin, published a paper in 2008 called Bitcoin: A Peer-to-Peer Electronic Cash System [6] which describes the mechanism of the Bitcoin network, its transactions and how the double-spending problem can be solved. Nakamoto released the Bit-coin software in 2009, which is currently a thriving open-source community and

payment network, which is currently used by approximately 10 million people1.

Bitcoin is a cryptocurrency, which is a digital currency that is entirely decent-ralised, meaning it is based on peer-to-peer transactions without going through a financial institution. It has several advantages such as a controlled and known algorithm for currency creation and an transparency for all transactions mean-ing all transactions includmean-ing transaction size, time stamp, sender wallet address and receiver wallet address are stored in the block chain, however, it has some kind of anonymity as well, because only wallet addresses, but no names are stored. Altogether it makes a possibly desirable alternative for classic curren-cies like US Dollar, Euro or Chinese Yen.

The anonymity of Bitcoin transactions is a desirable feature for people that want to commit illegal activities as they can’t be traced down. Also, no taxes are payed over Bitcoin transactions, which may possibly lead to a collapse of society when it will be the only used payment method and no other solution is found to this problem.

1.2 Time series analysis in financial context

The value of Bitcoin in US Dollar fluctuates every day, which makes it diffi-cult to predict future Bitcoin prices. Bitcoin prices are in particular diffidiffi-cult to predict, as it has an even more volatile character than most stock markets [8]. Knowing future Bitcoin values, of course, yields profit for investors as they know when to buy or sell Bitcoin in order to gain maximum profit. If investors have knowledge of future Bitcoin prices, they can perform actions based on this knowledge: when the Bitcoin price will rise they can invest and when prices will drop they can sell it.

Prediction can be possible if there exists a relation between historic data and future Bitcoin value. Machine learning can then be used to recognise patterns between historic data and future Bitcoin prices in order to predict them. When machine learning succeeds to effectively predict Bitcoin prices, the method can be implemented for Bitcoin prediction software.

1_{Based on wallet addresses richer than 1 USD, retrieved from https://bitinfocharts.}

(6)

Prediction can be achieved using classification or regression. The major difference between classification and regression is that classification depends on variables which are not ordered whereas regression depends on continuous-valued ordered variables. Regression allows for exact future Bitcoin prices to be predicted whereas classification allows for a limited amount of classes to be predicted. This work aims at predicting Bitcoin values in US Dollar using regression.

1.3 Outline of this thesis

In order to predict Bitcoin price as effective as possible, first in subsection 2.2 will be investigated what regression machine learning method has proven to be effective at similar problems, such as stock predictions. This regression method will be tested on predicting future Bitcoin prices. Then, in subsection 2.4, will be investigated which factors are related with future Bitcoin prices, which can be collected and used for the predictions.

Next, in subsection 4.1, the effect of feature scaling and target scaling on future Bitcoin price predictions will be investigated. The hypothesis is that scal-ing features allows for regression to find a solution faster, but not in particular more accurate. Predictions will be made using different scaling methods and errors with respect to the actual values will be calculated. The method with the lowest average error will be the most effective method.

Then, in subsection 4.2 the effect of different features will be tested by train-ing a model and make predictions ustrain-ing different combinations of features. The combinations of features that causes the least amount of error with its predic-tions will be the best combination of features.

Finally, in subsection 4.3, the delay between known data and predictions will be tested. The first hypothesis is that the closer to the future, the better the predictions, because earlier predictions allow for less uncertainty. However, it is possible that some features have a relation with Bitcoin prices further in the future than 1 day. The effect of the prediction delay will be tested together with the length of the input sequence. The expectation is that the longer the input length, the better the prediction as there is more information available to base a prediction on. Different combinations of prediction delays and sequence lengths will be used to make a prediction with and the method with the lowest error with respect to the actual value will be the best combination of prediction delay and sequence length.

These experiments will possibly help to achieve an effective future Bitcoin price prediction method.

(7)

2 Literature review

2.1 Regression techniques for stock prediction

Regression can be used to for predicting an output based on a given input. The simplest form of regression is linear regression, where a single variable is used to base a prediction on. More advanced regression is multiple regression, where more variables are used to base a prediction on [1].

2.1.1 Linear regression

Linear regression is used to predict a relationship between a dependent and a independent variable, which can be represented as

Y = C + W X

where Y is the dependent variable, X is the independent variable, C is a constant and W is the slope of the regression line.

2.1.2 Multiple regression

Multiple regression uses multiple variables to predict a relationship between the dependent variable Y and independent variables, which can be represented as

Y = w0+ w1x1+ ... + wnxn

where wn to wn are the coefficients, or weights, and x1to xn are the

independ-ent variables, or features.

2.2 Machine learning

Machine learning can be used to find the optimal weights of this regression for-mula such that it describes the relation between features and target as well as possible. The algorithm learns from a bunch of examples by minimising the error of the regression line with respect to the true values of these examples, with the intention it will describe the relation of future or unseen cases as well.

2.3 Artificial Neural Networks

Artificial Neural Networks (ANNs) is a form of machine learning which is based on a collection of connected units which can perform operations. The inform-ation flow trough the units and the operinform-ations performed are the weights that need to be optimised in order for a model to be fitted to the data. Learning is done using gradient descent.

(8)

2.3.1 Gradient descent

Machine learning can find the best fitting values for weights of the regression function using gradient descent. Gradient descent is used to find the minimum error of a regression function with respect to the true values it has to approach, by taking small steps towards a (local) optimum, which is where the error func-tion is at minimum. Such error funcfunc-tions are described at subsecfunc-tion 3.3.

2.3.2 Recurrent Neural Networks and Long Short-Term Memory

Recurrent Neural Networks (RNNs) are a collection of machine learning meth-ods, which has become a widely used method for extracting patterns from tem-poral sequences [7], making it possibly effective for predicting time series like the Bitcoin price trend. A RNN is an ANN equipped with temporal memory, as it takes a sequence as input. As can be seen in Figure 1, every element of the input sequence will be fed to a separate RNN cell, classified and its output will be passed to the next RNN cell, until the last cell is reached, and a final prediction is made.

Figure 1: Image courtesy of Chris Ola. The unrolled architecture of a Recurrent

Neural Network with x0 to xt are the input values of the sequence, A is a

recurrent cell and h0 to htare the output values.

The memory of a RNN will quickly fade over time, due to the method of passing information over time, which is through ordinary nodes when using a classic RNN, making time series analysis less effective, in particular for longer input and/or output sequences. This is called the vanishing gradient problem, which can be solved by introducing a Long Short Term Memory (LSTM) [9].

An LSTM is an RNN which has a memory cell which can maintain its state over time by using non-linear gating units which regulate the information flow into and out of the cell [2], as can be seen in figure 2. This allows for the temporal dimension of the data to be better taken into account rather than using a classic RNN, which may be the reason for the effective results on stock market prediction gained in earlier studies [9].

(9)

Figure 2: Image courtesy of Chris Ola. The architecture of a LSTM cell with its forget gate, input gate layer and an output gate layer.

2.4 Drivers of the Bitcoin price

In order to predict Bitcoin prices using machine learning, there has to exist data that hold a relation with the Bitcoin price in such a way that when this data fluctuates Bitcoin price fluctuates as well. To make effective future Bitcoin price predictions, the data must have a relation with future Bitcoin prices, making them leading towards Bitcoin price. Such factors may be effective Bitcoin price predicting features.

Multiple studies have shown there exist certain factors that hold a relation-ship with Bitcoin price [3] [4] [5] [10].

2.4.1 Popularity

The popularity of Bitcoin seems to hold such a leading relation [3] [5]. A meas-urement of popularity is query volume, which is the amount of times a subject has been queried.

Google Trends’ query volume of the query “Bitcoin”2_{represents the number}

of times “Bitcoin” is searched with Google search engine per day. Google Trends’ query volume of the query “Bitcoin” holds a positive and leading relation with the Bitcoin price [3]. Wikipedia query volume seems to hold a positive and leading relation as well [3].

2.4.2 Economic drivers

Bitcoin is a currency, which indicates there are economic drivers of Bitcoin price. The demand of Bitcoin holds a positive relation with the price of Bitcoin and the number of transactions being done with Bitcoin has a positive and leading relation with the price of Bitcoin [3].

(10)

2.4.3 Technical drivers

Technical factors are unique for the Bitcoin market, as other stock markets do not have technical factors, making them very interesting for predicting future Bitcoin prices.

Madan et al. [4] used a selection of 16 features related to the Bitcoin network

retrieved from Blockchain Info3_{, which can be seen in Table 1. These features}

were selected manually based the significance of solving the problem of predict-ing Bitcoin trends uspredict-ing classification.

Feature Definition

Average Confirmation Time Average time to accept transaction in block

Block Size Average block size in MB

Cost per transaction percent Miners revenue divided by the number of transactions Difficulty How difficult it is to find a new block

Estimated Transaction Volume Total output volume without change from value Hash Rate Bitcoin network giga hashes per second

Market Capitalization Number of Bitcoins in circulation * the market price

Miners Revenue (number of BTC mined/day * market price) + transaction fees Number of Orphaned Blocks Number of blocks mined / day not off blockchain

Number of TXN per block Average number of transactions per block

Number of TXN Total number of unique Bitcoin transactions per day Number of unique addresses Number of unique Bitcoin addresses used per day Total Bitcoins Historical total Number of Bitcoins mined TXN Fees Total BTC value of transaction fees miners earn/day Trade Volume USD trade volume from the top exchanges

Transactionto trade ratio Relationship of BTC transaction volume and USD volume

Table 1: Names and descriptions of the 16 features chosen in the research of Madan et al. [4] that relate to the Bitcoin network

The data used is a 24-hour time series, which seems to minimize noise con-cerns from higher granularity measurements and minute volatility [4].

The best result obtained by the research of Madan et al. is a prediction accuracy at the test setof 0.9879 using a Binomial Generalised Linear Model, which indicates that technical features, and in particular this selection of fea-tures, is successful for predicting future Bitcoin prices.

(11)

3 Method

First, the data is retrieved and pre-processed. Next, the machine learning en-vironment is set up. Then, the best feature scaling technique will be tested, the best combination of features will be selected and finally different combinations of sequence lengths and prediction delays are tested.

3.1 Data retrieval

The data is retrieved using APIs from Blockchain Info4, which is a 24-hour time

series. The BTC price in USD is visualised in Figure 3, which will be the target values for prediction. The data starts at 01-03-2009 and ends one day before the present day as the data is retrieved, which is at 24-06-2017 when the data is retrieved in this work.

Figure 3: The average Bitcoin price in USD per day, retrieved from Blockchain Info at 24-06-2017.

Other data retrieved from blockchain.info is described in Table 2, including features described in subsection 2.4.3 and other available data.

(12)

Feature Definition

Average USD price Average USD market price across major bitcoin exchanges. Market capitalization The total USD value of bitcoin supply in circulation, as calculated

by the daily average market price across major exchanges. BTC in ciculation The total number of bitcoins that have already been mined; in

other words, the current supply of bitcoins on the network. USD exchange trade volume The total USD value of trading volume on major bitcoin exchanges. Blockchain size The total size of all block headers and transactions in MB. Not

including database indexes. Average block size The average block size in MB.

No. orphaned blocks The total number of blocks mined but ultimately not attached to the main Bitcoin blockchain.

TXN per block The average number of transactions per block.

Median confirmation time The median time for a transaction to be accepted into a mined block and added to the public ledger (note: only includes transac-tions with miner fees).

BTC unlimited support Percentage of blocks signalling Bitcoin Unlimited support. Hash rate The estimated number of tera hashes per second (trillions of hashes

per second) the Bitcoin network is performing.

Difficulty A relative measure of how difficult it is to find a new block. The difficulty is adjusted periodically as a function of how much hashing power has been deployed by the network of miners.

Miners revenue The estimated number of tera hashes per second (trillions of hashes per second) the Bitcoin network is performing.

Total TXN fees The total value of all transaction fees paid to miners in BTC (not including the coinbase value of block rewards).

Total TXN fees USD The total value of all transaction fees paid to miners in USD (not including the coinbase value of block rewards).

Cost per TXN percent Miners revenue as percentage of the transaction volume. Cost per TXN Miners revenue divided by the number of transactions.

N unique adresses The total number of unique addresses used on the Bitcoin block-chain.

N transactions per day The number of daily confirmed Bitcoin transactions per day. Total number of transactions Total number of transactions

N TXN The total number of Bitcoin transactions, excluding those involving any of the network’s 100 most popular addresses.

N TXN exc chains longer than 100 The total number of Bitcoin transactions per day excluding those part of long transaction chains. There are many legitimate reas-ons to create long transaction chains; however, they may also be caused by coin mixing or possible attempts to manipulate transac-tion volume.

Output value The total value of all transaction outputs per day (includes coins returned to the sender as change).

Estimated USD transaction value The Estimated Transaction Value in USD value.

Table 2: Features and their definition retrieved from Blockchain Info that will be used to predict the average USD price of the Bitcoin with. These will be the target values that are to be predicted.

(13)

3.2 Data pre-processing

After the data has been retrieved, some modification needs to be done before machine learning can be applied. First, the data has to be matched by date. Then, the part where the value of the Bitcoin is 0 USD is removed. Finally, the data will be split in a train and test set and the data will be ready for machine learning to train a model with.

3.2.1 Match by date

The retrieved data needs to be matched to date to make the input vectors for machine learning. A part of the data has a daily resolution whereas other data has hourly resolution, making matching not possible by just inserting all data in a matrix. All dates of the data set are converted to Unix time stamps, by which the data is matched and inserted in a matrix.

3.2.2 Bitcoin price equal to 0

In the first 294 entries of the data set the BTC price in USD equal 0. This part of the data will not contribute to training a model, thus will be left out.

3.2.3 Split data in train set, validation set and test set

The data is split in a train set, validation set and test set according to a given ratio. The train set is the first part of the data, the validation set is the second part of the data and the test set is the last part of the data. By using a train-test ratio of 0.9, the data is split as shown in Figure 4. From the train set is the last 0.1 part used as validation set.

(a) Train and validation set (b) Test set

Figure 4: The train-validation target set and the test target set with a train-test ratio of 0.9.

(14)

3.3 Evaluation metrics

The L1 and L2 errors are used to calculate the difference between the predicted values and the true values as a measurement of how well the prediction are: the smaller the error, the better the predictions.

L1 error The L1 error of a prediction is the absolute average error, which is

calculated by: L1 = n X i=1 |yi− f (xi)|

where yi is the target value and f (xi) is the predicted value.

L2 error The L2 error of a prediction is the mean squared error, which is

calculated by: L2 = n X i=1 (yi− f (xi))2

where yi is the target value and f (xi) is the predicted value.

The main difference is that using the L2 error will be much larger in the case of outliers compared to the L1 error: when an error is already large, squaring it makes it even larger.

3.4 Definitions

3.4.1 Sequence length

The sequence length is the length of the input sequence to the LSTM, which can vary between 1 and the length of the entire data set - prediction length.

The sequence length can be seen as the amount of inputs x0 to xn, as showed

in Figure 6. In the case of this work every element of the sequence holds the values of one day. When the sequence length is 1 day the LSTM will act like a regular artificial neural network as there is no time series to analyse but only a single day. With the sequence length at maximum, the LSTM will only have one example to train with, so this will not be desirable. A balance between amount of training examples and sequence length needs to be used.

3.4.2 Prediction delay

The prediction delay is the amount of steps, in this case days, between known data and prediction, where a prediction delay of 0 days means that the next day with respect to the known data is predicted.

(15)

3.4.3 The baseline

The baseline is the simplest prediction possible. For this problem the baseline is defined by taking the Bitcoin price of the last day of the sequence of input as prediction. The the L1 error and L2 error can be calculated accordingly. The algorithm is useful if it predicts at least better than this baseline and thus the errors of the prediction are lower than those of the baseline.

A visualisation of the baseline prediction of the test set with a prediction delay equal to 0 can be seen in Figure 5.

Figure 5: The baseline prediction of the test set, where the baseline is the last day of the sequence, which is in this figure one day before the prediction.

3.5 Proposed training architecture

3.5.1 Tools

To implement a LSTM, Python 3.5 is used to import Keras5_{, which is a Python}

library which uses Tensor Flow as a back end. The Keras sequential model is used to build a model with and LSTM layers and dense layers are used to form the architecture with and use regression to predict future Bitcoin values.3.0

(16)

3.5.2 Architecture

The architecture consist of input LSTM layers, which contain 256 units, and an output dense layer, as is visualised in Figure 6. The amount of 256 units is experimentally chosen as it is an balance between enough units to fit the model well and not too many too just ’store’ the train data so that is would lead to overfitting.

Figure 6: The architecture of the LSTM used, where x0 to xn are the elements

of the input sequence, with LSTM layers A which each contain 256 units, D the dense layer and h the predicted output value of the Bitcoin price in US Dollars.

3.5.3 Fit model

The model is fit using the train data, which is 0.9 part of the data set as described in as described in subsection 3.2.3, of which the last 0.1 part is used as validation set. A batch size of 32 is used, which is the number of samples that is going to be propagated through the network.

3.5.4 Optimisation

For gradient descent to find optimum as effective as possible, as described in sub-section 2.3.1, RMSprop is the optimiser algorithm used in this work to perform gradient descent, which is usually a good choice for recurrent neural networks.

(17)

4 Experiments

To predict Bitcoin price as effective as possible, three experiments will be con-ducted. First, three methods of data scaling are compared on effectivity. Next, the effect of using different features combinations on prediction effectivity will be tested. Finally, the effect of prediction delay and sequence length on the prediction effectivity are investigated.

4.1 Data scaling

In order to potentially optimise gradient descent data scaling is applied. The formula by which data is scaled is given as:

x0 = x − min(xtrain)

max(xtrain) − min(xtrain)

where x is the original data and x0 is the scaled data. The min(xtrain) is the

minimum value of the training set and max(xtrain) is the maximum value of the

training set. The minimum values and maximum values of the training set are used to scale all data (including the test set), because when predicting unknown data the maximum and minimum value will be unknown thus can not be scaled accordingly. After scaling the data of the train set will have a range in [0, 1].

Usually feature scaling is done before applying machine learning as it often leads to better gradient descent performance and quicker convergence to a solu-tion. Target scaling is less usual, but seems to affect the effectiveness of the predictions. After prediction, the results are denormalised to obtain a usefull prediction value using the following formula:

x = x0× (max(xtrain) − min(xtrain)) + min(xtrain)

The effect of both feature and target scaling on prediction effectiveness will be investigated by comparing predictions made by a trained model to the real values. By training a model the only feature used is the Bitcoin price in USD with a sequential length of 1 and no prediction delay.

Models will be trained using a initial learning rate of 0.005, using the Re-duceLROnPlateau callback, which reduces the learning rate if no improvement is seen for 10 epochs. When converged, the EarlyStopping callback is used, which stops the training when no improvement is seen for 25 epochs. A maximum amount of 800 epochs is set.

(18)

4.2 Feature selection

In order to investigate which features contribute most to predicting Bitcoin prices, results of different combinations of features can be compared. All pos-sible combinations are given by the power set of these features. Having the Bitcoin price as a feature is desirable as the price prediction can be based on the last price, so this will always be the first element of the feature matrix, so the power set of feature combinations will be the power set of all features ex-cluding the Bitcoin price and appending these to the Bitcoin price feature vector. The power set of these 18 features with Bitcoin price feature pinned as first feature vector of all combinations has a length of 262144, making that it would take too long to train the entire set, so a more efficient method must be used to test effectiveness of feature combinations.

The method that is being used instead is a greedy method, by taking the first feature vector, which is average USD price, and in turn appending every other feature vector to it. Using 19 features, 18 different combinations can be made this way. Each combination is used to create a model with and predict the test set of which the errors are calculated. The feature combination with the smallest error is ’pinned down’ and the process is repeated by appending the remaining features to those 2 feature vectors in the next round. This is repeated until the length of 19 features is reached after 18 rounds. This reduced the problem to a size of 180 combinations. The appended features can be seen in Table 5 and the pinned down sets can be seen in Table 6, in subsection 5.2.

This method may cause some feature combinations that are possibly effect-ive, not to be found, because this process may not test certain combinations. The effect of some feature combinations may only appear in certain combina-tions, but not separated.

The different feature combinations are used to train a model with and tested on the test set. The prediction delay is set to 0 and the sequence length is set to 1, which would be the most basic setting. This may cause other sequence lengths and delays to be more effective with other feature combinations.

The results can be viewed in subsection 5.2.

4.3 Testing different prediction delays and sequence lengths

To investigate what prediction delay and sequence length will gain the most effective results, all different combinations must be used to train a model and predict the test set. The prediction delay is the amount of days between known data and prediction, where a prediction delay of 0 days means that the next day of the known data is predicted.

(19)

A prediction delay between 0 and 60 days is chosen, because leading factors seem to correlate up to 30 days [3] where the extra 30 days are there to in-clude possible correlations outside of the 30 days boundary. A sequence length between 1 and 20 is chosen.

The data is split using a train-test ratio of 0.9, with 0.1 of the train set used as validation set. Both the target and features are scaled as described in section Data scaling. The resulting features found in section Feature selection are used to predict with, described in subsection 5.2. A batch size of 32 is used to train the models with and 256 units in the LSTM layer. An initial learning

rate of 10−3 and decreases with factor 0.4 every time the validation loss does

not decrease within 5 epochs. Early stopping is used when the validation loss does not decrease within 25 epochs.

(20)

5 Results

5.1 Data scaling

After convergence, using all three methods, the L1 and L2 errors of predictions with respect to the true value of the test set are calculated, as can be viewed in Table 3. The difference in performance can clearly be seen in Figure 7 as the predictions of this method are closer to the real values than those of the other methods. As a result of this outcome the method of both scaling the features and target will be used in the rest of this research.

Feature scaling Target scaling L1 error L2 error

False False 382.65 344430.24

True False 245.86 202890.29

True True 64.99 13004.66

Table 3: Scaling methods and their errors with respect to true values of the test set.

Figure 7: BTC price prediction of the test set in USD using different scaling methods.

(21)

5.2 Feature selection

After running all combinations, the best results is to be found in round 8, power set number 5, which is the combination of the features:

• average USD price • BTC in circulation • median confirmation time • no orphaned blocks • output value

• estimated USD transaction value • transactions per block

• market capitalization • cost per transaction

The L2 error of the predictions of the test set is 6753.22, as can be seen in Table 4 which is visually represented by Figure 8.

The exact combinations of features per round and set number can be found in Table 5 and Table 6, where Table 5 contains the features that are appended per round and set number and Table 6 contains the features that are pinned down per round.

(22)

round \set no 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Baseline 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 7316.79 1 10398.86 13220.72 13531.71 12985.02 14951.17 7351.05 9416.30 18583.32 14097.23 10344.81 10020.29 12812.12 16566.17 39393.31 14130.65 17691.68 14372.63 14904.22 18127.59 2 7309.82 16208.69 9385.91 11391.40 9662.67 7481.27 15092.97 10481.11 7318.79 7435.80 8510.33 11524.70 37036.92 10741.94 14684.43 10990.45 11133.68 16650.21 3 7321.70 16095.12 8773.30 10319.26 9952.08 7054.75 12453.11 8789.31 7262.08 8215.36 10728.69 35137.73 10118.61 12359.44 10197.67 10711.17 15634.18 4 7211.18 15944.04 8708.27 10574.39 9860.73 12017.57 9010.01 7357.12 7885.80 10005.67 26580.76 10109.07 11941.35 10560.06 10236.18 13522.06 5 7277.08 16599.60 8912.57 10666.75 9945.27 12413.04 9345.54 8068.77 10476.92 41841.29 9560.82 12132.32 10876.14 9738.67 15022.47 6 7861.70 12700.01 10223.26 14246.74 13425.93 16596.14 12321.77 12488.54 32514.82 13472.18 14747.38 12698.33 13479.34 20102.33 7 10334.04 8459.10 15067.49 16760.36 16595.91 12483.56 14443.62 103824.72 14096.80 15343.22 15440.76 15732.95 25634.74 8 8229.77 6969.40 7245.78 7054.94 7016.18 6753.22 16274.46 7922.35 7395.98 7055.87 7084.31 12724.94 9 7010.27 7026.70 8441.65 7053.65 7132.18 15913.75 7288.37 7663.94 7129.18 7693.55 12129.08 10 7025.67 8486.93 7203.43 7186.08 12554.88 7139.28 8968.21 7217.74 7559.51 12017.37 11 7179.25 8998.59 7324.81 7314.82 10742.13 8746.12 7351.84 7248.95 9425.90 12 7402.76 8707.04 7482.07 16790.67 8381.26 8406.26 8352.71 11561.54 13 9556.99 12160.97 37445.38 8984.66 10280.09 8610.59 14834.54 14 8232.15 9061.42 24618.21 11937.05 9772.44 10906.66 15 9823.88 41753.73 11442.57 9316.75 15555.22 16 11132.90 23020.09 10747.88 10783.80 17 9708.93 41596.43 11032.31 18 21421.63 42023.35

Table 4: L2 errors of the prediction of the test set of different feature combina-tions using a prediction delay of 0 and a sequence length of 1.

Figure 8: Visualisation of the L2 errors of the prediction of the test set of different feature combinations using a prediction delay of 0 and a sequence length of 1. The lowest L2 error is at round 8, power set number 5.

(23)

1 2 3 4 5 6

1 market capitalization market capitalization market capitalization market capitalization market capitalization market capitalization 2 transactions per block transactions per block transactions per block transactions per block transactions per block transactions per block 3 cost per transaction cost per transaction cost per transaction cost per transaction cost per transaction cost per transaction 4 total transaction fees total transaction fees total transaction fees total transaction fees total transaction fees total transaction fees 5 BTC in circulation no orphaned blocks no orphaned blocks blockchain size blockchain size blockchain size 6 no orphaned blocks blockchain size blockchain size n transactions per day n transactions per day n transactions per day 7 blockchain size n transactions per day n transactions per day output value estimated USD transaction value average block size 8 n transactions per day median confirmation time output value estimated USD transaction value average block size miners revenue 9 median confirmation time output value estimated USD transaction value average block size miners revenue n unique addresses 10 output value estimated USD transaction value average block size miners revenue n unique addresses total number of transactions 11 estimated USD transaction value average block size miners revenue n unique addresses total number of transactions n transactions

12 average block size miners revenue n unique addresses total number of transactions n transactions n transactions exc chains longer than 100 13 miners revenue n unique addresses total number of transactions n transactions n transactions exc chains longer than 100 hash rate

14 n unique addresses total number of transactions n transactions n transactions exc chains longer than 100 hash rate 15 total number of transactions n transactions n transactions exc chains longer than 100 hash rate

16 n transactions n transactions exc chains longer than 100 hash rate 17 n transactions exc chains longer than 100 hash rate

18 hash rate

7 8 9 10 11 12

1 market capitalization cost per transaction total transaction fees blockchain size blockchain size blockchain size 2 cost per transaction total transaction fees blockchain size n transactions per day n transactions per day n transactions per day 3 total transaction fees blockchain size n transactions per day average block size average block size miners revenue 4 blockchain size n transactions per day average block size miners revenue miners revenue total number of transactions 5 n transactions per day average block size miners revenue n unique addresses total number of transactions n transactions

6 average block size miners revenue n unique addresses total number of transactions n transactions n transactions exc chains longer than 100 7 miners revenue n unique addresses total number of transactions n transactions n transactions exc chains longer than 100 hash rate

8 n unique addresses total number of transactions n transactions n transactions exc chains longer than 100 hash rate 9 total number of transactions n transactions n transactions exc chains longer than 100 hash rate

10 n transactions n transactions exc chains longer than 100 hash rate 11 n transactions exc chains longer than 100 hash rate

12 hash rate

13 14 15 16 17 18

1 n transactions per day n transactions per day miners revenue miners revenue miners revenue miners revenue 2 miners revenue miners revenue total number of transactions total number of transactions hash rate

3 total number of transactions total number of transactions n transactions hash rate 4 n transactions n transactions hash rate

5 n transactions exc chains longer than 100 hash rate 6 hash rate

Table 5: The elements that are appended per round, per powerset number.

1 2 3 4 5 6

average USD price average USD price average USD price average USD price average USD price average USD price BTC in circulation BTC in circulation BTC in circulation BTC in circulation BTC in circulation

median confirmation time median confirmation time median confirmation time median confirmation time no orphaned blocks no orphaned blocks no orphaned blocks

output value output value

estimated USD transaction value

7 8 9 10 11 12

average USD price average USD price average USD price average USD price average USD price average USD price BTC in circulation BTC in circulation BTC in circulation BTC in circulation BTC in circulation BTC in circulation median confirmation time median confirmation time median confirmation time median confirmation time median confirmation time median confirmation time no orphaned blocks no orphaned blocks no orphaned blocks no orphaned blocks no orphaned blocks no orphaned blocks output value output value output value output value output value output value

estimated USD transaction value estimated USD transaction value estimated USD transaction value estimated USD transaction value estimated USD transaction value estimated USD transaction value transactions per block transactions per block transactions per block transactions per block transactions per block transactions per block

market capitalization market capitalization market capitalization market capitalization market capitalization cost per transaction cost per transaction cost per transaction cost per transaction

total transaction fees total transaction fees total transaction fees n unique addresses n unique addresses

average block size

13 14 15 16 17 18

average USD price average USD price average USD price average USD price average USD price average USD price BTC in circulation BTC in circulation BTC in circulation BTC in circulation BTC in circulation BTC in circulation median confirmation time median confirmation time median confirmation time median confirmation time median confirmation time median confirmation time no orphaned blocks no orphaned blocks no orphaned blocks no orphaned blocks no orphaned blocks no orphaned blocks output value output value output value output value output value output value

estimated USD transaction value estimated USD transaction value estimated USD transaction value estimated USD transaction value estimated USD transaction value estimated USD transaction value transactions per block transactions per block transactions per block transactions per block transactions per block transactions per block market capitalization market capitalization market capitalization market capitalization market capitalization market capitalization cost per transaction cost per transaction cost per transaction cost per transaction cost per transaction cost per transaction total transaction fees total transaction fees total transaction fees total transaction fees total transaction fees total transaction fees n unique addresses n unique addresses n unique addresses n unique addresses n unique addresses n unique addresses average block size average block size average block size average block size average block size average block size blockchain size blockchain size blockchain size blockchain size blockchain size blockchain size

n transactions exc chains longer than 100 n transactions exc chains longer than 101 n transactions exc chains longer than 102 n transactions exc chains longer than 103 n transactions exc chains longer than 104 n transactions per day n transactions per day n transactions per day n transactions per day

n transactions n transactions n transactions total number of transactions total number of transactions

hash rate

(24)

5.3 Testing different prediction delays and sequence lengths

With all combinations of sequence length 1-20 and prediction delay 0-60 are models trained and tested on the test set. The L2 errors of the test set are calculated as can be seen in Table 7 and are visualised in Figure 9 and in figure 11. The difference between baselines and results can be seen in Table 8 and is visualised in Figure 10.

The absolute best prediction of the test set is done using a sequence length of 1 and a prediction delay of 0. The L2 error of these predictions is 7082.21. The prediction is shown in Figure 12.

The best prediction with respect to the baseline is done using a sequence length of 15 and a prediction delay of 60 days. The difference in L2 error with respect to the baseline is -439372.67. The prediction is shown in Figure 13.

(25)

Seq len \Pred delay 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Baseline 7316.79 14187.09 22112.63 28786.10 36446.29 42255.01 50550.14 62877.24 75758.81 91297.56 106514.81 121977.64 132526.88 148489.17 166783.35 183796.34 1 7082.21 16764.99 35436.30 61333.86 87540.75 99222.07 174220.09 197316.71 252978.68 282450.48 313013.44 385277.56 387115.48 418221.68 440574.17 447795.31 2 7746.27 22029.83 43453.71 91814.82 139324.84 179615.90 263481.54 308202.99 362905.18 464771.65 418967.69 496948.59 475109.70 523849.91 502634.47 490334.23 3 15907.95 15554.75 37388.22 64106.03 131380.21 180894.52 294366.21 291555.38 423145.33 468941.41 521630.46 539567.99 623381.80 576690.29 677197.28 669592.22 4 10416.10 21779.61 53728.15 87168.51 171597.50 249498.83 327613.06 309263.66 682354.81 388419.50 573407.29 420839.45 625915.18 469424.64 494077.97 442910.09 5 11533.71 18068.64 32531.55 91715.15 122218.72 173209.87 304442.77 564976.63 447408.50 429513.99 457991.00 474433.41 560258.24 526733.60 499683.78 492056.25 6 11463.75 25453.82 62779.98 89703.61 144938.15 164772.49 255775.80 309169.29 403376.12 419907.72 469071.63 457444.93 509364.93 481931.62 524424.89 484567.93 7 15389.06 34294.31 52199.73 78025.05 99660.97 158797.28 269207.60 255919.54 389520.90 439633.10 439645.53 447157.34 528647.93 486645.71 488087.58 485687.65 8 41134.64 69892.58 119076.20 203413.95 226213.63 271834.38 406684.93 350962.37 400823.09 453452.76 543819.57 570062.88 548799.73 541614.06 484526.99 526566.24 9 101711.33 333763.86 362756.06 517136.26 481560.91 470185.80 519683.37 552310.22 509157.22 669077.61 481513.84 557968.22 606988.59 577747.76 555828.75 583403.72 10 80621.00 115052.73 167612.95 212163.57 311660.93 372961.86 414191.45 438011.11 470405.24 543442.97 631089.01 639906.90 727313.55 566881.91 588197.34 613674.34 11 38357.07 80727.78 112293.83 152721.02 298129.11 342361.07 339026.07 470016.81 466184.24 483926.97 542308.18 693279.29 516644.32 525616.73 528618.38 540979.55 12 20996.14 63598.82 116481.45 147550.17 227071.43 256695.72 299481.80 354298.87 420253.35 452320.97 493795.43 503968.44 516481.32 545966.80 663302.91 578560.40 13 20390.67 52793.38 92982.21 207993.66 200754.81 292934.85 339396.39 370196.82 459533.74 386549.87 663508.43 472122.81 514982.07 716976.71 583191.00 702278.12 14 16356.18 48594.98 78914.99 108959.81 140518.98 216784.75 272832.84 353388.94 419662.78 437740.68 447498.36 509049.02 553987.64 560912.26 513661.95 776075.36 15 26592.24 47167.49 86687.90 107651.42 169500.70 200503.59 343580.64 338443.82 410694.05 428298.96 460765.63 512798.80 536120.89 718963.56 573689.72 626272.34 16 33856.08 76224.79 105270.62 144042.46 164343.77 198187.18 325545.91 358993.89 352297.14 470301.91 601540.85 622212.06 687088.01 531514.68 657415.55 689872.47 17 73304.78 148191.97 135069.16 241422.85 244467.61 288823.53 333888.96 434887.53 532943.68 570018.34 604516.63 694423.85 537056.55 732617.07 747032.61 755823.56 18 273154.35 257363.34 301179.00 342873.67 453934.89 417986.68 454920.31 547467.80 518362.53 754924.00 761977.05 711069.86 839643.12 832558.14 772717.85 856873.95 19 177803.68 336327.21 289262.57 386911.75 377473.19 435550.90 494947.27 529786.11 644599.08 677575.15 780766.46 801482.41 769412.53 824667.07 834039.02 766943.05 20 92431.60 127738.15 246482.02 328990.30 358445.67 463725.70 506841.50 563060.42 639569.24 701732.54 742159.18 735256.71 709869.04 727289.30 784353.87 818872.62

Seq len \Pred delay 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Baseline 203862.51 220356.84 235782.32 252768.0 271499.43 289583.46 304155.39 318705.00 332807.95 345200.49 356135.51 369818.43 378936.00 389512.29 400302.97 1 465018.50 464018.74 498213.05 520992.74 518259.20 559413.49 560825.06 589616.98 600966.16 637464.81 650806.83 694900.57 716247.85 711559.98 733427.00 2 508735.03 516464.88 544618.53 535204.06 561790.77 587443.87 581202.89 613474.50 655846.76 676901.85 708429.21 720449.22 742855.87 771583.19 762453.62 3 702993.79 731473.38 643794.64 664007.95 722086.76 926664.14 694654.99 1008764.30 800790.94 968380.29 778741.97 921215.13 965336.73 1071960.90 1081181.34 4 503243.92 465439.76 484374.00 526499.41 526061.87 674247.79 709763.41 778140.17 671232.84 758226.86 791630.48 786896.07 760550.18 841316.65 815892.98 5 524968.81 499938.71 535337.56 516742.66 556024.64 662578.89 583794.41 676541.45 673022.32 696082.81 740338.03 682567.58 625559.21 812878.45 514020.65 6 493373.23 511553.66 554080.59 558922.83 554785.50 573241.77 585651.62 611655.16 653284.73 658257.78 712444.54 610123.32 628625.24 581677.97 650709.62 7 510258.99 486906.53 505059.24 534378.80 545777.51 610249.98 585939.58 561316.03 612864.12 637533.69 612082.57 592098.87 682665.53 665693.74 674784.56 8 595969.28 604085.54 623633.50 666930.85 626944.64 653382.16 672234.08 720847.94 695466.83 748473.14 719484.43 740046.00 757656.92 782583.52 828083.16 9 600842.44 653714.08 650798.35 662610.85 699298.56 817336.56 695489.96 770042.61 748371.95 929737.62 804405.19 797312.26 911902.58 1025095.04 999628.13 10 615730.23 657926.49 661954.43 656348.19 656776.41 731845.63 690111.72 752913.68 719077.46 753922.90 847615.22 793421.88 799075.66 810909.78 877902.25 11 573812.35 663554.68 735564.69 708954.11 694698.91 716828.06 753579.60 643582.36 758246.46 787057.59 794466.80 805783.02 805693.73 836930.57 831355.58 12 675487.13 609932.33 637335.37 664105.26 729095.32 719496.11 667497.79 720601.75 761690.64 769737.48 870555.69 821674.12 836087.26 797179.86 846977.19 13 684053.64 712470.83 564748.02 721630.18 657856.05 738567.73 791985.94 757747.11 892588.02 855396.09 765022.03 775623.40 870901.64 772078.72 911957.32 14 594119.67 612758.98 587899.70 625988.66 714258.99 700156.58 870937.20 832972.87 939987.46 840500.38 868857.27 784062.89 731374.07 656221.16 758703.82 15 586764.37 674179.01 623464.48 712339.99 722550.50 754066.89 822219.30 886447.01 804988.99 866735.82 897851.80 677509.96 858976.22 627199.86 671564.73 16 677400.93 713181.82 688821.41 732790.10 740193.65 780021.62 763165.56 779940.64 877623.12 881314.13 867867.76 772169.23 792980.36 724815.84 726459.10 17 761179.85 679171.16 791224.80 843581.19 841680.02 871536.16 869053.19 870345.09 1022355.42 894217.46 661445.58 944482.67 711995.25 755063.36 950209.67 18 860155.79 769733.63 927147.22 938084.23 800876.90 1017844.43 880343.21 1009178.40 1179220.89 1253130.47 754982.89 808261.23 786221.28 894466.47 685521.60 19 831866.10 865393.09 818193.80 912353.27 731456.26 950059.49 948294.16 698699.88 929961.04 802406.83 893976.85 775074.30 742080.66 1224464.87 711889.34 20 784020.48 852447.60 867154.95 831562.48 885873.66 899789.37 730618.95 1048732.00 693482.02 763481.00 812617.18 729273.55 626913.03 725414.63 653858.54

Seq len \Pred delay 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

Baseline 409453.83 419214.39 428314.20 435894.95 449262.58 464487.60 474937.63 481525.70 490660.37 497143.08 502482.92 509077.93 511429.60 513962.67 520778.32 1 760507.75 785675.82 787872.36 784783.19 825410.53 851829.01 857614.83 884186.68 913852.36 917328.98 924707.72 937355.95 962280.60 969311.58 974272.24 2 791088.25 858765.46 838750.09 871133.85 925333.66 898322.67 922330.20 917593.61 945255.89 939083.48 965970.87 1005919.14 1022575.19 1007066.24 1030185.93 3 959339.93 1359390.55 1269793.15 1251849.98 1374755.30 1478084.94 1473762.72 1331065.63 1312827.03 1434639.18 1233751.87 1544063.31 1487546.73 1317436.09 1545380.86 4 821383.37 851401.63 873233.40 899035.42 864308.96 849566.46 946864.90 936822.27 948942.85 863807.78 1009157.97 973136.63 1001111.97 963898.18 950551.48 5 835693.88 461709.88 517739.66 523830.21 506987.29 554376.07 609792.90 560113.25 597174.06 948335.84 689863.22 480158.72 559994.17 502392.24 961724.69 6 701083.30 599432.26 539857.35 559524.65 574633.39 538496.62 572174.00 558134.29 537519.87 526179.84 539691.54 525675.96 552239.42 561930.95 502461.19 7 641462.45 732494.33 754222.82 751275.28 761989.47 803166.01 774341.49 752510.97 794976.61 726267.89 696558.91 699992.77 705937.27 693618.31 702088.21 8 847906.60 849056.39 883112.54 856640.36 859562.71 935471.91 884899.30 921791.32 880561.53 873379.66 893231.08 879173.99 883532.10 923839.08 856348.51 9 978041.29 902049.03 990642.43 1002009.49 1002929.20 934367.45 1005391.50 1014807.48 1072382.74 1010742.51 1054349.56 972790.17 846491.42 883798.37 933458.54 10 787019.50 875106.48 801396.87 921471.32 889289.04 796200.98 934718.65 1005520.20 955981.18 916030.04 896321.78 978883.84 944765.80 928628.51 715666.81 11 915478.87 854044.21 847834.71 928215.56 844374.24 831976.84 868463.14 875201.18 886344.19 817118.57 861822.30 941138.80 911827.94 760446.13 862147.12 12 951115.15 826710.16 844872.58 881523.53 880432.35 833510.74 802127.68 922348.14 880611.11 833713.67 877624.77 836441.39 868988.92 896040.09 960588.43 13 713381.21 673903.22 732647.64 700593.62 674193.59 684313.97 993770.98 731617.22 629684.60 671518.43 752815.07 1178478.24 843104.44 857452.82 827953.11 14 750245.33 775296.21 663131.21 821504.56 682599.69 676747.81 834246.56 826255.06 701333.89 605846.14 684910.02 837300.87 841915.14 883977.18 822658.02 15 852270.82 749256.29 774268.53 841549.02 812423.30 859970.37 842311.72 739421.53 864746.97 923480.99 841967.28 758162.68 928735.26 861929.32 839735.53 16 783297.03 853094.75 733203.97 765906.57 1109133.48 651011.00 841294.25 794670.19 845188.90 797164.64 753877.06 892716.75 891717.34 771937.31 764169.94 17 843072.79 834920.76 682223.80 745638.36 755367.04 821988.89 664015.02 789193.82 732234.90 781032.21 851219.36 700430.83 713581.26 686718.65 784790.31 18 868499.73 1100052.62 735484.02 697370.03 614741.06 892650.59 709112.87 832627.05 636513.62 672066.03 726424.03 722068.77 1234088.60 1465394.59 635129.60 19 690543.61 1332536.09 753198.60 746482.70 826189.85 751982.88 694252.41 763648.90 878385.22 772588.51 681869.62 786660.22 730559.77 834668.62 931304.31 20 697481.29 651552.72 644181.46 687370.88 848269.23 626335.70 1167349.77 807595.10 961936.29 650700.56 1004900.05 987796.89 667679.28 705625.35 795956.28

Seq len \Pred delay 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Baseline 529057.97 538527.13 554611.66 575083.60 596309.17 611686.70 629897.62 654966.80 681411.24 709437.02 743835.24 768599.14 793421.46 816462.56 837213.29 1 1001048.53 977113.45 995731.74 1017462.00 994743.42 1013997.62 1008002.52 1019169.40 1011015.00 1014860.33 1055054.14 1044167.74 1070891.32 1087562.03 1045589.31 2 1005197.12 1027349.53 1064231.92 1015901.31 1029975.96 1042544.85 1023461.29 1050658.38 1085112.20 1073864.39 1109756.19 1070984.97 1097591.54 1112534.12 1098227.22 3 1212807.83 1325989.05 1288512.67 1514403.05 1382778.75 1713175.99 1648632.78 1634307.64 1688130.14 1714404.91 1351469.98 1517994.08 1692157.55 1784349.21 1703928.63 4 497683.42 1153125.19 1101699.04 1065366.95 817104.07 1133860.22 1111110.66 1148402.90 1178297.16 1159329.01 1161347.30 1201886.10 1200173.93 1212308.78 1299476.64 5 1035664.06 427127.61 811420.16 1032481.87 720900.14 580875.06 485948.94 1103415.80 990419.27 615215.74 468073.24 650545.85 1107633.16 1248078.22 1302700.82 6 551390.06 428732.58 442335.40 457719.09 466281.31 502880.29 437112.43 494627.91 593946.88 547816.67 536625.53 478016.57 588263.75 577668.08 727087.88 7 630955.39 610354.12 590603.26 631333.75 786052.34 792107.71 557037.16 766961.22 616501.65 682448.44 694424.31 621032.15 619460.03 568190.64 618370.99 8 849749.76 802259.91 737969.39 814907.13 748374.02 823536.63 985450.84 829204.30 790699.71 831912.75 847605.97 786816.60 842208.51 774311.20 764341.38 9 823550.85 863420.17 817457.53 872826.01 895779.19 940512.79 834618.71 989910.34 978484.20 783615.00 1052215.10 1020913.75 913896.06 781045.17 865955.54 10 787976.84 864747.42 889377.80 841413.35 845991.68 794817.46 901290.42 964366.50 877689.25 780986.47 762486.40 824747.25 902407.47 783411.40 799647.95 11 892871.36 909895.23 873063.88 980280.51 857807.95 792949.32 859525.04 868594.05 796708.79 764526.26 728405.10 708347.52 627371.17 730107.69 513804.81 12 905268.50 867376.43 807043.43 831595.38 680315.06 848066.26 642019.60 772323.26 676936.53 665975.12 739664.81 668912.33 677497.76 652740.51 470979.43 13 641931.82 712086.72 762166.04 735535.60 646630.85 821092.53 1049264.49 603180.49 762232.53 545654.56 806877.85 743871.97 877410.95 866223.23 788464.16 14 880924.10 806225.82 673111.91 575472.92 928057.28 1031389.89 831417.38 696805.27 659841.11 925782.19 568920.75 696934.40 464486.72 689255.65 903153.50 15 865926.42 831658.59 580126.67 728736.37 969770.94 836355.96 807992.29 743941.14 834324.77 737492.52 684564.80 797835.53 660932.56 467023.53 397840.62 16 833202.25 810083.07 648769.67 544008.72 748848.20 802023.71 627068.03 795925.88 727246.71 594887.52 533105.44 746921.89 664459.35 421678.10 815168.02 17 662625.49 1035132.85 568225.67 831267.24 904826.61 913893.84 841991.25 732173.68 793970.97 812308.50 463760.97 539350.64 782221.70 620146.85 839382.95 18 863531.16 1179134.39 973754.74 860835.74 613231.34 720852.37 666622.18 862027.85 972764.41 998595.31 968152.21 939857.77 452806.18 882445.01 789274.62 19 729926.73 575648.76 833609.66 816397.84 874686.02 791443.80 843429.20 377782.41 652381.73 767797.03 406688.32 826826.55 678787.23 681171.56 849442.33 20 866793.74 848905.53 598495.94 676503.12 895548.10 613288.93 716007.16 873194.23 791368.97 605052.39 872703.37 717007.06 792284.81 681038.90 673726.48

(26)

Seq length \Pred delay 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Baseline 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 -234.58 2577.89 13323.67 32547.76 51094.46 56967.06 123669.95 134439.47 177219.86 191152.92 206498.63 263299.92 254588.61 269732.52 273790.82 263998.98 2 429.49 7842.73 21341.08 63028.72 102878.55 137360.89 212931.39 245325.74 287146.37 373474.09 312452.88 374970.95 342582.82 375360.74 335851.11 306537.89 3 8591.16 1367.66 15275.59 35319.93 94933.92 138639.51 243816.07 228678.14 347386.52 377643.85 415115.65 417590.35 490854.92 428201.12 510413.93 485795.88 4 3099.31 7592.52 31615.53 58382.42 135151.21 207243.83 277062.92 246386.42 606596.0 297121.94 466892.48 298861.81 493388.30 320935.47 327294.62 259113.75 5 4216.92 3881.55 10418.93 62929.05 85772.43 130954.86 253892.63 502099.39 371649.68 338216.43 351476.19 352455.76 427731.36 378244.43 332900.42 308259.91 6 4146.96 11266.73 40667.36 60917.51 108491.87 122517.49 205225.66 246292.05 327617.31 328610.16 362556.82 335467.29 376838.05 333442.45 357641.53 300771.60 7 8072.27 20107.22 30087.10 49238.95 63214.68 116542.27 218657.46 193042.30 313762.09 348335.54 333130.72 325179.70 396121.05 338156.54 321304.23 301891.31 8 33817.85 55705.49 96963.58 174627.85 189767.35 229579.37 356134.79 288085.13 325064.28 362155.20 437304.76 448085.24 416272.85 393124.90 317743.63 342769.90 9 94394.54 319576.77 340643.43 488350.16 445114.62 427930.79 469133.22 489432.98 433398.41 577780.05 374999.03 435990.58 474461.72 429258.59 389045.40 399607.38 10 73304.21 100865.64 145500.32 183377.47 275214.64 330706.85 363641.31 375133.87 394646.43 452145.41 524574.21 517929.26 594786.67 418392.74 421413.99 429878.00 11 31040.28 66540.68 90181.20 123934.92 261682.82 300106.07 288475.93 407139.56 390425.42 392629.41 435793.37 571301.65 384117.45 377127.57 361835.02 357183.21 12 13679.36 49411.73 94368.83 118764.07 190625.14 214440.71 248931.66 291421.63 344494.54 361023.41 387280.63 381990.80 383954.44 397477.63 496519.56 394764.06 13 13073.88 38606.28 70869.58 179207.56 164308.53 250679.84 288846.25 307319.58 383774.93 295252.31 556993.62 350145.17 382455.20 568487.55 416407.64 518481.78 14 9039.39 34407.89 56802.36 80173.71 104072.69 174529.74 222282.70 290511.70 343903.97 346443.12 340983.55 387071.38 421460.76 412423.10 346878.59 592279.03 15 19275.45 32980.40 64575.27 78865.32 133054.41 158248.58 293030.50 275566.58 334935.24 337001.40 354250.82 390821.16 403594.01 570474.39 406906.37 442476.01 16 26539.29 62037.70 83157.99 115256.36 127897.48 155932.17 274995.77 296116.65 276538.33 379004.35 495026.04 500234.42 554561.13 383025.51 490632.20 506076.13 17 65987.99 134004.88 112956.53 212636.75 208021.33 246568.53 283338.82 372010.28 457184.87 478720.78 498001.82 572446.20 404529.67 584127.91 580249.26 572027.23 18 265837.56 243176.25 279066.37 314087.57 417488.61 375731.67 404370.17 484590.56 442603.72 663626.44 655462.25 589092.22 707116.24 684068.97 605934.49 673077.61 19 170486.89 322140.11 267149.94 358125.65 341026.90 393295.90 444397.12 466908.87 568840.27 586277.59 674251.65 679504.77 636885.65 676177.90 667255.67 583146.71 20 85114.81 113551.06 224369.39 300204.20 321999.38 421470.70 456291.36 500183.18 563810.43 610434.98 635644.37 613279.07 577342.16 578800.13 617570.51 635076.28

Seq length \Pred delay 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Baseline 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 261155.99 243661.90 262430.73 268224.74 246759.77 269830.03 256669.67 270911.98 268158.21 292264.32 294671.32 325082.13 337311.84 322047.69 333124.04 2 304872.52 296108.04 308836.21 282436.07 290291.34 297860.41 277047.50 294769.50 323038.81 331701.36 352293.70 350630.79 363919.86 382070.90 362150.65 3 499131.27 511116.55 408012.32 411239.96 450587.33 637080.68 390499.60 690059.30 467982.98 623179.81 422606.46 551396.70 586400.72 682448.62 680878.37 4 299381.40 245082.92 248591.68 273731.41 254562.44 384664.33 405608.02 459435.17 338424.88 413026.37 435494.97 417077.64 381614.18 451804.36 415590.01 5 321106.30 279581.87 299555.24 263974.67 284525.21 372995.44 279639.02 357836.45 340214.37 350882.32 384202.52 312749.15 246623.21 423366.16 113717.68 6 289510.72 291196.82 318298.28 306154.83 283286.07 283658.31 281496.23 292950.16 320476.77 313057.29 356309.03 240304.89 249689.24 192165.68 250406.65 7 306396.47 266549.69 269276.92 281610.80 274278.08 320666.52 281784.19 242611.03 280056.17 292333.20 255947.06 222280.43 303729.52 276181.46 274481.59 8 392106.77 383728.70 387851.18 414162.86 355445.21 363798.71 368078.69 402142.94 362658.87 403272.65 363348.92 370227.56 378720.92 393071.23 427780.19 9 396979.93 433357.24 415016.04 409842.86 427799.13 527753.11 391334.57 451337.61 415564.0 584537.13 448269.68 427493.82 532966.58 635582.75 599325.17 10 411867.72 437569.65 426172.12 403580.20 385276.98 442262.17 385956.33 434208.68 386269.51 408722.41 491479.71 423603.44 420139.65 421397.50 477599.28 11 369949.84 443197.84 499782.37 456186.12 423199.48 427244.60 449424.21 324877.36 425438.50 441857.10 438331.29 435964.59 426757.73 447418.28 431052.62 12 471624.62 389575.49 401553.05 411337.27 457595.89 429912.65 363342.40 401896.75 428882.69 424536.99 514420.18 451855.69 457151.26 407667.58 446674.22 13 480191.13 492113.99 328965.70 468862.19 386356.62 448984.27 487830.55 439042.11 559780.07 510195.61 408886.52 405804.96 491965.63 382566.43 511654.35 14 390257.16 392402.14 352117.38 373220.67 442759.56 410573.12 566781.81 514267.87 607179.50 495299.89 512721.76 414244.46 352438.06 266708.87 358400.85 15 382901.85 453822.17 387682.16 459572.0 451051.07 464483.43 518063.91 567742.01 472181.03 521535.33 541716.28 307691.53 480040.22 237687.57 271261.76 16 473538.41 492824.98 453039.09 480022.11 468694.22 490438.16 459010.17 461235.64 544815.16 536113.64 511732.25 402350.80 414044.35 335303.55 326156.13 17 557317.34 458814.32 555442.48 590813.19 570180.59 581952.70 564897.80 551640.09 689547.46 549016.97 305310.07 574664.24 333059.25 365551.07 549906.70 18 656293.28 549376.79 691364.90 685316.23 529377.47 728260.98 576187.82 690473.40 846412.94 907929.98 398847.38 438442.80 407285.28 504954.18 285218.63 19 628003.59 645036.25 582411.48 659585.27 459956.83 660476.04 644138.77 379994.88 597153.09 457206.34 537841.33 405255.86 363144.66 834952.58 311586.38 20 580157.97 632090.76 631372.63 578794.48 614374.23 610205.91 426463.56 730027.0 360674.07 418280.51 456481.67 359455.11 247977.03 335902.34 253555.57 Seq length \Pred delay 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 Baseline 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 351053.92 366461.43 359558.16 348888.23 376147.95 387341.41 382677.20 402660.98 423191.98 420185.90 422224.80 428278.02 450851.0 455348.91 453493.91 2 381634.43 439551.08 410435.88 435238.90 476071.09 433835.06 447392.57 436067.91 454595.51 441940.40 463487.96 496841.21 511145.59 493103.57 509407.61 3 549886.11 940176.17 841478.95 815955.03 925492.72 1013597.34 998825.09 849539.92 822166.65 937496.10 731268.95 1034985.38 976117.12 803473.42 1024602.54 4 411929.55 432187.25 444919.20 463140.47 415046.38 385078.86 471927.27 455296.57 458282.48 366664.70 506675.05 464058.71 489682.37 449935.51 429773.16 5 426240.05 42495.49 89425.45 87935.25 57724.72 89888.47 134855.27 78587.55 106513.69 451192.76 187380.30 -28919.20 48564.57 -11570.44 440946.37 6 291629.48 180217.88 111543.15 123629.69 125370.81 74009.02 97236.37 76608.58 46859.49 29036.76 37208.62 16598.03 40809.82 47968.28 -18317.13 7 232008.63 313279.94 325908.62 315380.32 312726.90 338678.41 299403.86 270985.27 304316.24 229124.81 194076.0 190914.85 194507.67 179655.64 181309.89 8 438452.78 429842.01 454798.33 420745.41 410300.14 470984.30 409961.67 440265.62 389901.16 376236.58 390748.16 370096.06 372102.50 409876.41 335570.19 9 568587.46 482834.64 562328.23 566114.54 553666.62 469879.85 530453.87 533281.77 581722.37 513599.43 551866.64 463712.24 335061.82 369835.69 412680.22 10 377565.67 455892.09 373082.67 485576.36 440026.46 331713.38 459781.02 523994.50 465320.80 418886.96 393838.86 469805.91 433336.20 414665.84 194888.49 11 506025.05 434829.82 419520.51 492320.61 395111.67 367489.23 393525.51 393675.48 395683.82 319975.49 359339.38 432060.88 400398.34 246483.45 341368.80 12 541661.33 407495.77 416558.38 445628.57 431169.78 369023.14 327190.05 440822.44 389950.73 336570.59 375141.85 327363.46 357559.32 382077.42 439810.11 13 303927.39 254688.83 304333.44 264698.66 224931.02 219826.36 518833.34 250091.52 139024.23 174375.35 250332.15 669400.31 331674.84 343490.15 307174.79 14 340791.50 356081.82 234817.01 385609.60 233337.11 212260.21 359308.93 344729.36 210673.52 108703.06 182427.10 328222.94 330485.54 370014.51 301879.70 15 442816.99 330041.90 345954.32 405654.07 363160.72 395482.76 367374.09 257895.83 374086.60 426337.91 339484.36 249084.76 417305.66 347966.65 318957.21 16 373843.20 433880.37 304889.77 330011.62 659870.90 186523.40 366356.62 313144.49 354528.52 300021.56 251394.14 383638.82 380287.74 257974.63 243391.61 17 433618.97 415706.37 253909.60 309743.41 306104.46 357501.28 189077.39 307668.12 241574.53 283889.13 348736.44 191352.90 202151.65 172755.98 264011.99 18 459045.91 680838.24 307169.82 261475.08 165478.48 428162.99 234175.24 351101.34 145853.25 174922.95 223941.11 212990.85 722659.0 951431.92 114351.27 19 281089.78 913321.70 324884.40 310587.75 376927.27 287495.28 219314.78 282123.20 387724.85 275445.43 179386.70 277582.29 219130.16 320705.95 410525.98 20 288027.47 232338.34 215867.26 251475.93 399006.65 161848.09 692412.14 326069.40 471275.92 153557.48 502417.14 478718.96 156249.68 191662.68 275177.95

Seq length \Pred delay 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Baseline 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 471990.56 438586.32 441120.08 442378.40 398434.24 402310.92 378104.90 364202.60 329603.76 305423.31 311218.90 275568.60 277469.86 271099.47 208376.02 2 476139.15 488822.39 509620.25 440817.72 433666.79 430858.15 393563.66 395691.59 403700.96 364427.37 365920.95 302385.83 304170.08 296071.57 261013.93 3 683749.86 787461.91 733901.01 939319.45 786469.58 1101489.29 1018735.15 979340.84 1006718.90 1004967.89 607634.74 749394.95 898736.09 967886.66 866715.34 4 -31374.55 614598.06 547087.38 490283.36 220794.89 522173.52 481213.04 493436.10 496885.93 449891.98 417512.06 433286.96 406752.48 395846.23 462263.35 5 506606.09 -111399.52 256808.49 457398.28 124590.97 -30811.64 -143948.68 448449.0 309008.03 -94221.29 -275762.0 -118053.29 314211.71 431615.67 465487.53 6 22332.10 -109794.55 -112276.26 -117364.50 -130027.86 -108806.41 -192785.19 -160338.89 -87464.36 -161620.35 -207209.71 -290582.57 -205157.71 -238794.48 -110125.41 7 101897.42 71826.98 35991.60 56250.15 189743.17 180421.01 -72860.47 111994.42 -64909.58 -26988.58 -49410.93 -147566.99 -173961.43 -248271.91 -218842.30 8 320691.80 263732.78 183357.73 239823.53 152064.85 211849.93 355553.21 174237.50 109288.47 122475.73 103770.73 18217.47 48787.06 -42151.35 -72871.91 9 294492.89 324893.03 262845.87 297742.42 299470.01 328826.09 204721.09 334943.54 297072.96 74177.98 308379.86 252314.61 120474.60 -35417.38 28742.25 10 258918.87 326220.28 334766.14 266329.76 249682.51 183130.76 271392.80 309399.71 196278.01 71549.45 18651.15 56148.12 108986.01 -33051.16 -37565.34 11 363813.40 371368.09 318452.22 405196.91 261498.78 181262.62 229627.42 213627.26 115297.55 55089.23 -15430.14 -60251.61 -166050.28 -86354.87 -323408.48 12 376210.53 328849.30 252431.77 256511.78 84005.89 236379.56 12121.98 117356.46 -4474.71 -43461.91 -4170.43 -99686.81 -115923.7 -163722.05 -366233.86 13 112873.85 173559.59 207554.38 160452.0 50321.67 209405.83 419366.86 -51786.30 80821.29 -163782.46 63042.61 -24727.17 83989.49 49760.67 -48749.13 14 351866.13 267698.68 118500.25 389.32 331748.11 419703.18 201519.75 41838.47 -21570.13 216345.17 -174914.50 -71664.73 -328934.73 -127206.90 65940.21 15 336868.45 293131.46 25515.01 153652.78 373461.77 224669.26 178094.67 88974.34 152913.53 28055.50 -59270.44 29236.40 -132488.90 -349439.03 -439372.67 16 304144.28 271555.93 94158.01 -31074.88 152539.03 190337.0 -2829.59 140959.09 45835.47 -114549.51 -210729.81 -21677.24 -128962.11 -394784.45 -22045.27 17 133567.52 496605.72 13614.01 256183.64 308517.44 302207.13 212093.62 77206.88 112559.73 102871.48 -280074.27 -229248.50 -11199.75 -196315.71 2169.66 18 334473.19 640607.25 419143.08 285752.14 16922.17 109165.67 36724.56 207061.05 291353.17 289158.29 224316.97 171258.64 -340615.28 65982.46 -47938.67 19 200868.76 37121.63 278997.99 241314.24 278376.85 179757.10 213531.58 -277184.39 -29029.51 58360.01 -337146.92 58227.42 -114634.23 -135290.99 12229.04 20 337735.78 310378.40 43884.28 101419.52 299238.93 1602.23 86109.53 218227.44 109957.73 -104384.63 128868.13 -51592.07 -1136.65 -135423.65 -163486.81

(27)

Figure 9: The L2 losses of the test set with different prediction delays and sequence lengths.

Figure 10: The L2 losses of the test set with different prediction delays and sequence lengths, relative to the baseline.

(28)

Figure 11: The L2 losses of the test set per different sequence lengths with on the x-axis the prediction delays and the baseline.

(29)

Figure 12: Prediction of the test set using sequence length 1 and prediction delay 0, which is the prediction with the lowest L2 error.

Figure 13: Prediction of the test set using sequence length 15 and prediction delay 60, which is the prediction with the lowest L2 error minus the baseline.

(30)

6 Discussion and conclusion

6.1 Data scaling

It is clear to see in Table 3 and Figure 7 that feature scaling and target scaling outperforms the other two methods, possibly because it is more difficult to fit to very large values than it is to fit to scaled values. This is not the result as expected: it was expected that scaling improves the convergence speed, but not particularly its performance. This effect can not be due to the learning rate, as

all three methods used the same initial learning rate of 10−3and decreases with

factor 0.4 every time the validation loss does not decrease within 5 epochs, so gradient descent led through all three methods to convergence, but not all three as effectively. Possibly fitting a model is more difficult for large values than it is for scaled values.

6.2 Feature selection

It was expected that using more features would lead more effective predictions, as machine learning has more data to base a prediction on and when a feature does not contribute to the prediction, machine learning would set its weights to 0 during fitting, so that the feature is ignored. However, the best combination of features found is the following combination of 9 features: average USD price, BTC in circulation, median confirmation time, no orphaned blocks, output value, estimated USD transaction value, transactions per block, market capitalization and cost per transaction, which performs better than using all 19 features. It may be the case that some features show different relations to the target values in the train set than they do in the test set, which causes the predictions to deteriorate.

Using the greedy method of finding the best combination of features is incom-plete as many combinations are not tested using this method. In future studies other methods may be used for finding the optimal combination of features.

6.3 Testing different prediction delays and sequence lengths

The best absolute result is obtained by a sequence length of 1 and a predic-tion delay of 0. This result is not as expected as using a sequence length of 1 is basically similar to making predictions using a ordinary artificial neural network. The expectation was that longer the sequence length would lead to better prediction results, but experiment shows that longer sequence length does not particularly leads to better results. This may be caused by the selection of features in subsection 5.2, which are tested using a sequence length of 1 and prediction delay of 0, such that the selection of features works optimal at those settings but another selection of features may have led to different results.

(31)

The result is only slightly better than the baseline, as the results has a L2 error of 7082.21 and the baseline has a L2 error of 7316.79.

The best result with respect to the baseline is achieved using sequence length 15 and prediction delay 60, as it performs better with a difference in L2 of 439372.67. This may be due to the fact that the baseline prediction with a large prediction delay has already a great error such that even a ’random’ prediction may be better than the baseline.

6.4 Future work

Future studies may focus on the effects of different combinations of features, sequence lengths and prediction delays simultaneously, as those different com-binations may lead to very interesting and effective prediction results.

Also, future work may investigate prediction of the Bitcoin price using a different timescale than the 24-hour time series, for example using data with a hourly resolution, which may lead to more effective prediction results.

Bitcoin price prediction using Deep Neural Networks