Expected Shortfall Performance in Cryptocurrencies modelled by GARCH-EVT

(1)

Expected Shortfall Performance in

Cryptocurrencies modelled by GARCH-EVT

J. Holtrop

(2)

Master Thesis Econometrics & Actuarial Studies

(3)

Master Thesis Econometrics and Actuarial

Studies

J.Holtrop

s2242540

December 2018

Abstract

(4)

1 Introduction

During the years 2012-2017, the Bitcoin price series underwent an exponential growth. According to Urquhart (2016), the growth could be attributed to Bit-coin’s innovative features. Many other cryptocurrencies have emerged ever since the introduction of Bitcoin, being Litecoin, Ethereum and Monero. The total number of exchangeable cryptocurrencies to date (September 2018) reaches over 1600. The top five leading cryptocurrencies are Bitcoin, Ethereum, XRP, Bit-coin Cash and EOS. With BitBit-coin having the largest market capitalization of $113 billion (coinmarketcap.com), which is followed by Ethereum ($23.9 bil-lion), XRP ($22.9 bilbil-lion), Bitcoin Cash ($9.5 billion) and EOS ($5.2 billion). These five cryptocurrencies represent 78% of the total cryptocurrency market capitalization.

It is widely known that cryptocurrencies are extremely volatile compared to any other asset, commodity or equity available. This has been a hot topic for risk managements especially, since the financial crisis of 2007. After the fi-nancial crisis, stricter risk management systems and measurements have been implemented by the Basel framework such as the Conditional Value-at-Risk (cVaR). Given that there are a lot going on in the regulation of banks and that the market valuation of cryptocurrencies is large, it makes sense to consider risk measures for cryptocurrency portfolios.

In the current literature, two popular risk measures are commonly used in prac-tice. In this paper, we consider the Value-at-Risk (VaR) and the Expected Shortfall (ES). The VaR is a well-known measure proposed by J.P. Morgan in 1994 which allows practitioners to use the VaR to estimate the amount of capital required to offset possible future losses at a given probability. A complementary risk measure to the VaR is the ES. While the VaR measures possible losses at a given probability, the ES estimates the losses given that the VaR has been exceeded and thus concentrates more on the tail of the losses. In order to esti-mate these risk measures, we require a modelling approach which can take into account the (conditional) mean and variance of cryptocurrency time series.

A starting point to capture the (conditional) mean of the cryptocurrency series is the Autoregressive Moving-Average (ARMA) model. However, the ARMA fails to model the extreme volatility of cryptocurrency series since the ARMA assumes the volatility to be constant over time. Therefore, the General Autore-gressive Conditional Heteroskedasticity (GARCH) model by Bollerslev (1986) is introduced, which, together with its extended versions can capture the volatility over time.

(6)

and ES by using a rolling window approach. However, as there is only little research of the effects of different distributional assumptions of innovations on GARCH-type models in cryptocurrency time series, we propose our first re-search question:

Does the distribution of the innovations influence the estimates of the risk mea-sures of GARCH-type models in cryptocurrency time series?

Cryptocurrency time series are extremely volatile and fat tailed as mentioned by Chu, Chan, Nadarajah, and Osterrieder (2017). The GARCH methodology may not fully capture this as the GARCH focuses on the dynamics of the distri-bution. Therefore, a different modelling approach is required which takes into account extremes and rare events in the tails of the distribution. We propose to adopt the Extreme Value Theory (EVT) which has large successes in modelling the tails of financial data series.

Within the EVT methodology, we have the option to choose between two dif-ferent approaches to model the tails of cryptocurrency series. The first is the Block Maxima (BM) and the second is the Peak-Over-Threshold (POT). The BM approach considers an extreme observation within each ”block” and requires a large dataset. In this research a relatively short dataset is measured and we are interested in modelling observations which exceed only a particular threshold. For this reason, we adopt the POT methodology and model the ob-servations exceeding a threshold by using the Generalized Pareto Distribution (GPD).

In the paper by McNeil and Frey (2000) a two-step approach is proposed which combines the modelling approaches of both the GARCH and EVT framework. They explained that both methods have a drawback when used individually on the loss series. A GARCH model on its own does not estimate the tails very well, whereas the EVT performs poorly in estimating the dynamics of the dis-tribution. Hence, by combining both methods, all aspects of the volatility of the time series can be captured and therefore the risk measures can be estimated more accurately.

Currently, there exists literature of GARCH modelling and little literature of EVT modelling of cryptocurrencies. Until now, no literature is available of GARCH-EVT-POT modelling on cryptocurrencies. Modelling cryptocurren-cies by using a single method has limitations when estimating risk measures. According to McNeil and Frey (2000), combining GARCH with EVT provides an improved modelling approach when estimating risk measures of financial time series. However, does this apply to cryptocurrency time series as well? For this reason, we state the second research question of this paper:

(7)

To answer the two research questions of this paper, backtesting methodology is applied on the GARCH-type and GARCH-EVT-type model’s forecasted VaR and ES estimates. For the VaR, we adopt popular backtesting methodology of Christoffersen (2003). As there are some improvements in the current ES backtesting literature, we follow the procedures of McNeil and Frey (2000) and Embrechts, Kaufmann, and Patie (2005).

Our findings from the backtests relating to our first research question suggest that different GARCH-type models under normal innovations do not differ when estimating the VaR and ES. Assuming rolling t-innovations however, influences the estimation of both risk measures. Therefore, the innovations does not al-ways affect the risk measures estimates of GARCH-type models in cryptocur-rency time series.

Considering whether the inclusion of EVT improves the risk measures, we find that the addition significantly improves the estimation of the ES in the models with t-distributed innovations. Assuming normal distributed innovations, we find significant improvements in two out of three cryptocurrency series. How-ever, contradicting results are found in the estimation of the VaR.

(8)

2 Literature Review

Since the introduction of the very first cryptocurrency in 2008, many other cryptocurrencies have emerged. The interest in holding a portfolio of cryp-tocurrencies remains very high, as many have grasped a fortune by investing into cryptocurrencies in the past. However, the risk of cryptocurrencies is not fully understood yet. This is important as cryptocurrency time series are highly volatile as seen in recent years. Therefore, it is important to model the risk by using GARCH modelling.

The very first who adopted GARCH modelling on Bitcoins was Glaser, Hafer-korn, Siering, Weber, and Zimmermann (2014). Their goal was to investigate the intentions which drove users to invest in Bitcoins. A few years later, dif-ferent GARCH-type models were a topic by researchers such as Bouoiyour and Selmi (2015), Bouoiyour, Selmi, Tiwari, and Olayeni (2016), Dyhrberg (2016), Chan, Chu, Nadarajah, and Osterrieder (2017) and Katsiampa (2017). These researchers used popular volatility models such as AR-CGARCH, TGARCH, CMT-GARCH and EGARCH on Bitcoin’s daily price series. Based on the goodness of fit measures (e.g. information criteria’s), optimal models were cho-sen.

Only recently, Chu et al. (2017) started extending the number of cryptocur-rencies to seven of the top fifteen based on largest market capitalization. They fitted twelve different GARCH-type models, finding that the best fitting volatil-ity models (based on information criteria’s) were the IGARCH and the GJR-GARCH. Caporale (2018) later extended the work of Chu et al. by including mixture models such as Markov Switching GARCH-type models on four cryp-tocurrency series.

Modelling GARCH-type models on cryptocurrency return series requires as-sumptions on a distribution for the innovations. In the literature, one can for example choose between a normal, Student’s t or generalized error distributions. In the paper of Trucios (2018) and Ghalanos (2018) a wide overview of (other) popular error distributions can be found.

In the literature, different conclusions have been given to which distribution of the innovations estimates the volatility best. Liu, Shao, Wei, and Wang (2017) for example, found that t-distributed innovations is optimal for the Bitcoin se-ries, while Chan et al. (2017) found that the Generalized Hyperbolic distribution fits the Bitcoin series best. Others such as Naimy and Hayek (2018) and Peng, Albuquerque, Sa, Padula, and Montenegro (2018) also compared error distribu-tions on the Bitcoin series when fitting GARCH-type models.

(9)

small sample size suffices when volatility forecasting accurately ahead.

Almost no out-of-sample literature was available until 2018, and the number of it has been growing afterwards. Naimy and Hayek (2018) were one of the first who applied the one-step ahead volatility forecast by using GARCH models on the Bitcoin series. Trucios (2018) volatility forecasted GARCH-type models one-step ahead but also included outliers of the Bitcoin series into his research. Others such as Angelini and Emili (2018), extended the number of cryptocur-rencies by exploring the forecasting capabilities of different GARCH-type mod-els. A few months later, Peng et al. (2018) extended this research by comparing out-of-sample performances of GARCH-type models and machine learning tech-niques.

When investing into a portfolio of cryptocurrencies, one obtains a very high risk asset. The volatility is an issue, so is a possibility of fraud and criminal activity as mentioned by Frunza (2016). In order to quantify the risk of holding cryptocurrencies, many rely on the famous risk measure proposed by J.P. Mor-gan in 1994. Ever since the VaR was proposed, it still has been a widely used risk measure.

Researchers started by investigating the accuracy of the (one-step ahead) VaR in the Bitcoin series. A few example are described in the papers of Osterrieder and Lorenz (2017), Stavroyiannis (2018), Trucios (2018) and Gkillas and Katsiampa (2018). Thereafter, researchers expanded the number of cryptocurrencies when estimating the VaR. Chu et al. (2017) for example, tested the performance of the one-step ahead VaR by using the unconditional and conditional coverage VaR exceedance test of Christoffersen (1998). Similarly, Likitratcharoen, Ranong, Chuengsuksomboon, and Pansriwong (2018) researched different VaR-types to quantify the risk and performance of several cryptocurrencies.

The VaR does have a few drawbacks as seen in the paper by Acerbi and Tasche (2002). One important drawback is that the VaR fails the sub-additivity axiom and is thus not a coherent measure as mentioned by Artzner, Delbaen, Eber, and Heath (1999). This axiom means, that there are no diversification benefits when holding a portfolio composed of two assets compared to holding two indi-vidual assets when calculating the VaR. Mathematically speaking, let α, β ∈ R then f(α) + f(β) ≥ f(α + β) does not hold.

A second important drawback is that once we exceed the VaR, we do not know anything about the distribution of the losses in the far tail. Therefore, the ES started to gain popularity as this measure fulfills all of the axioms as seen in the paper by Artzner et al. (1999). Moreover, the ES measures the tail of the distribution once the VaR has exceeded.

(10)

several cryptocurrencies by applying the historical simulation approach. Capo-rale (2018) and Stavroyiannis (2018) both applied backtesting procedures on the one-step ahead VaR and ES. Stavroyiannis (2018) solely concentrated on the Bitcoin series by using GJR-GARCH modelling, whereas Caporale (2018) extended the GARCH-types by using a rolling window on four of the most pop-ular cryptocurrencies.

(11)

3 Data analysis

The paper by Chu et al. (2017) considered seven of the top fifteen cryptocurren-cies, which are ranked by market capitalization. As cryptocurrencies are highly volatile, the market capitalization has changed compared to the time of that study. In this paper we choose three relevant series which are consistent with Chu et al. (2017), conditioning on the top ten highest market capitalization of September 2018. From coinmarketcap.com (2018), we choose the following three series: Bitcoin, XRP and Monero.

The data that is used for the paper is publicly available from coingecko.com (2018). The data consists of 1462 historical daily closing prices starting from 22 June 2014 to 22 June 2018 for the Bitcoin series and 1229 historical closing prices (22 June 2015 to 1 November 2018) for the XRP and Monero series. We choose this starting date for Bitcoin because, Chu et al. (2017) uses a simi-lar starting date however, we extend the end date by one year. In the cases XRP and Monero, we use recent data due to computational issues when using a similar starting date.

3.1 Stationarity

It is common for financial series to exhibit non-stationarity. In a very volatile time series like cryptocurrencies, we unquestionably find non-stationarity. This can for example be seen from the historical closing prices of the three cryptocur-rencies in Figure 1.

We can further observe that for all the cryptocurrency price series, the series does not resemble a stationary process due to the (upward) trend and peaks starting from 2017. This implies that the mean and the variance are not stable over time. A way to remedy the non-stationarity is to transform the closing price series into returns. By transforming the closing price series into log-returns, the series provide a smooth representation of the changes around zero.

We denote Ptas the observed closing price at time t, where t = 1,...,T. Then

the log-returns Rt at time t of the price series can be obtained as follows:

Rt= ln _P t Pt−1 . (1)

We note, from equation 1 it is possible to turn the daily log-returns into daily log-losses by taking the negative of Rt, Lt = −Rt. This will be of use later

when calculating risk metrics.

(12)

In order to verify stationarity, we require formal testing by using the Aug-mented Dickey Fuller (ADF) test by Dickey and Fuller (1979). When using the ADF test, we need to select a lag length. We choose a common rule by Schwert (1989), which is given in equation 2:

pmax= 12 N 100 0.25 , (2)

where N is the total number of observations.

The ADF test result yields -16.97 (p.value = 0.01), -18.035 (p.value = 0.01) and -15.63 (p.value = 0.01) respectively for the series Bitcoin, Monero and XRP, with corresponding lag length 4, 3 and 3. This indicates that we reject the null hypothesis of a unit root and verify stationarity of all the series by taking log-returns as in equation 1.

3.2 Descriptive Statistics

Table 1 illustrates the descriptive statistics of the return series of Bitcoin, XRP and Monero. From the statistics presented, we observe that the daily mean is around 0.1-0.45% and the daily median for Bitcoin and Monero are positive, whereas the median for XRP is negative. The standard deviation is around 4-7.5%. This indicates that most of the observations fluctuate around the mean of 0%, which in theory agrees to the properties of financial time series. This can also be observed from Figures 2, 3 and 4.

In the final two rows of Table 1 we find the kurtosis and skewness of the return series. The kurtosis of each cryptocurrency is (much) higher than three com-pared to the normal distribution. This provides evidence of leptokurtic behavior in cryptocurrencies time series. When observing the skewness, we find that only Bitcoin has a negative skewness, whereas both XRP and Monero have positive skewness.

In Figures 5, 6 and 7 we investigate the AutoCorrelation Function (ACF) of the daily returns and squared returns series of Bitcoin, Monero and XRP re-spectively for up to a maximum of twenty lags. In each of the figures, we include a dash line which represents the 95% confidence band.

Observation from the ACF plots for the cryptocurrencies tells us that most of the autocorrelations of the daily returns are close to zero but two in the Monero series and three in the XRP series. Moreover, when observing the au-tocorrelation of the squared returns, we find that more than 5% of the squared daily returns are outside the confidence band. From these observations, we find evidence to model an ARMA.

(13)

3.3 Normality

Before investigating the tail distribution, testing of the normality of the distri-bution is required. First, we plot the empirical distridistri-butions against the normal distribution. Then, we formally test the distribution of the return series by using the Jarque-Bera (J-B) and the Shapiro-Wilk (S-W) test.

In Figures 8, 9 and 10 we present histograms of the empirical distributions of the three cryptocurrencies which is denoted by a red line versus the normal distribution which is denoted by a blue line. From each of these histograms we can easily observe that the peak is much higher than the normal distribution. Moreover, each of the histograms is not symmetric. Therefore, we find a viola-tion of the kurtosis and symmetry properties.

To formally test the normality and verify that the series deviate from the nor-mal distribution, we apply the J-B test and the S-W test. Both test reject the null (with p.value 0.01) of normality for all the series. Hence, we conclude that none of the three cryptocurrencies follows a normal distribution.

3.4 Tail distribution

In the previous subsections, we described the summary statistics of the return series as given in Table 1. Moreover, we encountered many peaks in the series. Based on these findings, it would be justified that cryptocurrencies exhibit fat-tailness. To formally show this, we make use of Quantile-Quantile (QQ) plots. By using QQ-plot, we can compare the quantiles of the empirical distribution with the quantiles of the standard normal distribution. If the series exhibits no fat tails, then the line will be approximately linear.

(14)

4 Methodology

In this section an overview of methods is provided in order to answer the re-search questions. In section 4.1, an introduction is given to ARMA modelling. In section 4.2, we dive into volatility models. For the section thereafter, we introduce the EVT framework and combine the EVT with GARCH in section 4.4. In section 4.5, we give an overview of two popular risk measures. In the final section, we introduce the backtesting procedure in order to evaluate the risk measures of section 4.5.

4.1 Autoregressive Moving-Average model

In this section we start by introducing a simple linear model which is required to model the conditional mean of the process. This will be the start of the building block when using the conditional variance models.

Before the model is given, we introduce a process {Xt} for t = 1,...,T. Then the

autoregressive (AR) model of order p is shown in equation 3.

Xt= c + φ1Xt−1+, ..., +φpXt−p+ Zt= c + p

X

i=1

φiXt−i+ Zt, (3)

where c and φi for i = 1,...,p are constants and Ztis a white noise. As seen in

equation 3, the model is a summation of its past values and a white noise. The AR(p) can be further reduced to the first order, which we use for the GARCH-type models. The parsimonious form is found in equation 4.

Xt= c + φ1Xt−1+ Zt, (4)

where this process is weakly stationary if |φ1| < 1. One-step ahead forecast of

the AR(1) can be done with the following equation: ˆ

Xt,1= E[Xt+1|Xt] = c + φ1Xt (5)

The second model we consider is the moving-average (MA) model. The MA model of order q is a summation of its current and past innovations and a mean. For a process {Xt}, we present the MA(q) in equation 6.

Xt= µ + Zt+ θ1Zt−1+, ..., +θqZt−q = µ + Zt+ q

X

i=1

θiZt−i, (6)

where µ is the mean, and θi for i = 1,...,q are constant finite coefficients.

Similarly for the MA(q), we use the parsimonious form as presented in equation 7.

Xt= µ + Zt+ θ1Zt−1, (7)

where the process is weakly stationary. The one-step ahead MA(1) is given as follows:

ˆ

(15)

Combining both the AR(1) from equation 4 and the MA(1) from equation 7 yields the ARMA(1,1) model which is given in equation 9.

Xt= c + Zt+ φ1Xt−1+ θ1Zt−1, (9)

where the process is weakly stationary if |φ1| < 1 and if we have no common

roots of the characteristic equations for the MA(1) and the AR(1). One-step ahead of the ARMA(1,1) is given as follows:

ˆ

Xt,1= E[Xt+1|Xt] = c + φ1Xt+ θ1Zt (10)

4.1.1 Model selection

In order to find the optimal AR(1), MA(1) or ARMA(1,1) component when modelling the conditional mean of the process, we implement information crite-ria’s (IC). In this paper, we restrict ourselves to identifying the best model by using a combination of the Akaike Information criterion (AIC) and the Schwarz criterion (BIC). We note that in all of these models, the smaller the value of the IC, the better the fit of the model.

When selecting a (nested) model, there is usually a trade-off between parsi-mony and the quality of the fit. The Akaike information criterion (AIC), which was originally developed by Akaike (1974), helps in providing a simple way to find the ”best” model by choosing the lowest AIC score. The score is determined by punishing the number of parameters in a model, while maximizing the fit by using a simple formula as seen in equation 11.

AIC = 2k − 2 ln L( ˆΘ), (11)

where k denotes the number of unknown parameters and ˆΘ denotes the Maxi-mum Likelihood estimate (MLE) of the vector of unknown parameter Θ .

As we would like to benchmark the information criteria’s, we use the Bayesian information criterion (BIC) or the Schwarz criterion. This model was intro-duced by Schwarz (1978) four years after the introduction of the AIC criterion. Similar to the AIC, the BIC punishes the complexity of the model. However, compared to the AIC it punishes the number of parameters included into the model more severely. The BIC can be found in equation 12.

BIC = k ln n − 2 ln L( ˆΘ), (12)

where n denotes the number of observations.

4.2 Volatility models

(16)

• The first model we consider is the standard GARCH model. This model can capture the volatility clustering in cryptocurrency time series very well.

• The second model takes into account the leverage effect. From this many models can be chosen such as the EGARCH, AP-GARCH, T-GARCH and the GJR-GARCH. In this paper, the EGARCH is treated solely as some of these models mentioned are nested models under specific condition(s).

• The last model we are interested in is the CS-GARCH. This model divides the volatility into two different components: A transitory effect and a permanent effect.

In the next subsection we provide an overview of the Autoregressive Conditional Heteroskedasticity (ARCH) model by Engle (1982). This is required since this is the starting point to understand the volatility models used in this paper. We note the reader that we restrict ourselves to the first order as further orders complicates explanation purposes.

4.2.1 ARCH

The first building block to conditional volatility models is the ARCH model. To model an ARCH process, we define {Zt} to be a white noise with mean zero and

unit variance. Moreover, we assume the marginal distribution function Fz(Z)

to follow either a normal or a Student’s t-distribution with mean zero and unit variance or a variance scaled to one. Then {t} follows an ARCH(q) process as

found in equation 13.

t= σtZt, (13)

where σt2 is modelled in equation 14.

σ2_t = ω + α12t−1+, ..., +αq2t−q = ω + q

X

i=1

αi2t−i, (14)

where ω > 0, αi ≥ 0 for i > 0 to ensure a positive variance, 2t−i is the ARCH

effect and ifPq

i=1αi< 1, we obtain weak stationarity.

4.2.2 GARCH

The ARCH model by Engle (1982) has a huge limitation because the model depends only on the square of the residual return of the previous periods. In 1986, Bollerslev (1986) extended the ARCH model to the GARCH model by including previous period’s variances.

In order to model the GARCH, we assume that the dynamics of the series are given in equation 15.

(17)

where µtis the conditional mean of the process modelled by an AR(1), a MA(1)

or an ARMA(1,1) and let Zt follow the same properties as mentioned in the

section before. The conditional variance σt2is given as follows:

σ_t2= ω + q X i=1 αi2t−i+ p X j=1 βjσ2t−j, (16)

where ω > 0, αi, βj≥ 0 for i,j > 0.

When restricting to the parsimonious GARCH(1,1), we find the following model:

σ2_t = ω + α2_t−1+ βσ_t−12 , (17)

where this is weakly stationary if α + β < 1. The one-step ahead forecast of the GARCH(1,1) can be found in equation 18, of which the parameters are estimated by Maximum Likelihood (ML).

ˆ

σ2_t+1= ˆω + ˆα2_t+ ˆβσ_t2. (18)

4.2.3 EGARCH

The Exponential GARCH by Nelson (1991) is an extension of the GARCH which takes into account the leverage effect through a parameter γ. Another difference with the GARCH model is that the current volatility is now in logs instead of levels and therefore always positive. If we assume that the dynamics of the series follows equation 15, then the conditional variance σ2t can be modelled by

a parsimonious EGARCH(1,1) as found in equation 19.

ln σ_t2 = ω + α(|t−1| − E(|t−1)) + γt−1+ β ln σt−12 , (19)

where α and β are real constants and if γ < 0 the model exhibits a leverage effect. The one-step ahead forecast of the EGARCH(1,1) is given as follows:

ln ˆσ2t+1 = ˆω + ˆα(|t| − E(|t)) + ˆγt+ ˆβ ln σt2, (20)

where similarly, the parameters are estimated by ML.

4.2.4 CS-GARCH

The Component Standard GARCH(1,1) by Engle and Lee (1999) is another extension of the GARCH model, in which the conditional variance is divided into two different components as seen in equation 21 and 22. The σ2

t−1− qt−1

part of equation 21 explains the transitory effect, whereas equation 22 explains the permanent movement effect. Moreover, compared to the GARCH model we now have a time varying mean, qtas seen in equation 21.

σ_t2= qt+ α(2t−1− qt−1) + β(σt−12 − qt−1), (21)

where

qt= ω + ρqt−1+ φ(2t−1− σ 2

(18)

for 0 < φ < β, ω and α > 0. Additional assumption for 0 < α + β < ρ < 1 is required for weak stationarity. The one-step ahead forecast of the CS-GARCH(1,1) can be found in the vignette of Ghalanos (2018) and is therefore omitted here.

4.3 Extreme Value Theory

In this section a discussion of the EVT methodology is given. We first make comparisons between the GARCH and EVT methodology. Thereafter, we dis-cuss the two approaches within EVT and explain which of the two approaches is suitable for implementation on cryptocurrency time series.

In the previous subsection, we provided the theory behind the GARCH method-ology. As mentioned previously, GARCH models capture the dynamics of the distribution very well. However, the EVT approach opposed to the GARCH focuses on the tail of the distribution. This implies that the EVT concentrates on rare events in the time series.

There are two streams of methods within the EVT as mentioned by McNeil, Frey, and Embrechts (2005). The first is the Block Maxima (BM) (see Gumbel (1958)) and the second is the Peak-Over-Threshold (POT). Both obviously are concentrating on the tails of the distribution. However, the procedure on esti-mating the tail follows differently.

The BM method first creates a number of fixed ”blocks” or groups of equal length in the data. It then maximizes these observations within each block and considers these to follow an extreme value distribution. In turn, to make this work for a time series a large dataset is required as mentioned by McNeil et al. (2005). Otherwise, insufficient blocks are created and a bias will occur in the pa-rameter estimates. In smaller datasets, the POT method is preferred because it considers all observations in the dataset beyond a certain predetermined thresh-old. These observations exceeding a threshold can be modelled by a GPD (see Pickands (1975)). As there are two methods within the EVT framework, a short discussion follows on which of the two methods is more suitable for application on cryptocurrencies.

(19)

4.3.1 Peak-Over-Threshold

The POT models the observations exceeding a threshold τ > 0 using the GPD. Therefore, as given in Chapter 7.2 of McNeil et al. (2005) we first present the GPD for a random loss L in equation 23.

Gξ,β(l) =

(

1 − (1 +lξ_β)−1ξ _{if ξ 6= 0,}

1 − exp(−_βl) if ξ = 0, (23)

with scale β > 0, l ≥ 0, if the shape ξ ≥ 0 and if the shape, ξ < 0 then 0 ≤ l ≤−β_ξ .

The observations exceeding a threshold can follow the ”excess distribution over a threshold τ ” as mentioned in McNeil et al. (2005). Let L be a random loss with distribution function F. Then the excess distribution over a threshold τ is given by Fτ(l) as in equation 24.

Fτ(l) = P (L − τ ≤ l|L > τ ) =

F (l + τ ) − F (τ )

1 − F (τ ) , (24)

where 0 ≤ l < L0− τ with L0 as the finite right end point of the distribution F.

Before we apply the theorem of Balkema and Haan (1974) and Pickands (1975), we require some definitions. Let the Generalized Extreme Value (GEV) distri-bution as mentioned on page 284 in McNeil et al. (2005) be defined by:

Hξ(l) =

(

exp − (1 + ξl)−1ξ _{if ξ 6= 0,}

exp(−e−l) if ξ = 0, (25)

where 1 + ξl > 0, with the shape parameter ξ and Hξ(l), determining the type

of distribution. Further details for different values of ξ can be found in McNeil et al. (2005). Moreover, let us define the sequences {dn} and {cn} with positive

constants. Then the Maximum domain of attraction (MDA) as mentioned on page 285 in McNeil et al. (2005) is given by:

lim n→∞P Mn− dn cn ≤ l = lim n→∞F n_(c nl + dn) = H(l), (26)

where H(l) is a ”non-degenerate” distribution function.

Applying the theorem of Balkema and Haan (1974) and Pickands (1975) pro-vides a general rule for all ξ to approximate the excess distribution by a GPD for large enough τ . Mathematically, this is shown for a function Ω(τ ) as follows:

lim

τ →L0

sup0≤l<L0−τ |Fτ(l) − Gξ,Ω(τ )(l)| = 0, (27)

(20)

The crucial part of equation 24 yields in optimally choosing a high enough threshold τ . Especially, for cryptocurrency time series, we should decide on a appropriate threshold as a too low threshold will result in too much irrelevant information of the observations, leading to a bias of the GPD parameters. On the other hand, a too high threshold will exclude observations which could yield valuable information when estimating the GPD parameters.

In the current literature there exists graphical methods such as the mean ex-cess function (MEF) and the Hill plot. There also exists non-graphical methods such as double bootstrap methods and taking a arbitrary quantile in order to determine the appropriate threshold.

In this paper, we determine the appropriate threshold by using a combina-tion of the 90% quantile and applying the MEF. The MEF, also known as the mean residual life plot plots the excesses given a threshold of the residuals of the AR(1), MA(1) or ARMA(1,1)-GARCH-type model. If the threshold τ is large enough, the line becomes approximately (positively) linear. This results in a appropriate threshold by visual inspection and a number of exceedances given the threshold. We denote the number of exceedances as k, similarly to McNeil and Frey (2000). After fixing the number k, we use the (k+1)th order statistic of the residuals from the window size for the GPD threshold.

4.4 EVT-GARCH

In section 4.1 and 4.2 an explanation is given to the ARMA and GARCH methodology, whereas in section 4.3 an overview is given of the EVT-POT literature. Both of these methods can be combined as mentioned in the paper of McNeil and Frey (2000). We summarize the two-step procedure of McNeil and Frey as follows:

1. Firstly, we fit an AR(1), a MA(1) or an ARMA(1,1)-GARCH-type model to the dynamics of the loss distribution. We extract the residuals.

2. Secondly, we standardize the residuals as a white noise with mean zero and unit variance. Then we fit the EVT-POT on to the tails of the residuals by using the GPD given a fixed threshold τ .

3. Lastly, we estimate one-step ahead risk measures for different levels of α ∈ (0.95, 1) .

4.5 Risk measures

(21)

4.5.1 Value-at-Risk

In risk managements, the VaR is defined as the maximum possible (future) loss at a given probability α ∈ (0.9, 1). Since, the definition of the VaR works with losses, we turn the log-returns as given in equation 1 from section 3.1 into log-losses at time t, Lt by multiplying the log-returns by minus one. For risk

management purposes, it is useful as the VaR measures the right tail of the loss distribution. The cumulative loss distribution function of the log-losses L is given in equation 28.

FL(l) = P (L ≤ l). (28)

Then, the VaR given the loss distribution is defined as follows:

V aRα(L) = inf (l ∈ < : FL(l) > α), (29)

where V aRα(L) is the VaR at level α of the daily log-losses L.

If the distribution of the log-losses at time t, Lt follows the dynamics as in

equation 15, and if Z is a white noise, the VaR of the form as in equation 30 can be calculated for a location-scale distribution as seen in McNeil et al. (2005), page 39-40.

V aRα= µ + σζα, (30)

where ζα is the α-quantile of a location-scale distribution.

A one-step ahead version of equation 30 allows us to calculate the VaR at time t given confidence level α for the losses at time t+1 as follows:

V aR_αt(Lt+1) = µt+1+ σt+1ζα(Z), (31)

where µt+1 and σt+1 are the one-step ahead conditional mean and standard

deviation of the AR(1), MA(1) or ARMA(1,1)-GARCH-type model’s one day ahead forecast.

In general, the one-step ahead VaR can be calculated under a specific location-scale distribution such as the normal distribution or a Student’s t-distribution as mentioned in McNeil et al. (2005). If we assume a normal distribution with mean zero and variance one, then the one-day ahead VaR is calculated as given in equation 32.

V aRt_α(Lt+1) = µt+1+ σt+1Φ−1(α), (32)

where Φ−1(α) is the α-quantile of the cumulative distribution function of the standard normal. If we now assume a Student’s t-distribution with mean zero and variance scaled to one then we calculate the VaR by using the formula in equation 33. V aRt_α(Lt+1) = µt+1+ σt+1 r v − 2 v t −1 v (α), (33)

(22)

In section 4.4 an introduction to EVT-GARCH of McNeil and Frey (2000) is given. McNeil and Frey suggests a VaR calculation by using the parameters of the GARCH-EVT-POT model. One-step ahead forecast is carried out for different confidence levels α. The calculation is based on the GPD exceeding a threshold τ . On page 282 in McNeil and Frey (2000) we can find the equation for the VaR under the GPD. Combining the VaR under the GPD with equation 31 leads to the following one step ahead VaR:

V aRt_α(Lt+1) = µt+1+ σt+1 µ +β ξ 1 − α τ n −ξ − 1 , (34)

where β is the scale, ξ is the shape and τ is the threshold.

4.5.2 Expected Shortfall

The ES fulfills all axioms as seen in the paper by Artzner et al. (1999) and is a coherent risk measure. Compared to the VaR, the ES measures the expected loss when the VaR has been exceeded. Mathematically, this can be defined as follows:

ESα(L) = E(L|L > V aRα(L)). (35)

On page 45 and 46 in McNeil et al. (2005), we find that for a location-scale distribution the ES can be calculated by assuming that the distribution of the losses L follows the dynamics of equation 15. If we assume that Z follows a standard normal distribution, the one-step ahead forecast of the ES can be calculated. We write the ES at time t given confidence level α for the losses at time t+1 as follows:

ES_αt(Lt+1) = µt+1+ σt+1

φ(Φ−1(α))

1 − α . (36)

If one assumes a Student’s t-distribution for Z with similar properties as before, we calculate the one-step ahead ES as follows:

ES_αt(Lt+1) = µt+1+ σt+1 r v − 2 v gv(t−1v (α)) 1 − α v + (t−1_v (α))2 v − 1 , (37)

where gv is a density function.

Lastly, we introduce the ES calculation of McNeil and Frey (2000). As in section 4.5.1, we can obtain the ES by using the parameters of the EVT-GARCH-POT model. If the losses exceeding the threshold follow a GPD, then as described on page 293 of McNeil and Frey (2000), we obtain the following one-step ahead ES equation: ES_αt(Lt+1) = µt+1+ σt+1zα ₁ 1 − ξ + β − ξµ (1 − ξ)zα , (38)

(23)

4.6 Backtesting

In the previous section, we formally discussed the GARCH-type and combined GARCH-EVT framework. From these models relevant risk metrics can be com-puted. In this section, the estimates of the risk metrics are evaluated based on the performance of the one-step ahead VaR and ES estimates from the GARCH and GARCH-EVT-type models. We do this by using backtests for the out-of-sample losses.

We provide a summary of three different popular backtesting methodologies for the VaR as discussed in Christoffersen (2003). Thereafter, we summarize the backtesting methodology for the ES. As the current literature to backtest-ing the ES has not been well established yet, only two backtests for the ES as proposed by McNeil and Frey (2000) and Embrechts et al. (2005) are presented.

Before introducing the VaR backtests, we require some extra clarification and definitions.

The backtesting methodology requires a fixed window size m < N, where N is the total number of observations and m = 1000 is fixed similar to McNeil and Frey (2000). From this window size m, we forecast the one-step ahead VaR and ES step by step by using a rolling window approach using the GARCH-type and GARCH-EVT-type model’s estimates. The rolling window approach drops the first observation and adds a new observation every step until the end of the dataset. From this, we obtain N-m = N1number of estimates for the backtests.

Further, while computing the estimates by rolling window, we obtain a sequence of VaR breaches or violations when the one-day ahead VaR has been underes-timated by the model. Mathematically, this can be easily shown by using an indicator function. Let us first define1EV EN T as the indicator function for any

particular event ”EVENT” as follows:

1{EV EN T }=

(

1, if event ”EVENT” realizes

0, otherwise. (39)

By using the indicator function, we obtain a sequence of VaR breaches of which each of the individual VaR breaches follows a independent identically distributed (i.i.d.) Bernoulli distribution.

Next, for each VaR and ES estimate given a quantile level α, where α = {0.95, 0.975, 0.99, 0.995} we can perform a backtest for the coverage rate, κ = 1-α. For each coverage rate, we obtain a p-value respectively for the GARCH-type and GARCH-EVT-GARCH-type model.

(24)

Bernoulli random variables at level α. For each of the following three VaR back-tests, we estimate a simulated LR, which is denoted by {dLR(i)}999_i=1. By using equation 40, we obtain more trustworthy p-values.

P-value = 1 1000 1 + 999 X i=1 1_{dLR(i)>LR} , (40)

where1 is defined in equation 39.

Based on the p-value, conclusions can be made on whether the model performs adequately in for example, accurately estimating the number of allowable VaR breaches or independence of VaR breaches.

4.6.1 Unconditional Coverage test

The first backtest to verify the VaR estimates is the unconditional coverage test (UCT) as mentioned in Christoffersen (2003) on page 185. The UCT compares the number of VaR breaches with the allowable number of VaR breaches based on the coverage ratio κ by using a LR test.

We first define π as the proportion of breaches under a specified theoretical distribution function. Then, we set up the null hypothesis H0: π = κ.

The null hypothesis of the UCT can be tested by using a LR test. Let K0

and K1 be defined as the number of non-breaches and breaches respectively.

Moreover, let the MLE of π be denoted by ˆπ = K1

K, where K = K0+ K1.

Then the LR of the UCT which is asymptotically distributed as a Chi-Square distribution with one degree of freedom can be found in equation 41.

LRU CT = 2 ln L(ˆπ) L(κ) ∼χ2 1, (41)

where the nominator of the fraction is the likelihood function based on the Bernoulli distribution function as given in equation 42.

L(ˆπ) = 1 − K1 K K0_K 1 K K1 . (42)

The denominator of the fraction of equation 41 is the likelihood function under the null hypothesis. This is given as follows:

L(κ) = (1 − κ)K0_κK1_. ₍₄₃₎

4.6.2 Independence test

(25)

Therefore, we summarize the procedure of the second backtest as described in Christoffersen (2003) which is the independence test (IT).

Instead of the UCT, which assumes the sequence of VaR breaches given to be independent as mentioned in the section before, we now assume the sequence to be dependent over time. Moreover, Christoffersen (2003) mentioned that the VaR breaches can be modelled as a first-order Markov Chain (MC). A MC can be easily explained by using a transition matrix with transition probability πij,

where i,j ∈ {0, 1}. We present the MC as the matrix M as follows:

M =π00 π01 π10 π11 =1 − π01 π01 1 − π11 π11 , (44)

where for example π00, is the probability of having no breach tomorrow,

condi-tioning on having no breach today. Similar explanation holds for π01, π10 and

π11.

From the matrix M, we set up a likelihood function by using the sample of breaches and non-breaches.

L(M ) = (1 − π01)K00π01K01(1 − π11)K10π11K11, (45)

where for example, the number of days of having no breach tomorrow, condi-tioning on having no breach today is K00. Similar explanation holds for K01,

K10 and K11.

Applying ML, we obtain the MLE for the transition probabilities of the ma-trix ˆM . The estimated transition probabilities are given as follows:

ˆ π00=

K00

K00+ K01

, (46)

which leads to ˆπ01= 1 - ˆπ00and

ˆ π10= K10 K10+ K11 . (47) Similarly, ˆπ11= 1 - ˆπ10.

Substituting ˆπ00, ˆπ01, ˆπ10 and ˆπ11into ˆM yields the following estimated MC.

ˆ M =1 − ˆπ01 ˆπ01 1 − ˆπ11 ˆπ11 = " _K 00 K00+K01 K01 K00+K01 K10 K10+K11 K11 K10+K11 # . (48)

As our goal is to test whether there exists no clustering (independence) of VaR breaches, we test the null hypothesis H0: π01= π11. We set up a LR test which

(26)

where L(ˆπ) is equal to equation 42, due to independence of the MC (with π01= π11= π).

4.6.3 Conditional Coverage test

Both the UCT and the IT have weaknesses in their tests. The UCT alone assumes independence, whereas the IT does not test for allowable limitations of the VaR breaches. Therefore, the final test of Christoffersen (2003) combines both the UCT and the IT into the Conditional Coverage test (CCT). The joint LR test becomes a summation of equation 41 and equation 49,

LRcc= LRU CT+ LRIT∼χ22, (50)

which is asymptotically distributed as a Chi-Square distribution with two de-grees of freedom.

4.6.4 Backtesting ES by Bootstrap

In this section, we summarize the backtesting procedure presented by McNeil and Frey (2000) and obtain p-values by using the bootstrap method of Efron and Tibshirani (1993). In the next subsection, we introduce the backtest method-ology of Embrechts et al. (2005). We choose these backtesting methodologies, because these procedures require the estimates (the mean, standard deviation, one-step ahead VaR and ES) from the previous section.

For the backtest method mentioned by McNeil and Frey (2000) on page 294, we require constructing the ”exceedance” residuals. The important part of these residuals is that these are composed of the differences between the (empirical) losses at time t+1 and the predicted ES at time t, conditioning on a VaR breach. This is defined as {rt+1: t ∈ T, Lt+1> V aRtα(Lt+1)}, where rt+1is given by

rt+1= Lt+1− dES t α(Lt+1) ˆ σt+1 , (51)

where dESt_α(Lt+1) and ˆσt+1 are the estimated one-step ahead ES and t+1th

period standard deviation, respectively.

McNeil and Frey (2000) mentioned that the exceedance residuals are approxi-mately i.i.d. with mean zero conditioning on correct estimation of the dynamics of equation 15 in the (previous) section. Therefore, we set up the one sided null hypothesis H0 : r0 = 0 with the alternative hypothesis Ha : r0 > 0 from the

exceedance residuals.

(27)

a test statistic under the null hypothesis denoted by ˆη which equals ˆ η = r − r¯ _¯_σ 0 √ T1 , (52)

where T1is the sample of VaR breaches, ¯r =_T1₁P T1 i=1riand ¯σ = _T₁1₋₁P T1 i=1(ri− ¯ r)2_.

There is however, one issue with finding an appropriate distribution which esti-mates the null distribution F when using the empirical distribution ˆF . There-fore, we transform the residuals into the ”shifted” residuals by using the follow-ing equation:

˜

ri = ri− ¯r + r0, for i = 1, ..., T1. (53)

Thereafter, we sample ˜r∗1, ..., ˜r∗T1 with replacement from the shifted residuals

˜

r1, ..., ˜rT1. For each bootstrap j, where j = 1,...,10000, we calculate a test statistic

similar to equation 52: ˜ η∗_j = r˜¯ ∗_{− r} 0 ˜ ¯ σ∗ √ T1 , (54)

with ˜¯σ∗ denoting the standard deviation of the bootstrap sample.

Lastly, to obtain a p-value, we use equation 55.

P-value =1 + P10000 j=1 1{˜η∗ j> ˆη} 10001 . (55) 4.6.5 Backtesting ES by V tests

The second backtest we implement for the ES is presented in Embrechts et al. (2005) on page 72. The test is composed of two different measures and a com-bination of the absolute value of the two measures by means of an average. Embrechts et al. noted that in all of these ”V” tests, a low number (near zero) means that the model predicts the ES accurately.

Before we introduce the first ”V” tests, we require an important definition. Let Dt+1be defined as the difference between the empirical losses at time t+1

and the one-step ahead ES at time t. If the difference is positive, then the ES is underestimated and conversely if the difference is negative. The V1 can be

calculated by taking the mean of Dt+1 conditioning on a VaR breach at level

α. Mathematically, this is shown in equation 56.

V1= PK t=1Dt+11_{L t+1> [V aR t α} PK t=11_{L_t+1_{> [}_{V aR}t α} . (56)

(28)

the losses occurring in ”one out of 1_p-event”. This is seen in equation 57. V2= PK t=1Dt+11{Dt+1>Qα} PK t=11{Dt+1>Qα} , (57)

where Qαis the empirical α-quantile of {Dt+1}.

As both the V1 and the V2 have weaknesses in their measures as mentioned

in Embrechts et al. (2005), they remedy these weaknesses by taking the mean of the absolute value of V1and V2 as seen in equation 58.

V3=

|V1| + |V2|

2 , (58)

(29)

5 Empirical results

In this section the empirical results of several GARCH-type models and the combined GARCH-EVT framework for both the normal and the Student’s t-innovations are presented. In total, we obtain twelve different results consist-ing of three different GARCH-type models with their respective GARCH-EVT framework under two different innovations.

Before presenting the backtesting results, we provide a few comments towards the data, the optimal ARMA model, the number of observations used for the threshold and notation in the tables.

For each of the datasets, we transformed the loss series, Ltto have variance one.

This is to prevent any issues with the algorithms for the backtesting procedures.

The ”optimal” ARMA component for the cryptocurrencies is chosen based on the lowest value of the information criteria. Findings suggest fitting a MA(1) to the Bitcoin series, an ARMA(1,1) to the XRP series and AR(1) to the Monero series.

Fitting the GPD to the residuals of the GARCH-type model requires a thresh-old τ . This threshthresh-old is determined by using a combination of the MEF and the 90% quantile which allows us to take the (k+1)th order statistic of the residuals. We find that the optimal k ranges between 100-106 which is close to the fixed k mentioned in McNeil and Frey (2000).

Lastly, in each of the tables the models are abbreviated in order to shorten the size of the table. The N and T denote the normal and the Student’s t-distribution for the innovations.

In the next section we discuss the (estimated) VaR breaches. Thereafter, the results of the VaR backtests are presented. We end the section with the results for the ES backtests.

5.1 VaR breaches

In Table 2 and 3, the (expected) number of VaR breaches is presented for respec-tively the GARCH-type and GARCH-EVT-type models. In the first column, we find the cryptocurrency series, followed by the quantile level α and the expected number of VaR breaches. The final six columns present the estimated number of VaR breaches sorted by the type of GARCH or GARCH-EVT model. We note that the expected number of VaR breaches is computed by N1∗ κ.

(30)

assuming normal innovations and overestimation when assuming t-innovations for both Bitcoin and XRP series. Interestingly, a contradiction is found in the Monero series. Under both innovations, the models seem to (over)predict the number of VaR breaches equally well.

For the Bitcoin under the assumption of normal innovations, the addition of EVT always improves the results in the three higher quantiles as seen in Table 3. However, when assuming t-innovations the results are improved in only the first two lower quantiles. In the two higher quantiles it is not clear whether the EVT improves the results.

The EVT framework does not always lead to an improvement in the number of VaR breaches as seen in the XRP and Monero series. In both series, the addition seems to worsen almost all of the results. Moreover, under normal in-novations, the addition compared to the GARCH-type model seems to increase the number of VaR breaches in the upper two quantiles and decrease the number in the two lower quantiles. Furthermore, the VaR is adjusted downwards under t-innovations, implying a higher number of VaR breaches.

5.2 Results for Unconditional Coverage test

In Table 4 and 5 the results of the UCT for the series of Bitcoin, XRP and Monero are presented. Christoffersen (2003) noted that a significance level of 10% is sufficient for risk models in risk management. If the model is rejected at this significance level then this implies that the risk model is incorrect on average.

Concentrating solely on the results of the Bitcoin series presented in Table 4, one can clearly see that all GARCH-type models under normal innovations perform extremely poor as seen by the low p-values at all levels α. The reverse occurs when assuming t-innovations. Almost all models seem to do extremely well, especially at the higher levels.

Contrarily, a different picture can be seen from the XRP and Monero series under normal innovations. The models at all confidence levels except for the highest seem to perform adequately on average. Similar conclusions hold for t-innovated models.

We find no large difference in all of the series when comparing different GARCH-type models under normal innovations. Only under a rolling t-distribution for the innovations do the results start to differ. We suspect that the cause of this difference lies in the estimation of the shape coefficient by the GARCH-type models which influences the results.

As expected from the addition of the EVT under both innovations, the models performs significantly better starting from confidence level 0.975 for the Bitcoin series.

(31)

comparatively worse under both innovations as seen in the much lower p-values.

5.3 Results for Independence test

Almost all GARCH-type models seem to take into account the independence of VaR breaches very well as presented in Table 6. Exceptions are seen in the two higher quantiles under normal innovations for the Bitcoin and XRP series. Similarly as before, the type of GARCH model does not seem to matter much as the p-values are very similar under normal innovations. Differences show up under a rolling t-distribution.

Combining with EVT solves the exceptions for the Bitcoin series as seen in Table 7. However, for the XRP series, the GARCH-type models under t-innovations start producing VaR breaches which contains information of future breaches. In the Monero series, the EVT seems to perform slightly worse as seen in the lower p-values in almost all the cases.

5.4 Results for Conditional Coverage test

The resulting p-values of the CCT given in Table 8 and 9 are similar to the results of the UCT as seen in Table 4 and 5. Therefore, we summarize two important observations:

• GARCH-type models under normal distributed innovations provide simi-lar risk estimates. It is suggested to assume a different distribution (e.g. rolling t) for the innovations since this affects the estimation of the risk measures.

• The EVT improves estimation of the VaR in the Bitcoin series under both innovations. However, it provides comparable or worse results in the XRP and Monero series.

5.5 Bootstrap results

In this section, we discuss the backtesting results of the ES by McNeil and Frey (2000). The subsection following, the results of the V tests by Embrechts et al. (2005) are presented. McNeil and Frey notes, a low p-value (e.g. 5%) indicates underestimation of the ES.

Before discussing the results, we note the reader that in some cases the results are unavailable in the higher quantile levels of the XRP and Monero series. A condition to start the bootstrap procedure by Efron and Tibshirani (1993) is to have a sample of at least size two. As seen in Table 2 and 3, we have insuffi-cient number of VaR breaches and therefore the bootstrap procedure cannot be started.

(32)

for the Bitcoin series underestimate the ES as seen in the rejection of the null hypothesis at the 5%. In the XRP series, similar conclusion can be given if we increase the significance to 10%. The only exception is seen in the Monero se-ries, where models with normally distributed innovations do not underestimate the ES.

Assuming t-distributed innovations results in high p-values in almost all mod-els, indicating that there is no underestimation of the ES. However, some results are unavailable due to not having a bootstrap sample as seen in the XRP and Monero series.

Similarly as when estimating the VaR, the type of GARCH model under normal innovations provides comparable p-values. The similarities cease to exist under rolling t-innovations.

The addition of EVT remedies the underestimation of the normal innovations entirely for both Bitcoin and XRP and slightly increase the p-values for the Mon-ero series. However, the p-values for the t-innovations are comparably (slightly) lower. This agrees to our previous comment of which the models with normal innovations tend to underestimate the VaR, and are corrected upwards by the EVT and conversely for the t-innovations.

5.6 V results

In this subsection the results of the three V test are explained. All of the results can be interpreted by observing the sign and size of the coefficient. If the sign is positive, meaning Dt+1 is positive, the model underestimates the ES and the

reverse holds if the sign is negative. Embrechts et al. (2005) notes a V value (near) zero indicates an accurate prediction of the ES. If the model predicts the ES accurately, the difference between the one-step ahead ES and the actual losses is (near) zero. In other words, we find Dt+1a low number.

Before we start discussing the results of V1 presented in Table 12 and 13, we

similarly as in the section before inform the reader of unavailable results. In order to calculate the V1, we require a sample size of at least one. In some cases,

we do not have a sample size as seen in Table 2 and 3 due to the overestimation of the one-step ahead VaR. As the mean of the absolute value of V1and V2is

re-quired to compute V3, we also omit the corresponding results in Table 16 and 17.

All of the GARCH-type models with normal innovations in both Bitcoin and XRP series underestimate the ES as seen in the positive coefficients of V1 due

to positive Dt+1. The reverse however, occurs in the Monero series with an

exception for some of the results for the EGARCH. The GARCH-type models with t-distributed innovations however, all (severely) overestimate the ES. This occurs in all three series as all signs of V1 are negative.

(33)

GARCH-type model under normal innovations for all series. This is seen in each of the V1 scores of Table 13, which are lower in comparison. Under t-innovations

how-ever, the reverse occurs. We do note that the results for V1of the GARCH-EVT

models are closer to zero in most cases, and thus seem to improve the ES for the series Bitcoin and XRP under both innovations. Considering the Monero series, the EVT only improves the ES under t-innovations.

When observing the results of V2 presented in Table 14 and 15, we find similar

conclusion for all series when comparing to the results in Table 12 and 13. One notable difference is found in the size of the V2, of which models with

nor-mal innovations tend to underestimate the ES more severely than that of V1

and conversely for t-innovations.

In almost all cases we find that models with normal innovations compared to t-innovations perform better in estimating the ES as seen in Table 16. The only exception is found in the CS-GARCH model with t-innovations of the Bitcoin series which resulted in lower V3 scores.

A notable similarity is found in the GARCH-type models under normal in-novations for all V scores. Similar to the estimation of the VaR, GARCH-type models do not seem to matter when estimating the ES under normal innova-tions. Differences start occuring under a rolling t-distribution.

When comparing GARCH-type models with GARCH-EVT models under both innovations, we find that for the Bitcoin and XRP series the addition of EVT improves the calculation of the ES as seen in Table 17. The improvements in the Monero series happen only under t-innovations. This can be seen in the lower estimates of V3, which is, as mentioned before, the average of the absolute

(34)

6 Concluding remarks

In this paper, we extend the current literature of EVT modelling and risk assess-ment of cryptocurrencies. We explore the addition of the EVT framework on three different GARCH-type models. By merging these GARCH-type models with EVT, we obtain the GARCH-EVT framework by McNeil and Frey (2000). According to McNeil and Frey, the combination of GARCH and EVT leads to an improved estimation of the risk measures.

From investigation, we find that cryptocurrency return series are extremely fat tailed and leptokurtic. We decided to fit GARCH-type models with normal and t-innovations to the loss series. From each model, we extracted standardized residuals and implemented the EVT-POT framework on to the (k+1)th order statistic of the residuals. The one-step ahead VaR and ES are then computed by using a rolling window. In order to measure the performances of the models, we implement backtesting procedures of Christoffersen (2003) for the VaR and the procedures of McNeil and Frey (2000) and Embrechts et al. (2005) for the ES.

The results show that each of the GARCH-type models under normal inno-vations seems to estimate the VaR and ES in a similar way as seen in the size of the estimates. However, when assuming a rolling t-distribution for the inno-vations, we find contradicting results.

This contradiction raises our first research question ”Does the distribution of the innovations influence the estimates of the risk measures of GARCH-type models in cryptocurrency time series?”. The findings suggest that GARCH-type models under normal innovations respond very similarly when estimating risk measures. Assuming a different distribution for the innovation however, influences the estimation of the risk measure. This can be seen in for exam-ple the estimation of the ES under different assumptions. Furthermore, the models under normal innovations tend to underestimate the ES, whereas under t-innovations the models tend to overestimate the ES.

Since GARCH-type models under normal innovations do not seem to mat-ter much, and since the models under t-distribution deviate from each other when estimating risk measures, does extending the models improve the results? This leads to the second research question of this paper: ”Does extending the GARCH-type models with the EVT framework improve risk estimation in cryp-tocurrency time series?”

Results show that the inclusion of the EVT leads to contradicting VaR results. Improvements are found in the Bitcoin series, but similar or worse results are seen in the Monero and XRP series. However, we find substantial improvements under t-innovations when estimating the ES. With normal innovations, for two out of three series the ES is improved.

(35)

discussion is given of the assumptions, which then could lead to future research topics.

The first assumption concerns the fixed window size m. We assumed the window size to be of a similar size as mentioned in McNeil and Frey (2000). A smaller window size would provide a larger out-of-sample size for the ES backtests, re-sulting in possibly more (accurate) results.

The second assumption involves the threshold choice based on a combination of the 90% quantile and the MEF. A different threshold would include/exclude observations which impacts the GPD parameters. It could be that cryptocur-rency series require a higher threshold. This remains a topic for further research.

Cryptocurrencies are relatively new compared to other financial series. There-fore, only limited and useful data is available. As seen from Figure 1, the series remained very stable until 2017 and then drifted up. It remains to be seen whether the results hold for cryptocurrencies in the (near) future.

As different innovations affect the estimation of the VaR and ES as seen in the result section, it would be interesting to assume different innovation distribu-tions or fixing the number of degree of freedoms for the Student’s t-distribution. We chose for the simplified approach by assuming a rolling normal and rolling t-distributed innovations. Other distributions such as the skewed t or a Gen-eralized Hyperbolic distribution could capture the behavior of cryptocurrencies better.

(36)

Acknowledgement

Firstly, I would like to express my sincere gratitude to dr. D. Ronchetti for being my supervisor during the whole process of the (overlap) thesis. His guidance, suggestions, comments and corrections helped me in writing the thesis success-fully. Moreover, I would like to thank him for introducing me to time-series and GARCH-EVT modelling during the Bachelor and Master.

Secondly, I would like to thank Prof. dr. R.H. Koning for his course in risk management, which provided a solid foundation to the current topic.

Thirdly, I thank Ying for her time on proofreading my thesis.

(37)

References

Acerbi, C. and D. Tasche (2002). On the coherence of Expected Shortfall. Journal of Banking & Finance 26 (7), 1487–1503.

Akaike, H. (1974). A New Look at the Statistical Model Identification. IEEE Transactions on Automatic Control 19 (6), 716–723.

Angelini, G. and S. Emili (2018). Forecasting Cryptocurrencies: A Comparison of GARCH Models. Working Paper .

Artzner, P., F. Delbaen, J.M. Eber, and D. Heath (1999). Coherent measures of risk. Mathematical Finance 9 (3), 203–228.

Balkema, A.A. and De.L Haan (1974). Residual Life Time at Great Age. The Annals of Probability 2 (5), 792–804.

Bollerslev, T. (1986). Generalized Autoregressive Conditional Heteroskedastic-ity. Journal of Econometrics 31 (3), 307–327.

Bollerslev, T. (2008). A Glossary to ARCH (GARCH). CREATES Research Paper (49), 46.

Bouoiyour, J. and R. Selmi (2015). What Does Bitcoin Look Like? Annals of Economics and Finance 16 (2), 449–492.

Bouoiyour, J., R. Selmi, A. Tiwari, and O.R. Olayeni (2016). What drives Bitcoin price? Economics Bulletin 36 (2), 843–850.

Caporale, G.M. Zekokh, T. (2018). Modelling Volatility of Cryptocurrencies Us-ing Markov-SwitchUs-ing Garch Models. CESifo WorkUs-ing Paper Series 1 (7167), 26.

Chan, S., J. Chu, S. Nadarajah, and J. Osterrieder (2017). A Statistical Analysis of Cryptocurrencies. Journal of Risk and Financial Management 10 (2), 1–23.

Christoffersen, P.F. (1998). Evaluating Interval Forecasts. International Eco-nomic Review 39 (4), 841–862.

Christoffersen, P.F. (Ed.) (2003). Elements of Financial Risk Management. San Diego: Academic Press.

Chu, J., S. Chan, S. Nadarajah, and J. Osterrieder (2017). GARCH Modelling of Cryptocurrencies. Journal of Risk and Financial Management 10 (17), 15.

Dickey, D.A. and W.A. Fuller (1979). Distribution of the Estimators for Autore-gressive Time Series With a Unit Root. Journal of the American Statistical Association 74 (366), 427–431.

(38)

Efron, B. and R.J. Tibshirani (Eds.) (1993). An Introduction to the Bootstrap. Chapman and Hall/CRC.

Embrechts, P., R. Kaufmann, and P. Patie (2005). Strategic Long-term Finan-cial Risks: Single Risk Factors. Computational Optimization and Applica-tions 32 (1-2), 61–90.

Engle, R.F. (1982). Autoregressive Conditional Heteroscedasticity with Esti-mates of the Variance of United Kingdom Inflation. Econometrica 50 (4), 987–1007.

Engle, R.F. and G. Lee (1999). A Permanent and Transitory Component Model of Stock Return Volatility. Cointegration, Causality and Forecasting: A Festschrift in Honor of Clive W.J. Granger. Oxford University Press, New York , pp. 475–497.

Frunza, M.C. (Ed.) (2016). Solving modern Crime in Financial Markets: Ana-lytics and Case Studies. Academic Press.

Ghalanos, A. (2018). Introduction to the rugarch package. (Version 1.3-8). Available at ftp://ubuntu.c3sl.ufpr.br/CRAN/web/packages/rugarch/ vignettes/Introduction_to_the_rugarch_package.pdf.

Gkillas, K. and P. Katsiampa (2018). An application of extreme value theory to cryptocurrencies. Economics Letters 164, 109–111.

Glaser, F., M. Haferkorn, M. Siering, M.C. Weber, and K. Zimmermann (2014). Bitcoin - Asset or Currency? Revealing Users’ Hidden Intentions. Working Paper . ECIS 2014 Tel Aviv.

Gumbel, E.J. (Ed.) (1958). Statistics of Extremes. New York: Columbia Uni-versity Press.

Katsiampa, P. (2017). Volatility estimation for Bitcoin: A comparison of GARCH models. Economics Letters 158, 3–6.

Likitratcharoen, D., T.N. Ranong, N. Chuengsuksomboon, R. Sritanee, and A. Pansriwong (2018). Value at Risk Performance in Cryptocurrencies. The Journal of Risk Management and Insurance 22 (1), 11–28.

Liu, R., Z. Shao, G. Wei, and W. Wang (2017). GARCH Model With Fat-Tailed Distributions and Bitcoin Exchange Rate Returns. Journal of Accounting, Business and Finance Research 1 (1), 71–75.

McNeil, A.J. and R. Frey (2000). Estimation of Tail-Related Risk Measures for Heteroscedastic Financial Time Series: An Extreme Value Approach. Journal of Empirical Finance 7 (3-4), 271–300.

(39)

Naimy, V.Y. and M.R. Hayek (2018). Modelling and predicting the Bitcoin volatility using GARCH models. International Journal of Mathematical Mod-elling and Numerical Optimisation 8 (3), 197.

Nelson, D.B. (1991). Conditional Heteroskedasticity in Asset Returns: A New Approach. Econometrica 59 (2), 347–370.

Nelson, D.B. (1992). Filtering and forecasting with misspecified ARCH models i: Getting the right variance with the wrong model. Journal of Economet-rics 52 (1-2), 61–90.

Osterrieder, J. and J. Lorenz (2017). A Statistical Risk Assessment of Bitcoin and its Extreme Tail Behavior. Annals of Financial Economics 12 (1), 19.

Peng, Y., P.H.M. Albuquerque, J.M.C. Sa, A.J.A. Padula, and M.R. Montene-gro (2018). The best of two worlds: Forecasting High Frequency Volatility for cryptocurrencies and traditional currencies with Support Vector Regression. Expert Systems With Applications 97, 177–192.

Pickands, J. (1975). Statistical Inference Using Extreme Order Statistics. The Annals of Statistics 3 (1), 119–131.

Poon, S.H. and C. Granger (2003). Forecasting volatility in Financial Markets: A Review. Journal of Economic Literature 41 (2), 478–539.

Schwarz, G.E. (1978). Estimating the Dimension of a Model. Annals of Statis-tics 6 (2), 461–464.

Schwert, G.W. (1989). Why Does Stock Market Volatility Change Over Time. Journal of Finance 44 (5), 1115–1153.

Stavroyiannis, S. (2017). Value–at-Risk and Expected Shortfall for the Major Digital Currencies. Working Paper .

Stavroyiannis, S. (2018). Value-at-risk and related measures for the Bitcoin. The Journal of Risk Finance 19 (2), 127–136.

Trucios, C. (2018). Forecasting Bitcoin risk measures: A Robust Approach. Working Paper .

(40)

(41)

Figure 2: Bitcoin’s log-return series in percentages

(42)

Figure 4: XRP’s log-return series in percentages

Descriptives and statistics of the log-return series in percentages

Statistic Bitcoin XRP Monero

Mean 0.1592 0.307 0.428 Median 0.2071 -0.287 0.0880 Standard Dev 3.96 7.35 7.186 Minimum -25.178 -49.628 -28.465 1st Quantile -1.186 -1.995 -2.791 3rd Quantile 1.784 1.887 3.434 Maximum 28.71 88.127 62.47 Observations 1461 1228 1228 Kurtosis 7.07 26.890 9.414 Skewness -0.32 2.560 1.122

(43)

(44)

(45)

(46)

Figure 8: Histogram, smoothed empirical probability density function (pdf) and normal pdf of Bitcoin’s log-returns

(47)

Figure 10: Histogram, smoothed empirical probability density function (pdf) and normal pdf of XRP’s log-returns

Expected Shortfall Performance in Cryptocurrencies modelled by GARCH-EVT