An investigation to the performance of misspecified VaR models

(1)

Master's Thesis

An investigation to the performance of

misspecied VaR models

Student: David Kranenburg

Student number: 10574697

Date of nal version: 15 August 2017

Master's programme: Econometrics

Specialisation: Econometrics

Supervisor: Prof. dr. S. A. Broda

Second reader: Prof. dr. H. P. Boswijk

Faculty of Economics and Business

(2)

Statement of Originality

This document is written by Student David Kranenburg who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Abstract

This thesis analyses the performance of misspecied Value at Risk (VaR) models. The VaR methods taken into account are the parametric location-scale method based on a normal distribution or Student's t distribution and a nonparametric method, which is the historical simulation method. Forecasts are made using the unconditional variance (unltered) or conditional variances (ltered) following from volatility models. The conditional variance models under analysis are the EWMA(0.94)-model, GARCH(1,1)-model, IGARCH(1,1)-model and GAS(1,1)-model. Dierent in-sample sizes are considered, being 500, 1000 and 1500, in order to investigate the eect of the in-sample size on the VaR forecasts. Two datasets are used for research, one consisting of 1000 Monte Carlo simulated time series of returns following from an ASVJt-model, the other being S&P 500 returns covering the last 20 years. For each time series, 1000 out-of-sample one-step-forward VaR, based on a coverage rate of 5% or 1%, are forecasted by a moving window approach. These VaR are backtested using the Christoersen Unconditional Coverage test, the Christoersen Conditional Coverage test and the Dynamic Quantile test. The backtest results show that a normal distribution fails in capturing a correct VaR, especially for a 1% coverage rate. A Student's t distribution captures the VaR better for this coverage rate, however, the historical simulation method outperforms both of the aforementioned, for both coverage rates. The VaR based on conditional variance models always outperforms the VaR based on the unconditional variance. For a small in-sample size, being 500, the EWMA model has the best performance in most cases, because the other volatility models suer from inaccuracy issues. For a larger in-sample size, and the parametric method, the GARCH and GAS models have the best performance in most cases. For the larger in-sample sizes, the EWMA and IGARCH model have a similar performance, and the same holds for the GARCH and GAS model. For the historical simulation method and a large enough in-sample size there is little dierence between the performances of the four volatility models. This reason, along with the fact that the historical simulation method outperforms the parametric methods, indicate that it is convenient to apply the ltered historical simulation method for VaR forecasting.

(4)

Acknowledgements

This thesis is written under the guidance of KAS BANK N.V. and supervision of prof. dr. S. A. Broda from the University of Amsterdam. I would like to thank the aforementioned for their support in the research of this thesis.

(5)

1 Introduction

The Value at Risk (VaR) is an important measure to analyse the risk of portfolios. This measure gives an indication of what the biggest possible loss would be in the future, based on a condence level. Many nancial institutions use dierent models to calculate the VaR. However, the application of this risk measure comes with critical limitations, which are important and should be taken into account. These limitations will now be discussed, but only one of the limitations will be researched in this thesis.

One of the main limitations of the VaR is that it is not a coherent risk measure, since it violates the subadditivity condition (Artzner, Delbaen, Eber & Heath, 1999). The subadditivity condition tells that the risk of a portfolio should be less or equal to the sum of the risks of each separate investment within a portfolio. More formally, let ρ be a risk measure and X1and X2be two separate investments.

Then the following must hold for ρ so as to be subadditive (Artzner, Delbaen, Eber & Heath, 1999):

ρ(X1 + X2) ≤ ρ(X1) + ρ(X2). (1)

Since the VaR violates this subadditivity condition, it is not in agreement with the common practice of diversication, which states that dierent kinds of in-vestments get bundled into a portfolio so as to decrease risk.

A second limitation of the VaR is that it does not tell anything about the distribution (`shape') of the tail of the distribution of the returns. It is of interest to know the expected loss given that a loss greater than or equal to the VaR occurred. The Expected Shortfall (ES)1 _{answers this question and is often taken}

into account besides the VaR. Moreover, the ES satises all the conditions of a coherent risk measure (Artzner, Delbaen, Eber & Heath, 1999).

Another important limitation is that the VaR models rely on assumptions which might not reect reality. In particular, a model which completely and correctly species the processes behind nancial returns does not exist, i.e. in practice all models are misspecied. Empirical research has to be conducted in order to obtain a model with a good t to the nancial returns. However, to a

(8)

certain degree, these models will still be misspecied. Therefore, research has to be done on the robustness and dierences in performance between the models. The evaluation of the performance of VaR models has been the topic of research in relatively recent studies, for example in Bao, Lee and Saltoglu (2006) and Kuester, Mittnik and Paolella (2006).

The scope of this thesis is only the last mentioned limitation, i.e. research will be conducted in order to conclude which misspecied VaR model has the best performance. In the rst study, the real process behind the returns is known and Monte Carlo simulated, which gives the possibility to test the robustness of various models in a controlled environment. However, in order to validate the results of the simulation study, these models will also be analysed on a real nancial daily return series, the S&P 500 for the period 1997 to 30-06-2017, covering the last twenty years. As it can be seen, the true DGP of the S&P 500 returns is unknown, so in this case less can be said about the degree of misspecication of the models.

The analysis will focus on the dierences between Value at Risk models, the model choices a risk manager has to make in VaR calculation. These choices will now be discussed. Firstly, when calculating a VaR, one has to make an as-sumption about the distribution of the returns. This can either be a parametric distribution, for example by assuming a normal or Student's t distribution for the returns, or nonparametric, by means of historical simulation (HS), in which the distribution of the returns is based on a historical sample. All three afore-mentioned methods will be applied and analysed. Secondly, one has to decide whether to base their VaR model on an unconditional variance or conditional variance, and in the latter case a volatility model has to be chosen. This the-sis will research the performance of a VaR based on an unconditional variance, but also a VaR based on an EWMA (Morgan, 1996), IGARCH (Engle & Boller-slev, 1986), GARCH (BollerBoller-slev, 1986) or GAS model (Creal, Koopman & Lucas, 2008). Lastly, in some occasions a researcher has a limited amount of nancial returns data available for Value at Risk forecasting. To this end, the eect of varying in-sample sizes, being 500, 1000 and 1500, on the forecasting of the VaR will be analysed.

(9)

and volatility models. A selection of them will now be discussed.

Slim, Koubaa and BenSaïda (2017) conclude that although the volatility mod-els based on a normal distribution might be useful for describing the risk in a period with low volatility, it yields unsatisfactory results for high volatility pe-riods. The tail characteristics of the distribution of the returns are of much importance in those periods, especially in a 1% coverage rate VaR, and the nor-mal distribution underestimates the risk in that case. A Student's t distribution will better capture that risk (Hartz, Mittnik & Paolella, 2006), however, Slim, Koubaa and BenSaïda (2017) state that the volatility models based on a Stu-dent's t distribution tend to overestimate the risk. Slim, Koubaa and BenSaïda (2017) nd that the FIGARCH model (Baillie, Bollerslev & Mikkelsen, 1996), a model which distinguishes between short memory (GARCH) and long memory (IGARCH)2 _{in the volatility, performs best for VaR forecasting in developing and}

developed market portfolios, because it takes long memory into account. How-ever, they nd that GARCH and GJR-GARCH models perform best in emerging and frontier markets respectively (Slim, Koubaa & BenSaïda, 2017).

Angelidis, Benos and Degiannakis (2004) nd that the Student's t distribution performs better than the normal distribution, which supports the aforementioned results. However, they do not nd a volatility model with a best performance, among their analysed GARCH models. Moreover, they do not nd a clear re-lationship between the in-sample size and optimal models and nd signicant dierences between the VaR for dierent sample sizes.

Hartz, Mittnik and Paolella (2006) improved the normal distribution based GARCH model with a resampling method and a bias correction step and nd that its VaR forecasting performance is promising, even better than a GARCH model based on a Student's t distribution.

Marzo and Zagaglia (2010) studied the forecasting performance of various GARCH, EGARCH and GJR-GARCH models with a normal, Student's t or generalized exponential distribution, for crude oil futures. They nd that the EGARCH models performed best, closely followed by GARCH models based on a generalized exponential distribution.

2_{However, an IGARCH process has innite persistence in volatility, meaning that a shock}

(10)

Lucas and Zhang (2016) researched the EWMA model, as proposed by Morgan (1996), but then in a GAS model framework, by means of an integrated GAS model. Their model is based on a skewed Student's t distribution. They nd that this model performs as well as or better than already existing methods for forecasting the volatility of stock returns or exchange rate returns.

Gao and Xiao-Hua (2016) also nd that score driven models, based on a skewed Student's t distribution, perform well in VaR forecasting.

The following results are expected, based on the aforementioned results of previous studies. It is expected that a VaR based on the normal distribution will fail in capturing the correct risk, leading to many rejections, especially for a VaR based on a 1% coverage rate. A Student's t distribution will capture this risk better and thus lead to fewer rejections, however, it might overestimate the VaR. The HS method is expected to perform best, mainly because it is a nonparametric method.

Taking conditional variances will probably lead to a better performance of the VaR, since it will take correlations between volatilities into account.

The GAS volatility model, which mimics an EGARCH model with the settings of this research, is expected to have the best performance in the prediction of the volatility, because of the results of Lucas and Zhang (2016) and Gao and Xiao-Hua (2016), but also because it is closest to the DGP used for the simulation of the returns (see Section 5.1).

The VaR is expected to dier signicantly for varying in-sample sizes, as concluded by Angelidis, Benos and Degiannakis (2004). A larger in-sample size is expected to increase the performance of VaR forecasting, because a larger information set will lead to more accuracy in the forecasting.

The empirical methodology for research is as follows. Firstly, 1000 return series will be Monte Carlo simulated according to an Asymmetric Stochastic Volatility with Jump and Student's t distribution (ASVJt) model. The afore-mentioned VaR methods (parametric or nonparametric) will then be used to calculate the Value at Risk for each return series, with a 5% and 1% coverage rate. Both methods will rst be evaluated unltered (unconditional variance). A sample of the simulated returns, with the aforementioned in-sample sizes, will then be used in order to calculate the one-step-ahead out-of-sample VaR. In total

(11)

1000 one-step-ahead out-of-sample VaR will be calculated per return series, based on a moving window approach.

Secondly, the methods will be ltered, i.e. using conditional variance mod-els. The following volatility models will be applied for ltering: EWMA-model, GARCH(1,1)-model, IGARCH(1,1)-model and model. The GAS(1,1)-model will mimic an EGARCH(1,1)-GAS(1,1)-model in the settings of this research.

The dierent VaR methods will be compared by backtests, in order to analyse which method is preferable. The backtests applied here are the Christoersen Un-conditional Coverage test (Christoersen, 1998), the Christoersen Conditional Coverage test (Christoersen, 1998) and the Dynamic Quantile test (Engle & Manganelli, 1999). The unltered and ltered VaR models will be compared based on the results from these backtests. Both the frequency of the violations3_,

i.e. whether the coverage rate is correct, and independence of violations, i.e. no clustered VaR violations should occur, will be analysed by these backtests.

The same empirical methodology will be applied to the S&P 500 returns, but with a few changes. Noticeably, there will only be one time series of returns in that case. The in-sample size is xed on 4000 and the out-of-sample size, i.e. the number of forecasted one-step-ahead VaR, is set at 1000. Again, a moving window approach is applied.

The remainder of this thesis is as follows. Section 2 will describe the VaR models and Section 3 will explain and discuss the backtests used for research and their nite sample properties. Section 4 will describe the empirical methodology for research. The ASVJt-model, which is used for the Monte Carlo simulation study, and the backtest results from research on the simulated returns will be explained and analysed in Section 5. Then, in Section 6, the empirical application based on the S&P 500 returns, which is the second dataset for research, will be explained and the backtest results on this dataset will be provided and analysed. Section 7 will compare the results from the simulation study and the empirical application. Section 8 provides limitations of the research. This thesis ends with a short summary of the research, a conclusion based on the results of this study, and a proposal of recommendations for future research.

(12)

2 Value at Risk

This section starts with the explanation of the framework behind the VaR. The asssumptions and choices made for the calculation of the VaR will be discussed, particularly the dierent volatility models applied for ltering. After that, the methods, namely the parametric normal or Student's t distribution method, or the nonparametric historical simulation method, will be explained.

2.1 Value at Risk framework

Let rt denote the log-return. The one-step-ahead V aR(α)t for coverage rate

100α% conditionally on an information set Ft−1 is then dened as (Dumitrescu,

Hurlin & Pham, 2012)4_:

P r[rt< −V aR(α)t|Ft−1] = α, ∀t ∈ Z. (2)

In this research, the assumption of symmetrically distributed5 _{returns and the}

assumption of a zero conditional mean6 _{of the returns: µ}

t = E[rt|Ft−1] = 0, are

made.

Statistical support for symmetry in the distribution of nancial returns is found by Coronel-Brizio et al. (2007). In Peiró (1999), the symmetry got re-jected under a normal distribution assumption of the returns, but not in case of a Student's t, or mixed normal distribution assumption for the returns. On the contrary, also some empirical evidence exists for asymmetry in returns, for example in Gunner, Brooks & Storer (2006), so the assumption of a symmetrical distribution for the returns is not fully supported by empirical research. How-ever, this assumption is still made for every method in this research, mainly for simplicity reasons. Obviously, for the parametric methods in this research (nor-mal or Student's t distribution), the symmetry in the returns is directly assumed by the characteristics of these distributions. On the contrary, the nonparamet-ric histononparamet-rical simulation method does not impose the assumption of symmetry

4_{Following the actuarial convention, a loss is denoted by a positive number (Dumitrescu,}

Hurlin & Pham, 2012). For this reason, the VaR has a positive sign.

5_{This assumption is in general not necessary.}

6_{In practice, E[r}_t_|F_t−1_{] ≈ 0}_{for daily log-returns. This assumption is made in every model}

(13)

in the returns. This assumption is restrictive and will negatively inuence the results for this method in case the returns show skewness in their distribution, a fact which has to be taken into account when analysing the results from the application of the historical simulation method in this research. The historical simulation method would perform better without the assumption of symmetry in the distribution of the returns.

The zero conditional mean assumption follows from the Ecient Market Hy-pothesis (EMH), which basically means that stock prices are expected to follow a random walk. See iµan (2015) for a review of the literature and empirical research to the EMH.

Because of the two aforementioned assumptions, the 100α% quantile is minus the 100(1 − α)% quantile and this is applied to Equation (2). Therefore, the V aR(α)t is the 100(1 − α)% quantile of the returns. Thus, the returns will be

greater than or equal to the VaR with a probability of α, in case of a correct VaR calculation. Most VaR calculations are based on a coverage rate7 _{of 5% or 1%.}

In this research the VaR will also be analysed for these two coverage rates. The information set, Ft−1, should contain enough observations of past returns,

in order to accurately predict the Value at Risk (Hendricks, 1996; Angelidis, Benos & Degiannakis, 2004). Moreover, Kuester, Mittnik and Paolella (2006) show that VaR models can perform dierently under dierent sample sizes. The eect of the number of observations in the information set on the performance of the Value at Risk methods will be analysed in Section 5, by taking the following three sizes8 _{for the information set: S = 500, 1000, 1500. Given the}

informa-tion set, a total of T = 1000 out-of-sample one-step-forward VaR forecasts are computed by a moving window approach, which will now be explained. Let the

7_{Basel Committee on Banking Supervision (2016) states a 1% coverage rate VaR measure}

as a requirement.

8_{However, one should be aware of the eect of the sample size on the GARCH(1,1) and}

GAS(1,1) estimation, for the ltered VaR models. A small sample size might result in less accurate estimation of these two volatility models. Hwang and Valls Pereira (2006) propose that at least 500 observations are needed for accurate GARCH(1,1) estimation, however, in research it will be shown that this number is not enough. Creal, Koopman and Lucas (2008) show by Monte Carlo simulation that GAS estimations get more accurate for increasing sample sizes.

(14)

total number of observed returns be denoted by R, i.e. the Rth _{return is the last}

observed return. Then, the V aR(α)t, for t = [1, ..., T ], calculated by a moving

window approach, is based on the following sample of returns:

{ri}Q−1i=Q−S = [rQ−S, rQ−S+1, ..., rQ−2, rQ−1], Q = R − T + t, t = 1, ..., T. (3)

The VaR series, {V aR(α)t}Tt=1, will be compared with the corresponding

ob-served return series {ri}Ri=R−T +1,9 for the backtesting procedure, which will be

explained in Section 3 and applied in Section 5 and Section 6.

Hull and White (1998) and Barone-Adesi et al.(2002) separate VaR models in unf ilteredVaR models and filtered VaR models (Bao, Lee & Saltoglu, 2006). A VaR model is unltered when it is applied to the returns, rt, or when the return's

volatility, σt, is assumed to be constant over time, so σt = σ. A VaR model is

ltered when it is applied to the standardized demeaned return series, zt = rt−µ_σ_t t,

with the volatility being time varying, following from a volatility model.

Bao, Lee & Saltoglu (2006) apply a GARCH(1,1) model for ltering and conclude that ltering is useful for some VaR models, but harmful for other VaR models, a result which is rather odd when considering that a constant volatility over time might be too restrictive. Kuester, Mittnik and Paolella (2006) nd that all of their unltered VaR models show clustered VaR violations, and that only a few unltered models result in the correct violation frequency, leading to the conclusion that ltering allows for better Value at Risk prediction. Therefore, it is expected that ltering will lead to a better performance of the VaR models, since the assumption of a constant volatility over time might be too restrictive and moreover, an unltered VaR model is not taking possible correlations between volatilities into account, leading to clustered VaR violations.

The choice of volatility model is important for correct forecasting of the con-ditional variance. Various volatility models have been proposed and applied in empirical research to nancial returns. The GARCH model (Bollerslev, 1986) and variants of it have been used extensively in previous studies about the VaR (see for example Bao, Lee & Saltoglu, 2006; Kuester, Mittnik & Paolella, 2006; Angelidis, Benos & Degiannakis, 2004) and this model and variants of it will be

9_{Please note that the index for the returns here is `i', and not `t'. When r}_t_{is mentioned in}

(15)

applied in this research as well.

Recently, a new form of volatility models have been proposed, based on the score of the distribution of the returns. These volatility models are called Gen-eralized Autoregressive Score (GAS) models (Creal, Koopman & Lucas, 2008). An attractive feature of GAS models is that several other volatility models, like GARCH and EGARCH models, can be derived from them, when making specic choices within the model. Examples of studies in which this model is applied to research the VaR are Gao and Xiao-Hua (2016) and Lucas and Zhang (2016).

The volatility models which will be applied in order to analyse the impact of ltering are the GARCH(1,1) model, IGARCH(1,1) model, EWMA(λ) model and the GAS(1,1) model10_{. These volatility models will be explained in the following}

paragraphs.

2.1.1 GARCH(p,q) model and extensions

The GARCH(p,q) model was rst proposed by Bollerslev (1986). This model is as follows: rt= µt+ t σ2_t = α0+ q X k=1 αk2t−k+ p X k=1 βkσt−k2 , (4) in which µt = E[rt|Ft−1] = 0, t ∼ N (0, σt2) or t ∼ tν(0, σ2t). However, the

original GARCH model as proposed by Bollerslev (1986) only assumed normally distributed residuals. The variance of the residuals, but in this case also the vari-ance of the returns because of the zero conditional mean assumption, is denoted by σ2

t. The coecients α0, αk and βk can be estimated by maximum likelihood,

using a time series of returns, a starting value11

0 for t= rt− µt, and a starting

value for the conditional variance12_{, say σ}2

0. An explanation of the maximum

like-lihood estimation for a GARCH(p,q) model with normally distributed residuals can be found in Section 5 of Bollerslev (1986).

In order to generate only positive variances, the following constraints are imposed (Bollerslev, 1986): α0 > 0, αi ≥ 0, βj ≥ 0, for i = {1, ..., q}, j =

10_{With the same form as an EGARCH model.} 11_{Can be 0 for example.}

(16)

{1, ..., p}. When the sum of the coecients is less than one, Pq_k=1αk+

Pp

k=1βk <

1, the GARCH variance process is said to be stationary, which means that the variance uctuates around a mean, i.e. reverts to a mean, given by:

E[σt2] = α0 1 −Pq k=1αk− Pp k=1βk .

For the research in this paper, p = 1 and q = 1 is chosen, thus a GARCH(1,1) model. The coecients are estimated for each sample of returns, with the sample dened by Equation (3). Then, with the use of starting values for σ2

0 and 0, a

series of conditional variances can be generated by Equation (4). The estimated coecients, the sample of returns and the generated series of conditional variances can then be used in order to calculate the out-of-sample, one-step-forward σ2

t.

This one-step-forward σ2

t is needed in the application of the ltered VaR methods.

See Section 4 for details about the calculation of σ2 t.

Various extensions to the GARCH model have been proposed in the litera-ture. The extensions relate to a transformation of the dependent variable σ2

t,

the addition of a leverage eect13 _{term to the model, the assumption of another}

distribution for the error t and a violation of the stationarity condition.

Nelson (1991) proposed the EGARCH model, which applies log(σ2

t) rather

than σ2

t in the second line of Equation (4), and which also takes the leverage

eect into account (see third line of Equation (5)). The EGARCH model, for p = 1 and q = 1, is as follows:

rt = µt+ t

log(σ_t2) = ω0+ α1g(zt−1) + β1log(σ2t−1)

g(zt−1) = θzt−1+ γ[|zt−1| − E(|zt−1|)],

(5)

in which zt= t−µ_σ_t t. This model has the attractiveness that no restrictions on the

coecients have to imposed to ensure that σ2

t is positive. However, a leverage

eect corresponds to θ < 0 and for stationarity |β1 | < 1 should hold (Nelson,

1991). The mean of this process is then given by: E[log(σt2)] =

ω0

1 − β1

.

(17)

Another model which takes the leverage into account is the GJR-GARCH (Glosten, Jagannathan & Runkle, 1993), which is also known as threshold GARCH or TGARCH (Zakoian, 1994). However, this model is not applied in this research, therefore its details will not be discussed.

The distribution of t has been found to deviate from a normal distribution in

empirical research (Broda, Haas, Krause, Paolella & Steude, 2013). The distri-bution violates the normal distridistri-bution in that it has fatter tails than a normal distribution or in that it is asymmetric, or both. For fatter tails, one can apply a t- distribution, as also done in this research, and asymmetry can be approached by a skewed t- distribution, for example. When both characteristics of a nor-mal distribution are not satised, then a mixed distribution can be applied to the residuals, as done by Broda et al. (2013). These researchers studied mix-ture GARCH models, and nd that a mixed stable GARCH process has the best performance among their analysed mixed distributions.

When the GARCH process is nonstationary, so Pq

k=1αk+

Pp

k=1βk= 1, then

the GARCH process is integrated (IGARCH), as stated by Engle and Bollerslev (1986). The EWMA model, which will now be explained, is a special case of the IGARCH model.

2.1.2 EWMA(λ) model

The EWMA(λ) model, proposed by Morgan (1996), is as follows: rt= µt+ t

σ_t2 = (1 − λ)2_t−1+ λσ2_t−1 = σ_t−12 · (λ + (1 − λ)z_t−12 ),

(6)

in which λ is called the decay factor (Morgan, 1996), µt= E[rt|Ft−1] = 0 in this

research, t∼ N (0, σ2t) or t∼ tν(0, σt2). The variance of the returns in this case

is σ2

t and zt is a standardized variable: zt∼ N (0, 1) or zt∼ tν(0, 1), with tν(0, 1)

being a standardized Student's t distribution.

As mentioned by Morgan (1996), the EWMA model can be seen as a form of GARCH(1,1) model (see Equation (4)) with the restriction14_{: α}

1 + β1 = 1,

(18)

which is an IGARCH(1,1) model, together with the restriction α0 = 0. Nelson

(1990) shows that for such IGARCH(1,1) models, the variance σ2

t is a martingale:

E(σt2|σt−12 ) = σt−12 , and that its process is analogous to a random walk without

drift.

Morgan (1996) proposes 0.94 as optimal choice for the decay factor of daily returns, this will be applied for this model in the research. The volatility series, including the one-step-ahead variance, can then be generated with Equation (6), using a start value σ2

0, the sample of the return series from Equation (3), and this

λof 0.94. In order to analyse the impact of xing the decay factor of the EWMA model at 0.94, also an IGARCH(1,1) model, without constant, will be estimated and evaluated. See Section 4 for details.

2.1.3 GAS(p,q) model

The GAS(p,q) model is proposed by Creal, Koopman and Lucas (2008). This model is as follows: rt= µt+ t ft= ω + p−1 X i=0 Aist−i+ q X j=1 Bjft−j, (7)

in which as usual for this research µt = E[rt|Ft−1] = 0 and t ∼ N (0, σt2) or

t ∼ tν(0, σt2). The parameter vector ft is a function of the conditional variance

σ2_t, for example ft = log(σ2t). Vector ω is a vector of constants and Ai and Bj are

coecient matrices. The scaled score, st, is dened as follows (Creal, Koopman

and Lucas, 2008): st= S(t, Rt−11 , F t−1 1 ) · ∇t= St−1· ∇t, ∇t= ∂ log(p(rt|ft−1, Rt−11 , X t 1, F t−2 1 ; θ))/ ∂ft−1, St−1= S(t, Rt−11 , X t 1, F t−1 1 ; θ), (8) in which Rt

1 = {r1, ..., rt}, the set of returns, X1t = {x1, ..., xt}, the set of

covari-ates15_{and F}t

1 = {f1, ..., ft}, the set of functions ft. The parameter vector is θ, i.e.

θ = (ω, A1, ..., Ap, B1, ..., Bq, µ, ν)and p(rt)is the density of the observed returns.

(19)

Matrix S(·) is a scaling matrix, which can be the inverse information matrix for example.

The parameters of the GAS model can be estimated by maximum likelihood. The maximum likelihood estimation of the GAS model is explained in Creal, Koopman and Lucas (2008).

Creal, Koopman and Lucas (2008, 2012) show that for specic choices of the function ft, several known volatility models can be derived. They derive that

when ft = σt2, t∼ N (0, σt2) and when the inverse information matrix is used as

scaling matrix, that then the GAS model equals the GARCH model16_(Bollerslev,

1986). For ft= log(σt2), Creal, Koopman and Lucas (2012) derive the EGARCH

model (Nelson, 1991) out of a GAS model.

For this research, p = 1 and q = 1, thus a GAS(1,1) model will be applied to the returns. The function ft is chosen to be log(σ2t), so it mimics an EGARCH

model, which is closest to the volatility model used for the simulation of the returns17_{. The inverse information matrix is used as a scaling matrix. Then, with}

the use of the return series of the sample in Equation (3) and starting values ω0, A0, B0, µ0, ν0 and σ0, parameter vector θ will be estimated by maximum

likelihood. The volatility series, including the one-step-ahead variance, can then be generated with Equation (7). See Section 4 for details.

2.2 Parametric method

In the parametric method, the V aR(α)t is calculated by assuming a parametric

distribution for the returns and then taking its quantile. The quantiles18 _q_ˆ t(α)

are approached by a location-scale method (Bae & Iscoe, 2012), thus the mean and variance of the returns are needed, as well as an assumed distribution for the standardized residuals. Therefore, the quantiles are obtained as follows (Bao,

16_{However, they mention that one should take note of that for}_t_{∼ t}_ν_{, the GAS model takes}

a dierent form than that of a GARCH model with t∼ tν.

17_{The volatility model used for simulation is the Stochastic Volatility model. For details,}

read Section 5.

18_{Do note that the V aR(α)}

t is actually the 100(1-α)% quantile, ˆqt(1 − α), in the way it is

(20)

Lee & Saltoglu, 2006):

ˆ

qt(α) = ˆµt+ ˆσtG−1t (α), (9)

in which ˆµt = E[rt|Ft−1] = 0 by assumption. For the unltered parametric

method, the variance follows from the unconditional variance of the returns in the sampling window: ˆσ2

t = 1 S−1 Pt−1 j=t−Sr 2

j. The distribution of the

standard-ized residuals is assumed to be Gt i.i.d.∼ N (0, 1), or with a t-distribution for the

standardized residuals: Gt i.i.d.∼ tν(0, 1). Do note that in the assumption of a

t-distribution, the degrees of freedom ν can be estimated for each sample of stan-dardized residuals. However, the t-distribution then has to be stanstan-dardized, by multiplying it with qν−2

ν .

For the ltered case, the variance follows from the one-step-ahead conditional variance ˆσt out of EWMA, GARCH(1,1), IGARCH(1,1) or GAS(1,1) estimation

applied to the returns in the sampling window, as described in Section 2.1. The distribution of the standardized residuals, zt = rt

−µt

σt , is then again Gt

i.i.d._{∼ N (0, 1)}

or Gti.i.d.∼ tν(0, 1), in which ν may vary per sample and can be estimated. Again,

the t-distribution has to be standardized.

Since the parametric method already assumes a parametric distribution of the residuals, it is not expected to be very sensitive to the number of observations in the information set. However, a downside to this method is that the assumed distribution might be a wrong approach to the real distribution of the returns. For this reason, a non-parametric method will also be applied in research. This method is the historical simulation method, which will now be explained.

2.3 Historical Simulation

The historical simulation method is a non-parametric method for calculating the Value at Risk, based on the key assumption that the sample of past returns is representative for the distribution of the one-step-forward return. Because this method is non-parametric, it is expected to give more accurate Value at Risk calculations than the aforementioned parametric methods (Adcock, Areal & Oliveira, 2011). However, it needs a long enough series of raw data in order to accurately calculate the Value at Risk (Piroozfar, 2009). This subsection will

(21)

rstly explain the unltered historical simulation method. The ltered historical simulation method is explained afterwards.

The unltered historical simulation method is easy and straightforward to apply. For the unltered historical simulation, the V aR(α)t is calculated by a

moving window approach, with the sample of returns described as in Equation (3). The historical simulation method then assumes that the distribution of the one-step-forward return is the same as the distribution of the sample. Therefore, a histogram is made of the returns in the sample and then the V aR(α)t is the

100(1 − α)% quantile based on the histogram of the returns (Hendricks, 1996). Even though this method is straightforward to apply, its application has some pitfalls. Besides its assumption that the distribution of past returns and future returns is the same, the unltered historical simulation method also relies on i.i.d. distributed returns (Bao, Lee & Saltoglu, 2006). These researchers mention that, for turmoil periods with a lot of signicant risk changes, the unltered historical simulation method turns out to be a very bad measure of risk. Adcock, Areal and Oliveira (2011) also state that the unltered historical simulation method performs poorly when volatility clustering is present. Therefore it is proposed that ltered historical simulation, which takes volatility updating into account, is a more appropriate method to calculate the Value at Risk.

The ltered historical simulation method, which will be applied in this re-search, is explained in Adcock, Areal and Oliveira (2011), Barone-Adesi, Gi-annopoulos and Vosper (2002) as follows19_{. Let the returns r}

t and volatility σt2

be dened by the GARCH(1,1), IGARCH(1,1), EWMA(λ) or GAS(1,1) model explained in Section 2.1, but with Student's t distributed residuals. Then dene standardized residuals: zj =

rj−µj

σj , and calculate this zj for j = [t − S, t − S + 1, ..., t − 1], with S being the sample size. Now take the 100(1 − α)% quantile, zq(1−α), from the set [zt−S, ..., zt−1]. With this quantile and the one-step-forward

ˆ σ2

t from a volatility model, calculate V aR(α)t with V aR(α)t= ˆµt+ ˆσtzq(1 − α).

This section discussed the parametric and non-parametric methods which will be used to calculate the VaR. Moreover, the unltered and ltered VaR models

19_{However, in this research the quantile of the set z}_j_{for j = [t−S, t−S +1, ..., t−1] is taken,}

instead of the bootstrapping method these papers describe. This will reduce computational eort.

(22)

were explained. The next section will discuss backtesting, a procedure used to evaluate the performance of the VaR models.

3 Backtesting Value at Risk

The previous section explained the various VaR methods which will be applied for research. This section explains backtesting, a procedure used to evaluate the performance of VaR models. First, the general framework behind backtesting will be explained. Backtests, proposed by Christoersen (1998) and Engle and Manganelli (1999), will be explained afterwards.

3.1 Backtesting framework

As before, let rtdenote the log-returns and let the VaR be dened as in Equation

(2), with coverage rate α of 5% and 1%. In the backtesting procedure, the forecasted VaR is compared to the realized log-returns. In order to implement backtesting, a hit-series, {I(α)t}Tt=1 has to be dened (Christoersen, 1998):

I(α)t=    1, if rt< −V aR(α)t 0, else . (10)

Thus, {I(α)t}Tt=1 is a binary time series indicating whether a loss equal to or

greater than the Value at Risk occurred (i.e. whether a hit or violation occurred). Christoersen (1998) proposes that Value at Risk forecasts are valid only if the corresponding process I(α)t satises the following criteria, written as follows in

Dumitrescu, Hurlin and Pham (2012) and Pajhede (2015):

• The unconditional coverage hypothesis: the unconditional probability of a hit must be equal to the coverage rate α,

P r[I(α)t = 1] = E[I(α)t] = α.

• The independence hypothesis: hits with the same coverage rate, ob-served at dierent points in time, must be independently distributed. For-mally, hit-variable I(α)t must be independent of hit-variable I(α)t−k, ∀k 6=

(23)

0. Therefore, the following must hold:

P r[I(α)t= 1|Ft−1] = P r[I(α)t= 1].

• The correct conditional coverage hypothesis: this hypothesis com-bines the two aforementioned criteria. The probability of a hit must be constant and equal to the coverage rate α:

P r[I(α)t= 1|Ft−1] = P r[I(α)t = 1] = α.

When these criteria are satised (i.e. the Value at Risk model is valid), it follows that the series {I(α)t}Tt=1 is a series of identically and independently

distributed (i.i.d.) Bernoulli random variables (Christoersen, 1998):

I(α)ti.i.d.∼ Bernoulli(α). (11)

Various tests have been applied in earlier empirical research in order to test these criteria, for example the Christoersen test (1998) and the Dynamic Quan-tile test (Engle & Manganelli, 1999). These tests will be applied to evaluate the Value at Risk methods and will be explained in the coming subsections.

3.2 Christoersen test

In Christoersen (1998), tests for the three aforementioned hypotheses are pro-posed. First the test for unconditional coverage will be described and explained, then the test for independence of I(α)t and afterwards the conditional coverage

test.

3.2.1 Christoersen test for unconditional coverage

For unconditional coverage one has to test whether the fraction of hits (ˆα) ob-tained by the Value at Risk models for research equals the specied coverage rate α. Under the null hypothesis of unconditional coverage, H0 : ˆα = α. The

likeli-hood of I(α)t, an i.i.d. Bernoulli(α) random variable, is given by (Christoersen,

1998; Adcock, Areal & Oliveira, 2011): LT(α) =

T

Y

(24)

with T0 and T1 being the number of zeros and respectively the number of ones

in the hit-series {I(α)t}Tt=1. The likelihood of I(ˆα)t, LT( ˆα), follows analogously

from Equation (12), with ˆα instead of α. The fraction of hits, ˆα, can be estimated by ˆα = T1

T . It follows that, for T → ∞, the Christoersen test statistic for

unconditional coverage is given by (Christoersen, 1998): LRChr, U C = −2[log(LT(α)) − log(LT( ˆα))]

= −2[log(1 − α)T0+ log(α)T1− log(1 − ˆα)T0− log( ˆα)T1] d

→ χ2 1.

(13) The p-value of this test is obtained as follows:

pChr, U C = 1 − Fχ2

1(LRChr, U C), (14)

with Fχ2

1(·)the cumulative distribution function (CDF) of a χ 2

1 distribution.

3.2.2 Christoersen tests for independence and correct conditional coverage

To test for independence and correct conditional coverage, let the hit-series {I(α)t}Tt=1, be modeled as a rst order Markov process (Christoersen, 1998):

P (p01, p11) =

1 − p01 p01

1 − p11 p11

! ,

in which pij are probabilities given by pij = P r[It = i and It+1 = j], i, j =

{0, 1}. The likelihood of this process, for a sample of T observations, is given by (Christoersen, 1998):

LT(P (p01, p11)) = (1 − p01)T00p01T01(1 − p11)T10pT1111, (15)

in which Tij, for i, j = {0, 1} is the number of observations with a j following an

i. The observed probabilities, ˆpij, are dened as follows (Christoersen, 1998):

ˆ p01= T01 T00+ T01 , ˆp11 = T11 T10+ T11 , ˆp00 = 1 − ˆp01, ˆp10= 1 − ˆp11.

Now dene the following unrestricted estimator (ˆα), the restricted estimator under independence (˜α) and the restricted estimator under correct conditional

(25)

coverage (α0) as follows, with α being the specied coverage level (Pajhede, 2015): ˆ α = (ˆp01, ˆp11)0, ˜α = H ˆφ, α0 = Hα, H = (1, 1)0, ˆφ = T01+ T11 T00+ T01+ T10+ T11 .

Therefore, to test for independence of the hits, the following null hypothesis has to be tested: HInd : ˆα = ˜α. To test for conditional coverage, the null

hypothesis is given by HCC : ˆα = α0.

It follows that, for T → ∞, the Christoersen test statistic for independence of the hits is given by (Christoersen, 1998):

LRChr, Ind = −2[log(LT(P ( ˜α))) − log(LT(P ( ˆα)))]

= −2[log(1 − ˆφ)(T00+ T10) + log( ˆφ)(T01+ T11)

− log(1 − ˆp01)T00− log(ˆp01)T01− log(1 − ˆp11)T10− log(ˆp11)T11] d

→ χ2 1.

(16)

Furthermore, for T → ∞, the Christoersen test statistic for conditional coverage is given by (Christoersen, 1998):

LRChr, CC = −2[log(LT(P (α0))) − log(LT(P ( ˆα)))]

= −2[log(1 − α)(T00+ T10) + log(α)(T01+ T11)

− log(1 − ˆp01)T00− log(ˆp01)T01− log(1 − ˆp11)T10− log(ˆp11)T11] d

→ χ2₂.

(17)

The Christoersen test statistic for conditional coverage could also be ob-tained as follows (Christoersen, 1998):

LRChr, CC = LRChr, U C + LRChr, Ind. (18)

The p-values of the independence and conditional coverage test are obtained as follows:

pChr, Ind = 1 − Fχ2

1(LRChr, Ind) (19)

pChr, CC = 1 − Fχ2

(26)

Both the Christoersen test for unconditional coverage and the test for condi-tional coverage will be used for this research. The uncondicondi-tional coverage test will check whether the VaR has the correct coverage, and when the conditional cov-erage hypothesis is satised, then it follows that also the criteria of independence of I(α)t is satised, in the rst lag-order.

The Christoersen test has some limitations (Adcock, Areal & Oliveira, 2011), because it only takes rst order dependence between the hit-series into account. Consequently, the Christoersen test might not reject a hit-series in which higher order dependences, but no rst order dependence, are present. This gives reason for the use of another backtest, namely the Dynamic Quantile test (Engle & Man-ganelli, 1999), which accounts for the rst but also for higher order dependences. This test will be explained in the following subsection.

3.3 Dynamic Quantile test

Instead of using a test based on the distribution of the hit-series, like the LR-test proposed by Christoersen (1998), Engle and Manganelli (1999) proposed a test based on the regression of the demeaned hit-series: I(α)t− α. The regression

model is as follows (Engle & Manganelli, 1999; Dumitrescu, Hurlin & Pham, 2012): I(α)t− α = δ + K X k=1 βk(I(α)t−k− α) + K X k=1 γkV aRt−k + t t = 1, ..., T t=    −α, with probability (1 − α) 1 − α, with probability α , (21)

with α the specied coverage, K the specied lag-order and t i.i.d. distributed.

Now dene = {t}Tt=1, the T x 1 vector of residuals, Ψ = [δ, β1, ..., βkγ1, ..., γk]0

as the (2K+1) x 1 vector of parameters, X as the T x (2K+1) matrix of regressors and Hit as the T x 1 vector of regressands of Equation (21). Then, Equation (21) may be rewritten as follows:

Hit = XΨ + . (22)

The coecient δ should be equal to zero in order to satisfy the unconditional coverage hypothesis: HU C : E[It(α)] = α. In order to satisfy the hypothesis

(27)

of independence of variable I(α)t, HInd : P r[I(α)t = 1|Ft−1] = P r[I(α)t = 1],

the coecients [β1, ..., βk, γ1, ..., γk] should be equal to zero. Thus, for correct

conditional coverage, the following null hypothesis has to be tested: HCC :

[δ, β1, ..., βk, γ1, ..., γk]0 = 0, or equivalently HCC : Ψ = 0. To derive a test for

this, Engle and Manganelli (1999) rst note that the OLS estimator of Ψ, ˆΨOLS,

has approximately the following distribution under the null hypothesis: ˆ

ΨOLS = (X0X)−1X0Hit a

∼ N (0, α(1 − α)(X0X)−1). (23) Engle and Manganelli (1999) then derive the Dynamic Quantile test statistic, which is ,in fact, a Wald test statistic:

DQCC = ˆ Ψ0_OLS(X0X) ˆΨOLS α(1 − α) a ∼ χ2 2K+1. (24)

The p-value of this test is calculated as follows:

pDQCC = 1 − Fχ2_2K+1(DQCC), (25)

with Fχ2

2K+1(·) the CDF of a χ 2

2K+1 distribution.

One might note that the OLS estimation of Equation (21) will be awed, since the Hit variable is a binary variable. Therefore an improvement of this test would be to estimate this model by probit or logit. Dumitrescu, Hurlin and Pham (2012) studied the DQ test with a logit or probit estimation of the model and call this the Dynamic Binary (DB) test. However, these researchers do not nd much dierence in the performance between the DQ test and the DB test. The DB test only has slightly better nite sample properties in their research.

3.4 Finite sample properties of the backtests

The backtests previously explained have as common condition that the out-of-sample size20 _{should be innitly large (T → ∞). However, in research it is often}

not possible to meet this condition, because of a shortage of the amount of data, for example. For this reason, the nite sample properties of the backtests have to be analysed.

(28)

The nite sample properties of various backtests have been analysed by Du-mitrescu, Hurlin and Pham (2012) and Berkowitz, Christoersen and Pelletier (2011). These researchers show that most of the backtests do not satisfy the desired properties for the size21 _{and power}22_{, a fact which has to be taken into}

account when interpreting backtest results. When a test is under the desired size level, it will reject the null hypothesis, given that the null hypothesis is valid, less than the size level describes, and analogously, when the test is above the desired size level it will result in more rejections of the null hypothesis than the size level describes. When a test has low power, this means that there is a high chance of accepting the null hypothesis given that it should be rejected. Thus, it is of importance to analyse the nite sample properties in depth, given the crucial role the two aforementioned backtests have in research. Therefore, the research of Dumitrescu, Hurlin and Pham (2012) and Berkowitz, Christoersen and Pelletier (2011) will now be extensively discussed.

The nite sample properties of an Unconditional Coverage test, i.e. the Kupiec test (Kupiec, 1995), are only analysed by Berkowitz, Christoersen and Pelletier (2011). The Kupiec test (Kupiec, 1995) is expected to have a performance similar to the Christoersen Unconditional Coverage test. Unfortunately, Dumitrescu, Hurlin and Pham (2012) only studied the conditional coverage tests. Both Du-mitrescu, Hurlin and Pham (2012) and Berkowitz, Christoersen and Pelletier (2011) studied the performance of the Christoersen Conditional Coverage test. Dumitrescu, Hurlin and Pham (2012) studied the DQ test with a lagorder varying from 1 to 3. Berkowitz, Christoersen and Pelletier (2011) studied the CAViaR test (Engle & Manganelli, 2004), which is the same as the DQ test. However, in their research they estimate the regression (Equation 21) by logit instead of OLS, and take a lagorder of 1 for the CAViaR test. The performance of this test and the DQ test are expected to be similar.

3.4.1 Size properties of the backtests

In order to analyse the eective size of the tests, both researchers generated a violation series {It} by drawing from a Bernoulli distribution with probability

21_{The probability of rejecting the nullhypothesis given that it should not be rejected.} 22_{The probability of rejecting the nullhypothesis given that it should be rejected.}

(29)

of 5% and 1%, reecting the coverage rates of the VaR, and afterwards applied backtests to the series. The signicance level taken by Dumitrescu, Hurlin and Pham (2012) is 5%23_{, whereas Berkowitz, Christoersen and Pelletier (2011) took}

a 10% level. Both researchers analysed out-of-sample sizes varying from 250 to 1500, however, only the results for the out-of-sample sizes from 500 to 1500 will be discussed here. The researchers sometimes do not reach the same conclusions, even though both researchers applied the same method to analyse the eective size. The conclusions of the research to the size properties are shown in Table 1.

Table 1: Size properties of the backtests

Sample size α Kupiec Chris. CC CAViaR | DQ

500 5% cor over under cor over

1% cor under under under over

1000 5% cor over under cor cor

1500 5% cor over over cor cor

The `sample size' refers to the out-of-sample size and α is the coverage rate. Per test, the rst column is the result from Berkowitz, Christoersen and Pelletier (2011) and the second column the result from Dumitrescu, Hurlin and Pham (2012). Results can be `cor', `over' or `under', meaning correctly sized, oversized or undersized, respectively.

3.4.2 Power properties of the backtests

Both researchers also studied the power of the backtests. Berkowitz, Christof-fersen and Pelletier (2011) performed their power analysis as follows. First they t a t-GARCH(1,1) model to a sample of returns. Then, with the estimated t-GARCH(1,1) model, they simulated a hit-series based on the VaR calculated by the t-GARCH(1,1) model, which is the hit-series under the null hypothesis. Moreover, they simulated a hit-series based on the VaR by the Historical Simula-tion method, which is the hit-series under the alternative hypothesis. They then compared the LR-statistic of both hit-series to get p-values. The null hypothesis

23_{They do not clearly mention their used signicance level, but from their interpretation of}

(30)

gets rejected for p-values less than or equal to the prespecied signicance level. The reader is referred to Section 5 and 6 of Berkowitz, Christoersen and Pelletier (2011) for more details on this procedure.

Dumitrescu, Hurlin and Pham (2012) performed a power analysis similar to the one of the aforementioned researchers, but unfortunately, for this analysis they only provide results for a 5% coverage rate and an out-of-sample size of 25024_{, which is not useful enough for comparison with the results obtained by}

Berkowitz, Christoersen and Pelletier (2011). Additionally, they performed an analysis using a hit-series simulated from a Bernoulli distribution with correct coverage rate of 5% and 1%, which is the hit-series under the null hypothesis. The hit-series under the alternative hypothesis is simulated from a Bernoulli distribution with a coverage rate of 3%, i.e. the `false' coverage rate. For this analysis they did report results for varying out-of-sample sizes. Details about their power analysis procedure can be found in Section 4.2 of Dumitrescu, Hurlin and Pham (2012).

Again, Berkowitz, Christoersen and Pelletier (2011) used a 10% signicance level while Dumitrescu, Hurlin and Pham (2012) used a 5% signicance level. The results, which are shown in Table 2, indicate that for both coverage rates of 5% and 1%, the power of the backtests increases with the out-of-sample size.

Table 2: Power properties of the backtests Sample size α Kupiec Chris. CC CAViaR | DQ

500 5% VL L M M/H L 1% VL VL VH M/H H 1000 5% VL L H M/H M 1% VL VL VH M/H VH 1500 5% VL L VH M/H H 1% VL VL VH M/H VH

The `sample size' refers to the out-of-sample size and α is the coverage rate. Per test, the rst column is the result from Berkowitz, Christoersen and Pelletier (2011) and the second column the result from Dumitrescu, Hurlin and Pham (2012). Results can be `V L', `L', `M', `H' or `V H', meaning `very low' (0 - 20%), `low' (20 - 40%), `moderate' (40 - 60%), `high' (60 - 80%), or `very high' (80 - 100%), respectively.

(31)

3.4.3 Expected backtest properties for this research

All in all, one has to be cautious in regards to the interpretation of the backtest results, since a small out-of-sample size might result in wrong conclusions of research. In this research, an out-of-sample size of 1000 will be taken into account. The following can be concluded, based on the previously provided results in Table 1 and Table 2.

For this out-of-sample size and a 5% coverage rate, the Christoersen UC test is expected to be correctly sized but very low in power. The Christoersen CC test is expected to be slightly oversized and low in power, while the DQ test is expected to have the right size and a higher power performance than the CC test, but still only a moderate power.

For a 1% coverage rate, the Christoersen UC test is expected to be correctly sized, but again very low in power. The Christoersen CC test is expected to be undersized and it might have a low power. For the DQ test not much can be said about the expected size, but its power is expected to be moderate.

This section discussed the backtests performed in this research, and their nite sample properties. The next section will discuss the empirical methodology applied in research.

4 Empirical Methodology

The empirical methodology applied in research consists of four steps. The rst step consists of the simulation of the returns or the collecting of the S&P 500 return series. This is followed by the calculation of the unconditional variance or the estimation of the parameters of the conditional variance models explained in Section 2.1. These estimated parameters will then be used in order to generate the conditional variances recursively. The third step is the calculation of the var-ious Value at Risk resulting from the parametric method, with the assumption of a normal distribution or a Student's t distribution, or from the non-parametric historical simulation method. The nal step is the backtesting of these VaR meth-ods, by the Christoersen Unconditional Coverage test, the Conditional Coverage test (Christoersen, 1998) and the Dynamic Quantile test (Engle & Manganelli,

(32)

1999). The aforementioned steps will now be explained in more detail.

Firstly, in the simulation step, N = 1000 time series of returns, with R = 5000 observations per time series, get simulated using the ASVJt-model. The variables kt, qt, λt, t and ηt from the DGP are drawn at random from their distributions,

using the default random number generator of Matlab. The values of the other parameters of the DGP are as mentioned in Section 5.1. Since the theoretical model behind the simulated returns is known, the performance of misspecied Value at Risk models can be evaluated in a controlled environment. An accurate analysis can be given, because the returns get Monte Carlo simulated 1000 times. The number of observations, 5000, is divided into the following: 500, 1000 or 1500 in-sample observations, 1000 out-of-sample observations. The rst 3500, 3000 or 2500 observations are left out so as to avoid problems with starting values25_.

The second step is the estimation of a GARCH(1,1), IGARCH(1,1) and GAS(1,1) model to the returns in the moving sample window, assuming either a normal distribution for the residuals, or a Student's t distribution. For the estimation of the (I)GARCH models, the MFE Matlab Toolbox by Sheppard (2009) has been used. For the estimation of the GAS model, the Matlab pro-gram written by Lit (2017) is applied. The EWMA model does not have to be estimated, since for this model the decay factor is taken as 0.94, the optimal value for daily returns, proposed by J.P. Morgan (1996). This optimal value will be tested by also estimating the IGARCH(1,1) model. The GARCH(1,1) and GAS(1,1) model contain a constant in the variance equation, the IGARCH(1,1) model and EWMA(1,1) model do not contain a constant in the variance equation. The choice of starting values for the models will now be mentioned.

The following were taken as the starting values for coecients of the GARCH model, when being the rst estimation for each time series and the moving window sample, with ν only in case of a Student's t distribution for the residuals and ¯σ2 t

being the unconditional variance of the returns in the sampling window:

α0 = ¯σ2t · (1 − α1− β1), α1 = 0.05, β1 = 0.94, ν = 4. (26)

25_{However, the number of observations which are left out might be somewhat high, since}

for example Hartz, Mittnik and Paolella (2006) only left out the rst 500 observations in their simulation study.

(33)

The starting values for the variance, σ2

0 and residuals, 0, are calculated by a

back cast algorithm, explained in Sheppard (2009). When a previous estimation for the same time series exist26_{, then the values of the estimated coecients}

from the previous estimation will be used as starting values. However, when the previously estimated values violated the stationarity, nonnegative variance or d.o.f. conditions, then the value of the coecient(s) which violated the conditions, have been changed to the value(s) as in Equation (26) respectively.

The starting values for the IGARCH model followed the same steps as the aforementioned GARCH model, with a few dierences. The rst dierence is that the starting values for the rst estimation of the moving window sample per time series are as follows:

α0 = 0, α1 = 0.05, β1 = 0.95, ν = 4. (27)

The second dierence is that instead of a check for the stationarity condition, there is a check for that the estimated α1 and β1 sum up to one. The other checks

stay the same with the GARCH model, i.e. the nonnegative variance and d.o.f. condition.

The EWMA model took as starting value 0.94 for the decay factor, for σ2 0 the

unconditional variance of the returns in the moving window sample and for 0

the return one step before the returns in the moving window sample27_.

The GAS model takes the following as starting values, based on an EGARCH model, with ¯σ2

t the unconditional variance of the returns in the sample:

A1 = 0.05, B1 = 0.945, ωt = (1 − B1) · log(¯σ2t), µt = 0, ν = 4. (28)

If the estimated B1 violated the stationarity condition (|B1| < 1for stationarity),

then B1 took the aforementioned starting value when it is the rst estimation of

each time series with the moving window, and the previously estimated value otherwise. When the estimated ν was lower than 2.1, then ν took the aforemen-tioned starting value when it is the rst estimation of each time series with the

26_{Because of a moving window sample approach, 1000 variance model estimations are made}

for each of the time series, in which the in-sample moves over time.

27_{This can be done since we got observations from longer ago than the rst element in the}

(34)

moving window, and the previously estimated value otherwise. If by error, the value of ωt is estimated too high, causing the generated variance to go to

inn-ity at one point, then the unconditional variances were taken when the variance reached innity. When the estimation resulted in an error because of the Hessian matrix being close to singular, then the unconditional variance was taken instead of the GAS conditonal variance. The latter occurred very rarely, and almost only for an in-sample size of 500.

After the maximum likelihood estimation of each model, the conditional ances are generated recursively. For the calculation of the rst conditional vari-ance, the estimated coecients and starting values for σ2

0 and 20 are used. For

later conditional variances, the estimated coecients and the previously calcu-lated σ2

t from the variance model, and 2t, which is the square of the previous

return in the sampling window, are taken for the calculation of the conditional variance.

The time needed to estimate the volatility models, especially for the GAS model, turned out to be quite long given the Monte Carlo setting. In order to reduce the computational time, the estimated parameters of the volatility models are assumed to be the same for each set of ten following VaR28_{. This is reasonable}

to do, since the moving sample window only contains one dierent return in it for each following VaR, or ten dierent returns from each rst VaR to the tenth VaR, and so on. Thus for the rst VaR the volatility model gets estimated, and with the estimated coecients and its sample of returns, the conditional variances get generated. The second VaR then uses the same estimated coecients as the rst VaR, but generates its conditional variances given its own sample. This continues until the tenth VaR. At the eleventh VaR, a new estimation is done, and then the same steps are taken, until the twentieth VaR, and so on.

In the third step 1000 one-step-forward out-of-sample VaR get calculated, based on the moving sample window. In all VaR methods, the mean µt is taken

as zero and the returns are assumed to have a symmetric distribution.

For the unltered parametric normal method, returns are assumed to have a normal distribution, so the 95% and 99% quantile from this distribution is

cal-28_{This is only done in the Monte Carlo study of this research and is not applied to the S&P}

(35)

culated by the location-scale method explained in Section 2.2, with σt being the

unconditional variance and Gtbeing the 95% or 99% quantile from a normal

dis-tribution. For the ltered parametric normal method, the variances σtfollow from

the EWMA, normal-GARCH(1,1), normal-IGARCH(1,1) and normal-GAS(1,1) models29_{, and G}

t being as aforementioned.

For the unltered parametric Student's t method, a t-location-scale distribu-tion gets t to the returns, so the d.o.f. ν gets estimated by maximum likelihood together with scale parameter σt. The location parameter µt is assumed to be

zero. For the ltered parametric Student's t method, a t-locationscale distri-bution is tted to the standardized residuals and then the t-distridistri-bution of the residuals gets standardized, by taking tν ·

q

ν−2

ν . The volatility σt results from

the ouof-sample one-step-ahead conditional variance models, being EWMA, t-GARCH(1,1), t-IGARCH(1,1) and t-GAS(1,1). In both the unltered and ltered VaR, when a too low d.o.f., i.e. under 2.1, was tted, then it took the value 4 when it is the rst estimation of each time series with the moving window, and the previously estimated value otherwise.

The unltered historical simulation method takes the minus of the 5% and 1% quantiles from the returns in the moving sample window, which are the 95% and 99% quantiles under the assumptions stated in Section 2.1., and these quantiles equal the Value at Risk. For the ltered historical simulation method the Value at Risk is calculated by taking the quantiles from the moving window sample of standardized residuals30_{and multiplied by the one-step-ahead out-of-sample}

con-ditional volatilities following from the EWMA, t-GARCH(1,1), t-IGARCH(1,1) and t-GAS(1,1) models.

The last step consists of the backtesting of the VaR models. The calculated series {V aRt} will be compared with their corresponding returns, by generating

a violation series {It}, as dened by Equation (10). The Christoersen

Uncondi-tional Coverage (UC) and CondiUncondi-tional Coverage (CC) test (Christoersen, 1998), and the Dynamic Quantile test (Engle & Manganelli, 1999) will be performed on the violation series to evaluate the performance of the misspecied VaR

mod-29_{The prex `normal' or `t' refers to the distribution of the residuals.} 30_{As mentioned before, standardized residuals are calculated as z}_t ₌ rt

σt, with σt following

(36)

els. For the UC and CC test only this series is needed, but for the DQ test one additionally has to specify the number of lags of series It and V aRt, which are

used as regressors in the regression equation. The lag-order is set to ve, which is reasonable for daily return data, since ve lags cover the last week of return data. The Matlab code for the backtests is provided by Dumitrescu, Hurlin and Pham (2012).

The backtests will be performed using a 5% signicance level, i.e. the VaR model is rejected for tests with a resulting p-value of less than 5%. It might be that no violation occurred for a time series and in that case the series It is a

vector of zeros, leaving one unable to perform the aforementioned tests, since the backtests are not dened in this case. The event of zero violations is dealt with as a rejection of the VaR model in this research, because of the following reason. For research a number of 1000 VaR are taken into account, therefore the expected violations should be 50 or 10 for a coverage rate of 5% and 1% respectively. An event of zero violations then means that the VaR is not representing its coverage rate well and that it has to be rejected.

The aforementioned steps will also be performed on the S&P 500 returns. This section discussed the empirical methodology applied for research. The following section will explain the Monte Carlo simulated dataset and will provide and discuss the results of the backtests on this dataset.

5 Monte Carlo study

This section starts with the explanation of the simulation of the returns, with an ASVJt-model. The backtest results on these returns are provided and analysed afterwards.

5.1 Simulation of the returns

The Stochastic Volatility (SV) model will be used as Data Generating Process (DGP) for the simulation of the returns. The application of an SV model as DGP for a study on simulated returns has been put forward by Hartz, Mittnik and Paolella (2006). The Stochastic Volatility model was rst proposed by Taylor

(37)

(1982). However, a few additions are made to the original model, in order to let the simulated returns reect real nancial returns better.

The rst addition to this model is that of a jump in the returns (SVJ-model). The accommodation of a jump in the returns is important because it relates to periods of new arrivals, in which the stock market is under stress and less liquid (Hautsch & Ou, 2008). For analysis and results of the estimation of SVJ-models, see the aforementioned paper and references therein.

A second addition to the model is that of a Student's t distribution for the residuals in the return equation (SVt-model). Financial returns often show fat tails in their distribution, as already mentioned in previous sections. The assump-tion of a normal distribuassump-tion for the residuals will not capture this fat-tailedness of the returns accurately, whereas the Student's t distribution will capture this characteristic. Therefore the residuals of the model are assumed to follow a Stu-dent's t distribution.

The last addition to the original model is that of the leverage eect (ASV-model). Empirical studies show that a drop in returns is associated with a rise in volatility, and a rise in returns with a drop in volatility. Black (1976) is the rst to describe this phenomenon and he claims that this is due to asymmetric eects of changes of a rm's leverage ratio (Hautsch & Ou, 2008). For results of earlier studies to the ASV-model, see Hautsch and Ou (2008) and references therein.

The aforementioned SV-models are shown separately in Hautsch and Ou (2008). When combining them, this results in the Asymmetric Stochastic Volatil-ity with Jump and Student's t distribution model (ASVJt) as DGP:

rt= µt+ ktqt+ exp(ht/2)t (29) ht= ω + φ(ht−1− ω) + τ ηt (30) kt∼ N (αk, βk) (31) qt∼ Bernoulli(p) (32) λti.i.d.∼ IG(ν/2, ν/2) ν > 2 (33) t ηt ! i.i.d._{∼ N} 0 0 ! , λt 1 ρ ρ 1 !! . (34)

An investigation to the performance of misspecified VaR models

Master's Thesis