The realized real-time GARCH model

(1)

The Realized Real-Time GARCH

model

W.J.P. (Wessel) Schouten

Master’s Thesis to obtain the degree in Financial Econometrics

University of Amsterdam

Faculty of Economics and Business Amsterdam School of Economics

Author: W.J.P. (Wessel) Schouten

Student nr: 10363300

Email: wessel schouten@hotmail.com

Date: August 14, 2018

Supervisor:

A.C. (Andreas) Rapp MSc Second reader:

(2)

Statement of Originality

This document is written by Student Wessel Schouten who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

The Realized Real-Time GARCH model — W.J.P. (Wessel) Schouten iii

Abstract

In this research, a very topical subject is highlighted, namely the GARCH framework. More specific an extension (in this thesis usage of the kernel) to the new Real-Time GARCH model. The Real-Time GARCH model responds quicker to new levels of volatil-ity. Due to the availability of more high-frequency data, the model can be improved with realized measures, which are far more informative about the current level of volatility. Furthermore different specifications of the model and assumptions about the distribu-tions are investigated.

This thesis looks whether the above described extension performs better than the previous models. The research provides an comprehensive description about the realized Real-Time GARCH model, in terms of deriviations of the moments, derivatives and limits. Besides that, several test statistics are used for the comparison against other volatility models.

Based on this thesis, the improvement of the model is made clear. An empirical application with several indices show that the Realized RT-GARCH (linear) structure leads to substantial improvements in the actual fit. It also contains suggestions for further research about new specification or usage of data on this subject.

Keywords Real-Time GARCH, Realized GARCH, Stochastic volatility, Volatility forecast, Leverage effect

(4)

Preface

In the Preface, I would like to seize the opportunity to explain my choice for the subject of this research. Also, I would like to thank my supervisor.

The financial market is nowadays one of the most important things in the world. Due to the digitization of the transaction, there is more data available. High-frequency financial data is stored and can be used to predict the future. There is a need of new models, which change the volatility faster than before. In 2011, the GARCH framework was extended with realized measures. The variant with a real time parameter was published last year. I decided to carry out a study on a combination of the new model and the use of realized measures. The models need to respond quicker to financial shocks and make the market less risky.

I would like to thank my supervisor for helping me bring my thesis to a successful conclusion. At the University of Amsterdam, Andreas Rapp was my supervisor. He helped me with technical points on modelling and testing the realized extention of the real time GARCH model, but also on ways to present the research in this thesis. I re-ally appreciated his feedback and meetings we had during my research. Wessel Schouten

Amsterdam, 2018

(7)

Chapter 1

Introduction

The volatility process of the asset’s return is a widely used estimate of the financial risk association with the asset. McNees(1979) suggested that, ”the inherent uncertainty or randomness associated with different forecast periods, seems to vary widely over time”. And also that ”large and small errors tend to cluster together (in contiguous time peri-ods)”. This implies that the volatility changes over time and that the realized volatility may hold predictive power about future volatility. Furthermore Cont (2001) mentions in his research, based on previous studies, that if one examines the fanincial times series from statistical point of view (as done in the last half century), financial assets have the same stylized statistical properties. Where the main properties are summarized below and are discussed later on in this thesis.

1. Absence of autocorrelations: (linear) autocorrelations of asset returns are often insignificant, except for very small intraday time scales (' 20 minutes) for which microstructure effects come into play.

2. Aggregational Gaussianity: as one increases the time scale ∆t over which returns are calculated, their distribution looks more and more like a normal distribution. In particular, the shape of the distribution is not the same at different time scales. 3. Intermittency: returns display, at any time scale, a high degree of variability. This is quantified by the presence of irregular bursts in time series of a wide variety of volatility estimators.

4. Volatility clustering: different measures of volatility display a positive autocorre-lation over several days, which quantifies the fact that high-volatility events tend to cluster in time.

5. Slow decay of autocorrelation in absolute returns: the autocorrelation function of absolute returns decays slowly as a function of the time lag, roughly as a power law with an exponent β ∈ [0.2, 0.4]. This is sometimes interpreted as a sign of long-range dependence.

6. Leverage effect: most measures of volatility of an asset are negatively correlated with the returns of that asset.

The development of models started with a seminar paperEngle(1982) about the ARCH models. A few years later,Bollerslev(1986) generalized these models to GARCH models. The one-period forecast variance ARCH and GARCH models1 use the volatilities of the past to form expectations about the next period volatility.

Due to the lag of speed in catching up to new levels of volatility in GARCH models (as discussed inAndersen et al.(2003) andHansen et al.(2012)), there is several research

1_{abbreviation for (generalized) autoregressive conditional heteroskedasticity}

(8)

2 W.J.P. (Wessel) Schouten — The Realized Real-Time GARCH model

for the improvement of this model. High-frequency financial data is available nowadays and so there is need for new models to change the volatility faster.

A current improvement of the GARCH model is to make use of an additional source of observed data information, called realized measures as in the research ofHansen et al.

(2012). They mention that any of the realized measures of volatility, for example the realized variance, bipower variation and/or the realized kernel are far more informa-tive about the current level of volatility than the squared return, which is used in the standard GARCH model. Furthermore they demonstrated that it is straightforward to estimate and give a significant improvement to the standard GARCH model.

Stochastic volatility models (an overview of these models is given by Shephard and Andersen (2009)) are better in-sample fit than GARCH models according to the tests executed by Kim et al. (1998). Furthermore an important part of stochastic volatility models is the leverage effect, defined by Yu (2005) as relationship between volatility and price/return. He mentions that volatility tends to rise in response to bad news but falls in response to good news. When this leverage effect is not taken into account the option prices are substantially biased. Other non realized parameter improvements of the GARCH models (as inHansen and Lunde(2005)) and hybrid models, for example,

Wang(2009), were specified to get better result.

Smetanina(2017b) proposes the so-called Real-Time GARCH (RT-GARCH) model, which essentially uses a mix of GARCH(1,1) and a SV-model. The model uses infor-mation up to the current time period. This makes the model respond more quickly to new levels of volatility, which is important for example in state prior to a crisis, where the model needs to respond quick to large volatility changes. The two main implica-tions are that the conditional kurtosis is time-varying and the return distribution has an extra parameter to allow beter tail behavior. She shows that incorporating current information into volatility models gives a quicker response to changes in the uncondi-tional volatility. Furthermore she proves that the combination of ex ante and ex post volatility measurements leads to better out-of-sample volatility forecast.

The realized measures may also be an improvement to the RT-GARCH of model

Smetanina (2017b), which will be investigated in this thesis. It will be named the Re-alized Real-Time GARCH (Real-RT-GARCH) model and is specified in chapter three. The model has an autoregressive volatilty process. Where the inclusion of the realized measures has an autoregressive moving average.

The research question is as follows: ”Can the Real-Time GARCH model be improved by incorporating some realized measures or other distributions? ”. Hereby a lagged real-ized measure of volatility xt−1 replaces the sqaured lagged return r2t−1, as described in

the Realized GARCH model of chapter two. The normal distribution used in the Real-Time GARCH model is investigated by assuming the Student-t distribution (Hansen and Lunde(2005)) as an improvement for the innovations, which is explained in chapter three. Chapter three mentions also other specifications of the volatilty proces. Further-more the influence of different leverage functions is examined in this chapter. Chapter four contains an empirical analyses on our new Realized Real-Time GARCH model, by a description of the moments, estimations of the likelihood and volatility forecasting. After which the model is estimated (results in chapter six) on the data of chapter five. The final chapter concludes.

(9)

Chapter 2

Previous research models

The Real-Time GARCH (RT-GARCH) model (Smetanina (2017b)) is a mix of a GARCH(1,1) model (Bollerslev (1986)) and a stochastic volatility (SV) model (for ex-ample as inKim et al.(1998)). These standard models are shown below and are the basic of this research. Furthermore our model uses the realized measures for the volatility as inHansen et al.(2012), for this reason the realized model is also described below. Where rt is the return series and λt is the volatility in the following formulas. The new model

of chapter three will be compared with the Real-Time GARCH model as described in section four. Previous research has already showed that the real time model predicts the volatility better than the other models of this chapter.

2.1 GARCH

The GARCH(1,1) model of Bollerslev (1986) is described below.

rt= λtt (2.1)

λ2_t = α + βλ2_t−1+ γr_t−12 (2.2)

With t∼ N (0, 1) and λtis a autoregressive moving average model (ARMA(1,1)). The

model is Ft−1 measurable, so it uses only past data. Therefore it takes a long period to

adjust to a new level of volatility.

2.2 Realized GARCH

An extension on the normal GARCH(1,1) model is the RealGARCH(1,1) of Hansen

et al. (2012)2, where it takes realized measure into account. rt=

p

λtzt (2.3)

λt= α + βλt−1+ δxt−1 (2.4)

xt= ξ + ψλt+ τ (zt) + ut (2.5)

With zt ∼ i.i.d.(0, 1), ut ∼ i.i.d.(0, σu2), and τ (zt) is a leverage function, which can

generate an asymmetric response in volatility to return shocks. The inclusion of the realized measure in the model and the fact that xthas an autoregressive moving average

(ARMA) representation, motivates the name Realized GARCH. The xtrealized measure

replaces the r2

t squared returns in de model of GARCH.Hansen et al.(2012) say in their

research that these realized parameters are far more informative about the current level of volatility than the sqaured returns. Furthermore there can be a specification added to this model for example log-linear or linear. Chapter three describes these specifications for our new model.

2_{For simplicity, the model in this paper deviates from the others}

(10)

2.3 Stochastic Volatility

Stochastic volatilty (SV) models are different than the models described before. An example of a SV model is as followed

rt= λtt (2.6)

λ2_t = α + δzt (2.7)

With zt ∼ i.i.d.(0, σz2), t ∼ i.i.d.(0, σ2), and corr(zt+1, t) = ρ. It uses the data up to

time t, but the process for returns rt, is driven by two shocks. The non-zero dependence

between the shocks allows the model to pick up leverage effect, which is investigated by

Yu (2005).

2.4 Real-Time GARCH

The RT-GARCH(1,1) model of Smetanina (2017b) works as followed.

rt= λtt (2.8) λ2_t = α + βλ2_t−1+ γr_t−12 + φr 2 t λ2 t = α + βλ2_t−1+ γr_t−12 + φ2_t (2.9)

With t∼ i.i.d.(0, 1). The model is a mix of the GARCH(1,1) model and the SV-model,

because of the φ2_t. Where it uses all the information up to time t instead of time t − 1 as in the GARCH model. The new term can be seen as change of the information set, and as extra shape parameters for the density of returns which determines the ”peaked-ness” and/or thickness of the tails. Furthermore it has only one source of randomness shared by the returns and volatility instead of the two in the above described SV-model.

Smetanina (2017b) concludes in her research that her model reponds quicker to sud-den changes in volatility. Furthermore it gives a better out-of-sample volatility forecast and empirical fit than the models described above. In her paper about the asymptotic propetries Smetanina (2017a), she uses a notation where φ2_t is isolated from the rest, which makes the derivation of the formulas used to investigate the model (see for ex-ample the Appendix) easier and is more similar to the notation ofHansen et al.(2012). This is given by the following two equations:

rt=

p

λtzt (2.10)

(11)

Chapter 3

The Realized Real-Time GARCH

model

Now we combine the RT-GARCH(p,q) model with the RealGARCH(p,q) model. In the thesis we will investigate mainly the first lagged linear specification of this model with different distributions. The model will also be compared with the current existing mod-els (some of them) described in chapter two.

3.1 The General Formulation

The general formulation of the Real-RT-GARCH(p,q) model is given by3 rt=

p

λtzt (3.1)

λt= v(λt−1, ..., λt−p, zt2, ..., zt−k+12 , xt−1, ..., xt−q) (3.2)

xt= m(λt, zt, ut) (3.3)

With zt∼ i.i.d.(0, 1) and ut∼ i.i.d.(0, σu2). We can take the sqaureroot of λt asHansen

et al.(2012) did, because (Smetanina,2017b) mentions the following about the 2_t in her model. ”The new volatility process λ2_t instead of σ_t2, as equation2.9does not correspond to the conditional variance of returns in this system of equations, that is, var(r2_t|F_t−1) 6= λ2_t as λt is not independent of t any longer. Note also that the choice of a particular

function of t, that is 2t, is only one of many possible ones subject to the necessary

condition that λ2_t > 0. In particular, functions |t|, 4t, among others, are possible.”

Where she also mentions that theoretically φ can be negative, but for practical reasons it is the easiest way to restrict that it is always positive, φ > 0. If the parameter is equal to zero, the model will become a simple GARCH model, where λt is the variance of rt

given the past, so λt = V ar[rt|Ft−1] and is positive since the returns are not constant.

Therefore the condition of λt > 0 in our model (described in the next sections) is

statisfied. This modification makes it easier to estimate the moments and loglikelihood of the model. Which looks more like the SV-model with two random shocks now. τ (zt)

is a leverage function, the necessity of this function will be investigated and described in paragraph four of this chapter. For simplicity the specification is (as inHansen et al.

(2011)) τ (zt) = τ1zt+ τ2(z2t − 1), which generates assymetric response in volatility to

return shocks. As said before the standard SV-model already picks up some leverage effects, because of the dependence between the distributions. The main difference with the standard Real-Time GARCH model is that the lagged squared return is replaced by a realized measure of volatility, xt. The xt can represent for example the realized

variance, bipower variation, intraday range and/or squared return. In this thesis it will

3

For consistency with the Realized GARCH model ofHansen et al.(2012), the equations ofSmetanina

(2017b) are adjusted.

(12)

represent the realized kernel, which is calculated using the Parzen kernel function of

Barndorff-Nielsen et al. (2011) as described in the chapter five. They concluded that it allows to utilize high frequency data and significantly improves the predictive models, for example better calculations of the covariances. Which influences the first two stylized statistical properties ofCont(2001) from the introduction, however the 5-minutes trade data is only used for the realized kernel. We use the open to close prices for the returns.

3.2 Log-Linear Specification

In the log-linear specification of the Realized Real-Time GARCH model, the retuns are specified as followed. Where rt=

√ λtzt still holds. log r_t2= log λtz2t = log λt+ log zt2 (3.4)

The volatility process of this specification is described below. Where we use the loga-rithmic function of z_t−k+12 for consistency. This is not necessary, by the same arguments as we made in section one of this chapter.

log λt= α + p X i=1 βilog λt−i+ m X k=1 φklog zt−k+12 + q X j=1 γjlog xt−j (3.5) log xt= ξ + ψ log λt+ τ (zt) + ut (3.6)

This model specification has advantages, but also some pitfalls which are well described in previous research. For example logarithmic returns are continuously compounded and non-negative returns, which are time addictive, so the time series are easy to derive. FurthermoreHughson et al.(2006) have showed that logarithmic returns perform better in the forcasting than the simple returns. A downside of the specification is that in the relationship to the simple returns, the logarithmic returns (L) depends on both the mean and variance of the simple returns (S). Which results approximately in this formula (where the variables with a bar are the means), ¯xL= ¯xS− 0.5σ2S. So there rises

a relationship between risk (volatility) and return. WhereHudson and Gregoriou(2015) conclude their research about the simple returns against log-linear returns comparison with the following sentence ”although each method of calculating returns has advantages, the methods may give results that are surprisingly different. It is worthwhile to be aware of this and so not to draw inappropriate conclusions from empirical studies”. From the model specification we can conclude that all the derivations become very difficult. In the realized GARCH model of Hansen et al. (2012), log λt = bt−1, is still slightly easy

to calculate with. But if there is a term added of time t behind, it becomes very messy, for example if we look at the first derivation of the Appendix. For the time period and magnitude of this thesis, we leave the logarithmic specification for further research. Which is still a very interesting specification, whereas it is less sensitive for outliers as shown in the paper of Hansen et al.(2012).

(13)

The Realized Real-Time GARCH model — W.J.P. (Wessel) Schouten 7

3.3 Linear Specification

The linear specification of our model is described in this section. This specification is most common in other research and called the simple return. Also Smetanina (2017b) uses this linear specification, with lagged variables of one time period. Which seems right due to point five of the stylized properties of Cont (2001) as mentioned in the introduction, where he stated that time observations from a further past have less influence. rt= p λtzt (3.7) λt= α + p X i=1 βiλt−i+ m X k=1 φkzt−k+12 + q X j=1 δjxt−j (3.8) xt= ξ + ψλt+ τ (zt) + ut (3.9)

So the Realized Real-Time GARCH(1,1) model discussed in this thesis has the following form. rt= p λtzt (3.10) λt= α + βλt−1+ φzt2+ δxt−1= bt−1+ φz2t (3.11) xt= ξ + ψλt+ τ (zt) + ut (3.12)

Where we can easily split the φz2_t from the variables of time t − 1.

3.4 The leverage effect

Number six of the stylized statistical properties described byCont(2001) and mentioned in the introduction, is the leverage effect. The first empirical evidence for this fenominal is documented by, for example,Black(1976),Christie(1982) andEngle and Ng(1993). The leverage effect or news impact curve describes the negative correlation between the returns and the asset volatilty. But both positive and negative changes in the unexpected returns lead to an increase of the variance. So the negative correlation will count for the negative part of the returns. As Yu(2005) mentions in his paper ”The usual claim is that when there is bad news, which decreases the price and hence increases the debt-to-equity ratio (i.e. financial leverage), it makes the firm riskier and tends to increase future expected volatility.” FurthermoreBouchaud et al.(2001) reported evidence of this return-volatility relation, but concludes that it decays rapidly for individual stock and even more (exponentially) for stock indices. Jensen et al. (2003) plots the investment horizon against different levels of return. They showed that the downward movements are faster than the upwards. Which means there is gain/loss assymetry, that influences the leverage effect. For this reason we take an assymetric convex function as the lever-age effect in this thesis. Hansen et al. (2012) concludes in his research that a shifted polynomial works fine and makes the calculations of the loglikelihood easier. For this reason the leverage effect is equal to, τ (zt) = τ1zt+ τ2(zt2− 1), with τ2> 0, τ1 6= 0 and

an expectation of E[τ (zt)] = 0. τ2 > 0, the conclusions of, for example, Engle and Ng

(1993) are then satisfied. τ1 can not be zero, otherwise the function becomes

symmet-ric around the y-axis, furthermore the parameter will be negative from the fact that negative returns effects the volatility more than positive returns.

(14)

Figure 3.1: The leverage effect

The values for τ1 and τ2 in the above figure (3.1) are chosen based on the outcomes

of the research of Hansen et al. (2012). Later, we calculate these factors for the data and model used in this thesis. From the graph we see an assymetric function, where τ1

influences mainly the horizontal movement of the graph. τ2effects the vertical movement

and the wideness of the lines.

3.5 Model with Student-t distribution

The assumptions of our model are very important for the prediction of the financial markets. As mentioned as point two of the stylized statistical properties ofCont(2001) in the introduction, the returns become Normal distributed when the ∆t raises. For that reason, this thesis is based on a Gaussian assumption, but it is common known that the returns are approximately uncorrelated over time and described by a symmetric distribution, which has heavy tails as we can see in the graph below (3.2). Bollerslev

(1987) was the first who applied a Student-t distribution to the GARCH framework, where he finds that the GARCH(1,1)-t performs quite well. Still the question remained, if other conditional error distributions perform even better.

(15)

Figure 3.2: Gaussian vs Student-t distribution

From the figure above (3.2) we can see, not yet conclude, that it looks like the Stundent-t is a beStundent-tStundent-ter fiStundent-t for our daStundent-ta. Therefore our model is also invesStundent-tigaStundent-ted wiStundent-th a SStundent-tudenStundent-t-Stundent-t distribution (for zt), the results are shown in chapter six. The derivation (with Student-t

distribution) for all the formulas in this thesis are similar to the more described Gaus-sian distribution, therefore it is not included. Be aware that the formulas change after using the joint conditional probability function as in equation 4.9 of the next chapter. Furthermore we know that the Student-t (with ν degrees of freedom) distribution con-verges to a Gaussian distribution if 1_ν goes towards zero. But if this fraction is bigger than zero, then the distribution has ’fatter tails’. Moreover the fourth moment only excists if the degrees of freedom is higher than four. And the skewness of ztis zero when

ν > 0. This parameter will become an output of the minimalization of the loglikelihood. Therefore it is interesting to see what the results are for the degree of freedom. If we look at the programming of this extra parameter, the program minimizes faster is we use this as exp(ν). A minor change in the parameter will have more influence and gives errors earlier, when using the wrong starting values.

(16)

Chapter 4

Outline of the Estimation Theory

In this chapter we discuss the properties of the Real-RT-GARCH(1,1) model. We start with a derivation of the moments, after that we look at the conditional probability function. This function is needed to give a formulation for the loglikelihood. Which is very important later on in the estimation of the parameters. Furthermore the volatility forecast is explained, where parameters are used from the first results of chapter six. These forecasts will give (due to test statistics) us a good comparison between several models, this is also described in chapter six of this thesis.

4.1 Main results

In this paragraph, we derive some statistical properties of the Real-RT-GARCH(1,1) model. We start with the unconditional moments of r2_t, λt, and xt. From the fact that

zt∼ i.i.d.(0, 1), ut∼ i.i.d.(0, σu2), and equations3.10- 3.12, we get the following

uncon-ditional expectations. Where the derivations are shown by equationsA.11 - A.13. E[r2_t] = α + βE[λt−1] + φE[z4t] + δE[xt−1] (4.1)

With τ (zt) = τ1zt+ τ2(zt2 − 1), so the expectation of τ (zt) becomes zero (proof in

Appendix, equationA.11).

E[xt] = ξ + ψE[λt] (4.2)

Where the expectation of λt is used as starting point in the calculation of the

quasi-maximimum likelihood, and given by (for λ0 assuming weak stationarity)

E[λt] = α + δξ + φ + (β + δψ)E[λt−1] (4.3)

E[λ1] =

α + δξ + φ

1 − (β + δψ) (4.4)

A combination of the above derscribed equation gives the following expectation for r2 t

(see equationA.14for the derivation). Where a lot of previous researchers set E[z_t4] − 1 equal to η.

E[r_t2] = E[λt] + φ(E[z4t] − 1) (4.5)

If we use again the weak stationarity assumption, we get the following expression for the expecation of r₁2

E[r2₁] = α + δξ + φ

1 − (β + δψ)+ φη (4.6)

Furthermore, if we assume normality for zt, the kurtosis γ20 = E[zt4] = 3, which reduces

the previous formulas to the below described expectations of the squared returns at 10

(17)

time t and one. The distribution of zt is symmetric and therefore the skewness eqauls

zero, γ₁0 = E[z3 t] = 0.

E[r₁2] = α + δξ + φ

1 − (β + δψ)+ 2φ (4.7)

E[r_t2] = E[λt] + 2φ (4.8)

The above included expectations are also used in paragraph three for the volatility forecast. The joint conditional probability density function (pdf) of the return series and the realized measures is used for the calculations of the loglikelihood of paragraph two. First we split the joint conditional pdf into a part for the returns and a part for the realized measures, because we have assumptions for these distributions. The functions are given by (see equations A.15 -A.18).

Where we know bt−1 = α + βλt−1+ δxt−1 from equation 3.11. Furthermore we need

expressions for the unknown variables, d(rt) is equal to (see equationsA.4 - A.8).

d(rt) =      r√ b2 t−1+4r2tφ−bt−1 2φ for φ 6= 0 rt √ bt−1 for φ = 0 (4.10)

And as shown in equation A.9of the Appendix, ut= h(rt, xt) is given by

h(rt, xt) = xt− ξ − ψ r2_t d2_(r t) − τ (d(r_t)) (4.11) The σ2

u = V ar[ut] = V ar[h(rt, xt)] is used for the likelihood function, but be aware of

the fact that this is an exogenous parameters. So this is an output of the maximum likelihood estimation, not a parameter to maximize. The following graph (4.1) shows the above described variables for the maximum likelihood.

(18)

Figure 4.1: Variables λt, dt, and ut

We can see that there is a big increase in volatility around 2008, which is a result of the credit crisis. This will increase the estimated volatility σu. For this reason the

data is splitted (pre- and post-crisis) as in previous research. We will do the same in our analyses, a more extensive description of the data and the rest of the results are described in the next chapters.

(19)

4.2 Quasi-maximum likelihood analyses

In this section we derive the QMLE of the linear Realized RT-GARCH model. The score and hessian appear quite hard to derive, so only the first and second derivatives that are needed for the score function are included in the Appendix. Here we use two assumptions, one is that all the data is tied to their own latent volatility proces and the other contains the independency between ztand ut. In the calculation we split the joint

loglikelihood into a sum, l(rt, xt) = l(rt) + l(xt|rt). Where l(rt, xt) = log f (rt, xt|Ft−1),

but to give a better overview we insert the parameter θ. With θ = (ω0, υ0, φ, σ2_u), where ω = (α, β, δ)0, and υ = (ξ, ψ, τ1, τ2)0. Use the last formulation ofA.17 for the first term

of equation 4.9, to derive the following loglikelihood. Lt(θ) = T X t=1 lt(θ) (4.12) lt(θ) = log ∂zt(θ) ∂rt −1 2 z_t2(θ) + log σ_u2+u 2 t(θ) σ2 u (4.13) The score function st(θ) = ∂l_∂θt(θ) is described in equationsA.32- A.35of the Appendix.

In addition we also know from the Appendix that E[st(θ)|Ft−1] = 0. Where it is straight

forward that the expectation of equationA.34andA.35equals zero. For the expectation of the more advanced parts of the score function (equation A.32 and A.33), we refer to the Appendix of the paper of Smetanina (2017a). A lot of terms become zero after taking the expectation, but the parts that remain are already explained in that paper. For the asymptotic distribution of ˆθ, we do the same as in the paper of (Hansen et al.,

2012).

i Suppose that E[ut|zt, Ft−1] = 0, E[z2t|Ft−1] = 1, and E[u2t|Ft−1] = σu2. Then

st(θ) = ∂l_∂θt(θ) is a martingale difference sequence.

ii Suppose, in addition, that (rt, xt, λt) is stationary and ergodic. Then

√ T (ˆθ − θ0)−→ N (0, Vd θ) (4.14) Where, Vθ = Iθ−1JθIθ−1, Iθ= −_T1Eθ0[ ∂2_{log L} T(θ) ∂θ∂θ0 ], and Jθ = _T1Eθ0[ ∂ log LT(θ) ∂θ ∂ log LT(θ) ∂θ0 ].

4.3 Volatility Forecast

To determine if our model is a good prediction of the reality and to get a better under-standing how it reacts when it is used multiple times, we look at the volatility forecast. Where the equations 4.3, 4.4, and 4.5 are used, to derive an expression. The following formula is given for equation4.3 at time t + 1.

E[λt+1] = α + δξ + φ + (β + δψ)E[λt]

= (1 − (β + δψ))E[λ1] + (β + δψ)E[λt]

= E[λ1] + (β + δψ)(E[λt] − E[λ1])

(4.15)

Hereby equation 4.4 is substituted in the above formulation. If we fill this into equa-tion 4.5, where the information until now is known, the volatility forecast becomes as followed.

E[r2_t+k|Ft] = E[ˆλt+k|Ft] + ˆφ(E[zt+k4 |Ft] − 1)

= E[λ1] + ( ˆβ + ˆδ ˆψ)k(E[ˆλt|Ft] − E[λ1]) + ˆφ(E[z4t+k|Ft] − 1)

(4.16) The E[ˆλt|Ft] is an estimate of λtand E[z4_t+k|Ft] is decided in our assumptions. Results

on this formulation are described and analysed in chapter six by using the mean squared error (MSE), QLIKE statistic, median absolute error (MAE), and Model Confidence set (MSC) tests (tests are also explained there).

(20)

Chapter 5

Data subscription

In this chapter we describe the S&P 500 index of the used data for the returns and realized kernel. A short overview of the other indices is given in the Appendix by table

B.1 and graph B.2. Our dataset of indices comes from the Oxford-Man Institute of Quantitative Finance and is available for free. First we discuss how the returns are calculated and analyse how the models respond to several adjustments. To give a clear overview of the realized kernel, the calculation method is explained, but be aware that the modifications were already made for this dataset. Furthermore we give the difference between the standard squared returns and the realized kernel. We made a modification in the dataset, because on some dates (think of holidays or periods without recording) the stock exchanges are closed and not traded, therefore these values are deleted. Otherwise it gives the model errors (for example a logarithm of an empty value or NULL) or the price differences are not correct and then the estimations are based on wrong values. The overview of the data in AppendixB.2provides also the amount of data points that remain after the adjustments. This will influence the height of the loglikelihood, which is higher when we have more data points.

5.1 Returns

In our analyses we estimate the model on both open-to-close as close-to-close prices. High frequency data is only available between opening and closing time of the market, therefore the realized kernel of next section captures this timeslot. The graph below (figure 5.1) shows that the opening and closing prices are similar, but if we look at the more zoomed graph of AppendixB.1, then we see that there is a difference between the prices due to overnight effects. We prefer estimations without these overnight effects, therefore we use open-to-close prices in most of the calculations of the next chapter. Furthermore we see a lot of price fluctuations in the first years and also around the big credit crisis in 2008. This has a big influence on the variance of the returns, as shown in the next section. Therefore we split (as shown by the dotted line in the graph) the data in a pre- and post-crisis set. Lehman Brothers declared bankruptcy October 13th, 2008, for this reason, the pre-crisis period contains data from March, 2000 until September, 2008. The post-crisis contains also data of the crisis and is therefore from October, 2008 until October, 2017.

(21)

Figure 5.1: Closing prices

For the calculation of the returns, we need to look carefully at our model. The sqaured returns are used in the model of Smetanina (2017b), where we replace this for the realized kernel. The input of the returns can be in percentages, then it is multiplied by 100, which gives different outcomes than without the multiplication. Due to underlying floating point numerical instabilities in commonly used optimisation algorithms (e.g., regarding stopping criteria and numerical precision) having the percentage returns will give a better chance of converging during the fitting of the model. When the percentages are used, most of the returns at time t become bigger than one. This means if we take the squares it becomes even bigger, whereas the squares are lower than the original value if the value is below 1, r_t2 > rt if rt > 1. The parameter γ and φ influences

the model more with percentage returns, because of the fact that the squared returns are bigger than without multiplication. Therefore we see a difference in the parameter estimations of theSmetanina (2017b) RT-GARCH model on our data, as showed in the table (5.1) below. To be consistent with her paper, we also use percentage returns in this paper. Furthermore this is a common way to use the data in the GARCH framework, whereas a lot of papers in the past calculated with percentage returns. Because of this multiplication we need to adjust the realized kernel as well, which is explained in the next section.

Table 5.1: Influences of ordinary and percentage returns on parameter estimations.

Returns 100 * Returns

Parameter Estimate Standard Deviation Estimate Standard Deviation

α -0.0001 0.0000 -0.0210 0.0034

β 0.9134 0.0123 0.9142 0.0099

γ 0.0606 0.0169 0.0654 0.0091

(22)

Figure 5.2: Differences of open-to-close and close-to-close daily compounded returns

The formula for the returns becomes as shown below. Where rc,t stands for the

close-to-close returns and ro,t stands for the open-to-close returns. The differences are shown

in the figure above (5.2). We use the same notation for the prices, Pt.

rc,t= 100 ∗ (log Pc,t− log Pc,t−1) (5.1)

ro,t= 100 ∗ (log Pc,t− log Po,t) (5.2)

5.2 Realized Kernel

The realized kernel introduced by Barndorff-Nielsen et al. (2008) can be used in the GARCH framework as replacement of the squared returns. Because of our assumptions, the expectation of the returns are zero, so the realized kernel can be seen as a mea-surement of the variance (the quadratic variation of an underlying efficient prices) of the returns. The realized kernel is a robust estimator of the volatility, therefore it is often used for high frequency data. Whereas it is still an appropriate estimator when the returns contain some noise. Moreover they ignore the overnight price fluctations and also large errors in the beginning (minutes) of the day. In this section we describe the importance of data cleaning and the method that is used to calculate the realized kernel. This is in our opinion a notable aspect of the realised measurements, although as said before, the realized kernel is already calculated in our dataset.

(23)

5.2.1 Data Cleaning

Data cleaning is one of the most important aspects of volatility estimation and in our case computing the realised kernel. The realized kernel does not make a distinction between accurate data and outliers, therefore these noisy observations have a lot of influence on the estimations. Although as mentioned above, realized kernels are robust to noise,Barndorff-Nielsen et al.(2009) have looked for an improvement for the realised measures by using data cleaning. They concluded that the volatility is more accurate when using the cleaning steps, which are based from literature that already existed. The data of theOxford-Man Institute of Quantitative Financehas more variation compared to their research, but have used the same cleaning steps. These steps can be found in the paper of Barndorff-Nielsen et al.(2009) or on the website of the Oxford-Man Institute of Quantitative Finance. We do not describe them in this thesis, because again it is already done for this dataset.

5.2.2 Parzen Kernel function

In our dataset, the realized kernel is calculated using the Parzen-Rosenblatt window method. The Oxford-Man Institute of Quantitative Finance uses 1-5 minute return data to investigate the effect of the noise. They calculate the realised variance of small subsets and take the average over many of these estimators, which is called subsampling. We only use the realized kernel in this thesis, which is calculated with the Parzen weight function (described below). Where H needs to increase towards the sample size, to get consistently estimes. RMt= H X h=−H k h H + 1 γh (5.3)

k(x) stands for the Parzen kernel function, a non-negative function that integrates to one. γh is formulated as followed; tj, t are the times of trades or quotes (x) on the t-th

day. γh = n X j=|h|+1 xj,txj−|h|,t (5.4)

For more information we refer to the website of the Oxford-Man Institute of Quantita-tive Finance, where they included a reader for details. Furthermore we use percentage returns, as mentioned in the first part of this chapter. Therefore we need to be aware of the fact that the realized kernel replaces the sqaure returns. Which means that we need to multiply the output of the functions above with 1002. This multiplied realized kernel is pictured against the squared returns (for clarity showed as negative) in the graph (5.3) on the next page. Where we can see that the squared returns are in most cases still bigger than the scaled kernel. Furthermore confirms this figure5.3, combined with the plot of the closing prices (5.1), point three of our introduction. Also we see a little bit of point four, where high-volatility events tend to cluster in time.

(24)

Figure 5.3: Squared returns vs realized kernel

5.3 Summary of the data

As mentioned in the beginning of this chapter, the S&P 500 index is our main data, which we use in this thesis. Therefore we give a more advanced description in the table below. In some parts of the results of next chapter other indices are used, the less extended descriptions of these indices are included in Appendix B.1.

Table 5.2: Variables based on the S&P 500 index

Variable Mean Maximum Minimum Standard Deviation

Closing Prices 1424.70 2575.00 676.70 414.86 Returns 0.01 10.64 -9.69 1.21 Realized Kernel 1.08 93.13 0.02 2.51 Squared Returns 1.47 113.25 0.00 4.70 λ 1.26 44.47 0.00 2.24 d 0.03 3.51 -3.71 1.00 h 0.02 78.84 -23.11 1.75

The above described variables are all used to calculate the loglikelihood 4.9in the next chapter. We see again that the realized kernel is still slightly lower than the squared re-turns, which are used in the ordinary RT-GARCH model. Furthermore it shows that our set up of zt(= d(r)) works fine, because the standard deviation equals one as expected.

Since there is a big maximum for h, we need to program this without the prescribed functions in Matlab, otherwise it runs out of limits and will give errors.

(25)

Chapter 6

Results

In the research of this thesis, the model will be tested and compared to other models by a variety of tests, the p-values of the Model Confidence Set test of Hansen et al.

(2011) will be used. Furthermore the likelihood-ratio test is calculated as in Francq et al. (2004). Several risk management loss functions (MSE, QLIKE, and MAE), as in the paper of Patton (2011), will be evaluate using the volatility forecast. The risk management loss function (VaR) will be evaluated. The central limit theorem will be used in the Quasi-Maximum Likelihood analyses (QMLE). All these tests and returns are explained in this chapter.

6.1 Empirical results

In this section the results of the parameter estimations and used test statistics are described. Where our base is, as said in the chapter before, the S&P 500 index with open-to-close prices. First we look if our parameters are significant and explain the test statistics. After which the data is splitted and see what the difference is between close-to-close and open-close-to-close prices or pre/post crisis data. Finally we run our model for different indices (open-to-close prices), and figure out if the parameter estimations react the same.

In the next table (6.1) we see the estimated parameters, with their standard error, t and p statistic4. The resulted volatility of the parameter u over the total data is showed. A more extensive overview of this estimate is given in the Appendix (B.3), where the volatility is showed on a yearly basis.

Table 6.1: Results for the Real RT-GARCH(1,1) model based on S&P 500 close-to-close prices

Parameter Estimation Standard error t-value p-value

α -0.0232 0.0048 -4.8755 0.0001* β 0.6850 0.0161 42.4133 0.0000* δ 0.3441 0.0271 12.7144 0.0000* φ 0.0490 0.0065 7.4922 0.0000* ξ 0.0682 0.0362 1.8813 0.0303* ψ 0.8008 0.0419 19.1198 0.0000* τ1 0.0492 0.0265 1.8456 0.0322* τ2 0.0887 0.0189 4.6940 0.0001* σu 1.7467

4_{All the p-values with a * are inside the 5% confidence interval and therefore significant.}

(26)

The p and t value of the above table (6.1) are decided with the null hypothesis containing the parameter value equals zero. If we use these estimations, our model becomes as followed (see below). Where ztstill is Gaussian distributed with an expectation of zero

and a standard deviation equals one. Furthermore ut also has a normal distribution

with zero mean and a standard deviation of approxemately 1.75 as mentioned in the previous table. rt= p λtzt (6.1) λt= −0.02 + 0.69λt−1+ 0.05zt2+ 0.34xt−1 (6.2) xt= 0.07 + 0.80λt+ 0.05zt+ 0.09(zt2− 1) + ut (6.3)

Now we can easily see that the extra real time parameter φ has a signicificant influence in the model, this is almost the same and as expected from the Smetanina (2017b) paper. The realized part has an significant contribute to the model too. Only the τ1 is

not as expected, but it is really small and therefore has little impact on our model. We see that it becomes negative (as expected) in the next tables for open-to-close prices. Where the parameter is mostly negative in the table below (6.2). There we check how the estimates changes for different indices (with different volatilities). This gives a better understanding of our model and more confidence in the estimates Furthermore the S&P 500 index is used in all the others results and therefore the conclusions of this thesis could become sensitive to the used data. The standard deviations of the estimates in the table below are of equal size as in other tables of this chapter, but for clarity only a *, for significancy, is included. Be aware of the fact that σu, the standard deviation of

u, is a result from the other parameters. The likelihoods can not be compared directly, because of the different amount of data in the indices (see AppendixB.2). We can divide it by the amount of data, but our goal is not to compare the indices here. Therefore the last two columns do not contain stars.

Table 6.2: Estimates for the linear Real-RT-GARCH(1,1) model

α β δ φ ξ ψ τ1 τ2 σu l(r, x) SPX -0.0201* 0.6577* 0.3821* 0.0296* 0.1232* 0.7773* -0.1194* 0.1562* 1.7362 -14425 FTSE -0.0043 0.6418* 0.3214* 0.0236* 0.0480* 0.9726* -0.0694* 0.0971* 1.0052 -11348 N225 0.1190* 0.5759* 0.1857* 0.0752* -0.4346* 1.6330* -0.1273* 0.3001* 1.1197 -12599 GDAXI 0.0109 0.4577* 0.4193* 0.0664* 0.1458* 1.0663* -0.2229* 0.2453* 2.0381 -16181 RUT 0.0704* 0.6440* 0.4820* 0.0521* -0.0880 0.6447* -0.0627* 0.1540* 1.4235 -14936 AORD 0.0528* 0.7328* 0.1314* 0.0225* -0.3138* 1.5354* -0.0490* 0.1532* 0.5174 -8169 DJIA -0.0245* 0.6414* 0.4007* 0.0271* 0.1402* 0.7729* -0.1340* 0.1730* 1.7370 -14312 IXIC -0.0214* 0.6809* 0.3886* 0.0408* 0.1236* 0.7165* -0.0273 0.2078* 1.7777 -15227 FCHI -0.0099 0.6091* 0.2934* 0.0683* 0.0428 1.1294* -0.1312* 0.1840* 1.5045 -14654 KS11 0.0137 0.4901* 0.4634* 0.0393* 0.0478 0.9676* -0.1202* 0.1506* 1.3876 -13544 AEX -0.0133 0.5963* 0.3212* 0.0536* 0.0484 1.0920* -0.1386* 0.1640* 1.3128 -13562 SSMI 0.0661* 0.5726* 0.3766* -0.0014* -0.0431 0.9877* -0.0456* 0.1436* 0.8100 -10690 IBEX 0.0321* 0.5874* 0.2888* 0.0767* 0.0291 1.1107* -0.1412* 0.2414* 1.4903 -14845 NSEI 0.2593* 0.3940* 0.0494* 0.1806* -3.5821* 5.9734* -0.4580* -0.5175* 2.1058 -13907 MXX -0.0382* 0.7873* 0.2934* 0.1149* -0.0195 0.5352* -0.0829* 0.1084* 0.8619 -12368 BVSP 0.0966* 0.6644* 0.2640* 0.1283* -0.4445* 1.1080* -0.2264* 0.3453* 2.3629 -17940 GSPTSE 0.0332* 0.5116* 0.4641* 0.0254* -0.0533* 0.9373* -0.0558* 0.0796* 0.8033 -8678 STOXX50E 0.0033 0.7096* 0.2471* 0.0475* 0.1070 0.9696* -0.2793* 0.4052* 2.3206 -16917 STI 0.0172* 0.5807* 0.2580* 0.0657* -0.0862* 1.1972* 0.0233* 0.0850* 0.5330 -7379 FTMIB 0.0151 0.5893* 0.3579* 0.0728* -0.0288 0.9765* -0.1408* 0.1512* 1.2523 -13922 Average 0.0329 0.6062 0.3194 0.0605 -0.2119 1.2552 -0.1304 0.1514 1.4050 -13280

This table (6.2) shows that the realized part is a significant improvement, where δ has a big influence in the model. Furthermore the β becomes slightly lower than in the paper of Smetanina (2017b). This can be explained by the fact that the realized part also contains the lagged λ and therefore overtakes partly the β parameter, but is clearly a better predictor. Furthermore the volatility of u is higher than in previous research, this is due to the bigger dataset (more post-crisis data) and becomes more clear in table6.4.

(27)

The tables (6.3 and 6.4) on the next two pages are quite similar. The first table (6.3) gives the estimates for close-to-close prices and open-to-close prices as described in the chapter five. The second table (6.4) pictures the difference between pre- and post-crisis, where the volatility is much higher for the post-crisis situation. Therefore it is very interesting for this research, where we mainly focus on the volatility of the model and now we can see how the estimates response. Furthermore the likelihood ratio test (LR) is calculated as a comparison of the goodness of fit for two models. The Real-RT-GARCH model is the alternative hypothesis in the following tables. And we test whether the (old) other models are worse than ours (by rejecting the null hypothesis). The test statistic is described as followed.

LRi = 2(l(r, x)RRT −G− l(r, x)i) ∼ χ2palt−pnull (6.4)

Whereby i can be replaced by the other models, and the parameter p (in the degrees of freedom of the chi squared distribution) stands for the parameters of the models. However compared to the outputs in the tables on the next pages this has no influence. Table6.3on the next page shows little difference between the open-to-close returns an close-to-close returns. The ξ parameter is smaller for the close-to-close returns, this is because of the bigger time interval, as said in the paper of Hansen et al. (2012). In terms of likelihood, our model performs better than the realized GARCH model for both returns and also the model with the student-t distribution is worse than our standard model. The log likelihood will always be negative, with higher values (closer to zero) indicating a better fitting model. We can see this in the table (6.3), where the likelihood ratio is stated. All the log likelihoods are more negative than our model and therefore the likelihood ratios are a lot bigger than the value from the test statistic in formula

6.4. For example if we look at the open-to-close returns, we see that our model has a likelihood of -14425, wher this is for the Student-t distribution -14483, and for the realized GARCH model -14494. Furthermore the φ parameter is still significant and therefore a good extension of the realized GARCH framework. Our main goal of the thesis was to improve the RT-GARCH model ofSmetanina (2017b), where she already showed that her model performs better than the simple GARCH model. Because of the big difference between the likelihoods, we use the partial5 _{likelihood to compare our}

model with the GARCH model. Also our extension out performs the simple RT-GARCH model in both cases of returns. For open-to-close the differences was -14425 against -14700.

In table 6.4 the difference between height of volatility in the data is shown and re-flected in the σu parameter. We see that high fluctuated data (post-crisis) effects mainly

the β of our model, which is the most important part in terms of variance. Furthermore we see the same as in table6.3, our model outperforms the realized GARCH model and realized RT-GARCH-t model, even more for the post-crisis data. The likelihood ratios are much bigger, for the realized GARCH, 34 against 900. If we look at the Student-t distribution, we see 22 against 168. What means that our model performs better when there are a lot of fluctuations. This comes from the fact that the RT-GARCH model re-acts faster to changes, which is explained inSmetanina(2017b) her paper. Moreover the realized part extension works good and better than the standard RT-GARCH model, as we can see in the table (italic numbers of the LR statistic, with usage of the partial likelihood).

The degrees of freedom are for both tables between one and two (exponential). The size of the data makes this happen and so the distribution converges to a Gaussian distribution.

5

Here we compare a model that maximizes the partial likelihood (therefore the LR statistic is included italic in the tables) with a model that maximizes a joint likelihood.

(28)

22 W.J.P. (Wessel) Schouten — The Realized Real-Time GARCH model T able 6.3: Results for the linear Real R T-GAR CH(1,1) mo del Mo del Op en-to-close returns Close-to-close returns G (1,1) R G (1,1) R T-G (1,1) RR T-G (1,1) RR T-G-t (1,1) G (1,1) R G (1,1) R T-G (1,1) RR T-G (1,1) RR T-G-t (1,1) α 0.0101 (0.0020) 0.0136 (0.0049) -0.0186 (0.0020) -0.0201 (0.0037) -0.0001 (0.0001) 0.0159 (0.0028) 0.0286 (0.0055) -0.0209 (0.0034) -0.0232 (0.0048) -0.0067 (0.0001) β 0.8951 (0.0095) 0.6501 (0.0157) 0.9090 (0.0104) 0.6577 (0.0166) 0.6405 (0.0005) 0.8916 (0.0097) 0.6706 (0.0152) 0.9125 (0.0099) 0.6850 (0.0161) 0.6600 (0.0005) δ 0.1010 (0.0093) 0.4220 (0.0238) 0.0995 (0.0119) 0.3821 (0.0244) 0.3018 (0.0005) 0.0955 (0.0090) 0.4116 (0.0244) 0.0652 (0.0091) 0.3441 (0.0271) 0.3393 (0.0006) φ 0.0205 (0.0026) 0.0296 (0.0041) -0.0007 (0.0000) 0.0385 (0.0048) 0.0490 (0.0065) -0.0002 (0.0000) ξ 0.1145 (0.0317) 0.1232 (0.0318) 0.1353 (0.0301) 0.0735 (0.0331) 0.0682 (0.0362) 0.0927 (0.0298) ψ 0.7204 (0.0269) 0.7773 (0.0313) 1.0383 (0.0151) 0.7000 (0.0275) 0.8008 (0.0419) 0.8639 (0.0129) τ 1 -0.1120 (0.0268) -0.1194 (0.0264) -0.1149 (0.0214) 0.0508 (0.0271) 0.0492 (0.0265) 0.0549 (0.0213) τ 2 0.1322 (0.0159) 0.1562 (0.0187) 0.0377 (0.0083) 0.0817 (0.0154) 0.0887 (0.0189) 0.0162 (0.0070) σ u 1.7424 1.7362 1.7495 1.7586 1.7467 1.7632 ν 1.4364 (0.0510) 1.1748 (0.0432) l( r, x ) -5884 -14494 -5854 -14425 -14483 -6104 -14797 -6030 -14699 -14805 LR 138 550 116 196 450 212

(29)

The Realized Real-Time GARCH model — W.J.P. (Wessel) Schouten 23 T able 6.4: Results for the linear Real R T-GAR CH(1,1) mo del Pre-crisis P ost-crisis G (1,1) R G (1,1) R T-G (1,1) RR T-G (1,1) RR T-G-t (1,1) G (1,1) R G (1,1) R T-G (1,1) RR T-G (1,1) RR T-G-t (1,1) α 0.0095 _(0.0029) 0.1101 _(0.0216) -0.0160 _(0.0067) 0.0512 _(0.0219) 0.1306 _(0.0855) 0.0188 _(0.0037) -0.0215 _(0.0000) -0.0166 _(0.0026) -0.0216 _(0.0040) -0.0067 _(0.0001) β 0.9237 (0.106) 0.4827 _(0.0317) 0.9396 _(0.0113) 0.4891 _(0.0346) 0.4509 _(0.0549) 0.8470 _(0.0170) 0.6850 _(0.0014) 0.8386 _(0.0190) 0.6135 _(0.0212) 0.6671 _(0.0011) δ 0.0704 _(0.0098) 0.4946 _(0.0399) 0.0558 _(0.0117) 0.4714 _(0.0457) 0.3549 _(0.1239) 0.1525 _(0.0170) 0.3451 _(0.0002) 0.1811 _(0.0219) 0.4304 _(0.0340) 0.3380 _(0.0011) φ 0.0080 _(0.0237) 0.0369 _(0.0097) -0.0001 _(0.0033) 0.0249 _(0.0037) 0.0286 _(0.0049) -0.0004 _(0.0001) ξ 0.0050 _(0.0470) 0.0417 _(0.0479) -0.1843 _(0.2809) 0.0710 _(0.0530) 0.1534 _(0.0529) 0.0720 _(0.0523) ψ 0.8633 _(0.0545) 0.8884 _(0.0618) 1.2853 _(0.4934) 0.7980 _(0.0171) 0.7658 _(0.0427) 0.8014 _(0.0180) τ1 -0.0918 (0.206) -0.0974 _(0.0205) -0.0828 _(0.0182) 0.0560 _(0.0384) -0.1561 _(0.0481) 0.0519 _(0.0419) τ2 0.1616 _(0.0141) 0.1558 _(0.0157) 0.1388 _(0.0126) 0.0980 _(0.0108) 0.1661 _(0.0338) 0.1014 _(0.0179) σu 0.9436 0.9411 0.9385 2.3111 2.2530 2.2812 ν 1.9899 _(0.2911) 1.1085 _(0.0612) l( r, x ) -3083 -5962 -3043 -5945 -5956 -2816 -8203 -2801 -7753 -7837 LR 34 930 22 900 598 168

(30)

6.2 Results on the volatility forecast

In this section we look at the volatility forecast of our model. First we describe the test statistics, by their loss function, after that the results are discussed.

The mean squared error (MSE) loss function is a measurement of quality of the estimator. It calculates the average squared difference between the estimated values and the real values. The formulation is described below, where h stands for the real volatility and ˆσ is calculated from the volatility forecast, pointed out in section three of chapter four. The most common way to approach the real volatility is to use the realized variance. Which is also calculated in our intraday high-frequency data on a 5-minutes base. M SE = 1 n n X i=1 (h2− ˆσ2_i)2 (6.5)

An other loss function that is used in the results of this chapter is the QLIKE statistic. This is less sensitive for outliers than the MSE. But all tests mentioned in this chapter are robust when used to compare rivalling volatility forecasting models as mentioned by Patton(2011). The equation is given by

QLIKE = 1 n n X i=1 (h 2 ˆ σ_i2 + log ˆσ 2 i) (6.6)

The last loss function that is showed in the tables below (6.5-6.8) is the median absolute error (MAE). Which is a consistent estimator for the approximation of the standard deviation. In the MAE, the deviations of a small number of outliers are irrelevant. The formulation of this statistic is as followed.

M AE = 1 n n X i=1 (|h2− ˆσ2_i|) (6.7)

Table 6.5: k-step ahead volatilty forecasts (short horizon)

1-step ahead volatility forecasts MSC p-values

Model MSE QLIKE MAE MSE QLIKE MAE

RT-GARCH 3.2996 (60.0089) 0.6431 (1.1161) 0.6638 (1.6265) 1.0000* 1.0000* 1.0000* Real-RT-GARCH 3.0369 (69.5660) 0.6274 (1.1504) 0.6262 (1.6910) 0.0630* 0.0040 0.0140

Table 6.6: k-step ahead volatilty forecasts (short horizon)

RT-GARCH 3.3282 (63.4980) 0.6673 (1.0478) 0.6688 (1.5918) 1.0000* 0.0010 1.0000* Real-RT-GARCH 2.9694 (71.4844) 0.6907 (1.1261) 0.6604 (1.6975) 0.0270 1.0000* 0.5530*

Table 6.7: k-step ahead volatilty forecasts (longer horizon)

RT-GARCH 3.4167 (68.1700) 0.6967 (1.0060) 0.6808 (1.6147) 1.0000* 0.0000 0.0150 Real-RT-GARCH 3.1177 (73.7793) 0.7549 (1.1070) 0.7149 (1.7187) 0.0470 1.0000* 1.0000*

(31)

Table 6.8: k-step ahead volatilty forecasts (longer horizon)

RT-GARCH 3.5460 (72.5378) 0.7235 (0.9897) 0.6963 (1.6759) 1.0000* 0.0000 0.0000 Real-RT-GARCH 3.4049 (75.9417) 0.8057 (1.0978) 0.7725 (1.7498) 0.3150* 1.0000* 1.0000*

The above tables (6.5-6.8) show us the described test statistics with their MSC p-values. The p-values with a * are again inside the 5% confidence interval. All the standard devation of our model are smaller than those of the RT-GARCH model, but because of the size we have chosen to add the MSC p-values. The MSC provide a sort of confidence interval which selects the best model using some error measure, where the comparison is made for every timepoint t of the forecasts. If we look at the mean squared error (MSC p-values, where the big standard errors are not important), we see that our model performs best at all the time horizons, where the RT-GARCH does not. But the QLIKE and MAE statistics tell us that our model is not the best predictor on the longer term. The MCS contains almost every time horizon for the simple RT-GARCH model. Hansen et al. (2010) stated in his webappendix that the linear specification is far less persistent than the reality is, where this is not the case for the log-linear model. Therefore it is possible that these long term results can be solved by implenting the log-linear specification.

With the forecasts we can also calculate another risk management loss function, the Value at Risk (VaR). It is better to look at the violation ratio (VR) in the evaluation of the forecasts. This is the ratio between the number of returns that exceed the VaR and 5% of the total number of returns. The model gives a good forecast if the ratio is between 0.8 and 1.2. Our model (1-step ahead forecast) gives a ratio of 0.95, so is a proper forecast. Moreover the simple RT-GARCH model returns a value of 0.88 and is therefore also a correct predictor, but not as well as ours.

However the above described VR statistic is more a measure of the assumed distri-bution. Therefore we look at a more advanced method of the VaR, which comes from

Engle and Manganelli (2004). This method investigates the regression on the hit se-ries. Whereby the hit serie is one if the value exceeds the VaR and zero otherwise. The unconditional coverage hypothesis, whether the expected value of the hit series equals the confidence level, can be tested by checking if the intercept of the regression is zero. Furthermore it is, especially in our model, desirable that the exceptions are not corre-lated. In case that they are correlated, then our forecast is not fast enough to adjust the new volatility level if the VaR is exceeded. This is called the independence hypothesis and can be tested by looking whether the lagged value estimates are significant. The following table (6.9) shows the results of the regression on the hit series.

Table 6.9: regression on the hitseries

Estimate Standard error p-value

Intercept 0.0001 0.0014 0.9417

It−1 -0.0099 0.0150 0.5105

It−2 0.0346 0.0149 0.0211*

It−1 and It−2 stand for the first two lagged variables. Only these are included, because

the other past values are not significant and therefore do not impact our conclusion. First we see that the interception is not significant, so the VaR model is correctly specified. Secondly the first lagged value does not influence the regression, but It−2 shows that

there is some correlation in the exceptions. Since this is the only (small) term that is significant, we are of the opinion that our model reacts fast, but again can be improved.

(32)

Chapter 7

Conclusion and recommendations

The financial market plays an important role in our current society. Because of the big uncertainty and risk, there is a desire for new models to predict the development of these assets. Especially since the credit crisis, where there were massive fluctuations in almost every stock exchange. The financial models so far were good to work with, but nowadays there is more high-frequency data available. Therefore we attempt to come with better forecasts and more specific, forecasts of the volatility.

Last year Smetanina (2017b) tried to do this with an addition to the GARCH framework of a real time parameter. And showed that her model adjusts faster to certain changes in volatility. However she pointed out that her model could be extended with other distributions or realised measures as inHansen et al. (2012).

Therefore we choose to provide a closer look to a new model, which we call the Real-RT-GARCH model. This is a combination of the models from the researchers described above and is included with a Gaussian distribution or Student-t distribution. The paper contains a lot of deriviations of the moments, derivatives and limits. Furthermore it is tested on several indices and expected to predict the volatility even better than the current RT-GARCH model. Whereas the realized measurements (here the kernel) are a better reproduction of the real volatility than the sqaured returns. Moreover it is interesting to see what the leverage effect will add to the current model. Finally there are some marks made about the programming of the model, where multiple pitfalls are appointed and suggestions are made to solve these problems.

7.1 Conclusion of the results

In this thesis we focussed mainly on the linear specification of the realized RT-GARCH model, therefore our conclusion will be based on that specification. First we saw in chapter six that the model worked, based on all the indices, pre- and post-crisis situation, as expected and that all the parameters are signicifant. The δ parameter had a big influence on the model and this says that the realized extension is an addition to the standard model. Furthermore the news impact function worked as expected and was significant. Then we looked at the (partial) likelihood ratio of our model against several others. We can conclude that our model is better in terms of likelihood than the others and that the Student-t distribution is not necessity a better assumption (probably by the size of our data). After that we investigated the forecasts of our model compared to RT-GARCH model. We used several loss functions and the model confidence set. The MSE statistic inferred that our realized extension performs very good. However the QLIKE and MAE stated that it only worked well on a short time horizon. Therefore it is interesting to see how the log-linear specification will perform, whereasHansen et al.

(2010) mentioned that the linear specification is less persistent than the reality. Finally we examined another risk management loss function, the Value at Risk. Where the comparison is first made via the violation ratio. This ratio gave a better result for our

(33)

model compared to the RT-GARCH model. After that we investigated the unconditional coverage hypothesis and the independence hypothesis, by doing a regression on the hit series. We saw that the VaR was correctly specified, but there also was some correlation in the exceptions, where the second lagged value was significant.

Therefore the conclusion of our thesis and answer to the research question is as followed. Our realized extension of the RT-GARCH model performs good and is a significant improvement to the current model. Furthermore due to the size of the used data, the Student-t distribution does not contribute to better results. However some statistics showed that our model does not outperform, in terms of volatility forecast, the standard RT-GARCH model on a longer horizon, whereas it does on shorter terms. Therefore we are curious to see how the more difficult (in terms of the deriviations) log-linear specification predicts the returns. It is also interesting to determine what impact this specification has on the Value at Risk loss function.

7.2 Recommendations for further research

Chapter three of this thesis contains the model description. Both the log- and linear specifications are included. In this research we used the linear model, but the log linear model may predicts better, which was the case in the paper of Hansen et al. (2012). There the model was slightly easy to calculate with, because the λtonly contains term

of t − 1. The new realized RT-GARCH model contains an extra term of time t, which makes the derivations difficult. However it is still a very interesting specification and therefore left for further research, whereas it is less sensitive for outliers. But be aware of the limits of a logarithmic function while programming the model.

The model with a Student-t distribution is also mentioned in chapter three. Which looks like a better fit for the data, because it has heavier tails. However as said in the introduction, if the ∆t raises, the distribution looks more and more like a normal distribution. And we see in the results that it is not an improvement for the linear specification. Bollerslev (1987) was one of the first researchers who investigated the GARCH framework with different distributions. For him the question also remained if other conditional error distributions perform even better than these two. Therefore this is also an interesting extension for further research, where the impact probably will be small, but it is easy to adjust.

Moreover our analysed linear model of the Real RT-GARCH model is the variant where p = 1 and q = 1. So it only contains the first lagged value. In the research of

Hansen et al.(2012) the (2,2) model outperformed the rest and therefore it is interesting to investigate whether this is similar for our model. Also the assumed news impact curve of chapter three can be changed, for example Yu(2005) used a Brownian motion in his research.

The data that is used in this thesis contains several indices. Smetanina (2017b) tested her model also on stock returns, which gave small differences. Also taking different splitting points for the pre- and post-crisis data might be interesting. These assumptions can be examined in a more extensive research.

Finally the question remains, as Smetanina (2017b) already mentioned, where the model stands between the GARCH framework and the stochastic volatility models. Deriving a continuous-time limit model will give more information for this.

(34)

Appendices

(35)

Appendix A

Theorem

This is the basic formulation of the model used in this thesis. Where zt ∼ i.i.d.(0, 1),

and ut ∼ i.i.d.(0, σ2u). For the calculations on the model, we need some derivations of

the parameters, which are described in this Appendix. To give a clear reading of the thesis, we left these out of the normal text, where we refer to the equations below.

rt=

p

λtzt (A.1)

λt= α + βλt−1+ φzt2+ δxt−1= bt−1+ φz2t (A.2)

xt= ξ + ψλt+ τ (zt) + ut (A.3)

Equation 1 and 2 give the following solution for zt, note that zt= d(r). For φ = 0

λt= bt−1 (A.4) zt= rt pbt−1 (A.5) Otherwise (φ 6= 0) we have φz_t4+ bt−1zt2− r2t = 0 (A.6) zt= ± v u u t± q b2_t−1+ 4r2_{φ − b} t−1 2φ (A.7)

But we are only interested in the real valued solutions, so

zt= d(r) = v u u t q b2_t−1+ 4r2_{φ − b} t−1 2φ (A.8)

Later on, for the calculation of the loglikelihood we need an expression for the ut, which

is an insertion of the formula that is derived aboveA.8 in equationA.3. ut= h(r, x) = x − ξ − ψ

r2

d2_(r)− τ (d(r)) (A.9)

The realized real-time GARCH model

The Realized Real-Time GARCH

model

W.J.P. (Wessel) Schouten

Contents

Preface

Chapter 1

Introduction

Chapter 2

Previous research models

2.1

GARCH

2.2

Realized GARCH

2.3

Stochastic Volatility

2.4

Real-Time GARCH

Chapter 3

The Realized Real-Time GARCH

model

3.1

The General Formulation

3.2

Log-Linear Specification

3.3

Linear Specification

3.4

The leverage effect

3.5

Model with Student-t distribution

Chapter 4

Outline of the Estimation Theory

4.1

Main results

4.2

Quasi-maximum likelihood analyses

4.3

Volatility Forecast

Chapter 5

Data subscription

5.1

Returns

5.2

Realized Kernel

5.3

Summary of the data

Chapter 6

Results

6.1

Empirical results

6.2

Results on the volatility forecast

Chapter 7

Conclusion and recommendations

7.1

Conclusion of the results

7.2

Recommendations for further research

Appendices

Appendix A

Theorem