• No results found

Modelling intraday volatility with leverage effects and jumps

N/A
N/A
Protected

Academic year: 2021

Share "Modelling intraday volatility with leverage effects and jumps"

Copied!
33
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Modelling intraday volatility with leverage

effects and jumps

Master’s Thesis

by

Sebastiaan Hersmis 1

Submitted in partial fulfillment of the requirements for the Degree of Master of Science

in Econometrics

MSc in Econometrics

Track: Financial Econometrics Date: December 15, 2017

Supervisor: prof. dr. H.P. Boswijk Second reader: dr. S Broda

(2)

Abstract

Correctly quantifying volatility is key in pricing any financial asset. This thesis models volatility using high-frequency, tick-by-tick trade data, incorporating stylized facts such as fat tails, volatility clustering and the leverage effect. After splitting the observed volatility series in a continuous and discontinuous part, the JLHARG-RV model by Alitab, Bormetti, Corsi, & Majewski (2016) is applied to the complete data set, containing 14 years of trade data. This thesis extends the existing model by introducing a time-dependent jump-intensity. The newly introduced specification improves the estimation of the jump-intensity and the point estimator of volatility. However, the quality of the conditional density forecasts decreases, suggestion that the specification of the JLHARG-RV model can be improved.

(3)

Contents

1 Introduction 4 2 Theoretical framework 6 2.1 Characteristics of volatility . . . 6 2.1.1 Clustering . . . 6 2.1.2 Fat tails . . . 7 2.1.3 Leverage effects . . . 7 2.1.4 Jumps . . . 8

2.2 Intraday volatility and microstructure noise . . . 8

2.2.1 Integrated and Realized variance . . . 9

2.2.2 Subsampling and pre-averaging . . . 10

2.2.3 Realized kernel estimators . . . 11

2.3 Detecting jumps in high-frequency data . . . 12

3 Model 14 3.1 Overview of existing volatility models . . . 14

3.2 Continuous Integrated Variance . . . 15

3.2.1 Model . . . 15

3.2.2 Estimation . . . 15

3.3 Discontinuous Integrated Variance . . . 16

3.3.1 Original: model 1 . . . 16

3.3.2 Time-dependent: model 2 . . . 16

3.3.3 GARCH-type: model 3 . . . 17

3.3.4 Estimation . . . 17

3.4 Evaluation & methodology . . . 18

3.4.1 Evaluation of one-day forecasts . . . 18

3.4.2 Density estimation and evaluation . . . 19

4 Data description 20 4.1 Trade data description . . . 20

4.2 Data handling and cleaning . . . 21

5 Results 23 5.1 Auxiliary results . . . 23

5.2 Continuous model . . . 26

5.2.1 Model estimation and evaluation . . . 26

5.2.2 Density estimation and specification testing . . . 27

5.3 Jump model . . . 27

5.3.1 Model estimation and evaluation . . . 27

5.3.2 Density estimation and specification testing . . . 28

6 Conclusion 30 6.1 Conclusion . . . 30

6.2 Limitations and suggestions for further research . . . 31

(4)

Chapter 1

Introduction

The value of the S&P 500 index can change at any moment, creating uncertainty in price forecasts. This uncertainty, measured by the volatility, is reflected in the price of the underlying asset. There-fore, accurately quantifying volatility is key in assessing risk and thus vital in pricing the asset and its derivative instruments. While the importance of volatility is acknowledged throughout financial literature, correctly modelling it remains a challenge to existing econometric models.

Modelling volatility starts by identifying its characteristics. Historical volatility series exhibit several stylized facts. Firstly, volatility is highly autocorrelated, which causes it to be clustered and persistent with a long-term memory. Secondly, negative past returns impact volatility heavier than positive past returns of the same magnitude, implying a leverage effect. Thirdly, the price changes in the underlying asset can be separated into a continuous and a discontinuous (jump) component. As a continuous time series behaves differently than a discontinuous one, both volatility series should be estimated using different models. These three stylized facts need to be incorporated when constructing a complete volatility model.

In most standard financial textbooks, volatility is modelled using daily observations. However, mainly due to technological advancement in recent years, newer models use Realized Variance (RV). RV models use intraday observations and estimate one RV measure per day. Intraday observations can be aggregated, for example into 5-minute returns, or non-aggregated, using the raw, (tick-by-tick trade data. The general philosophy of this thesis is that more data can provide better insights into the underlying process, which is incorporated by using the complete tick-by-tick data set, where possible.

When using tick-by-tick data, the standard RV measures are contaminated by microstructure noise. This microstructure noise is caused by the practical structure of the market, such as the bid-ask spread and discreteness of prices (A¨ıt-Sahalia, Mykland, & Zhang, 2005). Several methods can be applied to correct for this. A common measure is to aggregate trade data, for example into 5-minute aggregates, instead of using tick-by-tick data. As this method contradicts the philosophy of using all possible data, this thesis uses Realized Kernel by Barndorff-Nielsen, Hansen, Lunde, & Shephard (2009). This technique smooths out the microstructure noise, and get a non-contaminated RV measure, while using all available trade data.

Corsi (2009) proposed a volatility model consisting of different (heterogeneous) components defined over different time periods, the so-called Heterogeneous Autoregressive RV model (HAR-RV). The HAR-RV model captures the long-memory characteristic of volatility in a simple and intuitive way. Corsi & Reno (2009) extended this model by including heterogeneous leverage components to account for the leverage effect, the LHAR-RV model. Corsi, Fusari, & La Vecchia (2013) extended this model with a noncentral gamma distribution, yielding the LHARG-RV model, which is better suited and more flexible in modelling volatility. Recently, Alitab, Bormetti, Corsi, & Majewski (2016) contributed a jump extension to this model, to account for the difference between the continuous and discontinuous

(5)

volatility series. Their model, the JLHARG-RV model, outperforms benchmark models en is taken as the benchmark model for this thesis.

The novelty of this thesis lies in the last contribution to the model, the jump component. In the model of Alitab et al. (2016), the number of jumps per day (the jump-intensity) is modelled as constant. However, financial literature suggests that there exists self-excitation in jumps, as shown by Boswijk, Laeven, & Yang (2014). In chapter 2, literature on self-excitation is treated and autocorrelation in the number of jumps is tested to provide an empirical basis for the extension to the model. This extension entails a time-dependent jump-intensity parameter.

Data from the S&P 500 index are used to calculate RV estimates per day, and the JLHARG-RV model is fitted on these RVs. Two new models are introduced, with a time-dependent jump component. The quality of the new model is assessed and compared with the benchmark model by Alitab et al. (2016). All together, this leads to the following research questions:

1. How well does the model by Alitab et al. (2016) describe the observed continuous volatility series?

2. Can the jump-intensity model be improved by using an extended model?

3. When the jump-intensity is modelled by a more elaborate model, how well does the conditional distribution describes the discontinuous volatility series?

These three sub-questions help answer the main research question:

“Can the JLHARG-RV model be improved by including a time-dependent jump-intensity?”

This thesis is constructed as follows. The next chapter contains theory about volatility, intraday measures and microstructure noise. Based on this theory, chapter 3 firstly derives the continuous model and the discontinuous model. Secondly, the step between theory and practice is made by deriving the log-likelihood, such that the models can be estimated. Chapter 4 describes the data set and the data cleaning procedure. Chapter 5 contains the results of the model.

(6)

Chapter 2

Theoretical framework

A starting point of many financial valuation models is the payoff of some asset. In this thesis, company shares are taken as that asset. The price of the shares in the market determines the return of investing in the stock. Usually in finance, daily returns are considered. Here, the price at time Xt refers to the

price at the end of the trading day t, such that Yt is the return on day t:

Yt= ∆tX ≈ log  Xt Xi−t  (1) This return consists of some risk-free return, the price of risk and unknown stochastic innovations. The size of variation in these returns, known generally as the volatility, measures the uncertainty about the share price in the future (Hull, 2006, p. 217). Throughout this thesis and in line with current literature, the price process Xt is assumed to be a Brownian motion semimartingale with a jump

component:

dXt= µtdt + σtdWt+ ctdNt (2)

where µt is the predictable risk-free return, instantaneous volatility σt is c´adl´ag, Wt a standard

Brownian motion, Nt is the number of jumps, usually modelled by a Poisson process with a intensity

of λt that can be time-dependent, and ctis the size of the jump at time t.

This chapter consists of two sections. The first section treats the element of volatility and its character-istics. The second section discusses using intraday, high-frequency observations and the implications for modelling volatility.

2.1

Characteristics of volatility

Contrary to what is implied by most textbook models, it is widely recognized that volatility is not constant over time, and certain behaviour can be modelled (Andersen, Bollerslev, Diebold, & Ebens, 2001). When modelling volatility, the following four characteristics are the main dynamics present in volatility series.

2.1.1 Clustering

Periods of high and low volatility are clustered, indicating autocorrelation in volatility (Tsay, 2005, p. 113). This calls for an autoregressive volatility model, such as a (G)ARCH model. Figure 2.1 shows volatility over the period ‘99 - ‘13 being clustered, which also can be seen in the autocorrelation function in figure 2.2. To graphically display empirical volatility effects in this chapter, the VIX, a

(7)

measure of volatility implied by options in the market, is used (Chicago Board Option Exchange, 2017).

Corsi (2009) notes that this persistence lasts for long time periods and each period contributes a separate part to overall volatility. In order to model this, Corsi (2009) propose an additive, cascade model with three different volatility components, see chapter 3.

Figure 2.1: Daily volatility 1999-2013

Figure 2.2: Autocorrelation of volatility

2.1.2 Fat tails

Corsi (2009, p. 175) shows that return distributions exhibit fat tails, indicating a leptokurtic distri-bution, where the shape depends on the time scale used. Figures 2.3 displays the distribution for daily returns. Indeed, the distributions seem to be leptokurtic, with a kurtosis of 13.03634. The non-normality is confirmed by both the Shapiro-Wilk non-normality test (W = 0.9164, P < 0.001) and the Anderson-Darling normality test (AD = 56.876, P < 0.001).

2.1.3 Leverage effects

Historically, a relation between volatility and the share price is part of the market folklore (Christie, 1982, p. 408). This effect entails a negative correlation between past returns and future volatility

(8)

Figure 2.3: Distribution of (daily) SPY returns from 1999-2013

(Bouchaud, Matacz, & Potters, 2001). There are two common explanations for this stylized fact. The first reason is that a negative return decreases the value of equity, making the debt-to-equity (lever-age) ratio of a firm larger. Therefore, the absolute size of the subsequent returns is relatively larger, which implies larger variance (Christie, 1982). The second explanation entails that the asymmetric na-ture of volatility to return shocks simply reflects time-varying risk premium. Wang & Mykland (2014) refer to this explanation as the volatility feedback effect. In short, the reasoning is that if volatility is priced in, an anticipated increase in volatility (increased risk) raises the required return on equity, leading to an immediate share price decline.

While the precise underlying reason is unclear, research into the cause of the leverage effect lies in the field of behavioural finance and is not the focus of this thesis. The effect itself, however, needs to be included to correctly model volatility.

2.1.4 Jumps

Volatility is not merely a continuous process and jumps are present in the underlying price process. Formally, a jump in the price can be described as an abnormal and more than marginal movement in price of the stock (Merton, 1976). And indeed, Andersen, Bollerslev, & Diebold (2007) suggest that the continuous volatility and discontinuous volatility have different dynamics and should thus be modelled separately.

The jump intensity (the number of jump occurrences) is most intuitively modelled by a Poisson process with constant parameter λ, as is done in Alitab et al. (2016); chapter 3 treats their model. However, Boswijk et al. (2014) find that the occurrence of a price jump induces a higher conditional probability of the occurrence of future price jumps, in other words, there exists self-excitation in jumps. The novelty of this work lies in the allowance of jump-clustering, or self-excitation in jumps, by incorporating a time-dependent Poisson-parameter. This parameter can be modelled in various ways, from intuitive, where the expectation equals the number of jumps in the period before, to more complicated, such as a GARCH specification.

2.2

Intraday volatility and microstructure noise

In recent years, an increased amount of financial data has become available, mainly due to technological advancement. As mentioned before, the philosophy of this thesis is that more data contains more information. And indeed, Corsi et al. (2013) find that when using intraday, high-frequency returns,

(9)

the quality of volatility modelling compared to GARCH-type models that solely use daily returns, is remarkable.

To illustrate the incremental information of intraday data, figure 2.4 shows two trading days where the total daily return (Open to Close) is 0%. The chosen trading days contain the highest and lowest intraday variance of all trading days with zero daily return in the data set. Clearly, both days exhibit a different volatility pattern, while contributing the same to volatility measured on a daily basis. This is one example where the availability and incremental information of intraday returns could improve pricing models and the highest amount of available data should therefore be used.

Figure 2.4: Example of two trading days with 0% return

Following this logic, using intraday data could improve volatility modelling. This intraday variance is estimated through the estimation of Integrated Variance, discussed in the next subsection. Subse-quently, microstructure noise is discussed, including solutions.

2.2.1 Integrated and Realized variance

In line with Hansen & Lunde (2006, p. 129), Integrated Variance is here defined as IV[a,b]=

Z b

a

σ2(s)ds

The integral is bounded by parameters a and b, which in this thesis are seen as the beginning and end of a trading day, such that IVt measures the variance on day t.

The realized variance (RV ) is a common measure to estimate IV ; the standard form of RV is the sum of quadratic variation of log-returns:

RVt= M

X

i=1

Yi,t2.

Again, if the price process is assumed to be as specified in equation (1) and (2), the theoretical result under this assumption states that

plimRVt= plim X t Yi,M2 = Z T 0 σ2(s)ds,

as is proven by Zhang, Mykland, & A¨ıt-Sahalia (2005, p. 1394). However, when the time between observations becomes smaller and prices are sampled at finer intervals, microstructure noise becomes

(10)

more prominent. In order to show microstructure noise, both Zhang et al. (2005) and Hansen & Lunde (2006) differentiate between the observed price and the efficient (true) price process. The difference between the two is caused by microstructure noise.

The intuition behind the microstructure noise is that the market is structured in such a way that noise influences the price. A¨ıt-Sahalia et al. (2005) state that this noise can arise due to several microstructure effects, of which the following two are the most prominent in literature.

1. The discreteness of prices. By the end of January 2001, all stocks on the New York Stock Exchange had converted their price quotations from 161s to decimals (Graham, Michaely, & Roberts, 2003). However, prices are still observed on a discrete scale, causing the observed price to deviate from the efficient price.

2. The presence of a bid-ask spread and the corresponding bounces. When selling or buying shares, this can be done in the market against the best bid or ask, respectively. The difference be-tween these prices, the bid-ask spread, potentially causes the price to bounce. For example, if an agent decides to buy on the market, and pays the ‘ask’ price, and subsequently an agent sells on the same market and receives the ‘bid’ price, the observed price moves. Therefore, the efficient price lies in between the bid and ask. Note that the bid-ask spread is not a market anomaly/inefficiency, but reflects the an efficient market when faced with trading friction and transaction costs (Amihud & Mendelson, 1986, p. 246).

Statistically, this noise can be explained by breaking down the RV estimator. It is assumed that the observed trade price Xt,i on day t at a certain time i equals Xt,i = Xt,i∗ + ut,i. The u represents the

noise around the efficient price X∗. Incorporating this noise into the RV estimator yields

RVt= M X i=1 Yt,i2 = M X i=1 (Yt,i∗ + t,i)2 = M X i=1 Yt,i∗2 | {z } (1) IV +2 M X i=1

Yt,i∗t,i+ 2t,i

|{z}

(2)

,

where Y is the observed return and Y∗ is the efficient return, analog to the definition of Xt,i. Term

(1) is equal to IV, see equation 2.2.1. Term (2) adds a bias, as E[t,i] = E[ut,i− ut,i−1]

E[2t,i] = E[u2t,i− 2ut,iut,i−1+ u2t,i]

= E[u2t,i] − 2E[ut,iut,i−1] + E[u2t,i]

RVt= IVt+ M · 2E[u2t,i]

| {z }

bias

.

Assuming that the effective price is not always accurately measured, i.e. there is noise, the bias term is not equal to zero and this introduces the microstructure noise.

2.2.2 Subsampling and pre-averaging

Zhang et al. (2005) discuss two possible solutions for the microstructure noise problem. They show that microstructure noise problems make the most finely sampled data, such as tick-by-tick data unusable. Therefore, their first solution is taking intervals of a longer period. And indeed, A¨ıt-Sahalia et al. (2005) find that taking 5-minute returns can be optimal. However, taking 5-minute intervals

(11)

discards 299 out of 300 seconds and therefore, especially in liquid stocks, discards a large part of the observations. Furthermore, Zhang et al. (2005) argue that this 5-minute sampling might limit the impact of microstructure noise, but does not quantify and correct for it.

The second solution of Zhang et al. (2005) is more elegant, as it combines an average of RVs estimated over several subgrids with RVs using all return observations. Assuming the observations on day t are on the full grid G, the full grid can be partitioned in K (nonoverlapping) subgrids, with return Ytn,i at

discrete times tn,i, where n is the total number of observations. The Two-Scales RV (T SRV ) estimator

of Zhang et al. (2005) is defined as follows

RVt(TS) = RVt(n,K)− 2n − K + 1 n · K RV (all) t , where RVt(n,K)= 1 K X tn,i∈Gn,i≥K Xtn,i − Xtn,i−K 2 , RVt(all)= RVt(n,1).

Therefore, the T SRV is a combination of the RV that samples every data point and RV that samples each Kth data point, but dependent on the chosen scale.

2.2.3 Realized kernel estimators

Incorporating the techniques of subsampling and pre-averaging and correcting for autocorrelation in noisy high-frequency trade data, Barndorff-Nielsen et al. (2009) introduce realized kernel estimators to estimate intraday variance. Kernel estimation is commonly used in density estimation, where noisy observations are smoothed. Of special interest is the smoothing parameter, called the bandwidth. The kernel estimator used in this thesis has the form

K(X) = H X h=−H  h H + 1  γh, γh = H X j=|h|+1 Yj · Yj−|h|,

where k(x) is a Kernel weight function. In line with Barndorff-Nielsen et al. (2009), the Parzen kernel is used, given by k(x) =      1 − 6x2+ 6x3 0 ≤ x ≤ 1/2 2(1 − x)3 1/2 ≤ x ≤ 1 0 x > 1

The Parzen kernel is a smooth kernel, with k(0) = 1 and k(1) = 0 and produces a guaranteed non-negative outcome. The ideal bandwidth here is

H∗ = c ∗ ξ4/5n3/5 c∗ = k 00(0)2 k0,0 1/5 ξ = ω 2 q TRT 0 σ4udu .

(12)

where, for the Parzen kernel, c∗ =  122 0.269 1/5 = 3.5134. Note that ω and σ4 are unknown and have to be estimated,

ˆ ω2(i)= RV i t,dense 2n(i) , ˆ ω2= 1 q q X i=1 ˆ ω(i)2 .

This generates a RV estimator that is corrected for microstructure noise.

2.3

Detecting jumps in high-frequency data

Before the model can be estimated, the volatility series is split in a continuous part, with ‘normal’ price changes, and a discontinuous part, consisting of jumps. In order to detect jumps in intraday trade data, several methods have been proposed. Quite common is the Bipower Variation (BP V ) by Barndorff-Nielsen & Shephard (2004). This BP V model is extended with a Threshold component (T BP V ) by Corsi, Pirino, & Reno (2010). This section first treats the BP V , after which the T BP V is discussed.

Assuming that the price process follows equation (1), the daily RV can typically be decomposed into a continuous component and a discontinuous (jump) component:

RVt= RVtC+ RVtJ, where RVtC =Rt+1 t σ 2 sds and RVtJ = PNt

j=1c2j. In order to disentangle the continuous component from

the jump component, Barndorff-Nielsen & Shephard (2004) have introduced the BPV, which is defined as BP Vδ,t(Y ) = µ−2 [1/δ] X j=2 |Yj−1| · |Yj|,

where µ ≈ p2/π. The BP V converges in probability to the continuous part of equation (1), or formally BP Vδ,t(Y ) p → Z t+1 t σs2ds, as δ → 0.

Asymptotically, the probability of two jumps directly following each other, is zero. Therefore, if Yj−1

contains a jump, the probability that Yj contains a jump goes to zero, when the width of the interval

∆ approaches zero. Consequently, as BPV multiplies Yj−1 and Yj, both these returns have to vanish

asymptotically and BPV converges to integrated continuous variance (Barndorff-Nielsen & Shephard, 2004).

However, as Corsi et al. (2010) state, for finite δ, these returns will not vanish, causing a positive bias. When two jumps occur consecutively, this bias will be extremely large. In order to prevent this bias, Corsi et al. (2010) use Threshold Bipower Variation (T BP V ) to measure the continuous part of integrated variance, defined as

T BP Vδ,t(Y ) = µ−21 [T /δ] X j=2 |Yj−1| · |Yj| · I{|Yj−1|2≤θ j−1} · I{|Yj|2≤θj}

(13)

where I(.) is the indicator function and θ is a threshold function estimated as θt= c2θ· ˆVt,

where c2

θ is a scale-free constant and ˆVt is an auxiliary estimator of σt2, the ‘local’ volatility. In line

with Corsi et al. (2010), the auxiliary estimator is defined as

ˆ VtZ =

L

P

i=−L,i6=−1,0,1

K(Li)(Yt+i)2I((Yt+1)2≤c2 V· ˆV Z−1 t+1 ) L P i=−L,i6=−1,0,1 K(Li)I((Y t+1)2≤c2V· ˆV Z−1 t+1 ) .

The starting value ˆV0 = +∞ corresponds to using all observations in the first step. From the second

step onwards, large returns are eliminated at each iteration by the Iterator-condition and each estimate is multiplied by c2V to get the threshold for the next iteration. With high-frequency data, usually 2 or 3 iterations are needed (Corsi et al., 2010). L corresponds to the bandwidth of observations included. Fol-lowing Corsi et al. (2010), L = 25 and K is a Gaussian kernel function: K(y) = (1/√2π)e−y2/2. In line with the general philosophy, T BP V is best applied to the complete data set containing all trades. In periods with relatively few trades, such as in 1999 until 2001, the T BP V method performs well, as shown by Corsi (2009). However, when trading volume increases, relatively more subsequent trades have the same price. As T BP V uses returns, it incorporates many zero-return observations. This in turn causes ˆV to be low, and therefore subsequent non-zero returns are more likely to be above the threshold and marked as a jump. This potential problem can be resolved by removing all zero-return observations in the T BP V method.

(14)

Chapter 3

Model

This chapter firstly describes existing volatility models, incorporating the characteristics described in chapter 2. Secondly, the model for the continuous volatility series is derived and lastly, two models for the discontinuous part are discussed.

3.1

Overview of existing volatility models

The fact that volatility is clustered, see subsection 2.1.1, suggests the use of an Autoregressive Con-ditional Heteroskedasticity (ARCH(1)) model, proposed by Engle (1982). However, since volatility is typically a long-memory process, an ARCH(1) model seems too restrictive and a more generalised ARCH model is more appropriate, such as a GARCH(M, N ) model (Bollerslev, 1986).

Corsi (2009) states that the standard GARCH models are not always able to reproduce the charac-teristics of volatility models and a cascading model with different time scales might be a better fit. His model includes a daily, weekly and monthly estimator which approximates long memory, termed the Heterogeneous Autoregressive model of Realized Volatility (HAR-RV). This heterogeneous model has a fixed structure, namely a daily (1 trading day), weekly (5 trading days) and monthly (21 trad-ing days) component. Craioveanu & Hillebrand (2010) find that this structure allows for sufficient flexibility.

As described in subsection 2.1.2, stock returns exhibit excess kurtosis. Therefore, instead of a Gaussian-distributed model, models which accommodate fatter tails and allow for more flexibility give a better fit. In line with Corsi et al. (2013), this thesis assumes the volatility to be noncentral gamma dis-tributed. This distribution is the continuous time limit of the CIB model (Cox, Ingersoll Jr, & Ross, 1985), as proven by Gouri´eroux & Jasiak (2006, p. 137).

Additionally, a leverage element can be added, including the effect of negative past returns, to include the leverage effect described in subsection 2.1.3, An example of such a model is the EGARCH(m, n) model (Nelson & Cao, 1992). Combining the heterogeneous structure with the gamma distribution (G) and leverage component (L), Corsi et al. (2013) label their model LHARG.

As discussed in subsection 2.1.4, jumps can influence the volatility and volatility is no longer merely a continuous process. The number of jumps can most intuitively be modelled by a Poisson process, such as is done in Alitab et al. (2016), labelled the J LHARG − RV model. The next chapter builds volatility models using based on this model.

(15)

3.2

Continuous Integrated Variance

3.2.1 Model

For the continuous RV component, the approach of Corsi et al. (2013) is followed, where RVTC follows a noncentral gamma transition distribution (also see Gouri´eroux & Jasiak (2006)):

RVt+1C |Ft∼ Γ(δ, θ, Θ(RVRVRVtttCCC, LLLttt)).

In this notation, δ and c are the shape and scale parameter. The location Θ is determined by hetero-geneous components of RV and leverage, see Majewski, Bormetti, & Corsi (2015),

Θ(RVRVRVtttCCC, LLLttt) = β1RV C(d) t + β2RV C(w) t + β3RV C(m) t + α1l (d) t + α2l (w) t + α3l (m) t ,

where βββ and ααα are the parameters to be estimated. The heterogeneous terms

RVtC(d) = RVtC, lt(d)= lt, RVtC(w) = 1 4 4 X i=1 RVt−iC , lt(w)= 1 4 4 X i=1 lt−i, RVtC(m) = 1 17 21 X i=5 RVt−iC , lt(m)= 1 17 21 X i=5 lt−i,

consist of daily, weekly and monthly effects. In line with Alitab et al. (2016), the leverage effect in this thesis is defined as lt=  t− γ q RVC t + RVtJ 2 .

This term captures the relationship between returns and volatility; the γ is expected to be positive as we expect a negative relation between past returns and future volatility, see section 2.1.3.

3.2.2 Estimation

The above specified model is estimated using maximum likelihood. Gouri´eroux & Jasiak (2006) show that the noncentral gamma distribution is a mixture of a Poisson distribution and the central gamma distribution. Formally, the noncentral gamma distribution Γ(δ, θ, Θ) exists if, and only if, there exists Z ∼ P(Θ) such that the conditional distribution of Y is given by a central gamma distribution, Γ(δ + Z, θ) (Gouri´eroux & Jasiak, 2006). The Poisson distribution and central Gamma distribution have the following probability density functions1:

f (k; β) = β ke−β k! (3) f (x; δ, θ) = e−xθ x δ−1 Γ(δ)θδ (4)

Note the difference between the gamma distribution, with at least two parameters and the gamma function, with one parameter. Combining the equations above yields the pdf of the noncentral gamma distribution. f (x; δ, θ, β) = ∞ X k=0 " e−xθ x (δ+k)−1 θ(δ+k)· Γ(δ + k) βke−β k! # = e−xθ · e−β· ∞ X k=0 " x(δ+k)−1 θ(δ+k)· Γ(δ + k) βk k! # 1

As this Poisson distribution is a discrete distribution, formally it has a probability mass function. In application, the difference is only in terms of naming and therefore this thesis uses the term probability density function (pdf).

(16)

When taking the conditional distribution, each observation at time t is independent. Taking the logarithm, any log-likelihood function has the form

Lt,T(δ, θ, β) = T

X

t=1

log f (xt|xt−1; δ, θ, β) (5)

When applying this to the noncentral gamma distribution, the log-likelihood is derived: Lt,T(δ, θ, β) = T X t=1 log e−xtθ · e−β· ∞ X k=0 " x(δ+k)−1t θ(δ+k)· Γ(δ + k) βk k! #! = − T X t=1 xt θ + β  + T X t=1 log ∞ X k=0 " x(δ+k)−1t θ(δ+k)· Γ(δ + k) βk k! #!

When plugging in δ, θ and Θ(RVC t

RVtC RVC

t , LLLttt), the final log-likelihood for RVtcis obtained, which corresponds

to the log-likelihood of Alitab et al. (2016): lct,T(δ, θ, βd, βw, βm, αd, αw, αm) = − T X t=1  RVc t θ + Θ(RVRVRV C t−1, LLLt−1)  + T X t=1 log ∞ X k=1 (RVtc)δ+k−1 θδ+k· Γ(δ + k) · Θ(RVRVRVCt−1, LLLt−1)k k! !

The likelihood contains an infinite sum: in line with Corsi et al. (2013) this sum is truncated to the 50th degree in calculations.

3.3

Discontinuous Integrated Variance

The discontinuous RV component is based largely on Alitab et al. (2016). The number of jumps that occur follows a Poisson distribution and the size of the jump comes from a central gamma distribution. RVt+1J |Ft∼ nt+1 X i=0 Yi with nt+1∼ P(λt+1) and Yi∼ γ(δ, θ).

The novelty of this thesis is a time-dependent Poisson parameter. Three different models are dis-tinguished: the original model, an intuitive time-dependent model and a model using a GARCH specification.

3.3.1 Original: model 1

The original model by Alitab et al. (2016) estimates the Poisson parameter as constant. In terms of the parameter λt in equation (7), this is implemented as follows:

λt= γ.

3.3.2 Time-dependent: model 2

As literature suggests that there exists self-excitation in jumps, a time-dependent component could improve modelling of the discontinuous RV. The second model uses an intuitive specification, set-ting

(17)

3.3.3 GARCH-type: model 3

The specification of model 2 is intuitive but could be too simplistic for modelling the number of jumps correctly. The third model uses a GARCH specification to capture possible long-memory effects in jump-intensity modelling. λt follows a GARCH(1) specification:

λt(ω, α, β) = ω + α · nt−1+ β · λt−1 (6)

3.3.4 Estimation

The pdfs of the Poisson and gamma distributions are given above, at equation (3) and (4). As stated, the number of jumps, in this case nt, follows a Poisson distribution. The size of the jump comes from

a central gamma distribution with parameters δ and θ. Again, the pdf of the discontinuous RV series follows a mixture of the two distributions. Here, the density of the sum of multiple gamma distributions needs to be derived first. Assuming that Z = Y1+ Y2, and Y1 ∼ Γ(δ, θ) and Y2 ∼ Γ(δ, θ),

f (z; δ, θ) = Z ∞

−∞

f (z − x)f (x)dx

It is important to note that (1) f (x; δ, θ) = 0 when x < 0, and (2) f (z − x; δ, θ) = 0 when x > z, so the integral can be limited.

f (z; δ, θ) = Z z 0 e−(z−x)θ (z − x) δ−1 Γ(δ)θδ · e −x θ x δ−1 Γ(δ)θδdx = e −(z−x)−x θ Γ(δ)Γ(δ)θ2δ Z z 0 (z − x)δ−1· xδ−1dx

Now, introduce a change of variables by substitution: let u = xz. Firstly, the integral bounds are transformed. X ranges from 0 to z, so u has a range from 0 to 1. Secondly, x = zu. Lastly, dx = zdu. This yields f (z; δ, θ) = e −z θ Γ(δ)Γ(δ)θ2δ Z 1 0

(z − zu)δ−1· (zu)δ−1zdu

= e −z θ Γ(δ)Γ(δ)θ2δ Z 1 0 zδ−1(1 − u)δ−1· zδ−1uδ−1zdu = e −z θ z2δ−1 Γ(δ)Γ(δ)θ2δ Z 1 0 (1 − u)δ−1· uδ−1du = e−zθ z 2δ−1 θ2δΓ(2δ) Γ(2δ) Γ(δ)Γ(δ) Z 1 0 (1 − u)δ−1· uδ−1du = e−zθ z 2δ−1 θ2δΓ(2δ) | {z } pdf of a Gamma(2δ, θ) distribution · Z 1 0 Γ(2δ) Γ(δ)Γ(δ)(1 − u) δ−1· uδ−1du | {z } pdf of a Beta(δ, δ) distribution

The Beta distribution has support on the interval u ∈ [0, 1], so that the integral integrates to 1. In conclusion, the distribution of the sum of two Gamma(δ, θ) distributed variables is in itself Gamma (2δ, θ) distributed. This in turn implies that the sum of k Gamma(δ, θ) distributions is Gamma(kδ, θ) distributed. Therefore, the discontinuous RV series is distributed, at time t, as

f (x; δ, θ, λt) = ∞ X k=0  e−xθ x kδ−1 θ(kδ)· Γ(kδ)· e−λtλk t k!  (7) = e−xθ · e−λt · ∞ X k=0  xkδ−1 θ(kδ)· Γ(kδ)· λkt k! 

(18)

Using equation (5), inputting the Poisson parameters defined before and applying the log-likelihood to RVJ

t , three log-likelihoods are established.

lt,TJ 1(δ, θ, γ) = − T X t−1  RVJ t θ + γ  + T X t=1 log ∞ X k=1 (RVJ t )kδ−1 θkδΓ(kδ) γk k! ! lJ 2t,T(δ, θ, α, β) = − T X t−1  RVJ t θ + (α + β · nt−1)  + T X t=1 log ∞ X k=1 (RVtJ)kδ−1 θkδΓ(kδ) (α + β · nt−1)k k! ! lJ 3t,T(δ, θ, γ; λt) = − T X t−1  RVJ t θ + γλt  + T X t=1 log ∞ X k=1 (RVtJ)kδ−1 θkδΓ(kδ) (γλt)k k! ! ,

where λt is defined as in equation (6). Estimation of model 3 is done in two steps, by first estimating

the optimal GARCH parameters ω, α and β and subsequently the log-likelihood parameters δ, θ and γ.

3.4

Evaluation & methodology

Alitab et al. (2016) have compared the JLHARG-RV model with several benchmarks, two of which are the standard GARCH model by Heston & Nandi (2000) and the RV model with realized jump variation by Christoffersen, Feunou, & Jeon (2015). Firstly, Alitab et al. (2016) find that all models that incorporate RV measures outperform standard, daily GARCH type models. Secondly, they observe that all HAR-RV type models outperform other benchmark RV models. These results give enough basis to only compare the models in this thesis with each other, i.e. evaluate if a time-dependent jump component further improves the JLHARG-RV model.

The evaluation of the estimations and the specification of the model is discussed in the next two sections. The first section describes the evaluation of the one-day forecasts, and the second section discusses density forecasting and specification testing. The results are given in chapter 5 and a con-clusion is drawn based on the different methods of evaluating in chapter 6.

3.4.1 Evaluation of one-day forecasts

Before the models can be estimated, the RV measure is constructed using Realized Kernel. Sub-sequently, a continuous RV series and a discontinuous RV series are constructed using TBPV, as described in chapter 2. Then, the models are fitted on all data by Maximum Likelihood and the Root Mean Squared Errors (RMSEs) of both models are compared. The RMSE is constructed using the ob-served and predicted values, where the prediction is the first moment (expectation) of the conditional distribution.

For the continuous model, in line with Gouri´eroux & Jasiak (2006), the expectation of the noncentral gamma distribution is defined as

E[RVtC] = θ · δ + θ · β. The expectation of the discontinuous model is defined as

E[RVtJ] = θ · δ · λ. Following, the RMSE is defined as

RM SE = v u u t 1 N N X t=1 (RVt− E[RVt])2,

(19)

Here, N is total number of days and t is an indicator for the trading day.

Secondly, the data is split in a training set, the first 75% of the observations and an evaluation set, the last 25% of the observations, and again the RMSE is compared. The chronological order in the training set is preserved, such that a GARCH specification can be fitted.

3.4.2 Density estimation and evaluation

Lastly, the method of density estimation is applied. At each point in time, an estimated one-day-ahead density is established, which contains more information than solely an expectation. In order to check for the fit of the distribution, the specification is tested by plugging in the observed values into the estimated conditional Cumulative Density Function (CDF) and inspecting the resulting values. Formally, the theory of Probability Integral Transformation states that if X is a random, continuous variable with a continuous, increasing CDF Fx(x), and if Y = FX(X), then Y ∼ U (0, 1) (Diebold,

Gunther, & Tay, 1997).

For every model, at every forecast density, the Y is estimated. This generates a vector of Y s which should be uniformly distributed, if the chosen specification is correct. In this thesis the uniformity is evaluated by inspecting the histogram of Y s.

(20)

Chapter 4

Data description

In order to estimate the specified model, a data set containing trade data of the SPY fund is used. Section 4.1 describes the SPY trade data and section 4.2 explains the cleaning procedure.

4.1

Trade data description

The SPY tracks a market-cap-weighted index of 500 US large- and midcap stocks selected by the S&P Committee (State Street Global Advisors, 2017). This S&P 500 is the benchmark proxy for the US market. The use of the SPY is motivated by the high coverage and high liquidity of the fund, while correcting for idiosyncratic effects that individual stocks may exhibit. The source of SPY data is the Trade and Quote (TAQ) database (Wharton Research Data Service, 2017).

To include as much data as possible, the maximum data range is chosen, from 01-01-1999 until 31-12-2013. Table 4.1 contains an example of trades from the first second of the first trading day in 2005 (03/01/2005). Per trade, the following characteristics are reported.

SY M BOL: the traded instrument. The data set only contains the SPY instrument.

DAT E and T IM E: indication when the trade happened. The trade data is provided in a 1-second frequency, and multiple trades could have the same time stamp. However, is is useful to note that the data is ordered by time of occurrence, so that earlier trades to appear earlier in the data set.

SIZE: the number of shares traded in this trade.

G127: a combined indicator of several technical rules, outside of the scope of this research. See the TAQ user guide for more information1.

CORR: an indicator if a trade is later corrected, where 0 corresponds to no correction. CON D: the sale condition.

EX: the exchange where the trade happened. Most frequent are P (NYSE Arca SM), Z (Bats Exchange), and A (NYSE MKT Stock Exchange).

(21)

Table 4.1: Sample of trade data from 03/01/2005

SYMBOL DATE TIME PRICE SIZE G127 CORR COND EX

SPY 03/01/2005 09:30:01 121.56 1100 0 0 O P SPY 03/01/2005 09:30:01 121.56 500 0 0 O P SPY 03/01/2005 09:30:01 121.56 400 0 0 O P SPY 03/01/2005 09:30:01 121.56 100 0 0 O P SPY 03/01/2005 09:30:01 121.56 100 0 0 O P SPY 03/01/2005 09:30:01 121.56 300 0 0 O P SPY 03/01/2005 09:30:01 121.56 100 0 0 O P SPY 03/01/2005 09:30:01 121.56 500 0 0 O P SPY 03/01/2005 09:30:01 121.56 1300 0 0 O P SPY 03/01/2005 09:30:01 121.56 200 0 0 O P SPY 03/01/2005 09:30:01 121.56 100 0 0 P SPY 03/01/2005 09:30:01 121.56 100 0 0 P SPY 03/01/2005 09:30:01 121.56 100 0 0 P

4.2

Data handling and cleaning

The complete dataset (‘99-‘13) was retrieved in separate, yearly CSV-files, containing 829.248.123 (829 million) trades, corresponding to a total file size of 40GB. The largest yearly file was of 2008 and contained 148 million observations. As this usually cannot be loaded into the “working” memory (RAM) due to capacity restriction, a database system is needed. Therefore, using P ython, the data was transferred into an SQLite database, where each day was converted into one table. The total size of this database is 63.6GB.

Hansen & Lunde (2006) apply a thorough cleaning procedure of the high-frequency data, discarding a part of the observations. This contradicts the principle outlined in subsection 2.2, but is done for a valid reason. Barndorff-Nielsen et al. (2009) state that an estimator that makes optimal use of all data will typically put high weight on accurate data and be less influenced by the least accurate observations. Therefore, inaccurate observations need to be removed from the data. Barndorff-Nielsen et al. (2009, p. C7) formalize and complete the cleaning procedure of Hansen & Lunde (2006), and identify the following steps. Method P is applicable to all data, where method T is specifically for cleaning trade data. The following steps contain the cleaning method and a note if the application in this thesis differs from Barndorff-Nielsen et al. (2009).

P1 Delete entries with a time stamp outside the 9:30–16:00 window, when the exchange is open. P2 Delete entries with a transaction price equal to zero.

P3 Only retain entries originating from a single exchange. Note: The applied method deviates from Barndorff-Nielsen & Shephard (2004) and retains observations for the most active exchange per day, in order to include as many observations as possible, while correcting for assymetric timing issues.

T1 Delete entries with a Correction Indicator (CORR) unequal to zero.

T2 Delete entries with abnormal Sale Condition, unless CON D is equal to zero, empty, ‘@’ or ‘F’.

T3 If multiple transactions have the same time stamp, use the median price. Note: This method is not applied in this thesis, and therefore trades with the same time stamp are not deleted. The reason for this is twofold: firstly, the natural order in the databases causes the trades to be

(22)

chronological and secondly, this thesis aims to incorporate as much available data as possible and disregarding valid price changes within the second is not in line with this method.

T4 Remove outliers by using the bid-ask spread. Note: As the bid-ask spread is not part of the data set, the standard procedure for detecting outliers cannot be applied.

By far the most complex cleaning procedure is T4, the removal of outliers: observations which lie outside an expected price interval. Generally, outliers can be of two different types. Firstly, the price can be measured incorrectly and the trade did not occur at the reported price. Secondly, a short shock in the market caused the price to quickly jump to a new level and directly to jump back to the old level. Two examples of such a shock are “fat-finger errors”2 and the flash-crash of 2010. Clearly, the first type of outlier needs to be deleted as the observation is inaccurate while the second outlier type is a legit trade and therefore actually driving volatility.

The problem that arises is that the underlying reason for an outlier is nearly impossible to model statistically. In short, there is a trade-off between providing accurate data while possible deleting valid trades and including as much data as possible while including inaccurate observations. In line with the general approach in this thesis, the choice is made not to remove outliers in this thesis and therefore incorporate as much data as possible.

Applying the cleaning procedure removes about 60% of total observations, see table 4.2 for the number of trades cleaned per step per year. The last column indicates the percentage of observations that is preserved.

Table 4.2: Data cleaning procedure

Year Obs. prior P1 P2 EX P3 T1 T2 Obs. post % Obs.

1999 411,715 120 0 A 172,349 3,492 3,602 237,835 58% 2000 427,558 454 0 A 170,087 3,132 5,195 255,817 60% 2001 1,242,448 5,214 0 T 624,911 5,470 27,949 607,125 49% 2002 4,840,088 32,796 0 T 2,448,055 9,146 117,828 2,310,235 48% 2003 6,552,211 79,495 0 T 4,352,472 10,292 209,663 2,117,800 32% 2004 12,673,109 98,686 0 P 5,630,767 6,566 271,097 6,883,513 54% 2005 23,179,317 238,991 0 T 12,208,758 6,162 424,547 10,786,691 47% 2006 20,104,076 371,783 0 T 9,158,739 3,988 720,387 10,556,145 53% 2007 52,020,198 1,067,558 0 T 22,469,300 3,608 10,668,672 28,609,563 55% 2008 148,456,139 3,385,117 0 T 65,659,389 11,466 3,145,596 80,556,252 54% 2009 145,090,224 3,509,405 0 T 79,651,170 4,014 617,900 63,514,791 44% 2010 131,449,261 3,886,100 0 T 89,346,182 6,722 160,373 40,376,779 31% 2011 134,519,527 3,918,789 0 T 91,176,014 5,004 218,257 41,531,221 31% 2012 79,241,910 1,812,914 0 Z 54,514,252 1,676 189,355 23,910,387 30% 2013 67,910,723 0 0 Z 51,226,736 1,367 237,139 16,670,866 25% 2

A fat-finger error is a trading error where an order to buy or sell a stock is placed for a wrong amount, prize or any other input. This causes the stock to be traded at a seemingly erroneous price, seen as an outlier, but these trades actually took place.

(23)

Chapter 5

Results

This chapter contains the results from applying the procedures described in chapter 3. The first section contains the result from auxiliary estimations, including an autocorrelation test for self-excitation in jumps and a GARCH specification for the number of jumps. The second section contains the results of estimating the continuous part of the JLHARG-RV model. The third section contains the modelling result of the three discontinuous models, including a comparison between the models.

5.1

Auxiliary results

The goal of this section is to empirically underpin the theory explained in chapter 2. Firstly, the impact of microstructure noise and the correction of Realized Kernel are shown. Secondly, volatility clustering is tested. Thirdly, results regarding the modelling of jumps using Threshold Bipower Variation are given. Lastly, an empirical underpinning for self-excitation in jumps is shown and different jump-intensity models are tested.

Table 5.1: Mean, daily values of Realized measures, per year

Year RV RV (Realized Kernel) Continuous RV Jump RV No. jumps Micr. noise

1999 2.760E-04 2.678E-04 2.642E-04 3.567E-06 0.9 3%

2000 4.251E-04 4.098E-04 4.014E-04 8.466E-06 1.6 4%

2001 4.948E-04 4.290E-04 3.724E-04 5.661E-05 21.9 13%

2002 7.609E-04 4.183E-04 2.040E-04 2.143E-04 217.5 45%

2003 2.557E-04 1.679E-04 9.884E-05 6.905E-05 49.2 34%

2004 3.415E-05 3.128E-05 3.017E-05 1.117E-06 8.7 8%

2005 3.652E-05 2.922E-05 2.826E-05 9.617E-07 8.1 20%

2006 1.085E-04 3.816E-05 2.945E-05 8.710E-06 65.4 65%

2007 8.121E-05 6.307E-05 6.156E-05 1.507E-06 17.3 22%

2008 5.499E-04 4.271E-04 4.162E-04 1.091E-05 55.7 22%

2009 3.067E-04 1.910E-04 1.895E-04 1.484E-06 6.2 38%

2010 1.669E-04 8.270E-05 8.135E-05 1.346E-06 6.3 50%

2011 1.152E-04 1.078E-04 1.063E-04 1.561E-06 10.6 6%

2012 4.301E-05 3.742E-05 3.640E-05 1.022E-06 5.3 13%

2013 2.739E-05 2.579E-05 2.519E-05 5.934E-07 9.9 6%

As can be seen in table 5.1, the Realized Kernel removes between the 3% and 65% of the microstructure noise per year. These results are in line with the theory from chapter 2. A first check of the RV measure

(24)

is performed by scaling the VIX back to RV measure. Visually, the VIX and the RV series seem indeed to resemble the same underlying process, see figure 5.1. The difference between the two measures can be contributed to the risk premia investors pay for different types of risk, see the discussion in Alitab et al. (2016).

Figure 5.1: Daily volatility, compared with the VIX, 1999-2013

The theory of volatility clustering is tested by assessing the autocorrelation in the RV measure. Indeed, the ACF plot confirms the clustering of volatility, as the RV series indeed appears to be autocorre-lated.

Figure 5.2: Autocorrelation of RV

Chapter 2 states that a volatility series can be separated in a continuous part and a discontinuous part. To do so, the Threshold Bipower Variation method is applied. Corsi (2009, p. 10, fig. 2) finds that the contribution of jumps in 1999-2001 is around the 10%. As can be seen in figure 5.4, the contribution in 1999-2001 is lower, around 3%. Around 2003, the contribution peaks around 75%, which appears to be relatively high. On average, the figure seems to be in line with the theory of a jump contribution around 10%.

(25)

Figure 5.3: Number of jumps Figure 5.4: Jump contribution to total variance

Figure 5.5: Number of jumps, ACF Figure 5.6: Jump contribution, ACF

In order to test for self-excitation in the model, the number of jumps and the contribution of jumps to total volatility are tested for autocorrelation. Figures 5.5 and 5.6 show these ACFs. It appears that both the number of jumps and contribution of jumps are autocorrelated. This confirms the theory about self-excitation in jumps and validates the time-dependent extension to the model.

As specified in section 3.3, two alternative, time-dependent models are specified. The first a basic, intuitive model which models the number of jumps as a constant plus a parameter times the number of jumps the period before. The second model is more sophisticated and uses a GARCH(1,1) speci-fication to model the number of jumps. Figure 5.7 shows the actual and predicted number of jumps. Comparing the RMSE of the three models confirms that the GARCH(1,1) model outperforms the other specifications, as the RMSE of model 3 (90.37) is lower than the RMSE of model 1 (98.7) and model 2 (99.1).

(26)

5.2

Continuous model

This section contains the results of modelling continuous volatility using maximum likelihood. The results are evaluated in three ways, as described in section 3.4. Firstly, the model is estimated over the complete data and evaluated using RMSE, as described in section 3.4. Secondly, the model is estimated over the training set and evaluated by the evaluation set, using RMSE. Lastly, the observed values are plugged into the estimated conditional CDF to check the fit of the chosen specification.

5.2.1 Model estimation and evaluation

In order to estimate the nine coefficients of the highly nonlinear model, the Maximum Likelihood implementation requires the starting values to be chosen tactically. In this thesis, the starting values of βs are chosen by first estimating the intermediate model, keeping α1,2,3 = 0.1 and γ = 0.1

con-stant. Subsequently, the final model is estimated, where the starting values of β are the estimated coefficients.

Table 5.2: Estimation results of modelling continuous RV.

Complete data set Training / evaluation

Parameter Intermediate Final Intermediate Final

α1 -0.13402 (0.0275) -0.1563 (0.0281) α2 -0.24660 (0.0518) -0.2099 (0.0505) α3 -0.24721 (0.0692) -0.1560 (0.0503) β1 12985.50 (1049.11) 12956.18 (1222.83) 4220.47 (682.488) 4229.61 (778.60) β2 12586.33 (2151.88) 12550.64 (1106.97) 3941.51 (1058.79) 3947.20 (-) β3 12093.97 (1821.31) 12048.38 (1933.94) 3951.04 (984.336) 3954.88 (-) δ∗ -18.97619 (45.4615) -2.51260 (0.4291) -15.44319 (320.752) -15.42113 (-) θ∗ -10.58532 (0.0140) -10.64584 (0.0154) -9.71284 (0.04152) -10.68279 (0.02940) γ - 37.73221 (8.4810) -227.45571 (31.66059) -2 log LL -61765.41 -61898.31 -14478.1 -15296.48 RMSE 0.000188876 0.000133651

* Note that the exponent of the parameter needs to taken to correspond to the theoretical model.

Two main results can be deducted from table 5.2. Firstly, α1,2,3 and β1,2,3 are both very sensitive to

the data set. When the model is trained using 75% of the total data set, the coefficients change heavily. Secondly, the optimization algorithm estimates β relatively close to the starting values. This indicates that there is room for improvement in the fit of the model and in the quality of the optimization algorithm.

(27)

Apart from results on the individual coefficients, the fit of the model cannot be evaluated. The fit and specification of the model is evaluated in the next subsection.

5.2.2 Density estimation and specification testing

All observed continuous RVs are plugged into the estimated conditional CDF. If the density is specified correctly, this should result in a uniformly distributed response vector, as explained before. Figure 5.8 shows a histogram of these responses, which appear not to be uniformly distributed. The main result to be drawn from figure 5.8 is that the chosen specification is not suitable for the data. Two possible explanations for the misspecification are given next.

Figure 5.8: RV cont density

Firstly, the chosen distribution (noncentral gamma distribution) could be chosen incorrectly. However, as explained by Alitab et al. (2016), the noncentral gamma distribution is a very flexible distribution and has proven to improve volatility modelling.

Secondly, the estimated coefficients do not fit the data well enough. This could be due to the quality of the optimization algorithm or to the fact that the coefficients are estimated as constant. Chapter 2 has shown that volatility is not constant, the parameters could therefore vary and fixing parameters could result in a misspecification.

5.3

Jump model

This section contains the result of modelling the jump component of volatility. As explained in chapter 3, this thesis treats three jump models. Firstly, the models are estimated, evaluated and compared. Secondly, the specification is tested by the estimated conditional cumulative density.

5.3.1 Model estimation and evaluation

Table 5.3 contains the results of modelling the discontinuous RV. All coefficients are as expected and, where applicable, in line with previous findings of Alitab et al. (2016). Two results can be deducted. Firstly, the parameters are less dependent on the choice of the training set than those of the continuous RV model, as the complete data set and the training set yield nearly the same estimates. Secondly, the main result is that the RMSE of model 3 is lower than model 2 and model 1, indicating a better

(28)

modelling quality. The outperformance of model 3 in terms difference in RMSEs is around 5%. The difference between model 1 and model 2 is neglectable.

All in all, these results show that allowing for time-dependency improves the discontinuous RV model, if the jump-intensity is modelled with a GARCH(1,1) specification. This result does not seem to hold for the training and evaluation set, most likely due to the strong dependence of the GARCH model on the complete history. Next, the specification of both models is tested.

Table 5.3: Estimation results of modelling the discontinuous RV

Complete data set Training/evaluation set

Model 1 Model 2 Model 3 Model 1 Model 2 Model 3

δ∗ -4.83458 (0.03664) -3.39302 (0.06378) -2.18343 (0.02443) -4.45747 (0.07192) -3.19646 (0.06676) -2.06802 (0.02804) θ∗ -9.15489 (0.03752) -9.36827 (0.03749) -9.89736 (0.03007) -12.50117 (0.06572) -9.25560 (0.04151) -9.81233 (0.03308) γ 31.05836 (0.98293) 0.23845 (0.00179) 31.16787 (1.90825) 0.26594 (0.00229) ω 0.00962 (-) 0.00965 (-) α 7.16472 (0.45766) 0.60475 (0.003325) 6.50313 (0.43984) 0.47974 (0.019030) β 0.05929 (0.00239) 0.85024 (0.001326) 0.05657 (0.00243) 0.78717 (0.003769) -2 log LL -81241.27 -81554.94 210885.4 -24623.7 -58039.92 -53137.83 RMSE 0.0001663 0.0001612 0.0001530 0.0000343 0.0000284 0.0000382

* Note that the exponent of the parameter needs to taken to correspond to the theoretical model.

5.3.2 Density estimation and specification testing

As with the continuous model, discontinuous RV observations are inputted in the estimated conditional CDF. Figures 5.9, 5.10 and 5.11 display the specification of the models. Two results are apparent. Firstly, neither specification does exhibit a uniformly distributed histogram and therefore the chosen distributions seem to be misspecified. Secondly, model 1 displays a better fit than model 2 and 3. This suggests that a time-dependent jump component actually deteriorates the model fit, even when the quality of the jump-intensity model increased.

(29)

Figure 5.9: Density model 1 Figure 5.10: Density model 2

(30)

Chapter 6

Conclusion

6.1

Conclusion

This chapter concludes this thesis in the same order the previous chapters are ordered. Firstly, the reason for this research and the research question is discussed. Secondly, the theoretical basis of this thesis is shortly summarized. Subsequently, the model and data set are described. Lastly, based on the results of the estimation, conclusions are drawn and the research question is answered. The next section discusses the limitations of this research and suggestions for further research are given. The riskiness of any financial asset is a key driver of the underlying price. By correctly modelling the volatility of the price, the risk of the asset can be quantified. The goal of this thesis is therefore to model volatility as precisely as possible. This leads to the following research question: “Can the JLHARG-RV model be improved by including a time-dependent jump-intensity?”

Chapter 2 describes the observed volatility series which exhibit certain behaviour. Known stylized facts are fat tails in the underlying distribution, auto-regressive behaviour of volatility series, a leverage effect and a difference between the continuous and discontinuous movement of the price.

The general philosophy in this thesis is that more data gives better insights in modelling volatility. Therefore, intraday trade data are used on a tick-by-tick level. Using high-frequency data points introduces noise in the volatility estimators and a correction for this microstructure noise must be applied. Using the technique of kernel smoothing, the realized kernel gives an unbiased estimate of Realized Variance (RV) for each day. Using Threshold Bipower Variation (TBPV), the RV series is split in a continuous and a discontinuous part. Both parts can be modelled separately.

Chapter 3 describes the models for both the continuous and discontinuous RV series. The continuous part is modelled by the JLHARG-RV model of Alitab et al. (2016). Three models are applied to the discontinuous part. Firstly, the model of Alitab et al. (2016) is applied, with a constant jump-intensity parameter. Secondly, a linear model is applied, with a simple linear specification, where the jump-intensity is determined by a constant plus a coefficient times the number of jumps in the previous period. Lastly, the jump-intensity is modelled by a scaled GARCH(1,1) process.

The data used in this thesis are tick-by-tick trade data from the SPY fund that tracks the S&P 500 index. Chapter 4 describes the complete data set. This thesis uses all trade data from 1999 until 2013, containing over 800 millions trades. Using a cleaning procedure outlined by Barndorff-Nielsen et al. (2009), the data is cleaned, preserving around 40% of all trades.

Chapter 5 contains the results of applying the models above. Each method is tested in three ways. Firstly, the model is estimated using Maximum Likelihood on the complete data set and the Root Mean Squared Error (RMSE) of each model is compared. Secondly, each model is trained on the training set, the first 75% of the data and evaluated on the evaluation set, the second 25% of the data

(31)

and again the RMSEs are tested. Lastly, the specification of the model is tested by plugging in the observed values in the estimated conditional cumulative distribution function (CDF).

The first result of the continuous model is that the estimated coefficients are very sensitive to the starting values, indicating that the Maximum Likelihood algorithm might have found a local optimum instead of a global optimum. Secondly, the specification testing using the estimated conditional CDF does not indicate a good fit of the model on the data.

The three discontinuous models are also tested in three ways, using the same method as described above. In terms of RMSE, the second model performs the best, even though the GARCH(1,1) models the jump-intensity by far the best. This result holds for the training and evaluation set as well. The specification test shows that the specification of the original model (model 1) resembles the uniform distribution the best. The second and third model seem misspecified as the estimated conditional distribution does not fit the observed distribution.

All in all, the following three conclusions to the three subquestions can be drawn based on the re-sults.

1. The conditional distribution of the continuous realized volatility by Alitab et al. (2016) is mis-specified. This can be due to the wrong type of distribution (noncentral gamma distribution), poorly estimated coefficients by maximum likelihood or the fact that coefficients change over time.

2. The estimation of the number of jumps can be improved by a linear specification including the number of jumps in the previous period (Model 2) and further improved by using a GARCH(1,1) specification (Model 3). The constant model of Alitab et al. (2016) seems too simplistic.

3. Although the RMSEs of model 2 and model 3 are slightly lower, the observed distribution of discontinuous volatility seems better fitted by model 1.

The answer to the main research question is therefore that by including a time-dependent jump-intensity, the quality of the point estimator (expectation) of the discontinuous volatility model im-proves, but the quality of the estimated conditional distribution deteriorates.

6.2

Limitations and suggestions for further research

The main limitations of this thesis lie in the fit of the model and the specification of the jump-intensity. This section elaborates on these problems and specifies three limitations with three suggestions for further research.

Firstly, the conclusion of the research is not satisfying, as the conditional distribution of both RV series seems misspecified. Therefore, the noncentral gamma distribution, while flexible, might not be the correct distribution for the continuous RV series. A first suggestion is to model volatility using another distribution, such as the Student’s t-distribution, that is able to include all stylized facts, use intraday data and produce a better fitted conditional distribution.

Secondly, the estimated coefficients are highly sensitive and do not deviate much from the chosen starting values. This indicates that estimating by Maximum Likelihood is problematic, as the esti-mated coefficients might be corresponding to a local maximum. A second suggestion is to use another optimization algorithm or to decrease the number of model parameters.

Thirdly, while the fit of the jump-intensity improves using a GARCH(1,1) model, the quality of the discontinuous RV model decreases. This indicates that the discontinuous model is not the right specifi-cation for the observed time series. Specifically, the number of jumps might influence the discontinuous volatility differently than specified. A third suggestion is to change the specification of the discontin-uous volatility.

(32)

References

A¨ıt-Sahalia, Y., Mykland, P. A., & Zhang, L. (2005). How often to sample a continuous-time process in the presence of market microstructure noise. The review of financial studies, 18 (2), 351–416. Alitab, D., Bormetti, G., Corsi, F., & Majewski, A. A. (2016). A jump and smile ride: Continuous

and jump variance risk premia in option pricing.

Amihud, Y., & Mendelson, H. (1986). Asset pricing and the bid-ask spread. Journal of financial Economics, 17 (2), 223–249.

Andersen, T. G., Bollerslev, T., & Diebold, F. X. (2007). Roughing it up: Including jump components in the measurement, modeling, and forecasting of return volatility. The review of economics and statistics, 89 (4), 701–720.

Andersen, T. G., Bollerslev, T., Diebold, F. X., & Ebens, H. (2001). The distribution of realized stock return volatility. Journal of financial economics, 61 (1), 43–76.

Barndorff-Nielsen, O. E., Hansen, P. R., Lunde, A., & Shephard, N. (2009). Realized kernels in practice: Trades and quotes. The Econometrics Journal , 12 (3).

Barndorff-Nielsen, O. E., & Shephard, N. (2004). Power and bipower variation with stochastic volatility and jumps. Journal of financial econometrics, 2 (1), 1–37.

Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of economet-rics, 31 (3), 307–327.

Boswijk, H. P., Laeven, R., & Yang, X. (2014). Testing for self-excitation in jumps.

Bouchaud, J.-P., Matacz, A., & Potters, M. (2001). Leverage effect in financial markets: The retarded volatility model. Physical review letters, 87 (22), 228701.

Chicago Board Option Exchange. (2017, Nov). Cboe futures exchange - education.

https://cfe.cboe.com/cfe-education/cboe-volatility-index-vx-futures/vix-primer/ cboe-futures-exchange-nbsp-nbsp-education.

Christie, A. A. (1982). The stochastic behavior of common stock variances: Value, leverage and interest rate effects. Journal of financial Economics, 10 (4), 407–432.

Christoffersen, P., Feunou, B., & Jeon, Y. (2015). Option valuation with observable volatility and jump dynamics. Journal of Banking & Finance, 61 , S101–S120.

Corsi, F. (2009). A simple approximate long-memory model of realized volatility. Journal of Financial Econometrics, 7 (2), 174–196.

Corsi, F., Fusari, N., & La Vecchia, D. (2013). Realizing smiles: Options pricing with realized volatility. Journal of Financial Economics, 107 (2), 284–304.

Corsi, F., Pirino, D., & Reno, R. (2010). Threshold bipower variation and the impact of jumps on volatility forecasting. Journal of Econometrics, 159 (2), 276–288.

Corsi, F., & Reno, R. (2009). Har volatility modelling with heterogeneous leverage and jumps. Available at SSRN 1316953 .

(33)

Cox, J. C., Ingersoll Jr, J. E., & Ross, S. A. (1985). A theory of the term structure of interest rates. Econometrica: Journal of the Econometric Society, 385–407.

Craioveanu, M., & Hillebrand, E. (2010). Why it is ok to use the har-rv (1, 5, 21) model. Unpublished manuscript .

Diebold, F. X., Gunther, T. A., & Tay, A. S. (1997). Evaluating density forecasts. National Bureau of Economic Research Cambridge, Mass., USA.

Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of united kingdom inflation. Econometrica: Journal of the Econometric Society, 987–1007.

Gouri´eroux, C., & Jasiak, J. (2006). Autoregressive gamma processes. Journal of Forecasting, 25 (2), 129–152.

Graham, J. R., Michaely, R., & Roberts, M. R. (2003). Do price discreteness and transactions costs affect stock returns? comparing ex-dividend pricing before and after decimalization. The Journal of Finance, 58 (6), 2611–2636.

Hansen, P. R., & Lunde, A. (2006). Realized variance and market microstructure noise. Journal of Business & Economic Statistics, 24 (2), 127–161.

Heston, S. L., & Nandi, S. (2000). A closed-form garch option valuation model. The Review of Financial Studies, 13 (3), 585–625.

Hull, J. (2006). Options, futures, and other derivatives. Pearson Education India.

Majewski, A. A., Bormetti, G., & Corsi, F. (2015). Smile from the past: A general option pricing framework with multiple volatility and leverage components. Journal of Econometrics, 187 (2), 521–531.

Merton, R. C. (1976). Option pricing when underlying stock returns are discontinuous. Journal of financial economics, 3 (1-2), 125–144.

Nelson, D. B., & Cao, C. Q. (1992). Inequality constraints in the univariate garch model. Journal of Business & Economic Statistics, 10 (2), 229–235.

State Street Global Advisors. (2017, Oct). SPDR S&P 500 ETF. https://www.ssga.com/en/ products/Products/apac/fund-detail/SPY.html.

Tsay, R. S. (2005). Analysis of financial time series (Vol. 543). John Wiley & Sons.

Wang, C. D., & Mykland, P. A. (2014). The estimation of leverage effect with high-frequency data. Journal of the American Statistical Association, 109 (505), 197–215.

Wharton Research Data Service. (2017, Oct). Taq. https://www.eur.nl/ub/en/edsc/databases/ financial databases/.

Zhang, L., Mykland, P. A., & A¨ıt-Sahalia, Y. (2005). A tale of two time scales: Determining integrated volatility with noisy high-frequency data. Journal of the American Statistical Association, 100 (472), 1394–1411.

Referenties

GERELATEERDE DOCUMENTEN

Students who have passed these subjects in Grade 12 are supposed to have background knowledge on most of the concepts of the subjects Introductory Computer Practice and

Binnen het promotieonderzoek van Pezij zijn twee methoden ontwikkeld om een beter inzicht te krijgen in actuele bodemvochtcondities door het combineren van

Er werd wel een significant effect gevonden van perceptie van CSR op de reputatie van Shell (B = .04, p = .000) Uit deze resultaten kan worden geconcludeerd dat ‘perceptie van

Our aim is to provide an overview of different sensing technologies used for wildlife monitoring and to review their capabilities in terms of data they provide

Niet alleen waren deze steden welvarend, ook was er een universiteit of illustere school gevestigd; daarmee wordt nogmaals duidelijk dat de firma Luchtmans zich met hun

SNLMP can be introduced as transition systems with stochastic and non-deterministic labelled transitions over a continuous state space.. Moreover, structure must be imposed over

Hoewel het onderzoek geen significante effecten aantoont met betrekking tot message framing, stemming, risicoperceptie, attitude ten opzichte van het drinken van energiedrank

De afhankelijkheidsrelatie die bestaat tussen de jeugdige (diens ouders) en de gemeente leidt ertoe dat toestemming echter geen geschikte grondslag voor gegevensverwerking