• No results found

Measuring realized volatility and prediction in the HAR-RV model with implied volatility, jumps and asymmetries

N/A
N/A
Protected

Academic year: 2021

Share "Measuring realized volatility and prediction in the HAR-RV model with implied volatility, jumps and asymmetries"

Copied!
59
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Measuring Realized Volatility and Prediction in the

HAR-RV Model with Implied Volatility, Jumps and

Asymmetries

Msc Thesis Econometrics

University of Amsterdam

Author: Duco van Rossem Supervisor: Peter Boswijk

May 2014 Abstract

This thesis deals with the measurement as well as the prediction of daily volatility measured with the use of high frequency data of the S&P500 exchange traded fund SPY. Various realized variance measures are employed and examined to control for microstructure noise. The mea-surement for volatility used has large implications for its magnitude as well as predictability. The HAR-RV model proposed by Corsi (2009) and its various extensions are used as a basis to model daily realized variance over a 14 year horizon. The HAR-RV model allows for separation of short-, medium- and long- term volatility components. The explanatory power of the leverage effect, jumps and the option market is assessed. There is evidence for a persistent leverage effect. Jumps, the option Greeks as well as the shape of the volatility surface add some explanatory power.

(2)

Contents

1 Introduction 3

2 Measuring Volatility 4

2.1 Realized Volatility . . . 4

2.2 Simple model without Microstructure Noise . . . 5

2.3 The effects of Microstructure Noise . . . 7

2.4 Bias Correcting and Consistent Estimation . . . 8

2.5 Realized Kernel . . . 10

2.6 Jumps . . . 13

2.6.1 The Small Sample Biased Bipower Variation . . . 14

2.6.2 The Not-So Small Sample Biased ’Threshold’ Bipower Variation . . . 16

3 The HAR-RV Model 18 3.1 Introduction to HAR-RV . . . 18

3.2 The LHAR-RV Model . . . 21

3.3 LHAR-RV and Jumps . . . 21

4 Implied volatility 23 4.1 Introduction . . . 23

4.2 Implied Volatility Surface . . . 23

4.2.1 Failure of Black-Scholes Model . . . 23

4.2.2 Capturing the Implied Volatility Surface . . . 24

4.3 Greeks . . . 26

4.3.1 Overview . . . 26

4.3.2 Implementation with HAR-RV . . . 28

5 Data 30 5.1 High Frequency Trade Data . . . 30

5.1.1 Data Description . . . 30

5.1.2 Cleaning the High Frequency Trade Data . . . 31

5.2 Option Price and IV data . . . 33

6 Analysis 35 6.1 Comparison of RV Measures . . . 35

6.2 HAR-RV and the Leverage Effect . . . 37

6.2.1 Specification Test and Structural Change . . . 37

6.3 Comparing Volatility Measures . . . 39

6.3.1 Accross delta . . . 39

6.3.2 Comparison across different estimators . . . 41

6.3.3 Comparison across different models . . . 43

6.4 Option market . . . 44

6.4.1 Implied Volatility Surface . . . 44

6.4.2 Option Greeks . . . 46

6.4.3 Using the Put Option Greeks as Explanatory Variable . . . 46

6.4.4 Option Greeks Versus the Surface . . . 48

6.5 Jumps . . . 49

(3)

1

Introduction

The importance of volatility, a proxy for uncertainty, as an element of a risky asset’s value is undeni-able. Volatility forecasting is essential to assess the riskiness of future payoffs. It provides an insight into ’known unknowns’ that are characterized by volatility. Volatility is fundamental in pricing assets and derivatives as well as in decision making when it comes to risk management and hedging. The aim of this thesis is to explore the forecasting power of the HAR-RV model proposed by Corsi (2009) and its various extensions as a basis to model realized variance (RV). The HAR-RV model allows for separation of short-, medium- and long- term volatility components. Furthermore, various measures of volatility are examined in this thesis to determine how they differ in the HAR-RV model. The extension to volatility forecasting examined include the role jumps, the leverage effect and information out of the option market. In building our model for the S&P500 index realized volatility a number of recent academic contributions are aggregated.

Although daily returns of assets are next to impossible to predict, the volatility of returns are easier to forecast. Returns exhibit a number of properties not replicated by standard volatility models like GARCH. Correctly accounting for these dynamics of returns is important to obtain accurate forecasts for volatity. As Corsi (2009) outlines, the HAR-RV model can reproduce many properties of financial times series, including long-memory. Usually long-memory is replicated by employing fractional difference operators like in FIGARCH and ARFIMA models. HAR-RV provides a simple alternative that is economically intuitive and practical to implement. In the practical application of forecasting future volatility, Andersen et al. (2003) and Andersen et al. (2004) show that simple reduced form time series models for realized variance outperform the commonly used GARCH and related stochastic volatility models.

The research questions investigated in this paper are: To what extent do jumps, the leverage effect, the option Greeks and the implied volatility surface add significant additional information about tomorrow’s realized variance modelled through the HAR-RV model? Furthermore in selecting the realized variance measure to use in the HAR-RV framework we ask the following: How do RV measures differ in practice and what is the difference in their predictability?

This thesis mainly builds on papers by Busch et al. (2011) and Corsi & Ren (2012) which apply HAR-RV models with various extensions improving volatility forecasts. This thesis combines aspects of those two models and also builds on their models in a number of respects. Firstly, Busch et al. (2011) uses ‘simple’ RV estimates while Corsi & Ren (2012) uses the Two-Scale estimator RV(TS)t by Zhang et al. (2005). This thesis examines 6 RV measures and bases models on the more microstructure-noise-robust Realized Kernel RV(ker)t by Barndorff-Nielsen et al. (2008).

Secondly, Busch et al. (2011) use implied volatilities backed out of option prices to enhance the forecasting power of the HAR-RV model. The option market is often thought of as a market for volatility. Since options take a forward looking view, it would imply that implied volatility computed from option prices can be used as a predictor for future observed volatility. This thesis further explores the added explanatory power of information out of the option market by using the shape of the implied volatility surface and the option Greeks.

Thirdly, Corsi & Ren (2012) introduce the LHAR-RV model, using negative returns over the past day, week and month to improve RV forecasts, thereby imposing a persistent leverage effect. We examine whether this persistent of the leverage effect nature adds significant explanatory power and examine its explanatory power in relation to the other extensions.

The last extension to the HAR-RV model examined is the nonparametric separation of two com-ponents of RV: the jump component (Jt) and the integrated variance (Ct). These two components

(4)

(2007) and Busch et al. (2011) use past Jt and Ct, measured using bipower variation (BPV) as

separate regressors. Both show that the jump and continuous part are distinct in their forecasting power and differ in their persistence over time and predictability. In this thesis we examine jumps measured using instead the more unbiased threshold bipower variation (TBPV).

An important component of this paper is the use of high frequency data to compute the measures of volatility. Over 32GB of high frequency trade data is used to construct realized variance measures leading to estimates with a higher precision and accuracy of measurement to forecast the unobservable asset than simple volatility measures. To account for microstructure noise which contaminates high frequency measures, a number of estimators for RV are applied. Next to the ’simple’ measure for RVt, the Average estimator RV(AVG)t , the Two-Scale estimator RV

(TS)

t by Zhang et al. (2005), the

estimator RV(Zhou)t by Zhou (1996) and the Realized Kernel RV(ker)t by Barndorff-Nielsen et al. (2008) are applied.

This thesis is organized as follows. Section 2 introduces the measurement of volatility, especially in the presence of microstructure noise. Section 3 discusses the appeal of the HAR-RV model and some of its extensions taking into account the leverage effect and jump processes. Section 4 introduces information out of the option market that may be used to enhance volatility estimates. Section 5 discusses the data and cleaning methods. Section 6 carries out the analysis of model output. Finally section 7 concludes.

2

Measuring Volatility

Volatility is a latent variable and the search to characterize it has led to the growing number of papers analysing high frequency intraday data. However the use of high frequency data brings with it the problem of microstructure noise which is not related to the efficient price process. As will be shown later, microstructure noise gives a substantial bias to the daily volatility measure.

This section begins with outlining the added value of using high frequency realized variance measures. Then problems and approaches associated with measuring volatility, especially surrounding market microstructure are discussed. Toward the end of the section the methods are discussed which allow for consistent estimation of realized variance and their jump component.

2.1 Realized Volatility

Using high frequency data allows more information to be incorporated in measuring volatility. One aspect of this can be visualized in an example using open-to-close or close-to-close returns versus intraday returns. In Figure 1 below intraday log price movements of the S&P500 tracker SPY on 2012-03-12 and 2010-05-19 are depicted. For both days the open-to-close return was zero so that simply squaring the daily return cannot measure the large difference in volatility that occurred intraday.

(5)

Figure 1: Intraday log prices indexes for SPY on 2013-03-12 and 2010-05-19 for which open-to-close daily return were both zero

The method by which volatility is measured plays an important role. For example using 1 or 20 -minute returns involves a trade off between accuracy (which is theoretically optimized using the highest possible frequency) and microstructure noise leading to inconsistencies. There has been a substantial amount of research with focus on the measurement of integrated variance in the pres-ence of microstructure noise. As will be elaborated upon in Section 2.3, the two main sources of microstructure noise are the Bid-Ask bounce and prices that are discrete at the tick size.

2.2 Simple model without Microstructure Noise

Let us first consider the basics without microstructure noise. Suppose we have a stochastic log price process yt defined as

dyt= µtdt + σtdWt,

here Wt is a standard Brownian motion, µt is a drift component and σt is continuous random or

nonrandom volatility process. We consider the realization ytfor 0 ≤ t ≤ T one day of intra-day tick

data. To exclude a leverage effect for now we also assume orthogonality of σt and dWt.

As is true in any practical application, the time series ytis observed only at discrete times t0, t1, ..., tN

where tN = T , giving us N + 1 intraday price realisations. The fact that price is observed in discrete

intervals means that we cannot determine volatility at each instant as an Ito process would otherwise suggest.

Our variable of interest is integrated variance and is defined as

IVt=

Z t

t−1

(6)

Here we use T = 1, so that t = 1, 2, ... correspond to day 1, 2 etc. IVtis a measure of daily variance

σt2. To estimate IVt we use the realized variance RVt1 estimator defined as

RVt= M

X

j=1

r2t,j,

here rt,j represents the j = 1, ..., M intraday returns sampled on subinterval with a length of δ

(‘delta’), with T = 1:

rt,j = yt+δj− yt+δ(j−1), j = 1, ..., M (2)

For example, by taking 5-minute returns, it implies we have M = 78 and δ = 781 since there are 78 5-minute intervals per day in the typical NYSE trading hours 9:30AM to 16:00PM. In this section we first consider using all available intraday high frequency data to compute realized variance; we label this measure as RV(all)t . So using Nt+ 1 intraday price observations we can construct M = Nt

non-zero return intervals, so that

RV(all)t =

Nt

X

j=1

rt,j2

Andersen et al. (2003) showed that the realized variance defined as RV(all)t , is a consistent estimator of integrated variance when there is no microstructure noise

RV(all)tp IVt As δ → 0, M → ∞ (3)

With no microstructure noise the asymptotic distribution of realized variance RV(all)t derived by Barndorff-Nielsen & Shephard (2002) is

p Nt 1 p2IQt(RV (all) t − IVt) →dN (0, 1)

Where IQtis the integrated quarticity, defined as

IQt= Z t

t−1

σs4ds

Barndorff-Nielsen & Shephard (2002) further showed that the integrated quarticity (IQt) is

consis-tently estimated by the realized quarticity(RQt), which is defined as

RQt= Nt 3 Nt X j=1 rt,j4 , (4) so that p Nt 1 q 2 3RQt (RV(all)t − IVt) →dN (0, 1)

So, given a lack of microstructure noise, RV(all)t is a consistent estimator for IVt

1

Not to be confused with realized volatility which is RVolt=

√ RVt

(7)

2.3 The effects of Microstructure Noise

First the two main sources of noise are discussed before moving onto the effects on the estimation of integrated variance. The two main sources of microstructure noise are:

1. Bid-Ask Bounce: Trade prices typically move back and forth between the bid and offer levels. For example let’s say the efficient price for an asset is constant and equal to $100.00 throughout the day. Theoretically this would imply a volatility of zero for the day. However due to the cost of ‘making’ the market, trades do not actually occur at $100.00, rather they occur at a spread around the efficient price, for example at ask price $100.10 and bid price $99.90. The observed intraday price process therefore will ‘bounce’ between those two prices, implying a biased estimate of the true volatility.

2. Discretization: Noise from the fact that the price process is observed on a discrete scale. Errors coming from rounding effects due to transactions prices being multiples of a tick size. The tick size for the data used in this paper is one cent while the efficient price theoretically is continuous.

In section 2.2 we implicitly assumed the efficient logarithmic price yt,i is observed. Where we say

i = 0, ..., Nt indexes our intraday price realizations ocurring at ti = t0, t1, ..., tNt. Note that the

number of intraday price realizations is not constant over the days, hence the subscript t in Nt.

Recall that t refers to day and that j = 1, ..., M refers to the intraday returns rt,j using a subinterval

δ. A basic model to incorporate microstructure noise is supposing that the observed values are not yt,i, but xt,i. We write this noise as ut,i, so that xt,i is noisy observation of yt,i:

xt,i = yt,i+ ut,i (5)

Given some assumption made regarding the structure of this noise, we can derive the implications for the RVt estimate. For now we assume the noise is an i.i.d process with zero mean, has a finite

4th moment and the variance of t,i = ut,i− ut,i−1 is O(1) as Nt→ ∞.

True microstructure noise dynamics of the bid-ask bounce and discretization are not captured by these noise assumptions. For example discretization errors are different than the error source stated above. However this model allows us to see how even a simple source of noise can lead to inconsis-tencies. Moreover it allows the development of some improved estimators based on its theoretical properties.

Following Zhang et al. (2005) we will now show that RV(all)t is asymptotically a biased estimate of IVtunder microstructure noise. Recall that in the case of RV(all)t we have that Nt= M , since we take

the highest frequency and so use all intraday price observations to construct our return intervals. Applying the general realized variance RV(all)t for xt,i we obtain:

RV(all)t = Nt X i=1 (xt,i− xt,i−1)2 = Nt X i=1 (yt,i− yt,i−1)2+ 2 Nt X i=1

(yt,i− yt,i−1) (ut,i− ut,i−1) + Nt X i=1 (ut,i− ut,i−1)2 (6) = Nt X i=1 rt,i∗ 2 | {z } A + 2 Nt X i=1 r∗t,it,i | {z } B + Nt X i=1 2t,i | {z } C , (7)

here we label rt,i∗ as the true return and rt,i as the noisy return:

rt,i = rt,i∗ + ut,i− ut,i−1= r∗t,i+ t,i (8)

(8)

Effectively the observed returns rt,i become a MA(1) process via the correlated t,i.

The presence of microstructure noise introduces two additional terms. As we will see below the third term C makes consistency for the estimator fail. Taking the conditional expectation of equation (7) yields E h RV(all)t |yi= E "Nt X i=1 r∗t,i2|y # + 2 · E "Nt X i=1 r∗t,it,i|y # + E "Nt X i=1 2t,i|y # = IVt+ E "N t X i=1

u2t,i− 2ut,iut,i−1+ u2t,i−1 |y

#

= IVt+ Nt

X

i=1

Eu2t,i− 2ut,iut,i−1+ u2t,i−1|y



= IVt+ 2Nt· Eu2t|y



(9)

Above, the second equality follows from the independence of the noise and price process, ut,j and

yt,j respectively. Also from equation (3) we have that the RV(all)t estimate based on the true returns

approaches IVt in probability. Now depending on assumptions surrounding the noise structure a

number of inconsistencies can be shown. In equation (9) we see that applying the realized variance estimate to noisy data yields a biased estimate. The magnitude of the bias grows linearly in Ntwith

fixed T .

More specifically, we can write the asymptotic properties with a statistical error term Xi as be-low, RVt d ≈ IVt+ 2Nt· Eu2t  | {z }

bias due to noise

+O  NtEu4t + 1 Nt σt4  (10)

So far we have assumed i.i.d noise. Another assumption that can be made is with a dependent noise structure: ut has zero mean, is stationary, and is a strong mixing stochastic process with the mixing

coefficient decaying exponentially. Ait-Sahalia et al. (2005) and Zhang (2006) showed that under these assumptions for noise we have

RVt d

≈ IVt+ 2Nt· Eu2t



| {z }

bias due to noise

+O  NtΩ + 1 Nt σt4  , (11) here Ω = Var(t,1− t,0)2 + 2 ∞ X i=1

Cov(t,1− t,0)2, (t,i+1− t,i)2



These asymptotic results are used as the basis to introduce estimators that correct for the bias and inconsistency arising from microstructure noise. The following two sections introduce estimators with this aim.

2.4 Bias Correcting and Consistent Estimation

Sampling at lower frequencies reduces the effect of microstructure noise. So in a 20 minute return based estimate, microstructure noise plays a relatively smaller role compared to using one-second returns.

(9)

Instead of using all the intraday observations in RV(all)t we can take the sparsely subsampled re-turn to reduce the bias. We can see this by considering an augmented version of realized variance RV(M)t : RV(M )t = M X j=1 rt,j2 ,

here rt,j are based on M sparsely equidistant sampled price observation times. We refer to RV(M)t

as the ‘simple’ realized variance estimate. Similarly to (7)

RV(M )t = M X j=1 r∗t,j2 | {z } A + 2 M X j=1 r∗t,jt,j | {z } B + M X j=1 2t,j | {z } C (12)

And then similarly to (9), asymptotically this becomes

EhRV(M )t |yi= IVt+ 2M · Eu2t



(13)

By reducing M < Nt, the problem of microstructure noise bias is alleviated. For example taking

5-minute returns implies we have 78 subintervals over a 6.5 hour trading day, compared with 23400 subintervals from using 1-second intervals giving us RV(all)t . The bias is given by 2M · Eu2

t. Hence

an RV(M)t estimate based on 1-second rather than 5-minute returns will be increasingly dominated by microstructure noise rather than the ‘true’ variance of the efficient price process. Using for example 1 to 20-minute returns is a trade-off between accuracy, which is theoretically optimized using the highest possible frequency, and microstructure noise leading to inconsistencies. This is something we will also see in practice in secion 6.1.

Similarly to (10) and (11), asymptotically RV(M)t is

RV sparse≈ IVd t+ 2M · Eu2t



| {z }

bias due to noise

+O  M Ω + 1 Mσ 4 t  (14)

By taking sparser observations, M and microstructure noise bias is alleviated. Although bias is reduced, the variance due to discretization is increased in equation (14). This leads us to the bias-variance trade-off both in the case of i.i.d noise structure as well as dependent noise structure. Furthermore a large disadvantage of simply taking 5-minute returns to control for microstructure noise is that it would imply discarding large amounts of data. Taking 5-minute returns would imply using only 1 out of 300 seconds of data for example. For the sake of gaining an accurate estimate this is counter-intuitive.

Ait-Sahalia (2005) defines the so called average RV estimator RV(AVG)t . The central idea is to use multiple non overlapping subgrids to estimate RVt. For example 5-minute returns could be measured

using the intervals 9:30:00-9:35:00,9:35:00-9:40:00, etc. but also 9:30:01-9:35:01, 9:35:01-9:40:01, etc. Using this approach allows us to use all available data instead of discarding the data within the interval as would be the case for the simple RV(M)t measure.

Formally we partition the full grid G, G = {t0, ..., tNt} into Nk nonoverlapping subgrids. Define

M(k) as the number of return intervals for each subgrid. Define the realized volatility for the grid k = 1, ..., Nk as RV(k)t = M(k) X j=1 r2t,j

(10)

Where rt,j here is the return for each subinterval on the kth grid. The average estimator is then defined as: RV(AVG)t = 1 Nk Nk X k=1 RV(k)t

Zhang et al. (2005) propose the Two Time Scales (TS) estimator RV(TS)t to estimate integrated variance consistently under the presence of microstructure noise. RV(TS)t is obtained by combin-ing RV(all)t and RV(AVG)t . The estimator is ’two-scale’ in that is relies on very high frequency data (through RV(all)t ) to identify the bias component and relies on lower frequency return data to charaz-terize the individual realized variances prior to averaging. The Two Time Scales estimator for daily realized variance is defined as:

RV(TS)t = 1 Nk Nk X k=1 RV(k)t − N¯t Nt RV(all)t (15)

Where Nt is the number of observations in the complete grid, and ¯Nt is defined as

¯ Nt= 1 Nk Nk X k=1 M(k)= Nt− Nk+ 1 Nk

Ait-Sahalia et al. (2005) propose a small sample refinement of (15). The final estimator is defined as RV(TS,adj)t =  1 −N¯t Nt −1 RV(TS)t 2.5 Realized Kernel

Consistently estimating IVt with microstructure noise is to some extent similar to the widely used

auto-correlation corrections that are often applied when estimating the long run variances and co-variances of stationary processes. In this section we introduce the realized kernel also employed in Barndorff-Nielsen et al. (2009) using the Parzen kernel. In effect the realized kernel directly esti-mates the microstructure noise and uses this estimate to correct our IVt. To understand the appeal

of the realized kernel we first consider the following simple kernel-based estimator of Hansen & Lunde (2004): RV(HL)t = RV(all)t + 2 H X h=1 Nt Nt− h ˆ γh, (16) and, ˆ γh = Nt X j=|h|+1 rt,jrt,j−|h|,

here ˆγh are the empirical autocovariances, Nt are the number of return intervals using all data and

H is our bandwidth parameter.

Zhou (1996) was the first to make use of kernels to correct for microstructure noise. Zhou (1996) proposed (16) with H = 1. The intuition here is that the second term in (16) estimates the effect of

(11)

microstructure directly. Using the definition of noisy return in (8) we have E [ˆγ1] = E   Nt X j=2 rt,jrt,j−1  = E   Nt X j=2 (rt,j∗ + t,j)(r∗t,j−1+ t,j−1)   = E   Nt X j=2 (ut,j − ut,j−1)(ut,j−1− ut,j−2)   = − Nt X j=2 Eu2 t,j−1  = −NtEu2t 

Appropriately scaled, this result for auto-covariance equals the bias from microstructure noise in our asymptotic results (10) and (11). Equation (16) with H = 1 combines the contaminated RV(all)t estimate and a correction estimating the variance originating from microstructure noise.

Hansen & Lunde (2006) found the Zhou estimator with H = 1 to be unbiased but inconsistent for IVt. However Hansen & Lunde (2006) did find that Zhou’s kernel estimator has some attractive

properties useful for studying microstructure noise, and concluded that (i) the noise process is time dependent, (ii) the noise process is correlated with the efficient price process and (iii) that properties of microstructure noise have changed substantially over time.

Increasing H in the estimator (16) does not solve the problem of inconsistency. Furthermore (16) is unbiased via the upward scaling of the empirical autocovariances ˆγh, where the hth autocovariance

is scaled by Nt

Nt−h. Furthermore the scaling could result in a negative estimate of volatility and it

also increases the variance of the estimator.

Following these problems a number of kernels have been applied to improve the IVt estimates. For

example Hansen & Lunde (2005) introduce the Bartlett kernel2 and Barndorff-Nielsen et al. (2006) propose a flat-top kernel-based measure. We focus on the more recent so-called realized kernel estimation by Barndorff-Nielsen et al. (2008) and its application for the parzen kernel discussed in Barndorff-Nielsen et al. (2009).

The realized kernel by Barndorff-Nielsen et al. (2008) is defined as follows

RV(ker)t = H X h=−H k( h H + 1)γh, γh = M X j=|h|+1 rt,jrt,j−|h| (17)

Where k(x) is a kernel weight function such that k(0) = 1 and k(1) = 0. Barndorff-Nielsen et al. (2008) show that this RV(ker)t is consistent for IVt for some choices of H if the covariance between

noise terms ut,j and ut,j−1 vanished for an h and j as Mt→ ∞.

A critical element of the realized kernel estimation is bandwidth selection H. The bandwidth H relies on an estimate of unknown quantities w2 and RT

0 σ4du (integrated quarticity) outlined in

Barndorff-Nielsen et al. (2009): H∗ = c∗λ4/5n3/5, with λ2= w 2 q RT 0 σ4du

(12)

w2 is estimated through: ˆ w(i)2 = RV (i) t,dense 2n(i) , i = 1, ..., q ˆ w2 = 1 q q X i=1 ˆ w2(i)

RV(q)t,dense implies that we calculate realized variance using qth trade. n(i) is the number of non-zero returns that were used to calculate RV(i)t,dense. This RV is ‘dense’ because RV(i)t,denseis calculated over very short time intervals. This makes this RV estimate sensitive to microstructure noise.

Integrated quarticity IQt = RT

0 σ

4dt is estimated using an estimate of IV t =

RT

0 σ

2dt since IV2 t '

TR0T σ4dt. IVˆt = RVsparse is used by Barndorff-Nielsen et al. (2009) where RVsparse is based on

20 minute returns. At low frequency market microstructure noise has a negligible effect on realized variance so this initial estimate for IV is a reasonable starting point.

In effect λ2 becomes a ratio of estimated microstructure noise relative to estimated integrated vari-ance. Relatively higher ˆw2 implies a larger bandwidth H. In day to day estimation of the realized

kernel, the bandwidth H changes to reflect changing degrees of needed bias correction. In section 6 the realized kernel is also estimated keeping H constant over time.

Barndorff-Nielsen et al. (2009) use the Parzen kernel for k(·) in (17) as it satisfies the smoothness conditions, k0(0) = k0(1) = 0, furthermore it guarantees to produce a non-negative estimate. The parzen kernel is given by

k(x) =      1 − 6|x|2+ 6|x|3 0 ≤ |x| ≤ 1/2 2(1 − |x|)3 1/2 ≤ |x| ≤ 1 0 |x| > 1

The realized kernel by Barndorff-Nielsen et al. (2008) leads to a consistent estimate for integrated variance. In table 1 various estimators and their properties are summarized. Table 1 shows that different assumptions on the nature of microstructure noise have implication for the unbiasedness and consistency of variance estimators. In section 6 the different realized variance estimators are compared when applied to data.

(13)

IV estimator Unbiaseda? Consistent? Noise and Noise and time dependence efficient price

RV(all)t No No Dependent/ Dependent/

Using all data i.i.d Independent

RV(M)t No No -

-Sparse samplingb

RV(TS)t Yes Yes i.i.d Independent

Zhang et al. (2005)

RV(TS,adj)t Yes Yes Dependent Independent

Ait-Sahalia et al. (2005)

RV(HL)t Yes No Dependent/ Independent/

Hansen & Lunde (2006) i.i.d Dependent

RV(ker)t Yes Yes Dependent/ Independent/

Barndorff-Nielsen et al. (2011) i.i.d Dependent

RV(ker,Ht ∗) Yes Yes Dependent/ Independent/

Barndorff-Nielsen et al. (2008) i.i.d Dependent

a: Large sample bias. TSRV for example is biased in smaller samples

b: Sparse sampling for the simple RV estimate does not necessarily require assumptions on the microstruc-ture noise

note: H∗ Refers to using optimal bandwidth selection

Table 1: Properties of estimators for integrated variance in the presence of microstructure noise

2.6 Jumps

The quadratic variation of a process can be split up into a continuous component and a jump component. These two components of volatility might follow different processes and have varying predictability. In this section we will outline the methods employed to separate the two compo-nents.

Our aim is to disentangle the continuous process Ct and jump process Jt that we assume makes up

our integrated variance measure RVt, crudely said:

RVt= Ct+ Jt

Although jumps have been shown to be relevant in economic and financial applications, much of the research has found no evidence that jumps help in forecasting volatility. Andersen et al. (2007) and Busch et al. (2011) use jumps to forecast realized volatility with the HAR-RV model and find that jumps have an insignificant or negative impact on future volatility. As Corsi et al. (2010) highlights this result is puzzling for two reasons. Firstly since qualitative inspection of the time series suggests that jumps prelude burts in volatility. Secondly, volatility is associated with the dispersion of beliefs and heterogeneous information. I.e. jumps go hand in hand with uncertainty on the true value of an asset. According to Corsi et al. (2010), this mismatch between theory and empirical result is due to the nature of the measure used to detect jumps. To measure jumps Andersen et al. (2007) and Busch et al. (2011) use the so-called bipower variation (BPV), a popular measure of continuous quadratic variation proposed by Barndorff-Nielsen & Shephard (2004). Corsi et al. (2010) show that in finite samples bipower variation is substantially biased upward in the presence of jumps. Thus this bias of BPV leads to an underestimation of the jump components.

The small sample bias can be alleviated by taking smaller intraday returns intervals, However this leads the estimation to be contaminated by market microstructure noise. Corsi et al. (2010) propose the concept of threshold multipower variation and show that their estimation technique is nearly

(14)

unbiased on continuous trajectories and critically, also in the presence of jumps. As we will see, in effect, using the threshold protects the continuous measure against the presence on jumps.

We now build up to the threshold estimator discussed in 2.6.2 and employed in this thesis. In 2.6.1 we first show the estimation method from Barndorff-Nielsen & Shephard (2004), to highlight its limitation and to give an intuitive basis for the threshold estimator.

2.6.1 The Small Sample Biased Bipower Variation

We assume that log prices follow a general stochastic volatility jump diffusion model

dyt= µtdt + σtdWt+ κtdqt (18)

κtis the random jump process where qt is a counting process normalized so that dqt= 1 corresponds

to a jump at time t and dqt= 0 otherwise. The intensity of the arrival process of jumps is driven by

λJt, which can be time varying. Wtis a standard Brownian motion, instanteous volatility σ(·) > 0 is

c`adl`ag and µt is predictable.

Below we depict the relationship between realized variance and quadratic variation. Note that quadratic variation [Y ](t) is equal to our variable of interest integrated variation IVt. For any

semimartingale the quadratic variation [Y ](t) is defined as

[Y ](t) = plim

M

X

j=1

(rt,j)2 (19)

Generally we can say that from (18), that quadratic variation can be broken up as integrated volatility plus the sum squared jumps over time:

[Y ](t) = Z t 0 σ2(s)ds + q(t) X j=1 κ2(tj)

where 0 ≤ t1 < t2< ... are the jump times.

The classic estimator for quadratic variation is realized variance as has been discussed in the previous sections: RVt= M X j=1 rt,j2

To disentangle the continuous quadratic variationR0tσ2(s)ds from the discontinuous onePq(t)

j=1κ2(tj),

Barndorff-Nielsen & Shephard (2004) propose multipower variation (MPVt). Multipower variation

is used for the estimation ofRt

0σ 2(s)ds and Rt 0σ 4(s)ds and is defined as MPV[γ1,...,γL] t = δ1− 1 2(γ1+...+γL) M X j=L L Y k=1 |rt,j−k+1|γk (20)

Where δ is the subinterval length on which the M intraday returns rt,j are calculated. For

5-minute returns we would have δ = 781. As δ → 0, MPVt converges to

Rt

γ1+...+γL(s)ds. The

bipower variation BPVt is the version of (20) that converges to

Rt

0 σ

2(s)ds, the contintious part of

the quadratic variation. We define BPVtas

BPVt= µ−21 MPV [1,1] t = µ −2 1 M X j=2 |rt,j||rt,j−1|, (21)

(15)

here µ1 =p2/π. Also here, increasing M is a trade off between increased precision of the estimators

and higher sensitivity to microstructure noise.

The intuition of BPVt is the following: let us assume that the return interval |rJt,j| contains a

jump. If we compute realized variance, the |rt,jJ | containing a jump will be multiplied with itself. Asymptotically these returns should ’vanish’ (in terms of their relative contribution), however because of their relative sizes multiplying |rt,jJ | by itself will dominate the measure. BPVtrelieves the potential domination of squared returns containing a jump by instead multiplying adjacent returns |rt,j−1| and

|rt,j+1|. In this way, the effect of a jump on the size of the measure BPVtwill be limited as a opposed to the RVt measure.

Realized bipower variation can be used to measure when a jump occurs since realized bipower variation is consistent for the continuous integrated volatility segment of (19),

BPVt→p

Z t

t−1

σ2(s)ds as M → ∞

This implies that the difference RVt− BP Vt converges to the sum of squared jumps occurring in

the relevant period. We still need a measure of when a significant jump component does occur. Otherwise just using the difference between RVt and BVt would imply a process that is driven by

sample variation from finite sampling. To detect jumps Huang & Tauchen (2005) apply the following test statistic in which a large positive value implies that a jump occurred:

Zt= √ M (RVt− BP Vt)RV −1 t q (µ−41 + 2µ−21 − 5) max {1, T QtBP Vt−2}

Where T Qtis the staggered realized tripower quarticity3, without jumps, Zt→dN (0, 1) as M → ∞.

In the presence of jumps, this test statistic underestimates the number of jumps. Furthermore Huang & Tauchen (2005) show that microstructure noise may underestimate the number of jumps.

Jt= IZt>φ1−α(RVt− BPVt)

IZt>φ1−α is an indicator for the event that a jump ocurred. Jtis then the excess realized volatility not

attributed to BPVt. If no significant jump occurred in the period t then the continuous component

is just RVt

Ct= RVt− Jt

Consistency of the jump and continuous components may be achieved by letting α → 0 and M → ∞ we have Ct→p Z t 0 σ2(s)ds Jt→p q(t) X j=1 κ2(tj)

The steps outlined above are a powerful method using high frequency data in a nonparametric estimation for jumps. However as discussed this estimation is biased leading to an underestimation of the number of jumps detected. The small sample bias could be alleviated by taking smaller intraday returns intervals, however this leads to contamination of market microstructure.

In the 2.6.2 we introduce the threshold multipower variation method used by Corsi et al. (2010) and applied in this thesis. Importantly, the measure by Corsi et al. (2010) is nearly unbiased for small samples in the presence of jumps, unlike the method introduced above.

3

Realized tripower quarticity is defined as

T Qt= µ−34/3 M2 M − 2(k + 1) M X j=2k+3 |rt,j|4/3|rt,j−k−1|4/3|rt,j−2k−2|4/3

(16)

2.6.2 The Not-So Small Sample Biased ’Threshold’ Bipower Variation

The threshold realized variance is defined as

TRVδ= M X j=1 rt,j2 I{r2 t,j≤Θ(δ)}

Where Θ(δ) is the threshold function. Here the threshold function is scaled to the local spot vari-ance

ϑt= c2ϑ· ˆVt

Including the threshold for multipower variation in (20) we obtain the threshold multipower variation (TMPVt): TMPV[γ1,...,γH] t = δ 1−1 2(γ1+···+γH]) M X j=H H Y k=1 |rt,j−k+1|γkI {|rt,j−k+1|2≤ϑj−k+1} (22)

The intuition of using this threshold function is the following. Let us assume that the return interval |rJ

t,j| contains a jump. If we compute Bipower variation BPVt according to equation (21), the |rJt,j|

containing a jump will be multiplied with |rt,j−1| and |rt,j+1|. Asymptotically these returns should

’vanish’ (in terms of their relative contribution) and the BPVt would converge to the integrated

continuous volatility. In practice however for finite δ, this return instance |rJt,j| will not vanish and have a substantial presence in the estimation of BPVt. BPVt will be biased upward the larger the

jump in |rt,jJ |. The indicator function forces a |rJ

t,j| containing a jump larger than ϑj to ’vanish’, thus

it avoids the continuous measure of volatility to be contaminated by the presence of jumps.

Bipower variation is greatly biased for large jumps and less so for smaller jumps. On the other hand threshold realized variance is problematic with small jumps, since they might just fall outside the exclusion range, thus not ’vanishing’. Multiple small jumps would be able to contaminate the TBPV measure. To protect against this Corsi et al. (2010) employ a method to utilize both measures BPV and TBPV. The ’corrected’ method allows both measures to compensate for the weakness of the other.

The corrected version of (22) is given by

C-TMPV[γ1,...,γH] t = δ1− 1 2(γ1+···+γH]) [M ] X j=H H Y k=1 Zγk(rt,j−k+1, ϑj−k+1) Where Zγ(x, y) =    |x|γ if x2≤ y 1 2N (−cϑ) √ π  2 c2 ϑ y γ 2 Γγ+12 ,c2ϑ 2  if x2> y

Where N (x) is the standard normal cumulative function and Γ(α, x) is the upper incomplete gamma function.

For example the corrected threshold version of (21) is

C-TBPVt= µ−21 C-TMPV [1,1] t = µ −2 1 M X j=H Z1(rt,j, ϑj)Z1(rt,j−1, ϑj−1)

The test statistic based on this threshold correction in Corsi et al. (2010) is defined by

C-Tz = δ−12 (RVt− C-TBPVt) · RV −1 t r  π2 4 + π − 5  maxh1, C-TTriPVt (C-TBPVt)2 i

(17)

Where the test statistic C-Tz is standard normal under the null hypothesis of no jumps. Under the alternative, C-TMPV is upwardly biased. In order to migitate the effects of microstrucutre noise Corsi & Ren (2012) use the RV(TS)t in the above test.

In the threshold, ˆVt is an auxiliary estimator of σt2, and cϑis a scale-free constant. Following Corsi

et al. (2010) and Corsi & Ren (2012) we estimate local variance with a non-parametric filter of length 2L + 1 adapted for the presence of jumps by iterating in Z

ˆ Vt= PL i=−L,i6=−1,0,1K Li r 2 t,iI(rt,i)2≤c2V· ˆV Z−1 t+1 PL i=−L,i6=−1,0,1K Li I(rt,i)2≤c2V· ˆV Z−1 t+1 , Z = 1, 2, ..

With cV = 3 and starting value ˆVt0 = +∞, it is implied that all observations are used in the first

step. At each iteration, large returns are eliminated via the indicator condition. The threshold in the subsequent iteration is based on the estimate of the variance multiplied by cV = 3. We follow

(18)

3

The HAR-RV Model

3.1 Introduction to HAR-RV

Conditional volatility is latent, it cannot be directly observed. Many latent volatility models fail to describe several stylized facts observed in financial time series. One is that standard latent volatility models do not incorporate the empirical fact that the autocorrelations of squared and absolute returns show sustained persistence lasting for lengthy periods (months). Furthermore the probability density functions of returns have fat tails with shapes depending on the time scale used (e.g. 5-minute returns versus monthly returns). Returns probability distributions show a very slow convergence to the normal distribution as time scales increase. The standard GARCH model is not able to reflect all these features, which appears like white noise if aggregated over longer time periods.

Long memory is a property for which there is a large interest in financial econometrics. Below in Figure 2 the volatility persistence is depicted via the long lasting significant autocorrelation function (ACF). FIGARCH and ARFIMA models of realized volatilities introduce the long memory feature by employing fractional difference operators. However, fractionally integrated models pose a number of problems. Although a convenient mathematical trick, it lacks a clear economic interpretation (Corsi, 2009). Estimation of fractionally integrated models is non-trivial and cannot be easily extended to multivariate processes. Furthermore, applying fractional difference operators requires a long build up period resulting in a loss of many observations.

(a) 35 day lag structure (b) 100 day lag structure

Figure 2: Auto-correlation function of daily realized kernel volatility measure over 35 and 100 days (lags) of the S&P500 index tracker SPY over a 14 year period

The model used in this paper is based on the heterogeneous auto-regressive (HAR) realized volatility (RV) model proposed by Corsi (2009). The HAR-RV model reads:

RV(d)t+1= c + β(d)RV(d)t + β(w)RV(w)t + β(m)RV(m)t + t+1 (23)

(19)

variance whose value for period t is RV(d)t = RV(X)t , RV(w)t = 1 5 4 X h=0 RV(d)t−h, 1 22RV (m) t = 21 X h=0 RV(d)t−h

Where RV(X)t can refer to any measure of volatility introduced in section 2. Sampling returns at time interval δ yields M intraday returns per day. Note that t indexes the day and j indexes within a day.

The economic intuition of the HAR-RV model of equation (23) is that different market participants take actions based on different temporal horizons of volatility. Muller et al. (1999) first presented this idea in the form of the Heterogeneous Market Hypothesis. Market participants take action over a spectrum of trading frequencies. On one side there are Heterogeneous Market Hypothesis such as pension funds who adjust positions over months. On the other side of the spectrum short term traders take only intraday positions. These market participants can be expected to react, and have effect on different temporal components of market volatility.

It has also been found that volatility over longer time intervals has a stronger influence on volatility over shorter time intervals than conversely. The sequence found is a volatility cascade from low frequencies (months) to high frequencies (days). Economically, the intuition here is that traders who conduct short term trades take into account long- as well as short-term volatility into their trading decisions. However for long term traders, short term volatility matters less.

To view the ability of the HAR-RV model to replicate the structure of autocorrelation from Figure 2 the HAR-RV is simulated, generating the ACF in Figure 4 below. The parameters used to simulate the volatility process via the HAR-RV are based on those calibrated by Corsi (2009); β(d) = 0.36,

β(w) = 0.28 and β(m) = 0.28. The general ACF structure is replicated quite closely, however after roughly 70 lags, the long memory feature in this HAR-RV simulation fades. As a comparison to alternative models, Figure 3 depicts the ACF of a volatility process simulated using a GARCH(1,1) and ARFIMA-eGarch(1,1) model. The parameters used in the models to simulate the process are those that are fitted using the data on which the ACF in Figure 2 is based. A GARCH(1,1) is not able to reproduce the long memory structure beyond a very short horizon. The ARFIMA-eGarch(1,1) model is able to reproduce long memory but it decays at a very consistent rate and the initial correlation is high relative to the observed data. So even though Figure 3 uses fitted parameters, it replicates the ACF structure worse than using a HAR-RV model with non-fitted paramters.

(20)

(a) Fitted Garch(1,1) Simulation (b) Fitted ARFIMA-eGarch(1,1) Simulation Figure 3: Auto-correlation function of simulated daily volatility 100 days (lags) using the fitted paramters of a GARCH(1,1) and ARFIMA-eGarch(1,1) model

(21)

3.2 The LHAR-RV Model

An attractive property of the HAR-RV model is its OLS nature. An important part of this paper is to judge various sources of potential explanatory power for volatility. The simple structure of the HAR-RV model makes it trivial to add in additional regressors to judge their explanatory power. Using HAR-RV as opposed to a realized GARCH for example gives an easy way to analyze the additional explanatory power of variables such as jumps, implied volatility and option Greeks. Following Corsi & Ren (2012), models are applied using OLS with Newey-West covariance correction for serial correlation. h is set as the forecast horizon, so that h = 1 refers to forecasting one day ahead volatility, h = 5 refers to a week ahead forecast etc. This thesis focuses on day ahead forecasts h = 1. Furthermore realized variance are specified in logs to control for scaling. The HAR-RV structure used in our analysis is thus defined as

log RVt+h(h) = c + β(d)log RVt(d)+ β(w)log RVt(w)+ β(m)log RVt(m)+ (h)t+h (24)

Literature has widely documented that volatility tends to increase significantly after a negative return shock than after a positive return shock of the same magnitude. This phenomena is known as the leverage effect. A way to explain this phenomena economically is that as equity value of a company decreases, debt to equity ratio (leverage) increases. This means that the equity becomes more risky leading to larger in magnitude returns on equity.

Corsi & Ren (2012) also investigate the possibility of a persistent leverage effect so that the weekly and monthly return contribute to volatility. To include the leverage effect in the model we define

r(h)t = 1 h h X j=1 rt−j+1 r(h)−t = min(rt(h), 0)

To take into account daily, weekly and monthly negative returns the term γ(d)r(d)−t + γ(w)r(w)−t + γ(m)rt(m)− is added to (24), thereby obtaining the LHAR-RV model:

log RVt+h(h) = c + β(d)log RVt(d)+ β(w)log RVt(w)+ β(m)log RVt(m) (25) + γ(d)r(d)−t + γ(w)rt(w)−+ γ(m)r(m)−t + (h)t+h

The LHAR-RV structure in (25) allows persistent negative returns to be accounted for in their implications on volatility. In the later analysis sections the LHAR-RV model is used as a benchmark to compare volatility measures. Also the additional explanatory variable investigated are added on to the LHAR-RV model. The reason for using LHAR-RV is that some variables might implicitly hold information about negative returns rather than a unique source of explanatory power.

3.3 LHAR-RV and Jumps

RV time series can be viewed as an aggregate of a continuous and jump component with different properties and predictability. Separating the two allows more dynamics between the separate com-ponents to be measured. The OLS structure allows us to easily incorporate jumps as well as the continuous measure. Furthermore following Busch et al. (2011) the jump series can also be used as a dependent variable to asses its predictability.

(22)

Below the manner in which the variables, specified in logs, are aggregated over h is defined. Note that jumps are aggregated instead of averaged.

Ct(h)= 1 h h X j=1 Ct−j+1(d) Jt(h) = h X j=1 Jt−j+1(h)

The LHAR-RV-CJ model, where the CJ stands for continuous jump, is defined as:

log RVt+h(h) = c + β(d)log Ct(d)+ β(w)log Ct(w)+ β(m)log Ct(m) (26) + α(d)log(1 + Jt(d)) + α(w)log(1 + Jt(w)) + α(m)log(1 + Jt(m))

+ γ(d)rt(d)−+ γ(w)rt(w)−+ γ(m)r(m)−t + (h)t+h

Since no jump occurring implies Jt = 0, to be able to take logs we add 1 to our jump time series.

Putting jumps as the dependent variable, we obtain the LHAR-J-CJ model:

log(1 + Jt+1(d)) = c + β(d)log Ct(d)+ β(w)log Ct(w)+ β(m)log Ct(m) (27) + α(d)log(1 + Jt(d)) + α(w)log(1 + Jt(w)) + α(m)log(1 + Jt(m))

+ γ(d)rt(d)−+ γ(w)rt(w)−+ γ(m)r(m)−t + (h)t+h

The LHAR-J-CJ model(27) judges the predictability of jumps, which Busch et al. (2011) and Corsi & Ren (2012) found to be unpredictable.

(23)

4

Implied volatility

4.1 Introduction

In this section we discuss the inclusion of implied volatility as well as other information from the option market to find explanatory power for volatility. The volatility of an asset’s price process plays a large role in the pricing of options on that asset. Market prices of options contain some belief about future volatility in order to asses the distribution of the payoff of the option. By using a pricing model like Black-Scholes, an implied volatility can be backed out of an option’s market price. Implied volatility can be viewed as the market’s forecast of integrated volatility of the asset over the time until maturity of the option contract and thus can be used as a relevant predictor of future volatility.

Busch et al. (2011) use implied volatilities backed out of futures and forecast volatility for non-overlapping month-ahead periods in the HAR-RV model. Busch et al. (2011) find that implied volatilities are found to contain substantial incremental information about future volatility.

This thesis takes a more in depth approach in that day-ahead instead of month-ahead volatility is forecast. Furthermore multiple sources of information are investigated including implied volatility magnitude, implied volatility surface (IVS) and the option Greeks, namely vega, gamma, theta and delta.

In the subsequent sections, we discuss the methods used to extract information from the option market. In section 4.2 we discuss methods used to extract information from the shape of the IVS. In 4.3 the option Greeks are discussed.

4.2 Implied Volatility Surface

4.2.1 Failure of Black-Scholes Model

Let C(S, t) denote the value of an European call option at time t, stock price St, strike price K and

interest rate r. Using the assumption that the asset price follows a Geometric Brownian motion the Black-Sholes (B-S) formula is given by

C(S, t) = StΦ(d1) − e−r(T −t)KΦ(d2), (28)

Φ(·) is the CDF of the standard normal distribution and

d1 = log St K + (r + σ 2/2)(T − t) σ√T − t d2 = d1− σ √ T − t

σ is the volatility parameter for the option pricing model and cannot be directly observed. Given observed market prices C(S, t), the volatility parameter σ can be backed out of the B-S model, giving us an implied volatility. In practice, the market usually works with the implied volatility.

Although the B-S model is elegant it fails to perform in practice; B-S implied prices and true market prices do not match. This failure can be directly observed in the shape of the IVS shown in Figure 5. The implied volatility surface captures the variation of implied volatilities across time of expiry and delta moneyness; moneyness is defined as strike price divided by spot price. As will be explained later, in our application delta (∆) is used here as opposed to moneyness to formulate the IVS. If Black-Scholes is taken as true the IVS surface would be flat, independent of time to maturity and moneyness. In practice the surface for equity options is asymmetric toward lower moneyness options

(24)

for a given maturity. This is referred to as the volatility skew. Furthermore as time to maturity increases, implied volatility tends to converge to a constant. For very short term maturities however, the skew is stronger with relatively higher implied volatility for low moneyness options. The standard explanations for the volatility surface skew are:

• Stock price process is not a geometric Brownian motion with constant volatility. Prices jump with downward jumps being more frequent and larger in magnitude.

• Risk aversion: As prices decrease, fear sets in and volatility increases.

• Supply and demand: demand for deep out-of-the-money (OTM) puts because they are a way to hedge risks.

• Leverage effect: As equity value decreases, debt to equity ratio (leverage) increases leading to larger in magnitude returns on equity. Stocks tend to be more volatile at relatively lower prices.

As new information is continually processed by the market, over time the shape of the IVS continually changes to reflect change in forward looking views. Using regression models, day to day changes in the IVS can be captured.

The basis for using the shape of the IVS as an explanatory variable for realized volatility is as follows. Broadly speaking, daily changes in the IVS reflect daily market reassessments of factors that influence the financial market. Some of these factors relate to the realization of future volatility. Using regression models provides a manner of extracting this information from the option market without imposing a stochastic volatility model.

(a) Call implied volatility surface of the SPY (b) Put implied volatility surface of the SPY Figure 5: Implied volatility surface (IVS) plots over delta and days to expiry on 2013-08-30

4.2.2 Capturing the Implied Volatility Surface

A benchmark to capture the IVS is the Practitioner Black Scholes model by Christoffersen & Ja-cobs (2004). Implied volatility σ(∆, t, τ ) is the dependent variable while moneyness m and time to

(25)

maturity τ are the independent variables:

σt(∆, τ ) = α0,t+ α1,tm + α2,tm2+ α3,tτ + α4,tτ2+ α5,tmτ + t,

Here moneyness m is strike price divided by current spot price. In our application delta (∆) is used here as opposed to moneyness to formulate the IVS. The delta of an option is the sensitivity of the option price to a change in the price of the underlying asset. Delta is a better indicator than moneyness for how far in- or out-of-the-money an option is. The reason is the following: consider a call option on a stock with strike price K = 110 and current stock price St= 100. This option is

10% OTM. The degree of how ‘far’ OTM this is depends partly on the expiry date of the option. If the option expires in a week, this option would be considered further out of the money than if it were to expire in a year. Using delta controls for this via its sensivity.

Based on Badshah (2008), the following regression models are employed over the daily implied volatility surface data:

σt(∆, τ ) = α0,t+ t (M1.1)

σt(∆, τ ) = α0,t+ α1,t∆ + α2,t∆2+ t (M1.2)

σt(∆, τ ) = α0,t+ α1,t∆ + α2,t∆2+ α3,tτ + α4,t∆τ + t (M1.3)

σt(∆, τ ) = α0,t+ α1,t∆ + α2,t∆2+ α3,tτ + α4,t∆τ + α5,tτ2+ t (M1.4)

αifrom (M1.1)-(M1.4) are estimated daily as the shape of the IVS adjusts over time. The idea is that

those parameters hold some kind of information on future volatility not contained in the LHAR-RV model. Each model adds various degrees of sophistication. The parameter from (M1.1) is simply the mean implied volatility while (M1.4) takes multiple curvature parameters into account.

As stated, implied volatility can be seen as a forecasted RV of the asset over the time until maturity of the option contract. So the time horizon of the option used matters. Since we are forecasting day ahead realized volatility it is important to consider which implied volatility over which time horizon is relevant.

For example consider extracting information from an IVS including options expiring in up to 12 months. A question to consider is whether the 12 month based IVS would hold any more information on realized volatility for the coming day than a 1 month based implied volatility surface. One could argue that any factors that could effect volatility tomorrow would already be found in a closer term surface. This would imply that any information relevant for tomorrow’s volatility is contained in the 1-months ahead surface and not necessarily in the additional months in the 12-month ahead surface.

The ideal would thus be using implied volatility backed out of options that expire at end of the next trading day. Day ahead expiring options would only contain the information which is relevant for the chosen forecasting horizon. In practice implied volatilities based on options expiring in the coming day do not exist on a daily basis (or at least for ones which are exchange traded and for which historical information is obtainable).

Following this, we additionally investigate using only the nearest to expiry options to model skew. Hence two approaches are taken. The first approach is (M1.1)-(M1.4), it uses the implied volatility surface over 6 months. The second approach uses only options with the nearest expiry date modelling the behaviour seen in Figure 6. M2.1 and M2.2 below use only delta as an explanatory variable since one option expiration is taken.

σt(∆) = α0,t+ α1,t∆ + α2,t∆2+ t (M2.1)

(26)

(a) Call implied volatility smirk of the SPY (b) Put implied volatility smirk of the SPY Figure 6: Implied volatility smirk plots over delta with 30 days to expiry over 3 days

Using (M1.1)-(M1.4) is based on the day to day changes in the general shape of the implied volatility surface for options expiring within the coming 6 months4. Using (M2.1) and (M2.2) is based on that longer term option might not contain any additional information about tomorrow’s volatility that is not already captured by a shorter term option. Using this two model sets we investigate whether these day to day changes contain any additional explanatory information.

Models (M1.1)-(M1.4) are added into the LHAR-RV model including their coefficients in the re-gression. For example αi,t, i = 0, 1, 2 parameters are included from (M1.2) in the LHAR-RV-M2

model:

log RVt+h(h) = c + β(d)log RVt(d)+ β(w)log RVt(w)+ β(m)log RVt(m) (29) + γ(d)r(d)−t + γ(w)r(w)−t + γ(m)r(m)−t

+ π1αˆ0,t+ π2αˆ1,t+ π3αˆ2,t+ (h)t+h

4.3 Greeks

4.3.1 Overview

We now discuss the sensitivity of option prices to a number of parameters, known as the Greeks. The Greeks are used in the LHAR-RV model to investigate additional explanatory power for volatility in the option market. Delta, gamma, vega and theta are considered. These Greeks can supplement LHAR-RV with the option market’s forward looking nature and give an additional insight into current market condition. An overview is given of the Greeks with a link to volatility.

Furthermore, to accompany the theory, in Figures 7(a)-(d), values for delta, gamma, vega and theta are shown over a 14-year period for a standardized 30-day American put option on the SPY that is at-the-money. The values are interpolated so that we consider a put option that is perpetually at the

46 months is chosen in relation to the number of option expiring in that horizon. After the 6 month horizon, the

next expiry is a 12 month expiry horizon. This is deemed to be too far away (relative to the closeness of the shorter expiries) for its value as an extra data point.

(27)

money and 30 days until expiry. Figure 7 allows a link to be laid between what we know about the Greeks and the events that have taken place on the financial markets over the last 14 years.

The delta of an option is the sensitivity of the option price f to a change in the price of the underlying asset.

delta = ∆t=

∂f ∂S

Delta, like all Greeks, is affected by a number of factors. In the Black-Scholes setting for example, an expression for delta of European puts can be derived analytically to asses how it changes to variations in other variables. For an American option gaining an exact expression is not possible, however many of the intuitions are the same for European and American options. For an European at-the-money put option so that; lnSt

K = ln(1) = 0 and whose value is given by P , we would have:

∆t= ∂P ∂S = −e −q(T −t)Φ (r − q + σ 2 2 )(T − t) σ√T − t !

Here q is the dividend yield. The delta for a put option is negative since an increase in the asset price, decreases option value. For a call option the opposite applies.

In Figure 7a, a strong increase in delta for an at the money American put option is observed as the credit crisis hit in 2008. This implies that the price of an option was less sensitive to a change in the underlying asset during this period. The reason is that higher volatility increases delta for an at the money put option. Mathematically, the distribution of a log normal variable is not symmetric. The mean of the log normal distribution increases as the volatility increases. This means there is a smaller chance that the put will be in the money at expiration. Furthermore lowered dividend yield further increases delta, since less asset-price decreasing dividends will be paid.

The gamma of an option is the sensitivity of the option’s delta to change in the price of the underlying asset.

gamma = Γt=

∂2f ∂S2

A higher gamma would imply a relatively higher delta reaction to a movement in price. For an at the money option, lower volatility implies a higher gamma. This is due to the fact that when volatility is high, the market has already ’priced’ it into the option value. Given a small change in asset price will thus have a larger impact on the option under low volatility than high volatility. Thus, ceteris paribus, a higher gamma could be a sign that the market expects less extreme price movements. This effect can be seen back in Figure 7b in relation to the 2008 market uncertainty.

Vega is the sensitivity of the option price to a change in volatility.

vega = νt=

∂f ∂σ

A higher vega implies that the option price is more sensitive to change in the underlying price process’ volatility. A higher vega implies that the option value is relatively more governed by volatility than by than by its intrinsic value; the difference between strike price and asset price, which is governed by delta. If Vega is high it could be a sign that the market expects volatility not to shift substantially because if the market did expect a substantial movement in volatility, traders would adjust their portfolio is such a way to protect it against movement. The adjustment implies an overall re-pricing so that when volatility does change, its effect is smaller. This mechanism can be seen in Figure 7c where during period of turmoil or uncertainty, vega is low.

(28)

Theta is the sensitivity of the option price to a negative change in time to maturity. It is also referred to as the time decay of the option.

theta = Θt= −

∂f ∂(T − t)

Theta is usually negative for an option since as time passes, with all else constant, the option tends to become less valuable. Theta for a longer term option is close to zero, while it is high for short term option. Shorter term option have less time remaining until expiry so that their time value is higher. Options on assets with higher volatility have higher theta since the realtive time-value is worth more. A high volatility asset implies a higher chance for the higher volatility to propel an option into higher payoffs. This effect can also be seen in Figure 7d in relation with the increased volatility during the 2008 market crash, when time value increased via a higher theta in absolute value.

4.3.2 Implementation with HAR-RV

Daily values for the greeks are added into the RV model, including all yields in the LHAR-RV-Λ model (30). Λ signifies use of option price sensitivies.

log RVt+h(h) = c + β(d)log RVt(d)+ β(w)log RVt(w)+ β(m)log RVt(m) (30) + γ(d)r(d)−t + γ(w)r(w)−t + γ(m)r(m)−t

(29)

(a) Delta (b) Gamma

(c) Vega (d) Theta

Figure 7: Greeks of 30-day standardized American at the money put option on the SPY over 14 years

(30)

5

Data

The underlying asset used for the analysis in this thesis is an exchange traded fund (ETF) that tracks the level of the S&P500 index. It is traded under the SPY symbol while its options are traded under SPX. In section 5.1 we describe the underlying trade data used and its cleaning. Section 5.2 discusses the option data used in the analysis.

5.1 High Frequency Trade Data

5.1.1 Data Description

The underlying asset used is the SPY, an exchange traded fund that tracks the level of the S&P500 index. The S&P500 is a capitalization-weighted index of 500 stocks from a broad range of industries and with an 80% coverage of the total US equity market capitalization, making it is an good proxy for the total US stock market. The SPY has a daily volume of roughly 115 million shares, making it the most traded exchange traded fund in the financial market.

The source of the high frequency trade data is the Trade and Quote (TAQ) database on Wharton Research Data Services and ranges from 1999-09-01 until 2013-08-01. The raw dataset provides trade information of all SPY trades occurring on various stock exchanges at a 1-second frequency. Below, in table 2, a snippet of 2 seconds of raw trade data of SPY on the 4th of September 2007 is presented.

SYMBOL DATE TIME PRICE SIZE G127 CORR COND EX

SPY 20070904 9:30:00 147.47 800 0 0 P SPY 20070904 9:30:00 147.46 100 0 0 T T SPY 20070904 9:30:00 147.47 200 0 0 P SPY 20070904 9:30:00 147.45 100 0 0 P SPY 20070904 9:30:00 147.45 100 0 0 T T SPY 20070904 9:30:00 147.45 1000 0 0 F T SPY 20070904 9:30:00 147.45 500 0 0 F T SPY 20070904 9:30:00 147.45 400 0 0 F T SPY 20070904 9:30:00 147.45 500 0 0 F T SPY 20070904 9:30:00 147.46 1000 0 0 F T SPY 20070904 9:30:00 147.46 500 0 0 F T SPY 20070904 9:30:00 147.45 100 0 0 T T SPY 20070904 9:30:01 147.45 1000 0 0 F P SPY 20070904 9:30:01 147.45 500 0 0 F D SPY 20070904 9:30:01 147.46 300 0 0 F D SPY 20070904 9:30:01 147.46 100 0 0 F D SPY 20070904 9:30:01 147.46 500 0 0 F P SPY 20070904 9:30:01 147.46 1000 0 0 F P SPY 20070904 9:30:01 147.46 800 0 0 F P SPY 20070904 9:30:01 147.46 800 0 0 F P

Table 2: Snippet of 6 seconds of raw trade data for S&P500 tracker on 4th of September 2007

Next to price, date and time, in table 2 a number of additional variables are delivered important for cleaning: SIZE indicates the volume of shares traded, G127 refers to combined ‘G’, Rule 127 5 and an indicator for a stopped stock trade indicator, COND is a description of the sales condition,

5

(31)

CORR indicated whether a correction has occurred and EX depicts the exchange identifier on which the trade occurred. For example ‘T’ refers to the NASDAQ exchange, ‘M’ refers to the Chicago stock exchange and ‘P’ refers to the Pacific Exchange.

The raw data set is 32GB in size covering 771,735,763 trades. The database management program SQLite was used to handle the raw data since R/Matlab could not handle operations within a realistic and reasonable time.

In 1999 relatively little trades for the SPY occur per second compared to today. For the raw trade data from 1999 60kb constitutes a day’s worth of trade data while for 2013 this is 20Mb.

5.1.2 Cleaning the High Frequency Trade Data

Barndorff-Nielsen et al. (2009) outline the importance of accurate data cleaning for trade and quote data. In their analysis they show the biases and inconsistencies that may arise through improper use of some sections of the data. Barndorff-Nielsen et al. (2009) recommend the steps below before any volatility analysis was applied. In their procedure, Barndorff-Nielsen et al. (2009) differentiate their rules, based on whether the data is trade data or quote data. For this analysis only the rules applied to trade data are relevant. Here P refers to cleaning steps that are applied to both trade as well as quote data. Steps with T, are applied only to Trade data.

The cleaning procedure for high frequency trade data of Barndorff-Nielsen et al. (2009) is given as:

P1 Delete entries with a time stamp outside the 9:30 am - 4 pm window when the exchange is open

P2 Delete entries with a transaction price equal to zero P3 Retain entries originating from a single exchange T1 Delete entries with corrected trades (where CORR 6= 0)

T2 Delete entries with an abnormal sale condition (Trades with letter code except for ’E’ and ’F’) T3 If multiple transactions have the same time stamp, use the median price

T4 Delete entries for which the price deviated by more than 10 mean absolute deviations from a rolling centered median (excluding the observation under consideration) of 50 observation (25 observations before and 25 after)

P1 is applied to limit the trades that are recorded when the exchange is actually open. P3 is applied due to time delay between trades and quotes occurring on different exchanges within each day. T1 removes serious errors in the database such as missrecorded prices. T3 limits the data to one trade per second so that RV measures may be applied. T4 removes outliers not already removed by T3.

A number of adjustments were made to these steps to be more robust toward exceptions and special scenarios. We discuss these below before presenting the final cleaning algorithm used.

In their application of P3, Barndorff-Nielsen et al. (2009) use a single exchange (the New York Stock Exchange) over a limited time span. Application of P3 to the 14 year data set used in this thesis results in a large number of missing days due to the fact that the most active exchange changes over time. To account for this we choose the most active exchange per day, i.e. keeping only trades from the exchange which was most active for that particular day. By taking this more adaptive condition we assume that the nature of market microstructure is the same for each exchange across days.

Referenties

GERELATEERDE DOCUMENTEN

Étant donné le fait que Kees van Dongen, Isaac Israëls et Jan Sluijters fréquentent des artistes français, qui eux aussi créent des peintures de danseuses, on peut conclure que

Ik ben geneigd negatief te reageren (m.a.w. de borden niet volgen) op parkeerverwijsborden als deze de volgende informatie weergeeft:.. U kunt maximaal 2 factoren uit

al (2013) worden een aantal suggesties gedaan voor lessen die hierop gericht zijn. Mijn lesontwerp zal dus aan zoveel mogelijk van deze suggesties moeten voldoen. Dit betekent

SNLMP can be introduced as transition systems with stochastic and non-deterministic labelled transitions over a continuous state space.. Moreover, structure must be imposed over

De afhankelijkheidsrelatie die bestaat tussen de jeugdige (diens ouders) en de gemeente leidt ertoe dat toestemming echter geen geschikte grondslag voor gegevensverwerking

This study shows that quantification of blood flow in the human abdominal aorta is possible with echo PIV, and velocity profiles and data correspond well with those seen with

 to determine the ecological condition or health of various wetlands by identifying the land-cover types present in wetlands, as well as in the upslope catchments by

By investigating the effect of oxygen and initial formic acid concentration, order 1.4 in formic acid was observed and it is found that the nitrite conversion rate and the