• No results found

The predictability of the implied volatility index : evidence from Hong Kong market

N/A
N/A
Protected

Academic year: 2021

Share "The predictability of the implied volatility index : evidence from Hong Kong market"

Copied!
36
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

MSc Finance

Tack: Quantitative Finance

Master Thesis

The predictability of the implied volatility index:

Evidence from Hong Kong market

By

Junjian Zhang

11677864

June – 2018

(2)

P a g e

1 | 36

Statement of Originality

This document is written by Student Junjian Zhang who declares to take full

responsibility for the contents of this document.

I declare that the text and the work presented in this document are original and that

no sources other than those mentioned in the text and its references have been used

in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of

completion of the work, not for the contents.

(3)

P a g e

2 | 36

Contents

Abstract ...3 Introduction ...4 Literature Review ...6 Implied volatility ...6 Realized Volatility (RV) ...7

Volatility index and market return ...7

Data and Variables ...8

Hang Seng Index(HSI) ...9

Realized Volatility(RV)... 10

HSI Volatility Index(VHSI) ... 10

Summary statistics ... 13

Time series properties ... 16

Methodology and hypothesis ... 20

Realized vs. Implied volatility ... 20

Implied Volatility vs. HSI return ... 25

Robustness – Rolling Window ... 28

Conclusion ... 31

(4)

P a g e

3 | 36

Abstract

The Volatility Index is a widely used indicator of market risk. In this paper, with the data from Hang Seng Index (HSI) and its correspond volatility index (VHSI), we apply several approaches to investigate the relationship between them and the predictability of the VHSI (implied volatility) in the Hong Kong market. Empirically, we find that implied volatility is an efficient and bias forecast of the future realized volatility. We also find a strong negative correlation between the index return and its implied volatility and such correlation is asymmetric depends on the state of HSI. Moreover, there is no empirical evidence proving that implied volatility can be an efficient predictor of future index return.

(5)

P a g e

4 | 36

Introduction

For investors, predicting upcoming risks on the market and take relevant strategies to hedge them is particularly important. The most widely used method to measure risk is volatility which is the standard deviation between returns from the same underlying asset or security. In general, the higher the volatility, the riskier the underlying asset. Volatility plays a leading role in derivative pricing, the design of arbitrage/hedging strategies and diversifying the dynamic risk of a portfolio. Therefore, predicting the volatility is essential in financial activities and the relevant research area. Nowadays, there are mainly two types of volatility prediction models. One is the backward-looking model which adopts historical data to predict future volatility, such as the commonly used GARCH family models and stochastic volatility models. Another one is the forward-looking models, through which the implicit volatility of an option pricing model, usually the Black-Scholes(BS) model (Black & Scholes, 1973), is inverted to predict future volatility. Volatility implied from a forward-looking model is also known as implied volatility. There are numerous empirical researches focus on the comparison between these two types of models. See (Day & Lewis, 1992) (Christensen & Prabhala, 1998) (Blair, Poon, & Taylor, 2001) (Busch, Christensen, & Nielsen, 2011) and et al. The mainstream is that the implied volatility of options represents the expectation of future volatility, so it does contain some information about the future volatility. However, whether it subsumes all information from the volatility of GARCH family models is inconclusive. Moreover, there is more than one option pricing model to extract the implied volatility. Which of the models used to invert implied volatility that contains more information than others and provides a more accurate forecast of future volatility is also inconclusive.

Overall, it is not convenient and efficient for investors to exam all option pricing models and test forecasting performance of implied volatility regardless of their capabilities when they are trying to predict future market performance and risk. Hence, a more convenient and time-saving way to conduct the forecasting is to obtain the implied volatility directly from the volatility index that published by the Exchanges. The purpose of this paper is to exam the predictability of this implied volatility. The data used for empirical analysis are based on the Hong Kong market. More specifically, we obtain daily Hang Seng Index(HSI) level and its

(6)

P a g e

5 | 36

correspond volatility index value (implied volatility). The actual index volatility cannot be observed and collected directly, so realized volatility that constructed from intra-day HSI level with 5-min frequency is used for replacement. The Hang Seng Index plays an important role in the financial markets in Mainland China and Asia. Its close connection with the Mainland China can provide valuable experience for the development of the derivatives market of Mainland in the future. To sum up, the goal of this paper is to test:

1. The implied volatility obtained from volatility index of Hang Seng Index (VHSI) contains information content of the future realized volatility.

2. The implied volatility obtained from VHSI is correlated with the HSI and can be used to predict the index return.

The methodology applied in this paper adopts from the relevant research of Christensen and Prabhala in 1998. Their research, “The relation between implied and realized volatility”, was the first one proving that implied volatility is an efficient and unbiased forecast of future realized volatility and this is also the popular reference to many new researches that studied the information content of implied volatility. The way they computed the implied volatility was inverted from BS model which is an analogous method to compute VHSI. Noted that the methodology of computing VSHI is based on BS model but with adjustments and more detail is shown on the “Data and Variables” section later in this paper. In addition, they also explored the time series properties of both implied and realized volatility, which has barely been done for the Hang Seng Index.

This paper does not create or investigate a new method of option pricing or to compute the implied volatility, but to investigate an existing volatility index that has been widely used for various types of investors. The contribution of this paper consists of two. One is the first time to explore the time series properties of both implied and realized volatility in HSI. Second is adding more empirical evidence to the existing literature on the related topic but specialized on the Hong Kong market.

(7)

P a g e

6 | 36

Data and Variables. Section 4. Methodology and Section 5. Conclusion

Literature Review

Implied volatility

Implied volatility was first mentioned by (Black & Scholes, 1973) when they were trying to invert the implicit volatility from the BS option pricing model. Since the option price represents the market investor’s expectation of the future price distribution of the underlying asset, the implied volatility is widely regarded as better than the historical volatility in reflecting the future price volatility. However, the BS model assumptions are very strict, and there is a big gap between these assumptions and the real world, resulting inevitably deviation when predicting the future volatility with the implied volatility. This brought attention to many researchers, and they found complementary reasons to explain the deviation other than the assumptions violation in the BS model.

In the early years, (Christensen & Prabhala, 1998) while studying implied volatility, they achieved to demonstrate that measurement errors are a core issue for the bias. Furthermore, they proved that the sources of the measurement errors were primarily due to misspecification of the option model and bid-ask spreads in option prices. After correcting the measurement error with instrument variable technique, they eventually drew the conclusion of the efficient and unbiased forecasting performance of implied volatility. Sample selection and Overlapping sample were additional factors that responsible for the predicted bias in implied volatility (Engle & Rosenberg, 2000) (Hansen, Prabhala, & Christensen, 2001). Another remarkable finding of the sources of the bias was volatility risk premium. The approximate expectation of realized volatility based on previous implied volatility requires risk-neutral valuation. This is meant that the volatility risks either can be hedged or are not systematic. Some researches focused on the robustness of this assumption and empirically discovered that the violation of the risk-neutral valuation assumption is one of the primary source that caused the bias in implied volatility (Poteshman, 2000) (Benzoni, 2002) (Bollerslev & Zhou, 2006) (Chernov, 2007).

(8)

P a g e

7 | 36

model to improve or replace the BS implied volatility. Based on this, some researches improve the model by incorporating stochastic volatility and jumps to generate more realistic distribution of the return of underlying assets (Bates, 1996) (Bakshi, Cao, & Chen, 1997), while some others applied a non-parametric method to compute the model-free volatility that does not depends on any option pricing model (Britten-Jones & Neuberger, 2000) (Jiang & Tian, 2005). Most of them claimed that the alternative implied volatility can predict future volatility more precisely than the BS implied volatility, but the argument is still inconclusive.

Realized Volatility (RV)

(Merton, 1980) was the first one to deduce the realized volatility from the high-frequency sample variance of the historical stock return data. Following the method, (Poterba & Summers, 1986) used the variance of daily return data to estimate the monthly realized volatility. Given the fact that the actual volatility cannot be observed, realized volatility is used as an alternative measure is analyze predictive power of the volatility index since RV is a more precise indicator of daily volatility than daily squared return (Christoffersen, 2003). Also, as studied by (Andersen & Bollerslev, Answering the skeptics: Yes, standard volatility models do provide accurate forecasts., 1998), with high-frequency data, using 5-minute frequency of intraday returns to calculate realized variance is better than the other higher frequency (e.g. 1-minute or even higher frequency) since 5-minute frequency returns are not influenced by the microstructure noise and still can obtain a rational estimate of the quadratic variation from it. Another research from (Andersen, Bollerslev, Diebold, & Labys, 2001) empirically proved that realized volatility is consistent with actual realized volatility under the assumption of zero mean and large sample size of asset return. For relevant forecasting applications of RV, could also see (Martens, 2002) (Thomakos & Wang, 2003) (Pong, Shackleton, Taylor, & Xu, 2004) (Koopman, Jungbacker, & Hol, 2005) (Maheu & McCurdy, 2011).

Volatility index and market return

Since the Chicago Board Options Exchange (CBOE) launched the Volatility Index (VIX) based on S&P 100 at-the-money options in 1993, VIX had become a major indicator of future market volatility. A more substantial of researchers had been done on the relationship between the

(9)

P a g e

8 | 36

VIX and the stock market’s return. As the co-founder of VIX, (Whaley, Derivatives on Market Volatility: Hedging Tools Long Overdue, 1993) first pointed out the negative relationship between S&P 100 and its related volatility index. Such negative relationship is asymmetric. This is, the stock market responds to the rise of the volatility index greater than the decline of the index. A similar conclusion was drawn again by (Whaley, 2000). In addition, Fleming et al. also tested the forecasting performance of VIX to the S&P 100 future realized volatility and the result was significantly efficient. (Maggie & Thomas, 1999) studied the relation between VIX and stock market return, proving that VIX can be used as an indicator of stock market return. When the value of VIX increases, the future yield of a portfolio constructed by large-cap stocks is higher than a portfolio constructed by small-cap stocks and vice versa. (Traub, Ferreira, McArdle, & Antognelli, 2000) explored the relationship between the VIX and cross-market return. When VIX is significantly higher than normal, the stock markets outperform bond markets in the coming six months and vice versa. (Fernandes, Medeiros, & Scharth, 2014) added more empirical evidence in favor of the negative relationship between VIX and S&P 500. They also found that the VIX was, in long-run, negatively affected by both the term spread and the value of the US dollar.

Data and Variables

This empirical study includes 3 main variables which are the daily level of HSI, its implied volatility, which is the daily value of VHSI, and its intra-day (5-min) realized volatility. The sample contains 1634 non-overlapping observations spanning from 17 Aug. 2011 to 10 Apr. 2018. Comparing to some previous researches (French, Schwert, & Stambaugh, 1987) (Christensen & Prabhala, 1998) (Imlak & Puja, The information content of implied volatility index, 2013), we do not include financial crisis period like 1983 and 2008, because HKEX only started publishing the volatility index of HSI since August 2011. The data are collected from two sources. The daily closing price of HSI and the value of VHSI are downloaded from the website of ‘Investing’ while the realized volatility is collected from the realized library of Oxford-Man Institute.

(10)

P a g e

9 | 36

Hang Seng Index(HSI)

The objective of this empirical study is the Hang Seng Index which is widely used as a performance indicator for the Hong Kong stock market, as well as Asian stock market. As stated on the website of the Hong Kong Exchange, the index is calculated based on the weighted capital market approach. The Company with larger market capitalization weights more than those with less, which is similar to S&P 500. Constituent stocks cover 4 main classes including 1. Industry and Commerce (21) 2. Finance (12) 3. Properties (13) and 4. Utilities (5). with total 51 representative listed stocks, accounting for 57.68%1 of the market value of all listed Hong Kong stocks as reported in the 2017 annual report of HKEX. HSI became the main indicator in Hong Kong market due to its strong representation of constituent stocks, high calculation frequency and good continuity. Given the growing concern of the Hong Kong stock market in the 1980s, the demand for related derivatives had also risen. The Hong Kong Futures Exchange launched the Hang Seng Index futures contract and Hang Seng Index options contract in May 1986 and March 1993 respectively. These contracts provide investors with more effective tools to manage portfolio risks and capture arbitrage opportunities.

HSI option contracts apply cash delivery at the expiration date with contract multiplier of 50 Hong Kong dollar per index point. There are 11 groups of standard option contracts of various durations available for trading, including short-term and long-term. The short-term contract contains the spot month, next three calendar months and the next three calendar quarter months while the long-term contracts include the next five months of June and December. For flexible contracts, investors can choose any calendar month but not further out than the longest term of expiry months that are available for trading. Expiry day is the trading(business) day immediately preceding the last trading day of the contract month. Daily trading hours are 9:15 am to 12:00 noon and 1:00 pm to 4:30 pm (closed at 4:00 pm on the last trading day). Based on the interest of this study, the close-to-close daily return of HSI is used as the dependent variable to exam the predicted power of implied volatility(VHSI) to the market

1

Relevant information could be found at the annual report of Hong Kong Exchange:

(11)

P a g e

10 | 36

performance.

Realized Volatility(RV)

The time series data of ex-post estimate of variance during a day with 5-minute frequency is download directly from the Realized Library sector of Oxford-Man Institute of Quantitative Finance2. The method behind it is as following:

𝑅𝑒𝑎𝑙𝑖𝑧𝑒𝑑 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒𝑡 = ∑ 𝑅𝑡,𝑗𝑠2 𝑚/𝑠 𝑗=1 (3) where 𝑅𝑡,𝑗𝑠= 𝑟𝑡𝑗𝑠,𝑡− 𝑟𝑡(𝑗−1)𝑠,𝑡 (4)

𝑡𝑗,𝑡 is the time of trades or quotes on 𝑡-th day while 𝑅𝑡,𝑗 is the intraday return at day 𝑡. 𝑠

equals to 5-minute in our case. The realized variances are needed to convert to the annualized realized volatility for further analysis:

𝑅𝑉𝑡 = √(𝑅𝑒𝑎𝑙𝑖𝑧𝑒𝑑 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒𝑡∗ 252) (5)

HSI Volatility Index(VHSI)

Similar to the VIX of Chicago Board Options Exchange(CBOE), VHSI is applied to measure the expected volatility of HSI for the next 30 calendar days using the HSI option prices traded on the Hong Kong Stock Exchange (HKEX). The method to calculate the volatility is based on the VIX of CBOE in the US market but also combined with the trading characteristics of HSI options in the Hong Kong market. The Volatility selected the HSI Put and Call options in the last two expiry months (near-term and next-term) to calculate the volatility for the next 30 calendar days. For minimizing the impact of abnormal options pricing that is about to expire, the volatility index will roll options from the first and second contract months to the second and

2

Oxford-Man Institute’s realized library contains daily non-parametric measures of how volatility financial assets of indexes were in the past. See more, visit https://realized.oxford-man.ox.ac.uk/

(12)

P a g e

11 | 36

the third contract months on the third trading day prior to the expiry of the near-term options. Following is the formula to calculate VHSI:

σ2= 𝑁𝑦 𝑁30{𝑇1IV1 2[𝑁𝑇2− 𝑁30 𝑁𝑇2− 𝑁𝑇1 ] + 𝑇2IV22[ 𝑁30− 𝑁𝑇1 𝑁𝑇2− 𝑁𝑇1 ] (6) And 𝑉𝐻𝑆𝐼 = 𝜎 ∗ 100

Where σ is 30 days expected volatility; IV1 and IV 2 are implied volatility derived from the near-term and next-near-term options respectively; Ny equals the number of days in one year and N30 equal 30 days; NT1 and NT2 are the number of days to expiration of the near-term and the next-term options; T1 and T2 are the numbers of years of the near-next-term and next-next-term options (also equal to NT1 divided by Ny or NT2 divided by Ny). And VHSI is the expected volatility σ that expressed in percentage.

Implied volatility, IV 1 and IV 2, is derived by using the Black-Scholes model with parameters that value an option where its payoff depends on several risky assets which in our case, is the HSI. The formula3 is the following:

IV2= 2 𝑇∑ △ 𝐾𝑖 𝐾𝑖2 𝑒 𝑅𝑇𝑄(𝐾 𝑖) 𝑖 −1 𝑇[ 𝐹 𝐾0− 1] 2 (7) Where △ 𝐾𝑖= 𝐾𝑖+1− 𝐾𝑖−1 2 (8) And F = K + 𝑒𝑅𝑇(𝐶 𝐾− 𝑃𝐾) (9)

𝑇 is time to expiration accounting in days; 𝐾𝑖 equal the strike price of 𝑖-th selected option.

3

For more detail, the VSHI methodology could be found here: https://www.hsi.com.hk/static/uploads/contents/en/dl_centre/methodologies/IM_volatilityindexe. pdf

(13)

P a g e

12 | 36

Selected option will be a call if 𝐾𝑖 is larger than 𝐾0 and a put if 𝐾𝑖 is smaller than 𝐾0 and

will be both put and call if they are equal; 𝐾0 is the strike price nearest the forward index

level F where is calculated in equation (5). △𝐾𝑖 is the interval between strike price where is

also half the distance between the strike on either side of 𝐾𝑖; 𝑄(𝐾𝑖) indicates the mid-price

of each option with strike 𝐾𝑖. 𝑅 is the risk-free rate4 until expiration; 𝐶𝑘 and 𝑃𝑘 refers to

the mid-price at strike 𝐾 for the call and put options respectively.

There are two main parts for option selection including strike price and expiration. Theoretically, we need options with expiration with more than 23 days (near-term) and less than 37 days (next-term). When the term option is going to expire in 3 days, a new near-term option would be selected to replace the old one, and the original near-near-term option would “roll” the next-term option. This process is called rolling convention which is to minimize the pricing error of options that about to expire.

For the strike price, we first look at where the smallest absolute value of (𝐶𝑘− 𝑃𝑘) is, then

we use it to calculate the forward level 𝐹. Rounding of the forward level 𝑅 we can find 𝐾0.

Secondly, we choose out-of-money(OTM) call and options with a strike price that higher than 𝐾0. Slicing from the strike price right above 𝐾0 until two consecutive call options with zero

Figure 1. Visualization of VHSI calculation

4

Hong Kong Interbank Offered Rate (HIBOR) 1-week rate, 1-month rate and 2-month rate are used to interpolate the risk-free rate for corresponding near-term and next-term options.

Day of calculation, 11-Oct-2011

near-term

implied volatility VHSI of the day

of calculation next-term implied volatility 0 5 10 15 20 25 30

27-Sep-11 12-Oct-11 27-Oct-11 11-Nov-11 26-Nov-11 11-Dec-11

Vol ati li ty E sti m ati on Date

VHSI Calculation - Typical case

(14)

P a g e

13 | 36

bid price are found. Any call option in the set with zero bid price will be excluded. Similar to call option, we need OTM put option with a strike price that lower than 𝐾0. Slicing from strike

price right below the 𝐾0 until two consecutive put option with a zero bid price are found. Any

put option in the set with a zero bid price will be excluded as well. Thirdly, after slicing OTM options for near-term and next term, we can plug in the corresponding value into equation (5) to get the near-term and next-term implied volatility. Last, interpolating that two implied volatility using equation (1), the final result VHSI could be obtained. Visualization of the interpolation is simply shown in the following figure.

Summary statistics

Table 1 reports the summarized statistics of the raw time series data in three groups which are HSI return, realized volatility and implied volatility(VHSI). The first column of each group reports value in level while the second column reports value in the natural logarithmic form. Overall, the volatility in the natural logarithmic form has relatively lower skewness and kurtosis which appears to be more closely to normality5 , encouraging us to use log-form when analyzing with regressions. However, this is not true for the index return. The return in natural logarithmic form has slightly lower skewness and larger kurtosis, which is even far away from the normality comparing to its value in level. In order to be coinciding with other two variables, the natural logarithmic form will be applied for further analysis in later section. This should not be a problem for index return as the deviation between level and log-level is so small. Figure 2 depicts the daily changes in VHSI as a function of HSI daily returns from Aug. 2011 to Apr. 2018 with total 1634 observation. The Horizontal axis on the left-hand side indicates return while another one on the right-hand side indicates volatility. It is observed that the index return does fluctuate around zero and it moves further away from zero whenever the implied volatility increases, and the direction of movements is ambiguous. Therefore, it cannot be easily concluded that there is a clear correlation between index return and its implied volatility. From the economic point of view, this market is efficient, and the market return

5

(15)

P a g e

14 | 36

cannot be predicted.

Moving to the second and the third group, implied volatility (VHSI) is on average higher and more volatile than its corresponding series of realized volatility. For instance, the mean of implied volatility is on average 8.28 higher than realized volatility and its log-form is on average 0.59 higher than log realized volatility. This is shown on the top channel of Figure 3 where the blue area indicates the difference between series of implied volatility and series of realized volatility with the horizontal axis on the right-hand side. The series of VHSI and RV are also reported on the bottom channel of Figure 3 with the horizontal axis on the left-hand side. Generally, the difference is on average below 10 over time except that in the high volatility period like 2011, 2012, 2015 and 2016.6

Table 1. summary statistic

Variables Obs. Mean Variance Skewness Kurtosis

return 1634 0.032 1.3 -0.2117 5.6873

Log return 1634 0.0003 0.0001 -0.2915 5.7634

realized volatility 1634 11.0064 22.0024 3.0594 20.4433 realized log volatility 1634 2.3323 0.1205 0.7132 4.5074

implied volatility 1634 19.288 35.0621 1.5408 5.5355

implied log volatility 1634 2.9198 0.074 0.7999 3.3059

Summarized statistics for daily time series of Hang Seng Index return and its corresponding volatilities, including

level and natural logarithmic form. Volatilities and their log-form are reported in percentage. Realized volatility

is the annualized ex-post standard deviation of daily return of HSI based on 5-minute intraday data. Implied

volatility(VHSI) is constructed from put-call option prices that traded on Hong Kong Exchange. The entire sample

contains 1634 non-overlapping observations spanning from 17 Aug. 2011 to 10 Apr. 2018.

Theoretically, implied volatility tends to exceed realized volatility of the same underlying asset over time and the difference between them is known as volatility risk

6

In 2011 and 2012, S&P brought down the U.S rating unprecedentedly, and the European debt crisis escalation affected the market globally. In 2015 and 2016, Hong Kong stock market was influenced heavily by the stock market crash in mainland China.

(16)

P a g e

15 | 36

Figure 2. Volatility and Index Return

premium. One reason is the behavioral basis, which is the risk aversion. In order to stabilize the portfolio return streams, most of the investors are willing to pay a certain amount (risk premium). Another reason is due to market structural constraints. There has not always a pair of buyers and sellers in the market. To compensate this illiquidity, a certain amount of premium is required. Consequently, the volatility risk premium is resulted from the combination of those factors (Ge, 2014). The risk premium, as shown on the top channel of Figure 3, is varying overtime. A possible explanation for this exceeding risk premium is that in high volatility period,

Figure 2. Volatility and Index Return

investors might even overreact to the bad news thus increasing their willingness of paying

0 10 20 30 40 50 60 -8 -6 -4 -2 0 2 4 6 8 17-08-11 17-08-12 17-08-13 17-08-14 17-08-15 17-08-16 17-08-17 Vol ati lity R etur n Date

Comparison between VHSI and Index Return (Aug.2011 - Apr.2018)

(17)

P a g e

16 | 36

more margin for uncertain risk in which eventually drive up the implied volatility. Moreover, the bottom channel of Figure 3 shows that implied volatility of HSI moves in the same direction as the realized volatility over time except for some extreme cases where some realized volatilities are extremely higher during 2015 and 2016 when the stock market crash occurred in the mainland China. Comparing to (Imlak & Puja, The information content of implied volatility index, 2013), our sample shows a smoother pattern probably due to the absence of extreme circumstance like the financial crisis. But, it does not mean that there are no errors of measurement. As investigated by (Christensen & Prabhala, 1998) and (Hentschel, 2003), violating the assumptions of the Black-Scholes model will cause the potential issue of measurement error in implied volatility. Measurement errors will enhance errors-in-variable(EIV) problem if BS implied volatility is used as a volatility forecast. A possible solution to such issue is the use of instrumental variable technique. This will be further discussed in the methodology section.

Time series properties

To assess the time series properties and to determine the most appropriated model, three groups of time sequence data are fitted to the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF). The results are shown in Figure 4. All figures illustrate that two log-volatility series and the log-HSI series are non-stationary since their ACF functions show a smoothly decreasing pattern. However, the PACF functions indicate that the decreasing pattern of autocorrelation of other lags is primarily due to the impact of the first lag for implied volatility and HSI level, suggesting AR(1) for those two series while AR(4) is suggested for realized volatility. In addition, there is no sign of seasonality from all ACF and PACF figures. Next, fitting the data to the ARIMA(p,d,q) models and we obtain table 2. As proposed by (French, Schwert, & Stambaugh, 1987) and (Christensen & Prabhala, 1998) :

ϕ(B)(△𝑑𝑋

(18)

P a g e

17 | 36

Table 2. Fitted ARIMA model(q,d,p) for Realized Volatility, Implied Volatility and

HSI level.

ARIMA

(p,d,q) μ Φ1 Φ2 Φ3 Φ4 Θ1 Θ2 AIC Q* DF

Panel A: Realized Volatility {𝑅𝑉𝑡} (4,0,0) 2.33 *** 0.31 *** 0.21 *** 0.15 *** 0.13 *** 211 45.3 *** 22 (1,0,1) 2.34 *** 0.96 *** -0.7 *** 178 28 22 (1,1,1) -0.00 0.12 *** -0.82 *** 196 38 ** 22 (2,0,0) 2.38 *** 0.40 *** -0.32 *** 298 144 *** 22

Panel B: HSI level {𝑃𝑡} (1,1,0) 0.00 0.015 *** -9969 29.2 22 (2,1,0) 0.00 0.016 -0.016 -9967 28.4 22 (1,1,1) -0.00 -0.78 *** 0.80 *** -9968 27.3 22 (2,1,1) 0.00 -0.75 *** -0.01 0.77 *** -9967 27.1 22

Panel C: Implied Volatility {𝐼𝑉𝑡} (1,0,0) 2.94 *** 0.98 *** -4816 31.5 * 22 (2,0,0) 2.94 *** 0.99 *** -0.01 -4814 31.4 * 22 (1,1,1) -0.00 0.92 *** -0.97 *** -4821 32.5 * 22 (1,0,1) 2.94 *** 0.98 *** 0.01 -4814 31.4 * 22

The table shows the regression results of fitting the time series data of Realized Volatility, Implied Volatility and HSI level to the ARIMA(q,d,p) model with the general format of:

ϕ(B)(△𝑑𝑋

𝑡− 𝜇) = 𝜃(𝐵)𝜀𝑡

The sample contains 1634 non-overlapping daily observations for each panel, spanning from Aug.2011 to Apr.2018. Akaike information criterion (AIC) and Ljung–Box Q test are also reported in the last few columns of the table. *, **, *** are the indicators for 1%, 5% and 10% significance respectively.

(19)

P a g e

18 | 36

where 𝑋𝑡 donates to one of those time sequences; ϕ(B) refers to the autoregressive

polynomial such that 1 − 𝜙1𝐵 − 𝜙2𝐵2− ⋯ − 𝜙𝑖𝐵𝑖 ; 𝜃(𝐵) refers to moving average

polynomial such that 1 − 𝜃1𝐵 − 𝜃22𝐵 − ⋯ − 𝜃𝑖𝑖𝐵; 𝐵 refers to the backshift operator; △𝑑 =

(1 − 𝐵) meaning the first-difference operator. 𝜇 refers to the sample mean, while 𝜀𝑡 is the

white noise. The log-form of Realized volatility (𝑅𝑉𝑡), Hang Seng Index Level (𝑃𝑡) and Implied

Volatility (𝐼𝑉𝑡) series are used instead of the level series due to the approximation of Gaussian

distribution as mentioned before.

Table 2 displays the results for three series, including the Akaike information criterion (AIC) and portmanteau test (Ljung–Box Q test). The first column shows the type of ARIMA(p,d,q) where 𝑝 is the order of autoregressive model, which is the number of lags as well, 𝑑 is the degree of differencing, and 𝑞 is the order of the moving average model. Column 2 to 8 reports coefficients and the significant level of the model parameters. The last 3 columns report relevant tests for white noise, also to compare fitness among those various types of ARIMA. For instance, the first row gives the summarized information of model ARIMA(4,0,0) which is AR(4) as well. The table shows that all parameters are significant on 1% level as indicated by ***. AIC equals to 211, which is meaningless to interpret alone a single model while it is needed to compare among other models. The one with lowest AIC better describe the data than other candidate models. Noted that AIC alone does not tell how poorly a model fit the data.

Comparing the AIC with model ARIMA(1,0,1), which is ARMA(1,1) as well, in the second row, we can conclude that ARIMA(1,0,1) better fits the data than ARIMA(4,0,0). The second last column reports the Ljung-Box Q statistic and the last column reports the number of lags being tested and it is also the number of freedom (DF) when comparing Q statistic with chi-square distribution 𝑋𝑑𝑓2 . For example, the first row gives Q statistic of 45.3 and the DF of 22

illustrating that the null hypothesis that residuals of the fitted model ARIMA(4,0,0) have no autocorrelation up to 22 lags is rejected at 1% significant level. In other words, the residuals exhibit serial correlation causing incorrect standard errors and unreliable statistic tests. 22 lags are chosen to eliminate the day-of-month seasonal effect since our data is accounting on trading days, although PACFs exhibit no seasonal pattern.

(20)

P a g e

19 | 36

(21)

P a g e

20 | 36

As suggested by PACF, we first fit time series of realized volatility {𝑅𝑉𝑡} to AR(4). All estimated

coefficients are significant at 1% level while the residuals appear to be serially correlated. Therefore, non-integrated models, ARMA(1,1) and AR(2), and integrated model ARIMA(1,1,1) are used to fit the data as well. Comparing AIC and eliminating the model that is not white noise, the non-integrated ARMA(1,1) model best describe the data among others with the lowest AIC and the lowest Q statistic. The next time series data in Panel B is the Hang Seng index level {𝑃𝑡}. As the core interest is the index return, we always take the first differencing

d=1 when fitting the data to various models. The table shows that model ARIMA(1,1,0) best describe the time series data of index level. In other words, model AR(1) best describe the time series data of index return. Model ARIMA(2,1,0) has lower ljung-box Q and relatively large (negative)value of first-order autoregressive, while this (negative) value is offset by the effect of the first-order moving average, generating the similar result as model ARIMA(1,1,0). Following the principle of simplicity, ARIMA(1,1,0) is more appropriated among others for index level. The last one is the time series data of implied volatility {𝐼𝑉𝑡}. It is shown that the

residuals of all fitted model are white noise at 5% level. With lowest AIC, the best described model is the integrated ARIMA(1,1,1).

Methodology and hypothesis

There are two primary interests of this paper. First, it is to exam the information content of implied volatility about the future realized volatility. Second, it is to analyze whether the implied volatility can forecast the future index(stock) return. To do so, the econometric framework is divided into two parts, including in-the-sample and out-of-sample estimation.

Realized vs. Implied volatility

As proposed by (Christensen & Prabhala, 1998), the following model specification can be constructed to test the information content of implied volatility to realized volatility:

(22)

P a g e

21 | 36

Where 𝑅𝑉𝑡 refers to ex-post log realized volatility at day t and 𝐼𝑉𝑡−1 denotes the log implied

volatility at the end of day t-1. 𝜀𝑡 is the error term. From equation (11), three hypotheses can

be tested:

1. If the coefficient 𝛽𝐼𝑉 is significantly different from zero, then implied volatility

contains some information about the future realized volatility.

2. If α = 0 and 𝛽𝐼𝑉 = 1, then implied volatility is an unbiased forecast of realized

volatility.

3. If 𝜀𝑡 is white noise and uncorrelated with any independent variable that contains

market information, then implied volatility is efficient.

The OLS estimates of equation (11) are shown in the second column of Table 3. The estimate of 𝛽𝐼𝑉 is 0.824 which is significantly different from zero, indicating that implied volatility

contains some information about the future realized volatility. Investors making trading strategies on Hang Seng Index do take its corresponding Volatility Index as a reference. 𝛼(-0.072) is not significantly different from zero and 𝛽𝐼𝑉 is significantly different from one

showing that implied volatility is a biased forecast of realized volatility. The joint hypothesis of α = 0 and 𝛽𝐼𝑉 = 1 is rejected by an F-test with F(2,1632) statistic of 4035.52. Hence, implied

volatility is a biased forecast of realized volatility. Durbin–Watson (2, 1633) statistic of 1.432 indicates a positive correlation between 𝜀𝑡 and 𝜀𝑡−1 , so it can be roughly concluded that

implied volatility is a biased and inefficient forecast of realized volatility with α = 0 and 𝛽𝐼𝑉 <

1. The bias might be caused by several factors that discussed before. Overlapping and sample selection are irrelevant in our case because of the use of daily non-overlapping data. The most likely sources caused bias are the other two. One is the risk premium that embedded in implied volatility as illustrated in Figure 3. Realized volatility equal to implied volatility plus the risk premium and other variation, which means that implied volatility can only predict fractional realized volatility. This can be explained by the slope formula in the OLS where the slope estimated 𝛽𝐼𝑉 equals to the covariance between 𝑙𝑛(𝑅𝑉)𝑡 and 𝑙𝑛(𝐼𝑉)𝑡−1 divided by the

variance of 𝑙𝑛(𝐼𝑉)𝑡−1. Risk premium that embedded in the implied volatility is not constant

over time causing Cov(ln(𝑅𝑉)𝑉𝑎𝑟(ln(𝐼𝑉)𝑡,ln(𝐼𝑉)𝑡−1)

(23)

P a g e

22 | 36

As discussed in the ‘Time Series Properties’ section, both volatilities exhibit autocorrelation in the first lag which incorporates into the error term causing unreliable estimates and statistic test. EIV arises when implied volatility endogenously depends on past implied volatility. Additional measurement error is introduced by the bid-ask spread. This error should be minimized since the way to compute VHSI already eliminated options with too short maturity and options with illiquidity. Nevertheless, the measurement error and the variation in the risk premium will remain.

Besides, we are also interested in the comparison between the past implied volatility and the past realized volatility about the information content. To test that the following AR(1) is estimated.

𝑅𝑉𝑡 = 𝛼 + 𝛽𝑅𝑉𝑅𝑉𝑡−1 + 𝜀𝑡 (12)

The results are reported in the third column of Table 3. The past realized volatility has predicted power to future realized volatility since 𝛽𝑅𝑉 is significantly different from zero,

which is consistent with the pattern observed in PACFs. Comparing the univariate model (11) and (12), we can see that model (11) has a larger magnitude of slope coefficient and relatively larger adjusted R-square. DW statistic shows that the residuals of both models are somehow autocorrelated at first lag. Both DW statistics are significantly different from 2, meaning that both are not an efficient forecast for future realized volatility in isolation. To see whether past implied volatility subsumes all information content of past realized volatility, we also run the following multivariate model:

𝑅𝑉𝑡 = 𝛼 + 𝛽𝐼𝑉𝐼𝑉𝑡−1+ 𝛽𝑅𝑉𝑅𝑉𝑡−1+ 𝜀𝑡 (13)

Results are shown on the fourth column of OLS sector of Table 3. Then we can test the next hypothesis:

4. If 𝛽𝐼𝑉 is significantly different from zero and 𝛽𝑅𝑉,1 = 0, then implied volatility

(24)

P a g e

23 | 36

Table 3 Estimated coefficients

Dependent Variable: 𝑅𝑉𝑡

OLS OLS OLS OLS ARIMAX

models (11) (12) (13) (13.1) (14) 𝐼𝑉𝑡−1 0.824*** 0.575*** 0.545*** 0.766*** 𝑅𝑉𝑡−1 0.593*** 0.287*** 0.302*** 0.885*** 𝜀𝑡−1 -0.699*** Const -0.072 0.949*** -0.014 0.036 0.096 Instrument No No No Yes No MLE -29.98 adj. R2 41.63 35.14 46 44 AIC 301.19 473.293 174.684 226.897 69.985 DW 1.432 2.382 2.078 2.111 1.983 RMSE 0.265 0.280 0.255 0.259 0.246

Descriptive statistic of OLS and ARIMAX estimations with a unified dependent variable. The last row of RMSEs is reported to compare with out-of-sample forecasting error. *** for 1% significance, ** for 5% significance and * for 10% significance.

Since both 𝛽𝐼𝑉 and 𝛽𝑅𝑉 are significantly different from zero, it can be concluded that both

implied volatility and historical realized volatility contain some information of the future realized volatility. In other words, the variation on Hang Seng Index today will be influenced by both behaviors of index trading and its related options trading from yesterday. And the implied volatility cannot capture all the information content and characteristics of the variation in index trading activities. Furthermore, the magnitude of 𝛽𝐼𝑉 is larger than 𝛽𝑅𝑉

both in univariate and multivariate models, indicating that implied volatility contains more information content of the future realized volatility than historical realized volatility. Moreover, after controlling autocorrelation, AIC drop dramatically from 301 to 175 and DW is not significantly different from 2 expressing that the multivariate model (13) is better to fit the data than the univariate model (11). In addition, we are also aware that implied volatility is self-autocorrelated at the first lag and OLS model (13) might suffer from EIV. To deal with this issue, we apply instrumental variable of the second lag of implied volatility as suggested by (Christensen & Prabhala, 1998). Results are shown on the fifth column of Table 3. In short, 𝛽𝐼𝑉

(25)

P a g e

24 | 36

and both coefficients remain significant at 1% level. Adjusted R-square decreases from 46% to 44% and AIC rise proportionally from 175 to 227. In our sample, the second lag of implied volatility cannot further minimize the remaining bias. First, the dataset and the method to obtain implied volatility is relatively different from each other even though both of method employ Black-Scholes model. Second, we are using data in the daily base which is more sensitive to market shocks resulting more variation in the risk premium. Third, the remaining bias might heavily depend on the risk premium. The mean difference between realized volatility and implied volatility in their sample is only 1% while it is more than 8% in our sample. From the time properties perspective, the best described model for realized volatility series is nonintegrated ARMA(1, 1), which includes the moving average components. Today’s variation in the error term including the variation of the risk premium might be explained by the variation itself from yesterday. Based on this argument, it is preferring to employ the nonintegrated ARMAX(p,q,b) model, which is shown as the following.:

𝑅𝑉𝑡 = 𝛼 + 𝛽𝐼𝑉𝐼𝑉𝑡−1+ 𝛽𝑅𝑉𝑅𝑉𝑡−1+ 𝛽𝜀𝜀𝑡−1+ 𝜀𝑡 (14)

Where 𝛽𝜀 is the coefficient of error term 𝜀 at the first lag. Table 3 reports the results of

nonintegrated model ARIMAX(1,0,1). As shown, the coefficient 𝛽𝐼𝑉 approach back to a higher

level of 0.766 but it is still smaller than one. The impact from historical realized volatility is now larger than the that from previous implied volatility. There is no serial correlation in the error term with DW statistic of 1.98. If we apply the hypothesis from Christensen and Prabhala, our conclusion will be that implied volatility is an efficient but biased forecast of future realized volatility. However, as we discussed before, the existence of the risk premium ensures that the implied volatility is significantly higher than realized volatility most of the time, unless investors do not see the market shocks coming in a very short period (e.g., in less than one day). For instance, this could be a sudden announcement of the failure of the merger or the failure of debt restructuring. To be more rigorous answering this question, we only can conclude that implied volatility is an efficient forecast of realized volatility. Whether it is biased or not, if yes, in what direction is left to further research. Moreover, we also can conclude that

(26)

P a g e

25 | 36

nonintegrated model ARIMAX(1,0,1) , model (14), has best forecasting performance among others with smallest AIC.

Implied Volatility vs. HSI return

Moving to next interest of this paper, we also want to investigate the relationship between implied volatility and index return. As proposed by (Bates, 2000), we can test the relationship by applying the following framework:

𝑑𝑖𝑣𝑡 = 𝛼 + 𝛽𝑟𝑒𝑡𝑟𝑒𝑡𝑡+ 𝜀𝑡 (15) Where

𝑟𝑒𝑡𝑡 = ln(𝑝𝑡) − ln(𝑝𝑡−1) (15.1)

𝑑𝑖𝑣𝑡 = ln(𝑖𝑣𝑡) − ln(𝑖𝑣𝑡−1) (15.2)

𝑝𝑡 refers to the average Hang Seng Index level at time t; 𝑖𝑣𝑡 refers to the average value of

implied volatility at time t. Hence, 𝑟𝑒𝑡𝑡 measures the log difference (log return) of HSI at time

t while 𝑑𝑖𝑣𝑡 measures the log difference of its corresponding implied volatility. 𝜀𝑡 is the

error term. Implied volatility is taken the first difference due to two reasons. First, it is suggested by its own time series properties. Second, we can compare and interpret the slope coefficient more directly to the index return. From equation (15), we can test:

5. If 𝛽𝑟𝑒𝑡 is significantly different from zero, HSI return is correlated with the change of

implied volatility.

Based on the results shown in Table 4, 𝛽𝑟𝑒𝑡 is statistically significantly smaller than zero at 1%

level, indicating that the index return is negatively correlated with the change of implied volatility. Statistically, If HSI return increases by 1% today, then on average, the change of implied volatility will decrease by 3.035%, holding everything else unchanged. DW statistic of 1.855 is not far away from 2, so it should not be an issue of the reliability of statistic tests. The constant term α is zero as expected. Moreover, PACFs also suggests no serial correlation after

(27)

P a g e

26 | 36

first difference for both time series sequences. This can be tested by running the following regression. After controlling for autocorrelation, we obtain the results on the second column of ‘OLS’ section on table 4.

𝑑𝑖𝑣𝑡 = 𝛼 + 𝛽𝑟𝑒𝑡,𝑡𝑟𝑒𝑡𝑡+ 𝛽𝑟𝑒𝑡,𝑡−1𝑟𝑒𝑡𝑡−1+ 𝛽𝐼𝑉𝐼𝑉𝑡−1+ 𝜀𝑡 (16)

The slopecoefficient of index return barely changes even though both lag coefficients exhibit significant at 1% level. To eliminating the issue of serial correlation in the error term, both lag variables will be kept in the next regression as well. It is also noted from previous studies, the impact on index return is asymmetric. For example, negative index return generates larger impact on implied volatility than positive index return (leverage effect). And the degree of difference is different depends on various prediction horizon, database, market conditions and empirical framework (Ederington & Guan, 2010). To investigate the leverage effect, dummy variable D is needed, where D equals to one if the index return smaller than zero, otherwise zero. Then equation (16) is modified as below:

𝑑𝑖𝑣𝑡 = 𝛼 + 𝜃𝐷𝑡+ 𝛽𝑟𝑒𝑡,𝑡𝑟𝑒𝑡𝑡+ 𝛾 (𝐷𝑡∗ 𝑟𝑒𝑡𝑡) + 𝛽𝑟𝑒𝑡,𝑡−1𝑟𝑒𝑡𝑡−1+ 𝛽𝑑𝑖𝑣𝑑𝑖𝑣𝑡−1+ 𝜀𝑡 (17)

where 𝐷𝑡 indicates that whether the index return at day t is negative and θ refers to the

estimated coefficient of dummy variable 𝐷𝑡 ; 𝛾 donates to the estimated coefficient of

interaction 𝐷𝑡∗ 𝑟𝑒𝑡𝑡. It measures the expected difference of impact between positive and

negative index return on implied volatility. In other words:

6. If the estimated interaction coefficient 𝛾 is significantly different from zero, positive and negative index return have the different impact on implied volatility.

Estimates are shown on the third column of ‘OLS’ section on table 4. The main interest is the value of 𝛾 and its relevant significance. 𝛾 equals to -4.194 and appears to be significant at 1%. θ is not significantly different from zero in our case, but it does not mean that the impact

(28)

P a g e

27 | 36

of index return is symmetric to its implied volatility. The results should be read and interpret in this way. If the index return is positive (D = 0), when it declines by 1%, implied volatility will increase by 0.949 + 0.016 = 0.965 percent. If the index return is negative (D = 1), when it declines by 1%, implied volatility will increase by 0.949 + 0.016 + 4.194 = 5.159 percent. The difference is exactly equal to the absolute value of 𝛾. From the economic point of view, we can see that investors become more panic about the market requiring even more compensation (higher implied volatility) from writing options when the market is currently in “bad state” (negative return) (Traub, Ferreira, McArdle, & Antognelli, 2000). Higher implied volatility does not necessarily mean the price (level) of the underlying asset (HSI) being more variable, but it could mean that investors are demanding more compensation, or the risk premium, for writing options. In our sample, the compensation for panic in “bad state” is approximately five times larger than in “good state”.

Table 4 Estimated coefficients

OLS OLS OLS OLS ARIMAX

Dependent variable 𝑑𝑖𝑣𝑡 𝑑𝑖𝑣𝑡 𝑑𝑖𝑣𝑡 𝑟𝑒𝑡𝑡 𝑟𝑒𝑡𝑡 Model (15) (16) (17) (18) (19) 𝑟𝑒𝑡𝑡 -3.035*** -3.043*** -0.949*** 𝑑𝑖𝑣𝑡−1 0.073*** 0.083 0.001 0.001 𝑟𝑒𝑡𝑡−1 0.624*** 0.747*** 0.019 -0.776*** 𝐷𝑡 -0.003 𝐷𝑡∗ 𝑟𝑒𝑡𝑡 -4.194*** 𝜀𝑡−1 0.797*** Const 0 0 -0.016*** 0 0 Ajd.R2 38.87 39.76 48.66 0 AIC -5605.703 -5624.49 -5879.34 -9963.61 -9960.92 DW 1.855 1.994 1.880 1.998 2.002 RMSE 0.0435 0.0431 0.0399 0.0114 0.0114

Descriptive statistic for OLS and ARIMAX estimations with different dependent variables. The last row of RMSEs is reported to compare with out-of-sample forecasting error. *** for 1% significance, ** for 5% significance and * for 10% significance.

It is empirically confirmed that HSI return is negatively correlated with implied volatility and the magnitude of the correlation sharply depends on the market conditions (good state or bad

(29)

P a g e

28 | 36

state). Knowing the correlation between them does not serve all the interests of this paper. We are also interested in investigating that whether the index return can be predicted by historical implied volatility, even though in common financial and economic theories, the market is always assumed efficient and the stock market returns follow a random walk. Inspecting such question, we regress the index return on one-day ahead implied volatility both in OLS and ARIMAX. The regression is shown below:

𝑟𝑒𝑡𝑡 = 𝛼 + 𝛽𝑑𝑖𝑣𝑑𝑖𝑣𝑡−1+ 𝛽𝑟𝑒𝑡𝑟𝑒𝑡𝑡−1+ 𝜀𝑡 (18)

𝑟𝑒𝑡𝑡 = 𝛼 + 𝛽𝑑𝑖𝑣𝑑𝑖𝑣𝑡−1+ 𝛽𝑟𝑒𝑡𝑟𝑒𝑡𝑡−1+ 𝛽𝜀𝜀𝑡−1+ 𝜀𝑡 (19)

Results are shown on the last two columns of table 4. The estimated coefficient of one-day ahead implied volatility 𝛽𝑑𝑖𝑣 is not significantly different from zero in both models. According

to the results, we do not find evidence of predicting the HSI return based on one-day ahead implied volatility. Extending the forecasting horizon or applying different methodology might yield a different result. For instance, (Pierre, 2003) tested 1-day, 5-day, 20-day and 60-day forecasting horizon and concluded that extremely high value of the Volatility Index does generate higher forward-looking index return in a long-term. Nevertheless, this is left to further research.

Robustness – Rolling Window

To verify the robustness of those results reports on above, additional tests are required. In this section, we conduct the process of ‘Rolling Window’ to not only check the robustness of our reported results between the in-the-sample estimation and the out-of-sample forecast but also to compare the forecast accuracy among models. Empirical researches show that out-of-sample forecasting is more reliable than in-the-out-of-sample prediction since it is less sensitive to extreme value (outliers) and data mining (White, 2000). Compare to in-the-sample, out-of-sample forecast is better to reflect available information in real (Diebold & Rudebush, 1991),

(30)

P a g e

29 | 36

and to reduce the probability of model overfitting (Ashley, Granger, & Schmalensee, 1980). As a result, the out-of-sample forecast is chosen to exam the forecasting performance of forecasting models by many researchers (Stock & Watson, 2007).

From the perspective of statistic, it is commonly assumed that the coefficients of time series models are constant regardless of time. In our case, applying rolling method, estimated parameters are allowed to be time-varying, and the rolling parameters are recalibrated at each window with the newest available information. Conducting the robustness check, the rolling framework is needed to be established.

Firstly, the rolling window size m is determined by taking half of the total observation, which is 817.Thus the number of sub-samples(windows) is 1632-817 = 814. Smaller rolling window size tends to increase variance and mean square forecast errors (Pesaran & Timmermann, 2004). On the contrary, larger rolling window size contains earlier data that might be irrelevant to the present data generating improving the accuracy of forecasting but also suffering the issue of higher bias (Clark & McCracken, 2009). Considering the tradeoff of different window length and the principle of simplicity, it is reasonable to take the half of the sample observation as window size. Developing or investigating a method of choosing the optimal rolling window length will be another research topic other than the one of this paper. Second, as the time series model shown, the forecast horizon h is one day ahead. The first rolling window contains observations from period 1 to period m = 817. Running time series models, the parameters of the first window are obtained. The one day ahead out-of-sample predicted value 𝑦̂818 could

be computed by plugging all independent variables from period 817. Economically, we collect all available information (e.g., implied volatility) on period 817 and use it to predict the dependent variable (e.g., realized volatility) on period 818 with estimated parameters from the first window. Noted that for simple OLS models without lag variables, predicted value 𝑦̂818

is computed by plugging all independent variables from the same period, which is period 818. The second window contains observations from period 2 to period m+1 =818. Re-running the time series models again with the second sub-sample and obtain the second predicted value 𝑦̂819 and so on. Continuing the rolling, we obtain the out-of-sample predicted series {𝑦̂818,

(31)

P a g e

30 | 36

𝑦1634}, the root forecast mean square errors (RMSEs) could be computed:

RMSE = √ 1

𝑛 − 𝑡 + 1 ∑ (𝑦𝑡− 𝑦̂𝑡)2

𝑛 𝑡=818

(19)

Where 𝑛 equals to the sample size 16327 and 𝑡 starts from the first predicted period 818. Results are reported in table 4. Reporting both in-the-sample RMSEs, which are obtained directly from the last row of table 3 and 4, and out-of-sample RMSEs, we can directly analyze the performance of those models.

Table 5 Robustness check

Panel A: Implied Volatility vs. Realized Volatility

Model 11 12 13 13.1 14

in-the-sample RMSE 0.265 0.280 0.255 0.259 0.246

out-of-sample RMSE 0.281 0.322 0.267 0.267 0.260

Panel B: Implied Volatility vs. Hang Seng Index Return

Model 15 16 17 18 19

in-the-sample RMSE 0.04 0.043 0.040 0.011 0.011

out-of-sample RMSE 0.049 0.049 0.045 0.011 0.011

The out-of-sample RMSE are conducted based rolling window with the window size of 817, forecast window of 1 and 814 sub-samples while the in-the-sample RMSE is conducted based on the full sample size.

In panel A of Table 5, in-the-sample prediction produces more accurate estimation than out-of-sample forecasting in all models. There are two reasons for such result. First, as mentioned above, the means to determine the rolling size is rather simple and not optimal, which will increase either variance or bias of the estimation. Second, it might due to smaller sample size where it only contains a half of information as the total sample does. Nevertheless, comparing

7

After subtracting two periods of data to conduct the first lag of HSI return, the total observation decreases from 1634 to 1632.

(32)

P a g e

31 | 36

the out-of-sample RMSEs, the nonintegrated model ARIMAX(1,0,1) has the best forecasting performance among others which is constant with the conclusion drawn from in-the-sample estimation. Investigating the direction of forecast error, we plug in the forecast errors series into a bar chart (Figure 5). The forecast error is computed as the difference between forecasting value and observed value. If the bias is above zero, forecasting value is overestimated, and vice versa. There are in total 436 predicted values are overestimated which is not shown here. Therefore, there is not enough evidence to draw the conclusion about the direction of the forecasting bias.

In panel B, in-the-sample prediction performs better than out-of-sample forecasting as well. Noted that the RMSEs of both model 18 and 19 cannot be compared to others on Panel B since they have different dependent variables. For instance, the in-the-sample RMSE of model 18, which is 0.011, is relatively lower than that of model 17, which is 0.04, but it does not mean model 18 fitting date more precisely than 17. Both model 18 and 19 measure the forecasting performance of the change of implied volatility to the HSI return. Their corresponding out-of-sample RMSE barely change indicating that HSI return still cannot be predicted by implied volatility.

Figure 5. Forecasting bias

Conclusion

To fulfill the purpose of this paper, we apply several approaches to investigate the

-2 -1.5 -1 -0.5 0 0.5 1 1.5 8 18 841 864 887 910 933 956 979 1 00 2 1 02 5 1 04 8 1 07 1 1 09 4 1 11 7 1 14 0 1 16 3 1 18 6 1 20 9 1 23 2 1 25 5 1 27 8 1 30 1 1 32 4 1 34 7 1 37 0 1 39 3 1 41 6 1 43 9 1 46 2 1 48 5 1 50 8 1 53 1 1 55 4 1 57 7 1 60 0 1 62 3 Va ri at io n prediction period

Forecasting bias of ARIMAX(1,0,1) (Implied Volatility vs. Rrealized Volatility)

(33)

P a g e

32 | 36

predictability of the volatility index (implied volatility) in Hong Kong market. Firstly, applying ACFs, PACFs and ARIMA model, we assess the time series properties of three main variables, HSI level, implied volatility and realized volatility. As a result, the best described ARIMA model is ARMA(1,1) for realized volatility, ARMA(1,1,0) for HSI level (or AR(1) for index return) and ARIMA(1,1,1) for implied volatility. Further, based on their time series properties, this paper introduces simple OLS, instrumental variable technique with 2SLS and time series model ARIMA to investigate their relationship. Empirically, this paper finds that for one-day ahead forecasting, implied volatility is an efficient but bias forecast of future realized volatility and the most appropriate forecast model for our sample is ARIMAX(1,0,1). The remaining bias in estimations might be caused by the variation in the risk premium that embedded on implied volatility. One cannot be concluded is the direction of the bias. Overall, for investors in Hong Kong Market, Volatility index of HSI could be used as a forecast index to predict the short-term market risk. When doing so, extra attention on current risk premium is needed. On the other hand, this paper also finds that HSI return is negatively correlated with the change in implied volatility. This correlation is asymmetric depends on the state of the market. If the current market state is bad (negative return on HSI), then further decrease in HSI return results even larger increase in its correspond volatility index. This is constant with most of the previous researches. Furthermore, this study finds no evidence to prove that implied volatility is an efficient predictor of HSI return, neither from in-the-sample nor out-of-sample estimation.

(34)

P a g e

33 | 36

References

Andersen, T., & Bollerslev, T. (1998). Answering the skeptics: Yes, standard volatility models do provide accurate forecasts. International Economic Review, 885-905.

Andersen, T., Bollerslev, T., Diebold, F., & Labys, P. (2001). Modeling and forecasting realized volatility. Econometrica, Vol.71, No.2, 579-625.

Ashley, R., Granger, C., & Schmalensee, R. (1980). Advertising and Aggregate Consumption An Analysis of Causality. Econometrica Vol.45, no.5, 1149-1167.

Bakshi, G., Cao, C., & Chen, Z. (1997). Empirical performance of alternative option pricing models. The Journal of FInance 52(5), 2003-2049.

Bates, D. (1991). Was it expected? the evidence from options markets. Journal of Finance 46, 1009-1044.

Bates, D. (1996). Jumps and stochastic volatility: Exchange rate processes implicit in deutche mark options. The Review of Financial Studies 6, 69-107.

Bates, D. (2000). 'Post-'87 crash fears in the S&P 500 futures option market. Journal of Econometrics, Vol.94 Nos 1-2, 181-238.

Bates, D. (2003). Empirical option pricing: a retrospection. Journal of Econometrics 116, 387-404.

Benzoni, L. (2002). Pricing options under stochastic volatility: an empirical investigation. Carlson School of Management.

Black, F., & Scholes, M. (1973). The pricing of options and corporate liabilities. Journal, 637– 654.

Blair, B., Poon, S., & Taylor, S. (2001). Modelling S&P 100 volatility: The information content of stock returns. Journal of Banking and Finance 25, 1665-1679.

Bollerslev, T., & Zhou, H. (2006). Volatility puzzles: A unified framework for gauging return-volatility regressions. Journal of Econometrics 131, 123-150.

Britten-Jones, M., & Neuberger, A. (2000). Option Prices, Implied Price Processes, and Stochastic. Journal of Finance, 839-866.

Busch, T., Christensen, B., & Nielsen, M. (2011). The role of implied volatility in forecasting future realized volatility and jumps in foreign exchange, stock and bond markets. Journal of Econometrics 160, 48-57.

Chernov, M. (2007). On the role of risk premia in volatility forecasting. Journal of Business and Economic Statistics 25, 411-426.

Christensen, B., & Prabhala, N. (1998). The relation between implied and realized volatility. Journal of Financial Economics, 125-150.

Christoffersen, P. F. (2003). Elements of Financial Risk Management. Academic Press.

Clark, T. E., & McCracken, M. W. (2009). Improving Forecast Accuracy by Combining Recursive and Rolling Forecasts. International Economic Review Vol.50, No.2, 363-395.

Day, T., & Lewis, C. (1992). Stock market volatility and the information content of stock index options. Journal of Econometrics 52, 267-287.

Dennis, P., Mayhew, S., & Stivers, C. (2006). Stock returns, implied volatility innovations and the asymmetric volatility phenomenon. Journal of Financial and Quantitative Analysis, 381-406.

(35)

P a g e

34 | 36

real-time analysis. Journal of American Statistical Association 86, 603-610.

Doran, J., Ronn, S., & Ehud, I. (2005). The Bias in Black-Scholes/Black Implied Volatility: An Analysis of Equity and energy Market. Review of Derivatives Research 8, 177-198. Ederington, L., & Guan, W. (2010). How asymmetric is US stock market volatility? Journal of

Financial Market, Vol.13 No.2, 225-248.

Engle, R., & Rosenberg, J. (2000). Testing the volatility term structure using option hedging criteria. Journal of Derivatives 8, 10-28.

Fernandes, M., Medeiros, M., & Scharth, M. (2014). Modeling and predicting the CBOE market volatility index. Journal of Banking & Finance 40, 1-10.

Fleming, J., Ostdiek, B., & Whaley, R. (1995). Predicting Stock Market Volatility A New Measure. Journal of Futures markets 15(3), 265-302.

French, K., Schwert, G., & Stambaugh, R. (1987). Expected stock returns and volatility. Journal of Financial Economics 19, 3-30.

Ge, W. (2014). Understanding the sources of the insurance risk premium. CBOE.

Hansen, C., Prabhala, N., & Christensen, B. (2001). The Telescoping Overlap Rpoblem in Options Data. https://ssrn.com/abstract=276311: AFA 2002 Atlanta Meetings. Retrieved from https://ssrn.com/abstract=276311

Hentschel, L. (2003). Errors in implied volatility estimation. Journal of Financial and Quantitative Analysis 38(4), 779-810.

Hull, J. C. (2006). Option, Futures, and Other Derivatives. Upper Saddle River: Pearson/Prentice Hall.

Imlak, S., & Puja, P. (2013). The information content of implied volatility index. Glob Bus Perspect, 359-378.

Imlak, S., & Puja, P. (2016). On the relationship between implied volatility index and equity index return. Journal of Economic Studies, Vol 43 Issue:1, 27-47.

Imlak, S., & Puja, P. (2016). On the relationship between implied volatility index and equity index returns. Journal of Economic studies,Vol. 43 Issue: 1, 27-47.

Jiang, G., & Tian, Y. (2005). Model-Free Implied Volatility and Its Information Content1. The Review of Financial Studies, 1305–1342.

Jorion, P. (1995). Predicting Volatility in the Foreign Exchange Market. Journal of Finance, 507-528.

Koopman, S., Jungbacker, B., & Hol, E. (2005). Forecasting daily variability of the S&P 100 stock index using historical, realized and implied volatility measure. Empir. Finance 12, 445-475.

Maggie, M., & Thomas, E. (1999). Market Timing: Style and Size Rotation Using the VIX. Financial Analysts Journal Vol 55, Issue 2, 73-81.

Maheu, J., & McCurdy, T. (2011). Do high-frequency measures of volatility improve forecasts of return distribution? Economics 160, 69-76.

Martens, M. (2002). Measuring and forecasting S7P 500 index futures volatility using high-frequency data. Futures Mark, 497-518.

Merton, R. (1980). Estimating the expected return on the market: An exploratory investigation. Journal of Financial Economics, Vol 8, Issue 4, 323-361.

Neely, C. (2002). Forecasting Foreign Exchange Volatility: Why is Implied Volatility Biased and Inefficient? and Does it Matter? St.Louis: Federal Reserve bank.

Referenties

GERELATEERDE DOCUMENTEN

Niet alleen waren deze steden welvarend, ook was er een universiteit of illustere school gevestigd; daarmee wordt nogmaals duidelijk dat de firma Luchtmans zich met hun

SNLMP can be introduced as transition systems with stochastic and non-deterministic labelled transitions over a continuous state space.. Moreover, structure must be imposed over

 to determine the ecological condition or health of various wetlands by identifying the land-cover types present in wetlands, as well as in the upslope catchments by

The primary goal of learning by doing is to foster skill development and the learning of factual information in the context of how it will be used. It is based on

After determining whether the most accurate volatility estimation model is also not drastically different to the realized volatility (benchmark) obtained from

In this thesis I try to make the literature on volatility a little bit more conclusive. As described in the theoretical framework, there are some problems in

This paper conducts a comparative study in which three volatility forecasting models are tested. Based on the theory and previous literature two hypotheses were formulated.

It can be concluded that the CSV measures in panel A and panel B do contain information about the subsequent short-term momentum strategy, while the VDAX measure