• No results found

Forecasting Stock Market Volatility With Implied Volatility As Benchmark

N/A
N/A
Protected

Academic year: 2021

Share "Forecasting Stock Market Volatility With Implied Volatility As Benchmark"

Copied!
31
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

1

Forecasting Stock Market Volatility With Implied

Volatility As Benchmark

Koen Reijnders s1348418 University of Groningen Faculty of Economics and Business

1 Augustus, 2009

Abstract

This thesis tries to test whether a GARCH (1,1) (Generalized Autoregressive Conditional Heteroskedastic) model or a model based on historical volatility is the better forecaster for implied volatility, with a clean and fixed benchmark. This test was performed making use of the GARCH (1,1) model, the EWMA (Exponentially Weighted Moving Average) model and the RW (Random Walk) model. The models are calibrated in-sample and their forecasting performance is tested out-of-sample using four accuracy statistics. The outcomes of the forecasting models are comparte to a fixed benchmark, the implied volatility, represented by the VIX and VDAX. In this research the GARCH (1,1) model outperformed the EWMA and the RW model. JEL Classification: C22, C52, C53, G14 and G17

(2)

2

1. Introduction

According to investopedia, volatility is a measure of the dispersion of returns for a given security or market index. Therefore volatility is an important subject in financial markets. The investor best predicting volatility, can achieve a better performance on financial markets. Derivative traders are an example of traders trying to profit from the market through predicting volatility. Last year we have seen tremendous turmoil in worldwide financial markets, causing big results on the exchanges. Large alternating returns caused huge volatility which led to stock and index options being more expensive (because of the volatility value in options). This made it possible to have big short-term gains or losses.

In this thesis I try to forecast volatility making use of a GARCH-type model and two models in the historical volatility class. As input for these models the S&P100 and the Frankfurt Stock Exchange (DAX30) are used. The forecasting power of the models is compared and they are benchmarked against the VIX (for the S&P100 input) and the VDAX (for the DAX30 input). In this way I try to come to a conclusion about which of the chosen models is the better forecaster for volatility.

(3)

3 input in the benchmark. In previous research where no fixed benchmark is used, but different benchmarks, the results of the research can not be seen as volatility forecasting. These results should be interpreted as volatility measuring (e.g. Balaban, et al. 2006 and Aboura, 2005). When a fixed benchmark is used, the results are more clean, this results in volatility forecasting instead of volatility measuring.

The VIX is an index representing the implied volatility of the Standard and Poor (S&P) 500 (in 2008). On October 24th 2008 the VIX reached an intraday high of 89,53, whereas the average over the period 1990-2003 was 20,20 (CBOE). This shows that where the VIX already was an important index since 1990, it became even more prominent in the recent history. Nowadays we can buy derivatives on the VIX, even in The Netherlands, these derivatives are for instance, the VIX turbo long and the VIX turbo short (Abn Amro markets).

This thesis is organized as follows. The next chapter is the theoretical framework, where the existing literature on the subject of volatility forecasting is discussed. This provides the necessary background information to continue with the third chapter, where the data used for this study is discussed. Hereafter chapter four will provide the methodology used. In chapter five the outcomes of the different models are discussed. The last part of this thesis consists of conclusions and recommendations for further research.

II.

Theoretical Framework

(4)

4 and strike price, the volatility can be calculated. The market determines this implied volatility (Black and Scholes, 1973). However since there are indices, like the VIX and the VDAX, implied volatility on the underlying index, respectively the S&P100(until 2003) and the DAX30, is directly observable.

Because implied volatility can be calculated, given the option price and the other Black and Scholes variables, efficient option and stock markets implicate that implied volatility should absorb all information contained in the other variables and therefore implied volatility should contain all relevant information available in the market (Christensen and Prabhala, 1998). Since implied volatility should contain all relevant information in the market it is a very good benchmark for volatility forecasts. Changes in volatility in the markets are caused by uncertainty, the more uncertainty there is, the higher the volatility will be. Uncertainty is a representative of e.g. unknown important news. When this news comes out, it should be directly reflected in all prices in all markets. Sometimes this reflection in the markets takes some time, but arbitrageurs try to keep markets as perfect as they can be.

Nowadays there are indices that portray implied volatility. Two examples that are used in this thesis are the VIX, which represents the implied volatility on the S&P100 and the VDAX which represents the implied volatility on the DAX30 (Fleming et al., 1995). In the data description, in chapter three a more elaborate explanation on the VIX an de VDAX will be given.

(5)

5 group of researchers, e.g. Dunis, et al. (2003) and Noh, et al. (1994), found another outcome, GARCH-type (Generalized Autoregressive Conditional Heteroskedastic) models are the best forecaster of future volatility. This difference in forecasting power could be caused by the dataset chosen, but there are also other reasons possible why there is a different outcome in these studies. Possible reasons are time span chosen, models used, market conditions, benchmark chosen or accuracy statistics used. Foundation therefore is that different models can cope with different circumstances. Different circumstances can mean a better or worse performance of a specific model (Poon, Granger, 2003). On the inconclusive part of the literature I find that one of the main problems of the existing literature is the way benchmarks are used. There are a lot of papers that use a forecasting model that has some similar input variables to the benchmark they use (e.g. Balaban, et al. 2006 and Aboura, 2005). If this is the case this forecasting model starts one-step ahead of the other forecasting models, which leaves no fair comparison, therefore I have chosen to deal with this particular problem of the existing literature. In this thesis one clear benchmark is used, the implied volatility (represented by the VIX and the VDAX in this research). In this thesis the VIX and VDAX do not have the same input variables as the forecasting models.

(6)

6 volatility and thus the conclusion that past volatility has more predictive power than implied volatility is not fully justified. Christensen and Prabhala (1998) claim that their research does not suffer the above mentioned problems, which is the result of using non-overlapping data and longer time series.

The easiest way to forecast volatility is to use realized volatility. When realized volatility is used it is used for the future as it is the same. However is this a good way to forecast volatility? This depends on different factors. The most important factor on which the accuracy of forecasting depends is perhaps the time to maturity of the derivative (Figlewski, 1997). When derivatives were developed most derivatives had only a time to maturity of a couple of months. People who forecasted volatility at that time just assumed volatility to change gradually. Given this assumption and fact, the volatility of the recent past was quite a good predictor of the short-term future volatility. However nowadays derivatives are traded with many different times to maturity. Some derivatives can have a time to maturity of more than 10 years (Figlewski, 1997). In this thesis a fixed benchmark is used, for the S&P100 this benchmark is the VIX, and for the DAX30 this is the VDAX. A fixed benchmark makes sure there is a fair comparison possible. In this thesis the data on the fixed benchmarks form 1992 to 2003 are used. These benchmarks cover the whole time span of the research, in these benchmarks derivatives with different times to maturity are used, and not only very short times to maturity.

(7)

7 absorbs the information content in past volatility (Christensen and Prabhala, 1998). However the study of Christensen and Prabhala is a good example of what happened in earlier research, they developed their own method to measure volatility. Therefore this research cannot be compared to earlier research. This is one of the main problems of the existing literature. There are a lot of different methods to research volatility, and these different methods can be interpreted differently by the researchers which ultimately leads to incomparable results. In this thesis a fixed benchmark is used. If in the future there will be more studies with a fixed benchmark, the results will be comparable, because the benchmark is fixed and therefore the method is similar. This also leads to clean results and volatility forecasting instead of volatility measuring.

As Figlewski (1997) describes volatility is subject to non-lognormality, it has fat tails. Next to this volatility is subject to mean-reversion, if one takes a look at a longer time span, there are some serious outbursts in volatility however in the long run it reverts back to the mean. Figlewski (1997) concluded in his research that very advanced models are good for in-sample research, because they are robust against small changes. However if research is done out-of-sample it is the other way round. The advanced models are not able to cope with the big changes. In contrast the simple models are able to cope with these bigger changes and give a better forecast when taken out-of-sample (more robust) especially over longer horizons. However because these models are quite simple, they do not perform that well when taken in-sample, because there are less possibilities for calibrating the models (Figlewski, 1997). In this research I make use of relatively simple models, because this thesis incorporates an out-of-sample part. The most advanced model used in this thesis is the GARCH (1,1) model, a model often used in the literature, however this model is one of the more simple GARCH-type models.

(8)

8 studies state something like: with this regression it becomes clear that the first or the latter is the better forecaster (e.g. Aboura, 2005). The goal of this study is to go another way, the volatility indices (in this case the VIX and the VDAX) are the benchmark for volatility forecasting. This results in a study which tries to tell whether a GARCH (1,1) model or a model based on historical volatility is the better forecaster of implied volatility. In the existing literature the implied volatility indices are also used as input for the forecasting models, this results in measuring volatility instead of forecasting volatility. These studies (e.g. Aboura, 2005) give little information on the real forecasting power of the models, these studies compare the information in the implied volatility indices to for instance the information in historical volatility. In this thesis I try to come to a conclusion about forecasting performance of the different models when a fixed benchmark is used. This fixed benchmark makes sure that there is one clear method, with clear results. Which results in forecasting volatility, not measuring volatility.

In standard finance often the assumptions of a bell curve (normal distribution) and efficient markets are made in order to create a world where models apply, e.g. Black and Scholes (1973) assumed e.g. efficient markets and no transaction costs in buying or selling stocks or options. In the “real” world these assumptions can not be fully justified.

(9)

9 assumptions have to be made, while we know these are not fully justified, however these assumptions are close enough to reality to come to a good model.

In this thesis I will try to give an answer to the following hypotheses:

H0: A GARCH (1,1) model is a better forecaster of implied volatility than an EWMA or

Random Walk model, when a fixed benchmark is used.

H1: An EWMA or Random Walk model is a better forecaster of implied volatility than a

GARCH (1,1) model, when a fixed benchmark is used.

III.

Data Description

In the past data on implied volatility was calculated using option prices. These days there are exchanges that directly portray implied volatility, these are e.g. the VIX on the S&P100 (this was until 2003, in 2003 the underlying was changed to the S&P 500) and the VDAX on the DAX30. In the Dutch market implied volatility still has to be calculated using option prices. The VIX is the best known volatility index and it exists for more than 15 years now. Therefore there is sufficient information on the VIX available. The VDAX is next to the VIX an index constructed on very liquid options and since it started in 1992 there is also enough information available. Next to this the VIX and VDAX are designed to portray implied volatility and they portray implied volatility in a consistent manner. These reasons are the basis to use the VIX and the VDAX and the S&P100 and the DAX30 as the indices for my research.

(10)

10 years of dataset is a sufficient large dataset and the underlying and methods for the VIX changed, I chose this dataset (Enders, 2004). In this dataset I will test the forecasting power of the different methods out-of-sample. If a sufficient wide dataset is used, it is normal to use approximately 50 percent of the dataset for the model and approximately 50 percent to test the model out-of-sample (Enders, 2004). I hold this rule of thumb in my research, meaning that the in-sample period spans from 1992 tot 1998, the out-of-sample period is the remaining part of the dataset, which spans from 1999 to 2003.

The data on the VIX and the VDAX are collected from Bloomberg. With the data on the VIX and the VDAX I am able to use implied volatility as a benchmark for the forecasting models. The daily total returns, to correct for dividends, on the S&P100 are subtracted from Bloomberg. Because the DAX30 is a performance index it is already corrected for dividends, so I subtracted the daily returns from Bloomberg.

The VIX is constructed in such a way that it represents a volatility index based on a group of options that in total are at the money with a continuous time to expiry of 22 trading days (Fleming, et al., 1995). “VIX is a weighted index of American implied volatilities calculated from eight near-the-money, near-to-expiry, S&P 100 call and put options and it is constructed in such a way as to eliminate mis-measurement and “smile” effects” (Blair, et al., 2001). The VIX also tries to adjust for cash dividends with a binomial valuation method, with trees adjusted for these dividends, and their timing. In addition the VIX takes into account the bid-ask bounce, through using the midpoint of the most recent bid-ask quote (Blair, et al., 2001) In the VIX both put and call options are used, this is to ensure that there are no put-call parity problems. A lot more can be told about the construction of the VIX, however this is quite technical and is not within the scope of this thesis. For further information on the construction of the VIX, I recommend the appendix of Whaley (1993) and (2009).

(11)

11 for this. The basis for calculating the implied volatility of the DAX30, are the DAX option contracts. Next to this, eight sub indices, with a similar time to maturity as the DAX options, are calculated. These 8 sub-indices all have a different time to maturity, varying from 1 month to 24 moths and are calculated with series of four at-the-money options. The VDAX and its sub-indices are calculated once a day at 5:45 pm (deutsche-boerse.com). As with the VIX there is a lot more to the construction of the VDAX than lies within the scope of this thesis, for technical information on the construction of the VDAX I recommend the guide to the volatility indices of Deutche Börse (2007).

In table 1 the descriptive statistics of the DAX30 and the S&P100 are presented. The logarithmic returns of the indices were calculated and multiplied by 100 to get the percentage change. This was done as in previous research (e.g. Dunis, et al., 2003). The formula that was used to calculate the log return is:

(1)      1 log 100 t t t P P s

Where St stands for the logarithmic return, and Pt stands for the closing quote of the relevant

index at day t.

Table 1: Descriptive Statistics of daily log returns 2-jan-1992 to 30-dec-2003 of the S&P100 and the DAX30

(12)

12 distributed, they are skewed and leptokurtic (Brooks, 2002). However Brooks (2002) states that there is no one clear solution for non-normality, each method to make a dataset normal has downsides as well. Furthermore Brooks (2002) states: “For sample sizes that are sufficiently large, violation of the normality assumption is virtually inconsequential”. As my dataset is sufficiently large I chose to leave the data in original state.

In table 2 the descriptive statistics of the VIX and VDAX can be found. These statistics are calculated on the daily volatility of these indices. To calculate the daily volatility I used a method proposed by Becker, et al. (2007) which is described in the methodology.

Table 2: Descriptive statistics of daily volatility in the period 2-jan-1992 to 30-dec-2003 on the VIX and VDAX

Table 2 shows non-normal highly leptokurtic distributions for both the VIX and the VDAX. The daily volatility has the same mean as median, concerning the minimum and maximum there are some outliers. The outliers all concern higher volatility than the daily volatility. This means that concerning a time span of eleven years the volatility tends around its mean with some periods of upward outburst of daily volatility.

(13)

13 Table 3: Descriptive Statistics of daily log returns 2-jan-1992 to 30-dec-1998 of the S&P100 and the DAX30

Table 3 shows a similar image as table 1. Both the S&P100 and the DAX30 have a slightly positive mean daily log-return, slightly bigger than the mean daily log-return over the entire period. The Bera-Jarque test clearly shows that the log-returns are not normally distributed, they are skewed and leptokurtic (Brooks, 2002). Even more skewed and leptokurtic than the dataset for the entire period.

In table 4 the descriptive statistics of the VIX and VDAX over the first period (1992-1998) are presented. These statistics are, as the entire period, calculated on the daily volatility of these indices according to the method proposed by Becker, et al. (2007).

(14)

14 Table 4 shows non-normal highly leptokurtic distributions for both the VIX and the VDAX as does table two for the entire period. However the period of 2-jan-1992 to 30-dec-1998 shows a far more skewed and leptokurtic distribution as does the entire period, this is probably caused by the big differences between the years, e.g. 1998 was a year with extremely high volatility, whereas 1993 was a year with extremely low volatility. Within the time span of the first period these years (1993 and 1998) are both covered. The highest value for the fix in the period 1992-1998 was 48,56, where the lowest was 9,04, a difference of 39,52. The period 1999-2003 only had a difference of 35,13 between the highest and lowest values. Something similar can be seen in the data of the VDAX. The difference between the highest and lowest value was in the period of 1992 to 1998, 46,98, where the difference within the second time span was 41,16.

In table 5 the descriptive statistics of the DAX30 and the S&P100 over the second period (1999-2003) are presented.

Table 5: Descriptive Statistics of daily log returns 4-jan-1999 to 30-dec-2003 of the S&P100 and the DAX30

(15)

15 normally distributed, they are just a little skewed and have only little excess kurtosis (Brooks, 2002).

In table 6 the descriptive statistics of the VIX and VDAX over the second period are presented. These statistics are, as the entire period, calculated on the daily volatility of these indices.

Table 6: Descriptive statistics of daily volatility in the period 4-jan-1992 to 30-dec-2003 on the VIX and VDAX

Table 6 shows non-normal leptokurtic distributions for both the VIX and the VDAX as does table two for the entire period. However the period of 4-jan-1999 to 30-dec-2003 shows a little less skewed and leptokurtic distribution as does the entire period, this is probably caused by the high base level of volatility in this period. The lowest VDAX value in this period was 17,09 and the lowest VIX value was 15,35, where the lowest values for the VDAX and VIX in the first period where respectively 9,36 and 9,04.

IV.

Methodology

(16)

16 same as the input for the benchmark (e.g. Balaban, et al. 2006 and Aboura, 2005), this will give unclear results and is not really forecasting volatility, but building a model that can best forecast itself. To avoid building a model that tries to forecast itself I assume the implied volatility to be the best benchmark available. In this study there is one clear benchmark, and the models will all be compared to this benchmark, leading to a fair comparison. In this way I can come to a conclusion whether a GARCH (1,1) model or a model based on historical volatility is the best forecaster for implied volatility. I understand that there are also problems in the approach I take (as described before) to forecast the implied volatility, however in this way existing literature can be seen in another daylight.

The aim of this thesis is to forecast volatility with a fixed benchmark, and to give another perspective on existing literature using this fixed benchmark. As described before, there are different forecasting methods, the outcomes of these different forecasting methods will be compared with the implied volatility. The dataset is cut in half, the first half for the in-sample period (1992-1998), the second half for the out-of-sample period (1999-2003). The realized volatility is calculated using daily data. The actual testing of the models will be out-of-sample in order to be able to see the real forecasting power of the different models. However there has to be an in-sample period as well, the purpose of this in-sample period is to calibrate the models. The outcomes of this calibration are the input for the models in the out-of-sample period. This out-of-sample forecasting for the models based on historical volatility is conducted in Excel. The input for these models are the S&P100 and the DAX30. The GARCH (1,1) forecasting is conducted with EViews, using the static forecasting function in EViews 4.1.

(17)

17 To be able to compare the outcomes of the different models with the VDAX and the VIX, the daily volatility of these indices has to be calculated. Becker, et al. (2007) proposed a method to calculate this daily volatility. I follow this method, which is represented in equation 2. Equation 3 shows the calculation for the daily volatility of the VDAX.

(2) Daily volatility VIX=

2 ) 252 100 (       VIX

(3) Daily volatility VDAX=

2 ) 252 100 (       VDAX

The forecasting methods in the „historical standard deviation‟ class are the RW (equation 4) and the EWMA (equation 5). The RW model is a naïve model using one step ahead in forecasting the volatility a day ahead. EWMA is a model that uses past observations and gives an exponentially decreasing weight to these observations. The more recent the observation took place, the more influence it has on the outcome of the model. Actually the Random Walk model is an EWMA model with a weight of 1 assigned to labda. When a weight of 1 is assigned to labda the daily percentage change is no longer of influence and only the value of the previous day as in the Random Walk is of influence (Hull, 2006). (4) RW:

ˆt

t1 (5) EWMA:

n2 



n21(1

)un21 Where ^ t

represents the volatility forecasted and

t1 represents the volatility on t-1. 2

n

is the forecasted volatility for day n, u is the daily percentage change, u is defined as

the percentage change in the market variable (

1 1     i i i i S S S

(18)

18 variable).

is a constant between 0 and 1, which determines the reaction speed of the model (Hull, 2006).

In the GARCH class, the GARCH (1.1) (Bollerslev, 1986) model will be used, this model is represented in equation 6. GARCH (1,1) is somewhat similar to the EWMA model. The EWMA model is a very specific case of the GARCH (1,1) model, namely the GARCH (1,1) with  =0,

=1-

and

=

. Therefore the main difference between GARCH (1,1) and EWMA lies within the use of the most recent variance rate (Hull, 2006). According to Dunis, et al. (2003) Garch(1,1) is superior to other GARCH models, like E-GARCH or GJR-GARCH. This is why I chose to use the GARCH (1,1) model.

The input of the GARCH (1,1) model will be based on the data extracted from Bloomberg, the S&P100 and DAX30. The GARCH (1,1) model itself computes return values based on the Bloomberg data, which can be used to compare the model with the corresponding benchmark. For the GARCH (1,1) outcomes of the S&P100 input values the relevant benchmark is the VIX and for the GARCH (1,1) outcomes of the DAX30 input values the relevant benchmark is the VDAX.

(6) GARCH (1.1)

n2

V

L

u

n21



n21

In this equation,  is the weight assigned to VL,

is the weight assigned to un21,

is the weight assigned to

n21 and VL is a long-run average variance rate, which can be

seen as a constant (Hull, 2006).

(19)

19 As the name suggests, the RMSE is an accuracy statistic that measures the average size of the error. The difference between the forecast and the actual measure is taken and squared, hereafter it is averaged. This makes the RMSE sensitive to big forecasting errors (Brooks, 2002). (7)

5 . 0 1 2 1    

   n t t a f n RMSE

All accuracy statistics have the variables σf and σa. The σf stands for the forecasted

volatility, the σa stands for the actual observed volatility. As I use the volatility indices

VIX and VDAX as a fixed benchmark, the VIX and VDAX give the values I use as actual observed volatility. All accuracy statistics compare the outcomes of the forecasting models to the relevant data of the implied volatility indices. This assures that the outcome of the accuracy statistics portray the forecasting accuracy of the models with a fixed benchmark (the VIX and VDAX).

The MAE calculates the average absolute forecasting error, as it subtracts the actual observed value from the forecasted value. Because the MAE uses average forecast errors, it is not sensitive to outliers (Brooks, 2002).

(8)

     t n t a f n MAE 1 1 

The MAPE is a percentage error, the outcome is in contrast to the RMSE and the MAE a percentage and therefore bounded between 0 and 100. The MAPE actually measures the forecasting error and divides this by the actual value, which leaves a percentage (Makridakis, 1993). According to Makridakis (1993): “the MAPE is the only measure that means something to decision makers who have trouble even understanding medians, not to mention geometric means.”

(20)

20 Also with Theil‟s U-statistic a low value represents a more accurate forecast. The U statistic in contrast to MAPE has no upper bound, which sometimes makes it a little harder to interpret. The advantage of the Theil-U statistic is that it necessarily lies between 0 and 1, and that it is independent of the scale of the variables (Makridakis and Hibon, 1979). (10)

 

 

                 

         n t n t a n t n t f n t t a f n n n U s Theil   

2 2 1 2 1 1 ) ( 1 '

The most accurate forecasting model will be the model with the lowest score on the accuracy statistics. The null-hypothesis will be tested accordingly.

V.

Results

In this chapter the results of the modelling part of this thesis are described and discussed. The research was conducted in an in-sample part and an out-of-sample part, first the outcomes of the in-sample model calibration are discussed, second the out-of-sample forecasts following on the in-sample testing period are described. These outcomes are discussed with the help of the accuracy statistics. These statistics show the accuracy of the different models in comparison to each other.

(21)

21 calculates the best fit of a GARCH (1,1) model with a maximum likelihood function. The alpha, beta and omega for the GARCH (1,1) model are determined by this maximum likelihood function.

Table 7: EWMA in-sample model estimates of the S&P100 and the DAX30 with the leading accuracy statistic, which is minimized using the Solver function in Excel.

The labda in the EWMA model determines the weight of the previous observation of volatility, whereas the Beta represents the moving average part of the equation. A lower labda means more weight to the most recent daily percentage change, which means a faster response to the market, but also a more volatile outcome of the model (Hull, 2006). The results in table 7 show that for the Dax30 the daily percentage change clearly is the most important forecaster of volatility within the EWMA model. For the S&P100 the daily percentage change and the previous observation are almost equally important in forecasting volatility within the EWMA model.

Table 8: Variables and coefficients calculated by EViews with the GARCH (1,1) model with probabilities for the DAX30 and the S&P100

(22)

22 Table 9: In-sample accuracy statistics of the three different models for the S&P100

Table 10: In-sample accuracy statistics of the three different models for the DAX30

In order to compare the different models there most be some sort of ranking, I will use the absolute ranking method proposed by Balaban, et. al, 2006. Balaban, et al. (2006) rank the different models with each accuracy statistic. The best model according to the RMSE gets a 1, the second model a 2 and the third model a three. This will work the same for the MAE, MAPE and Theil's U-statistic. The model with the lowest total score, will be the model with the best performance. In the in-sample period we can clearly see that for both indices for all accuracy statistics the GARCH (1,1) model performed best, EWMA came in second and the RW model was the worst performer. The actual out-of-sample forecasting performance was the goal of this thesis, it will be interesting to see if out-of-sample models yield the same performance of the models in comparison to each other as in-sample. In table 11 and 12, the accuracy statistics of the out-of-sample forecasting of the S&P100 and the DAX30 can be found.

Table 11: Out-of-sample accuracy statistics of the three different models for the S&P100 S&P100 RMSE MAE MAPE Theil’s U

RW 0,01388 0,01051 37,44% 0,95876 EWMA 0,00029 0,00023 0,76% 0,47075 GARCH 0,00015 0,00013 0,42% 0,27699 S&P100 RMSE MAE MAPE Theil-u RW 0,00853 0,0059053,57% 0,96573 EWMA 0,00018 0,00011 0,81% 0,51909 GARCH 8,44E-056,07E-05 0,44% 0,29242

(23)

23 Table 12: Out-of-sample accuracy statistics of the three different models for the DAX30

Tables 11 and 12 clearly show the same line of results as the in-sample model calibration. The GARCH (1,1) model performs best for all accuracy statistics for both the S&P100 and the DAX30, the EWMA model performs second best for both indices and the RW has the worst performance for all accuracy statistics for both indices. The different models and different accuracy statistics clearly show for both the DAX30 and the S&P100 the same line of results, they even show approximately the same figures for the different accuracy statistics. This states that the VDAX and the VIX both have a similar relation to respectively the DAX30 and the S&P100. When using fixed benchmarks both seem fit for the respective indices. Looking in a worldwide perspective, the DAX30, S&P100, VDAX and VIX used in this research where at the time the most liquid and big indices in the world. Seeing the results in table 11 and 12 and the graphs in figure 1 and 2 it seems that the world economy has a big impact on all four indices. They all have similar patterns and figures as can be seen in table 11 and 12 and figure 1 and 2. It would be interesting to see in further research if the same line of results are shown when using more indices than the ones used in this thesis.

In figures 1 and 2 the volatility indices and the GARCH (1,1) forecasts are portrayed. The EWMA and RW models are not portrayed in a figure, because they deviate to much from the actual volatility indices to give a nice graph in one figure. In figure 1 the forecast of GARCH (1,1) on the S&P100 is shown accompanied by the actual VIX.

(24)

24 0,00% 0,02% 0,04% 0,06% 0,08% 0,10% 0,12% 4 -j an -99 4 -a p r-99 4 -j u l-99 4 -o kt -99 4 -j an -00 4 -a p r-00 4 -j u l-00 4 -o kt -00 4 -j an -01 4 -a p r-01 4 -j u l-01 4 -o kt -01 4 -j an -02 4 -a p r-02 4 -j u l-02 4 -o kt -02 4 -j an -03 4 -a p r-03 4 -j u l-03 4 -o kt -03

Daily Volatility VIX

Forecasted GARCH in eviews

Figure 1: The daily volatility of the VIX with the through a GARCH (1,1) model forecasted volatility for the period 1999-2003

Figure 1, shows that the forecasted GARCH stays quite close to the actual daily volatility according to the VIX. This graph shows that the fixed benchmark, the VIX in this case, has almost the same pattern as the modeled GARCH (1,1) outcomes. Where in previous research the GARCH (1,1) model was often one of the better forecasters of volatility (Dunis, et al., 2003 and Noh, et al., 1994). This research shows that using a fixed benchmark, the GARCH (1,1) model is also a good predictor of volatility, in this research even the best.

(25)

25 0,00% 0,02% 0,04% 0,06% 0,08% 0,10% 0,12% 0,14% 0,16% 0,18% 0,20% 4 -j an -99 4 -a p r-99 4 -j u l-99 4 -o kt -99 4 -j an -00 4 -a p r-00 4 -j u l-00 4 -o kt -00 4 -j an -01 4 -a p r-01 4 -j u l-01 4 -o kt -01 4 -j an -02 4 -a p r-02 4 -j u l-02 4 -o kt -02 4 -j an -03 4 -a p r-03 4 -j u l-03 4 -o kt -03 Daily Volatility VDAX Forecasted GARCH in eviews

Figure 2: The daily volatility of the VDAX with the through a GARCH (1,1) model forecasted volatility for the period 1999-2003

In figure 2 the same image can be seen as in figure 1. The forecasted volatility through the GARCH (1,1) model stays quite close to the actual daily volatility portrayed by the VDAX. As described above, the same line of results is shown in this research as in part of the literature (Dunis, et al., 2003 and Noh, et al., 1994). Using a fixed benchmark, the VDAX, also the GARCH (1,1) model is the best forecaster of volatility in this research. Figure two clearly shows that even using a fixed benchmark, this benchmark shows a similar pattern as the outcomes in the GARCH (1,1) model. Because the fixed benchmark is used, the results shown above are clean and actually portray the forecast of volatility, and not a measure of volatility.

(26)

26 the DAX30 and the S&P100, there is not one forecasting model used in this thesis that uses information that is also used for the benchmark. Therefore the outcome of the forecast is not flawed. In other studies sometimes forecasts are flawed, because a benchmark is used that is built on the same information that is used as an input for the different forecasting models (e.g. Balaban, et al. 2006 and Aboura, 2005).

The results of this thesis are not really comparable with results in previous research. As described above, I can state that in this thesis the GARCH (1,1) model clearly outperformed EWMA and RW. This outcome is in line with previous research, e.g. Dunis, et al. (2003) and Noh, et al. (1994). However the meaning of the outcome of this research is not the same as the meaning of the outcome of previous research. This is because my thesis is different from previous research by taking a fixed benchmark, in this case two volatility indices. The fixed benchmark leads to incomparable results with research that did not have a fixed benchmark or with research that had the same input variables for the benchmark as for the forecasting models. Which gave these forecasting models a step ahead of the models that did not make use of the same input information. Therefore I can state that in my research the GARCH (1,1) model clearly outperformed the RW and EWMA. However there is space for further research, which is discussed in further recommendations.

VI.

Conclusion and further recommendations

(27)

27 The hypothesis that this thesis tries to solve is the following:

H0: A GARCH (1,1) model is a better forecaster of implied volatility than an EWMA or

Random Walk model, when a fixed benchmark is used.

H1: An EWMA or Random Walk model is a better forecaster of implied volatility than a

GARCH (1,1) model, when a fixed benchmark is used.

Based on this thesis, it could be stated that H1 can be rejected, and therefore H0 accepted.

However because this thesis does not make use of all available models, it can not be stated that a GARCH (1,1) is the best forecaster of implied volatility. What can be stated is that in this thesis the GARCH (1,1) model was according to all four accuracy statistics the better forecasting model for implied volatility than the two models based on historical volatility used in this thesis.

For further research it is recommended that more models are tested using this approach. There are more models available in the GARCH family, but also in the historical volatility class there are many more available models. In this way it can be tested if a GARCH (1,1) model really is a better forecaster than other models. Next to this there are also forecasting models based on extreme-value volatility which can be integrated in a fixed benchmark research. In a research with a fixed benchmark it is very important that the forecasting models do not make use of the same input information as the benchmark or at least all forecasting models make use of the same amount of information.

(28)

28

VII. References

Aboura, S., 2005, Predicting future volatility with ATM implied volatility – a study on VX1, VDAX and VIX, The Cyprus Journal of Sciences, Vol. 3, p. 249-276.

Äijö, 2008, Implied volatility term structure linkages between VDAX, VSMI and VSTOXX volatility indices, Global Finance Journal, Vol. 18, p. 290-302.

Amin, K.I., Ng, V. K., 1997, Inferring Future Volatility from the Information in Implied Volatility in Eurodollar Options: A New Approach, Review of Financial Studies, Vol. 10, is. 2, p. 333-367.

Balaban, E., Bayar, A. and Faff, R.W., 2006, Forecasting stock market volatility: Further international evidence, The European Journal of Finance, Vol. 12, No. 2, p. 171-188.

Becker, R., Clements, A.E., White, S.I., 2007, Does implied volatility provide any information beyond that captured in model-based volatility forecasts?, Journal of Banking & Finance, Vol. 31, Issue 8, p. 2535-2549.

Black, F. and Scholes, M., 1973, The Pricing of Options and Corporate Liabilities, Journal of

Political Economy, Vol. 81, Issue 3, p. 637-654.

Blair, B.J., Poon, S. and Taylor, S.J., 2001, Forecasting S&P 100 Volatility: The Incremental Information Content of Implied Volatilities and High-Frequency Index Returns, Journal of

Econometrics, Vol. 105, is. 1, p. 5-26.

Bluhm, H., Yu, J., 2000, Forecasting volatility: evidence from the German Stock Market,

Working Paper, University of Auckland.

Bollerslev, T., 1986, Generalized Autoregressive Conditional Heteroskedasticity, Journal of

(29)

29 Brace, A., Hodgson, A., 1991, Index Futures Options in Australia – An Empirical Focus on Volatility, Accounting and Finance, Vol. 31 n. 2, p. 13-31.

Brooks, C., 2002, Introductory econometrics for finance, Cambridge, 2002.

Christensen, B.J. and Prabhala, N.R., 1998, The relation between implied and realized volatility.

Journal of Financial Economics 50, p. 125–150.

Corrado, C.J., Miller, JR. T.W., 2005, The forecast quality of CBOE Implied volatility indexes,

The Journal of Futures Markets, Vol. 25, No. 4, p. 339-373.

Day, T.E. and Lewis, C.M., (1992). Stock market volatility and informational content of stock index options, Journal of Econometrics, Vol. 52, p. 267-287.

Dunis, C. L. and Chen, Y. X., 2005, Alternative volatility models for risk management and trading: Application to the EUR/USD and USD/JPY rates, Derivatives Use, Trading &

Regulation, Vol. 11, No. 2, p. 126-156.

Dunis, C. L., Laws, J. and Chauvin, A., 2003, FX volatility forecasts and the informational content of market data for volatility, The European Journal of Finance, Vol. 9, p. 242-272.

Enders, W., 2004, Applied Econometric Time Series, John Wiley & Sons, Hoboken.

Figlewski, S., 1997, Forecasting Volatility, Financial Markets, Institutions & Instruments, vol. 6, iss. 1, p. 1-88.

Fleming, J., Ostdiek, B. and Whaley, R.E., 1995, Predicting stock market volatility: a new measure. Journal of Futures Markets, Vol. 15, p. 265–302.

Hull, J.C., 2006, Options, futures, and other derivatives, Sixth Edition, Prentice Hall, New Jersey.

(30)

30 Makridakis, S., 1993, Accuracy measures: Theoretical and practical concerns, International

Journal of Forecasting, Vol. 9, Issue 4, p. 527-529.

Makridakis, S., Hibon, M., 1979, Accuracy of Forecasting: An Empirical Investigation, Journal

of the Royal Statistical Society, Vol. 142, No. 2, p. 97-145.

Mandelbrot, B., 1971, When can price be arbitraged efficiently? A limit to the validity of the random walk and martingale models, Review of Economics & Statistics, Vol. 53 Issue 3, p. 225-237.

Mandelbrot, B., Taleb, N. N., 2005, How the Finance Gurus Get Risk All Wrong, Fortune, vol. 152, Issue 1, cover story.

Noh, J., Engle, R.F., Kane, A., 1994, Forecasting Volatility and option prices of the S&P 500 Index, Journal of Derivatives, vol. 2, p. 17-30.

Poon, S, and Granger C. W. J., 2003, Forecasting Volatility in Financial Markets: A Review,

Journal of Economic Literature, Vol. 41, Issue 2, p. 478-539.

Tse, Y.K., 1991, Stock Return Volatility in the Tokyo Stock Exchange, Japan World

Economy, Vol. 3, p. 285-298.

Walsh, D.M. and Tsou, G.Y., 1998, Forecasting index volatility: sampling integral and non-trading effects, Applied Financial Economics, Vol. 8, p. 477-485.

Whaley, R. E., 1993, Derivatives on market volatility: Hedging tools long overdue, Journal of

Derivatives, vol. 1, p. 71-84.

Whaley, R. E., 2009, Understanding VIX, Journal of Portfolio Management, Forthcoming.

Internet:

(31)

31 http://www.cboe.com/micro/vix/historical.aspx

www.deutsche-boerse.com

www.investopedia.com

Referenties

GERELATEERDE DOCUMENTEN

After determining whether the most accurate volatility estimation model is also not drastically different to the realized volatility (benchmark) obtained from

Next to this, we can conclude that in all cases, except for the stock exchange in Shanghai, the effect of negative news has a larger impact on volatility than positive.. All

This is only done for the three bivariate models, since the Q(5) and Q(30)-statistic suggest that the trivariate model may not adequately capture the

The primary goal of learning by doing is to foster skill development and the learning of factual information in the context of how it will be used. It is based on

 to determine the ecological condition or health of various wetlands by identifying the land-cover types present in wetlands, as well as in the upslope catchments by

al (2013) worden een aantal suggesties gedaan voor lessen die hierop gericht zijn. Mijn lesontwerp zal dus aan zoveel mogelijk van deze suggesties moeten voldoen. Dit betekent

Niet alleen waren deze steden welvarend, ook was er een universiteit of illustere school gevestigd; daarmee wordt nogmaals duidelijk dat de firma Luchtmans zich met hun

SNLMP can be introduced as transition systems with stochastic and non-deterministic labelled transitions over a continuous state space.. Moreover, structure must be imposed over