Forecasting volatility using Artificial Neural Networks and parametric methods

(1)

Forecasting volatility using Artificial

Neural Networks and parametric

methods

Jhordano S. Aguilar Loyo

11385308

Faculty of economics and business section of

quantitative economics

University of Amsterdam

This dissertation is submitted for the degree of

MSc. Econometrics

Supervisor: dr. M. J. van der Leij

Second reader: prof. dr. H. P. Boswijk

(2)

Abstract

This study compares the forecasting performance of the artificial neural network tech-nique and parametric methods for the prediction of volatility in financial markets. Unlike previous works, the aim of the research is not to define which approach outperforms the other, instead under which circumstances what method is preferable to the other. The em-pirical analysis shows that the preference for a specific method depends on the frequency of volatility that will be estimated. For daily volatility it is advisable to use a linear model with changing regimes, and for the prediction of weekly volatility, the artificial neural network technique shows a robust performance.

(3)

Statement of Originality

This document is written by Jhordano Sayuri Aguilar Loyo who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(4)

Introduction

Financial agents have to make decisions frequently, they need to decide the components of their portfolios, to sell or buy a particular asset, etc. To deal with these problems they have to their disposal data and tools to interpret this data. Nevertheless, the agents cannot observe all the variables that determine the result of their decisions, a typical example is volatility. Knowledge of volatility allows one to make decisions with respect to: portfolio selection, risk management and option pricing.

One of the first attempts to model volatility was made by Engle (1982) who introduced the ARCH model. Several extensions of this model have been proposed with the purpose to capture some facts in financial markets such as: the asymmetry in the response to negative shocks and the nonlinear relation between current and past volatility. An alter-native approach to estimate volatility is realized volatility, that bases its calculations on the availability of high-frequency data. Several authors point out that the approximation of volatility through realized volatility is more accurate than other proxies.

There are several techniques to forecast volatility and the most popular correspond to parametric methods such as GARCH and HAR models. A different approach corresponds to Artificial Neural Networks (ANN), that unlike parametric methods do not assume a pre-defined functional form for volatility. ANN is a computational technique that enables the modeling of nonlinear relations between targeting variables and input variables through layers and activation functions. The use of ANN has been extended in recent years to many topics in economics and finance, opening the path to explore new dynamics and modeling complex interactions.

Researchers are not conclusive if the performance of ANN outperforms parametric tech-niques or vice versa, instead, we can see these methods as complementary, where given some circumstances one method is preferable to the other. The present research evaluates the forecast performance for predicting volatility for these two techniques, considering dif-ferent frequencies: daily and weekly. We determine which approach is preferable for each frequency.

(6)

preference for a particular model or group of models depends on the frequency of volatility that will be predicted. For daily volatility, it is preferable to use the HAR-MSW model (a parametric model with a switching regime) and for weekly volatility, it is advisable to use ANN models as well as the HAR-MSW model.

The present work is divided into four sections, the first section presents the literature review. In the second section, we describe the methodology, describing all the used models and the approach to compare them. In the third section, we analyze the main stock indices of 5 countries and we determine the best model for each frequency. Finally, in the fourth section, we present the conclusion and possible extensions.

1 Literature review

In the field of finance, knowledge of volatility constitutes an important factor to take into account in order to manage risk, pricing derivatives and hedging portfolios. The estima-tion of volatility is a broadly studied topic, there are many definiestima-tions of it as well as techniques and methods to find it. One important issue is that volatility is not an ob-servable variable, instead it is a latent variable that we can only estimate through proxy variables.

One of the most popular approaches of volatility estimation was proposed by Engle (1982). He proposed a model in which the volatility changes with time (as opposed to the assumption of homoscedastic volatility) and depends on past values; the family of models that take the approach of Engle(1982) are known in the literature as GARCH models. Several variations of GARCH models have been proposed with the purpose to capture some facts of financial data. An extensive survey of the family of GARCH models is described in Ter¨asvirta (2008) and Andersen et al. (2006).

Volatility estimation through GARCH models constitutes a noisy proxy to real volatil-ity, since these models associate only one observation (the square of the return) to the estimation of volatility of a given period. This problem has been solved with the record of high-frequency data that has allowed for new estimates of volatility. Andersen et al. (2001) introduced the concept of realized volatility, which is a nonparametric estimator

(7)

that sums the observable intraday square returns. Several researchers have deepened the concept of realized volatility and used it for forecasting purpose. Hansen and Lunde (2011) present state-of-the-art techniques that use high-frequency data to predict future realiza-tions of volatility.

The majority of models used to predict volatility are parametric methods, in the sense that they assume a determined functional form for volatility. A different approach cor-responds to data-driven techniques, which try to find patterns in data instead of fitting pre-assumed functions. Artificial neural network is a promising data-driven technique that has seen tremendous growth in its popularity, due to the availability of large data sets, the increase in computer power and the development of more efficient algorithms.

The development of artificial neural networks has gone through many stages. The first architecture of ANN is attributed to McCulloch and Pitt (1943). The authors build a mathematical model to simulate the function of the brain, in particular, they study neural events and the relations among them. The first modern architecture of ANN was developed by Ivakhnenko (1971); He modelled the input-output relationship of a complex system using a multilayer perceptron structure. Another cornerstone contribution was made by Werbos (1982), through the introduction of the backpropagation algorithm.

ANN can reproduce functions with complex structure and nonlinear patterns, which makes it an indispensable tool to evaluate financial time series. The main application of ANN in finance and economics has been forecasting. Nevertheless, there is no broadly accepted procedure to build ANN, some researchers rely on statistical indicators and prac-titioners base their models on a shot-gun (trial-and-error) methodology.

Most ANN research is carried out by researchers outside the econometric field, who focus more on case studies and are less concerned with theory-based evaluations. ANN has some drawbacks recognized by many authors like the risk of over-parametrization and overfitting. Additionally, the interpretation of ANN parameters is not as straightforward as parametric models.

(8)

than traditional statistical models. Zhang et al. (1998) made a survey of works that compare ANN and conventional forecasting approaches and found that the result of the comparison depends on the problem that is faced.

The majority of works that compare ANN and conventional statistical models base their studies on observables variables, such as prices, returns and so on. There are few studies that compare the performance of the two approaches to forecast latent variables since for forecasting purpose the majority of ANN uses supervised learning, and this ap-proach needs to know the desired output.

With the development of more accurate estimations of latent variables, realized volatil-ity being a clear example, comparisons between conventional statistical techniques and ANN have been conducted. The most recent work corresponds to Vortelinos (2017); the author finds that for forecasting daily volatility the Heterogeneous Autoregressive model outperforms other models, including ANN.

The present research extends the work of Vortelinos (2017) considering different fore-cast frequencies (daily and weekly). Additionally, we explore different specifications for the ANN and we incorporate a recent extension of the HAR model that considers different regimes.

2 Methodology

In the following sections, we explain the methodology used to compare the forecasting performance of parametric methods and ANN. First, we describe the calculation of real-ized volatility, next we describe the parametric models and the ANN models. Finally, we describe the procedure to compare these models.

2.1 Realized volatility

Unlike share prices and their returns, volatility is not an observable variable, instead we need to approximate it. Several authors suggest the use of realized volatility as a good ap-proximation of the true volatility, see Patton and Sheppard (2009), Patton (2010), Hansen

(9)

and Lunde (2011). The calculation of realized volatility relies on the availability of high-frequency data. The notational framework corresponds to Hansen and Lunde (2011).

The daily return is defined as:

yt= Y (t) − Y (t − 1),

where Y (t) is the logarithmic price of some asset. Additionally, we can also observe the price values during the day Y (τi), and define the intra-day return as:

yt,i = Y (τi) − Y (τi−1), where t − 1 ≤ τi≤ t and i ∈ [0, 1, .., Nt],

where Nt corresponds to the number of observations available in one day. In continuous

time, we assume that Y (t) is described by the following equation:

dY (t) = µ(t)dt + σ(t)dWt.

µ(t) represents the drift, σ(t) denotes the spot volatility and W (t) is a standard Brownian motion. When the first two components are independent from W (t), we obtain:

yt|µt, IVt∼ N (µt, IVt), where µt= Z t t−1 µ(s)ds and IVt= Z t t−1 σ2(s)ds.

For the diffusion process described above, the integrated variance (IVt) associated with

day t is the integral of the instantaneous variance over the one-day interval [t − 1, t]. In discrete time the equivalence of IVt is given by:

QVt= lim Nt→∞ Nt X i=0 y_t,i2 .

(10)

through realized variance (RVt): RV_t1 = Nt X j=0 y2_t,j (1)

Usually, the observed intra-day returns are contaminated with micro-structure errors, which are associated mostly (but not limited) to the bid ask spread. Due to this problem, researchers do not use the full sample available, instead they calculate realized volatility using a sample of the data, where each observation is separated from the other for an equal space of time, 5 minutes being the most accepted. We call this calculation of realized volatility RV_t2. RV_t2 = Nt/k X j=0 y_t,j2 (2)

where k represents the time separation between intra-day returns . For higher values of k the problem of micro-structure errors is diminished. However, we need to discard more data. This trade-off problem to reduce the microstructure error and use all the available data is solved by the estimation of RV_t3.

RV_t3 as well as RV_t2 uses equally spaced intraday returns, but takes different non-overlapping samples, that is to say, the first sample will be composed of the first obser-vation and the successive k periods ahead obserobser-vations (yt,1, yt,1+k, yt,1+2k, ...), the second

sample will be composed of the second observation and the successive k periods ahead ob-servations (yt,2, yt,2+k, yt,2+2k, ...), in total we have k non-overlapping samples that cover

all the observations. We calculate the realized volatility for each subsample (RV_tq) and we take the average of these estimators.

RV_t3= Pk q=0RV q t k , (3) where RV_tq= Nk X j=0 y2_t,j,q , yt,j,q ∈ (yt,q, yt,q+k, yt,q+2k, ..., yt,q+jk, ...yt,q+Nkk).

Zhang et al. (2005) evaluate the accuracy of several estimators of realized volatility, including the ones presented above, and they propose a combination of RV_t1 and RV_t3 as the best estimator which is unbiased and consistent, we denote this estimator as RV_t4, and

(11)

it is given by: RV_t4= RV_t3− kˆ NRV 1 t , (4) where ˆk = N − k + 1 k .

The present research will use RV_t4as a proxy of the true volatility σtdue to the advantages

of it over the rest of estimators presented before, which are mentioned by Zhang et al. (2005).

2.2 Parametric methods

2.2.1 GARCH-X

The principal characteristic of conventional statistical methods is the parameterization of volatility. These methods assume a functional form of the true volatility. The widely used GARCH models assume that returns are characterized by the following equations:

yt=

p

htzt, zt∼ iid(0, 1),

ht= ω + αyt−12 + βht−1.

The predicted volatility by the GARCH model is given by ht. It is important to

highlight that traditional GARCH models only use present and past values of the returns (rt, rt−1, rt−2, ...) to model volatility and do not take into account the intra-day returns.

Nevertheless, there are some extensions of GARCH models that take into account the intra-day returns through the incorporation of realized volatility, see Engel (2002) and Engel and Gallo (2006). In general, these models are called GARCH-X and use additional exogenous regressors to explain volatility. The exogenous variable in our GARCH model will be the RV of the previous period. The model can be summarized by the following equations.

yt=

p

htzt, zt∼ iid(0, 1),

(12)

2.2.2 HAR model

A different approach to predict volatility is made by the Heterogeneous Autoregressive Model of Realized volatility (HAR), proposed by Corsi (2009). The paper proposes an additive cascade model of volatility components defined over different time periods. The author bases his idea on the Heterogeneous Market Hypothesis, arguing that there are dif-ferent kinds of market participants, with difdif-ferent trading preferences (short-term traders, medium-term investors and long-term investors). Additionally, he assumes that volatility of low frequency data such as monthly and weekly explains volatility of high frequency data.

This model can be represented as an AR-Type model in the realized volatility with different volatility components over different time horizons. We use t to index days, and tw is used to index weeks. The original HAR model to predict daily volatility can be summarized by the following equations:

RV_t+1d = c + βdRV_td+ βwRV_tw+ βmRV_tm+ t+1, (5) where: RV_tw= 1 5 RV_td+ RV_t−1dd + RV_t−2dd + RV_t−3dd + RV_t−4dd , RV_tm= 1 22 RV_td+ RV_t−1dd + .... + RV_t−21dd .

Hence the model predicts future volatility using the past realized volatilities at different frequencies (daily, weekly and monthly). Corsi (2009) argues that despite the simplicity of the model, it can capture some stylized facts of financial time series (long memory and fat tails). For the prediction of daily data, equation (5) is used, for the prediction of weekly volatility we use equations (6):

RV_tww₊₁ = c + βwRV_tww+ βmRV_tmw + _t+1, (6) where: RV_tmw = 1 4 RV_tdw+ RV_tdw_−1w+ RV_tdw_−2w+ RV_tdw_−3w . where RVw

t represent the weekly volatility. The HAR model is estimated by ordinary least

squares.

2.2.3 Markov Switching - HAR Model

Even though the HAR model can capture important features of realized volatility, it is still a linear model that fails to capture nonlinearities of volatility, this issue has been

(13)

addressed through different modifications of the model. Feng et al. (2015) evaluate many extensions and conclude that the HAR model with a switching regime is the one that has the best forecasting performance, for future reference we call this model HAR-MSW.

The HAR-MSW is composed of different equations that describe volatility, each equa-tion represents a specific regime (st). We will consider two regimes, one that corresponds

to low volatility (st = 0) and another of high volatility (st = 1). The equations that

represent this model are the following:

RV_t+1d =      c0+ β0dRVtd+ β0wRVtw+ β0mRVtm+ t+1, if st= 0 c1+ β1dRVtd+ β1wRVtw+ β1mRVtm+ t+1, if st= 1 (7)

The state variable or regime st is an unobservable variable, and it is ruled by a Markov

chain probability. The Markov property states that the probability that stbeing a specific

value only depends on its immediate past value (st−1). Hence, the probability matrix is

denoted as: P =    P(st= 0|st−1= 0) P(st= 1|st−1= 0) P(st= 0|st−1= 1) P(st= 1|st−1= 1)   =    p00 1 − p00 1 − p11 p11    (8)

There are many methods to estimate the parameters of a Markov Switching model, maximum likelihood and Bayesian inference (Gibbs-Sampling) are the most popular. We use maximum likelihood for our estimations, the Markov chain probability is calculated using Hamilton’s Filter. For a complete description of the maximum likelihood estimation see Fr¨uhwirth-Schnatter (2006).

2.3 Artificial Neural Networks

Artificial neural networks is a computational technique that processes information, relating input variables with desired outputs. ANN bases their computations on the interconnec-tion of simple units called artificial neurons. These models were inspired by the funcinterconnec-tioning of a cell neuron and how it generates and propagates electrical impulses.

The units of an artificial neural network are grouped in different layers, the first layer corresponds to the input layer, which is composed mainly of explanatory variables or raw data that we want to analyze, the last layer corresponds to the output layer whose units

(14)

are the objective of the analysis. Finally, the middle layers are called hidden layers and help to relate the input layer and output layer. See Figure 1 for a graphical description.

Figure 1: Artificial Neural Network

In mathematical terms, the equations that relate the inputs and the output variables are the following (for one hidden layer):

Z = W0X, Y = φ(Z),

where X is a vector that represents the input variables, W is a matrix that relates the inputs variables and the hidden units Z, φ() is an activation function and Y is a vector of output variables. The number of layers and units in each layer determine the structure of the ANN. Additionally, there are many specifications of how the layers are related, the most popular specifications correspond to multilayer perceptron, in which the value of a unit is related to all the values of the previous layer.

The training of the ANN model is the process by which a particular algorithm de-termines the values of W , back-propagation algorithm being the most used among the practitioners. The objective of the algorithm is to minimize the discrepancy between the value predicted by the model and the desired value (given by the output node); one in-dicator of this discrepancy is given by the mean squared error (MSE). To minimize the MSE this algorithm updates the W values proportional to the deviation of the predicted

(15)

values and the true values, α represents the learning rate or the proportion at which the values of W is updated. W_ij(l+1)= W_ij(l)− α ∂ ∂W_ij(l) J (W, b) where: J (w, b) = 1 2kY − ˆY k

To guarantee a good generalization of the model and avoid over-fitting, the data is split into three parts: training data, validation data, and testing data. The first two portions of the data are used to train the model. The back-propagation algorithm updates the values of W using the training data, and this process stops when the MSE in the validation data starts to increase. It is worth mentioning that the validation data is not used to update the values of W , only to stop the learning process.

Figure 2: Early stopping

Source: Chapter 4 of Haykin, S. (2009)

All the analyzed ANN models in this research are trained using the back-propagation algorithm, the validation data is composed of 10 percent of the estimation data (which is not used to evaluate the forecasting). The keras package is used and the language program is Python.

2.3.1 Conventional ANN

For forecasting purpose, the majority of papers that compare the ANN and parametric models, see Vortelino (2017) and Tang and Fishwick (1993), use a predefined structure

(16)

for the ANN, in which the ANN is composed of four layers: one input layer, two hidden layers, and one output layer. In the output layer we have a single node that corresponds to the prediction of the model. The input variables correspond to the present and past realized volatility at the same frequency. Finally, the two hidden layers have the same number of nodes as the input layer. Figure 3 depicts this model that we call ANN.

Figure 3: Artificial Neural Network

The following equation summarizes the ANN model for the prediction of daily volatility, that considers the last p observations of realized volatility.

RV_t+1d = f (RV_td, RV_t−1d , RV_t−2d , ..., RV_t−pd ) (9)

To determine the number of past observations included in the input layer, we calculate the MSE for different values of p, and we select the specification with the lowest MSE in the validation data, avoiding in this way the risk of overfitting. For the prediction of weekly volatility the structure described before remains.

2.3.2 ANN-BE

The conventional ANN described previously only explores a narrow set of possible struc-tures. We explore a bigger range of specifications, where the number of nodes in the hidden layers is not necessarily the same as the number of nodes in the input layer. We still consider two hidden layers, given that this structure is complex enough to capture important patterns in the data, as described by Zhang (1998).

(17)

com-binations, given a number of input nodes. The ANN structures with a better performance are those with a symmetry among the input layer and the hidden layers. We observe that the performance of the model increases when the number of nodes in the hidden layers is equal or bigger than the number of nodes in the input layer. Nevertheless, when we consider a huge number of nodes in the hidden layers (four or five times the number of nodes in the input layer) the performance of the model start to decrease (see Appendix Figure 12 and 13).

Given a number of m explained variables or m lags in the input layer, we evaluate the structures that have m to 2m nodes in the first hidden layer. For the second layer we consider the same the number of nodes as in the first hidden layer up to twice this number. This represents (3m+2)(m+1)₂ possible combinations of which the conventional ANN only considers one. A graphical description of the possible structures is depicted in the Figure 4.

Figure 4: Possible structures evaluated

We evaluate different lags taking into account the possible combinations described previously. We select the architecture with the lowest MSE in the validation data. For further reference we call this model ANN-BE.

(18)

2.3.3 HAR Model and ANN

We also explore a combination of the HAR model and the artificial neural network, that we call HAR-ANN. This model is an artificial neural network with fixed input nodes, which are given by the explanatory variables of the HAR model. Like previous ANN models we consider two hidden layers and the number of nodes in the hidden layers will be deter-mined following the procedure described by the ANN-BE model. Equation 10 summarizes the model for the prediction of daily realized volatility.

RV_t+1d = f (RV_td, RV_tw, RV_tm) (10)

For the prediction of weekly volatility, we have the same structure described before but the explanatory variables are different. Equation 11 summarizes the model.

RV_tww₊₁= f (RV_tww, RV_tmw) (11)

2.4 Forecast evaluation and comparison

The prediction accuracy of a model is based on the expected loss or distance to the true volatility. There are different specifications of loss functions, however not all the functions properly rank the forecast of the different models. A loss function is called robust if the resulting ranking does not change if we use the true volatility σ2

t or a proxy variable ˆσt2.

That is,

E[L(σ_t2, h1t)] Q E[L(σt2, h2t)] ⇔ E[L(ˆσt2, h1t)] Q E[L(ˆσ2t, h2t)],

for any ˆσ2_t s.t. E[ˆσ_t2|F_t−1] = σ_t2, where Ft−1 is the set of information until period t − 1,

h1tand h2t are the forecasts of two different models. Patton (2010) lists a set of conditions

that is required for a function to be robust. Likewise, the author presents a family of loss functions that are robust and homogeneous. We use two robust loss functions of that

(19)

family of functions. M SE : L(ˆσ2, h) = 1 T T X t=1 ˆ σ_t2− h_t2 , QLIKE : L(ˆσ2, h) = 1 T T X t=1 ˆσ2_t ht − log σˆ 2 t ht − 1 .

To determine if the the accuracy of one model is significantly different than other, we use the Diebold-Mariano test. This test is a widely acceptable procedure that evaluates the equal accuracy null hypothesis, which states that the expected loss of two forecasting models are equal E(dt) = E[g(eit) − g(ejt)] = 0, where eit and ejt are the forecast error of

the model i and j, and g() is a particular loss function, that in our case it is given by MSE or QLIKE. The alternative hypothesis states that the performance of the two models are different.

According to Diebold-Mariano (1995), given a sample path dtTt=1, the mean of this

difference follows a normal distribution. √ T ( ¯d − µ) → N (0, 2πfd(0)) where d =¯ 1 T T X t=1 [g(eit) − g(ejt)], fd(0) = 1 2π ∞ X τ =−∞

γd(τ ) spectral density at frequency 0,

γd(τ ) = E(dt− µ)(dt−τ − µ).

Taking in consideration the above result, the null hypothesis of equal accuracy can be tested using S1, which is equal to:

S1 = ¯ d q 2π bfd T , where 2π bfd= M X τ =−M b γd(τ ), b

γd(τ ) is the sample autocorrelation, we will consider M = T

1

3, that provides an adequate

(20)

3 Empirical analysis

The empirical analysis is divided into six parts. In the first section, we describe the data used; in the second part, we present the calculation of realized volatility and its characteristics. The third section shows the estimation of the models. In the four section, we compare the one period ahead prediction for daily and weekly volatility for the different models. The fifth section shows the result of the robustness exercise, changing the test data from 8 percent to 15 percent of the total data. Finally, the last section compares our results with previous works.

3.1 Data

The data is composed of the main indices of 5 countries, which are the Dow Jones Indus-trial Average for the United States, the Deutscher Aktienindex or DAX for Germany, the Cotation Assist´ee en Continu or CAC 40 for France, the Financial Times Stock Exchange for the United Kingdom, and the Nikkei heikin kabura or Nikkei 225 for Japan. We refer to these indices as USA30, DEU30, FRA40, UK100 and JPN225, to accentuate the country of origin and the number of stocks that composed the index. The data is obtained from the web page of the Dukascopy Bank.

The intra-day sampling frequency is 1 minute, we remove the holidays, the weekends, and we do not take into account the observations with missing values. The sample starts on 2 January 2012 and ends on 31 May 2017. We use 8 percent of the data to compare the forecasts of the models, which corresponds to a sample from 2 January 2017 up to 31 May 2017. The realized volatility is calculated for daily and weekly data.

3.2 Realized volatility

The four estimations of realized volatility described in equation 1 to 4 capture some styl-ized facts of financial series, such as the high persistence of autocorrelation (see Figure 14 in the Appendix) and the asymmetry in the response to negative shocks, i.e. periods of negative returns are characterized by high volatility and periods of positive returns are associated with low values of volatility.

The predicted values of volatility by RV1 are higher than the estimation of RV 4, the

(21)

seen as a crude approximation of volatility that contains the volatility itself and errors of measurement, as well as micro-structure errors; RV4 isolates these problems and gives a more accurate estimate of volatility. To compare the accuracy of the different models we use the values predicted by RV4.

Figure 5 displays the realized volatility and the index in levels for the USA30 index, these graphs illustrates the asymmetry in the response of volatility to negative shocks. The first months of 2016 were characterized by negative returns and high values of volatility. On the other hand, when the USA30 index started to increase, we have a reduction in volatility.

Figure 5: Realized daily Volatility for the USA30 Index

Note: The top graph displays the realized volatility estimation of the USA 30 Index for the first semester of 2016 and the graph below shows the index in levels.

(22)

3.3 Estimation of the models

3.3.1 Parametric models

From the family of considered parametric models, only the HAR and HAR-MSW take into consideration the high persistence of autocorrelation of realized volatility. The HAR and HAR-MSW model this characteristic by incorporating different frequencies of volatility.

The fact that realized volatility has an asymmetric response to negative shocks support the idea that there are different regimes of volatility. This fact is only considered for the HAR-MSW through the incorporation of an unobservable state variable. From Figure 6 we can observe that the HAR-MSW can fit the values of high volatility as well as the value of low volatility better than the HAR model. In general, the HAR model underestimates volatility for periods of high volatility and overestimates it for periods of low volatility.

Figure 6: Estimation of the HAR and HAR-MSW model for the daily volatility of USA30 Index

3.3.2 Artificial neural networks models

The running time of the ANN models is more demanding than parametric models. Addi-tionally, the computation time among the ANN models differs considerably, the ANN-BE model is the most time demanding and the conventional ANN the one that requires the least time.

(23)

The architectures of the 3 considered ANN models are displayed in table 1 to table 3. The structure of the ANN-BE is composed of equal or fewer nodes in the input layer than the conventional ANN, to compensate this loss in information, ANN-BE has more nodes in the two hidden layers.

Table 1: Structure of the conventional ANN model Daily data Weekly data Inputs n1 n2 Inputs n1 n2 USA30 8 8 8 6 6 6 DEU30 7 7 7 4 4 4 FRA40 5 5 5 9 9 9 UK100 7 7 7 9 9 9 JPN225 4 4 4 8 8 8

Note: the column inputs denote the number of nodes in the input layer, n1

and n2 represent the number of nodes in the first and second hidden layers.

Table 2: Structure of the ANN-BE model Daily data Weekly data Inputs n1 n2 Inputs n1 n2 USA30 8 13 21 3 3 5 DEU30 5 8 13 4 6 12 FRA40 5 5 10 6 6 8 UK100 6 8 16 4 4 6 JPN225 3 6 10 3 4 5

Table 3: Structure of the HAR-ANN model Daily data Weekly data Inputs n1 n2 Inputs n1 n2 USA30 3 3 6 2 11 22 DEU30 3 12 20 2 8 8 FRA40 3 8 10 2 5 6 UK100 3 11 14 2 12 20 JPN225 3 7 14 2 12 20

Figure 7 illustrates the selection of lags or input nodes for the ANN-BE and the con-ventional ANN for the prediction of weekly volatility of the USA30 index. Each point represents the MSE for a particular structure grouped by the number of lags in the input layer. The orange line depicts the lowest MSE for a given number of lags, and the blue line represents the MSE for the conventional ANN structure (same number of nodes in the

(24)

input and hidden layers). For that particular case, the ANN-BE selects an architecture with three lags (the lowest MSE that we can achieve) and the conventional ANN selects an architecture with 6 lags.

Figure 7: Selection of the number of lags for the ANN and the ANN-BE for the forecasting of weekly volatility of USA30 Index

The different considered models of ANN have a similar performance to fit the data, as shown in Figure 8. These models can capture the asymmetry in the response of negative shocks as well as the high persistence of autocorrelation. The values predicted by the ANN models tend to be less smooth than the HAR model.

(25)

Figure 8: Estimation of ANN models for the daily volatility of USA30 Index

3.4 Comparison of the forecasts

We evaluate the forecasting performance of the different models for the first five months of 2017. Figure 9 displays the prediction of all the models and the realized volatility of the USA30 index. According to these graphs, the model with the worst forecasting performance is the GARCH-X model, that overestimates the values of realized volatility. The model with the most accurate prediction corresponds to the HAR-MSW model.

(26)

Figure 9: One period ahead forecasting for daily volatility of USA30 index

(a) ANN (b) ANN-BE

(c) HAR-ANN (d) GARCH - X

(27)

All the evaluated models decrease their forecasting performance for the prediction of weekly volatility. In general, the predictions of HAR and GARCH models tend to overes-timate the values of realized volatility. The rest of the models have a similar forecasting performance. Figure 10 displays the weekly forecasting volatility for all the models.

Figure 10: One period ahead forecasting for weekly volatility of USA30 index

(a) ANN (b) ANN-BE

(c) HAR-ANN (d) GARCH - X

(28)

The forecasting performance of all the models, measured by MSE, is displayed in Figure 11. The prediction accuracy depends on the frequency that is estimated. For daily forecasting, most of the models have a similar performance. Nevertheless, for the prediction of weekly volatility, we have a division in the performance of the models, on the one hand, the GARCH and the HAR model reduce their prediction accuracy considerably with respect to prediction of daily volatility, on the other hand, the HAR-MSW and the different ANN models have a slight reduction in their accuracy with respect to the forecasting of daily volatility.

Figure 11: MSE for all the models at different frequencies

Comparing the forecasting performance of the different models at high (daily) and low (weekly) frequencies, we cannot find a model that outperforms the others for all the frequencies. However, for a given frequency, we can find a model or a group of models

(29)

that outperforms the rest.

For the forecasting of daily volatility is advisable to use the HAR-MSW model. If we mea-sure the prediction accuracy using the MSE, the HAR-MSW is selected as the best model for three of the five analyzed series (USA30, DEU30 and FRA40), see table 4. Addition-ally, the difference in the performance of the HAR-MSW with respect to the other models, for these three series, are significant according to the Diebold-Mariano test. For the other two series that do not select the HAR-MSW as the best model (DEU30 and JPN225), the HAR-MSW and the best model (HAR-ANN) have a similar performance. Hence, when the HAR-MSW is pointed out as the best model, its performance is statistically signif-icantly better than the other models, and when it is not selected as the best model, its performance is as good as the best model. Table 5 displays the Diebold-Mariano test for the model with the best forecasting performance, and Table 10 and 11 in the Appendix show the complete comparison among all the models.

Table 4: Forecasting performance of the different models for daily volatility

USA30 DEU30 FRA40 UK100 JPN225

MSE HAR 1.1 ∗ 10−10 3.5 ∗ 10−10 6.1 ∗ 10−10 1.2 ∗ 10−10 1.6 ∗ 10−09 HAR-MSW 5.5 ∗ 10−11 2.5 ∗ 10−10 5.0 ∗ 10−10 5.8 ∗ 10−11 1.6 ∗ 10−09 ANN 7.5 ∗ 10−11 3.8 ∗ 10−10 4.8 ∗ 10−10 9.4 ∗ 10−11 1.5 ∗ 10−09 ANN-BE 8.2 ∗ 10−11 3.4 ∗ 10−10 5.1 ∗ 10−10 7.3 ∗ 10−11 1.7 ∗ 10−09 HAR - ANN 8.5 ∗ 10−11 3.1 ∗ 10−10 4.7 ∗ 10−10 8.8 ∗ 10−11 1.5 ∗ 10−09 GARCH 3.5 ∗ 10−10 1.1 ∗ 10−09 1.3 ∗ 10−09 2.0 ∗ 10−10 6.2 ∗ 10−09 QLIKE HAR 0.20 0.19 0.14 0.10 0.25 HAR-MSW 0.13 0.17 0.12 0.06 0.27 ANN 0.14 0.19 0.13 0.08 0.24 ANN-BE 0.16 0.18 0.12 0.07 0.25 HAR-ANN 0.15 0.18 0.11 0.08 0.23 GARCH 0.32 0.34 0.21 0.14 0.46

Note: For each series, the model with the best performance according to MSE and QLIKE is enclosed in a square. The models with the second best performance are displayed in bold.

The another measure of forecasting performance is given by the QLIKE loss func-tion. In this case, we also have that the HAR-MSW model is selected as the best model for three of the five analyzed series. However, according to the Diebold-Mariano test, the HAR-MSW and the family of ANN have a similar performance for the five analyzed series.

(30)

T able 5: Dieb old-Mariano test for the mo dels w ith the b est p erformance -daily v olatilit y USA30 DEU30 FRA40 UK100 JPN225 MSE HAR-ANN(2.58**) HAR-ANN(2.12*) ANN(0.20) ANN-BE(1.80) ANN(0.28) ANN-BE(2.60**) ANN-BE(2.25*) HAR-MSW(0.52) HAR-ANN(3.16**) HAR-MSW(0.51) HAR-MSW ANN(3.02**) HAR-MSW HAR(2.50*) HAR-ANN ANN-BE(1.92) HAR-MSW ANN(3.34***) HAR-ANN HAR(0.92) HAR(3.92***) ANN(2.75**) GAR CH(2.63**) HAR(4.69***) ANN-BE(2.36*) GAR CH(7.95***) GAR CH(8.16***) HAR(4.39***) GAR CH(6.57***) GAR CH(6.46***) Qlik e ANN(0.30) HAR-ANN(0.22) HAR-MSW(0.97) ANN-BE(0.72) ANN(0.74) HAR-ANN(0.86) ANN-BE(0.26) ANN(1.18) HAR-ANN(1.70) HAR(1.36) HAR-MSW ANN-BE(0.99) HAR-MSW HAR(0.48) HAR-ANN ANN-BE(1.76) HAR-MSW ANN(1.91) HAR-ANN HAR-MSW(1.49) HAR(2.34*) ANN(0.54) HAR(3.28**) HAR(3.12**) ANN-BE(2.17*) GAR CH(5.23***) GAR CH(4.07***) GAR CH(5.88***) GAR CH(4.88***) GAR CH(7.15***) Note: In th e first c olu m n of eac h series, th e mo del with the b est p erformance is displa y ed, the other columns sho w the test statistics of the equal accuracy n ull h yp othesis. The significance lev els are: * p ≤ 0 .05 ** p ≤ 0 .01 *** p ≤ 0 .001.

(31)

Unlike the forecasting of daily volatility, for weekly data, we do not have a single model that outperforms the others. Instead, there is a group of models with similar forecast-ing performance, the family of ANN models (ANN, ANN-BE and HAR-ANN) and the HAR-MSW model. Table 6 shows the selection of the best model for the two measures of accuracy. For the five analyzed series these models have a similar performance according to the Diebold-Mariano test, see table 12 and 13 in the Appendix. It is worth mentioning that, this results hold for the both measures of accuracy, the MSE and the QLIKE.

Table 6: Forecasting performance of the different models for weekly volatility

MSE HAR 4.3 ∗ 10−09 7.8 ∗ 10−09 1.1 ∗ 10−08 4.2 ∗ 10−09 2.3 ∗ 10−08 HAR-MSW 8.2 ∗ 10−10 1.3 ∗ 10−09 2.5 ∗ 10−09 4.3 ∗ 10−10 1.5 ∗ 10−08 ANN 6.7 ∗ 10−10 1.3 ∗ 10−09 3.4 ∗ 10−09 6.7 ∗ 10−10 1.7 ∗ 10−08 ANN-BE 8.3 ∗ 10−10 1.2 ∗ 10−09 4.1 ∗ 10−09 6.6 ∗ 10−10 1.7 ∗ 10−08 HAR-ANN 1.2 ∗ 10−09 1.3 ∗ 10−09 2.7 ∗ 10−09 5.9 ∗ 10−10 1.4 ∗ 10−08 GARCH 4.6 ∗ 10−09 1.2 ∗ 10−08 2.4 ∗ 10−08 5.5 ∗ 10−09 2.3 ∗ 10−07 QLIKE HAR 0.24 0.14 0.12 0.13 0.34 HAR-MSW 0.10 0.04 0.04 0.03 0.33 ANN 0.08 0.04 0.05 0.03 0.35 ANN-BE 0.08 0.04 0.07 0.03 0.38 HAN 0.10 0.04 0.04 0.03 0.31 GARCH 0.22 0.17 0.19 0.15 0.68

It should be noted that for all the frequencies evaluated the model with the worst performance corresponds to the GARCH-X model. With respect to the HAR model, we observe that for daily volatility its performance is comparable to the family of ANN mod-els, but its accuracy decreases for the prediction of weekly volatility.

(32)

T able 7: Dieb old-Mariano test for the mo dels w ith the b est p erformance -w eekly v ol atilit y USA30 DEU30 FRA40 UK100 JPN225 MSE ANN-BE(1.26) HAR-MSW(0.21) HAR-ANN(0.30) ANN(1.26) HAR-MSW(0.26) HAR-ANN(1.39) HAR-ANN(0.76) ANN(0.88) HAR-ANN(1.28) ANN(0.74) ANN HAR-MSW(1.50) ANN-BE ANN(1.06) HAR-MSW ANN-BE(1.67) HAR-MSW ANN-BE(1.35) HAR-ANN ANN-BE(1.44) GAR CH(3.75***) GAR CH(4.76***) HAR(4.01***) HAR(6.40***) HAR(1.57) HAR(5.20***) HAR(5.41***) GAR CH(4.01***) GAR CH(6.81***) GAR CH(6.48***) QLIKE ANN-BE(0.68) HAR-MSW(0.65) HAR-MSW(0.07) ANN(0.64) HAR(0.67) HAR-ANN(1.14) HAR-ANN(0.87) ANN(1.23) ANN-BE(0.83) ANN(0.76) ANN HAR-MSW(1.41) ANN-BE ANN(1.07) HAR-ANN ANN-BE(2.49*) HAR-MSW HAR-ANN(0.95) HAR-ANN HAR-MSW(1.15) GAR CH(3.58***) HAR(3.70***) HAR(4.97***) HAR(4.63***) ANN-BE(1.32) HAR(3.93***) GAR CH(4.37***) GAR CH(5.84***) GAR CH(5.28***) GAR CH(3.92***) Note: In the first column of eac h serie s, the mo del with the b est p erforman ce is displa y ed, the other columns sho w the test statistics of the equal accuracy n ull h yp othesis. The significance lev els are: * p ≤ 0 .05 ** p ≤ 0 .01 *** p ≤ 0 .001.

(33)

3.5 Robustness

For the robustness exercises, we change the test data from 8 percent to 15 percent of the total data (from 1 August 2016 to 31 May 2017). For the forecasting of daily volatility we cannot find enough evidence against the assertion that the HAR-MSW model is the preferable model. For both measures of prediction accuracy, the HAR-MSW model has a similar performance as the rest of the ANN models, and for one series it is pointed out as the best model (DEU30). An exception is the prediction of the JPN225 index, that points out the ANN-BE model as the best model, and its forecasting performance is sta-tistically different than the HAR-MSW model. Tables 14 and 15 in the appendix display the Diebold-Mariano test for all the models in the robustness exercise.

Table 8: Forecasting performance of the different models for daily volatility - robustness exercise

MSE HAR 5.5 ∗ 10−10 1.0 ∗ 10−09 1.1 ∗ 10−09 5.4 ∗ 10−10 9.0 ∗ 10−09 HAR-MSW 5.5 ∗ 10−10 9.6 ∗ 10−10 1.1 ∗ 10−09 5.4 ∗ 10−10 9.7 ∗ 10−09 ANN 5.0 ∗ 10−10 1.0 ∗ 10−09 1.1 ∗ 10−09 5.2 ∗ 10−10 8.0 ∗ 10−09 ANN-BE 5.2 ∗ 10−10 1.0 ∗ 10−09 1.0 ∗ 10−09 5.1 ∗ 10−10 8.3 ∗ 10−09 HAR-ANN 5.1 ∗ 10−10 1.0 ∗ 10−09 1.1 ∗ 10−09 5.2 ∗ 10−10 9.0 ∗ 10−09 GARCH 1.5 ∗ 10−09 3.2 ∗ 10−09 2.5 ∗ 10−09 1.0 ∗ 10−09 3.3 ∗ 10−08 QLIKE HAR 0.27 0.20 0.16 0.12 0.30 HAR-MSW 0.25 0.18 0.14 0.11 0.47 ANN 0.21 0.18 0.15 0.11 0.31 ANN-BE 0.20 0.18 0.13 0.11 0.29 HAR-ANN 0.20 0.18 0.14 0.11 0.30 GARCH 0.30 0.37 0.22 0.19 0.51

With regard to the prediction of weekly data, the robustness exercises reassert our previous finding. For all the analyzed series, independently of the measure of accuracy, the HAR-MSW and the family of ANN models are considered as the best models, and their forecasting performances do not differ significantly among them, as observed in Table 16 and 17 in the Appendix.

(34)

Table 9: Forecasting performance of the different models for weekly volatility - robustness exercise

MSE HAR 7.7 ∗ 10−09 1.7 ∗ 10−08 1.9 ∗ 10−08 7.9 ∗ 10−09 1.0 ∗ 10−07 HAR-MSW 6.3 ∗ 10−09 1.2 ∗ 10−08 1.3 ∗ 10−08 7.0 ∗ 10−09 1.0 ∗ 10−07 ANN 4.9 ∗ 10−09 1.1 ∗ 10−08 1.2 ∗ 10−08 6.5 ∗ 10−09 1.0 ∗ 10−07 ANN-BE 4.7 ∗ 10−09 9.7 ∗ 10−09 1.4 ∗ 10−08 6.1 ∗ 10−09 9.5 ∗ 10−08 HAR-ANN 6.9 ∗ 10−09 1.1 ∗ 10−08 1.3 ∗ 10−08 6.8 ∗ 10−09 1.0 ∗ 10−07 GARCH 1.1 ∗ 10−08 4.1 ∗ 10−08 1.2 ∗ 10−07 1.5 ∗ 10−08 3.8 ∗ 10−07 QLIKE HAR 0.27 0.16 0.14 0.12 0.31 HAR-MSW 0.20 0.11 0.10 0.08 0.32 ANN 0.16 0.10 0.09 0.07 0.40 ANN-BE 0.15 0.09 0.10 0.07 0.34 HAR-ANN 0.18 0.10 0.09 0.09 0.30 GARCH 0.28 0.19 0.36 0.14 0.60

3.6 Discussion

If we consider that the weekly data is a time series with short memory, i.e. it does not have a high persistence in the autocorrelation, see Figure 15 in the Appendix, and daily data is a time series with long memory, see Figure 14 in the Appendix. Our results are comparable with previous works that evaluate the performance of artificial neural net-works and linear parametric models for the prediction of observable variables. Tang and Fishwick (1993) assert that for time series with long memory both approaches have a sim-ilar performance. Nevertheless, for series with a short memory, artificial neural networks technique outperforms the linear parametric models.

(35)

4 Conclusions

In summary, the preference for a specific model depends on the frequency of volatility that will be estimated. For daily volatility, the use of the HAR-MSW model is advisable, not only because it has the most accurate forecasting performance, but also because its running time is shorter than the family of Artificial neural network models. To predict weekly volatility, it is appropriate to use ANN models or a model with a switching regime like the HAR-MSW model. We confirm the finding of previous works that recommend the use of artificial neural networks for short memory series.

The selection criteria of the ANN-BE tends to reduce the number of nodes in the in-put layer and compensate this loss of information by increasing the number of nodes in the two hidden layers. Despite that the ANN-BE requires a longer calculation time, this model has an equivalent performance to the rest of the considered ANN models. Taking into account the performance of the model and its running time, we can assert that it is better to combine different approaches, such as the HAR-ANN and the HAR-MSW models, instead of refining a single technique, like the ANN-BE.

Finally, it is worth mentioning that the analysis presented in this work is for unidi-mensional series. A further extension corresponds to the multivariate case. Additionally, all the analyzed artificial neural networks in this research use the multilayer perceptron structure, which is the most used for prediction, but there are other architectures that can also be used, such as Recurrent Hopfield networks and Self-Organizing Kohonen networks.

(36)

References

Andersen, T., Bollerslev, T., Christoffersen, P., and Diebold, F. (2006). Volatility and correlation forecasting. Handbook of Economic Forecasting., pages 778–878.

Andersen, T., Bollerslev, T., Diebold, F., and Labys, P. (2001). The distribution of exchange rate volatility. Journal of the American Statistical Association, 96:42–55. Corsi, F. (2009). A simple long memory model of realized volatility. Journal of Financial

Econometrics, 7:174–196.

Diebold, F. X. (2015). Comparing predictive accuracy, twenty years later: A personal perspective on the use and abuse of diebold-mariano tests. Journal of business and economic statistics, 33(1).

Diebold, F. X. and Mariano, R. (1995). Comparing predictive accuracy. Journal of business and economic statistics, 13:253–263.

Engle, R. F. (1982). Autoregressive conditional heteroskedasticity with estimates of the variance of united kingdom inflation. Econometrica, 50:987–1007.

Engle, R. F. (2002). New frontiers for arch models. Journal of applied econometrics, 17:425–446.

Engle, R. F. and Gallo, G. M. (2006). A multiple indicators model for volatility using intra-daily data. Journal of econometrics, 131:3–27.

Feng, M., Li, L., Zhichao, L., and Yu, W. (2015). Forecasting realized range volatility: a regime-switching approach. Applied economics letters, 22(17):1361–1365.

Fr¨uhwirth-Schnatter, S. (2006). Finite Mixture and Markov Switching Models. Springer Series in Statistics.

Hansen, P. R.and Lunde, A. (2011). Forecasting volatility using high frequency data. The Oxford Handbook of Economic Forecasting.

Hansen, P. R., Huang, Z., and Shek, H. H. (2009). Realized garch: A complete model of returns and realized measures of volatility. working paper, Stanford University.

Hansen, P. R., Huang, Z., and Shek, H. H. (2012). Realized garch: a joint model for returns and realized measures of volatility. Journal of Applied Econometrics, 27(6):877–906.

(37)

Hansen, P. R. and Lunde, A. (2005). Consistent ranking of volatility models. Journal of Econometrics, 131:97–121.

Haykin, S. (2009). Neural Networks and Learning Machines. Pearson Education, third edition.

Hinton, G. E. (2006). To recognize shapes, first learn to generate images. Technical Report UTML TR University of Toronto.

Hinton, G. E. and Salakhutdinov, R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507.

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8):1735–1780.

Ivakhnenko, A. G. (1971). Polynomial theory of complex systems. IEEE Transactions on Systems, Man, and Cybernetics, 1(4):364–378.

Krizhevsky, A., Sutskever, I., and Hinton, G. (2012). Image-net classification with deep convolutional neural networks. NIPS.

McCulloch, W. and Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 7:115–133.

Patton, A. (2010). Volatility forecast comparison using imperfect volatility proxies. Jour-nal of Econometrics.

Patton, A. and Sheppard, K. (2009). Evaluating volatility and correlation forecasts. Hand-book of Financial Time Series, pages 801–838.

Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning internal represen-tations by error propagation. Parallel Distributed Processing, 1(8):318–362.

Tang, Z. and Fishwick, P. (1993). Feed-forward neural nets as models for time series forecasting. ORSA Journal on Computing, 5(4):374–385.

Ter¨asvirta, T. (2008). An Introduction to Univariate GARCH Models, Handbook of Fi-nancial Time Series. Springer, New York.

Vortelinos, D. I. (2017). Forecasting realized volatility: Har against principal components combining, neural networks and garch. Research in International Business and Finance, 39:824–839.

(38)

Werbos, P. J. (1974). Beyond regression: New tools for prediction and analysis in the behavioral sciences. PhD thesis, Harvard University.

Werbos, P. J. (1982). Applications of advances in nonlinear sensitivity analysis. In Pro-ceedings of the 10th IFIP Conference, 31.8:762–770.

Zhang, G., Patuwo, B. E., and Hu, M. Y. (1998). Forecasting with artificial networks: The state of the art. International Journal of Forecasting, 14:35–62.

Zhang, L., Mykland, P., and Ait-Sahalia, Y. (2005). A tale of two time scales: determining integrated volatility with noisy high-frequency data. Journal of the American Statistical Association, 100:1394–1411.

(39)

Appendix

Given a number of lags, the performance of the ANN increases when the number of nodes in the first hidden layer is equal to the number of lags, independently of the number of nodes in the second hidden layer.

Figure 12: Mean squared error for ANN structures with 5, 10 and 20 lags for the estimation of daily realized volatility of USA30 Index

Note: Each graph shows the MSE of the ANN for a given number of lags, each line represents a different number of nodes in the first hidden layer; the x-axis shows the number of nodes considered in the second hidden layer.

(40)

Figure 13 is similar to Figure 12, previously displayed. However, for each line we displayed the number of nodes in the second hidden layer and the x-axis represent the number of nodes in the first hidden layer, inversely that the previous graphs, but we have the same conclusion. We can see that the performance of the ANN increases when there is a symmetry among the number of nodes in the input layer and the number of nodes in the second hidden layer.

Figure 13: Mean squared error for ANN structures with 5, 10 and 20 lags for the estimation of daily realized volatility of USA30 Index

Note: Each graph shows the MSE of the ANN for a given number of lags, each line represents a different number of nodes in the first hidden layer; the x-axis shows the number of nodes considered in the second hidden layer.

(41)

Figure 14: Autocorrelation of the daily realized volatility for all the series

(a) USA30 (b) DEU30

(c) FRA40 (d) UK100

(42)

Figure 15: Autocorrelation of the weekly realized volatility for all the series

(a) USA30 (b) DEU30

(c) FRA40 (d) UK100

(43)

Table 10: DieboldMariano test for daily volatility, MSE used to measure the accuracy -8 % data used for test

HAR HAR-MSW ANN ANN-BE HAR-ANN GARCH USA HAR 0.0 HAR-MSW 4.27*** 0.0 ANN 4.72*** 2.25* 0.0 ANN-BE 2.14* 2.38* 0.75 0.0 HAR-ANN 3.34*** 2.43* 1.55 0.19 0.0 GARCH 5.37*** 5.87*** 6.13*** 5.55*** 6.63*** 0.0 DEU HAR 0.0 HAR-MSW 2.50* 0.0 ANN 1.07 2.75** 0.0 ANN-BE 0.97 2.25* 1.64 0.0 HAR-ANN 1.94 2.12* 2.40* 1.42 0.0 GARCH 11.23*** 8.16*** 10.75*** 11.20*** 9.73*** 0.0 FRA HAR 0.0 HAR-MSW 2.00* 0.0 ANN 2.59** 0.66 0.0 ANN-BE 4.07*** 0.14 0.62 0.0 HAR-ANN 4.39*** 0.52 0.20 1.92 0.0 GARCH 2.35* 2.50* 2.50* 2.62** 2.63** 0.0 GBR HAR 0.0 HAR-MSW 4.69*** 0.0 ANN 4.60*** 3.34*** 0.0 ANN-BE 6.15*** 1.80 3.89*** 0.0 HAR-ANN 5.15*** 3.16** 1.74 2.92** 0.0 GARCH 6.75*** 6.57*** 8.52*** 7.85*** 8.29*** 0.0 JPN HAR 0.0 HAR-MSW 0.01 0.0 ANN 0.54 0.40 0.0 ANN-BE 1.03 0.59 1.83 0.0 HAR-ANN 0.92 0.51 0.28 2.36* 0.0 GARCH 6.35*** 5.70*** 6.20*** 6.58*** 6.46*** 0.0

(44)

Table 11: Diebold-Mariano test for daily volatility, QLIKE used to measure the accuracy - 8 % data used for test

HAR HAR-MSW ANN ANN-BE HAR-ANN GARCH USA HAR 0.0 HAR-MSW 2.34* 0.0 ANN 5.93*** 0.30 0.0 ANN-BE 2.19* 0.99 1.13 0.0 HAR-ANN 6.43*** 0.86 2.59** 0.16 0.0 GARCH 10.08*** 5.23*** 10.71*** 6.48*** 13.43*** 0.0 DEU HAR 0.0 HAR-MSW 0.48 0.0 ANN 0.10 0.54 0.0 ANN-BE 1.22 0.26 1.20 0.0 HAR-ANN 0.52 0.22 0.66 0.18 0.0 GARCH 11.53*** 4.07*** 9.33*** 10.59*** 5.75*** 0.0 FRA HAR 0.0 HAR-MSW 0.94 0.0 ANN 0.93 0.12 0.0 ANN-BE 2.80** 0.39 0.53 0.0 HAR-ANN 3.28** 0.97 1.18 1.76 0.0 GARCH 5.16*** 3.09** 3.30*** 6.08*** 5.88*** 0.0 GBR HAR 0.0 HAR-MSW 3.12** 0.0 ANN 5.62*** 1.91 0.0 ANN-BE 5.90*** 0.72 3.95*** 0.0 HAR-ANN 5.41*** 1.70 1.67 2.85** 0.0 GARCH 7.90*** 4.88*** 9.77*** 8.48*** 9.13*** 0.0 JPN HAR 0.0 HAR-MSW 0.69 0.0 ANN 0.53 1.28 0.0 ANN-BE 0.25 0.62 0.88 0.0 HAR-ANN 1.36 1.49 0.74 2.17* 0.0 GARCH 8.07*** 3.22** 6.03*** 8.18*** 7.15*** 0.0

(45)

Table 12: Diebold-Mariano test for weekly volatility, MSE used to measure the accuracy - 8 % data used for test

HAR HAR-MSW ANN ANN-BE HAR-ANN GARCH USA HAR 0.0 HAR-MSW 4.69*** 0.0 ANN 5.60*** 1.30 0.0 ANN-BE 5.90*** 0.07 1.44 0.0 HAR-ANN 5.98*** 1.12 1.81 1.82 0.0 GARCH 0.44 3.73*** 4.06*** 4.32*** 4.77*** 0.0 DEU HAR 0.0 HAR-MSW 5.04*** 0.0 ANN 5.22*** 0.36 0.0 ANN-BE 5.41*** 0.21 1.06 0.0 HAR-ANN 5.59*** 0.36 0.03 0.76 0.0 GARCH 2.70** 4.54*** 4.68*** 4.76*** 4.80*** 0.0 FRA HAR 0.0 HAR-MSW 4.01*** 0.0 ANN 5.53*** 0.88 0.0 ANN-BE 4.22*** 1.67 0.89 0.0 HAR-ANN 5.14*** 0.30 1.24 2.27* 0.0 GARCH 3.66*** 4.01*** 4.66*** 4.16*** 4.35*** 0.0 GBR HAR 0.0 HAR-MSW 6.40*** 0.0 ANN 7.52*** 1.26 0.0 ANN-BE 7.79*** 1.35 0.21 0.0 HAR-ANN 7.00*** 1.28 0.98 1.01 0.0 GARCH 4.18*** 6.81*** 7.99*** 8.06*** 7.39*** 0.0 JPN HAR 0.0 HAR-MSW 1.30 0.0 ANN 1.15 0.58 0.0 ANN-BE 0.98 1.27 0.09 0.0 HAR-ANN 1.57 0.26 0.74 1.44 0.0 GARCH 7.22*** 6.28*** 6.82*** 6.29*** 6.48*** 0.0

(46)

Table 13: Diebold-Mariano test for weekly volatility, QLIKE used to measure the accuracy - 8 % data used for test

HAR HAR-MSW ANN ANN-BE HAR-ANN GARCH USA HAR 0.0 HAR-MSW 2.32* 0.0 ANN 3.93*** 1.41 0.0 ANN-BE 4.28*** 0.80 0.68 0.0 HAR-ANN 4.58*** 0.19 1.14 1.37 0.0 GARCH 0.88 2.09* 3.58*** 4.13*** 4.97*** 0.0 DEU HAR 0.0 HAR-MSW 3.34*** 0.0 ANN 3.40*** 0.21 0.0 ANN-BE 3.70*** 0.65 1.07 0.0 HAR-ANN 3.74*** 0.13 0.06 0.87 0.0 GARCH 2.27* 3.95*** 4.07*** 4.37*** 4.46*** 0.0 FRA HAR 0.0 HAR-MSW 3.50*** 0.0 ANN 4.66*** 0.60 0.0 ANN-BE 2.94** 1.77 1.42 0.0 HAR-ANN 4.97*** 0.07 1.23 2.49* 0.0 GARCH 5.85*** 4.65*** 6.06*** 4.54*** 5.84*** 0.0 GBR HAR 0.0 HAR-MSW 4.63*** 0.0 ANN 5.38*** 0.64 0.0 ANN-BE 5.87*** 0.83 0.56 0.0 HAR-ANN 5.01*** 0.95 0.07 0.29 0.0 GARCH 3.96*** 5.28*** 6.34*** 6.82*** 5.80*** 0.0 JPN HAR 0.0 HAR-MSW 0.27 0.0 ANN 0.07 0.49 0.0 ANN-BE 0.37 1.28 0.73 0.0 HAR-ANN 0.67 1.15 0.76 1.32 0.0 GARCH 7.77*** 3.32*** 3.06** 2.42* 3.92*** 0.0

(47)

Robustness Tables

Table 14: DieboldMariano test for daily volatility, MSE used to measure the accuracy -15 % data used for test

HAR HAR-MSW ANN ANN-BE HAR-ANN GARCH USA HAR 0.0 HAR-MSW 0.12 0.0 ANN 3.86*** 1.50 0.0 ANN-BE 1.55 1.26 1.10 0.0 HAR-ANN 2.70** 1.29 0.87 0.74 0.0 GARCH 1.66 1.71 1.74 1.70 1.72 0.0 DEU HAR 0.0 HAR-MSW 1.61 0.0 ANN 1.11 1.05 0.0 ANN-BE 1.09 0.85 0.36 0.0 HAR-ANN 0.88 1.48 0.20 0.50 0.0 GARCH 6.59*** 6.23*** 6.66*** 6.64*** 6.57*** 0.0 FRA HAR 0.0 HAR-MSW 1.37 0.0 ANN 1.52 0.17 0.0 ANN-BE 2.33* 0.52 0.96 0.0 HAR-ANN 1.84 0.40 0.39 1.31 0.0 GARCH 5.49*** 5.28*** 5.56*** 5.65*** 5.81*** 0.0 GBR HAR 0.0 HAR-MSW 0.05 0.0 ANN 1.18 0.66 0.0 ANN-BE 2.03* 1.26 0.93 0.0 HAR-ANN 1.67 1.04 0.25 0.89 0.0 GARCH 4.67*** 4.46*** 4.90*** 4.83*** 4.84*** 0.0 JPN HAR 0.0 HAR-MSW 2.19* 0.0 ANN 2.23* 3.00** 0.0 ANN-BE 2.19* 2.68** 1.21 0.0 HAR-ANN 0.22 1.64 2.18* 2.14* 0.0 GARCH 2.02* 1.96 2.06* 2.06* 2.03* 0.0

(48)

Table 15: Diebold-Mariano test for daily volatility, QLIKE used to measure the accuracy - 15 % data used for test

HAR HAR-MSW ANN ANN-BE HAR-ANN GARCH USA HAR 0.0 HAR-MSW 0.39 0.0 ANN 3.07** 1.34 0.0 ANN-BE 3.10** 2.81** 0.56 0.0 HAR-ANN 5.44*** 1.89 0.55 0.21 0.0 GARCH 0.97 0.93 2.68** 2.69** 3.06** 0.0 DEU HAR 0.0 HAR-MSW 0.94 0.0 ANN 2.76** 0.15 0.0 ANN-BE 1.91 0.21 0.11 0.0 HAR-ANN 1.62 0.30 0.83 0.72 0.0 GARCH 11.77*** 6.01*** 10.33*** 8.80*** 8.79*** 0.0 FRA HAR 0.0 HAR-MSW 1.14 0.0 ANN 0.38 0.57 0.0 ANN-BE 2.98** 1.27 1.49 0.0 HAR-ANN 4.20*** 0.49 0.96 0.83 0.0 GARCH 6.24*** 3.86*** 3.28** 5.91*** 8.08*** 0.0 GBR HAR 0.0 HAR-MSW 0.78 0.0 ANN 3.70*** 0.77 0.0 ANN-BE 3.61*** 1.02 0.36 0.0 HAR-ANN 4.14*** 0.99 0.23 0.14 0.0 GARCH 7.83*** 4.18*** 8.56*** 7.53*** 7.62*** 0.0 JPN HAR 0.0 HAR-MSW 2.84** 0.0 ANN 0.17 3.19** 0.0 ANN-BE 1.00 3.36*** 0.67 0.0 HAR-ANN 0.72 2.88** 0.28 0.77 0.0 GARCH 5.82*** 0.43 2.89** 5.22*** 6.18*** 0.0

(49)

Table 16: Diebold-Mariano test for weekly volatility, MSE used to measure the accuracy - 15 % data used for test

HAR HAR-MSW ANN ANN-BE HAR-ANN GARCH USA HAR 0.0 HAR-MSW 0.96 0.0 ANN 2.77** 1.25 0.0 ANN-BE 3.10** 1.59 0.58 0.0 HAR-ANN 0.47 1.14 1.26 1.50 0.0 GARCH 1.63 2.15* 2.41* 2.54* 2.03* 0.0 DEU HAR 0.0 HAR-MSW 2.11* 0.0 ANN 2.66** 1.43 0.0 ANN-BE 3.45*** 1.81 1.26 0.0 HAR-ANN 3.22** 1.07 0.08 1.30 0.0 GARCH 3.32*** 3.50*** 3.52*** 3.67*** 3.63*** 0.0 FRA HAR 0.0 HAR-MSW 2.02* 0.0 ANN 2.74** 1.62 0.0 ANN-BE 1.35 0.27 0.92 0.0 HAR-ANN 2.78** 0.79 1.03 0.50 0.0 GARCH 6.19*** 6.11*** 6.06*** 6.09*** 6.25*** 0.0 GBR HAR 0.0 HAR-MSW 0.92 0.0 ANN 1.49 0.48 0.0 ANN-BE 2.43* 1.19 0.73 0.0 HAR-ANN 1.62 0.14 0.31 0.74 0.0 GARCH 2.44* 2.54* 2.56* 2.71** 2.98** 0.0 JPN HAR 0.0 HAR-MSW 0.12 0.0 ANN 0.09 0.05 0.0 ANN-BE 0.63 1.13 0.42 0.0 HAR-ANN 0.15 0.01 0.04 0.84 0.0 GARCH 4.75*** 4.53*** 3.73*** 4.33*** 4.69*** 0.0 Note: * p ≤ 0.05 ** p ≤ 0.01 *** p ≤ 0.001.

(50)

Table 17: Diebold-Mariano test for weekly volatility, QLIKE used to measure the accuracy - 15 % data used for test

HAR HAR-MSW ANN ANN-BE HAR-ANN GARCH USA HAR 0.0 HAR-MSW 1.21 0.0 ANN 2.43* 1.86 0.0 ANN-BE 2.54* 2.11* 0.53 0.0 HAR-ANN 1.84 0.99 1.20 1.79 0.0 GARCH 0.73 1.32 2.41* 2.44* 1.91 0.0 DEU HAR 0.0 HAR-MSW 2.22* 0.0 ANN 2.83** 1.18 0.0 ANN-BE 3.48*** 1.53 0.93 0.0 HAR-ANN 3.19** 0.43 0.38 1.56 0.0 GARCH 1.51 2.76** 3.18** 3.79*** 3.68*** 0.0 FRA HAR 0.0 HAR-MSW 2.02* 0.0 ANN 2.76** 1.43 0.0 ANN-BE 2.20* 0.08 0.70 0.0 HAR-ANN 2.85** 1.56 0.32 0.55 0.0 GARCH 9.95*** 6.58*** 7.04*** 7.30*** 7.49*** 0.0 GBR HAR 0.0 HAR-MSW 2.42* 0.0 ANN 2.39* 0.40 0.0 ANN-BE 2.98** 0.50 0.02 0.0 HAR-ANN 2.21* 0.60 0.84 1.03 0.0 GARCH 1.63 3.01** 2.68** 3.22** 3.21** 0.0 JPN HAR 0.0 HAR-MSW 0.23 0.0 ANN 1.06 1.47 0.0 ANN-BE 0.57 0.82 1.15 0.0 HAR-ANN 0.16 0.84 1.78 1.83 0.0 GARCH 6.39*** 3.72*** 1.60 2.47* 3.46*** 0.0

Forecasting volatility using Artificial Neural Networks and parametric methods