• No results found

Evaluating multivariate density forecasts with different marginal specifications

N/A
N/A
Protected

Academic year: 2021

Share "Evaluating multivariate density forecasts with different marginal specifications"

Copied!
62
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

E VA L U AT I N G M U LT I VA R I AT E D E N S I T Y F O R E C A S T S W I T H D I F F E R E N T M A R G I N A L S P E C I F I C AT I O N S ja k o b b l a n k -0.02 -0.01 0.00 0.01 0.02 -0.02 -0.01 0.00 0.01 0.02 MSc Financial Econometrics University of Amsterdam Supervisor: Prof. dr. C.G.H. Diks Second reader: Dr. N.P.A. van Giersbergen

(2)

Jakob Blank: Evaluating Multivariate Density Forecasts with different Marginal Specifications © July 2014

(3)

C O N T E N T S

1 i n t r o d u c t i o n 1

2 m e t h o d o l o g y 3

2.1 Copulas . . . 3

2.2 Multivariate density forecast evaluation . . . 4

2.2.1 Multivariate density forecast evaluation . . . 4

2.2.2 The logarithmic scoring rule . . . 5

2.2.3 Weighted likelihood-based scoring rules . . . . 6

2.2.4 Copula-based density forecast evaluation using weighted likelihood scoring rules . . . 7

2.2.5 Tests of equal predictive accuracy . . . 8

2.3 Copula-based multivariate dynamic models . . . 9

2.4 Some practical remarks . . . 11

3 m o n t e c a r l o s t u d y 15 3.1 Simulation setup . . . 15

3.2 Size . . . 18

3.2.1 Results . . . 19

3.2.2 Higher dimensions and other robustness checks 19 3.3 Power . . . 22

3.3.1 Results . . . 22

3.3.2 Time-varying weight functions . . . 28

3.3.3 Higher dimensions and other robustness checks 28 4 e m p i r i c a l a p p l i c at i o n 31 4.1 Setup . . . 32

4.2 Goodness-of-fit of the different marginal distributions 34 4.3 Regions evaluated . . . 37

4.4 Results . . . 37

4.5 Robustness and Extensions . . . 42

5 c o n c l u s i o n 45

b i b l i o g r a p h y 47

a a p p e n d i x 51

(4)

L I S T O F F I G U R E S

Figure 1 Size discrepancy plots . . . 20

Figure 2 Size-size plots for asymptotic-based tests . . . 21

Figure 3 Power plots for various levels of the number of degrees of freedom . . . 23

Figure 4 Size-power plots, true Student-t vs. Normal . . 24

Figure 5 Power plot for different regions, true Student-t vs Normal . . . 25

Figure 6 Power plot for various levels of skewness in the DGP . . . 26

Figure 7 Power plot for different regions, true Skew-t . 27 Figure 8 Power plot for different regions, time-varying weight function . . . 29

Figure 9 TED spread . . . 32

Figure A.1 QQ-plots for S&P100 . . . 52

Figure A.2 QQ-plots for S&P400 . . . 53

L I S T O F TA B L E S Table 1 Main-properties of the different distributions . 34 Table 2 Parameter estimates and goodness-of-fit test re-sults . . . 36

Table 3 Full support test results . . . 38

Table 4 Post-crisis sub-period, censored score . . . 41

Table A.1 Pre-crisis sub-period, censored score . . . 54

Table A.2 Post-crisis sub-period, conditional score . . . . 55

Table A.3 Pre-crisis sub-period, conditional score . . . . 56

Table A.4 Gumbel copula family test results . . . 57

Table A.5 Crisis sub-period, censored score . . . 58

(5)

1

I N T R O D U C T I O N

In the aftermath of the 2007-2008 financial crisis, academics and in-dustry experts alike came to regard the importance of non-linearity in the dependence between financial asset returns. Many models used to capture dependence between financial assets were revealed during the crisis as being inadequate. These models were often based on an assumption of multivariate Gaussianity and consequently neglected the possibility that crashes may be correlated across assets, leading to devastating losses.

Copulas have recently emerged as the most efficient and popular modeling technique in finance to remedy this situation (see Patton (2009) and Genest et al. (2009)). One of the main advantages of copula-based models is that they allow for modelling the marginal distri-butions and the dependence structure of the financial assets sepa-rately. However, many different copula and marginal specifications are available and this multitude of specifications pose the challenge of selecting the best specifications for forecasting risk under different types of market dynamics. To overcome these issues, Diks et al. (2013) propose tests for comparison of the predictive accuracy of different copula-based multivariate density forecasts in specific regions of the support. Their approach combines the logarithmic score decomposi-tion methods for copula-based models of Diks et al. (2010) with the univariate testing framework using weighted likelihood scoring rules developed by Diks et al. (2011). That is, to evaluate density forecasts they use multivariate extensions of either the conditional likelihood scoring rule, given that the actual observation lies in the region of in-terest, or the censored likelihood scoring rule, with censoring of the observations outside of the region of interest.

Diks et al. (2013) focus in their research on finding the best copula specification for different forecasting purposes. They achieve this by comparing the predictive accuracy of two competing copula-based multivariate density forecasts, which differ only in their copula and have identical predictive marginals. They verify the reliability of their tests by a Monte Carlo study of their power and size. In an empir-ical application to risk management of a foreign exchange portfolio they demonstrate that the best copula specification varies with the targeted region of the support. In this thesis we extend their anal-ysis by allowing their tests to focus on the predictive accuracy of different marginals. We achieve this by assuming that two compet-ing copula-based multivariate density forecasts differ only in their predictive marginals and use the same copula.

(6)

2 i n t r o d u c t i o n

Fermanian and Scaillet (2004) find that misspecification of marginal distributions can lead to severely biased estimates of the copula pa-rameters. Therefore in practice, researchers often use a non-parametric distribution for the marginals. This approach has the advantage that it requires no assumptions to be made on the shape of the distribu-tion. It is however only feasible if a large amount of data is available, which may not always be the case in practice. Furthermore, a priori it is not immediately clear whether a non-parametric distribution pro-vides better one-step ahead forecasts than parametric distributions. The analysis in this thesis might therefore help researchers in select-ing the best marginal specification for copula-based multivariate den-sity forecasting purposes, under different market circumstances.

The size and power properties of the predictive accuracy tests of Diks et al. (2013), when focusing on the use of different predictive marginals, are examined in this thesis via Monte Carlo simulations. Results demonstrate that the predictive accuracy tests, using both the conditional and censored likelihood scoring rules, have satisfactory size and power properties in realistic sample sizes. We find that the tests based on the censored likelihood scoring rule, which uses more of the relevant information present, performs better in all cases con-sidered.

To illustrate the usefulness of the suggested test for risk manage-ment, we consider an empirical application to daily returns of two equity indices. Based on the relative predictive accuracy of one-step-ahead density forecasts we find that different specifications for the marginal distributions achieve superior forecasting performance in different regions of the support. Our analysis shows the importance of using flexible marginal distributions, accommodating for skewness and heavy tails in the distribution of the innovations. Specifically, we find that a non-parametric marginal distribution provides the best forecasts under extreme market circumstances, whereas it has a hard time forecasting under normal market circumstances. In that case, it is outperformed by most parametric distributions and we find that the best marginal specification is given by the Skewed Student-t dis-tribution of Hansen (1994).

The remainder of this thesis is organised in the following way. In Section 2 we briefly review copulas, density forecast evaluation us-ing weighted scorus-ing rules and copula-based multivariate dynamic models. In Section 3 we investigate the size and power properties for the predictive accuracy tests, when focusing on the use of different predictive marginals. This is done by an extensive Monte Carlo study. In Section 4 we provide an empirical application of the predictive accuracy test, using a substantial number of different marginal spec-ifications commonly observed in financial applications. In Section 5 we conclude.

(7)

2

M E T H O D O L O G Y

The aim of this thesis is to extend the analysis of Diks et al. (2010) and Diks et al. (2013), by enabling their copula-based tests of predictive accuracy to focus on the predictive accuracy of different marginal specifications, in specific regions of the support.

2.1 c o p u l a s

We briefly discuss copulas here as they are used in the subsequent sections. Several surveys and books on copula theory and their ap-plications have appeared in the literature to date, we provide a few examples here for the interested reader. A brief review of the litera-ture on copula-based models for financial time-series is given by Pat-ton (2012), whereas PatPat-ton (2013) focuses on the use of copula-based model for economic forecasting purposes. Manner and Reznikova (2012) present a survey specifically focused on time-varying copulas. Finally, Joe (1997) and Nelsen (2006) are two leading text books on copula theory, with an emphasis on their statistical and mathematical foundations.

Consider a vector stochastic process {Zt : Ω → Rk+d}Tt=1, defined on a complete probability space (Ω,F,P). The process Zt is identi-fied with (Yt, X0t)0, where Yt :Ω →Rd represents the d-dimensional vector of central interest, and Xt : Ω→ Rk a vector of exogenous or pre-determined variables.

Sklar’s (1959) theorem states that any d-dimensional joint distribu-tion can be decomposed into its d univariate marginal distribudistribu-tions and a d-dimensional copula. To be specific, denote F(y)as the cumu-lative distribution function of Ytand let Fi, for i=1, . . . , d, denote the univariate marginal distributions of Yi,t. Then there exists a copula function C :[0, 1]d → [0, 1]such that, for all yRd,

F(y) =C(F1(y1), . . . , Fd(yd)). (1) The decomposition in Eq. (1) shows the practical usefulness of the copula approach for modelling multivariate distributions. Given any set of d univariate marginal distributions{F1, . . . , Fd}and any copula C, the function F defined by Eq. (1) defines a valid joint distribu-tion. Given that the marginal distributions Fi only contain univariate information on the individual variables Yi, their dependence is com-pletely governed by the copula function C. Furthermore, the choice of the marginal distribution does not restrict the choice of the copula, or vice versa. Allowing for a wide range of joint distributions to be

(8)

4 m e t h o d o l o g y

obtained by combining different marginal distributions and copulas. For example, one might combine a Normal distributed variable with an Exponential distributed variable via a Student-t copula to obtain a strange, but valid bivariate distribution.

In the current context of multivariate time series it is more natural to consider the conditional distribution F(y|Ft−1) of Yt, where Ft−1 denotes the σ-field generated by {Yt−1, Yt−2, . . . ; Xt, Xt−1, . . .}. Pat-ton (2006b) provides an extension of Sklar’s theorem for conditional distributions, which allows us to decompose F(y|Ft−1)into its condi-tional marginal distributions and a condicondi-tional copula. That is, let

Yt|Ft−1 ∼F(·|Ft−1), Yit|Ft−1∼ Fi(·|Ft−1), for i=1, . . . , d, then

F(y|Ft−1) =C(F1(y1|Ft−1), . . . , Fd(yd|Ft−1)|Ft−1). (2) 2.2 m u lt i va r i at e d e n s i t y f o r e c a s t e va l uat i o n

In this subsection we recall some of the relevant methodology of Diks et al. (2013) and Diks et al. (2011) on density forecast evaluation through the use of (weighted) likelihood-based scoring rules.

2.2.1 Multivariate density forecast evaluation

Consider the case where two competing multivariate density forecast models are available, each producing one-step ahead predictive densi-ties of the random variable of central interest Yt+1. That is, predictive densities of Yt+1 based onFt, the information set at time t. The one-step ahead predictive densities of the competing forecast models are denoted by ˆfA,t(y)and ˆfB,t(y)respectively.

We evaluate the relative predictive accuracy of the competing den-sity forecasts through the use of scoring rules, see Diebold and Lopez (1996). In the current context, a scoring rule S∗(ˆft; yt+1)is a loss func-tion assigning a numerical score based on the predictive density ˆft and the actually observed variable of central interest yt+1. That is, scoring rules measure the quality of the probabilistic forecasts and allow for a ranking of the competing density forecasts based on their obtained average scores. Following Gneiting and Raftery (2007), we prefer the use of scoring rules for which incorrect density forecasts ˆft do not achieve a higher average score than the true conditional density pt. That is,

Et[S∗(ˆft; Yt+1)] ≤Et[S∗(pt; Yt+1)], for all t,

whereEt[·] =E[·|Ft]. In the literature, scoring rules S∗(·)satisfying this condition are called proper.

(9)

2.2 multivariate density forecast evaluation 5

Note that, the true conditional density pt includes the true param-eters (if any), whereas in practice predictive densities typically in-volve estimated parameters. In the case of non-vanishing estimation uncertainty, the use of proper scoring rules implies that even if a density forecast is correctly specified, its obtained average score may not achieve the upper bound Et[S∗(pt; Yt+1)]. Therefore, as argued by Diks et al. (2011), a misspecified density forecast with a limited amount of estimation uncertainty may achieve a higher average score as compared to a correctly specified density forecast with more es-timation uncertainty. Although this may suggest that proper scor-ing rules are of limited practical use, they still allow us to give a higher average score to a density forecast which approximates the true conditional density pt more closely. That is, in the presence of non-vanishing estimation uncertainty, we have to keep in mind that this may be achieved by a density forecast based on a misspecified model. We illustrate this issue in the Monte Carlo study below. 2.2.2 The logarithmic scoring rule

One of the most popular scoring rules, the logarithmic scoring rule, is based on the Kullback-Leibler information criteria (KLIC). The KLIC provides a measure for the divergence between the true conditional density pt and the evaluated predictive density ft and is defined as,

KLIC(ft, pt) =Et  log pt(y) ft(y)  = Z log pt(y) ft(y)  pt(y)dy, it is non-negative and zero only if ft and pt concur. We can use the KLIC to measure the relative predictive accuracy of two density fore-casts, see for example Bao et al. (2004, 2007) and Diks et al. (2010). Al-though in practice the true conditional density pt is unknown, it can-cels out by subtracting two KLICs: KLIC(fA,t, pt) −KLIC(fB,t, pt) =

E[log fA,t−log fB,t]. Motivating the use of the logarithmic scoring rule, given by

Sl(ˆft; yt+1) =log ˆft(yt+1). (3) Hence, the logarithmic score given to a density forecast varies posi-tively with the obtained value of ˆft(yt+1).

Assume we have P out-of-sample observations available for evalua-tion, yR+1, . . . , yT, where P=T−R. Using the obtained average loga-rithmic scores, P−1∑Tt=R1log ˆfA,t(yt+1)and P−1∑T

−1

t=Rlog ˆfB,t(yt+1), we can rank the competing density forecasts ˆfA,t and ˆfB,t. Where, natu-rally, a density forecast with a higher average score would be the pre-ferred one. Furthermore, the observed logarithmic score differences dlt+1 = log ˆfA,t−log ˆfB,t may be used to test the null hypothesis of equal predictive accuracy, as explained later in this section. Note that, the logarithmic score differences correspond with the relative KLIC measure of two competing density forecasts.

(10)

6 m e t h o d o l o g y

2.2.3 Weighted likelihood-based scoring rules

In practice, forecasters are often mostly interested in a particular region of the density. Risk management applications for example, where in order to obtain good estimates of measures of downside risk an accurate description of the left tail of the distribution is required. To address these issues Diks et al. (2011) adapt the logarithmic scor-ing rule in Eq. (3) for evaluatscor-ing density forecasts in a specific region of interest. As argued by Diks et al. (2011) and Gneiting and Ranjan (2011) this cannot simply be achieved by using the weighted logarith-mic score wt(yt+1)log ˆft(yt+1), where wt(yt+1) is a weight function defining the region of interest on the support. By construction the resulting test statistic will be biased towards density forecasts with more probability mass in the region of interest. Reflecting the fact that the weighted logarithmic scoring rule is not a proper scoring rule. Diks et al. (2011) propose two solutions which do not suffer from this problem and do remain proper, the conditional likelihood scoring rule and the censored likelihood scoring rule, which are extended to the multivariate case by Diks et al. (2013). The conditional likelihood (cl) scoring rule, which gives a score given that the observation lies in the targeted region of interest, is given by

Scl(ˆft; yt+1) =wt(yt+1)log ˆft(yt+1) R wt(y)ˆft(y)dy ! . (4)

The censored likelihood (csl) scoring rule, with censoring of observa-tions outside of the region of interest, is given by

Scsl(ˆft;yt+1) =wt(yt+1)log ˆft(yt+1) + (1−wt(yt+1))log  1− Z wt(y)ˆft(y)dy  . (5)

Under the following assumptions, Diks et al. (2013) argue that the conditional likelihood scoring rule in Eq. (4) and the censored likeli-hood scoring rule in Eq. (5) are both proper scoring rules.

Assumption 1. The weight function wt(y) satisfies: (a) it is determined by the information available at time t, (b) 0 ≤ wt(y) ≤ 1, and finally (c) R

wt(y)pt(y)dy>0.

Assumption 2. The density forecasts ˆfA,tand ˆfB,tsatisfy KLIC(ˆfA,t, pt) < ∞ and KLIC(ˆfB,t, pt) <∞.

Specifically, Assumption 1(c) avoids situations where the weight func-tion wt(y) takes only strictly positive values outside the support of the data and Assumption 2 makes sure that the expected score differ-ences are finite.

(11)

2.2 multivariate density forecast evaluation 7

2.2.4 Copula-based density forecast evaluation using weighted likelihood scoring rules

We proceed by focusing on the comparison of two multivariate den-sity forecasts which are obtained using a copula approach. For ease of notation, we now denote conditional distributions with a simple sub-script notation, for example Ft(·) = F(·|Ft). Recall from Eq. (2) that the conditional multivariate distribution Ft(yt+1)can be decomposed into its conditional marginal distributions Fi,t(yi), for i=1, . . . , d, and a conditional copula Ct(·), that is

Ft(y) =Ct(F1,t(y1), . . . , Fd,t(yd)), (6) provided that the marginal conditional CDFs Fi,tare continuous. The one-step ahead predictive log-likelihood associated with yt+1 based on Eq. (6) is then seen to be given by

d

i=1

log ˆfi,t(yi,t+1) +log ˆct(Fˆ1,t(y1,t+1), . . . , ˆFd,t(yd,t+1)), (7) where ˆfi,t(yi,t+1), for i = 1, . . . , d, are the conditional marginal pre-dictive densities and ˆct is the conditional copula predictive density, defined as

ct(u1, . . . , ud) =

d ∂u1. . . ∂ud

Ct(u1, . . . , ud),

which we will assume to exist throughout. Using Eq. (7), Diks et al. (2013) show that the conditional likelihood (cl) and censored likeli-hood (csl) scoring rules of a density forecast ˆft can be decomposed as

Sclt+1 =wt(yt+1) d

i=1

log ˆfi,t(yi,t+1) +log ˆct(ˆut+1) ! −wt(yt+1)log Z wt(y)ˆft(y)dy  , (8) and Scslt+1 =wt(yt+1) d

i=1

log ˆfi,t(yi,t+1) +log ˆct(ˆut+1) ! +(1−wt(yt+1))log  1− Z wt(y)ˆft(y)dy  , (9)

where ˆut+1 = (Fˆ1,t(y1,t+1), . . . , ˆFd,t(yd,t+1))0 is the multivariate con-ditional probability integral transform (PIT). When comparing the predictive accuracy of two competing density forecasts, both are as-sumed to have well-defined marginal and copula densities.

Unless stated otherwise, the weight function wt(yt+1)used in both scoring rules will take the form of an indicator function of a given

(12)

8 m e t h o d o l o g y

fixed subset of the support. We will consider both time-independent and time-varying weight functions. An example of a time-independent weight function is given by the treshold weight function w(y) = I(y1 ≤ r, . . . , yd ≤ r), where the treshold r defines the upper bounds of the lower left tail region(−∞, r]dunder evaluation. Finally, for the special case where wt(yt+1) → 1, the two considered scoring rules approach each other and in the limit the original logarithmic scoring rule

Sl(ˆft; yt+1) =log ˆft(yt+1) (10) is retrieved.

2.2.5 Tests of equal predictive accuracy

Assume that two competing density forecasts ˆfA,t and ˆfB,t and corre-sponding realisations of the vector of central interest Yt+1 are avail-able for t = R, R+1, . . . , T−1. We may then formally test whether the difference between ˆfA,t and ˆfB,tis statistically significantly differ-ent from zero on average.

Given a scoring rule S∗, formal tests of equal predictive accuracy may be based on the obtained score differences,

d∗t+1=S∗(ˆfA,t; yt+1) −S ∗(ˆf

B,t; yt+1), for the null hypothesis of equal scores

H0: E[d∗t+1] =0, for all t= R, R+1, . . . , T−1.

We adopt the framework of Giacomini and White (2006), by assum-ing that all parameters in the multivariate conditional densities are estimated using a finite rolling window of R past observations. That leaves P out-of-sample observations available for evaluation for which we obtain the score differences dt+1and we let ¯d∗R,Pdenote the sample average of the score differences, i.e. ¯d∗R,P = P−1∑tT=R1d∗t+1. To test the null of equal predictive accuracy of the two competing forecasting models, we may use a Diebold and Mariano (1995) type test statistic

QR,P= √ P ¯ d∗R,P q ˆσ2 R,P , (11) where ˆσ2

R,Pdenotes a heteroskedasticity and autocorrelation-consistent (HAC) variance estimator of σ2

R,P = Var( √

P ¯d∗R,P), satisfying ˆσ2 R,P−

σR,P2 →p 0. This HAC variance estimator is needed as it is likely that

the score differences d∗t+1 are serially correlated and heteroskedastic. Throughout this thesis we use as a Newey and West (1987) HAC es-timator ˆσ2

(13)

2.3 copula-based multivariate dynamic models 9

covariance of the sequence {d∗t+1}T−1

t=R and aj are the Bartlett weights aj =1−j/(m+1)where the lag-length m is given by the Newey and West (1994) plug-in estimate m=4(P/100)2/9.

The use of a moving estimation window of R past observations considerably simplifies the asymptotic theory of the test of equal pre-dictive accuracy, as argued by Giacomini and White (2006). Under this framework, the asymptotic distribution under the null is charac-terised by the following theorem.

Theorem 1. The Diebold-Mariano type test statistic QR,P in Eq. (11) is asympotically standard normally distributed under the null hypothesis, as P→∞ with R fixed, if: (a){Zt}is φ-mixing of size−q/(2q−2)for q≥2, or α-mixing of similar size−q/(2q−2)for q>2; (b)E[|d∗t+1|2q] <∞ for all t=R, R+1, . . . , T−1, and (c) σR,P2 >0 for all P sufficiently large. Proof. A proof can be found in Theorem 4 of Giacomini and White (2006).

2.3 c o p u l a-based multivariate dynamic models

The models used in this thesis are assumed to be specified as general copula-based multivariate dynamic models, given by

Yt= µt(θ1) + q

Ht(θ)εt, (12)

where

µt(θ1) = (µ1,t(θ1), . . . , µd,t(θ1))0 =E[Yt|Ft−1],

is a specification of the conditional mean, parametrised by a finite dimensional vector of parameters θ1, and

Ht(θ) =diag(h1,t(θ), . . . , hd,t(θ)),

where

hi,t(θ) =hi,t(θ1, θ2) =E[(Yi,t−µi,t(θ1))2|Ft−1], i=1, . . . , d, is the conditional variance of Yi,tgivenFt−1, parametrised by a finite-dimensional vector of parameters θ2, where θ1 and θ2 do not have common elements. This approach allows for a wide variety of mod-els for the conditional mean: ARMA modmod-els, vector autoregressions, linear and nonlinear regression, and others. Furthermore, it allows for a variety of models for the conditional variance: ARCH and any of its numerous parametric extensions (e.g. GARCH, EGARCH and GJR-GARCH, see Teräsvirta (2009)), stochastic volatility, and others.

The innovations εt = (ε1,t, . . . , εd,t)0 are assumed to be

indepen-dent of Ft−1 and independent and identical distributed (i.i.d.), with

(14)

10 m e t h o d o l o g y

extension of Sklar’s theorem as given in Eq. (2), the conditional joint distribution Ft(ε)of the innovations εt can be written as

Ft(ε) =C(F1,t(ε1), . . . , Fd,t(εd)) ≡C(u1,t, . . . , ud,t; α), (13) where C(u1,t, . . . , ud,t; α)denotes the parametric copula function with parameter vector α and Fi,t denote the conditional marginal distribu-tions of the innovadistribu-tions for i=1, . . . , d.

Recall from the Introduction that we focus in this thesis on the predictive accuracy of different specifications for the marginal dis-tributions of the financial returns Yt. Under the assumption of the general copula-based multivariate dynamic models, the marginal dis-tributions of Ytare defined by specifications for the conditional mean and variance together with a specification for the marginal distribu-tions of the standardised innovadistribu-tions. Bao et al. (2007), who examine the predictive accuracy of a variety of GARCH models, find that the accuracy of density forecasts depends more on the choice of the dis-tribution of the innovations than on the conditional mean and vari-ance specification. Hence, from now on, by comparing the predictive accuracy of different marginal specifications for Yt we will restrict ourselves to focus on the use of different specifications for the condi-tional marginal distributions Fi,t of the innovations.

The conditional marginal distributions Fi,t of the innovations are often treated in one of two ways, either parametrically or non-para-metrically. The models with parametrically specified marginal dis-tributions are typically estimated in two stages. First, for a given marginal distribution univariate quasi-maximum likelihood is used to estimate the parameters θ1 and θ2 separately for each marginal distribution. The estimated innovations ˆεi,t are then obtained as

ˆεi,t=

(yi,t−µi,t(ˆθ1)) q

hi,t(ˆθ)

, i=1, . . . , d. (14)

Second, the parameters α of the given parametric copula specification are estimated by maximising the corresponding copula log-likelihood and conditioning on the parameters of the marginals obtained in the first stage. Separately maximising the parameters for the copula and the marginals is often referred to as "inference functions for margins", see for example Joe (1997) and Joe and Xu (1996), or more gener-ally as multi-stage maximum likelihood estimation (MSMLE). Clearly, MSMLE is asymptotically less efficient than one-stage maximum like-lihood estimation (MLE). This loss does however not seem to be sub-stantial in many cases, as shown by simulation studies in Joe (2005) and Patton (2006a). The main appeal of MSMLE relative to (one-stage) MLE is the ease of estimation: by breaking the full parameter vector into parts the estimation problem is often greatly simplified and no assumptions on the marginal distributions of εi,t are required.

(15)

2.4 some practical remarks 11

Chen and Fan (2006) propose estimating the marginal distributions of the innovations non-parametrically and introduce the class of semi-parametric copula-based multivariate dynamic models (SCOMDY). They propose a three-stage procedure to obtain parameters estimates in the SCOMDY model. The first stage coincides with the first stage of the parametric estimation, under the assumption of normality of the estimated innovations ˆεi,t. Second, an estimate of the marginal distribution Fi,t(·)is obtained for each univariate series by means of the empirical cumulative distribution function (EDF) transformation of the estimated innovations. The (rescaled) EDF used here is given by ˆ Fi,t(ε) = (R+1)−1 t

s=t−R+1 I[ˆεi,s ≤ε], (15)

where I[A]is an indicator function for the event A and R denotes the rolling window estimation size. Note that we scale the EDF by 1/(R+ 1), this has no effect asymptotically and is useful in finite samples for keeping the estimated probability integral transforms away from the boundaries of the unit interval, where some copula models diverge. That is, PIT values obtained from the rescaled EDF will range from 1/(R+1) to R/(R+1), with equal steps in between of 1/(R+1). Finally for the SCOMDY model, the parameters of a given parametric copula are estimated by maximum likelihood, conditioning on the estimates of the marginal EDFs obtained in the second stage.

2.4 s o m e p r a c t i c a l r e m a r k s

We conclude this section with some practical remarks on the imple-mentation of the tests of equal predictive accuracy used in this thesis. Firstly, both the conditional likelihood and censored likelihood scor-ing rules require the evaluation of a multi-dimensional integral

A=

Z

wt(y)ˆft(y)dy. (16)

In this thesis Monte Carlo integration methods (see for example New-man and Barkema (1999)) are used to numerically evaluate it. This method is especially useful for higher dimensions and works as fol-lows. Given that the weight function wt defines a subset Ω of Rd we can rewrite the integral A as

A=

Z

Ω ˆft(y)dy, whereΩ has volume

V=

Z

(16)

12 m e t h o d o l o g y

The Monte Carlo approach is to sample points uniformly onΩ. Given these N uniform samples, y1, . . . , . . . , yN ∈Ω, the integral A can then be consistently approximated by A≈QN = 1 N N

i=1 ˆft(y∗i).

This follows as a law of large numbers ensures that limN→∞QN = A. It furthermore can be shown that estimation error of QN decreases asymptotically to zero at rate 1/√N, independent of the dimension-ality of the integral. In this thesis we set N = 100, 000 which seems sufficiently precise while keeping computational time within bounds.

Secondly, weight functions used in this thesis focus on subsets of the support Rd. Note that for a given CDF Fi, we can rewrite these subsets onRdequally as regions defined on the copula support[0, 1]d. This follows because for all i = 1, . . . , d we have Fi : R → [0, 1], where Fi is non-negative and non-decreasing, moreover Fi(∞) = 1 and Fi(−∞) = 0. Given that in this thesis we focus on two compet-ing forecasts that use different CDFs Fi, the region analysed on the copula support [0, 1]d will not be the same for the two competing forecast methods. For example, assume that we are analyzing a sub-set ofRdthat takes the form of a d-dimensional hypercube[a, b]dwith b> a. For each forecasting method, using a different specification for the marginal distributions, there is an equivalent area on the copula support[0, 1]d that takes the form of a d-dimensional hyper-rectangle. Due to the different CDFs Fi used, the size of the hyper-rectangles will differ between the two forecasting methods. Finally, the shape of a hyper-rectangle follows because in general we have ˆFi,t(x) 6= Fˆj,t(x) for j6=i and j, i=1, . . . , d.

Thirdly, in the case of parametric conditional marginal densities, the probability integral transforms (PITs) ˆut+1 can be evaluated di-rectly when a closed-form expression for the CDF is available. If no closed-form expression is available, numerical integration techniques can be used to obtain the PITs. If the marginal densities are mod-elled parametrically, the PITs can similarly be estimated non-parametrically. Specifically, we now describe how the PITs ˆut+1 are obtained for the SCOMDY model. The SCOMDY model will be used in the empirical application below and was introduced and discussed in Section 2.3. Let ˆθt denote a point estimate of the parameter vector

θ= (θ01, θ02)0 characterizing the conditional means and variances,

ob-tained at time t using the R observations Yt−R+1, . . . , Yt. These esti-mates can be used to compute the sequence of in-sample innovations {ˆεs}ts=t−R+1, where ˆεs= (ˆε1,s, . . . , ˆεd,s)0 and ˆεi,s = (yi,s−µi,s(ˆθ1,t)) q hi,s(ˆθt) , i=1, . . . , d.

(17)

2.4 some practical remarks 13

In addition, the one-step ahead forecast error ˆεi,t+1|t can be obtained as ˆεi,t+1|t = (yi,t+1−µi,t+1(ˆθ1,t)) q hi,t+1(ˆθt) , i=1, . . . , d.

The PIT ˆut+1 associated with the forecast errors ˆεi,t+1|t is then given by ˆut+1= (uˆ1,t+1, . . . , ˆud,t+1)0 where

ˆ ui,t+1 =

rank of ˆεi,t+1|tamong{ˆεi,t−R+1, . . . , ˆεi,t, ˆεi,t+1|t}

R+2 . (17)

The division by R+2 rather than by the estimation sample size R+ 1 ensures that the estimated out-of-sample PITs do not fall at the boundaries of the copula support[0, 1]d, where some copula densities diverge.

Finally, in the empirical application below we use the EDF given in Eq. (15) as a non-parametric estimate of the marginal distribution of the innovations. Note that, the EDF does not have an unique counter-part for the density, which is needed to compute the log-likelihood in the tests of equal predictive accuracy. To overcome this, we use a non-parametric density estimate based on a Gaussian kernel:

ˆfi,t(xi,0) = (Rh)−1 t

s=t−R+1 K ˆεi,s−xi,0 h  , for i=1, . . . , d,

where K(z) = ()−1/2exp(−z2/2). For the bandwidth h we use Sil-verman’s (1986) bandwidth plug-in, given for the Gaussian kernel by h = 1.059 ˆσR−1/5 where ˆσ2 is the sample variance of the innovations (which is one for both series). We evaluate ˆfi,t(xi,0)at 1, 000 equally spaced values of xi,0, within the range of the estimated innovations {ˆεi,s}ts=t−R+1. Finally, ˆfi,t(ˆεi,t+1|t)is obtained by linear interpolating. In the rare event that ˆεi,t+1|t falls outside the range{ˆεi,s}ts=t−R+1 we use as an approximation the obtained log-likelihood value of the Skew-t distribution, introduced in the next section.

(18)
(19)

3

M O N T E C A R L O S T U D Y

In this section we use Monte Carlo simulations to examine the finite-sample behaviour of the predictive accuracy tests when comparing alternative specifications for the marginal distributions, in selected regions of the support. Both the conditional likelihood scoring rule in Eq. (8) and the censored likelihood scoring rule in Eq. (9) are evalu-ated.

3.1 s i m u l at i o n s e t u p

In all experiments, the data generating process (DGP) for the univari-ate series Yi,t, for i=1, . . . , d, is based on a GJR-GARCH(1, 1, 1) spec-ification for the conditional variance (see Glosten et al. (1993)). That is, we allow the univariate series in the DGP to account for two well-known stylized facts of financial returns; volatility clustering and the leverage effect. Coefficients used in the DGP are typical for financial applications. In particular,

Yi,t= q

hi,tεi,t,

hi,t=0.017+ (0.02+0.13I{Yi,t−1<0})Yi,t2−1+0.89hi,t−1, (18) for i =1, . . . , d. The standardised innovations εi,tare i.i.d. with mean zero and variance one. Their dependence is governed by the Student-t copula, with degrees of freedom parameter ν=5 and all off-diagonal correlations set at ρ=0.5. The marginal distributions Fi,tfor the stan-dardised innovations used in this Monte Carlo study are either given by the Normal distribution, the standardised Student-t distribution or the Skew-t distribution of Hansen (1994).

The Student-t copula density is given by

c(u;Σ, ν) = |Σ|−1/2Γ([ν+d]/2)Γ d−1(ν/2) Γd([ν+1]/2)  1+ Tν−1(u) 0Σ−1T−1 ν (u) ν −(ν+d)/2 ∏d i=1  1+(Tν−1(ui))2 ν −(ν+1)/2 , (19) where Tν−1(u) = (Tν−1(u1), . . . , Tν−1(ud)) 0, and T−1 ν (·) is the inverse of the univariate Student-t cumulative distribution function. Further-more,Γ(·)is the gamma function, ν is the number of degrees of free-dom parameter with restriction ν > 2 and Σ defines the correlation matrix. The Student-t copula nests the Gaussian copula when ν→∞. The Student-t copula has, in contrast to the Gaussian copula, the ability to capture tail dependence. Lower tail dependence is defined

(20)

16 m o n t e c a r l o s t u d y

as λL = limq↓0C(q, . . . , q)/q and the upper tail dependence as λU = limq↓0C˜(q, . . . , q)/q. Where ˜C is the survival-copula of εt, that is, the copula of −εt. Specifically, the Student-t copula has symmetric and

positive tail dependence. Positive tail dependence between assets is often observed in financial applications and motivates the use of the Student-t copula in the experiments below. For the bivariate case the tail dependence coefficients of the Student-t copula are given by

λL=λU =2Tν+1 q

(ν+1)(1−ρ)/(1+ρ)

 , which are increasing in ρ and decreasing in ν.

The density of the univariate standard Normal distribution is given by fN(εt) = √1 exp  −1 2ε 2 t  . (20)

The standard Normal distribution is symmetric around zero, has mean zero, unit variance, no excess kurtosis and requires no parameters to be estimated.

The density of the univariate standardised Student-t is given by

fSt−t(εt; νm) = Γ([ν m+1]/2) Γ(νm/2)p(νm−2)π  1+ ε 2 t (νm−2) −(νm+1)/2 , (21) with the restriction νm >2. Note that for this distribution we denote the degrees of freedom parameter by νm, where the 0m0 stands for marginal. This is done to emphasize the difference with the Student-t copula degrees of freedom parameter ν. The requirement νm > 2 is needed for the mean and variance of the non-standardised Student-t marginal distribution to exist, which allows it to be standardised. The standardised Student-t distribution has mean zero, unit variance and is symmetric around zero. It belongs to the family of leptokurtic (or fat-tailed) distributions and implies excess kurtosis k = 6/(νm−4)

for νm > 4, and k = ∞ for 2< νm ≤4. When νm → ∞ the

standard-ised Student-t distribution approaches the standard Normal distribu-tion.

The Skewed Student-t (Skew-t) distribution was first proposed by Hansen (1994), its density is defined as

fSk−t(εt; γ; λ) =      bc1+ γ12(t+a 1−λ ) 2−(γ+1)/2 if εt < −a/b bc1+ 1 γ−2( t+a 1+λ ) 2−(γ+1)/2 if ε t ≥ −a/b , (22)

(21)

3.1 simulation setup 17

where 2< γ<∞ and−1<λ<1. The constants a, b and c are given

by a=4λc  γ−2 γ−1  , b=1+2−a2, c= p 1 π(γ−2) Γ(γ+1 2 ) Γ(γ 2) .

The Skew-t distribution is a generalisation of the Student-t distribu-tion which allows for skewness in the distribudistribu-tion. The two parame-ters γ and λ control the shape of the Skew-t distribution. The degree of skewness of the distribution is controlled by λ, if λ is negative the probability mass concentrates in the left tail and vice-versa if λ is posi-tive. The degrees of freedom parameter γ controls the thickness of the tails. Furthermore, the distribution has zero mean, unit variance and expressions for the higher moments were derived by Jondeau and Rockinger (2003). Note that, when λ=0 we recover the standardised Student-t distribution, when λ 6= 0 and γ∞ we obtain a skewed Normal distribution, and finally when both γ∞ and λ = 0 we obtain the standard Normal distribution.

All models considered in the Monte Carlo simulation study are fully parametric and estimated using multi-stage maximum likeli-hood (as explained in Section 2.3). For each of the univariate series Yi,twe assume we know the correct GJR-GARCH(1, 1, 1) specification, but not the parameter values. For the considered parametric marginal distributions of the innovations we estimate the parameters (if any) by maximum likelihood and then use the obtained CDF to transform the innovations into their probability integral transforms (PITs). These PITs are then used to estimate the Student-t copula parameters. Note that estimation of the copula parameters plays an important part in the predictive accuracy of different specifications for the marginal dis-tributions. This follows because Fermanian and Scaillet (2004) find that the use of incorrect specifications for the marginal distributions will lead to biased estimates of the copula parameters. Naturally, a larger bias will have a larger negative impact on the predictive accu-racy of a particular marginal distribution specification.

In the simulation experiments focus is on the bivariate case d= 2. The number of observations for the moving-in-sample window is set to R = 1, 000 and we compare the results for two different out-of-sample forecasting periods P = 1, 000 and P= 5, 000. Recall that the asymptotic distribution results for the considered Diebold-Mariano test statistics in Eq. (11) assume that P tends to infinity with R fixed. In practice it is not always possible to have very large P, motivating us to also consider the finite sample properties of the test for the more feasible situation P=1, 000.

(22)

18 m o n t e c a r l o s t u d y

In addition to the asymptotic tests we also performed bootstrap-based tests, which may achieve a higher power in finite samples. In particular, we apply the stationary bootstrap methodology of Politis and Romano (1994) directly to the out-of-sample score differences. Given that it is likely that the out-of-sample score differences are se-rially dependent and the test statistic is based on these score differ-ences, this bootstrap procedure seems appropriate. The probability of sampling the consecutive observation is based on the autocorrelation structure of the original out-of-sample score differences, following the methods detailed in Politis and White (2004) and including the cor-rection of Patton et al. (2009). Finally, for hypothesis testing at level

αthe number of bootstraps B needs to be chosen such that α(B+1)

is an integer. For example, at α = 0.05 let B = 999 rather than 1000. If instead B = 1000 it is unclear on an upper one-sided alternative test whether the 50th or 51st largest bootstrap t-statistic is the criti-cal value, see MacKinnon (2002). Throughout this thesis we set the number of bootstraps at B= 999, which seems sufficiently large and allows us to test at the standard significance levels such as α = 0.01 and α =0.05.

The number of independent Monte Carlo simulations in each exper-iment is set to M=1, 000. We start the Monte Carlo study with a size experiment, the size-discrepancies (if any) obtained there for the dif-ferent tests will be used to obtain size-corrected power in the power study. This is a valid approach as the used Diebold-Mariano type test-statistic in Eq. (11) is asymptotically pivotal under assumption 1, see Davidson and MacKinnon (1998) for further details. Furthermore, to improve upon comparability between experiments, we use the same random number generators for all the size and power experiments. 3.2 s i z e

In order to asses the size properties of the tests of equal predictive accuracy, a setting is required with two competing marginals spec-ifications that are both ’equally (in)correct’. However, as argued by Diks et al. (2011), whether the null of equal predictive ability holds depends on the weight function wt(y)used in the scoring rules. For the threshold type weight function wt(y) = I(y1 ≤ r, . . . , yd ≤ r), of particular interest for risk management purposes, it seems im-possible to construct a setting with two different marginal specifica-tions having equal predictive ability. Therefore we restrict ourselves to evaluate the size of the tests when focusing on the central part of the distribution. This is achieved by using the weight function wt(y) = I(−r ≤ y1 ≤ r, . . . ,−r ≤ yd ≤ r). The central part of the the distribution may for example be of particular interest to monetary policymakers, targeting to keep inflation between certain bounds.

(23)

3.2 size 19

The marginal distributions of the innovations in the DGP are given by the standard Normal distribution in Eq. (20). We test the null hy-pothesis of equal predictive accuracy of two equally incorrectly spec-ified Normal distributions for the marginals, with different means equal to −0.2 and 0.2 and identical variance equal to one. That is, both competing marginal specifications are equally distant from the true marginal specification. Results are given for the bivariate case d = 2 and threshold r = 0.5 in the weight function. They are based on one-sided tests, where the alternative hypothesis is that the left shifted marginal distribution has a higher predictive accuracy. For the sake of brevity we report only the results of the asymptotic tests, the results of the (stationary) bootstrap-based tests are very similar. 3.2.1 Results

The discrepancy between the actual and nominal size of tests based on both the censored likelihood and conditional likelihood scoring rules are given in Fig. 1, for nominal sizes up to 0.20. We find that overall the tests seem to exhibit similar size-properties, although the censored likelihood scoring rule tend to over-reject slightly more than the tests based on conditional likelihood scores. Furthermore, the dif-ferent panels show that the size distortions can be seen to become smaller for a larger number of out-of-sample evaluations P. The small deviations from zero which are left are therefore most likely caused by some estimation uncertainty and so-called "experimental random-ness", due to the relatively low number of simulation replications M =1, 000. Size-size plots, showing the discrepancy for all the nomi-nal size values, are given in Fig. 2 for P =1, 000. They show that the size discrepancies are sufficiently small for both tests and all nominal sizes considered.

3.2.2 Higher dimensions and other robustness checks

We verified the robustness of our results by considering various ex-tensions of the DGP. First, we repeated the experiments using other parameter values for the Student-t copula, e.g. changing the degree of freedom ν and the amount of correlation ρ between the series. Finally, we repeated the experiments for higher dimensions, up to d = 5. Unreported results find sufficiently small size distortions for all ro-bustness checks considered.

(24)

20 m o n t e c a r l o s t u d y

Figure 1:Size discrepancy plots for the tests of equal predictive accu-racy 0 0.05 0.10 0.15 0.20 -0.04 -0.02 0 0.02 0.04 0.06 0.08 nominal size size discr epancy censored asymptotic conditional asymptotic (a) P=1, 000 0 0.05 0.10 0.15 0.20 -0.04 -0.02 0 0.02 0.04 0.06 0.08 nominal size size discr epancy censored asymptotic conditional asymptotic (b) P=5, 000

The panels display the actual size - nominal size discrepancy of the asymptotic-based tests of equal predictive accuracy, when using the conditional likelihood and censored likelihood scoring rules. The DGPs are specified as in Eq. (18), where the marginal distributions of the innovations are given by the standard Normal distribution. The weight function used is given by wt(y) =I(−r≤y1 ≤r, . . . ,−r≤yd≤r)where r=0.5, so focus is on the central part of the support. The tests compare the predictive accuracy of two equally incorrect specifications for the marginals, N(−0.2, 1)and N(0.2, 1). The moving in-sample estimation window size is fixed at R = 1, 000, we evaluate P =1, 000 (top) and P = 5, 000 (bottom) out-of-sample observations, and perform 1, 000 independent Monte Carlo simulations. The 95% point-wise confidence intervals are given by the thin solid lines

(25)

3.2 size 21

Figure 2:Size-size plots for asymptotic-based tests, P=1, 000

0 0.20 0.40 0.60 0.80 1.00 0 0.20 0.40 0.60 0.80 1.00 nominal size actual size censored asymp

(a)censored likelihood

0 0.20 0.40 0.60 0.80 1.00 0 0.20 0.40 0.60 0.80 1.00 nominal size actual size conditional asymp (b)conditional likelihood The panels display the actual size as a function of the nominal size for the asymptotic-based tests of equal predictive accuracy, when using the conditional likelihood (right) and censored likelihood (left) scoring rules. The DGPs are specified as in Eq. (18), where the marginal distri-butions of the innovations are given by the standard Normal distribution. The weight function used is given by wt(y) =I(−r≤y1 ≤r, . . . ,−r≤yd≤r)where r=0.5, so focus is on the central part of the support. Results are given for d=2. The tests compare the predictive ac-curacy of two equally incorrect specifications for the marginals, N(−0.2, 1)and N(0.2, 1). The moving in-sample estimation window size is fixed at R=1, 000, we evaluate P=1, 000 out-of-sample observations and perform 1, 000 independent Monte Carlo simulations. The dotted line indicates the diagonal where the nominal size equals the actual size.

(26)

22 m o n t e c a r l o s t u d y

3.3 p o w e r

The power of the tests is evaluated through a setting where for one of the competing marginal specifications the marginals correspond with that of the DGP, while the distance of the alternative, incorrect marginal specification to the DGP varies depending on a certain pa-rameter in the DGP. That is, the marginal distributions of the innova-tions in the DGP are given by the standardised Student-t distribution in Eq. (21) with degrees of freedom parameter νm varied over the interval (4, 40]. Hence, we assume a finite kurtosis for all DGPs. We compare the predictive accuracy of the correct standardised Student-t marginal specification (with νm estimated) against an incorrect stan-dard Normal marginal specification (with no parameters estimated). Focus is on the left tail region of the support, which is of particular interest for risk management, using a threshold type weight function wt(y) = I(y1≤r, . . . , yd ≤r)in the tests of equal predictive accuracy. Hence, for this DGP we are particularly interested whether the pro-posed tests can distinguish between marginal distributions for the innovations with and without fat tails. Note that the standardised Student-t distribution approaches the standard Normal distribution as νm increases. Intuitively, as νm gets larger in the DGP, the more difficult it will become for the tests to distinguish between these two forecasting methods.

3.3.1 Results

The results are given in Fig. 3 in the form of size-corrected power plots, showing the size-corrected rejection rates as a function of the degrees of freedom parameter νm for the marginals in the DGP. The nominal size is 5% and results are given for the asymptotic and bootstrap-based tests of equal predictive accuracy, using threshold r = −0.5 in the weight functions. The null hypothesis is given by equal predictive accuracy of the Student-t and Normal marginal dis-tributions, against the one-sided alternative hypothesis that the cor-rectly specified Student-t marginals have higher predictive accuracy. Given that the true DGP used the Student-t marginal distributions, we might expect them to perform better. Note, however, that as the number of degrees of freedom νm becomes larger in the DGP the Normal marginals might outperform the Student-t marginals. This is a consequence of the fact that the Student-t marginals require one parameter to be estimated, whereas the more parsimonious Nor-mal marginals require no estimation. This can indeed be observed in Fig. 3, for the asymptotic based-tests and νm = 40 we find that the rejection rates are below the nominal size 5%. Consequently, Fig. 3 shows that the tests have increasingly higher power for smaller val-ues of νm. We also observe that the bootstrap-based tests (dashed

(27)

3.3 power 23

Figure 3:Power plots for the tests of equal predictive accuracy, for various levels of the number of degrees of freedom in the DGP 4 7 10 20 30 40 0.0 0.2 0.4 0.6 0.8 1.0

degree of freedom νmmarginals in DGP

size-corr ected po w er asympt censored asympt conditional bootstr censored bootstr conditional (a)P=1, 000 4 7 10 20 30 40 0.0 0.2 0.4 0.6 0.8 1.0

degree of freedom νmmarginals in DGP

size-corr ected po w er asympt censored asympt conditional bootstr censored bootstr conditional (b)P=5, 000

The panels display the size-corrected rejection rates of a one-sided test of equal performance of Student-t and Normal marginal distributions for the innovations, against the alternative hypothesis that the correctly specified Student-t marginals have a higher average score. The DGPs are specified as in Eq. (18), where the marginal distributions of the innovations are given by the standardised Student-t(νm) distribution. The horizontal axis displays the degrees of freedom parameter νmof the standardised Student-t marginal, characterising the DGP. The weight function used is given by wt(y) =I(y1 ≤r, . . . , yd ≤r)with r= −0.5, so focus is on the left tail region. Results are given for d =2. The tests use either censored or conditional scores. Solid lines are used for the asymptotic tests and dashed lines are used for the stationary bootstrap-based tests. The thin, dotted, horizontal lines indicate the nominal size of 5%. The moving in-sample estimation window size is fixed at R=1, 000, we evaluate P=1, 000 (top) and P =5, 000 (bottom) out-of-sample observations, and perform 1, 000 independent Monte Carlo simulations. The number of bootstrap replications is set to B=999.

lines) attain a higher finite-sample power than the asymptotic tests (solid lines). Naturally, the tests based on the larger number of out-of-sample evaluations P show higher rejection rates. Finally, we see that the censored likelihood based test gives higher rejection rates for all considered numbers of degrees of freedom, where the

(28)

differ-24 m o n t e c a r l o s t u d y

ence with the conditional likelihood seems to become smaller as P becomes larger.

Fig. 4 shows the observed (size-corrected) power as a function of the nominal size, the results are given for degrees of freedom param-eter νm = 10. This value for νm may often be observed in financial applications and implies an excess kurtosis of k = 1. It can easily be verified from the figures that all the tests of predictive accuracy exhibit nontrivial power for all sizes considered.

Figure 4:Size-power plots of the test of equal predictive accuracy for true Student-t marginals vs. Normal marginals

0.00 0.05 0.10 0.15 0.20 0.0 0.2 0.4 0.6 0.8 1.0 nominal size size-corr ected po w er asympt censored asympt conditional bootstr censored bootstr conditional (a)P=1, 000 0.00 0.05 0.10 0.15 0.20 0.0 0.2 0.4 0.6 0.8 1.0 nominal size size-corr ected po w er asympt censored asympt conditional bootstr censored bootstr conditional (b)P=5, 000

The panels display the size-corrected power as a function of the nominal size of a one-sided test of equal performance of Student-t and Normal marginal distributions for the innovations, against the alternative hypothesis that the correctly specified Student-t marginals have a higher average score. The DGPs are specified as in Eq. (18), where the marginal distributions of the innovations are given by the standardised Student-t distribution with νm = 10. The weight function used is given by wt(y) =I(y1 ≤r, . . . , yd ≤r)with r= −0.5, so focus is on the left tail region. Results are given for d=2. The tests use either censored or conditional scores. Solid lines are used for the asymptotic tests and dashed lines are used for the stationary bootstrap-based tests. The moving in-sample estimation window size is fixed at R=1, 000, we evaluate P=1, 000 (top) and P=5, 000 (bottom) out-of-sample observations, and perform 1, 000 inde-pendent Monte Carlo simulations. The number of bootstrap replications is set to B=999.

(29)

3.3 power 25

Fig. 5 shows how the observed rejection rates depend on the consid-ered region of interest. We vary the number of out-of-sample evalua-tions P to maintain approximately the same number of observaevalua-tions c=50 in the different lower tail regions considered in the tests. This is done for a better comparison of the tests based on the different regions. Results are for the bivariate case d = 2, degrees of freedom parameter set at νm = 10 and nominal size 5%. They show that the considered tests have increasingly higher power when we move fur-ther into the joint left tail. Again, tests based on the censored likeli-hood show higher rejection rates and the stationary bootstrap-based tests seems to provide a finite-sample improvement.

Figure 5:Power plot for different regions, true Student-t marginals vs. Normal marginals -1.25 -1.00 -0.75 -0.5 -0.25 0 0.0 0.2 0.4 0.6 0.8 1.0 treshold r size-corr ected po w er asympt censored asympt conditional bootstr censored bootstr conditional

The figure displays the size-corrected power as a function of the threshold r used to deter-mine the upper bound on the lower tail region. Results are based on a one-sided test of equal performance of Student-t and Normal marginal distributions for the innovations, against the alternative hypothesis that the correctly specified Student-t marginals have a higher average score. The DGPs are specified as in Eq. (18), where the marginal distributions of the innova-tions are given by the standardised Student-t distribution with νm=10. The weight function used is given by wt(y) =I(y1≤r, . . . , yd≤r). Results are given for d=2. The tests use either censored or conditional scores. Solid lines are used for the asymptotic tests and dashed lines are used for the stationary bootstrap-based tests. The thin, dotted, horizontal lines indicate the nominal size of 5%. The moving in-sample estimation window size is fixed at R=1, 000 and the number P of out-of-sample evaluations is varied to maintain approximately c=50 obser-vations in the targeted region. The number of bootstrap replications is set to B=999 and are based on a 1, 000 independent Monte Carlo simulations

We then considered a setup for the DGP which focuses on the question whether the tests of predictive accuracy can distinguish be-tween marginal distributions for the innovations with and without skewness. This was achieved using a setup where the marginal dis-tributions in the DGP correspond to the Skew-t distribution given in Eq. (22). Recall that the parameter λ of the Skew-t distribution con-trols the skewness of the distribution. Focus here is on λ ≤0, which implies that the probability mass of the distribution concentrates in the left tail, as often observed in practice for financial returns. The null

(30)

26 m o n t e c a r l o s t u d y

Figure 6:Power plot for the tests of equal predictive accuracy, for various levels of skewness in the DGP

-0.30 -0.20 -0.10 -0.05 0 0.0 0.2 0.4 0.6 0.8 1.0 skewness λ of marginals in DGP po w er asympt censored asympt conditional bootstr censored bootstr conditional (a)P=1, 000 -0.30 -0.20 -0.10 -0.05 0 0.0 0.2 0.4 0.6 0.8 1.0 skewness λ of marginals in DGP po w er asympt censored asympt conditional bootstr censored bootstr conditional (b)P=5, 000

The panels display the rejection rates of a one-sided test of equal performance of Skew-t and Student-t marginal distributions for the innovations, against the alternative hypothesis that the correctly specified Skew-t marginals have a higher average score. The DGPs are specified as in Eq. (18), where the marginal distributions of the innovations are given by the Skew-t distribution with DoF fixed at γ=10. The horizontal axis displays the skewness parameter λ of the Skew-t marginals, characterising the DGP. The weight function used is given by wt(y) = I(y1 ≤r, . . . , yd ≤r)with r = −0.5, so focus is on the left tail region. Results are given for d=2. The tests use either censored or conditional scores. Solid lines are used for the asymptotic tests and dashed lines are used for the stationary bootstrap-based tests. The nominal size is 5%, indicated by the dotted horizontal lines. The moving in-sample estimation window size is fixed at R=1, 000, we evaluate P=1, 000 (top) and P=5, 000 (bottom) out-of-sample observations, and perform 1, 000 independent Monte Carlo simulations. The number of bootstrap replications is set to B=999.

hypothesis of the test is given by equal predictive accuracy of Skew-t and Student-t specifications for the marginals, against the one-sided alternative hypothesis that the correctly specified Skew-t marginals have higher predictive accuracy. Focus is again on the lower tail re-gion, with threshold r = −0.5 used in the weight functions and we set the degrees of freedom parameter of the Skew-t distribution in the DGP at γ=10.

(31)

3.3 power 27

Results are given in Fig. 6 and show the rejection rate as a function of the skewness parameter λ. We find increasingly higher power for smaller values of λ and for λ close to 0 we see rejection rates below the nominal size 5%. This is due to the fact that for λ =0 the Skew-t distribution in the DGP reduces exactly to the Student-t distribution. In that case, the Student-t marginals outperform the Skew-t marginals as they require one parameter less to be estimated.

Fig. 7 shows, for this setup, how the rejection rates depend on the region of interest. Similar as before, we focus on the left tail region using a treshold type weight function. The number of out-of-sample observations in the regions considered is again set approximately to c=50 and we fix the parameters of the Skew-t marginals in the DGP at λ = −0.3 and γ = 10. Results now show that the rejection rates stay roughly the same for the different lower tail regions evaluated. Clearly the difference between the predictive accuracy of the true Skew-t marginals and the Student-t marginals stays approximately the same as we move further into the tails of the distributions.

Figure 7:Power plot for different regions, true Skew-t marginals vs. Student-t marginals -1.25 -1.00 -0.75 -0.5 -0.25 0 0.0 0.2 0.4 0.6 0.8 1.0 treshold r size-corr ected po w er asympt censored asympt conditional bootstr censored bootstr conditional

The figure displays the size-corrected power as a function of the threshold r used to deter-mine the upper bound on the lower tail region. Results are based on a one-sided test of equal performance of Skew-t and Student-t marginal distributions for the innovations, against the al-ternative hypothesis that the correctly specified Skewt-t marginals have a higher average score. The DGPs are specified as in Eq. (18), where the marginal distributions of the innovations are given by the Skew-t distribution with γ=10 and λ= −0.3. The weight function used is given by wt(y) =I(y1≤r, . . . , yd≤r). Results are given for d=2. The tests use either censored or conditional scores. Solid lines are used for the asymptotic tests and dashed lines are used for the stationary bootstrap-based tests. The thin, dotted, horizontal lines indicate the nominal size of 5%. The moving in-sample estimation window size is fixed at R=1, 000 and the number P of out-of-sample evaluations is varied to maintain approximately c=50 observations in the targeted region. Results are based on . The number of bootstrap replications is set to B=999 and are based on a 1, 000 independent Monte Carlo simulations

(32)

28 m o n t e c a r l o s t u d y

3.3.2 Time-varying weight functions

The preceding implies that an appropriate choice of the weight func-tion wt obviously depends on the interest of the forecast user. Fur-thermore, given a certain weight function it is up to the forecast user to set the parameter(s) in the weight function, such as the threshold r in wt(y) = I(y1 ≤ r, . . . , yd ≤ r). In practice r could be estimated from historical data and might be set equal to a particular quantile of the R observations in the moving window that is used for construct-ing the density forecast at time t. This makes the weight function time-varying, i.e. wt(y) = I(y1 ≤ r1,t, . . . , yd ≤ rd,t), while it also in-volves estimation uncertainty in the thresholds ri,t, for i=1, . . . , d. As argued by Diks et al. (2011), as long as the weight function wt is con-ditionally (givenFt) independent of Yt+1, the properness property of the conditional and censored likelihood scoring rules is not affected. However, non-vanishing estimation uncertainty in the thresholds may affect the power of the test of equal predictive accuracy. Diks et al. (2011) verify the power properties for the univariate case, we now extend their analysis to multivariate copula-based density forecasts when focusing on the use of different predictive marginals.

Results are given in Fig. 8 for both tests of equal predictive accu-racy and using different empirical α-quantiles to obtain the thresh-olds rα

i,t. We let rαi,t equal the empirical α-quantile of yi,t−R+1, . . . , yi,t, for i =1, . . . , d. The time-varying weight function used in this experi-ment is given by wt(y) = I(y1 ≤ rα1,t, . . . , yd ≤ rαd,t). The setup is sim-ilar as for the first power experiment, given by the null of equal pre-dictive accuracy of the true standardised Student-t marginals against the standard Normal marginals. Similar to the previous experiments, the number of out-of-sample evaluations P is varied to maintain ap-proximately the same small number of observations c = 50 in the different regions considered. The results show that the predictive ac-curacy tests have good power properties when used in combination with time-varying weight functions.

3.3.3 Higher dimensions and other robustness checks

We verified the robustness of the power results by considering various extensions of the DGPs. First, we used other parameter values for the Student-t copula, changing the degree of freedom ν and the amount of correlation ρ between the series. Finally, we considered the same settings in higher dimensions up to d=5.

In summary, the tests of predictive accuracy in selection regions of the support have satisfactory statistical power when focusing on the use of different predictive marginals. The improved finite-sample performance of the bootstrap-based test is clearly visible and the test

(33)

3.3 power 29

based on the censored likelihood scores is performing better than the test based on the conditional likelihood scores.

Figure 8:Power plot for different regions using a time-varying weight function. True Student-t vs. Normal

0.05 0.10 0.15 0.20 0.25 0.30 0.0 0.2 0.4 0.6 0.8 1.0 alpha-quantile treshold size-corr ected po w er asympt censored asympt conditional bootstr censored bootstr conditional

The figure displays the size-corrected power as a function of the α-quantile threshold used to determine the (time-varying) upper bounds on the lower tail region. Results are based on a one-sided test of equal predictive accuracy of Student-t and Normal marginal distributions used for the innovations, against the alternative hypothesis that the correctly specified Student-t marginals have a higher average score. The DGPs are specified as in Eq. (18), where the marginal distributions of the innova-tions are given by the standardised Student-t distribution with νm=10. The weight function used is given by wt(y) = I(y1 ≤ rα1,t, . . . , yd ≤ rαd,t). Results are given for

d=2. The tests use either censored or conditional scores. Solid lines are used for the asymptotic tests and dashed lines are used for the stationary bootstrap-based tests. The thin, dotted, horizontal lines indicate the nominal size of 5%. The moving in-sample estimation window size is fixed at R = 1, 000 and the number P of out-of-sample evaluations is varied to maintain approximately c = 50 observations in the targeted region. The number of bootstrap replications is set to B=999 and are based on a 1, 000 independent Monte Carlo simulations

(34)

Referenties

GERELATEERDE DOCUMENTEN

incentivised to provide a bid (considering their place in the merit order) or to not deliver their bids, due to a difference between their individual settlement price (determined

As seen in Panel A, the estimated coefficients of marginal value of cash, controlling for the effects of cash holdings and leverage level, is higher for financially constrained

In this dissertation, categorical marginal models are used to solve the follow- ing psychometric problems in test construction: constructing hypothesis tests for reliability

Effecten van lage temperatuur voor diverse parameters, gemiddeld voor alle lichttrappen in de gemeten licht response curves: Photo: fotosynthese μmol m-2 s-1: relatief gering

Met wetenschappelijk vervolgonderzoek is het mogelijk om tot meer betrouwbare resultaten te komen voor de onderzoeksvraag uit dit onderzoek, namelijk het verband

Figure 5: One-sided rejection rates at nominal level 10% of the Diebold-Mariano type test statistic of equal predictive accuracy defined in (3) when using the weighted modified

23 Different as the outcome of firm size and interest coverage ratio evaluation, we can see from Table 6 that the average cash holding in the proportion of market value of equity

Een vermindering van de omvang van de Nube programmering wordt enerzijds bereikt wanneer het programmeren zelf door bepaalde hulpmiddelen vereenvoudigd wordt en