• No results found

Optimal portfolio decisions under conditional higher moments and time-varying correlations

N/A
N/A
Protected

Academic year: 2021

Share "Optimal portfolio decisions under conditional higher moments and time-varying correlations"

Copied!
43
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

time-varying correlations

S.J.J. Vonk

Student number: 5742633

Date of nal version: April 26th, 2015 Master's programme: Econometrics

Specialisation: Financial Econometrics Supervisor: dr. N.P.A. van Giersbergen Second Reader: prof. dr. H.P. Boswijk

Abstract

The widely used traditional Mean-Variance analysis for portfolio optimalization assumes normality and constant correlations between assets. Recent evidence shows that individual asset return distribu-tions contain skewness and excess kurtosis and that correladistribu-tions are time-varying. This thesis is about incorporating the conditional third and fourth moment of asset return distributions and time-varying cor-relations between those assets in one model. Marginal distributions, specied for each asset, will be linked together by a Copula-function. A constant relative risk averse (CRRA) investor faces an asset allocation problem with three stock indices: a high-risk/high-return asset, a medium-risk/medium-return asset and a low-risk/low-return asset. To prohibit the investor from taking 'extreme positions' in het market, he or she is only allowed to take long positions. Finally, the importance of incorporating skewness and excess kurtosis and time-varying correlations is examined in an out-of-sample setting. The bootstrap methods of Patton (2004) will be used to analyze statistical signicance. Evidence is found that using advanced models with conditional parameters results in better portfolio performance.

(2)

Contents

1 Introduction 3

2 Theoretical Background & Methods 4

2.1 Markowitz . . . 4

2.2 GARCH-class models . . . 5

2.3 Modelling skewness and kurtosis . . . 6

2.4 Copulas . . . 8

2.5 Contribution . . . 8

3 Data 9 3.1 Qualitative description of the data . . . 9

3.2 Quantitative description of the data . . . 9

4 Models & Estimation 10 4.1 Estimation of the conditional mean and conditional variance . . . 10

4.2 Estimation of the conditional skewness and conditional kurtosis . . . 12

4.3 Specication tests . . . 15

4.4 One-step-ahead forecasting in the out-of-sample period . . . 16

4.5 Copulas & optimization . . . 19

4.6 Portfolio optimization procedure . . . 22

5 Results and model comparison 24 5.1 Performance of dierent strategies . . . 24

5.2 Statistical signicance . . . 25

6 Conclusion 29

References 33

(3)

1 Introduction

In the aftermath of the nancial crisis that struck the world in 2008, the risk management, risk-exposure and asset management of major nancial institutions became items of debate. Common tendency was that many banks, insurance companies and other nancial institutions had exposed themselves to an unacceptable amount of risk, with worldwide suering as result. Since then, risk regulations have become increasingly strict. Large companies as well as nancial institutions benet from proper risk management and the ability to monitor, allocate and hedge risks serves the stability of the world economy as a whole. It is beyond doubt that proper techniques of risk-measurement and asset allocation are of great importance for a great deal of parties, including pension funds, insurance companies and banks. Even private investors benet from the ability to properly analyse risk in dierent ways. For example, the self-employed have to take care of their own retirement plans, which must be done sensibly. This research is performed in order to improve nancial decision-making by modelling nancial time series more adequately.

With the evolution of econometric techniques it becomes increasingly possible to capture more complicated market specics. For almost every equity-based asset clustered volatility is observed. Conditional het-eroskedasticity can be modelled by a variety of models and extensions from the GARCH-class, rst introduced by Engle (1982) and shortly thereafter generalized by Bollerslev (1986). It is also observed that for certain assets the return distributions are skewed and exhibit excess kurtosis (Jondeau and Rockinger, 2003; Bauwens and Laurent, 2005; Peiro, 1999), which cannot be captured by a normal distribution. Hansen (1994) proposed an extension of the Student's t-distribution to capture the third and the fourth moment. This skewed student t-distribution allows for parametrization of the skewness and the degrees-of-freedom parameter. Skewness and excess kurtosis can thus be modelled to be time-varying. Building appropriate models to capture all relevant characteristics of individual assets is one thing, but determining an appropriate correlation struc-ture is another. Correlations are typically time-varying as well (Andersen et al., 2007; Tse and Tsui, 2002; Engle, 2002), and tend to be higher in periods of economic turmoil than they would have been in periods of economic stability. Engle (2002) proposed a Dynamic Conditional Correlation model, a form of a mul-tivariate GARCH-model. This model allows for conditional correlations, but is limited in the specication of a conditional third and fourth moment. In order to resolve this problem a copula-based model is used in this research. For an extensive overview on the use of copulas in nance see Cherubini et al. (2004). A copula is a multivariate probability density for which the marginal distributions all follow a standard Uniform distribution, rst introduced by Sklar (1959). A copula thus links dierently specied marginal distributions together to form a joint distribution. The combination of a very specic dependence structure and complete freedom in constructing the marginal distributions allows for very specic model-building.

This research will draw heavily on Patton (2004). Patton performed an analysis on a two-asset market. He examined whether the dierence in performance of complex copula-models and simpler models is statistically signicant, using an out-of-sample setting. The signicance of the dierences in portfolio performance are examined using bootstrap techniques. This research will contribute to the existing literature by extending past research to a market with not two, but three assets. This expansion is non-trivial since a correlation structure in more complicated in a three-asset market than in a two-asset market. This thesis is organized as follows. Firstly, models for the conditional mean and conditional variance will be constucted, using the in-sample period of the dataset only. Lags of ve exogeneous variables - dividends, the risk-free rate, excess return on the market, the spread in returns between value and growth stocks and the spread in returns between large and small rms, as described by Fama (1981) - as well as lags of returns of all the three

(4)

assets are allowed to enter the mean equation. For the conditional variance a GJR-GARCH specication will be employed, as in Patton (2004). Lags of the exogeneous variables are also allowed to enter the variance equation. Information criteria will be used to nd the best model specications for the mean and variance. Model specication will also be checked by looking for serial autocorrelation in the residuals. During the out-of-sample period these models will be recursively re-estimated, but the specication is non-adaptive. Once for all three assets a conditional mean and conditional variance have been specied, the skewed t-distribution with parametrizations for the skewness and degrees-of-freedom parameter will be estimated. After the construction of the marginal models is completed, dierent copula models will be estimated. The dependence parameters are modelled to be conditional on information of the recent past. Skewness and kurtosis are modelled in the univariate time-series, so the correlation structure is estimated by using the residuals of this model. Once the dierent models are specied completely, dierent strategies for dierent types of investors will be examined. Five dierent levels of relative risk aversion are examined (RRA = 1, 3, 7, 10, 20). The strategies that will be examined are (1) always hold the large cap; (2) always hold the mid cap; (3) always hold the small cap; (4) always hold a equally weighted mix of the three assets; (5) optimize portfolio weights using an unconditional trivariate normal distribution; (6) repeatedly re-optimize portfolio weights using the conditional trivariate normal distribution; (7) repeatedly re-optimize portfolio weights using the conditional Gaussian copula; (8) repeatedly re-optimize portfolio weights using the conditional t-copula. Strategies (9) to (12) are like (5) to (8), but where investors are allowed to take a small short position. The performance of dierent strategies will then be tested for economical and statistical signicance. In order to do so a bootstrap tests of pairwise comparisons will be deployed. Because of the assumed presence of weak serial autocorrelation in the dierent portfolio return series, standard bootstrap methods fail. Therefore the stationary bootstrap of Politis and Romano (1994) will be deployed.

This paper will proceed as follows: Section 2 provides a theoretical background on portfolio allocation theory, copulas and other relevant subjects. It also makes clear what the contribution of this research to the existing literature is. In section 3 the data that are used are described. Descriptive statistics are given as well. Section 4 deals with all the models that are used and the estimation results. In section 5 the statistical and economical signicance of the results will be examined, before conclusions are drawn in section 6.

2 Theoretical Background & Methods

In this section an overview of existing literature will be given. During the past decades, many papers were published on the topics of portfolio management, optimal asset allocation, incorporating higher moments in the modelling of return distributions and (time-varying) correlation structures.

2.1 Markowitz

The pioneer of modern portfolio theory is Harry Markowitz (1952). In his paper "Portfolio selection" he addressed the issue of expectations of future performance of securities and associated portfolio choices. He emphasized that an investor should consider high return a desirable thing and variance an undesirable thing. The paradigm Markowitz provided in his early work is about optimizing portfolio selection using the rst two moments of the available assets. In this early stage of investment theory means and (co)variances were assumed not to vary over time. Using Markowitz' paradigm, an investor minimizes the variance of her or his

(5)

Figure 1: Mean-variance with and without the risk-free asset.

Source: Figure 5.1 in Brandt (2009).

portfolio given a desired level of excess return. Assuming all excess returns and covariances to be known, a closed solution for the problem can be obtained. The still widely used (and educated) mean-variance frontier, as displayed in Figure 1, shows the minimal portfolio variance for every level of excess return. It can easily be seen that a higher expected return comes at the cost of higher risk. Since Markowitz' model is a model suited for one period, means and covariances are assumed to be constant over time. It would take decades for researchers to start undermining the assumptions Markowitz made.

2.2 GARCH-class models

In 1982, Robert F. Engle was the rst to address the problem of Conditional Heteroskedasticity. From the data it was clear that variances were varying over time, and that so-called clustered volatility was present in almost every time series. To loosen the assumption of a constant one-period ahead forecast variance he introduced a new class of stochastic processes called Autoregressive Conditional Heteroskedastic (ARCH) processes. These processes have nonconstant variances conditional on the past, but constant unconditional variances. For such processes, information on today's volatilty provides information for the one-period ahead forecast variance. The work of Engle (1982) has been citated over 17 thousand times during the past thirty years, and is accepted all over the world of nance, statistics and econometrics. The basic form of his model, an ARCH(1)-model, is displayed in equation (1) (Heij et al., 2004). The term ε2

t−1is the square of the rst

lag of the observed innovations. Throughout the years many extensions of the ARCH-framework of Engle have been proposed.

(6)

yt = µ + εt

εt|Yt−1 ∼ N 0, σt2

 (1)

σ2t = α0+ α1ε2t−1

Bollerslev (1986) was the rst to add another term to the ARCH-structure of Engle to create a Generalized ARCH (GARCH) model. Where Engle's model only incorporated observed shocks - asssuming an MA process for the variances - to predict future volatility, Bollerslev added lags of past predictions of the variance to the variance equation, assuming that the variance follows an ARMA-proces. A simple example of a garch model,

a GARCH(1,1)-model is displayed in equation (2). The term σ2

t−1is the rst lag of the predicted variances.

yt = µ + εt

εt|Yt−1 ∼ N 0, σt2



(2) σ2t = α0+ α1ε2t−1+ α2σt−12

Glosten et al. (1993) designed another extension of the GARCH-class models: the Glosten-Jagannathan-Runkle (GJR-GARCH) model. This model allows for assymetry in the variance process. It adds a dummy-term for negative observed shocks to allow for dierent regimes in dierent situations. The basic form of

the GJR-GARCH model is displayed in equation (3). The αi-parameters measure the impact of a positive

observed shock on the predicted volatility while the βi-parameters measure the impact of a negative observed

shock on the predicted volatility. As in Patton (2004), all conditional variances are modelled to follow a GJR-GARCH-specication. Exogeneous predictors (excess return on the market, spread in return between value and growth stocks, spread in return between small- and large-sized rms and the risk-free rate) are allowed to enter the variance equation. The modelspecication given here will be used throughout this research.

εt = σtZt Zt ∼ i.i.d(0, 1) (3) σ2t = α0+ q X i=1 αiε2t−i+ q X i=1 βiε2t−iIt−i+ p X j=1 βjσt−j2 It−i =    1 if εt−i< 0 0 if εt−i> 0 (4) During the years numerous extensions of the GARCH-framework pioneered by Engle and Bollerslev have been invented, but it is beyond the scope of this research to address them all. An extensive overview of all related models is to be found in Bollerslev (2008).

2.3 Modelling skewness and kurtosis

In 1999, Peiro was among the rst to publish a paper on skewness in nancial returns. He concludes that under a variety of non-normal distributions symmetry of returns cannot be rejected for most markets. However

(7)

not signicant, he observes some dierences between returns below the mean and returns above the mean. Jondeau and Rockinger (2003) investigated the existence and persistence of conditional skewness and kurtosis of various nancial series taken at the daily frequency. In order to do so they used a model with a GARCH-specication where innovations are modelled to follow a skewed Student's t-distribution, as proposed by Hansen (1994). They used an autoregressive structure for the modelspecication of the skewness parameter and degrees-of-freedom-parameter. They conclude that this structure yields great numerical instability in the estimation. In this research, the skewness parameter and degrees-of-freedom parameter solely depend on lags of the exogeneous variables and the forecast mean and variance of the three assets. Jondeau and Rockinger (2003) also conclude that large events of a given sign tend to happen simultaneously. This indicates that a proper way of modelling the dependence structure between assets is of great value. This research will contribute to the existing literature in doing so by deploying copulas to model the dependence stucture of three assets, as will be discussed later. In 2004, Jondeau and Rockinger proposed an extension of the traditional mean-variance criterion. They used a Taylor series expansion of expected utility to be able to compute the optimal portfolio allocation numerically. A main advantage of this approach is that it remains operational even when a large number of assets is involved. A main disadvantage is that - like traditional mean-variance optimization - parameters are not rendered conditional and are thus not allowed to vary over time. In 2003, Jondeau and Rockinger showed that - especially the skewness parameter - is in fact time-varying. Alongside time-varying means and variances the model of Jondeau and Rockinger (2004) is too constrained. They recommend modelling conditional moments at the end of their paper. In 2006, Jondeau and Rockinger follow their own recommendation and propose a Copula-GARCH specication. In their paper they show that Hansen's skewed t-distribution provides a good t for stock returns. Volatility, skewness and kurtosis are allowed to vary over time. Two dierent copula functions (Gaussian and t) are used to model the dependence structure. They also describe how the dependency parameter can be rendered conditional. The symmetric copulas used by Jondeau and Rockinger (2006) fail to describe assymetries in the dependence structure. In this research assymetric copulas - which allow for greater dependence in periods of economic turmoil - will not be used. Furthermore, Jondeau and Rockinger (2006) do not show whether complicated models outperform simpler ones. Patton (2004) provides a method to do so. He uses bootstrap techniques (Politis and Romano, 1994) to examine whether dierences between performance of the complex copula model and a simpler model are statistically signicant. However, he only uses two assets, which means that there is only one conditional dependency parameter to estimate. This research will deal with three assets instead, which renders the correlation structure more complex. As stated before, Hansen (1994) proposed a skewed t-distribution to capture the third and fourth moment of the asset return distribution. The pdf of this generalization of the t-distribution is displayed in equation 5, where λ is the skewness parameter and ν is the degrees-of-freedom parameter. The distribution is normalized to have zero mean and unit variance. Details on this probability density function and a proof that it is in fact a valid density function are to be found in the appendix of Hansen (1994). Hansen's skewed t-distribution has been used extensively.

(8)

g(z|λ, ν) =          bc  1 +ν−21 bz+a1−λ 2−ν+12 z < −ab bc  1 + 1 ν−2  bz+a 1+λ 2− ν+1 2 z = −a b (5) a = 4λc ν − 2 ν − 1  b2 = 1 + 3λ − a2 c = Γ ν+1 2  pπ (ν − 2)Γ ν 2 

2.4 Copulas

A copula is a multivariate probability distribution for which the marginals all follow a standard uniform distribution. It links together at least two marginal distributions to form a joint distribution. The use of copulas allows for all dierent kinds of marginal specifcations, hence allowing great exibility. Copulas have been used extensively during the past years. For example, Rosenberg and Schuermann (2006); Patton (2004); Jondeau and Rockinger (2006); Cherubini et al. (2004) have used copula models in their research. Another advantage of the use of copulas is the ease with which marginal distributions and dependency parameters can be rendered conditional. In 1959, Sklar was the rst to describe copulas in a statistical context. Patton (2004) elaborated extensively on Sklar's theorem. In this research three variables are examined: Xt, Yt and

Zt. There are also some exogeneous variables Wt. Then (Xt, Yt, Zt) |Ft−1∼ Jt= Ct(Ft, Gt, Ht), where Jtis

the conditional trivariate distribution with conditional univariate distributions Ft, Gtand Htof Xt, Yt and

Ztand the conditional copula being Ct. Ft−1is the information set up to time t − 1.

Sklar's theorem for continuous conditional distributions (Patton, 2004).

Let F be the conditional distribution of X|Z, G be the conditional distribution of Y |Z and H be the joint distribution of (X, Y ) |Z. Assume that F and G are continuous in x and y, and let Z be the support of Z. Then there exists a unique conditional copula such that

H (x, y|z) = C (F (x|z) , G (y|z) |z) , ∀ (x, y) ∈ ¯R ×R and each z ∈ Z¯ (6)

Conversely, if we let F be the conditional distribution of X|Z, G be the conditional distribution of Y |Z, and Cbe a conditional copula, then the function H dened by equation (6) is a conditional bivariate distribution function with conditional marginal distributions F and G.

In this research two copulas will be tested: the t-copula, used by (amongst others) Demarta and McNeil (2005) and Patton (2004), and the Gaussian copula, used by Patton (2004).

2.5 Contribution

Summarizing, this paper will contribute to the existing literature by modelling three assets using dierent copulas. For the marginal conditional distributions a GJR-GARCH-specication is used in the variance

(9)

equation while innovations are modelled using a skewed t-distribution allowing for conditional, time-varying skewness and degrees-of-freedom parameters. The performance of dierent investment strategies will be com-pared: optimize the portfolio weight once using the unconditional distributions of returns, optimize portfolio weights monthly using the trivariate normal model (conditional means and variances) and optimize portfo-lio weights monthly using both the Student's t-copula and Gaussian copula (where in both cases marginal distributons follow the skewed t-distribution). Comparison of these four strategies yields the opportunity to check whether it is the use of conditional moments, the use of more accurately modelled innovations or the use of copulas that yields more accurate predictions. The bootstrap methods of Patton (2004) will be used to examine whether the dierences between models are statistically signicant.

3 Data

3.1 Qualitative description of the data

Patton (2004) used monthly data from the CRSP (Center for Research in Security Prices) on the top 10% and the bottom 10% of stocks sorted by market capitalization to generate a portfolio consisting of small caps and another consisting of large caps. The `small cap' portfolio typically yields a higher average return at the cost of higher risk. The `large cap' portfolio yields a more modest average return at the benet of lower risk. In this research a third asset - the so-called `midcap' portfolio - is added. This portfolio consists of the 5th decile of companies listed in the CRSP database sorted by market capitalization. Monthly data for a period of 60 years is available. The rst 600 datapoints (January 1954 to December 2003) are used for in-sample model building, the last 120 datapoints (January 2004 to December 2013) are used for out-of-sample estimation and performance comparison.

In addition to these three indices other explanatory variables will be used. For every observation in the data total monthly return for each portfolio is given, splitted in income return and asset appreciation return. The rst lag of income return - or dividends - is used the rst an exogeneous variable. The resulting three variables act as proxies for time-varying expected returns. Furthermore, lags of the three Fama and French variables are used as exogeneous predictors. The small-minus-big (SMB) factor accounts for the spread in returns between small- and large-sized rms. This variable tracks the cyclical variation in the risk premium on stocks. The high-minus-low (HML) factor accounts for the spread in returns between value and growth stocks. The third Fama-French factor that is used is the excess return on the market (MKTRF). This variable measures general portfolio performance. The last exogeneous variable used in this research is the rst lag of the risk-free rate. The risk-free rate is used as a proxy for shocks to expected growth in the real economy. The use of these variables is roughly based on Patton (2004).

3.2 Quantitative description of the data

Table 1 displays descriptive statistics of the three time series. In both the full sample and the in-sample period average returns increase with risk. Large caps display a relatively low return with a relatively low risk, where small caps display a higher average return and higher risk. The midcap is situated in between. For the out-of-sample period this neat order nog longer occurs. The midcap seems to be the 'superior' asset displaying a higher average return with lower risk, compared to the small cap portfolio. The large cap and

(10)

midcap display negative skewness in all samples. The small cap portfolio displays positive skewness in both the full sample and the in-sample period, while displaying negative skewness in the out-of-sample period. As can be seen from the table, kurtosis tends to increase with the riskiness of the portfolio, meaning that a larger part of the variance is determined by more frequent extreme returns (both negative and positive). In the last four rows test results are displayed. The rst reported p-value is under the null hypothesis of the absence of skewness. The second reported p-value is under the null hypothesis of the absence of kurtosis. The third reported p-value is under the null hypothesis of the absence of both skewness and kurtosis. The Jarque-Bera test statistic that results from this last test is displayed in the last row. The test-statistic is χ2(2)-distributed.

This table provides evidence that skewness and kurtosis is present in the sample. Modelling skewness and kurtosis may thus prove to be useful.

Table 1 also shows the unconditional correlations between the three assets. The correlation between the small cap and midcap and the correlation between the midcap and large caps are both higher than the correlation between the small and large caps. This is as to be expected, since the former two are more closely related than the small and the large caps. It is striking that all unconditional correlations are relatively high during the out-of-sample period.

4 Models & Estimation

In this section dierent models will be estimated and compared. First the models for the conditional mean and conditional variance will be estimated. The standardized residuals (zero mean, unit variance) of these models will be used to determine models for the conditional skewness and conditional kurtosis1. Then the

standardized residuals of the mean-equation will be transformed using the CDF of the skewed t-distribution with time-varying parameters to construct residuals that follow a standard uniform distribution. These transformed residuals will be used to estimate the conditional correlation structure of the two dierent copulas. The transformed residuals will also be used in a Kolmogorov-Smirnov test for model specication.

4.1 Estimation of the conditional mean and conditional variance

Possible regressors for the conditional mean are lagged variables of the three asset returns and lagged variables of the exogeneous variables. Regarding the assumption that in periods of economic downfall volatility tends to be higher than in periods of economic prosperity , an assymetric model specication for the conditional variance (Patton, 2004; Glosten et al., 1993) is used. Lags of the exogeneous regressors are also allowed to enter the conditional variance specication. Information criteria (AIC, BIC) were used to determine optimal model specication, alongside residual analysis to check for possible misspecication. This proces was initiated by including all available variables in the model, and stepwise dropping insignicant variables. Results of this proces are displayed in Table 2. The resulting models for the conditional model and conditional variance for one specic asset are displayed in equations (7) to (12). For convenience, the `large cap'-portfolio will be denoted as X, the `midcap'-portfolio will be denoted as Y and the `small cap' portfolio will be denoted as Z. Estimating the mean-variance equations with t-distributed innovations proved to be more accurate than using Gaussian innovations. From the equations it can be seen that the error term is decomposed into

1The method of estimation described yields estimates that are not ecient. During this research ecient estimation methods

(11)

Table 1: Descriptiv e statistics & correlations Fu ll samp le In-sample Out-of-samp le Large Mid Small Large Mid Small Large Mid Small Mean* 6.8454 9.6846 11.422 5 7.286 2 9.5176 11 .9 577 4. 6640 10.5248 8.7907 Std Dev* 14 .4 033 18.1474 22.3139 14.5376 18.0847 22.4079 13.754 1 18.0847 21.9165 Sk ewness -.4000 -.4424 .2832 -.3335 -.4174 .3 845 -.8132 -.5615 -.2682 Kurtosis 4.52 17 5.404 0 6.996 7 4.470 1 5.5047 7.2452 4.7232 4.9432 5.5263 5% V aR 6. 4483 7. 9446 8. 7698 6.38 72 7.45 51 8.55 33 7.525 9 8.620 1 9.204 0 1% V aR 10.2308 12.7387 17.0819 10.1613 13.7998 16.10 78 10.3208 11.8086 17.0819 Min -20.1817 -27.0558 -28.51 73 -20.1817 -27.0558 -28.5173 -15.2432 -20 .0341 -23.2912 Max 17.1283 24.9645 39.985 17.1283 24.9645 39.9850 9.8449 17.3185 24 .9 943 p-v alue sk ewness .0000 .0000 .0021 .0010 .0000 .00 02 .0006 .0124 .2125 p-v alue kurtosis .0000 .0000 .0000 .0000 .0000 .0000 .0057 .0031 .0007 p-v alue join t .0000 .0000 .0000 .0000 .0 000 .0000 .0004 .0019 .0039 Jarque-Bera 36.89 53.44 64.79 27.16 45.01 61.94 15 .7 5 12.50 11.07 Fu ll samp le In-sample Out-of-samp le Large Mid Small Large Mid Small Large Mid Small Large 1.0000 1.0000 1.0000 Mid 0.8445 1.0000 0.8323 1.0000 0.9121 1.0000 Small 0.6657 0. 8789 1. 0000 0.6 311 0.86 85 1.00 00 0.85 35 0.933 0 1.000 0 Statistics with an asterisk are ann ualized. The in-sample perio d runs from the start of 1954 un til the end of 2003. The out-of-sample perio d runs from the star t of 2004 un til the end of 2013. The in-sample perio d th us consists of 600 datap oin ts, the out-of-sample perio d consists of 120 datap oin ts. The p-v alue rep orted corresp onds with the value of the Jarque-Bera normalit y test. In this test the null hyp otheses of zero sk ewness and no excess kurtosis are tested join tly .

(12)

two terms. The skewed t innovations (with zero mean and unit variance by construction) are denoted with εt. The variance htis added by multiplying this distribution term with

ht. The variance equations contain

a hx t−1 2 t−1-term, a h x t−1 2

t−1{t−1> 0}-term and a hxt−1-term. The rst two term are lags of the last observed

squared shock, the GARCH-term is the rst lag of the estimated variance.

Xt = α0+ α1Xt−1+ α2Xt−4+ α3Zt−1+ α4DIV 1t−1+ α5DIV 5t−1 (7) +α6DIV 10t−1+ p hx tε x t hxt = α7+ α8hxt−1 2 t−1+ α9hxt−1 2 t−1{t−1> 0} + α10hxt−1+ α11DIV 1t−1 (8) +α12DIV 5t−1+ α13SM Bt−1+ α14HM Lt−1+ α15RFt−1 Yt = β0+ β1Xt−1+ β2Zt−1+ β3DIV 1t−1+ β4M KT RFt−1+ q hytεyt (9) hyt = β5+ β6hxt−1 2 t−1+ β7hxt−1 2 t−1{t−1> 0} + β8hxt−1+ β9DIV 1t−1 (10) +β10DIV 10t−1+ β11M KT RFt−1+ β12HM Lt−1+ β13RFt−1 Zt = γ0+ γ1Xt−1+ γ2Zt−1+ γ3DIV 1t−1+ γ4M KT RFt−1+ p hz tεzt (11) hzt = γ5+ γ6hxt−1 2 t−1+ γ7hxt−1 2 t−1{t−1> 0} + γ8hxt−1+ γ9DIV 1t−1 (12) +γ10DIV 5t−1+ γ11DIV 10t−1+ γ12M KT RFt−1+ γ13HM Lt−1+ γ14RFt−1

The data that were used while constructing these models were observations in the in-sample period (rst 600 observations). Throughout the out-of-sample period these are be recursively re-estimated, but the model specications as given will not be updated.

4.2 Estimation of the conditional skewness and conditional kurtosis

Next to specications for the conditional mean and conditional variance, the third and fourth moment of the univariate time series need to be modelled properly. As stated by Hansen (1994), a skewed Student's t distribution is a good t for the marginal distributions for asset portfolio returns. Both the skewness and

degrees-of-freedom paramaters are rendered conditional. They are denoted λt and νt and are allowed to

depend on lags of the exogeneous variables and the forecast conditional means and variances. Hansen (1994) suggested transformations to ensure paramater feasibility, as is to be seen in equations (13) and (14). The degrees-of-freedom parameter has to be larger than 2 and Λ(x) = (1 − e−x)/(1 + e−x)is designed to keep

the λ-parameter on the interval (−1, 1). Ideally, this model is eciently estimated by maximum likelihood. The conditional log-likelihood function is denoted in equations (15) to (20). In the following the conditional

loglikelihood will be derived for Xt only, but the mechanism is the same for all three assets. Equation

(15) is the mean-equation of Xt. Equation (16) displays the error term splitted in two components. εt is

the skewed t-distributed as in equation (5). This distribution function is designed to have zero mean and unit variance (Hansen, 1994). √ht hence adds conditional variance to the error-distribution. In equation

(18) the equation to be used in maximizing the conditional loglikelihood is shown. This equation contains all parameters for both the mean- and the variance equation, while lambda and the degrees-of-freedom are parameterized in the probability distribution function itself. The nal loglikelihood function is displayed in

(13)

Table 2: Comparison of univ ariate time series mo dels (mean-and variance-equation) CAP1 (LAR GE CAPS ) Mo del Information Criteria Residual analysis** Mo del* Mean equ ation σ 2-equation ε ∼ AIC BIC L=1 L=2 L=3 P<0.05 GJR-GAR C H All 4,5,6,7,8,9,10 Gaussian -21 46.30 -2049.61 .6815 .8966 .4531 6 (.0357 ) GJR-GAR C H All 4,5,6,7,8,9,10 t -2153.51 -2052.42 .2338 .4922 .2578 5 (. 0237) GJR-GAR C H 1,2,3,4,5,6,7,8 4,5,6,7,8,9,10 t -21 57.16 -2064.86 .2789 .5564 .3083 5 (.0326) GJR-GAR C H 1, 2, 3, 4, 5, 6,7 4,5,8,9,10 t -2161.13 -2077.62 .2690 .5427 .299 3 5 (.0316) GJR-GAR C H 1,2,3,4,5,6 4,5,8,9,10 t -2161.58 -2082.47 .2262 .4712 .2486 5 (.0220) GJR-GAR C H 1,3,4,5,6 4,5,8,9,10 t -2 162.56 -2087.84 .2157 .4623 .2124 5 (.0184 ) GJR-GAR CH** * 1,3,4,5,6 4,5,8,9,10 t -2150.65 -2071.62 .3775 .6774 .3 244 12+ CAP5 (MID CAPS) Mo del Information Criteria Residual analysis** Mo del* Mean equ ation σ 2-equation ε ∼ AIC BIC L=1 L=2 L=3 P<0.05 GJR-GAR C H All 4,5,6,7,8,9,10 Gaussian -19 24.40 -1827.70 .3146 .2083 .3665 12+ GJR-GAR C H All 4,5,7,8,9,10 t -194 3.76 -1847.06 .6337 .2892 .4732 12+ GJR-GAR C H 1,2,3,4,5,7 4,5,7,8,9,10 t -1950.53 -1871.414 .5743 .35 49 .5467 12+ GJR-GAR C H 1,3,4,7 4,5,7,8,9,10 t -1 953.33 -1883.01 .6554 .3831 .5811 12+ GJR-GAR CH 1,3,4,7 4,5,7,9,10 t -1954.38 -1888.45 .6 153 .3540 .5490 12+ GAR CH 1,3,4,7 4,5,7,9,10 t -1953.43 -1891.89 .76 80 .4493 .6481 12+ CAP10 (LAR GE CAPS ) Mo del Information Criteria Residual analysis** Mo del* Mean equ ation σ 2-equation ε ∼ AIC BIC L=1 L=2 L=3 Lag P<0.05 GJR-GAR C H All 4,5,6,7,8,9,10 Gaussian -17 34.01 -1637.31 .0961 .2457 .3258 12+ GJR-GAR C H All 4,5,6,7,8,9,10 t -1774.92 -1673.82 .1522 .3584 .5062 12+ GJR-GAR C H 1,3,4,5,6,7,10 4,5,6,7,8,9,10 t -178 0.02 -1692.11 .1806 .4073 .5756 12+ GJR-GAR C H 1,3,4,7 4,5,6,7,8,9,10 t -1782.61 -1707.89 .2626 .5268 .6709 12+ GJR-GAR CH 1,3,4,7 4,5,6,7,9,10 t -178 3.45 -1713.13 .2866 .5591 .7014 12+ GAR CH 1,3,4,7 4,5,6,7,9,10 t -1780 .2 3 -1714.303 .5499 .8295 .8836 12+ * Both p and q are 1. Further: 1. cap 1t − 1 ,2. cap 5t − 1 ,3. cap 10 t − 1 ,4. div 1t − 1 ,5. div 5t − 1 ,6. div 10 t − 1 ,7. mk tr ft − 1 ,8. smb t − 1 ,9. hml t − 1 ,10. r ft − 1 ** H0 : All rst L lags (L=1,2,3) are join tly insignican t. The last column indicates whic h lag is the rst to be sign i can t. *** includes an AR(4)-term

(14)

equation (20). Estimation of this conditional loglikelihood is preferable since it yields ecient estimates. However, estimation of this conditional likelihood yields spurious results. All parameters in the equation for the degrees-of-freedom parameter νtare estimated to be zero. Both the programs `Matlab' and `Stata' have

been used to estimate this likelihood function, both did fail due to the same problem. This phenomenon may be caused by the large amount of parameters that is to be estimated and the relatively low amount of data available. However, using only a constant in all equations (Xt, ht, λt, νt) yields the same result. Correlation

between the mean or the variance and the degrees of freedom may be another cause. It would be for further research to further investigate this problem in detail.

λxt = Λ(α0+ α1DIV 1t−1+ ...) (13) νtx = 2.1 + (β0+ β1DIV 1t−1+ ...)2 (14) Xt = α0+ α1Xt−1+ α2Xt−4+ α3Zt−1+ α4DIV 1t−1+ α5DIV 5t−1 (15) +α6DIV 10t−1+ ηt ηt = p htεt (16) ht = α7+ α8hxt−12t−1+ α9hxt−12t−1{t−1> 0} + α10hxt−1+ (17) α11DIV 1t−1+ α12DIV 5t−1+ α13SM Bt−1+ α14HM Lt−1+ α15RFt−1 εt = ηt √ ht =(Xt− Z√ t−1β1) ht (18) ln L(θ|x1, x2, . . . , xn) = n X t=1 lt(θ) (19) ln lt(θ) = ln g(εt|λt, νt) − 1 2ln ht (20)

As an alternative, the model was estimated in two steps. Standardized (zero mean, unit variance) residuals of the mean- and variance-models are used as dependent variable to model the conditional skewness and the degrees-of-freedom parameter in the skewed t-distribution. The pdf of the skewed t-distribution has zero mean and unit variance by construction, so using standardized residuals is necessary. These standardized

residuals are used as εt in equation (20). As stated, both conditional parameters are allowed to depend

on lags of the exogeneous variables, and one-step-ahead forecasts of the means and variances of the three assets. Including a constant, this brings the total amount of independent variables for each dependendent parameter to 14, and the total amount of parameters to be estimated simultaneously to 28. This did not cause any problems in the estimation procedures. Comparing likelihood values, information criteria and p-values for the Kolmogorov-Smirnov specication test (to be addressed later in this section) lead to the decision to maintain all variables in the model specications for the conditional skewness and conditional degrees-of-freedom parameters for all three models. The resulting models are displayed in equations (21) and (22) and hold for all three assets. Plots of the out-of-sample forecasts of these parameters are displayed later.

(15)

λt = 2.1 + (β0+ β1DIV 1t−1+ β2DIV 5t−1+ β3DIV 10t−1+ β4M KT RFt−1+ β5SM Bt−1 (21)

+β6HM Lt−1+ β7Rft−1+ β8µxt + β9µyt + β10µzt+ β11hxt + β12hyt + β13hzt) 2

νt = Λ(β14+ β15DIV 1t−1+ β16DIV 5t−1+ β17DIV 10t−1+ β19M KT RFt−1+ β19SM Bt−1 (22)

+β20HM Lt−1+ β21Rft−1+ β22µxt + β23µyt + β24µzt+ β25hxt+ β26hyt + β27hzt)

The likelihood that was used to estimate the parameters is the conditional density function of the skewed t-distribution that was denoted in equation (5) where z are the standardized residuals from the mean-and variance-equation. It is of importance to note that the in-sample observations were used to construct the model, and that the models for λ and ν were recursively re-estimated throughout the out-of-sample period. However, as was the case for the mean- and variance-equation, model specications are not updated throughout the out-of-sample period.

4.3 Specication tests

With proper model-specications for both the mean and the variance of the univariate time series as well as proper model-specications for the parameters of the conditional skewed t-distribution in place, it is now possible to transform the standardized residuals from the mean- and variance-equation to standard uniform distributed residuals. Those transformed residuals serve as input for the copula estimation and allow for the use of the Kolmogorov-Smirnov test for model specication. To check whether the use of the conditional skewed t-distribution is necessary in the rst place, the residuals will also be transformed using simpler (unconditional) distributions and tested. Equations (23) and (24) show how residuals have to be transformed in order to be standard uniform. In these equations the residuals are transformed using the skewed t-distribution. The transformation is carried out for all three time series, each with their own model predictions for mean, variance, skewness parameter and degrees-of-freedom parameter. Note that transformed residuals are also calculated for the out-of-sample period, using the one-step-ahead forecasts of the mean, variance, the most recent observed innovation and forecasts of the skewness and the degrees-of-freedom parameter.

ˆ εt= Xt− ˆµt q b ht ∼ Skewed t ˆλt, ˆνt  (23) ˆ ηt= F  ˆ εt; ˆλt, ˆνt  ∼ U nif orm [0, 1] (24)

To check for proper model specication a Kolmogorov-Smirnov test (Massey Jr, 1951) will be used to check whether the transformed residuals follow the standard uniform distribution. The results are displayed in Figure 2. The null-hypothesis is that the residuals follow the desired distribution, the alternative hypothesis is that they do not. The critical value for the test-statistic is 0.0553. As can be seen, this model is not rejected by the Kolmogorov-Smirnov test for all three time-series.

In the estimation of the mean- and variance-equation innovations were assumed to follow an unconditional regular t-distribution. Therefore it may be of interest to check whether transformation of the residuals using an unconditional regular t-distribution may be a suitable alternative. Fitting the data to an unconditional

(16)

Figure 2: Histogram of the transformed residuals for the large cap series & Kolmogorov-Smirnov test results where ε ∼ conditional skewed t-distribution

Kolmogorov-Smirnov tests

reject null hypothesis test-statistic p-value c ηx t no .0195 .9696 c ηyt no .0249 .8431 b ηz t no .0332 .5150

t-distribution yields an estimate of the degrees-of-freedom parameter of 7.4 for Xt, 6.2 for Yt and 3.7 for

Zt.These values will be used in the transformation that uses a regular and unconditional t-distribution. The

results of both the Kolmogorov-Smirnov tests and the histogram of the large cap series are displayed in Figure 3. It can be concluded that using the conditional skewed t-distribution is necessary, since using a regular, unconditional distribution to transform the residuals fails to produce residuals that are Uniform[0, 1]-distributed. Note that the specication tests are all carried out using the in-sample observations only, but the rapported results also hold for the full sample.

4.4 One-step-ahead forecasting in the out-of-sample period

In this section the derived models are used to produce one-step ahead forecasts for the four parameters. These parameter forecasts will eventually be used to re-optimize portfolio weights for every month during the out-of-sample period. In Figure 4 the forecasted conditional mean and variances during the out-of-sample period are displayed. The rst thing that draws the eye is the fact that the large cap index has a far more stable mean than both the mid cap and the small cap. This is to be expected since the large cap is to be the `safe' investment. The mean return is moderate, but stable. The fact that the model for the large cap contains an AR(4)-term may be an even more important factor for the smooth mean prediction function. In October 2008 the nancial crisis struck the world. This can clearly be seen from the expected returns of both the midcap and the small caps. The negative expected return is persistent for a few months. More characteristics of the dierent assets come forward in Figure 4. For example, expected returns of the small cap index are more uctuating, compared to the midcap and the large cap. This characteristic is also to be

(17)

Figure 3: Histogram of the transformed residuals for the large cap series & Kolmogorov-Smirnov test results where ε ∼ unconditional regular t-distribution

Kolmogorv-Smirnov tests

reject null hypothesis test-statistic p-value c ηx t no .0536 .0629 c ηyt yes .0622 .0187 b ηz t yes .0934 .0000

found in the graph for the conditional volatility. The small cap index exhibits the highest volatility, where the large cap index shows less variance. Again, in the months after October 2008 volatility rises very high. In Figure 5 the one-step-ahead forecasts of the λ and ν-parameter are displayed for the out-of-sample period. Where for the volatility the three assets generally moved in the same direction, predictions for the skewness-parameter are all over the place. Estimates are very volatile and jumpy. The large cap and the mid cap, however, do exhibit some similar behaviour. A very weak positive trend from the beginning of 2004 to the summer of 2007 is to be observed. After that the skewness parameter starts declining until the end of 2009 after which it starts rising again, only to remain more or less constant after May 2011. A few outliers aside, the skewness parameters roughly varies from -0.7 to 0. This is quite an extensive range, so modelling conditional skewness may indeed be useful. The small cap series exhibits dierent behaviour. For the rst three years of the out-of-sample period the skewness parameter uctuates between -0.1 and 0.1. After that a slight increase occurs, to be followed by a steep decline, starting a few months before the nancial crisis broke out and ending in January 2009. After January 2009 the skewness parameter of the small caps increases to a value of around 0.58. After that the skewness parameter roughly remains in the range between 0 and 0.2, a few positive and negative outliers aside. The fact that the small cap generally displays a higher value for the skewness parameter was to be expected, regarding the table with univariate time series characteristics (1). The one-step-ahead predictions for the nu-parameter generally move in the same direction. Note that for some months a very high value of the nu-parameter was estimated, but for graphical purposes those values have been truncated to 30. For parameter values higher than 30, the t-distribution approximates a normal distribution, so this is a feasible truncation. For the rst few years the degrees-of-freedom parameter exhibits

(18)
(19)

quite steady behaviour. The rst steep increase in the forecast of this parameter occurs around october 2007. After this peak the degrees-of-freedom parameter slowly falls back to its initial level, only to rise again very fast in the months after October 2008. Since it is to be expected that in periods of economic turmoil a larger part of the variance would have been explained by the regular occurence of extreme events, the parameter value was expected to be low instead of high. A low value for the degrees-of-freedom parameter in the skewed t-distribution implies more excess kurtosis, and thus `fatter' tails. After a year the degrees-of-freedom parameter falls back again to its initial level (around 5) only to rise again at the end of 2010 and the beginning of 2012. The three assets generally move in the same direction.

Comparing the forecast plots in this paper to the forecast plots of Patton (2004) yields a striking conclusion. Where in this research the forecast are very jumpy, in Patton's research they exhibit very smooth behaviour. The fact that in this paper forecasts are quite jumpy may be due to the fact that here far more explanatory variables are used to explain λ and ν. Another cause for the dierence may be that Patton used autoregressive or moving average terms in his models. He did not report so in his nal publication, but the code he used that was published online suggests otherwise. Especially the plots of the conditional degrees-of-freedom parameter are not as they were expected, but the fact that using these predictions to generate residuals that follow a standard uniform distribution is reassuring.

4.5 Copulas & optimization

In this research two dierent copulas are compared: the Gaussian copula (equation 27) and Student's t copula (equation 28). It is worth noting that both copulas are symmetric, so assymetric dependence (as in Patton (2004)) is not captured within these models. To allow for a comparison between the more advanced models introduced in this paper and simpler models a benchmark model is introduced. This model is the trivariate normal model where the means and (co)variances are rendered conditional (equation 26). It is of importance to note that for all models that use conditional parameters, portfolio weights are updated monthly during

the out-of-sample period. It would also be of interest to check whether rendering means and (co)variances conditional on itself yields improvements on portfolio decisions, so using an unconditional trivariate-normal model (equation 25) is also an investment strategy. Therefore the four multivariate models that are compared are displayed in equations (25) to (29).

(20)
(21)

 X − µx √ hx , Y − µy √ hy , Z − µz √ hz  ∼ N   0,    1 ρxy ρxz ρxy 1 ρyz ρxz ρyz 1       (25) Xt− µxt phx t ,Yt− µ y t phy t ,Zt− µ z t phz t ! ∼ N (0, ρt) (26) Xt− µxt phx t ,Yt− µ y t phy t ,Zt− µ z t phz t !

∼ Cnorm(Skew t (λxt, νtx) , Skew t (λyt, νty) , Skew t (λzt, νtz) , ρt) (27)

Xt− µxt phx t ,Yt− µ y t phy t ,Zt− µ z t phz t !

∼ Ct−stud(Skew t (λxt, νtx) , Skew t (λyt, νty) , Skew t (λzt, νtz) , ρt, νtcop(28))

ρt =    1 ρxyt ρxz t ρxyt 1 ρyzt ρxz t ρ yz t 1    (29)

With transformed residuals (that are U [0, 1]-distributed) in place the estimation of the correlation structure is all that remains. Patton (2004) rendered the single correlation coecient conditional on lags of the exogeneous variables, one-step ahead forecasts of the conditional means, the observed correlation during the past twelve months and an AR(1)-term. Due to the presence of three assets the correlation structure is a 3 × 3 matrix instead of single correlation parameter. A rst attempt in estimating the conditional correlation structure consisted of extending Pattons methods to a 3 × 3 correlation matrix instead of using a single correlation coecient. The same transformation as before was used to ensure parameter feasibility. Individual correlation predictions were modelled, and subsequently merged into a 3 × 3 correlation matrix. This method yielded a one-step-ahead forecast correlation matrix for each period of the in-sample period. Estimation of the parameters was tried by using the conditional likelihood of the trivariate gaussian copula. This method, unfortunately yielded spurious results. Since these spurious results may have been caused by the large amount of parameters to be estimated, simpler models where used for the correlation. For example, the three conditional correlations were modelled to only depend on the observed correlation of the past year and a constant term. The use of much simpler models yielded results that were more likely. The estimation of the conditional correlation structure to be used in the t-copula was even more complex, since the t-copula uses another degrees-of-freedom parameter next to the regular correlation structure.

As a solution to this diculty, conditional correlation structures were estimated in a dierent, non-parametric way. A so-called rolling window technique was employed in which the one-period-ahead forecast simply equals the observed correlation structure (calculated by means of both the Gaussian and the t-copula) of the past year. Note that for the prediction of the correlation at time t + 2, information up to time t + 1 is available. Subsequently, for estimating the correlation structure at t + 3, information up to time t + 2 is available. Information at time t + 2 includes observed movements, so forecasted conditional correlations are always based on observed data, not predictions. The results of this estimation procedure is displayed in Figure 6. It is worth noting that for every time-window the t-copula also estimates a degrees-of-freedom parameter. These parameter estimates are not displayed, but they tended to be very high (even 10,000+). The unconditional degrees-of-freedom parameter value lies around 7.0, so these frequent high estimates are strange. Experimenting a little with the size of the time-window suggests that the high estimate of the

(22)

ν-parameter is caused by the very small amount of observations (12) in every window. Therefore the observed nu-forecasts are not used in the simulations.

When comparing the two models one great dierence draws the eye immediately. Where the correlation prediction path of the Gaussian copula exhibits a severe downward peak in May 2007, this peak is absent in the prediction path of the t-copula. It may be the case that the absence of this peak is caused by the very large ν-parameter values that were estimated in the same period, but it cannot be said for certain. The dierence between the two graphs is striking, but it is beyond the scope of the research to investigate this in further detail. A high correlation is expected during the period following October 2008, but since the correlations are already high throughout the out-of-sample period, this is not well distinguishable.

4.6 Portfolio optimization procedure

In this section the optimization procedure will be described. The investor wants to maximize utility every month during the out-of-sample period. During this research the investor is assumed to use a CRRA (Constant Relative Risk Aversion) utility function, as is displayed in equation (30). The variable P0is the initial wealth

(set to 1 during this research), and the γ-parameter is the level of relative risk aversion (RRA). In this paper only RRA = [1, 3, 7, 10, 20] are considered. The use of these values is very common in the nancial literature. Brandt (2009) - on page 276 - and Jondeau and Rockinger (2004) argue that in order to be able to capture the third and fourth moment an utility function should be dierentiable at least four times. The CRRA utility function is dierentiable an innite amount of times. Patton (2004) argues that nonincreasing absolute risk aversion - as is the case for the CRRA utility function - is a desirable property of an utility function.

U (γ) =    (1 − γ)−1· (P0(1 + ωxXt+ ωyYt+ ωzZt)) −1 if γ 6= 1 log (P0(1 + ωxXt+ ωyYt+ ωzZt)) if γ = 1 (30) The parameters of interest are ωx, ωy and ωz. To optimize portfolio weights, the following function needs to

be maximized over the parameters of interest (equation 31). ˆFt+1, ˆGt+1 and ˆHt+1 are forecast densities of

the three assets, while ˆCt+1is the forecast density for the copula. Together they form the joint forecast density

ˆ

Dt+1. For the short-sales constrained investor, the following holds: W = (ωx, ωy, ωz) ∈ [0, 1]3: ωx+ ωy+ ωz≤ 1

and for the investor that is allowed small short positions the following holds: W = (ωx, ωy, ωz) ∈ [−1, 1]3: −1 ≤ ωx+ ωy+ ωz≤ 1

ω∗t+1 = arg max ω∈W EDˆt+1[U (1 + ωxXt+ ωyYt+ ωzZt)] = arg max ω∈W ˆ ˆ U (1 + ωxXt+ ωyYt+ ωzZt) · ˆft+1(x) · ˆgt+1(y) · ˆht+1(z) (31) ·ˆct+1 ˆFt+1(x), ˆGt+1(y) ˆ, Ht+1(z)  · dx · dy · dz

The formula that needs to be optimized does not have a closed-form solution (Patton, 2004). Therefore we use Monte-Carlo integration. n = 10, 000 Monte Carlo replications were simulated to create possible one-step-ahead asset movements using the derived conditional distributions. These 10,000 possible paths of simulated asset returns will be used to obtain optimal portfolio weights maximizing expected utility. Note

(23)
(24)

that this optimization procedure is carried out for every point in time during the out-of-sample period2.

Details on the exact optimization procedure can be found in the Appendix of Patton.

The optimization procedure described above is used for the copula models using conditional distributions. Optimizing the conditional multivariate normal model works in a similar way3. For the unconditional

trivari-ate normal model an asset-specic mean, asset-specic variance and correlation structure are calcultrivari-ated at the beginning of the out-of-sample period. To do so the 600 datapoints of the in-sample period are used. Optimal portfolio weights are constructed as follows. A Monte Carlo simulation of the trivariate normal distribution with N=100,000 is performed with the unconditional parameters for the mean and (co)variances. Portfolio weights are then chosen such that the average utility (CRRA utility function) of the 100,000 simulations is maximized. The obtained portfolio weights are not updated throughout the out-of-sample period.

In the next section the twelve dierent models are compared for all 5 possibile values of the γ-parameter.

5 Results and model comparison

In this section the total of twelve dierent investments strategies are examined: (1) always hold the large cap; (2) always hold the mid cap; (3) always hold the small cap; (4) always hold a equally weighted mix of the three assets; (5) use the unconditional multivariate distribution of asset returns to determine the optimal asset allocation, abbreviated MVNu; (6) use the conditional multivariate distribution of asset returns to determine the optimal asset allocation, abbreviated MVNc; (7) use the Gaussian copula (NormCop) with skewed t-distributed marginals and; (8) use the t-copula (tCop) with skewed t-t-distributed marginals. Strategies (9) to (12) are that same as (5) to (8), but where taking a small short position is allowed.

5.1 Performance of dierent strategies

In Table 3 the monthly realized portfolio return statistics for the out-of-sample period are displayed. The statistics that are given are the mean return, the standard deviation of the return, the Sharpe ratio (mean divided by standard deviation), skewness, Value at Risk (VaR) and Expected Shortfall (ES). The results are quite similar to those of Patton (2004), with the exception of the portfolios where short positions are allowed. In this research short positions are bounded at one time to the initial wealth, where in Pattons research unlimited short positions are allowed. Unbounded short positions result in extreme monthly returns, but at the cost of very high standard deviations. It can be seen that the models with conditional parameters (MVNc, NormCop and tCop) in all cases outperform the models without conditional parameters (the naive portfolios and MVNu) regarding the Sharpe ratio. This means that using the more advanced models (with monthly rebalancing of the portfolio) yields relatively higher average returns with lower standard deviations4. This

result is in accordance with the results of Patton. Furthermore, an advantage of the use of copula models over

2As input for the copula simulations for the t-Copula a degrees-of-freedom-parameter is required. In this research this

parameter is unconditional, and xed to the value 3. The reason for this value is to create maximal contrast between the Gaussian copula (without strong tail dependence) and the t-copula (with strong tail dependence)

3In this model the one-step ahead forecast for the correlation is calculated as the unconditional correlation over the last year,

using observed monthly returns. Combining these correlation estimates with the forecasts for the standard deviations (estimated by GJR-GARCH-models earlier in this research) results in estimated covariance matrix. This covariance matrix, combined with forecast expected returns is used to simulate 10,000 replications from the trivariate normal distribution. Optimization proceeds as described in the text.

4It is of importance to note that this conclusion applies to frictionless markets. Transaction costs are not at all taken into

(25)

the use of the conditional multivariate normal model is the fact that risk-averse investors are able to achieve lower Expected Shortfall and Value at Risk. This is a property that may result from the more appropriate modelling of skewness and kurtosis. The eect is highest in the situation where short-sales are allowed. All three models using conditional moments are able to construct portfolios with positively skewed returns. Positive skewness is a desirable property of portfolios. This positive skewness is also in strong contrast with the results of Patton (2004), who nds very little positive skewness in his portfolios.

To allow for a closer comparison between a two-asset market and a three-asset market, the results of Table 3 are replicated for a two-asset market, using the same dataset. These results are to be found in Table 4. In the situation where short-sales are not allowed, the use of the proposed models in a two-asset market yields higher average returns, at the cost of higher risk. In a situation where short-sales are allowed, the use of the proposed models in a three-asset market yields higher returns at the cost of higher standard deviations. Sharpe ratios in both the two-asset and the three-asset market are comparable.

It seems likely that better portfolio performance is obtained by rendering parameters conditional rather than modelling skewness and kurtosis. However, the signicant reduction of Expected Shortfall and Value at Risk - while maintaining the same Sharpe ratio compared to the MVNc model - for risk-averse investors is a desirable property that possibly results from modelling skewness and kurtosis.

In Table 5 management fees are displayed for both the two-asset market and the three-asset market. This table displays the monthly fee an investor is willing to pay to switch from the unconditional multivariate distribution model to another model, taking the average return into account. The management fees displayed in this table do in no way account for risk, standard deviations or whatsoever. For example, an investor in a three-asset market with relative risk preference 7 is willing to give up .56 percent points per month to become indierent between the normal copula model and the benchmark. It can easily be seen that regarding average return, the unconditional models (both short-sales constrained and short-sales allowed) are strongly outperformed by all models with conditional parameters. This conclusion holds for both cases. Furthermore it is notable that especially for risk-averse investors holding an equally weighted portfolio is not a bad alternative in terms of economic gains. Regarding the fact that in real life transaction costs are incurred when rebalancing the portfolio, this may even be the preferable option. As stated before, in a market where short positions are allowed the models perform better in a two-asset market. When short-sales are allowed the models perform better in a three-asset market. For both cases the conclusion that conditional models are preferable over unconditional models still holds. Whether the results described in this section are economically signicant is up to the investor. In the next section statistical signicance is examined.

5.2 Statistical signicance

In this section the statistical signicance of dierences in performance will be measured using bootstrap methods. As in Patton (2004), bootstrap methods are used to examine the statistical signiance of the results presented so far. Pairwise comparisons are conducted by looking at the bootstrap condence intervals of the dierences between a performance measure of portfolio i and a performance measure of portfolio j. Since the investor in this paper aims to maximize utility using a CRRA-utility function, realized utility will be the performance measure of interest. Let µi,tbe the realized utility of portfolio i at time t, and let µj,tbe

the realized utility of portfolio j at time t. The time-index t runs throughout the full out-of-sample period, yielding 120 pairs of portfolio performance measures for each possible pair of portfolios. Due to a relatively

(26)

Table 3: Realized p ortfolio return statistics (three assets) Short positions not allo w ed Short positions allo w ed Large M id Small Mix M V Nu MVNc NormCop tCop MVNu MVNc NormCop tCop R R A = 1 Mean .91 1.25 1.26 1.23 1.24 2.71 2.70 2.58 Std Dev 6.33 5.47 5.45 5.43 9.15 9.27 9.09 8.88 Sharp e ratio .14 .23 .23 .23 .14 .29 .30 .29 Sk ewness -.27 .36 .36 .37 .20 1.37 1.43 1.21 5% V aR 9.53 8.82 8.82 8.82 13.18 10.96 10.95 10.40 5% ES 14.00 10.02 10.02 10.02 19.07 13.96 14.05 13.81 R R A = 3 Mean .76 1.15 1.10 1.14 .67 2.19 2.10 2.02 Std Dev 5.39 5.06 5.02 5.06 5.32 7.95 7.83 7.66 Sharp e ratio .14 .23 .22 .23 .13 .28 .27 .26 Sk ewness -.50 .35 .20 .35 -.47 .54 .64 .75 5% V aR 8.96 8.76 8.76 8.75 8.70 10.75 10.99 10.52 5% ES 12.24 9.53 9.58 9.53 12.03 12 .9 7 13.28 12.14 R R A = 7 Mean .46 .98 .91 .78 .39 .96 .95 .93 .34 1.83 1.60 1. 48 Std Dev 3.97 5. 35 6.33 5.05 2.83 4.04 3. 95 4.00 2.74 5.94 5.52 5.27 Sharp e ratio .12 .18 .14 .16 .14 .24 .24 .23 .13 .31 .29 .28 Sk ewness -.81 -.56 -.27 -.63 -.62 -.11 -.00 .06 -.60 .23 .66 .37 5% V aR 8.19 8.76 9.53 8.47 4.88 6.78 6.88 6.84 4.78 8.81 7.61 7.35 5% ES 9.79 12.10 14 .0 0 11.64 6.52 7.96 8.07 7.99 6.37 9.64 8.66 8.57 R R A = 10 Mean .27 .89 .83 .85 .22 1.63 1.43 1.23 Std Dev 1.94 3.54 3.36 3.25 1.92 5.09 4.59 4.31 Sharp e ratio .14 .25 .25 .26 .12 .32 .31 .29 Sk ewness -.61 -.05 .01 -.01 -.57 .15 .65 .25 5% V aR 3.33 5.69 5.37 4.72 3.28 7.82 5.17 5.15 5% ES 4. 47 6.99 6.99 6.81 4.47 8.61 6.93 7.21 R R A = 20 Mean .13 .58 .49 .48 .12 1.16 .97 .74 Std Dev .95 2.36 1.89 1.86 .94 3.48 2.77 2.45 Sharp e ratio .14 .25 .26 .26 .12 .33 .35 .30 Sk ewness -.60 -.10 .07 .04 -.58 .22 .89 .5 6 5% V aR 1.61 3.11 2.75 2.53 1.62 4.44 2.27 2.71 5% ES 2. 18 4.93 3.83 3.79 2.17 6.03 3.79 3.60 * MVN u is the unconditional m ultiv ariate normal mo del. This mo del uses unconditional values for the (co)v ariances and m eans . MVNc is the m ultiv ariate normal mo del with condit ion al means and (co)v ariances. The mean and variance equations are as sp ecied b efore.

(27)

Table 4: Realized p ortfolio return statistics (t w o assets, the `mid cap' is omit ted) Short positions not allo w ed Short positions allo w ed Large Mid Small Mix MVNu MVNc NormCop tCop MVNu MVNc NormCop tCop R R A = 1 Mean .91 2.20 2.01 2.13 1.32 2.18 1.88 2.16 Std Dev 6.33 7.76 7.58 7.71 9.29 7.75 7.50 7.74 Sharp e ratio .14 .28 .27 .28 .14 .28 .25 .28 Sk ewness -.27 1.26 1.35 1.32 .17 1.26 1.39 1.2 7 5% V aR 9.53 10.33 10.33 9.51 13.40 10.33 10.26 10.33 5% ES 14.00 12.07 12.07 11.84 19.40 12.07 11.96 12.07 R R A = 3 Mean .74 1.77 1.83 1.65 .75 1. 77 1.76 1.55 Std Dev 5.31 6.78 6.64 6.59 5.37 6.81 6.69 6.59 Sharp e ratio .14 .26 .28 .25 .14 .26 .26 .24 Sk ewness -.52 .50 .71 .50 -.50 .57 .68 .53 5% V aR 8.78 10.33 9.16 9.51 8.91 10.33 9.37 9.61 5% ES 12.07 11.83 11.41 11.81 12.19 11.77 11.50 11.79 R R A = 7 Mean .46 .91 .68 .39 1.47 1.34 1.37 .38 1.49 1.32 1.25 Std Dev 3. 97 6.33 4.97 2.82 5.19 4.89 4.87 2.74 5.19 4.89 4.83 Sharp e ratio .12 .14 .14 .14 .28 .27 .28 .14 .29 .27 .26 Sk ewness -.81 -. 27 -.62 -.62 .44 .70 .56 -.60 .46 .68 .50 5% V aR 8.19 9.53 8.56 4.87 7.18 5.98 6. 70 4.68 7.43 6.42 6.37 5% ES 9.79 14.00 11.44 6.50 8. 87 8.34 8.25 6.29 8.82 8.57 8.53 R R A = 10 Mean .27 1.29 1.19 1.12 .27 1. 30 1.11 1.05 Std Dev 1.95 4.35 3.96 3.90 1.96 4.35 4.02 3.84 Sharp e ratio .14 .30 .30 .29 .14 .30 .28 .27 Sk ewness -.63 .57 .98 .70 -.62 .55 .80 .53 5% V aR 3.38 5.77 4.68 4.82 3.39 5.78 4.94 5.02 5% ES 4.50 6.92 5.83 6.24 4.53 6.97 6.37 6.55 R R A = 20 Mean .13 .84 .74 .68 .13 .83 .73 .67 Std Dev .94 2.77 2.40 2.21 .96 2.77 2.37 2.17 Sharp e ratio .14 .30 .31 .31 .14 .30 .31 .31 Sk ewness -.61 .71 1.25 .99 -.62 .72 1. 33 .88 5% V aR 1.62 3.59 2.25 2.31 1.67 3.55 2.13 2.58 5% ES 2.17 4.53 3.10 2.97 2.22 4.50 2.99 3.09 * MVN u is the unconditional m ultiv ariate normal mo del. This mo del uses unconditional values for the (co)v ariances and m eans . MVNc is the m ultiv ariate normal mo del with condit ion al means and (co)v ariances. The mean and variance equations are as sp ecied b efore.

(28)

Table 5: Management fees in monthly percentage points 3 ASSETS 2 ASSETS RRA RRA 1 3 7 10 20 1 3 7 10 20 Large cap -.45 -.30 .07 .19 .33 -.45 -.28 -.06 .19 .33 Mid cap -.30 .22 .59 .71 .85 - - - - -Small cap 0 .15 .52 .64 .78 0 .17 .52 .64 .78 Mix -.13 .02 .39 .51 .65 -.23 -.06 .29 .41 .55

Short positions not allowed

MVNu 0 0 0 0 0 0 0 0 0 0

MVNc .34 .39 .57 .62 .45 1.29 1.03 1.08 1.02 .71 NormCop .35 .34 .56 .56 .36 1.10 1.09 .95 .92 .61 tCop .32 .38 .54 .58 .35 1.22 .91 .98 .85 .55

Short positions allowed

MVNu .33 -.09 -.05 -.05 -.01 .41 .01 .01 0 0 MVNc 1.80 1.43 1.44 1.36 1.03 1.27 1.03 1.10 1.03 .70 NormCop 1.79 1.34 1.21 1.16 .86 .97 1.02 .93 .84 .60 tCop 1.67 1.26 1.09 .96 .61 1.25 .81 .86 .78 .54

* MVNu is the unconditional multivariate normal model. This model uses unconditional values for the (co)variances and means. MVNc is the multivariate normal model with conditional means and (co)variances. The mean and variance equations are as specied before.

small sample size and the complicated theoretical distribution of the statistic of interest, bootstrapping is suitable way of obtaining test statistics. Since time-series data of asset returns almost surely contains weak autocorrelation and is stationary, the stationary bootstrap of Politis and Romano (1994) will be employed. For every model comparison B=10,000 bootstrap samples are drawn. This means that original sample of 120 observations is 10,000 times resampled (with replacement) to form new sets of 120 observations. To maintain the autocorrelation structure that is present in the data (examined by means of the PACF) the stationary bootstrap of Politis and Romano (1994) is employed. In this bootstrap method blocks of observations are drawn instead of individual observations. The block length is distributed as a geometric(1/p) random variable, where p is the average block length. For each model comparison the average block length is chosen to be the largest lag (the maximal amount of lags is 36) of both return series that is signicant. This method resulted in an average block length varying from 24 to 36. For those new drawn sets the mean return is calculated, resulting in 10,000 bootstrapped mean returns. These returns are sorted in ascending order. A 90% condence interval is constructed by taking the 500th value (5% of the bootstrapped means are of lower value than this chosen value) to be the lower bound and the 9500 (5% of the bootstrapped means are of higher value than this chosen value) value to be the upper bound. When the upper bound of the condence interval is below zero, model j is concluded to statistically outperform model i. When the lower bound of the condence interval is above zero, model i is concluded to statistically outperform model j. When zero is in the condence interval no decisive conclusion can be drawn from the test.

In Table 6 the results of the bootstrap comparisons are displayed. For every pair the signicantly better model is given inside the matrix. The full matrix is symmetric with inverse signs in the upper half, so only the lower half is displayed. Regions where comparison is most sensible are framed. In the main text of this paper only the result table for RRA = 1 is displayed. Tables with results for other levels of relative risk aversion are to

Referenties

GERELATEERDE DOCUMENTEN

What immediately stands out is that both of the “extreme” portfolios yield significant alphas where interestingly, the highest ESG scoring portfolio yields a negative alpha of

(i) Stochastic volatility slightly increases the expected return for OBPI but decreases it for CPPI (ii) CPPI is more affected by stochastic volatility. (iii) if the percentage

wealth generated by the universal portfolio converges at a faster rate to the best con- stant rebalanced portfolio, as the number of trading days grows large relative to the

Bonadaptation is determined by a number of interacting factors: namely, newly introduced patterns of functioning (TT); the maintenance or restoration of already

However, with individual steering signals a more detailed pattern can be reached since multiple houses can react on one shared signal, as can be seen on the imbalance in the car

The stereoselective behavior of one of these catalysts, which ligand could potentially coordinate to the metal species via three donor atoms (‘tridentate’), showed an

The compliance of a reinforcement needs to be taken into account for determining the support stiffness at small

het omringende muurwerk. Ofwel is de kuil een afvalkuil van de afbraak van het gebouw, ofwel betreft het een opgevuld kelderdeel. De vulling kenmerkt zich door een grijsbruin gevlekte