A review of Gaussian and Copula-Garch aggregation techniques following the 2008 financial crisis

(1)

University of Amsterdam Business School

Master in International Finance Master Thesis

A REVIEW OF GAUSSIAN AND COPULA-GARCH

AGGREGATION TECHNIQUES FOLLOWING THE 2008

FINANCIAL CRISIS

Student: Dewald Kemp

Student number: 11081775

Thesis Supervisor: Prof. Stefan Arping

Date: August 2016

Abstract

The use of Gaussian assumptions has been shown to underestimate market risk when considering risk measures such as value at risk. Despite this, such assumptions are commonly made by practitioners. This thesis compares a risk aggregation methodology that applies Gaussian assumptions directly to stock returns with the GARCH-Normal and GARCH-t models. We apply these models to a portfolio of 10 sector-based indices created from S&P500 constituents to estimate the portfolio’s value at risk. Estimates are generated and back-test over an observation period between April 1998 and December 2015. We show clearly that the direct application of Gaussian assumptions on stock returns result in an underestimation of market risk. Furthermore we find that both the GARCH-Normal and GARCH-t models provide more robust value at risk estimates and recommend that these models be applied as an alternative to conventional aggregation methodologies that relay on underlying Gaussian assumptions.

(2)

1

Acknowledgements

First and foremost I would like to extend my thanks to the staff of the Amsterdam Business School, in particular my supervisor professor Stefan Arping for their guidance in completing this thesis.

I would also like to thank the Barclays Africa Group, in particular Ina de Vry, Jennifer Sathabridg, Dr Frederik van der Walt and Chantelle Houston-McMillan for their support, both financial and otherwise, without which I would not have been able to complete this degree. I will be forever in your debt.

(3)

2

1. Introduction

Obtaining some measuring of the total risk inherent in a portfolio of assets is at core of risk-oriented modelling efforts in finance. Risk estimates are used by both practitioners and regulators to manage and control the amount of risk inherent in portfolios. The concept of measuring the total risk inherent in a portfolio dates back to modern portfolio theory as put forth by Markowitz (1952) that suggests that investors should hold a diversified portfolio that maximises return for some specified level of risk. This theory aims to provide an optimal risk-return trade-off in a mean-variance space where portfolio risk is estimated as the weighted sum of the elements of the variance-covariance matrix. Note that the results produced by the aforementioned procedure rely on the underlying assumption that asset returns have both marginal and joint normal distributions (Sweetling, 2011).

Empirical studies assessing the marginal distributions of stock returns have, however, shown that such assumptions of normality do not hold. Early results include a study by Fama (1965) that found distributions of stock returns exhibited higher peaks than what would be expected under a Gaussian distribution (i.e. stock returns are leptokurtic). More importantly, stock return distributions also exhibited high frequencies in the tails of their distributions. This implies a higher likelihood of observing extreme events than expected under Gaussian assumptions. More recent studies have shown similar results, reinforcing the notion of non-Gaussian stock returns. Possible alternatives to the non-Gaussian distribution have also been suggested, including mixtures of two or three Gaussian distributions (Kon, 1984) and scaled t distributions (Aparicio & Estrada, 2001). Ang and Chen (2000) show that correlations between stocks increase during times of market turmoil, a phenomenon referred to as tail dependence. This asymmetric dependence pattern contradicts the assumption of elliptical dependence patterns that underlay the joint normal distribution.

It has been shown than incorrectly assuming Gaussian stock returns will lead to the underestimation of portfolio risk. Peters (1991) found that the probability of observing a three-sigma event under the empirical distribution of stock returns is almost two times as large as the same probability estimated assuming a Gaussian returns distribution. Even though a substantial body of empirical research suggests that stock returns violate Gaussian assumptions, portfolio selection and risk aggregation techniques based on such assumptions are still commonly used in practice.

(6)

5 This thesis compares risk aggregation methodologies that rely on Gaussian assumptions with other more modest assumptions when quantifying the risk inherent in a portfolio of stocks. We conduct this investigation by back-testing and comparing value at risk estimates generated by a conventional Gaussian risk aggregation approach applied directly to stock returns with value at risk estimates generated by GARCH-normal and GARCH-t models. These models rely on the use of copulas. We assess the performance of these models under the market conditions prevalent before, during and after the 2008 global financial crisis.

A literature review of relevant topics relating to each of the three risk aggregation methodologies are presented in section 2, followed by a discussion of the methodology applied in obtaining and back-testing value at risk estimates form each of these models in section 3. Section 4 provides the reader with some background regarding the data used in building each of the three models before the results of our evaluation are presented in section 5. Some limitations of this investigation and considerations for practical implementation of these models are discussed in section 6, before we present our conclusions in section 7.

(7)

6

2. Literature review

This section introduces the theoretical concepts underlying the three risk aggregation methodologies under investigation. We start by introducing market risk and discuss measures that attempt to quantify this risk, focussing specifically on value at risk, after which the conventional methodology of aggregating risk under Gaussian assumptions is discussed. Next the groundwork is laid for subsequent discussions on the GARCH-normal and GARCH-t models by introducing copula theory as well as considering means of capturing correlation and modelling time dependent volatility. Following these introductory topics we discuss the process of parameterising and sampling form both the Gaussian and t copulas and provide algorithms that were used in the models under consideration. We conclude by introducing the binomial test that will be used in back-testing value at risk thresholds generated by each of the three risk aggregation methodologies.

2.1 Quantifying market risk

Market risk can be defined as the risk arising from exposures to capital markets (Sweetling, 2011). Although several drivers of market risk might be identified, including firm-specific factors as well as macroeconomic factors such as inflation, interest rates and economic growth, realisations of market risk are most readily observed in adverse movement in the share prices of a particular stock. For this reason, efforts to quantify market risk usually entail arriving at some summary statistic based on the history of a share or index. Although standard deviation is often used to quantify the volatility associated with a particular stock or portfolio, this measure is not easily interpreted and offers little information regarding extreme events (McNeil, Frey, & Embrechts, 2005). For this reason, value at risk is widely used by practitioners as a measure to quantify market risk.

Value at risk refers to the threshold loss for a particular stock or portfolio over a specified time period such that losses will not exceed this threshold in more than some specified percentage of hypothetical future states (Sweetling, 2011). For example, if the 95% 10-day value at risk for a particular portfolio is $1 million, it should be interpreted that the portfolio should not suffer losses exceeding $1 million in more than 5% of hypothetical future 10-day periods. A formal mathematical definition for value at risk is given by

(8)

7 where portfolio losses are denoted by 𝐿 and the level of confidence is denoted by 𝛼 (Artzner, Delbaen, Eber, & Heath, 2011).

Although this measure is widely used it has several limitations. Firstly, it is important to bear in mind that value at risk estimates will only be as reliable as the data used in arriving at these estimates and that historical data might not be appropriate for predicting future conditions. Furthermore, value at risk estimates only describe the expected loss at a specific confidence level, without providing any information on more severe losses. Alternative measures such as expected shortfall are more appropriate to fulfil this purpose (McNeil, Frey, & Embrechts, 2005). Finally, it should be noted that a key assumption for arriving at a value at risk estimate is deriving some statistical distribution for future losses or returns, a subjective decision made by the modeller. The choice of statistical distribution greatly influences value at risk estimates, especially when considering tail events (Sweetling, 2011). For a portfolio of stocks, this process is more involved since both the marginal distributions of the returns of individual stocks as well as the correlations that exist between stocks in a portfolio should be accounted for (Aparicio & Estrada, 2001). Methodologies for aggregating market risk across stocks in a portfolio will be considered in subsequent sections.

2.2 Risk aggregation under Gaussian assumptions

Modern portfolio theory as put forth by Markowitz (1952) provides a rudimentary framework for aggregating market risk across a portfolio of stocks. Under this framework the correlation is captured by estimating the covariance between stocks. If it is assumed that all portfolio constituents are jointly normally distributed and that the expected return of any security will be zero in the short term, the value at risk is easily estimated. Let 𝑅 denote a 𝑡 × 𝑑 matrix that contains the returns of 𝑑 stocks over 𝑡 time periods. Under the assumption that the average return for each of the 𝑑 socks is zero, a covariance matrix denoted Σ is calculated as

Σ = 𝑅

𝑇_𝑅

𝑡 − 1 .

(2) Next let 𝒘 be a vector containing the weights of the allocation to each of the 𝑑 stocks, then the portfolio standard deviation can be calculated as

𝜎_𝑝 = √𝒘𝑻_{Σ 𝒘 .} (3)

(9)

8

VaR_1−𝛼 = −𝜎_𝑝Φ−1_{(1 − 𝛼) .} ₍₄₎

where Φ−1() is the inverse of the Gaussian distribution function.

Several empirical studies that analyse the distributions of stock returns have found that Gaussian assumptions are violated. These include early studies by Mandelbrot (1963) and Fama (1965) as well as a more recent study by Aparicio and Estrada (1997) where statistical tests rejected Gaussian assumptions. In fact, financial time series exhibit higher densities in the tails of distributions than associated with the Gaussian distribution. Furthermore, these series were found to be leptokurtic with the degree of leptokurtosis decreasing as the observation period over which returns are calculated decreases (Sweetling, 2011). Finally we note that although little evidence of serial correlation is observed in return series, squared or absolute returns show strong evidence of serial correlation, which supports the hypothesis that the volatility of stock prices vary over time (Sweetling, 2011). This so called volatility clustering can also be observed when plotting returns series. Because of the aforementioned attributes of returns, risk aggregation done under Gaussian assumptions will result in inaccurate estimates for risk measures, as will be discussed further in subsequent sections.

2.3 Modelling time dependant volatility

As discussed in Section 2.2 financial time series exhibit changes in volatility over time, an effect known as volatility clustering. As such the assumption that returns are identically distributed over time is violated and it is therefore important to model this changing volatility in order to obtain more robust estimates of market risk statistics. The Autoregressive Conditional Heteroskedasticity (ARCH) model that was first proposed by Engle (1982) attempts to capture this effect in a framework where error terms are dependent on the volatility of a time series. Numerous extensions of this model has been proposed by several authors, most notably the Generalised Autoregressive Conditional Heteroskedasticity (GARCH) model that is commonly used by practitioners. Under the GARCH model the stock return at time 𝑡 denoted by 𝑋𝑡 is modelled using a time dependant innovation term 𝑍𝑡 such

that

𝑋𝑡= 𝑍𝑡 . (5)

This innovation term is in turn comprised of two components, namely a white noise process 𝜖_𝑡 and a time dependant volatility component 𝜎_𝑡 such that

(10)

9 The volatility at time 𝑡 is modelled as a linear combination of 𝑝 previous squared returns and 𝑞 previous squared volatilities such that

𝜎_𝑡2 _{= 𝜔 + ∑ 𝛼} 𝑖𝑋𝑡−𝑖2 𝑝 𝑖=1 + ∑ 𝛽_𝑗𝜎_𝑡−𝑗2 𝑞 𝑗=1 , (7)

with 𝜔 being some constant. By using an appropriate number of squared returns and squared volatilities a series of identically distributed error terms 𝜖_𝑡 are produced that are in turn used is inputs to models such as copulas (Jaworski, Durante, Hardle, & Rychlik, 2009). This will be discussed in the following section.

2.4 Copula theory

A copula can be defined as a multivariate probability distribution for which the marginal distributions of each of the variables are uniform. The use of copulas provide a means of modelling multivariate dependencies where the explicit forms of joint marginal distributions are not known. Sklar’s theorem shows that any multivariate joint distribution can be written in terms of a set of univariate marginal distribution functions and a copula that describes the dependence between such variables. Thus for every multivariate distribution 𝑭 with marginal distributions 𝐹1… 𝐹𝑑 there exists a copula 𝐶 such that

𝐹(𝑥₁, … , 𝑥_𝑑) = 𝐶(𝐹1(𝑥1), … , 𝐹𝑑(𝑥𝑑)) . (8)

This can then be used to extract the copula of the joint distribution 𝑭 and its marginal distributions 𝐹₁… 𝐹_𝑑 for any random vector 𝑿 = (𝑋₁, … , 𝑋_𝑑), such that

𝐶(𝑢₁, … 𝑢_𝑑) = 𝐹(𝐹−1_(𝑢

1), … , 𝐹−1(𝑢𝑑)) , (9)

with 𝑢₁… 𝑢_𝑑 ~ 𝑈(0,1) and 𝐹_𝑖−1 is the quantile function of marginal distribution 𝐹_𝑖. Although Sklar’s theorem proves that there always exists a copula 𝐶 as defined above, these copulas do not necessarily have simple closed-form expressions. As such copulas can be categorised as either implicit or explicit copulas where implicit copulas are based of well-known multivariate distributions without closed-form expressions. These include the normal or Gaussian copula and the t copula. We will restrict ourselves to implicit copulas in the remainder of this literature review. The Gaussian copula denoted 𝐶_𝑅𝐺𝑎 is defined as

𝐶_𝑅𝐺𝑎(𝑢₁, … 𝑢_𝑑) = 𝑃(𝐹₁(𝑋₁) ≤ 𝑢₁ , … , 𝐹_𝑑(𝑋_𝑑) ≤ 𝑢_𝑑) = 𝑃(Φ(𝑋₁) ≤ 𝑢₁ , … , Φ(𝑋_𝑑) ≤ 𝑢_𝑑)

(11)

10 = Φ_P(Φ−1_(𝑢

1) , … , Φ−1(𝑢𝑑) ) ,

where Φ( ) denotes the standard normal distribution function and ΦP( ) the joint distribution

function of 𝑿 (Sweetling, 2011). Similarly, we define the t Copula with 𝑣 degrees of freedom by 𝐶_𝑣,𝑅𝑡 such that

𝐶_𝑣,𝑅𝑡 (𝑢₁, … 𝑢_𝑑) = 𝑃(𝐹₁(𝑋₁) ≤ 𝑢₁ , … , 𝐹_𝑑(𝑋_𝑑) ≤ 𝑢_𝑑) = 𝑡_𝑣,R(𝑡_𝑣−1(𝑢₁) , … , 𝑡_𝑣−1_(𝑢

𝑑))

(11)

Where 𝑡_𝑣( ) denotes the standard t distribution function with 𝑣 degrees of freedom and 𝑡_𝑣,R( ) the joint t distribution of 𝑿 with 𝑣 degrees of freedom and correlation matrix 𝑅 (Sweetling, 2011). The process of modelling multivariate data sets using copulas can be separated into three steps, namely the parameterisation of the copula, generating a set of uniformly distributed random variables with an appropriate correlation from the copula and finally transforming each of the uniformly distributed variables to an appropriate marginal distribution. In subsequent sections we discuss each of these three steps as they apply to the use of both Gaussian and t copulas.

2.5 Parameterising copulas

In this section we briefly discuss some measures of correlation before we consider the parameterisation of both Gaussian and t copulas from a 𝑑 dimensional matrix of observations 𝑿 = (𝑿_𝟏, … 𝑿_𝒅).

2.5.1 Measures of correlation

Before proceeding we will consider briefly three different measures of correlation, namely Pearson’s rho, Spearman’s rho and Kendall’s tau and present a few crucial observations regarding the appropriateness of their use in modelling multivariate distributions using copulas. The linear correlation coefficient for two variables 𝑋1 and 𝑋2, commonly known as

Pearson’s rho is given by

𝜌(𝑋₁, 𝑋₂) = Cov(𝑋1, 𝑋2) √Var(𝑋1)√Var(𝑋2)

. (12)

Although it can be shown that the linear correlation coefficient is invariant under strictly increasing linear transformations, the same does not hold for strictly increasing non-linear transformations (Sweetling, 2011). Furthermore it can be shown that although the linear correlation coefficient is bounded between [−1; 1], the minimum and maximum attainable

(12)

11 correlations between two variables 𝑋1 and 𝑋2 depend on the distributions of these variables

such that possible attainable correlations might have a smaller range (McNeil, Frey, & Embrechts, 2005), as shown below:

−1 ≤ 𝜌_𝑚𝑖𝑛≤ 𝜌 ≤ 𝜌_𝑚𝑎𝑥 ≤ 1. (13)

Both of the aforementioned shortcomings of the linear correlation coefficient could lead to less robust results particularly when attempting to capture extreme events (Aas K. , 2004). For a more detailed discussion on the shortcomings of the linear correlation coefficient the reader is referred to McNeil, Frey and Embrechts (2005). In order to address the aforementioned shortcomings, it is suggested that a ranked correlation coefficient be used to capture the correlation structure between variables. Two popular choices are Spearman’s rho and Kendall’s tau. Spearman’s rho is equivalent to calculating the linear correlation coefficient after ranking each of the variables. This measure can be represented mathematically as

𝜌𝑆(𝑋1, 𝑋2) = 12 ∫ ∫ 𝐶(𝑢, 𝑣) 𝑑𝑢 𝑑𝑣 − 3 1 0 1 0 , (14)

where 𝐶(𝑢, 𝑣) is the copula of the bivariate distribution function of 𝑋₁ and 𝑋₂ (Aas K. , 2004). If 𝑋₁and 𝑋₂ have distribution functions 𝐹₁ and 𝐹₂ respectively the relationship between Spearman’s rho and the linear correlation coefficient is given by

𝜌_𝑆(𝑋₁, 𝑋₂) = 𝜌(𝐹₁(𝑋₁), 𝐹₂(𝑋₂)) . (15) Similarly, Kendall’s tau is represented mathematically as

𝜌_𝜏(𝑋₁, 𝑋₂) = 4 ∫ ∫ 𝐶(𝑢, 𝑣) 𝑑𝐶(𝑢, 𝑣) − 1 1 0 1 0 , (16)

where 𝐶(𝑢, 𝑣) is the copula of the bivariate distribution function of 𝑋1 and 𝑋2 (Aas K. , 2004).

It can be shown that matrices of both Spearman’s rho and Kendall’s tau estimates are positive semi-definite and that there exists a relationship between these rank correlations and Pearson’s linear correlation (McNeil, Frey, & Embrechts, 2005), as shown in equations (19) and (20): 𝜌_𝜏(𝑋_𝑖, 𝑋_𝑗) = 2 𝜋arcsin(𝜌𝑖,𝑗) , (17) 𝜌_𝑆(𝑋_𝑖, 𝑋_𝑗) =6 𝜋arcsin ( 𝜌_𝑖,𝑗 2 ) . (18)

(13)

12 The relationships above allows for one of the rank correlation measures to be estimated and transformed to an equivalent linear correlation estimate that will in turn be used to parameterise the copula in question.

A final consideration regarding correlation matrices is that we might require the matrix to be positive semi-definite in order to calculate the inverse of the matrix. To ensure this we could apply the eigenvalue method as outlined below (Rousseeuw & Molenberghs, 1993).

1. Suppose we have a non-positive semi-definite correlation matrix 𝑹. Calculate the spectral decomposition of the correlation matrix 𝑹 such that 𝑹 = 𝑮𝑳𝑮𝑇.

2. Replace all negative values in 𝑳 with small values 𝛿 > 0 to obtain 𝑳̃.

3. Calculate 𝑸 = 𝑮𝑳𝑮𝑇_. _{𝑸 will be positive semi-definite but not necessarily a}

correlation matrix since its diagonal entries may not be equal to 1.

4. Calculate 𝑹̃ = (Δ(𝑸))−1 𝑸 (Δ(𝑸))−1 where Δ(𝑸) is a diagonal matrix with diagonal elements_(√𝑄_1,1_{… √𝑄}_𝑑,𝑑).

5. Use the positive semi-definite correlation matrix 𝑹̃ as a substitute for 𝑹. 2.5.2 Parameterising Gaussian copulas

The Gaussian copula is based on the multivariate normal distribution with density function

𝑓_𝐱(𝑥1, … 𝑥𝑑) = 1 √(2𝜋)𝑑_|𝚺|exp (− 1 2(𝐱 − 𝝁) 𝑇_𝚺−𝟏_{(𝐱 − 𝝁)) ,} (19)

where 𝚺 denotes the covariance matrix and the 𝑑 dimensional vector 𝝁 contains the means of the vectors 𝑿𝟏, … 𝑿𝒅. When considering the standardised joint normal distribution equation

(12) is reduced to 𝑓_𝐱(𝑥₁, … 𝑥_𝑑) = 1 √(2𝜋)𝑑_|𝚺|exp (− 1 2𝐱 𝑇_𝑹−𝟏_{𝐱) ,} (20)

where 𝑹 denotes a correlation matrix. Note that the standardised joint normal distribution as shown in equation (13) is fully parameterised by the correlation matrix and as such the task of parameterising the Gaussian copula reduces to estimating an appropriate correlation matrix.

2.5.3 Parameterising t copulas

(14)

13 𝑓_𝐱(𝑥1, … 𝑥𝑑) = Γ (𝑣 + 𝑑_{2 )} Γ (𝑣₂₎(𝜋𝑣)𝑑2|𝑹| 1 2 [1 +1 𝑣 𝐱 𝑇_𝑹−𝟏_𝐱] −(𝑣+𝑑₂ ) , (21)

where 𝑑 represents the dimention of the copula, 𝑣 represents the copula’s degrees of freedom, 𝑹 is a correlation matrix and Γ represents the gamma function. Unlike the Gaussian copula, the t distribution is not fully parameterised by its correlation matrix since the degrees of freedom need to be estimated. This necessitates the use of maximum likelihood estimation and makes the process of parameterisation more involved.

The first step in parameter estimation consists of constructing a so called pseudo sample form observed data. This requires the estimation of marginal distributions 𝐹̂₁, … 𝐹̂_𝑑 for each of the 𝑑 variables that is achieved by either parametric estimation or non-parametric estimation. For parametric estimation we find the marginal distributions 𝐹̂₁, … 𝐹̂_𝑑 by selecting an appropriate parametric model and parameterising this model using maximum likelihood estimation. For non-parametric we estimate 𝐹̂₁, … 𝐹̂_𝑑 using

𝐹̂_𝑗(x) = 1 𝑇 + 1 ∑ 𝐼{xt,j≤ x } 𝑇 𝑡=1 , (22)

where 𝐼{ } is an indicator function. Note that the use of the denominator 𝑇 + 1 rather that 𝑇 ensures that the pseudo copula lies strictly in the unit cube.

Following the estimation of marginal distributions to arrive at 𝐹̂₁, … 𝐹̂_𝑑 through either a parametric or non-parametric approach, a pseudo-sample of uniformly distributed variables at time 𝑡 denoted 𝑈_𝑡,1, … , 𝑈_𝑡,𝑑 is constructed from observations 𝑋_𝑡,1, … , 𝑋_𝑡,𝑑 such that

(𝑈_𝑡,1, … , 𝑈_𝑡,𝑑) = (𝐹̂₁(𝑋_𝑡,1), … , 𝐹̂_𝑑(𝑋_𝑡,𝑑)) . (23) The log likelihood 𝐿 for the pseudo-sample is then calculated as

𝐿(𝑹, 𝑣, 𝑼_𝟏, … , 𝑼_𝒅) = ∑ 𝐿𝑜𝑔(𝐶_𝑣,𝑹𝑡 _{(𝑹, 𝑣, 𝑈} 𝑡,1, … , 𝑈𝑡,𝑑)) 𝑇 𝑡=1 , (24)

where 𝐶_𝑣,𝑹𝑡 is a t copula as defined in equation (11) with correlation matrix 𝑹 and 𝑣 degrees of freedom.

Although maximum likelihood would ideally be used to arrive at parameter estimates for both the correlation matrix and degrees of freedom, this process is often too computationally

(15)

14 intensive which necessitates a more pragmatic two step approach (Jaworski, Durante, Hardle, & Rychlik, 2009). First the correlation is estimated using Spearman’s rho or Kendall’s tau as presented in equations (16) and (18), after which the corresponding estimate is transformed to an equivalent linear correlation estimate using equation (19) or equation (20). This correlation is then used as an input to the likelihood function 𝐿 that is maximized to estimate the degrees of freedom. The parameters 𝑹 and 𝑣 are then used when sampling form the t copula, as discussed in the next section.

2.6 Sampling from Copulas

In this section we describe the process of sampling from copulas as would be done during Monte Carlo simulation. For both the Gaussian and t copulas, the aim will be to first generate a set of uniformly distributed random variables with an appropriate correlation structure, after which the uniform random variables can be transformed to appropriate marginal distributions. We first present algorithms for generating uniformly distributed samples from both copulas, after which we consider transforming these uniformly distributed random variables.

2.6.1 Algorithm for sampling form a Gaussian copula

1. Obtain the correlation matrix 𝑹 as discussed in Section 2.5.2.

2. Perform a Cholesky decomposition on the correlation matrix such that = 𝑨𝑇𝑨 , where the resulting matrix 𝑨𝑇_{will be a lower triangular matrix. Although most statistical}

software packages contain predefined modules for doing Cholesky decompositions, the reader is referred to Press et al. (1992) for a Cholesky decomposition algorithm. 3. Generate a 𝑛 × 𝑑 matrix of independent and identically distributed random standard

normal variables such that 𝒁 = (𝒁_𝟏… 𝒁_𝒅)𝑇_.

4. Compute 𝑿 = 𝑨 × 𝒁 such that 𝑿 = (𝑿_𝟏… 𝑿_𝒅)𝑇_.

5. Return a 𝑛 × 𝑑 matrix of uniformly distributed random variables denoted = (Φ(𝑿_𝟏) … Φ(𝑿_𝒅)) , where Φ( ) is the standard normal distribution function.

(16)

15 2.6.2 Algorithm for sampling form a t copula

1. Obtain the correlation matrix 𝑹 as discussed in Section 2.5.22.

2. Perform a Cholesky decomposition on the correlation matrix such that = 𝑨𝑇𝑨 , where the resulting matrix 𝑨𝑇_{will be a lower triangular matrix.}

3. Generate a 𝑛 × 𝑑 matrix of independent and identically distributed random standard normal variables such that 𝒁 = (𝒁_𝟏… 𝒁_𝒅)𝑇_.

4. Compute 𝑿 = 𝑨 × 𝒁 such that 𝑿 = (𝑿_𝟏… 𝑿_𝒅)𝑇_.

5. Generate a 𝑛 × 1 vector 𝝃 of independent chi-squared distributed variables with 𝑣 degrees of freedom where 𝑣 is the t copula’s degrees of freedom that can be estimated as discussed in Section 2.5.22.

6. Return a 𝑛 × 𝑑 matrix of uniformly distributed random variables denoted = (𝑡𝑣(𝑿𝟏/√𝝃/𝑣 ) … 𝑡𝑣(𝑿𝒅/√𝝃/𝑣 )) , where 𝑡𝑣( ) is the distribution function of the

standard t distribution with 𝑣 degrees of freedom.

For more detailed discussions on these algorithms the reader is referred to Sweetling (2011) or McNeil, Frey & Embrechts (2005).

2.6.3 Transforming uniformly distributed marginal distributions

The final step in the sampling process consists of transforming the correlated uniformly distributed random variables generated using the algorithms form sections 2.6.1 and 0 to the appropriate marginal distributions. The transformation is done by applying the inverse distribution function for each of the 𝑑 variables to the corresponding random uniform numbers. The two steps involved in this process are discussed below.

1. Find the parameters of the distribution through maximum likelihood estimation. This estimation will be done using the same data as that used for parameterising the copula. Note that we repeat this process 𝑑 times in order to find an appropriate marginal distribution for each of the variables denoted 𝐺̂₁… 𝐺̂_𝑑. Also note that these marginal distributions would not necessarily be of the same type.

2. Transform the 𝑛 × 𝑑 matrix of uniformly distributed random variables 𝑼 = (𝑼_𝟏… 𝑼_𝒅) to a 𝑛 × 𝑑 matrix 𝑽 = ( 𝐺̂₁−𝟏(𝑼_𝟏) … 𝐺̂_𝑑−𝟏_(𝑼

𝒅)) that consists of 𝑑

variables with marginal distributions 𝐺̂₁… 𝐺̂_𝑑 and an underlying correlation structure captured by the copula used in generating 𝑼 = (𝑼𝟏… 𝑼𝒅).

(17)

16 The resulting matrix of simulated values can now be used to estimate distributions of portfolio returns and to derive risk measures such as value at risk.

2.7 The binomial test

As mentioned in previous sections, this thesis will assess the ability of three risk aggregation methodologies to capture risk as measured by value at risk estimates. One test used to assessing the accuracy of these methods is the binomial test. The binomial distribution is a discrete probability distribution of the number of successes in a sequence of 𝑛 independent Bernoulli trails. If the probability of success in each independent trail is given by 𝑝 then the probability of observing 𝑘 successes out of 𝑛 independent trails is given by

(𝑛 𝑘) 𝑝

𝑘_{(1 − 𝑝)}𝑛−𝑘_. (25)

In the context of this thesis we aggregate market risk using Gaussian assumptions as well as GARCH-Normal and GARHC-t models and estimate 95% and 99% value at risk estimates. These estimates are then compared to actual returns to determine whether 95% and 99% thresholds have been breached. Each period will be considered an independent Bernoulli trail where a breach of the 95% or 95% value at risk is considered a success with probability of 5% and 1% respectively. This allows for the null hypothesis that the aggregation method in question accurately captures market risk against the alternative hypothesis that risk is either overestimated or underestimated to be tested. Tests will be conducted at a 95% level of confidence where the p-value reflects the probability of the observing as many or more breaches as in the experiment. More formally we define the p-value corresponding to 𝑘 breaches of the value at risk threshold as

𝑝_𝑣𝑎𝑙𝑢𝑒 = ∑ (𝑛 𝑖) 𝑝 𝑖_{(1 − 𝑝)}𝑛−𝑖 𝑛 𝑖=𝑘 (26)

where 𝑛 denotes the number of observations, 𝑝 denotes the assumed probability of breaching the value at risk estimate and 𝑘 is the observed number of breaches.

3. Methodology

This section introduces the methodology that was used to assess the ability of conventional Gaussian model, GARCH-normal model and GARCH-t model to capture market risk. We first outline the process followed in gathering and preparing returns data, focussing on the

(18)

17 process of constructing ten sector-based indices that were used as individual asset classes. Thereafter we discuss how the Gaussian, GARCH-normal and GARCH-t models were built, referring back to Section 2 where appropriate.

3.1 Data gathering and preparation

As mentioned above, the risk aggregation methodologies in question were assessed using the constituents of the S&P500 index. An observation period between April 1998 and December 2015 was considered since a variable capturing the number of shares outstanding has only been recorded since April 1998 in the Compustat database. In order to reduce the dimensionality of the modelling exercise it was decided to create 10 indices from the constituents of the S&P500 that will be considered as individual asset classes. The remainder of this section discusses the steps followed in creating the aforementioned indices.

Since only stocks that were constituents of the S&P500 at a given time are considered and the S&P500’s constituents change over time, the first step in constructing the indices consisted of generating an indicator variable. Data was extracted from the Compustat database, which indicated that 976 firms formed part of the S&P500 index at some point between April 1998 and December 2015, with some firms exiting and re-entering the index over time. Firms were classified according to the 𝐺𝑉𝐾𝐸𝑌 variable. The indicator variable denoted 𝐼_𝑖,𝑡 for each of the 𝑖 = 1 … 976 stocks shows whether that particular stock formed part of the S&P500 on a particular date, such that

𝐼_𝑖,𝑡 = { 1 if stock 𝑖 is in the SP500 at time 𝑡 0 if stock 𝑖 is not in the SP500 at time 𝑡.

(27)

Next the daily closing prices variable 𝑃𝑅𝐶𝐶𝐷 was downloaded for each of the 976 stocks from the Compustat North America database. These daily close prices were adjusted for stock splits and dividends using two adjustment factor variables named 𝐴𝐽𝐸𝑋𝐷𝐼 and 𝑇𝑅𝐹𝐷. An adjusted close price for stock 𝑖 at time 𝑡 was calculated as

𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝐶𝑙𝑜𝑠𝑒 𝑃𝑟𝑖𝑐𝑒_𝑖,𝑡 =𝑃𝑅𝐶𝐶𝐷𝑖,𝑡 × 𝑇𝑅𝐹𝐷𝑖,𝑡 𝐴𝐽𝐸𝑋𝐷𝐼_𝑖,𝑡 .

(28) From this the daily return for holding each stock was calculated such that the daily return of stock 𝑖 between times 𝑡 and 𝑡 + 1 is given by

𝐷𝑎𝑖𝑙𝑦 𝑟𝑒𝑡𝑢𝑟𝑛𝑖,𝑡 =

𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑐𝑙𝑜𝑠𝑒 𝑝𝑟𝑖𝑐𝑒_𝑖,𝑡+1 𝐴𝑑𝑢𝑗𝑠𝑡𝑒𝑑 𝑐𝑙𝑜𝑠𝑒 𝑝𝑟𝑖𝑐𝑒𝑖,𝑡

(19)

18 Finally the market capitalisation was calculated for each stock on each day as the product of the daily closing price variable 𝑃𝑅𝐶𝐶𝐷 and the number of shares outstanding variable 𝑆𝐶𝐻𝑂𝐶, such that the market capitalisation of stock 𝑖 at time 𝑡 is given by

𝑀𝑎𝑟𝑘𝑒𝑡 𝐶𝑎𝑝𝑡𝑖𝑎𝑙𝑖𝑠𝑎𝑡𝑖𝑜𝑛_𝑖,𝑡 = 𝑃𝑅𝐶𝐶𝐷_𝑖,𝑡 × 𝑆𝐶𝐻𝑂𝐶_𝑖,𝑡 . (30) As mentioned previously, 10 indices were created to reduce the dimensionality of the modelling process. Socks were classified based on the sector within which firms operate. This was done according to the 𝐺𝑆𝐸𝐶𝑇𝑂𝑅 variable that distinguishes stocks by the following sectors: energy, minerals, industrials, consumer discretionary goods, consumer staples, health care, financials, information technology, telecommunication services and utilities. Stocks were sorted by sector so that we have 𝑗 = 1 … 10 sectors with 𝑖 = 1 … 𝑁_𝑗 stocks forming part of e specific sector 𝑗 over the observation period. Indices were constructed by weighting the returns for each S&P500 constituent in the sector at a given time by its market capitalisation in the sector, such that the daily return for an index 𝑗 at time 𝑡 is given by

𝑟_𝑗,𝑡= ∑ 𝐼𝑖,𝑡× 𝐷𝑎𝑖𝑙𝑦 𝑟𝑒𝑡𝑢𝑟𝑛𝑖,𝑡 × 𝑀𝑎𝑟𝑘𝑒𝑡 𝐶𝑎𝑝𝑡𝑖𝑎𝑙𝑖𝑠𝑎𝑡𝑖𝑜𝑛𝑖,𝑡

𝑁𝑗

𝑖=1

∑𝑁_𝑗=1𝑗 𝐼_𝑖,𝑡 × 𝑀𝑎𝑟𝑘𝑒𝑡 𝐶𝑎𝑝𝑡𝑖𝑎𝑙𝑖𝑠𝑎𝑡𝑖𝑜𝑛_𝑖,𝑡 .

(31)

The process above was also repeated using 10 day intervals to calculate 10-day returns for each sector index. The risk aggregation methodologies in question were assessed using these indices. We discuss how these methodologies were applied in the following sections starting with the Gaussian model. Note that in each case we fit two models, the first using daily returns and the second using 10-day returns. For the daily returns model we use the past 500 data points to parameterise the model and estimate 1-day value at risk thresholds. Similarly, we use the past 50 data points to parameterise the model for day returns and arrive at a 10-day value at risk threshold.

(20)

19 3.2 The Gaussian model

As described in Section 2.2 the Gaussian model is applied directly to the returns of each asset, in this case the 10 sector indices discussed in section 3.1.

For each model a variance-covariance matrix 𝚺 was calculated as in equation (2), after which the portfolio variance was calculated using equation (3). This was done using the market capitalisations of each of the indices in relation to the entire S&P500 as weights. Finally the 95% and 99% value at risk thresholds were calculated using equation (4). These thresholds were then compared to the actual return observed in the next period to determine if these thresholds were breached.

3.3 The GARCH-normal model

The GARCH-normal model uses a Gaussian copula to capture the dependence between the 10 sector indices. This copula is, however, not parameterised using the daily returns but rather on a set of error terms that are obtained after a GARCH(1,1) model was applied to returns. This is done to account for the fact that the volatility of returns change over time. First, a the GARCH(1,1) model is fitted to the returns of each of the 10 indices using maximum likelihood estimation as discussed in section 2.3. The resulting sets residuals 𝜖_𝑡,𝑗 for each index 𝑗 = 1 … 10 is then used to parameterise the Gaussian copula.

Once GARCH residuals have been obtained we calculate a correlation matrix using Kendall’s tau and transform this matrix to an equivalent linear correlation matrix using equation (17). Since the Gaussian copula is fully parameterised by this correlation matrix we proceed to generate a set 50 000 correlated uniformly distributed random variables using the algorithm outlined in section 2.6.1. Next we transform these uniformly distributed random variables to normally distributed random variables by applying the inverse of the appropriate normal distribution function as discussed in section 2.6.3. These distribution functions are parameterised using the means and variances of the GARCH residuals for each index.

Note that the resulting data set does not contain a set of 50 000 simulated returns, but rather a set of 𝑠 = 1 … 50 000 GARCH(1,1) residuals denoted 𝜖̃_𝑠,𝑗 for each index 𝑗. In order to obtain simulated returns we transform these residuals using the volatilities obtained by fitting the GARCH(1,1) models denoted 𝜎̂_𝑗. Note that 𝜎̂_𝑗 corresponds to the volatility at the time of the most recent return. Using equation (6) the 𝑠’th simulated return 𝑟̃_𝑠,𝑗 for index 𝑗 is given by

(21)

20

𝑟̃_𝑠,𝑗 = 𝜖̃_𝑠,𝑗 × 𝜎̂_𝑗 (32)

Each of the 𝑠 = 1 … 50 000 sets simulated returns represent a likely future scenario. We can therefore calculate the corresponding portfolio return under each of these likely future scenarios. This was done using the market capitalisations of each of the indices in relation to the entire S&P500 as weights.

Once the portfolio return was calculated for each scenario we sort these scenarios by portfolio return and find the fifth and first percentiles to estimate 95% and 99% value at risk thresholds. These thresholds were then compared to the actual return observed in the next period to determine if these thresholds were breached.

3.4 The GARCH-t model

The GARCH-t model uses a t-copula to capture the dependence between the 10 sector indices. Similarly to the GARCH-Normal, the GARCH-t model is parameterised using a set of error terms that are obtained after a GARCH(1,1) model was applied to returns. As under the GARCH-normal model, a GARCH(1,1) model is fitted to the returns of each of the 10 indices using maximum likelihood estimation as discussed in section 2.3. The resulting sets of residuals 𝜖_𝑡,𝑗 for each index 𝑗 = 1 … 10 is then used to parameterise the t-copula.

Once GARCH residuals have been obtained we calculate a correlation matrix using Kendall’s tau and transform this matrix to an equivalent linear correlation matrix using equation (17). Unlike the Gaussian copula, the t-copula is not fully parameterised by this correlation matrix since the degrees of freedom need to be estimated. We ensure that this correlation matrix is positive semi-definite by applying the eigenvalue method as described in section 2.5.1. We then proceed by constructing a pseudo sample using non-parametric marginal distributions as discussed in section 2.5.3 and estimating the degrees of freedom using maximum likelihood estimation.

Once parameters for the t-copula have been found we proceed to generate a set 50 000 correlated uniformly distributed random variables using the algorithm outlined in section 0. Next we transform these uniformly distributed random variables to t distributed random variables by applying the inverse of the appropriate t distribution function as discussed in section 2.6.3. These t distribution functions are parameterised by finding the degrees of freedom though maximum likelihood estimation using of the GARCH residuals for each index.

(22)

21 Note that, as in the case of the GARCH-normal model, the resulting data set does not contain a set of 50 000 simulated returns, but rather a set of 𝑠 = 1 … 50 000 GARCH(1,1) residuals denoted 𝜖̃𝑠,𝑗 for each index 𝑗. In order to obtain simulated returns we again transform these

residuals using the volatilities obtained by fitting the GARCH(1,1) models denoted 𝜎̂_𝑗. Note that 𝜎̂_𝑗 corresponds to the volatility at the time of the most recent return. Using equation (6) the 𝑠’th simulated return 𝑟̃_𝑠,𝑗 for index 𝑗 is given by

𝑟̃_𝑠,𝑗 = 𝜖̃_𝑠,𝑗 × 𝜎̂_𝑗 . (33)

As was the case under the GARCH-normal, each of the 𝑠 = 1 … 50 000 sets simulated returns represent a likely future scenario from which the portfolio return can be calculated. This was again done using the market capitalisations of each of the indices in relation to the entire S&P500 as weights.

Once the portfolio return was calculated for each scenario, these scenarios were sorted by portfolio return and the fifth and first percentiles were used to estimate 95% and 99% value at risk thresholds, which were compared to the actual return observed in the next period to determine if these thresholds were breached. Following a brief discussion relating to the 10 sector indices in section 4 we present the results from the applied methodology as outlined in this section.

4. Data and descriptive statistics

Before discussing the results obtained through applying the methodology outlined in the previous section, this section provides exploratory data analysis performed on the S&P500 data set as well as the 10 sector indices. First returns on the S&P500 between January 2003 and December 2015 are considered and three distinct time periods are identified. The performance of each of the three risk aggregation methodologies will be assessed during each of these three time periods. To justify the distinction made between these periods we present summary statistics relating to each of the 10 sector indices for each time period. Finally we discuss the presence of time dependent volatility in the data and how the application of the GARCH(1,1) model is used to address this phenomenon.

4.1 The S&P500

As mentioned in previous sections the risk aggregation methodologies considered in this thesis are assessed using the constituents of the S&P500. Before proceeding we consider the

(23)

22 cumulative return over the entire S&P500 between January 2003 and December 2015, as shown in Figure 1.

Figure 1: Graph of cumulative daily returns on the S&P500 index.

From Figure 1 we identify three distinct time periods within which the performance of the three risk aggregation methodologies will be assessed. We refer to a pre-crisis time period between January 2003 and June 2008; a crisis period between July 2008 and December 2011 that includes both the 2008 global financial crisis and European debt crisis and finally a post-crisis period between January 2012 and December 2015. Next we consider both the volatilities and Pearson correlation coefficients for each of the 10 sectors during these three time periods. 0 0.5 1 1.5 2 2.5 Ja n -0 3 Ju l-0 3 Ja n -0 4 Ju l-0 4 Ja n -0 5 Ju l-0 5 Ja n -0 6 Ju l-0 6 Ja n -0 7 Ju l-0 7 Ja n -0 8 Ju l-0 8 Ja n -0 9 Ju l-0 9 Ja n -1 0 Ju l-1 0 Ja n -1 1 Ju l-1 1 Ja n -1 2 Ju l-1 2 Ja n -1 3 Ju l-1 3 Ja n -1 4 Ju l-1 4 Ja n -1 5 Ju l-1 5

(24)

23 Table 1: Summary statistics for daily S&P500 returns between January 2003 and June 2008 by sector.

Pre-Crisis: January 2003 to June 2008

Sector Volatility Pearson Correlation

Energy 0.0133 1.000 Minerals 0.0122 0.650 1.000 Industrials 0.0096 0.523 0.786 1.000 Consumer Discretionary 0.0105 0.419 0.687 0.852 1.000 Consumer Staples 0.0069 0.411 0.599 0.761 0.781 1.000 Health Care 0.0083 0.406 0.564 0.714 0.686 0.720 1.000 Financials 0.0123 0.411 0.638 0.792 0.833 0.723 0.656 1.000 Information Technology 0.0122 0.403 0.642 0.770 0.765 0.643 0.613 0.669 1.000 Telecommunication Services 0.0118 0.362 0.504 0.631 0.627 0.621 0.565 0.611 0.586 1.000 Utilities 0.0095 0.545 0.552 0.590 0.552 0.588 0.549 0.571 0.469 0.493 1.000 Average volatility 0.0107 Average correlation 0.6120

Table 2: Summary statistics for daily S&P500 returns between July 2008 and December 2011 by sector.

Crisis: July 2008 to December 2011

(25)

24 Table 3: Summary statistics for daily S&P500 returns between January 2012 and December 2015 by sector.

Post-Crisis: January 2012 to December 2015

From Table 1, Table 2 and Table 3 we note that the volatilities of daily S&P500 returns increase significantly during the period between July 2008 and December 2011 from an average volatility of 0.0107 to an average volatility of 0.0205, after which they return to pre-crisis levels at an average volatility of 0.0092. Similarly we note that the correlations between each of the 10 sectors under consideration also exhibit a significant increase between July 2008 and December 2011, increasing form an average pre-crisis correlation of 0.6120 to an average of correlation of 0.7904. Although not presented here, we observe similar patterns when considering the volatilities and correlations of 10-day returns over the same time periods. The aforementioned results support the decision to assess the performance of risk aggregation methodologies separately during each of these time periods.

4.2 Daily returns and GARCH residuals.

As discussed in Section 2.3 returns series do not exhibit constant volatilities over time. Time periods with higher volatilities are often observed, a phenomenon referred to as volatility clustering. Volatility clustering is clearly visible in each of the 10 sector indices constructed from the S&P500’s constituents. As an example we consider the returns of the index constructed from financial firms in Figure 2 where times of higher volatility are clearly observed between the second half of 2008 and the end of 2009 as well as at the end of 2011.

(26)

25 Figure 2: Daily returns of financial firms in the S&P500.

Since copulas can only be parameterised under the assumption that data is independent and identically distributed, it is clear the returns cannot be used directly. As discussed in Sections 3.3 and 3.4, returns data was modelled using a GARCH(1,1) model, after which copulas were parameterised using the residuals from this model. Figure 3 shows the resulting GARCH(1,1) returns that exhibit no signs of volatility clustering over time.

Figure 3: GARCH(1,1) residuals from daily returns of financial firms in the S&P500.

-0.2 -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 Jan -03 May-04 Oct -05 Fe b -07 Ju l-08 N o v-09 Mar -11 A u g-12 De c-13 May-15

Daily returns of financial firms in the S&P500

-5 -4 -3 -2 -1 0 1 2 3 4 5 Jan -03 May-04 Oct -05 Fe b -07 Ju l-08 N o v-09 Mar -11 A u g-12 De c-13 May-15

GARCH(1,1) residuals from daily returns of financial firms

in the S&P500

(27)

26

5. Results

In this section we consider the results of methodology outlined in Section 3. First we consider the results of back-testing each of the three risk aggregation methodologies in the pre-crisis, crisis and post-crisis periods identified in Section 4.1 after which we consider the probabilities assigned to an extreme 10 day period in the 2008 global financial crisis by each of the three risk aggregation methodologies right before the period occurred.

5.1 Back testing of risk aggregation methodologies

As discussed in Section 3, the normal model, normal copula as well as the GARCH-t copula were used GARCH-to calculaGARCH-te 95% and 99% value aGARCH-t risk esGARCH-timaGARCH-tes for a porGARCH-tfolio consisGARCH-ting of 10 sector indices held in proportion to their market capitalisations. These estimates were then compared with actual results to determine whether the estimated value at risk thresholds were breached. Once the number of breaches in each time period was identified, a two-sided binomial test was applied to assess the performance of each model. The null hypothesis in the aforementioned test is that the number of value at risk threshold beaches is consistent with the percentiles (95% or 99%) of these value at risk thresholds. All tests were performed at a 5% confidence level. The results of these test are presented in

Table 4 and Table 5.

Table 4: Results for back-testing risk aggregation methodologies applied to daily returns using a two-sided binomial test.

Period Model Trading

days Number of breaches expected Number of breaches observed Two-sided binomial test p-value 95% VaR 99% VaR 95% VaR 99% VaR 95% VaR 99% VaR Pre-crisis: Jan 2003 to June 2008 Normal 68 29 0.854 0.000** GARCH-Normal 1383 69 14 66 20 0.668 0.085 GARCH-t 59 11 0.205 0.320 Crisis: July 2008 to December 2011 Normal 60 30 0.016** 0.000** GARCH-Normal 884 44 9 56 16 0.065 0.018** GARCH-t 53 13 0.157 0.130 Post-crisis: January 2011 to December 2015 Normal 36 17 0.040** 0.027** GARCH-Normal 998 50 10 51 18 0.800 0.014** GARCH-t 43 13 0.271 0.266

(28)

27 Table 4 shows the results from two-sided binomial tests performed on models built using daily returns data. Firstly we note that the normal model applied directly to returns data performs poorly over all periods. A two-sided binomial test rejects the null hypothesis that the observed number of value at risk threshold breaches are consistent with the confidence levels of these thresholds in 2 of the 3 time period for the 95% value at risk threshold and in all 3 time periods for the 99% value at risk threshold. The GARCH-Normal model performs relatively well when estimating 95% value at risk thresholds since the null hypothesis is not rejected over any of the three time periods. The model does, however perform poorly when estimating 99% value at risk estimates since we reject the null hypothesis in both the crisis and post-crisis time periods. Finally we note that the GARCH-t copula exhibits the best performance of the three models. The null hypothesis of the two-sided binomial test is not reject in any of the three periods for either the 95% or 99% value at risk estimates.

Table 5: Results for back-testing risk aggregation methodologies applied to 10-day returns using a two-sided binomial test.

Period Model Trading

days Number of breaches expected Number of breaches observed Two-sided binomial test p-value 95% VaR 99% VaR 95% VaR 99% VaR 95% VaR 99% VaR Pre-crisis: Jan 2003 to June 2008 Normal 8 4 0.509 0.026** GARCH-Normal 138 7 1 11 5 0.089 0.006** GARCH-t 9 3 0.308 0.101 Crisis: July 2008 to December 2011 Normal 9 6 0.027** 0.000** GARCH-Normal 89 4 1 11 8 0.003** 0.000** GARCH-t 9 4 0.027** 0.004** Post-crisis: January 2011 to December 2015 Normal 1 3 0.047** 0.033** GARCH-Normal 97 5 1 6 2 0.424 0.148 GARCH-t 3 1 0.225 0.506

Table 5 shows the results from two-sided binomial test performed on models built using 10-day returns. Firstly we note that all three models produce less accurate 10-10-day value at risk estimates that daily value at risk estimates. Once again the normal model fitted directly to returns data performs the worst of the three models with the null hypothesis in two of the three time periods for 95% value at risk thresholds and in all three time periods for 99% value

(29)

28 at risk thresholds. Although the GARCH-Normal and GARCH-t models perform reasonably well pre-crisis and post-crisis, both models fail to produce accurate value at risk thresholds between July 2008 and December 2011.

5.2 Probability assigned to the crisis

Next we consider the probability assigned to a severe 10-day loss observed during the 2008 global financial crisis by each of the risk aggregation methodologies. The loss of 8.353% observed was observed between the 15th and the 26th of September 2008. All three models were parameterised using data from the 500 trading days preceding the 15th, after which Monte Carlo simulations were done to generate 1 000 000 simulations that predict possible returns over the following 10-day period. These simulations were binned and polynomials wire fitted to estimate density functions, as presented in Figure 4.

Figure 4: Estimated density functions of the Normal, GARCH-Normal and GARCH-t models based on Monte Carlo simulations.

From Figure 4 it is clear that the GARCH-t assigns the most weight to extreme events whilst the Normal model assigns the least weight to extreme events. The estimated probability assigned by each risk aggregation methodology to the aforementioned loss is shown Table 6.

-0.12 -0.115 -0.11 -0.105 -0.1 -0.095 -0.09 -0.085 -0.08 -0.075 -0.07

Estimated density functions based on Monte Carlo simulations

Poly. (GARCH-Normal) Poly. (Normal) Poly. (GARCH-t)

(30)

29 Table 6: Probabilities assigned to the loss between 15 and 26 September 2008 by each risk aggregation methodology.

Model Normal

GARCH-Normal GARCH-t

Probability assigned to 8.353% loss 0.01% 0.10% 0.17%

As expected the GARCH-t model assigns higher probabilities to extreme events and is therefore able to more accurately simulate severe adverse market movements. It should however be noted that the probability assigned to severe events by the GARCH-t model could still be considered relatively low, a problem that could potentially be addressed by techniques such as fitting a generalised Pareto distribution to the tails of loss distributions (Sweetling, 2011).

6. Limitations and considerations for practical implementation

Before presenting a conclusion we first consider some limitations of this investigation and discuss some issues to be considered before implementation of the GARCH-normal or GARCH-t model.

6.1

Grouping stocks by sector

As discussed in section 2.5 the application of copula-based models requires the modeller to reduce the dimensionality of data to a smaller number of asset classes, usually between 10 and 20. Ideally stocks within each asset class should be highly correlated whilst different asset classes should behave independently from one another. For the purpose of this investigation, 10 indices were constructed based on the sector within which companies operate. Although beyond the scope of this investigation, it should be noted that classification by sector is unlikely to prove optimal and practitioners are urged to spend some time refining the classification of stocks into different indices since this will greatly improve the accuracy of models.

6.2 Modelling time dependant volatility

As discussed in previous sections, modelling time dependant volatility improves the accuracy of risk measures. For the purpose of this investigation time dependant volatility was modelled by fitting a GARCH(1,1) model to each of the 10 sector indices. It should be noted that other GARCH models that use more or fewer previous volatility and return terms will likely be able to model time dependent volatility more accurately. Furthermore, several other models exist that might outperform the GARCH model. These include ARIMA models or variants of the GARCH model that capture asymmetric effects of positive and negative shocks. Therefore we

(31)

30 recommend that practitioner spend some time applying model selection techniques to find an optimal model to capture time dependant volatility since this will improve the accuracy of the model.

6.3 Model inputs

In order to compare the three risk aggregation methodologies considered in this this investigation, all models were parameterised using returns data form the previous 500 trading days. Since this number was arbitrarily chosen it is likely that an alternative observation period could yield more accurate results. Practitioners might consider using longer periods that include stressed conditions to improve modelling results. Similarly, the Monte Carlo simulations done to predict possible future outcomes were limited to 50 000 samples in order to decrease the computation time. Increasing the number of simulations could improve the accuracy of model outputs.

6.4 Marginal distributions

In this investigations t and normal marginal distributions were applied to transform correlated uniform random variables for the GARCH-t and GARCH-normal models respectively. Other marginal distribution are likely to yield more accurate results and practitioners should therefore spend time finding appropriate distributions. It should be noted that distributions fitted to indices need not necessarily be of the same type.

7. Conclusion

This thesis assesses the ability three risk aggregation methodologies, namely Gaussian assumptions applied directly to returns; a Gaussian copula applied to GARCH residuals and a t copula applied to GARCH residuals. This was done by back-testing 95% and 99% value at risk estimates obtained from these models over a period of 12 years.

Our results show that Gaussian assumptions are not appropriate when attempting to model equity returns. We showed in Section 4.2 that daily returns of S&P500 stocks showed significant evidence of volatility clustering that is contrary to the assumption of identically distributed observations. Furthermore we show that stock returns become more correlated during times of stress. Both of the aforementioned phenomena contradict key underlying assumptions made when fitting a Gaussian model directly to stock returns. We show clearly that this leads to an underestimation of market risk when calculating estimates such as value at risk thresholds and that this underestimation is more severe when considering so called tail

(32)

31 events by finding evidence of severe underestimation of 99% value at risk estimates. Furthermore we show that the Gaussian model performs particularly poorly during times of stress, as was observed during the 2008 financial crisis. Despite these facts, practitioners frequently estimate value at risk thresholds by applying Gaussian assumptions directly to returns.

Upon comparing the performance of this model to two slightly more sophisticated models we found that both the GARCH-normal and GARCH-t models outperform a Gaussian model applied directly to returns data. This held when fitting the models to daily and 10-day returns series and the models consistently outperformed the Gaussian model during pre-crisis, crisis and post-crisis periods. We also find that the t model outperformed the GARCH-normal model over the observed periods.

It should however be noted that the GARCH-t model as applied here is not without its limitations. Even though it outperforms the conventional Gaussian model, the ability of the model to predict severe outcomes over extended periods it limited, as was seen when estimating 99% value at risk estimates for 10 day returns. Furthermore the model performed poorly when applied to stress conditions between July 2008 and December 2011. For these purposes we recommend considering more advanced techniques such as extreme value theory.

In light of the results from this investigation we conclude that, even though its results are not completely accurate, the GARCH-t model delivers superior estimates for risk measures such as value at risk compared to the applying Gaussian assumptions directly to returns. Although slightly more complex than the conventional Gaussian model we believe that this model can easily be applied by practitioners and recommend its use as a more realistic alternative to the direct application of Gaussian assumptions.

(33)

32

8. List of figures

Figure 1: Graph of cumulative daily returns on the S&P500 index. ... 22 Figure 2: Daily returns of financial firms in the S&P500. ... 25 Figure 3: GARCH(1,1) residuals from daily returns of financial firms in the S&P500. ... 25 Figure 4: Estimated density functions of the Normal, GARCH-Normal and GARCH-t models based on Monte Carlo simulations. ... 28

9. List of tables

Table 1: Summary statistics for daily S&P500 returns between January 2003 and June 2008 by sector. ... 23 Table 2: Summary statistics for daily S&P500 returns between July 2008 and December 2011 by sector. ... 23 Table 3: Summary statistics for daily S&P500 returns between January 2012 and December 2015 by sector. ... 24 Table 4: Results for back-testing risk aggregation methodologies applied to daily returns using a two-sided binomial test. ... 26 Table 5: Results for back-testing risk aggregation methodologies applied to 10-day returns using a two-sided binomial test. ... 27 Table 6: Probabilities assigned to the loss between 15 and 26 September 2008 by each risk aggregation methodology. ... 29

(34)

33

10. References

Aas, K. (2004, December). Modelling the dependence structure of financial assets: A survey of four copulas. Retrieved from www.nr.no/files/samba/bff/SAMBA2204b.pdf

Aas, K., & Berg, D. (2009, October). Models for construction of multivariate dependence - A coparison study. The European Journal of Finance, 15(7), 639-659.

Ammann, M., & Süss, S. (2009, May). Asymmetric dependence patterns in financial time series. The European Journal of finance, 15(7), 703-719.

Ang, A., & Chen, J. (2000). Asymmetric Correlations of Equity portfolios. Retrieved from https://www.nr.no/files/samba/bff/SAMBA2204.pdf

Aparicio, F. M., & Estrada, J. (2001, January). Empirical distributions of stock returns: European securities markets, 1990-95. European Journal of Finance, 7(1), 1-21. Artzner, P., Delbaen, F., Eber, J., & Heath, D. (2011). Coherent measures of risk.

Mathematical finance, 9(3), 203-228.

Berg, D. (2009, October). Copula goodness-of-fit testing: an overview and power comparison. The European Journal of Finance, 15(7), 675-701.

Derendinger, F. (2015, March 8). Copula-based hierarchical risk aggregation. Retrieved from http://arxiv.org/pdf/1506.03564.pdf

Engel, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica, 50, 987-1006.

Fama, E. (1965). The behaviour of stock market prices. Journal of business, 38, 34-105. Fama, E. (1965, January). The Behaviour of Stock-Market Prices. Journal of Business, 38(1),

34-105.

Jaworski, P., Durante, F., Hardle, W., & Rychlik, T. (2009). Copula theory and its applications. Warsaw: Springer.

Kon, S. J. (1984, March). Models of Stock Returns: A Comparison. Journal of Finance, 39(1), 147-165.

A review of Gaussian and Copula-Garch aggregation techniques following the 2008 financial crisis