A rule to optimally use vine copulas

(1)

Luit M.L. Elbers

Master’s Thesis to obtain the degree in Actuarial Science and Mathematical Finance University of Amsterdam

Faculty Economics and Business Amsterdam School of Economics Author: Luit M.L. Elbers Student nr: 10618058

Email: luitelbers@gmail.com Date: August 15, 2018

Supervisors: Dr. S.U. Can (UvA) L. Tegels MSc (KPMG) J.J. Enthoven MSc (KPMG)

(2)

Statement of Originality

This document is written by Luit Elbers who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Abstract

Vine copulas use bivariate copulas as building blocks to construct a multivariate dependence model. These building blocks can be used to build many different vine structures. This makes it an extremely flexible model, but also complicates finding the optimal way to use vine copulas. In this thesis a rule is developed to optimally use vine copulas by improving Dißmann’s algorithm, a widely used sequential method for selecting and estimating a regular (R-)vine structure. We adjust the algorithm to select two subclasses of R-vines: C- and D-vines. The three algorithms are tested on 7 subsets of a 10-dimensional financial dataset of equity and the S&P 500, ob-served over the last 13 years. The subsets contain varying dependence structures to find out when which algorithm is optimal. We found that the optimal vine class can be identified be-forehand by analyzing the Kendall’s tau matrix and developed a rule to select and execute the optimal algorithm. This improved the model in 3 of the 7 subsets and found the same model as the original algorithm in the other 4. Because of the massive amount of possible vine struc-tures, it can still not be guaranteed that this rule consistently finds the globally optimal vine copula model. It does however improve Dißmann’s widely used algorithm by selecting the most appropriate vine class for every dependence structure.

Keywords: Vine copula, Dißmann’s algorithm, Structure selection, R-vine, C-vine, D-vine, Dependence modelling, Pair-copula construction, Correlation

(4)

As everyone knows, a thesis isn’t written in one day. But neither is it written alone. I am grateful to everyone who took the time to discuss it with me, or even just listen to what it is about. Explaining it to people from a non-mathematical-finance background was a bit hard at the beginning, but very helpful as it made me realize which steps needed extra attention.

There are a few people who contributed significantly. First of all, I would like to thank my supervisor Dr. Umut Can from the University of Amsterdam for his guidance and our valuable discussions. He consistently allowed me to follow my own interests during the process, but at the same time steered me in the right direction whenever something was unclear to me.

I would also like to thank my supervisors from KPMG, Luuk Tegels and Jeroen Enthoven. They always kept an eye on the big picture, which was helpful as I sometimes got lost in the details. The atmostphere at the Financial Risk Management department was perfect for writing this thesis and I would like to thank all my other colleagues as well for their support, interest and laughs.

Last but not least I would like to thank my parents and my girlfriend Judith for their ongoing support and encouragement during my studies.

And thanks to you, reader, for having read at least one page of my thesis. For the non-technical readers, I can advise to start by reading the introduction and the conclusion. These two chapters contain the most important and least technical information and can be read on their own.

(5)

1 Introduction 1

2 Theoretical Framework 4

2.1 Copulas . . . 4

2.1.1 Definition of copula functions . . . 5

2.1.2 Copula families . . . 6 2.1.3 Family selection . . . 9 2.1.4 Dependence measures . . . 10 2.2 Vine copulas . . . 12 2.2.1 Pair-copula constructions . . . 12 2.2.2 Graph theory . . . 14 2.2.3 Vine structures . . . 15

2.2.4 Vine structure selection . . . 19

3 Research Design 22 3.1 Data . . . 22

3.2 Time series modelling . . . 24

3.3 Implementation of vine copulas . . . 26

3.3.1 Adjusting Dißmann’s algorithm to select C- and D-vines . . . 26

3.3.2 Testing environments . . . 29 3.4 Software packages . . . 30 4 Results 33 4.1 Marginal transformations . . . 33 4.2 Vine copulas . . . 36 4.2.1 Groups 1 to 4 . . . 36 4.2.2 Groups 5 to 7 . . . 41 5 Conclusion 51 Appendix 53 References 57 v

(6)

Introduction

In the aftermath of the financial crisis, an often heard excuse was about ‘the formula that killed Wall Street’. This formula, designed by David X. Li (1999), uses the Gaussian copula to identify default correlation in order to prize credit default swaps. As with most excuses, this phrase was made up by the people responsible and is highly inaccurate. There is no such thing as a formula being able to disrupt the financial system, it were the people who used the formula wrongly without understanding its consequences (Salmon, 2012). This is a dangerous aspect of any formula, but as this formula was accepted and used for measuring correlation and prizing risks by nearly all Wall Street investment bankers, its consequences were enormous.

One of the most important issues in mathematical finance is understanding correlation. Firms have more than one source of income, investment portfolios consist of more than one asset and even one asset often consists of multiple underlying securities, think of CDOs or multivariate options. In order to make any sensible decision about buying a certain asset or prizing a certain risk, decision-makers need to understand how this action affects other aspects of their company or portfolio. Understanding the correlation structure of a firm’s assets or a portfolio’s financial products is vital for good decision making. Specifically in the case above, the correlation structure of default risk was wrongly captured by Li’s formula, which ended up being one of the main causes of the collapse of the financial system.

In statistics, modelling a single variable using historical data is a well-developed area and there are many excellent methods to do this. Techniques for modelling the dependence between those variables are, however, less advanced. For many years, this has been done using the linear correlation coefficient. This coefficient indicates how strongly two variables are correlated, but it has severe shortcomings. Embrechts, McNeil and Straumann (2002) argue that, because the linear correlation coefficient gives just one number, it cannot possibly capture the dependence structure correctly when the multivariate distribution is non-elliptical. The linear correlation coefficient only gives information about how much two variables are correlated, but is unable to capture important properties like tail dependence or asymmetrical dependence (upper/lower tail dependence).

This is a major shortcoming when modelling financial data or insurance losses, as both show 1

(7)

a significantly stronger correlation in the lower tail than in the rest of the distribution. This is a result of the interconnectedness of the system: if one bank suffers a big loss, another bank’s assets are affected as well, resulting in a loss for that bank as well. In the insurance world: for example in case of a natural disaster, not one but many insurers suffer big losses. Widely used models based on the linear correlation coefficient like the Capital Asset Pricing Model (CAPM) severely underestimate the dependence in the lower tail (Embrechts et al, 2002), resulting in banks and insurers not being prepared for crises as their models underestimate the chance of one occurring.

Copulas provide an elegant way of modelling dependence, one that can fully capture the dependence structure. First mentioned by Sklar (1959), it was not until the end of the 20th century that copulas became popular in mathematical finance. Frees and Valdez (1998) describe copulas as functions that link univariate marginal distributions to their full multivariate joint distribution. This creates great flexibility for the dependence model by dividing the complex problem of modelling a multivariate joint distribution into two separate problems: modelling the marginals and modelling the dependence structure using a copula. The marginals can now be described by different univariate distributions, while the copula can purely focus on modelling the dependence.

There are many different copula families used in statistical modelling. One of them is the Gaussian copula, which is the one used in Li’s formula. The problem with the Gaussian copula is that it completely ignores tail dependence. When people used Li’s formula without understand-ing this, they thought the dependence structure was perfectly captured by the formula while in fact the formula failed to address the most important part of the dependence: the dependence in the tails. There are however many other copula families which can be used to model tail dependence, or even more specific, asymmetric dependence like upper or lower tail dependence. Copulas however have their own limitations when applied to more than two variables. While there are many different bivariate copula families, each capable of modelling a particular type of dependence, there are only a few multivariate (more than two dimensions) copulas available (Gruber and Czado, 2015). Besides that, the few multivariate copulas that are available can only capture limited patterns of dependence. This leads to the need for a better multivariate copula model. One of the most promising models is the vine copula, which Aas, Czado, Frigessi and Bakken (2009) describe as a cascade of bivariate (pair-)copulas. Vine copulas use bivariate copulas as building blocks to construct a multivariate dependence model. This creates enormous flexibility, as every pair of variables can be modelled by its individually best-fitting bivariate copula. As with childrens’ LEGO building blocks, the pair-copulas in a vine copula model can be used to build many different structures enabling it to optimally capture the entire dependence structure. These vine structures can be divided over three different classes: regular (R-)vines, canonical (C-)vines and drawable (D-)vines.

The literature agrees upon the fitting qualities of vine copulas and praises its capability of capturing the dependence structure, but at the same time this model is not being widely used

(8)

in practice (A. van Stee, personal communication, May 13, 2018). This can have various causes, perhaps it is too abstract for risk managers to understand or not clear enough how this model should be used. The example of Li’s formula shows practitioners are not against using a certain model without completely understanding it, but it must be generally applicable and not be overcomplicated.

The wide possibilities of vine copulas are discussed in this thesis as it tries to find an answer to the question: how can vine copulas be optimally used as a dependence model? Optimal here means finding the perfect trade-off between model fit and complexity, measured by the Akaike Information Criterion (AIC). In higher dimensions, which are often seen in practice, there are extremely many possible vine copula specifications and the question then is how to find the best-fitting model while keeping an eye on the complexity.

It is hereby interesting to see how the different vine classes (R-vines, C-vines and D-vines) perform and to examine how a vine structure should be selected to find the optimal model. And, to not underestimate tail dependence like many bankers did who used Li’s formula, do these vine copulas use the Gaussian copula or do they use copula families which are capable of capturing tail dependence?

This thesis is divided into four chapters. Chapter 2 contains the theoretical framework, discussing the relevant existing theory concerning copulas and vine copulas. The chapter ends by proposing a method to select an R-vine structure in order to find the optimal dependence model. Then a quantitative analysis is performed on a 10-dimensional financial dataset to examine the different vine classes and copula families. How this research is designed is explained in Chapter 3, including information about the used dataset, how this data is transformed and which vine copula models are estimated. Then Chapter 4 shows and discusses the results of this analysis and develops a rule for optimally using vine copulas as a dependence model. The rule decides between estimating an R-vine, a C-vine or a D-vine. All results are summarized in Chapter 5 along with some suggestions for further research.

(9)

Theoretical Framework

The most important results on copulas are discussed in the first part of this chapter. This includes the discussion of dependence measures which cannot fully describe the dependence structure on their own, but are important for building copula models. Section 2.2 then discusses how bivariate copulas can be used as building blocks to construct vine copulas.

2.1 Copulas

As described in the introduction, the motive for researching vine copulas is to find a proper way to model dependency as the often used linear correlation coefficient has severe shortcomings when modelling financial data. This means finding a way to describe the joint distribution function.

Definition 2.1.1. The joint cumulative distribution function (CDF) of a random vector (X1, . . . , Xd) is defined as:

F (x1, . . . , xd) = P (X1 ≤ x1, . . . , Xd≤ xd) . (2.1)

It is difficult to find a correct formula for this joint CDF for a given dataset, which is why most of the models used by financial institutions to price or value multivariate assets or a multivariate portfolio are based on the linear correlation coefficient (Embrechts et al., 2002). Using this measure for correlation means tail dependence is completely ignored, as this measure can only capture risks that are elliptically distributed (like the multivariate normal distribution). Financial risks are however known to be non-normally distributed.

Lots of attention in the world of dependence modelling has recently gone to copulas, which provide a way to find a better description of the joint distribution function. This is done by finding a copula function that links the marginals to the joint distribution function.

The first part of this section gives a formal definition of copula functions, along with some important properties. In Section 2.1.2 the most important copula families are given, after which Section 2.1.3 explains how the right copula family can be selected for a given dataset. Section 2.1.4 describes basic properties of dependence measures which are not sufficient to describe the dependence structure, but turn out to be very useful when building a vine copula model.

(10)

2.1.1 Definition of copula functions

A copula is a multivariate distribution function with standard uniform margins. This separates the problem of modelling a multivariate joint distribution into (1) modelling the marginals and (2) modelling the dependence structure. The most important result on copulas was found by Sklar (1959) and is now known as Sklar’s theorem.

Theorem 2.1.1 (Sklar’s theorem). Let X = (X1, . . . , Xd) be a random vector with joint

cumu-lative distribution function F (x1, . . . , xd) and with marginal distribution functions Fi(xi) =

P (Xi ≤ xi) , i ∈ 1, . . . , d . Then there exists a d-dimensional copula C such that for all

x = (x1, . . . , xd)0 ∈ R :

F (x1, . . . , xd) = C{F1(x1), . . . , Fd(xd)} . (2.2)

If the joint cumulative distribution function is absolutely continuous and the marginal cumulative distribution functions are strictly increasing and continuous, then the copula C is unique. This uniqueness implies it also works in the opposite direction: if C is a d-dimensional copula and F1(x1), . . . , Fd(xd) are marginal distribution functions, then F (x1, . . . , xd) defined by equation

(2.2) is a d-dimensional joint distribution function with marginals F1(x1), . . . , Fd(xd).

The complete proof of Sklar’s theorem can be found in Sklar (1996). We now see that the copula C links the marginal distributions to the multivariate distribution: the problem of modelling a multivariate distribution is successfully separated into two problems. Embrechts, Lindskog and McNeil (2001) use the second part of Sklar’s theorem to find an expression for the copula C :

Corollary 2.1.1. Let F (x1, . . . , xd) be a d-dimensional distribution function with continuous

margins F1, . . . , Fd and copula C (where C satisfies equation (2.2)). Then for any u in [0, 1]d,

C{u1, . . . , ud} = F (F1−1(u1), . . . , F_d−1(ud)) . (2.3)

The intuition is as follows. First, one transforms the marginals into uniformly distributed variables Ui = Fi(Xi). Then the copula C is the joint cumulative distribution function of these

standard uniform random variables. The marginals can be recovered using Xi= F_i−1(Ui).

This can be done for any marginals, no matter how the marginals are distributed. xi and xj

can even have different distributions, which allows for great flexibility in dependence modelling. Before copulas were used, this was impossible. For example, a joint multivariate distribution that is multivariate normally distributed implies that its marginals are all normally distributed, even if this is not the case in the underlying dataset.

The result of equation (2.2) can also be used to find the joint probability density function by taking the partial derivatives with respect to the marginals:

f (x1, . . . , xd) =

∂dC{F1(x1), . . . , Fd(xd)}

∂x1. . . ∂xd

= c1...d{F1(x1), . . . , Fd(xd)} · f1(x1) · . . . · fd(xd) ,

(11)

where c1...d is the density of the CDF C. This also further clarifies how the problem of modelling

a multivariate distribution is separated into two problems: the right-hand side of equation (2.4) consists of a part specifying the dependence structure (c1...d{F1(x1), . . . , Fd(xd)}) and a second

part specifying the marginals (f1(x1) · . . . · fd(xd)).

The remaining part of this section contains other important properties of copulas, which help in understanding the concept. To not unnecessarily overcomplicate things, these properties are shown for bivariate copulas, but they apply to multivariate (more than two dimensions) copulas as well. Copulas can be seen as the joint cumulative distribution function of standard uniform margins:

C(u, v) = P (U ≤ u, V ≤ v) , ∀ u, v ∈ [0, 1]2_. _(2.5)

This implies copulas also have the following properties: C(u, 0) = C(0, v) = 0 , ∀ u, v ∈ [0, 1] ,

C(u, 1) = u and C(1, v) = v , ∀ u, v ∈ [0, 1] .

(2.6)

These properties are trivial for any bivariate CDF, but it can be helpful to remember them when modelling copulas. The next theorem is a result found in the work of Fr´echet (1957), part of which follows from H¨offding (1940), and gives lower and upper bounds for copulas:

Theorem 2.1.2 (Fr´echet-H¨offding). Let C(u,v) be a bivariate copula, then for every (u,v) ∈ [0, 1]2_:

max[u + v − 1, 0] ≤ C(u, v) ≤ min[u, v] . (2.7)

2.1.2 Copula families

Now that it is clear which conditions a copula needs to meet and what properties a copula function has, this section discusses the most popular copula families. As every other applica-tion might have a different dependence structure, there is no universal best copula (Trivedi & Zimmer, 2007). For every new dataset the best-fitting copula needs to be found and estimated. Infinitely many copula families can be derived, but most of the often used copula families can be divided over two classes: elliptical copulas and Archimedean copulas. These classes differ in the way the copula functions are derived. The expressions of the most popular copulas are given below for the bivariate case, because while there are multivariate elliptical copulas, multivariate Archimedean copulas require very techhnical conditions. See for example McNeil and Neˇslehov´a (2009) for more information on these conditions. As these conditions are in practice often not met only the bivariate Archimedean copulas are discussed in this thesis.

Elliptical copulas

Elliptical copulas can be derived by inserting known multivariate distributions into Sklar’s theorem, like the multivariate normal distribution and the multivariate Student-t distribution (Embrechts et al., 2001). These elliptical copulas are easy to derive and relatively easy to

(12)

understand, but there are two drawbacks. First, as they are elliptical these copulas cannot cover the assymmetry often found in financial data (negative returns are more correlated than positive returns). The second drawback is that there is no closed-form expression for these copulas.

The Gaussian (Normal) copula is derived from the bivariate Gaussian distribution:

C_ρN(u1, u2) = Φρ(Φ−1(u1), Φ−1(u2)) , (2.8)

with Φρthe bivariate normal distribution function with linear correlation coefficient ρ and Φ−1

the inverse of the distribution function of the univariate standard normal distribution. Although there is no closed-form expression, this copula can be written in terms of integrals:

C_ρN(u1, u2) = Z Φ−1(u1) −∞ Z Φ−1(u2) −∞ 1 2π(1 − ρ2₎1/2 exp − s 2_{− 2ρst + t}2 2(1 − ρ2₎ dsdt . (2.9) The Student-t copula is derived likewise, but with the bivariate Student-t distribution in-stead of the bivariate Gaussian distribution:

C_ν,ρt (u1, u2) = tν,ρ(t−1ν (u1), t−1ν (u2)) . (2.10)

Here, tν,ρ denotes the bivariate Student-t distribution function with again linear correlation

coefficient ρ, and degrees of freedom ν. t−1_ν is the inverse of the distribution function of the univariate Student-t distribution with ν degrees of freedom. It’s integral form is:

C_ν,ρt (u1, u2) = Z t−1ν (u1) −∞ Z t−1ν (u2) −∞ 1 2π(1 − ρ2₎1/2 1 +s 2_{− 2ρst + t}2 ν(1 − ρ2₎ −ν+2₂ dsdt . (2.11) The extra parameter gives the Student-t copula an advantage over the Gaussian copula, as this parameter ensures Student-t copulas can fit data with tail dependence better. In fact, the parameter for the degrees of freedom (ν) is used to add tail dependence to the copula (Embrechts et al., 2001). The higher ν, the lower the tail dependence. When ν = ∞, the Student-t distribution is equal to the Gaussian distribution. An often used rule of thumb is to use the Gaussian distribution when ν ≥ 30, because the differences between the Gaussian and the Student-t distribution are negligible when ν ≥ 30.

This possibility of modelling tail dependence makes the Student-t copula better than the Gaussian copula for distributions with heavy tails, but it still is an elliptical copula and thereby symmetric. The key attribute of financial data (stronger correlation in highly negative returns) can therefore not be modelled, but as the Student-t copula can model symmetric tail dependence this may be sufficient. It will probably overestimate upper tail dependence, but this is possibly not a serious issue as banks don’t fail as a result of wrongly modelling highly positive returns, it is the highly negative returns that are most important.

Archimedean copulas

The other class, Archimedean copulas, uses a generator function ϕ instead of inserting known multivariate distributions into Sklar’s theorem. The notation of Genest and MacKay (1986)

(13)

is followed for this class of copulas, as they were the first to compare multiple Archimedean copulas. Theorem2.1.3states how and under which conditions a generator function ϕ generates an Archimedean copula.

Theorem 2.1.3 (Archimedean copulas). Let ϕ : [0, 1] → [0, ∞) be a continuous, twice differ-entiable function that satisfies:

ϕ(1) = 0 , ϕ0(t) < 0 , ϕ00(t) > 0 , for 0 < t < 1 . (2.12) This also guarantees ϕ has an inverse ϕ−1 that also has two derivatives. For any ϕ that satisfies these conditions, ϕ generates a bivariate (Archimedean) copula for a pair (U1, U2) as follows:

C(u1, u2) = ϕ−1(ϕ(u1) + ϕ(u2)) . (2.13)

This way of creating copulas results in closed-form expressions and, unlike elliptical copulas, Archimedean copulas allow for assymetric dependence modelling (Embrechts et al., 2001). This creates more flexibility and enables copulas to capture the lower tail dependence so often seen in financial data. The drawback however is that there are no practically usable multivariate Archimedean copulas. This gives a motivation to look at vine copulas, which will be discussed in Section 2.2.

The most used Archimedean copula families are the Gumbel, Clayton, Frank and Joe copula families. Table 2.1 shows expressions for these Archimedean copula families, along with their generator function ϕ. When fitting these copula families their parameter θ can be estimated by a number of methods (see for example Joe (1997)), but for reasons of reliability and computational complexity Maximum Likelihood Estimation (MLE) is used in this thesis.

Copula family Generator function ϕ Copula CDF Cθ(u1, u2)

Gumbel (−log(t))θ exp − (−log(u1))θ+ (−log(u2))θ

1/θ Clayton 1_θ(t−θ− 1) maxu−θ 1 + u −θ 2 − 1; 0 −1/θ Frank −logexp(−θt)−1_{exp(−θ)−1} −1

θlog h 1 +(exp(−θu1)−1)(exp(−θu2)−1) exp(−θ)−1 i Joe −log(1 − (1 − t)θ₎ _{1 −}_{(1 − u} 1)θ+ (1 − u2)θ− (1 − u1)θ(1 − u2)θ 1/θ Table 2.1: Archimedean copula expressions

These Archimedean copula families are able to capture different tail dependencies, which is also why it is a good choice to choose between one of these families when fitting a copula. Smith (2003) compares all four and shows that the Gumbel copula captures weak upper tail dependence. The Clayton copula captures strong lower tail dependence, while the Frank copula is a symmetric copula able to capture (symmetric) tail dependence. It is however not elliptical, thus different from the Gaussian and Student-t copula discussed earlier. Like the Gumbel copula, the Joe copula is able to capture upper tail dependence, but stronger. Note that all these four Archimedean copula families are able to capture tail dependence.

(14)

Furthermore, the asymmetrical Gumbel, Clayton and Joe copulas can also be rotated as described by Embrechts et al. (2001). As a result of their generator functions, to still satisfy all necessary conditions from Theorem2.1.3these copulas can only be used for positive θ. To enable these families to capture all types of dependence (upper tail dependence, lower tail dependence, and negative upper or lower tail dependence), they can be rotated by 90, 180 or 270 degrees. This implies that for every type of dependence, the best-fitting copula family can be chosen out of all 6 discussed families.

2.1.3 Family selection

Now that we know which copula families are commonly used and how their parameters can be estimated, we must find a way to decide which family fits best. When looking graphically at bivariate data, one can probably find out whether it shows some form of tail dependence, and choose an appropriate copula family based on the plot. However, this method is both very subjective and impossible to implement when multiple (if not thousands) variables must be fitted. As this is often the case in finance, we must find a faster method to choose the right copula family.

As the goal of this thesis is to examine how vine copulas can be optimally used as a de-pendence model and not to test which family selection method is optimal, we follow the results of Manner (2007) and Brechmann (2010). Both Manner and Brechmann performed a Monte Carlo simulation study to test which selection strategy leads to the correct result most often. In his study, Manner compared the Akaike Information Criterion (AIC), the Kolmogorov-Smirnov test, the Chi-square test and the Jarque-Bera test and found that the AIC gives the best results. It only fails in selecting the correct copula family when dependence is low, but this problem is negligible as in this case the AIC then chooses a copula family that is very close to the correct family (Manner, 2007).

Brechmann (2010) questions this result and performs another Monte Carlo simulation study, where he uses the more rigorous goodness-of-fit tests, data characteristics, AIC, BIC and Vuong tests. Although arguing beforehand that the AIC test might not be a satisfying method when dependence is weak and goodness-of-fit tests are more appropriate for non-nested models (which different copula families often are), he too finds that the AIC has the highest accuracy in selecting the correct copula family (Brechmann, 2010). This result holds even when dependence is weak. As the results of both Manner and Brechmann lead to the AIC test, this is the test we will use for selecting a copula family. The AIC test is explained briefly in the next paragraph, more information concerning goodness-of-fit tests or other family selection methods can be found in the papers of Manner (2007) and Brechmann (2010).

The Akaike Information Criterion is a well-known statistic to compare the quality of statis-tical models.

Definition 2.1.2. The AIC is defined as:

(15)

where k is the number of parameters and ˆL is the maximum likelihood of the model.

The AIC test computes the AIC for all chosen models and then chooses the model with the lowest AIC. Note that the factor 2k translates to a penalty for models with more parameters (in this context the Student-t copula, which is the only considered copula with two parameters instead of one).

2.1.4 Dependence measures

Copulas can in theory be used to describe the full dependence structure, but other ways to mea-sure dependency exist as well. This section discusses the linear (Pearson) correlation coefficient and Kendall’s tau and explains if and how they can be related to copulas.

Pearson correlation coefficient

The Pearson or linear correlation coefficient is the most widely used measure of dependence, probably because it’s both easy to compute as well as to interpret. It is often referred to using only the word correlation, implying that this is the definition of correlation. This is not true, it is one of the many measures of correlation, but because of its popularity people often confuse the Pearson correlation coefficient to be the definition of correlation.

Definition 2.1.3. The Pearson correlation coefficient for two random variables X and Y is defined as:

ρ(X, Y ) = Cov(X, Y )

pV ar(X)pV ar(Y ). (2.15)

As the name suggests, it measures linear correlation and this coefficient results in values between -1 (perfect negative dependence) and 1 (perfect positive dependence). This makes it easy to interpret, but because it can only measure linear correlation it can miss important parts of the dependence between two variables. Another weakness is that the Pearson correlation coefficient is not defined when the variance of X or Y is infinite. This is a problem for heavy-tailed distributions with infinite variances. A third weakness is that this dependence measure does not work both ways: when two random variables X and Y are independent the Pearson correlation coefficient is zero, but when the coefficient is zero this does not mean the two variables are independent. This is only the case when the variables are bivariate normally distributed. The last weakness is that the coefficient is not invariant under non-linear strictly increasing transformations, which is a desirable property for dependence measures.

Embrechts et al. (2001) discuss the weaknesses of Pearson’s correlation coefficient in more detail and explain why it is an inappropriate correlation measure to use in finance, which mainly is because of the non-normality often seen in financial data. As the Pearson coefficient measures dependence correctly only when the joint distribution is normal, if often fails in finance and gives people who use it a false idea about the true dependence.

(16)

Kendall’s tau

A dependence measure that does not suffer from some of these weaknesses is the Kendall rank correlation coefficient, most often called Kendall’s tau. Because Kendall’s tau measures correlation between the ranks of two random variables, it is able to capture dependence of any monotonic function and not just linear dependence.

Definition 2.1.4. Kendall’s tau for two random variables X and Y is defined as

τ (X, Y ) = P{(X1− X2)(Y1− Y2) > 0} − P{(X1− X2)(Y1− Y2) < 0} , (2.16)

where (X1, Y1) and (X2, Y2) are two independent pairs of random variables with an identical

joint distribution as (X, Y ).

Hence, Kendall’s tau is the probability of concordance minus the probability of discordance. When calculating Kendall’s tau empirically, we compute how often a pair of datapoints is concordant minus how often a pair is discordant. A pair of datapoints (X1, Y1) and (X2, Y2) is

concordant if either X1 > X2 and Y1 > Y2, or X1 < X2 and Y1 < Y2. This can be written in

one equation as (X1− X2)(Y1− Y2) > 0. A pair is discordant if (X1− X2)(Y1− Y2) < 0.

Embrechts et al. (2001) show Kendall’s tau does not suffer from all the weaknesses of the Pearson correlation coefficient: Kendall’s tau always exists, is invariant under non-linear strictly increasing transformations and is independent from the marginal distributions. This last aspect brings another interesting and useful insight. Both copulas and Kendall’s tau are independent from their marginals. This implies Kendall’s tau is, much like copulas, a measure that gives information purely based on the dependence structure and not on the marginal distributions. Embrechts et al. (2001) prove the following relationship between copulas and Kendall’s tau:

τ (X, Y ) = 4 Z 1 0 Z 1 0 C(u, v)dC(u, v) − 1 . (2.17)

Before this, Genest and MacKay (1986) already proved there exists a specific relationship be-tween (the generator function of) Archimedean copulas and Kendall’s tau:

τ (X, Y ) = 4 Z 1

0

ϕ(t)

ϕ0(t)dt + 1 . (2.18)

There are many other dependence measures, one of whom is Spearman’s rho, but Kendall’s tau has the best properties to use in combination with copulas and therefore the others are not discussed here. For more information on other dependence measures, see Embrechts et al. (2002).

Section 2.1 discussed the most important results regarding copulas. The Gaussian, Student-t, Gumbel, Clayton, Frank and Joe copulas were discussed in more detail and a method to select one of these copula families was given. The last part of the section explained the Pearson correlation coefficient, which is the most used measure for dependence but often gives mislead-ing results, and Kendall’s tau which is a better alternative if one desires a scalar measure for dependence between two variables. The next section explains vine copulas, a method to use bivariate copulas as building blocks to build a model for multivariate dependence.

(17)

2.2 Vine copulas

Since most financial products or portfolios consist of many constituents, dependence modelling techniques must be applicable in a multivariate setting. Elliptical copula models can be used in a multivariate setting, but this results in a highly restricted dependence model as for ex-ample a multivariate normal copula implies that every single marginal is normally distributed. Archimedean copula models require highly technical conditions in order to be applied in a mul-tivariate setting. As these conditions are often not met, they cannot be practically used. This gives motivation to look for a multivariate dependence model using only bivariate copulas: vine copulas.

Vine copulas use bivariate copulas as building blocks to construct a multivariate dependence model. Joe (1996) was the first to come up with the idea and a few years later Bedford and Cooke (2001, 2002) introduced vines as a graphical model to identify in which order the building blocks are used to construct a multivariate dependence model. This new model was largely overlooked until Aas et al. (2009) wrote an influential paper where they coined this technique pair-copula construction. The term vine copula combines the graphical model (vines) with the technique to use bivariate (pair-)copulas as building blocks (pair-copula construction).

This section starts with an explanation of pair-copula constructions while temporarily pre-tending the vine structure is not an issue. Then Section 2.2.2 explains basic graph theory which is needed for the discussion of different vine structures in Section 2.2.3. After this a sequential method to select and estimate the right vine structure is discussed.

2.2.1 Pair-copula constructions

This section explains how multiple bivariate copulas can be combined to construct a multivariate dependence model. The discussion is predominantly based on the paper of Aas et al. (2009), who describe it as a cascade of pair-copulas. This results in a multivariate model where the dependence between every pair of variables can be modelled by its own best-fitting copula, making it extremely flexible.

The problem we ultimately want to solve is to find an expression for the joint density function f (x1, . . . , xd). The density function can be factorized as follows:

f (x1, . . . , xd) = fd(xd) · f (xd−1|xd) · f (xd−2|xd−1, xd) · . . . · f (x1|x2, . . . , xd) . (2.19)

To find an expression for this density function, we rewrite it in terms of copulas by using the result of Sklar’s theorem (equation (2.4)):

f (x1, . . . , xd) = c1...d{F1(x1), . . . , Fd(xd)} · f1(x1) · . . . · fd(xd) . (2.20)

The term c1...d(F1(x1), . . . , Fd(xd)) in this equation represents the density of a multivariate

(18)

This is where pair-copula constructions are used for. By using transformed conditional marginals and density functions, the output of two pair-copulas can be used as input for a new pair-copula. To explain this, the following equations first show how a regular pair-copula is constructed (which works the same as in Section 2.1) and thereafter how the output of two pair-copulas is used as input for a new pair-copula. Writing equation (2.20) in the bivariate version gives

f (x1, x2) = c12{F1(x1), F2(x2)} · f1(x1) · f2(x2) . (2.21)

Here, c12{F1(x1), F2(x2)} is the pair-copula density of the transformed variables F (x1) and

F (x2). Using basic conditional probability theory, the conditional density then is

f (x1|x2) = f (x1, x2) f2(x2) = c12{F1(x1), F2(x2)} · f1(x1) · f2(x2) f2(x2) = c12{F1(x1), F2(x2)} · f1(x1) . (2.22)

This conditional density can be inserted into the bivariate version of equation (2.19) to obtain equation (2.21). It gets interesting when expanding this to the 3-dimensional case, where we have to find an expression for

f (x1, x2, x3) = f3(x3) · f (x2|x3) · f (x1|x2, x3) . (2.23)

The second term at the right-hand side of this equation can be found using equation (2.22). The third term, f (x1|x2, x3), can be decomposed as follows:

f (x1|x2, x3) = c13|2{F (x1|x2), F (x3|x2)} · f (x1|x2) , (2.24)

where c13|2 represents the pair-copula density of x1 and x3 conditional on x2. This new

pair-copula is applied to the transformed variables F (x1|x2) and F (x3|x2), which will be derived

later. Further decomposing f (x1|x2) in equation (2.24) gives

f (x1|x2, x3) = c13|2{F (x1|x2), F (x3|x2)} · c12{F1(x1), F2(x2)} · f1(x1) . (2.25)

Inserting these results into equation (2.23) yields f (x1, x2, x3) = f3(x3) · f (x2|x3) · f (x1|x2, x3)

= f3(x3) · c23{F2(x2), F3(x3)} · f2(x2)

· c_13|2{F (x1|x2), F (x3|x2)} · c12{F1(x1), F2(x2)} · f1(x1) .

(2.26)

Rearranging this gives the following expression for our first pair-copula construction: f (x1, x2, x3) = f1(x1) · f2(x2) · f3(x3)

· c12{F1(x1), F2(x2)} · c23{F2(x2), F3(x3)}

· c_13|2{F (x1|x2), F (x3|x2)} .

(19)

This can be expanded to any dimension by decomposing each term in equation (2.19) into an appropriate pair-copula and a conditional marginal density, as in (2.24). This can be done using the general formula

f (x|v) = c_xv_j|v−j{F (x|v−j), F (vj|v−j)} · f (x|v−j), (2.28)

where vj is an arbitrarily chosen component of v and v−j the vector v excluding this j-th

component.

Then the only remaining thing is the specification of the transformed conditional marginals like F (x1|x2) in equation (2.24). Joe (1996) derived the following general equation for this in

terms of v and j:

F (x|v) = ∂Cx,vj|v−j{F (x|v−j), F (vj|v−j)} ∂F (vj|v−j)

. (2.29)

For example, for F (x1|x2) this gives

F (x1|x2) =

∂Cx1x2{F (x1), F (x2)}

∂F (x2)

. (2.30)

Aas et al. (2009) introduce the h-function to represent this conditional distribution function when x and v are uniform.

Definition 2.2.1 (h-function).

h(x, v, Θ) = F (x|v) = ∂Cx,v(x, v, Θ)

∂v , (2.31)

where Θ denotes the set of parameters for the pair-copula of the joint distribution function of x and v.

2.2.2 Graph theory

Before introducing vines and discussing different vine structures, we need to understand basic graph theory. Graph theory dates back to the K¨onigsberg bridges problem in 1735 (Gross, Yellen & Zhang, 2013), resulting in the fact that the information about graphs in this section is highly incomplete and limited to a number of definitions about nodes, edges and trees needed to describe vines. For this purpose the work of Bedford and Cooke (2001, 2002) is followed supplemented by Brechmann (2010). It also contains a figure of the smallest vine possible, with three variables, to help understand these new concepts.

Definition 2.2.2 (Graph, node, edge, degree). A graph is a pair G = (N, E) of sets such that E ⊆ {{x, y} : x, y ∈ N }. The elements of E are called edges of the graph G and the elements of N are its nodes. The number of neighbouring edges of a node v ∈ N is the degree of v, denoted by d(v).

A graph can be seen as a set of nodes and edges. Figure 2.1 shows the most simplified vine with d = 3 variables. Tree 1 is a graph, with nodes N = {1, 2, 3} that are connected by the edges E = {(1, 2), (2, 3)}. Here the nodes 1 and 3 have degree d(1) = d(3) = 1 and node 2 has

(20)

degree d(2) = 2. If like tree 1 in Figure2.1 every node is connected with every other node, the graph is a connected graph. If it is possible to go from one node in a graph to another node in more than one way, the graph is cyclic. We now have enough vocabulary to define a tree. Definition 2.2.3 (Tree). A graph G is a tree T if

1. Any two nodes of T are linked by a unique path.

2. T is minimally connected, so T is connected but when any edge is left out T is disconnected. 3. T is maximally acyclic, meaning that T is not cyclic but when a single edge is added it

becomes cyclic.

The name tree has not been given to this type of graph without reason: like real life trees the nodes (branches) are connected to all others but only through one path. Two different branches never fuse together, so a cycle cannot be formed. A node can however be connected to infinitely many new nodes, which leads to many different tree structures. The next section explains how a set of trees form a vine, and which different vine structures can be created.

Figure 2.1: Vine with 3 variables

2.2.3 Vine structures

This section discusses the definition of a vine and thereafter explains how every vine can be classified as either a regular vine (R-vine), a canonical vine (C-vine), or a drawable vine (D-vine). Definition 2.2.4 (Vine). V = (T1, . . . , Tn−1) is a vine on d elements if

1. T1 = (N1, E1) is a tree with nodes N1= {1, . . . , d} and edges E1.

2. For i = 2, . . . , d − 1 , Ti is a tree with nodes Ni= Ei−1and edges Ei.

Looking back at Figure 2.1 we see that indeed tree 1 and tree 2 together form a vine. The edges in tree 1 form nodes in tree 2. This is also called a nested set of trees. This 3-dimensional vine results in the following pair-copula decomposition already given in Section 2.2.1:

f (x1, x2, x3) = f1(x1) · f2(x2) · f3(x3)

· c12{F1(x1), F2(x2)} · c23{F2(x2), F3(x3)}

· c_13|2{F (x1|x2), F (x3|x2)} .

(21)

This is our first, most simplified version of a pair-copula construction represented by the graph-ical vine model in Figure2.1.

Permuting the variables 1, 2 and 3 can be done in six ways, but only three result in different decompositions as for example {1,2,3} and {3,2,1} both result in the decomposition given in equation (2.32). Aas et al. (2009) note that showing the tree structure by drawing a vine is not strictly necessary for applying the pair-copula methodology, but it helps in identifying the different pair-copula decompositions. For example, switching node 1 and node 2 in Figure 2.1 results in the following pair-copula decomposition:

f (x1, x2, x3) = f1(x1) · f2(x2) · f3(x3)

· c12{F1(x1), F2(x2)} · c13{F1(x1), F3(x3)}

· c_23|1{F (x2|x1), F (x3|x1)}.

(2.33)

When looking at these expressions, it is difficult to see the difference between the two pair-copula constructions. However, when it is shown graphically by drawing the vine structure, it is immediately clear how the two pair-copula decompositions differ (compare Figure 2.1 with Figure 2.2).

Figure 2.2: Another vine with 3 variables

To be able to classify different vine structures, the regular vine is introduced.

Definition 2.2.5 (Proximity condition). A vine V is a regular vine (R-vine) if it satisfies the proximity condition:

For i = 2, . . . , d − 1, if {a, b} ∈ Ei with a = {a1, a2} and b = {b1, b2}, one of the elements of a

must also be an element of b. Or shorter, #(a ∩ b) = 1.

It can easily be checked that this is the case in Figure 2.1. In fact, every vine structure used for pair-copula constructions is a regular vine (Aas et al., 2009). The other two classes are subclasses of the regular vine and are defined as follows:

Definition 2.2.6 (C-vine and D-vine). A regular vine V is

1. A canonical vine (C-vine) if every tree Ti, i = 1, . . . , d − 1 , has one node with degree d − i.

This node is called the root node of tree Ti.

(22)

The vine in Figure 2.1 satisfies all these conditions. In fact, every regular vine with only three variables is both a C-vine and a D-vine. Aas et al. (2009) further show that for four variables, every regular vine is either a C-vine or a D-vine, while for five or more variables three possibilities exist. A vine can then either be an R-vine, a C-vine, or a D-vine. Figure 2.3-2.5 show three examples of these vine classes for five variables.

Figure 2.3: R-vine with five variables

(23)

Figure 2.5: D-vine with five variables

These graphical models are not only useful to spot differences in the structure, they can also be used to quickly derive the pair-copula decomposition of the joint density function. This decomposition is equal to the product of the marginal density of all variables multiplied by the pair-copula of every edge in the vine, where every one of these pair-copulas is applied to the transformed variables of that edge. For example, the pair-copula of the edge in T4 in Figure2.3

is applied to the conditional variables F (x1|x2, x3, x4) and F (x5|x2, x3, x4). These transformed

variables can be computed using the h-function given in definition 2.2.1. This results in the following pair-copula decomposition for the R-vine in Figure 2.3:

f (x1, x2, x3, x4, x5) = f1(x1) · f2(x2) · f3(x3) · f4(x4) · f5(x5) · c12{F1(x1), F2(x2)} · c23{F2(x2), F3(x3)} · c34{F3(x3), F4(x4)} · c35{F3(x3), F5(x5)} · c_13|2{F (x₁|x₂), F (x3|x2)} · c24|3{F (x2|x3), F (x4|x3)} · c_45|3{F (x₄|x₃), F (x5|x3)} · c14|23{F (x1|x2, x3), F (x4|x2, x3)} · c_25|34{F (x₂|x₃, x4), F (x5|x3, x4)} · c_15|234{F (x₁|x₂, x3, x4), F (x5|x2, x3, x4)}. (2.34)

The expressions for the pair-copula decompositions of the shown C- and D-vine follow similarly from the vine structures in Figure2.4 and2.5 and are placed in appendix A.

These three vine structures are examples only, many different permutions can lead to dif-ferent vine structures. Aas et al. (2009) state that for five variables, 60 unique D-vines and 60 unique C-vines exist. They further found that for a d-dimensional D-vine, there are d! ways to order the variables in tree T1. As mentioned before, reversing the order doesn’t change the

pair-copula decomposition ({1, 2, 3} gives the same joint density as {3, 2, 1}) so this gives d!/2 possibilities to order the variables in T1. As the following trees are completely determined by

the choice of T1, there are d!/2 different d-dimensional D-vines.

For d-dimensional C-vines, in T1 there are d possible choices for the root node, while the

placement of the other variables makes no difference as they are all connected only to this root node. T2 has d − 1 possible root nodes, etcetera. In the final tree (Td−1) both nodes can be seen

(24)

as the root node, as it does not matter which one is placed on the left side (in Figure 2.4, see T4, where the pair-copula decomposition would be the same if the nodes were reversed). This

results in d(d − 1)(d − 2) · . . . · 3 = d!/2 unique d-dimensional C-vines.

The total number of different R-vines (including D- and C-vines) is a lot harder to explain and beyond the scope of this thesis, however Morales-N´apoles, Cooke and Kurowicka (2010) prove this equals d₂ · (d − 2)! · 2(d−22 ). This amounts to 480 regular vines for 5 variables, 23.040

for 6 variables and 2.580.480 different R-vine structures for 7 variables. This number increases so rapidly when adding variables that it becomes impossible to estimate and compare all possible models. This leads to the need for a structure selection mechanism, explained in Section 2.2.4.

2.2.4 Vine structure selection

Vine copulas are extremely flexible dependence models. This flexibility is a result of two things. First, every single pair-copula can be chosen such that the copula family and the estimated parameters fit the corresponding bivariate data optimally. Secondly, there are many different vine structures which result in many possible combinations of data pairs to fit a pair-copula to. This vine structure can be used to choose which variables are coupled directly in T1, and

which variables are coupled conditionally in higher trees. This section discusses the impact of this vine structure and proposes a method to select an R-vine structure, predominantly based on the paper of Dißmann, Brechmann, Czado and Kurowicka (2013).

The problem of fitting a vine copula model now consists of three steps: 1. Selection of the R-vine structure (or possibly the C- or D-vine structure). 2. Copula family selection for each bivariate data pair following from step 1. 3. Estimation of the copula parameters.

Ideally, one would want to carry out step 2 and step 3 for every possible vine structure and then select the best-fitting model. This is however infeasible due to the number of possible R-vines, so another solution has to be found. In their seminal paper which introduced pair-copula constructions, Aas et al. (2009) also propose an estimation procedure for C- and D-vines. Although this made it possible to explore these vine classes, their used approach is not very rigorous and could not be used to select R-vine structures. Dißmann et al. (2013) improved it and developed a method to select any R-vine structure. It is a sequential method and works as follows:

First, the structure of tree 1 is selected by choosing the pair-copulas to fit the data pairs with the strongest dependencies. This is reasonable because copulas in the first tree have the greatest influence on the model fit and the first trees can be estimated with more precision (Aas et al. 2009). Then, when the structure of tree 1 is determined, step 2 and step 3 can be performed for tree 1 using the theory of Section 2.2 (AIC to select the copula family and MLE to estimate the parameters). By using the h-function (definition2.2.1), the outcome of the pair-copulas from tree 1 are transformed to new conditional variables. Dependency between these

(25)

conditional variables is again calculated and tree 2 can be formed the same way as tree 1, by letting the pair-copulas model the strongest pairwise dependencies. This is however restricted, as not all transformed variables can be coupled together. Remember the proximity condition (definition 2.2.5), stating that every two nodes that share an edge must share a node in the previous tree. This shared node is necessary because the new pair-copula has to be conditioned on the variables of this node. When the structure of tree 2 is selected, step 2 and step 3 are performed. This process is then repeated until all d − 1 trees are formed. This sequential process can be written down in the form of an algorithm and is referred to as Dißmann’s algorithm. Before writing down this algorithm, the question of how to find the strongest dependency needs to be answered.

As explained in Section 2.1.4, there are multiple dependence measures. But because Diß-mann’s algorithm needs to work in every case and not only for elliptical distributions, the Pearson correlation coefficient should not be used and Kendall’s tau makes sense. Since we are looking for the strongest dependence, it is irrelevent if this is negative or positive so the absolute value of Kendall’s tau is used. Kendall’s tau can easily be calculated for every data pair, but then the problem remains to find a tree which maximizes the sum of these values. This can be solved by using Prim’s algorithm (Prim, 1957), finding the maximum spanning tree (MST) with as edge weights the Kendall’s tau of every data pair. The algorithms of Dißmann and Prim are given below in pseudocode.

Note that this sequential method does not necessarily lead to the optimal vine structure. It tries to find the optimal model by capturing as much dependency as possible in early trees, but it cannot guarantee to find a global optimum. However, as finding the global optimum is infeasible due to the number of possible R-vine structures, this method leads to a model where as much dependence is modelled as accurately as possible and is by far the most used method in practical applications (Aas, 2016).

This chapter discussed the most important results on copulas and vine copulas. Vine copu-las are extremely flexible dependence models, since (1) every data pair can be modelled by its own best-fitting bivariate copula and (2) the vine structure can be chosen such that the most important dependencies between variables can be modelled optimally. The chapter ended by suggesting a sequential method to select and estimate an R-vine. The next chapter explains how the quantitative research is designed in order to find out how vine copulas can be optimally used as a dependence model.

(26)

Algorithm 1 Dißmann’s algorithm for sequentially selecting an R-vine structure based on Kendall’s tau

Input: Data (x1, ..., xd)

Output: R-vine copula specification

1: Calculate the empirical Kendall’s tau ˆτj,k for all possible variable pairs (xj, xk), 1 ≤ j <

k ≤ d.

2: Select the spanning tree that maximizes the sum of absolute empirical Kendall’s taus using Prim’s algorithm, so

max X

e={j,k}in spanning tree

|ˆτj,k|

3: For each edge {j, k} in the selected tree, select a copula family and estimate the correspond-ing parameter(s). Then calculate the transformed variables F (xj|xk) and F (xk|xj) using the

h-function.

4: for i = 2, ..., d − 1 do

5: Calculate the empirical Kendall’s tau ˆτj,k|v for all possible conditional variable pairs

(x_j|v, x_k|v) that can be part of tree Ti, so all edges that meet the proximity condition.

6: Among these possible edges, select the spanning tree that maximizes the sum of absolute empirical Kendall’s taus using Prim’s algorithm, so

max X

e={j,k|v}in spanning tree

|ˆτj,k|v|

7: For each edge {j, k|v} in the selected tree, select a conditional copula family and estimate the corresponding parameter(s). Then calculate the transformed variables F (xj|xk, v)

and F (xk|xj, v) using the h-function.

8: end for

Algorithm 2 Prim’s algorithm for finding the maximum spanning tree Input: A connected graph G = (N, E) with edge values |ˆτj,k|

Output: The maximum spanning tree T = (N, E0)

1: E0 = {}

2: Use an arbitrary node n as starting node

3: N0= {n}

4: while E0 does not connect all nodes from N do

5: Select an edge e ∈ E with maximal edge value, that connects a new node n0 that is not already connected with E0.

6: N0 = N0+ n0

7: E0 = E0+ e

(27)

Research Design

This section contains information about the research design. Section 3.1 explains which data is used to test the various vine copula models and describes the main properties of this dataset. As this is financial data, a time series model has to be applied to each marginal variable to filter out time series properties and obtain i.i.d. random variables. Note that every observation within each marginal should be i.i.d. and not between marginals, otherwise there would be no dependence left to fit a dependence model on. This is explained in Section 3.2, along with another transformation which is applied to make the marginal residuals uniformly distributed on [0, 1]. The resulting so called copula data is then ready to be used as input for our copula models. Section 3.3 explains how this transformed data is implemented to fit the different vine copulas. To do this, Dißmann’s algorithm is adjusted in Section 3.3.1 to be able to estimate both R-, C- and D-vines. Section 3.3.2 then explains on which subsets of the complete dataset the different vine classes are estimated to analyze their performance under different circumstances. The chapter ends with a description of the used software packages.

3.1 Data

This section starts by discussing desirable properties for the dataset and thereafter explains which data is used and where it is found.

As this thesis set out to discover dependence models and ultimately find out how vine copulas can be optimally used as a dependence model, the data has to contain a variety of dependence structures. This ensures the models can be tested extensively for different purposes. The dependence in the data should not be too clustered, as this might have too much influence on the outcomes, but it must be ensured that there is enough dependence present. Besides this I have chosen to investigate the time period starting on the first market day of 2005 (before the big financial crisis) and ending on the last market day of 2017, to incorporate both economical upturns as well as downturns. This results in 3272 datapoints per variable.

For the above reasons the S&P 500 index is chosen, along with the stock prices of its largest components from three different industries. The index is per definition correlated with all its

(28)

components and the components itself are expected to be mostly correlated with components from the same industry and less with components from other industries. An exception is the financial crisis, where all variables are expected to be more strongly correlated. US stock data is chosen for obvious reasons like liquidity and availability. I expect this data to contain more than enough interesting dependence structures and therefore no data from other areas outside the US is used.

The dataset is 10-dimensional, making it possible to explore vine copulas in small dimensions first and then examine what happens when applied to higher dimensions. Table3.1shows which companies are used, ordered by industry. These are in 2018 in terms of market capitalization the biggest companies in their respective industries, excluding Amazon, Facebook, Berkshire Hathaway and Alphabet who were not included in the S&P 500 index in the entire period. All data is found on Yahoo Finance. Figure 3.1 shows the close prices of the S&P 500 and the chosen nine stocks during this thirteen year period.

Name Industry Variable number Name Industry Variable number

S&P 500 Index 1 Bank of America Finance 6

Apple Tech 2 Wells Fargo Finance 7

Microsoft Tech 3 Johnson & Johnson Health 8

Intel Tech 4 United Health Group Health 9

JPMorgan Finance 5 Pfizer Health 10

Table 3.1: List of companies and their industry

(29)

3.2 Time series modelling

Following from Sklar’s theorem, copulas require the marginals to be i.i.d. standard uniform variables. Because the used dataset consists of financial time series, volatility clustering and possible serial correlation must first be eliminated by applying an ARMA-GARCH filter to the data. This section describes all the data transformations necessary to go from daily stock prices to (approximately) i.i.d. standard uniform variables which we can use as input for vine copulas.

First, daily stock prices Ptare transformed to daily logreturns as follows:

rt= log

Pt+1

Pt

, t = 1, ..., 3271 . (3.1)

As financial returns are known to contain some serial correlation and a lot of volatility clustering, an ARMA(1,1)-GARCH(1,1) model is considered separately for every variable. This is a combination of an AutoRegressive Moving-Average (ARMA) model which describes the mean process, and a General AutoRegressive Conditional Heteroskedasticity (GARCH) model which describes the variance of the error term. When correctly specified, these two models together contain all information about the process and ‘filter out’ all serial correlation and volatility clustering. The models were first described by Box and Jenkins in 1970 (ARMA) and Bollerslev (1986), who added the GARCH part. Definition 3.2.1describes an ARMA(1,1)-GARCH(1,1) process for logreturns rt.

Definition 3.2.1 (ARMA(1,1)-GARCH(1,1) model). The ARMA part describing the mean of rt is given by

rt= µ + φ1rt−1+ θ1t−1+ t, (3.2)

where φ1 is the AR parameter, θ1 the MA parameter and t the error term. When correctly

specified this filters out serial correlation in rt. However, t is not yet i.i.d. due to volatility

clustering which is why the following GARCH process is used for t:

t= σtZt, where

σ_t2 = α0+ α12t−1+ β1σt−12 , and

Zt∼ i.i.d.(0, 1) and assumed to follow the (standardized) Student-t distribution.

(3.3)

When correctly specified, this process of t completely describes the error terms of the

logreturns. As all volatility clustering is then captured by σt, the standardized residuals ˆZt= _σˆ_ˆt_t

are approximately i.i.d. and can be used for our vine copula models. To test whether there is still serial correlation left in the standardized residuals, the Ljung-Box test is used. The Ljung-Box test (Ljung and Box, 1978) tests whether there is serial correlation left in the first m lags. This test is performed by calculating the Q-statistic given by the following definition.

Definition 3.2.2 (Ljung-Box test statistic).

Q(m) = N (N + 2) m X h=1 ˆ ρ2_h N − h, (3.4)

(30)

where N is the number of observations and ˆρh is the sample ACF (autocorrelation function) of

the standardized residuals at lag h.

Under H0 : ρ1 = ρ2 = ... = ρm = 0, this Q-statistic follows a χ2m−g distribution where g is

equal to the number of parameters of the used time series model. This test can be performed on multiple lags to test if there is serial correlation left in any of these lags.

Then, when no serial correlation is left in the standardized residuals, the ARCH-LM test is used to assess whether the process still contains ARCH effects (serial correlation in the squared residual series). Engle (1982) first derived this test, given in the following definition.

Definition 3.2.3 (ARCH-LM test). The ARCH-LM test uses the following auxiliary regression:

ˆ

Z_t2 = γ0+ γ1Zˆt−12 + ... + γmZˆt−m2 + et, (3.5)

where etis a white noise error process. It then tests the null hypothesis H0 : γ1= γ2 = ... = γm

using the test statistic LM = N · R2 where N is the number of observations as before and R2 is the R2 of the auxiliary regression. Under H0, LM follows a χ2m distribution.

When both these tests do not result in rejecting the null hypothesis, the time series model sufficiently filters out serial correlation and volatility clustering. Otherwise the model must be adjusted by either adding/subtracting lags or adding another filter. Section 4.1 discusses the results of all time series modelling on the used dataset and ultimately gives the final model used for every separate marginal variable.

After these time series properties are filtered out, the resulting standardized residuals ˆZt

are transformed by their empirical CDF to obtain standard uniformly distributed copula data (data which can be used as input for copulas). Note that this is the empirical CDF and not the real CDF, as the real CDFs of the marginal standardized residuals are unknown. The empirical CDF is calculated as follows: ˆ F (t) = 1 N N X t=1 1_{{ ˆ}_Z_t_≤t}. (3.6)

Subsequently, Genest, R´emillard and Beaudoin (2009) show that to avoid potential problems at the boundary of [0, 1]d the values of the empirical CDF needs to be multiplied by a factor N/(N + 1). This results in the following approximately standard uniform pseudo-observations which we can use as input for our copulas:

ut=

1

N + 1Rt= N

N + 1F ( ˆˆ Zt) , t ∈ 1, . . . , N , (3.7) where Rt represents the rank of xt.

Section 4.1 discusses the time series properties present in the used dataset and the transfor-mations applied in order to obtain approximately i.i.d. standard uniform marginals. As described in this section, this is done in three steps:

(31)

2. Fit an appropriate time series model on rt to obtain i.i.d. standardized residuals ˆZt.

3. Transform these ˆZt to obtain i.i.d. standard uniform marginals ut= _{N +1}N F ( ˆˆ Zt).

After these time series properties are filtered out and the marginals are transformed to copula data following the steps described above, these marginals utare referred to as ui, where

i = 1, . . . , 10, to indicate which variable it is. The subscript t is no longer relevant as time series properties are filtered out now and the variables are (approximately) i.i.d. Note that if the notation ui,t would be used, ui,2 would be independent of ui,1 and it is no longer interesting

to know which t it is, therefore it is sufficient to write ui.

3.3 Implementation of vine copulas

After eliminating all serial correlation and volatility clustering, and transforming the standard-ized residuals into pseudo-observations, these pseudo-observations ui can be used as input for

our vine copulas. To ultimately find out how vine copulas can be optimally used as a dependence model, this section first explains how the different R-, C- and D-vine copulas can be selected and estimated. Then the different testing environments are explained which are used to analyze the performance of the different vine classes under different circumstances.

3.3.1 Adjusting Dißmann’s algorithm to select C- and D-vines

Recall from Section 2.2.4 the three steps that must be taken to estimate a vine copula: 1. Selection of the R-vine structure (or possibly the C- or D-vine structure).

2. Copula family selection for each bivariate data pair following from step 1. 3. Estimation of the copula parameters.

In this analysis the widely used sequential method described by Dißmann et al. (2013) is used. This method is however only capable of finding the optimal R-vine structure. As this thesis set out to find an optimal way to use vine copulas, this also includes analyzing C- and D-vines and therefore Dißmann’s algorithm must be adjusted to estimate these vine classes. Czado, Brechmann and Gruber (2013) give a short description of how to adjust the algorithm but do not explicitly show them. The adjusted Dißmann’s algorithms for selecting a C- or D-vine are explained in this section.

To help understand these adjustments, a simplified version of Dißmann’s algorithm is given in Algorithm 3. The adjustments made to be able to estimate C- and D-vines take place in lines 2 and 6, where the tree structures are selected. The main idea of choosing the structure of tree 1 by selecting the data pairs with the strongest dependencies stays the same, but an extra restriction is added to ensure the algorithm results in a C- or D-vine.

For C-vines, instead of using Prim’s algorithm (Algorithm 2) to select the (unrestricted) tree that maximizes the sum of all Kendall’s taus, the tree must have one root node of order

(32)

Algorithm 3 Simplified version of Dißmann’s algorithm

1: Calculate ˆτj,k for all pairs (xj, xk).

2: Select the tree that maximizesP ˆτj,k.

3: For each edge in the tree, estimate copula parameters (MLE) and select a family (AIC). Then calculate transformed variables using the h-function.

4: for i = 2, ..., d − 1 do

5: Calculate new ˆτj,k|v for all new pairs that satisfy the proximity condition.

6: Select the tree that maximizes P ˆτj,k|v.

7: Do line 3.

8: end for

d − i where d is the number of variables and i the number of tree Ti (see definition2.2.6). This

translates to calculating for every variable j the sum of all its Kendall’s taus and then selecting the tree that maximizes this sum. Lines 2 and 6 then become:

2: Select the tree with root node j that maximizes the sum of absolute empirical Kendall’s taus, so

maxX

k

|ˆτj,k|

6: Among every remaining variable j not in v, select the tree with root node j that maximizes the sum of absolute empirical Kendall’s taus, so

maxX

k

|ˆτ_j,k|v|

Note that for the C-vine, the proximity condition is automatically satisfied in tree Ti for every

possible conditional variable pair because every variable is connected with the root node in tree Ti−1. The full adjusted Dißmann’s algorithm for C-vines is given in Appendix B.

For D-vines, line 2 is altered to find the tree that maximizes the dependency in the first tree, with the extra restriction that every node has a degree of at most 2 (see definition2.2.6). This translates to finding a single path connecting all nodes which maximizes the sum of the Kendall’s taus. Line 2 then becomes:

2: Select the single path connecting all nodes that maximizes the sum of absolute empirical Kendall’s taus, so

max X

e={j,k}in a single path

|ˆτj,k|

This problem of finding the path that maximizes the sum of absolute Kendall’s taus is similar to the Traveling Salesman Problem (TSP) originating from graph theory. The TSP first arised centuries ago (like many graph theoretic problems) when a salesman had to travel to multiple towns to conduct his business. To do this efficiently, the salesman wanted to know what the shortest route connecting all cities is.

To construct a D-vine, we need to find the path maximizing the sum of Kendall’s taus instead of minimizing it as in the TSP. But because all |ˆτj,k| are between 0 and 1, the problem