• No results found

Multivariate extensions of Value-at-Risk

N/A
N/A
Protected

Academic year: 2021

Share "Multivariate extensions of Value-at-Risk"

Copied!
39
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Sophie van Dongen student nr: 5823161

Master thesis Actuarial Science

Multivariate Extensions of Value-at-Risk

Summary

This thesis describes some of the most commonly used methods of expanding univariate Value-at-Risk into a multivariate framework. Both numerical and mathematical methods are used. It looks at the copula, CoVaR, upper/lower orthant VaR, linear programming, and Joint Density Tail. Mathematical background is combined with empirical examples.

A copula is a multivariate distribution function with uniform marginals that is used to bind uni-variate distribution functions. CoVaR measures the contribution of the failure of one institution on the economy as a whole. Upper and lower orthant VaR measures the probability that several variables have all (not) passed a threshold. Linear programming techniques can be used to solve the systems of equations that one might get when defining a multivariate VaR. The Joint Density Tail is used to translate multivariate VaR back to a univariate measure along a vector.

(2)

Introduction

The Basel accords have brought Value-at-Risk (VaR) back to the forefront of attention. Value-at-Risk measures the maximum incurred loss at probability α. Typical values for α are 1% or 0.1%. Value-at-Risk has as a benefit that it can be reliably calculated for almost any probability distribution, no matter how skewed or otherwise strange behaved. This makes it more versatile than a lot of other commonly used risk measures such as the standard deviation, as those other methods typically assume a certain underlying distribution, such as the normal distribution.

Another benefit of Value-at-Risk is that it looks at the tail instead of the entire distribution. On the one hand this means losing a lot of data if one only tries to fit the tail instead of the whole distribution. On the other hand this focus on the tail can prevent a false sense of security: when the market has rarely any big shocks that would fall into the tails of a distribution it can become easy to underestimate those outliers. Thus risk can ’build up’ in the background for a long time.

The popularity of the Value-at-Risk has recently led to a surge of attempts to translate this uni-variate risk measure into a multiuni-variate one, which has met with all the usual problems of attempting to scale up a univariate to a multivariate variable. This thesis attempts to give an overview of the most commonly used methods to achieve this, consisting of mathematical background mixed with empirical results.

Chapter one discusses the univariate VaR. Chapter two looks at risk and properties of risk measures. Chapter three looks at covariance and quantiles. Chapter four introduces copulas, a commonly used method of binding two univariate risk distributions together. Chapter five continues with the use of copulas to find bounds for the VaR.Chapter six gives empirical proof for the importance of the tail distribution for the multivariate framework. Chapter seven introduces the CoVaR, which is the effect one institution has on the VaR of an entire economy. Chapter eight looks at linear programming methods. Chapter nine discusses upper and lower orthants, which is the probability that both variables have (not) passed a threshold. Chapter ten introduces the concept of the Joint Density Tail. Chapter eleven gives empirical proof that multivariate models are preferable when attempting to forecast in a complex system. Chapter twelve gives a visual illustration of the concept of Multivariate Value-at-Risk.

1

VaR

This chapter gives a short argument for the use of VaR as a risk-measure. It gives a definition of VaR and discusses some VaR-based risk measures.

1.1 Why use VaR?

An important part of insurance is to quantify risk. In particular the risks of earning less than amount X on a certain investment and the risk of losing more than amount Y when a customer claims damages to his insured property. More information makes it possible to fit the risk function better: to determine the relevant distribution (such as the normal distribution, Poisson distribution, etc) and find the mean and variance. Knowing the full distribution is however not always necessary if you are only interested in tail-risks. And assuming an underlying distribution might even steer you in the wrong direction if you think you are f.e dealing with a normal distribution but find your tails to be much thicker than those in a normal distribution. And nowadays consensus holds that risks are indeed not following a normal distribution (Polanski et al, 2013, Patton, 2004).

Such a misspecified model might still perform quite well most of the times, while the market keeps quiet, but would fail in times of market upheaval, when you are most likely to encounter extreme

(3)

values, perhaps even several times in a row. Therefore if you are only interested in tail-risk it is better to focus only on the tails of your distribution instead of the middle part. VaR doesn’t need mean or variance but only measures the part of your distribution function that crosses risk threshold α.

However a potential problem with a VaR-based model is that it is based on only a few extreme data-points out of your entire set. Thus it might be wise not to completely discount the rest of your distribution when trying to find the tail-distribution. An alternative way of adding more data to your calculation (and thus hopefully add some predictive value) is by looking at other components of your assets besides historical growth. Components such as size, leverage and interconnectivity (Adrian, 2011) can all be used in an analysis.

1.2 Defining VaR

Given a confidence level 1 − α the Value-at-Risk of a portfolio is the value for loss L following distri-bution function F (L) such that the probability of this loss exceeding V aR is α, or

1 − α = F (L) ⇔ V aRα= L

Figure 1: VaR

1.3 cVaR and TVaR

One of the drawbacks of VaR is that it does not give a measure for how much loss is made, only that the reserve was insufficient. A lot of theoretical models use commonly-used distributions such as the normal distribution to quantify the amount of money lost, but an unscrupulous financial manager might feel tempted to use a distribution with very thick tails, such that instead of ’a small chance of a big loss’ there is ’a small chance of a very very big loss’ hidden in the 5% (or 1% or 0.1%) tail of the

(4)

distribution. This would not be immediately apparent to regulators or analysts focusing on the 95% (or 99% or 99.9%) quantile representing the typical behavior of the portfolio.

To overcome this shortcoming variants on the VaR can be used instead. There is the cVaR (con-ditional Value at Risk) also known as ES (Expected Shortfall), which is a measure for the average shortfall conditional on the fact that a shortfall does occur. Alternatively there is the TVaR (Tail Value at Risk) which measures average expected shortfall. But as both of those measures are directly derived from VaR, hereafter the focus will remain mostly on VaR itself.

2

Risk and risk measures

This chapter discusses a few desirable qualities that one would want a risk measure to have. It then gives short descriptions of the kinds of multivariate risk one might encounter within a company and within an entire economy.

2.1 Coherence

If univariate coherent risk-measures are scaled up to multivariate measures one would aim for the resulting risk measure to still be a coherent risk measure. This means mapping ρ : X → R should retain the following properties (Artzner et al, 1999):

-Monotone: X1 ≤ X2 ⇒ ρ(X1) ≤ ρ(X2)

-Subadditive: ρ(X1+ X2) ≤ ρ(X1) + ρ(X2)

-Positive homogeneous: ρ(λX) = λρ(X) for all λ ≥ 0 -Translative invariant : ρ(X + λ) = ρ(X) + λ

2.2 Some properties of stochastic orders

For univariate risk X having continuous strictly-monotonic distribution function FX the quantile

QX(α) is defined as QX(α) = FX−1(α), ∀α ∈ (0, 1)

Some notions of univariate order are (with X and Y as random variables): Stochastic Dominance Order :

X is smaller than Y in stochastic dominance (X st Y ) if QX(α) ≤ QY(α) for all α ∈ (0, 1)

Stop-loss Order (M¨uller, 1997):

X is smaller than Y in stop-loss order (X SL Y ) if for ∀t ∈ R it holds that

E[(X − t)+] ≤ E[(Y − t)+], with x+:= max{x, 0}. Note that stochastic order implies stop-loss order.

Increasing Convex Order

Equivalent to Stop-loss order (M¨uller et al, 2002)

X is smaller than Y in increasing convex order (X icx Y ) if for every non-decreasing convex

function f it holds that E[f (X)] ≤ E[f (Y )].

For multivariate order the principle of supermodular order becomes relevant: supermodular function:

A function f : Rk→ R is supermodular if for any x,y ∈ Rk it satisfies

f (x) + f (y) ≤ f (x ∧ y) + f (x ∨ y). supermodular order :

Let X and Y be 2 k-dimensional random vectors. Than X is smaller than Y in supermodular order (X sm Y) if for all supermodular functions f it holds that E[f (X)] ≤ E[f (Y)], given that

(5)

2.3 Risks within a company

Whereas most of this thesis deals with the combination of several market-assets, such as stocks, the concept of MVaR may also be important within a company. Different units of the company might each warrant an independent risk-analysis, which can then be combined to form the risk for the entire company. On the other hand companies as a whole deal with several different kinds of risks which can all be examined individually.

The three most important risks for a company are: market, credit and operational risk (Polanski et al, 2013). Those risks can be dissected into more specific risks, almost none of which follows a normal distribution. The below graph gives an overview of the many risks a company might face and shows the resulting skewed distribution of the combined risks (Kuritzkes et al, 2003).

Figure 2: Overview of the risks that can be measured within a company (Kuritzkes et al,2003)

2.4 Risks within an economy

Quantifying systematic risk tends to run into two problems (Adrian, 2011). On the one hand there is the problem of spill-over effects when trying to add together the risks of individual companies. Those spill-over effects can either enhance each other so that the total shock is greater than the sum of its parts or partly cancel each other out. On the other hand systematic risk tends to slowly build up in the background during low-risk times.

Risk tends to be procyclical. F.e. an economic downturn might force many traders to sell their assets at roughly the same time, which leads to prices dropping even further. Regulation tends to follow demands from society and markets, which can lead to regulations being too loose in times of stability and too tight after a downward shock. Through this mechanism regulators can inadvertently contribute further to the procyclicality.

(6)

Figure 3: bank failures demonstrate pro-cyclical risk (Hakwa et al, 2012)

3

Simple multivariate tools

This chapter discusses two relatively simple ways of combining univariate risk measures into a bivariate model.

3.1 Covariance

The simplest way to combine two possibly-dependent variables X1 and X2 into a third one Y that

is correlated with both is with a formula such as Yt = β1X1,t+ β2X2,t+ γX1,tX2,t in which the β’s

measure the effect of each variable X on variable Y and γ measures their combined effect in a dataset going from t = 0 to t = N . In this model the level of connectedness between X1 and X2 can be

quan-tified by the covariance Cov(X1, X2). This Covariance assumes the connectedness remains constant

throughout the model.

While this makes for a relatively simple model history has shown those models to be insufficient when dealing with hedging and portfolio decisions (Embrechts et al, 2002).

Figure 4: comonotonic dependence (left) versus alternative dependence structure (right), illustrating different ways univariate variables may be combined and thus the importance of a multivariate risk-measure (Embrechts et al, 2005)

(7)

3.2 Quantiles

The use of Covariance proves to be unsatisfactory in practice as the connectedness changes depending on market-conditions, more specifically whether one is in a bull-market (rising prices) or bear-market (dropping prices). Quantiles can be used to indicate that either ’all stocks are in bad shape’ or ’all stock are in good shape’.

Mathematically speaking the use of quantiles refers to the four quantiles of a (X, Y ) 2-dimensional grid. When plotting a point cloud more points will be in the two areas where X and Y are either both positive or both negative than in the areas where one is positive and one negative.

The following figure shows the difference in correlation for either a small (< 50%) or high (> 50%) cut-off quantile. The portfolios used in the analysis are a small caps fund and a high caps fund. As can be seen the correlation is much higher in times of market down-turn (when one is concerned with the lower quantiles) than in times of market upturn (when the high quantiles are more relevant).

Figure 5: correlation (vertical axis) and cut-off quantile (horizontal axis) (Patton,2004)

4

Copula

A copula is used to tie two (or more) univariate distribution functions together. The copula itself has uniform marginals. This chapter will discuss copulas and give formulas and graphs illustrating the most widely used copulas.

4.1 Properties of copulas

A copula is a multivariate distribution function with standard uniform marginals. A 2-dimensional copula is a distribution function C : [0, 1]2 → [0, 1] that satisfies (Hakwa et al, 2012):

-Boundary conditions:

(a) for every u ∈ [0, 1] : C(0, u) = C(u, 0) = 0 (b) for every u ∈ [0, 1] : C(1, u) = u and C(u, 1) = u -Monotonicity condition:

(c) for every (u1, u2), (v1, v2) ∈ [0, 1] × [0, 1] with u1 ≤ u2 and v1 ≤ v2 it holds that

(8)

Take H(x, y) to be a bivariate distribution function with marginal distributions F (x) and G(y) than there exists a copula C : [0, 1]2 → [0, 1] such that (Sklar, 1959):

H(x, y) = C(F (x), G(y)) for all (x, y) ∈ R2.

If F and G are continuous the copula is unique, else the copula is uniquely determined only on the range of F and G.

A copula is bound by the Fr´echet-Hoeffding upper (W ) and lower (M ) bound, such that

M (u, v) ≤ C(u, v) ≤ W (u, v), for (u, v) ∈ [0, 1]2 and M (u, v) = max{u + v − 1, 0} and W (u.v) = min{u, v} (Fan at al, 2014).

Than for any (x, y) ∈ R2 and any bivariate cdf H(x, y) with marginal cdf’s F (x) and G(y) the Fr´echet-Hoeffding inequality holds:

M (F (x), G(y)) ≤ H(x, y) ≤ W (F (x), G(y))

4.2 Copula formulas and graphs

In this section there will be formulas (Joe, 1997; Nelsen, 1999) and graphs of the most-used copulas. Capital C denotes cdf’s, lower-case c denotes pdf’s, (u, v) denotes the 2 distributions being bound and additional parameters indicate the level of interaction between u and v

Normal copula:

CN(u, v, ρ) := Φρ(Φ−1(u)Φ−1(v) Φ(x) denotes the cdf of the standard normal distribution

cn(u, v, ρ) := √1 1−ρ2exp{ Φ−1(u)2−1(v)2−2ρΦ−1(u)Φ−1(v) 2(1−ρ2) + Φ−1(u)2Φ−1(v)2 2 } ρ ∈ (−1, 1) Clayton Copula CC(u, v, θ) := (u−θ+ v−θ− 1)−1/θ cC(u, v, θ) := (1 + θ)(uv)−θ−1(u−θ+ v−θ− 1)−2−1/θ θ ∈ [−1, ∞)\{0} Rotated Clayton: CRC(u, v, θ) := u + v − 1 + CC(1 − u, 1 − v, θ) cRC(u, v, θ) := cc(1 − u, 1 − v, θ) θ ∈ [−1, ∞)\{0}

Student’s t copula (Tv is Student’s t cdf, tv is Student’s t pdf, Tv,ρ is bivariate Student’s t cdf)

CT(u, v, ρ, ν) := Tν,ρ(Tν−1(u), Tν−1(v)) cT(u, vρ, ν) := Γ(ν+2/2)tν(T −1 ν (u))−1tν(Tν−1(v))−1 νπΓ(ν/2)√1−ρ2  1 +Tν−1(u)2+Tν−1(v)2−2ρTν−1(u)Tν−1(v) ν(1−ρ2) −(ν+2)/2 ρ ∈ (−1, 1) ν > 2

Joe Clayton copula

CJ C(u, v|τU, τL) := 1 − (1 − {[1 − (1 − u)κ]−γ+ [1 − (1 − v)κ]−γ − 1}−1/γ)1/κ

cJ C(u, v|τU, τL) :=too long to write out

κ = [log2(2 − τU)]−1, γ = [−log2(τL)]−1

τU ∈ (0, 1), tauL∈ (0, 1)

Plackett copula

CP(u, v, π) := 2(π−1)1



1 + (π − 1)(u + v) −p(1 + (π − 1)(u + v))2− 4π(pi − 1)uv

cP(u, v, π) := ((1+(π−1)(u+v))π(1+(π−1)(u+v−2uv))2−4π(pi−1)uv)3/2

π ∈ [0, ∞) {1} Frank copula:

CF(u, v, λ) := −1λ log



(1−e−lambda)−(1−e−λu)(1−e−λv) 1−e−λ

 cF(u, v, λ) := λ(1−e

−lambda)e−λ(u+v)

((1−e−λ)−(1−e−λu)(1−e−λv))2

(9)

Gumbel copula:

CG(u, v, δ) := exp{−((log u)δ+ (− log v)σ)1/δ}

cG(u, v, δ) := CG(u,v,δ)(log u·log v)

δ−1

uv(− log u)δ+(− log v)δ)2−1/δ ((− log u)δ+ (− log v)δ)1/δ+ δ − 1

 δ ∈ [1, ∞)

Rotated Gumbel copula:

CRG(u, v, δ) := u + v − 1 + CG(1 − u, 1 − v, δ)

cRG(u, v, δ) := cG(1 − u, 1 − v, δ)

δ ∈ [1, ∞)

Figure 6: different copulas, all with standard normal marginal distributions and linear correlation coefficient 0.5 (Patton,2004)

(10)

5

VaR in copula surrounding

This chapter looks at some risk measures that use the copula idea to find bounds on the multivariate VaR.

5.1 Lowest possible bound on VaR

In insurance an important aspect when dealing with risk is to find the ’worst case scenario’ when working with a combination of several risks: finding an upper bound based on the most disadvanta-geous combination of 2 or more risks. Embrechts et al (2005) have tried to tackle this problem for the VaR of n=2. They focused on the no-information case, meaning there is no certainty on the shape of the dependence structure.

An n-dimensional copula is an n-dimensional distribution function restricted to [0, 1]n and with standard uniform marginals (Embrechts et al, 2005). Any copula C lies between the lower and upper Fr´echet bounds W ≤ C ≤ M . With W = (u1, ..., un) := (Pni=1ui− n + 1)+ (counter-monotonic),

M (u1, .., un) := min1≤i≤nui. (comonotonic) andQ(u1, ..., un) :=Qni=1ui (independence copula).

Assume there exists a copula CLsuch that C ≥ CLand a function ψ : Rn→ R that is non-decreasing.

Take operator τCl,ψ to be the left continuous version of a distribution function, such that for random

variable K P[K < s] = τCl,ψ(F1, ..., Fn)(s). Let X

C = (X

1, ...Xn) be a random vector in Rn with

marginal distribution functions F1, ...Fn and copula C.

When n = 2 the point-wise best possible bound that cannot be tightened is τCL,ψ(F1, ..., Fn)(s) ≥ τCL,ψ(F1, ..., Fn)(s)

or in a VaR context

V aRα(ψ(X1, ..., Xn)) ≤ τCL,ψ(F1, ...Fn)

−1(α) for every α ∈ [0, 1] (Embrechts et al 2005).

When n > 2 W is used as the lower bound (instead of CL), that bound is no longer sharp because

W is not a copula for n > 2 (as for more than 2 random variables it is unlikely for each of them to be a non-increasing function of each of the remaining ones (Embrechts et al, 2005)).

Let XC = (X1, X2) be a random vector in R with marginal distribution functions F1, F2 and a

copula C and take α = τ w + (F1, F2)(s). Define Cα[0, 1]2→ [0, 1]

Cα(u1, u2) =

(

max{α, W (u1, u2)} if(u1, u2) ∈ [α, 1]2

min{u1, u2} otherwise

This copula Cαattains the previously mentioned lowest bound τCL,ψ(F1, ..., Fn)(s) ≥ τCL,ψ(F1, ..., Fn)(s)

which for this specific copula translates to the bound σcα+ (F1, F2)(s) = α

As illustrated in the figure below the assumption of monotonicity can lead to underestimation of the VaR. Comonotonicity seems like the worst dependence scenario, since under comonotonicity every random value is a non-decreasing function of the other, such that high values for the one also mean high values for the other. But for every threshold α < 1 there exists a copula Cα yielding a VaR which

is higher than that of comonotonicity (Embrechts et al, 2005). Although perhaps not all assumptions are realistic the above does provide an argument against to easily assuming C≥ Π. This assumption (C≥ Π) corresponds to positive lower orthant risks (Nelsen,1999).

(11)

Figure 7: p[X1+ X2 < s] for a N(0,1) portfolio, using copula c0.9857 (Embrechts,2005)

5.2 Test for independence using p.i.t.

Clements et al (2002) have looked at the problem of evaluating interval-forecasts focusing on the 2-dimensional case. An example they give is ’what are the chances that inflation will remain below 2.2% next year and growth will exceed 2%?’.VaR itself is a point-forecast and cVaR and TVaR are interval-forecasts.

Assume 1-step prediction intervals pt(yt) for t = 1...n. Now take the probability integral transform

(p.i.t) zt=

Ryt

−∞pt(u)du for t = 1..n. When the predicted density corresponds to the true predictive

density it holds that zt∼ U [0, 1], as the sequence is independently distributed {zt}nt=1 i.d.d. U [0, 1] at

each t. In a time-series pt(yt) might change over time, so it is important to use the the correct density

pt(yt) for each value of t (Clements at all, 2002)

The test consists of comparing the distribution of {zt} with the theoretical distribution which

forms a 45% line. The Kolmogorov-Smirnov (KS) test tests for uniformity under the assumption of independence. The KS-statistic is the maximum difference between the empirical distribution of zt

and its theoretical value.

Clements et al (2002) assume joint density {y1t, y2t} and use 3 testing statistics ( Z2|1,tc represents y2t

conditional on y1t and Z1,tm the marginal for y1t):

-Stacked, S = [Z2|1,1c , ..., Z2|1, nc, Z1,1m, ..., Z1,nm ] a 2n × 1 stacked vector (Diebold et al (1998) -J1, using an n-dimensional vector with elements Ztj = Z2|1,tc × Z1,tm (Clements et al, 2000). -J2, using an n-dimensional vector with elements Zc

2|1,t/Z1,tm.

It is confirmed using a Monte Carlo analysis that the S-test benefits from having twice as many variables (in the bivariate case), but lacks power in the case of misspecification of the correlation between the variables. This is because such a misspecification would affect Z2|1,tc

0 and Z

m

1,t0 jointly and

J1 and J2 combines those 2 variables together and thus preserves their temporal grouping (Clements et al, 2002).

(12)

Figure 8: Illustration of KV-test using a non-linear self-exiting threshold autoregressivel (SETAR) model (Clements et al, 2002)

5.3 Rearrangement algorithm

In the homogeneous cases (where risk factors Li are identically distributed) an analytical formula

typically suffices to compute worst possible VaR for portfolios of any size, assuming the marginal dis-tributions Fiare continuous. However in the case of inhomogeneous portfolio’s with arbitrary marginals

those formulas soon become next to impossible to calculate so one would look at a numerical algorithm to find the VaR (Embrechts et all, 2013).

Take V aRα(L+) to be the ’worst case’ coupling and V aRα(L+) to be the ’best case’ coupling. The

superscript and subscript refer to approaching the VaR from either above or below (as depicted in chapter 1).

If we assume the marginal risks Li to be identically distributed as cdf F (meaning F1 = F2= F )

and d=2 then we can analytically find that find V aRα(L+) = F−1(α) and V aRα(L+) = 2F−1 1+α2

 for all α ∈ [F (xF, 1)

Now we drop the assumption of identically distributed risks. In that case for d = 2 the sharp bounds V aRα(L+) and V aRα(L+) can still be calculated without too much problems. And the homogeneous

case also causes no insurmountable problem, not even for d ≥ 3. But taking d ≥ 3 while assuming inhomogeneous risks poses a challenge. This problem can be made much easier if it is possible to divide all the risks L+in n homogeneous subgroups. For the computation of those bounds the rearrangement

(13)

Define vectors a, b ∈ RN as oppositively ordered when (aj − ak)(bj − bk) ≤ 0 holds for all 1 ≤

j, k ≤ N . For (N × d) matrix X the operators s(X) and t(X) are defined as respectively s(X) = min1≤i≤NP1≤j≤dxi,j and t(X) = max1≤i≤NP1≤j≤dxij

Re-arrangement algorithm for V aRα(L+) (Embrechts et al, 2013):

1. Fix an integer N and the desired level of accuracy  > 0 2. Define matrices Xα =xα ij  and Xα=xα ij  as xα ij = F −1 j  α + (1−α)(i−1)N and xα ij = F −1 j  α +(1−α)iN  for 1 ≤ i ≤ N and 1 ≤ j ≤ d

3. Permutate randomly the elements in each column of Xα and Xα

4. Iteratively rearrange the jth column of the matrix Xα so that it becomes oppositely ordered to the

sum of the other columns for i ≤ j ≤ d. This way matrix Yα is found. 5. Repeat step 4. until s(Yα) − s(Xα) < .

A matrix X∗ is found.

6. Apply steps 4-5 to Xα to find X∗ 7. Define sN = s(X∗) and sN = s(X

)

8. Then we find that sN ≤ sN and in practice that sN N →∞

' sN N →∞' V aRα(L+).

Re-arrangement algorithm for V aRα(L+):

1. Fix an integer N and the desired level of accuracy  > 0 2. Define matrices Zα=  zαij  and Zα =  zαij  as zαij = Fj−1  α(i−1) N  and zαij = Fj−1 αiN for 1 ≤ i ≤ N and 1 ≤ j ≤ d

3. Permutate randomly the elements in each column of Zα and Zα

4. Iteratively rearrange the jth column of the matrix Zα so that it becomes oppositely ordered to the sum of the other columns for i ≤ j ≤ d. This way matrix Wα is found.

5. Repeat step 4. until t(Zα) − t(Wα) < . A matrix Z∗ is found.

6. Apply steps 4-5 to Zα to find Z∗ 7. Define tN = t(Z∗) and tN = t(Z

)

8. Then we find that tN ≤ tN and in practice that tN N →∞

' tN N →∞' V aRα(L+).

6

Relation between tail and rate of multivariate dependency

This chapter discusses an experiment which indicates that the tail of a distribution (thick or thin) might be correlated with the severity of spillover effects. Thus Illustrating the importance of correctly identifying the tail-distribution.

6.1 Theoretical background

Mandelbrot (1963a) remarked on the Pareto nature of returns, the resulting implication of market-incompleteness. Because of this Pareto nature tail probabilities are self scaling (Feller, 1971). Assuming the self-scaling applies to the entire distribution it can be concluded that the distribution of asset re-turns is an infinite variance sum stable distribution (Mandelbrot, 1963a). However Hartmann et al (2010) note we do not need the infinite variance assumption for the univariate case as it suffices to assume the scaling holds for just the tail area. This result still calls for the multivariate case to be analyzed with regard to this Pareto nature.

(14)

tails and tail dependence are well-known to exists in the univariate case there has not been much study into the bivariate and multivariate case. (Hartmann et al, 2010).

Hartman et al (2010) indicate there are two basic conditions underlying systematic widespread crisis in currency markets. Firstly the univariate distributions underlying the exchange rate should have heavy tails. Those heavy tails more or less indicate that the probability of univariate currency col-lapses is Pareto distributed. The probability of a currency crisis is much higher in that case than if the underlying univariate distribution followed a normal distribution. Secondly the nominal exchange rates are linear expressions of the domestic and base currency fundamentals.

The two previous conditions taken together are linked to frequent and large currency crisis. This result hints at a relation between the degree of dependency during crises (known as asymptotic de-pendence) and the thickness of the tails of the univariate fundamentals. It should be noted this result only holds for the tails, in general there is no connection between the univariate distributions and their multivariate dependency.

6.2 Fat tail experiment

Take κ to be the number of simultaneously crashing currencies, that is: all currencies whose return exceeds threshold s. We can then define the expected number of crashes, conditional on the fact that there is one such crash already taking place (Huang, 1992. Hartman et al, 2004). Thus obtaining a measure for contagion

E[κ|κ ≥ 1] = 1P[X>s,Y ≤s]+P[X≤s,Y >s1−P[X≤s,Y ≤s] + 21−P[X≤s,Y ≤s]P[X>s,Y >s] = P[X>s]+P[Y >s]1−P[X≤s,Y ≤s]

If X and Y are (somewhat) independent and identically distributed, and we define p = P[X > s] than we can rewrite the above formula as E[κ|κ ≥ 1] = 1−(1−p)2p 2 = 2−p2 . A special result in this case is

s → ∞ ⇒ p → 0 ⇒ E[κ|κ ≥ 1] → 1.

Alternatively define Y = a + bX(b 6= 0), indicating total dependence between X and Y . Than we find that E[κ|κ ≥ 1] = 1−(1−p))2p = 2. In this case p has no influence on the level of contagion, which is always 2.

The above 2 results indicate that 1 ≤ E[κ|κ ≥ 1] ≤ 2

6.3 Results of fat tail experiment

The empirical analysis performed by Hartman et all (2010) proves that the fragility of a (multivariate) system depends on the marginal distributions of (univariate) fundamentals. More specifically it was found that in the case of thick tails κ (=contagion) is closer to 2, and in the case of thin tails κ is closer to 1. This provides prove for their hypothesis that if those marginals have a thick tail (such as the Student t-distribution) there is more risk of contagion occurring than if those marginals have thin tails (such as the normal distribution).

Hartmann et all (2010) indicate two policy implications stem from their empirical results. Firstly regulators should take into account that the likelihood of multiple currency crisis is much higher than under the assumption of a multivariate normal distribution. Secondly that macro policy rather them micro market oddities tend to drive large exchange rate swings. Thus regulators should take care not to overshoot in their regulations.

Based on the discovered attributes of contagion it would be advisable to pursue ’steady hand’ regulations rather than allowing for drastic changes in variables such as money supply or interest rates, in order to diminish the occurrence of fat tails. Even taking active counter-measures in the case of large market fluctuations.

(15)

7

CoVaR

The CoVaR is a risk measure which measures the contribution of one institution on the entire economy, taking into account spillover effects. This chapter describes how the CoVaR concept may be used and how to move CoVaR into the copula framework.

7.1 Defining CoVaR

The ’co’ in CoVaR stands for conditional, contagion orcomovement.(Adrian et al, 2011). CoVaR aims at quantifying the amount of distress to the entire economy that is the result of the distress of just one firm i. In this it is a bit like the Shapely value, which measures the marginal contribution of each player to a coalition.

This can be mathematically defined as: a particular institutions marginal contribution to system-atics risk (∆CoV aR) is the difference between the CoVaR conditional on the institution being under distress minus the CoVaR in the median state of the institution (Adrian et al, 2011).

This can be extended for spill-over effects between institutions: ∆CoV aRj|i measures the increase in contribution of institute j when institute i fails. Keep in mind institution i might have a high effect on institution j, but not the other way around, so there is no reason to think ∆CoV aRj|ineeds to be equal to ∆CoV aRi|j.

CoV aRj|iq , the VaR of institution j (or the financial system) conditional on some event C(Xi)

be-falling institution i is implicitly defined as (Adrian et all, 2011): P[Xj ≤ CoV aRj|C(X

i)

q |C(Xi)] = q.

Whereas the contribution of i to j can be written as ∆CoV aRj|iq = CoV aR

j|Xi=V aRi q

q −CoV aRj|X

i=mediani

q .

While one might expect CoVar to be tightly linked with VaR (when taking ’systematic failure’ to be the sum of individual failures) the below figure shows such is not really the case. In this figure the V aRiq of an asset is simply defined as the qth quantile of its returns.

(16)

7.2 CoVaR estimation

Quantile regression can be used to estimate the CoVaR (Adrian et al, 2011). First we look at the predicted value of the q-quantile regression of the financial sector conditional on portfolio i, which can be defined as ˆXqsystem,i = ˆαiq+ ˆβqiXi. The former might then be extended to include higher orders to

allows for non-linearity. Secondly, from the definition of value at Risk it follows that V aRsystemq |Xi= ˆXqsystem,i.

Taking those two formulas together we can now define CoVaR as: CoV aRsystem|X

i=V aRi q

q := V aRsystemq |V aRiq= ˆαiq+ ˆβqiV aRiq.

And correspondingly define ∆CoV aRiq as: ∆CoV aRsystem|iq = ˆβqi(V aRiq− V aRi50%).

Figure 10: Time series average, ∆CoVaR versus VaR (Adrian et al, 2011)

The above definitions deal only with a CoV aR that is constant over time. One can however also define a time-varying CoV aRt and V aRt by means of a regression using lagged state variables (Mt−1). As

state variables one should use measures that are (a) known to capture time variation, and (b) liquid and easily trade-able. Based on those demands Adrian et all (2011) choose as their state variables weekly collected data from (a) VIX which measures the implied volatility in the stock market, (b) a short term ’liquidity spread’ which in this case is measured by the difference between three-month reop rate and three-moth bill rate,(c) the change in the three-month treasury bill rate, and (d) several other assets.

We can use the following series of regression to find CoV aR: Xti= αi+ γiMt−1+ εit

Xtsystem= αsystem|i+ βsystem|iXti+ γsystem|iMt−1+ εsystem|it

We then generate the predicted values to obtain: V aRit(q) = ˆαiq+ ˆγqiMt−1

CoV aRi

(17)

Now we can compute ∆CoV aRi

t for each institution i as:

∆CoV aRit(q) = CoV aRit(q) − CoV aRit(50%) = ˆβsystem|i(V aRit(q) − V aRit(50%))

An important benefit of calculating forward-∆CoV aRtby means of the above regression and looking

at characteristics such as size and maturity and loan-loss reserves is that it gives a countercycli-cal movement as compared with the contemporaneous-∆CoV aR. This makes it a valuable tool for macro-prudential regulations (Adrian et al, 2011).

Figure 11: counter-movement of contemporaneous-∆CoV aR and forward-∆CoV aRt (Adrian et al,

2011)

7.3 CoVaR in the Copula framework

Let Liand Lsbe variables representing the loss of respectively system i and institution s, with marginal distributions Fi and Fs. Let H be the joint distribution with copula C, or H(x, y) = C(Fi(x), Fs(y)).

Assume it holds, as usual for a copula, that (a) for any v ∈ [0, 1] the partial derivative ∂C(u,v)∂u exists (for almost all u) and is non-decreasing and 0 ≤ ∂C(u,v)∂u ≤ 1, and (b) for any u ∈ [0, 1] the partial derivative ∂C(u,v)∂v exists (for almost all V ) and is on-decreasing and 0 ≤ ∂C(u,v)∂v ≤ 1. Also assume g(u, v) := ∂C(u,v)∂u is invertible with respect to v.

Now for all l ∈ R the CoV aRs|Lα t=l can be expressed as (Hakwa et al, 2012):

CoV aRs|Lα t=l(α) = Fs−1(g−1(α, Fi(l))) ∀α ∈ [0, 1]

This function has the advantage that CoV aRs|Lα t=l is separated in two components: (a) marginal

distributions Fi and Fs for the univariate risk of institution i and financial system s, and (b) function

g which represents the dependency between i and s. Thus allowing one to investigate the effect of a change in either of those components on the outcome.

(18)

In practice the conditional level l for financial institution i might often be implicitly defined by β such that l = Fi−1(β). Then the previous expression may be written as CoV aRs|Lα t=l = Fs−1(g−1(α, β)).

This new expression has only α and β as input parameters, thus we can now use a simplified version of the CoV aR-definition: CoV aRβα := CoV aRs|L

t=l

α . This in turn leads to the conclusion that the

CoV aRs|Lα t=l does not depend on the marginal distribution Fi (contrary to VaR) but only on the

marginal distribution of the system’s losses Fi and on the copula that links i and s (Hakwa et all,

2012).

We can say the CoV aRs|Lα t=l is really just the VaR of the whole system using transformation

˜

α = g−1(α, Fi(l)). Thus we can connect CoV aRs|L

t=l

α to the total Value-at-Risk of the financial

sys-tem (V aRαs) using the fact that V aRsα˜ = Fs−1( ˜α) = CoV aRs|Lα t=l. The risk of an entire system

CoV aRs|Lα t=l at level α is given by transformation ˜α = g−1(α, Fi(l)).

If the loss of the financial system Ls is normal distributed (N ∼ (µs, σs)) than (Hakwa, 2012):

CoV aRs|Lα t=l= σsΦ−1( ˜α) + µs

8

Linear programming

This chapter looks at the use of linear programming techniques such as the knapsack algorithm and the use of duals to solve a system of equations in order to find a numerical Multivariate Value-at-Risk measurement.

8.1 Polyhedral scalarization

In order to go from univariate to multivariate scalars need to be extended to vectors. Noyan and Rudolf (2013) limit themselves to linear scalarization functions in order to obtain computational formulas. A relatively easy way to compare two vectors X and Y is by coordinate-wise preference, meaning that X is preferred over Y if Xi  Yi for all i = 1, ..., d. In this case one could say the scalarization vectors

C ∈ Rd are taken as C = (e1, ..., ed) with el = (0, ..., 0, 1, 0, ..., 0) ∈ Rd being the unit-vectors.

By taking other scalarization vectors c we find the following property: given two d-dimensional vectors X and Y we say that X is preferable () over Y with respect to C if cTx  cTY holds for all c ∈ C. We can take this result to the VaR framework. Noyan and Rudolf (2013) prove that we can as-sume C to be a compact polyhydron, i.e. a polytope, without loss of generality. Let X and Y be two d-dimensional random vectors, C ∈ Rd a set of scalarization vectors, and α ∈ [0, 1] the confidence interval. We may say X is cVaR-preferred (αcV aR

α) to Y at level α if:

cV aRα(cTX) ≥ cV aRα(cTY) for all c ∈ C.

Let (Ω, 2Ω, P i) be a finite probability space for which Ω = {ω1, ..., ωn} and Π(ωi) = pi. Decision

variable z is selected from set Z, outcome map is G : Z × Ω → Rd. Let f : Z → R be an objective function, Y a d-dimensional benchmark random vector, C ∈ Rd a polytope of scalarization vectors, and α ∈ (0, 1) the confidence level. Now the problem to be solved can be written as:

max f (z)

s.t. G(z) CcV aRα Y, z ∈ Z

(19)

The above is the simple form, meaning only a single cVaR-constraint is used. This model can however also be used for more general forms with M benchmarks for cVaR, and different scalarization sets: max f (z)

s.t. G(z) Ci,j

cV aRαi,j Yi i = 1, ..., M j = 1, ..., Ki,

z ∈ Z

On the one hand cVaR is a maximization problem. On the other hand it can be viewed as a spectral risk measure (Acerbi, 2002) and thus the weighted sum of least favorable outcomes. Thus cVaR is the optimum of minimization problems. Noyan et all (2013) have defined several equivalent ways of looking at this problem:

Let V be a random variable with realizations v1, ...vn and the corresponding probabilities p1, ..., pn.

Than cVaR at confidence level α ∈ (0, 1], or cV aRα(V ) can be found with either of the following

optimizations (1),(2),or (3) that are equivalent to each other: (1) max {η−1 α n X i=1 piwi} s.t. wi≤ η − vi, i = 1, .., n, wi≤ 0, i = 1, ..., n

The linear programming dual of problem (1): (2) min 1 α n X i=1 γivi s.t. n X i=1 γi= α, 0 ≤ γi ≤ pi i = 1, ..., n

We might assume without loss of generality that v1 ≤ v2 ≤ ... ≤ vn. Let k∗ = min{k ∈ {1, ..., n} :

Pk

i=1pi≥ α}. (2) can be seen as a knapsack problem, the greedy solution is optimal:

γ∗i =        pi i = 1, ..., k∗− 1 α −Pk∗−1 i=1 pi i = k∗ 0 i = k∗+ 1, ..., n

Analogous a feasible solution to (2) is γi=

       pi i ∈ K α −P i∈Kpi i = K 0 i /∈ K ∪ {k}

with objective value Ψα(V, K, k).

Setting K∗ = {1, ..., k∗ − 1}, the pair (K∗, k∗) is a feasible solution of (3) with objective value cV aRα(V ) = Ψα(V, K∗, k8). (3) min Ψα(V, K, k) s.t. K ⊂ {1, ..., n}, k ∈ {1, ..., n}\K, X i∈K pi≤ α, α −Xi ∈ kpi ≤ pk, where Ψα(V, K, k) = 1 α " X i∈K pivi+ α − X i∈K pi ! vk #

(20)

For any nontrivial polyhedron (C) of scalarization vectors (c ∈ C) the cVaR-constraint is by definition equivalent to a collection of infinitely many scalar cVaR-constraints. However if we restrain ourselves to finite probability spaces we only need to use a finite number of those cVaR vectors (Noyan et al, 2013).

Let X and Y be d-dimensional random vectors with realizations x1, ..., xn and y1, ..., ym and

cor-responding probabilities p1, ...pn and q1, ..., qm. And let C ⊂ Rdbe a polytope of scalarization vectors.

Then X is cVaR-preferable (V aR) to Y at level α with respect to C if and only if:

cV aRα(CT(l)X) ≤ cV aRα(cT(l)Y) for all l = 1, ..., N ,

where c(1), ..., c(N ) are d-vertices of polyhedron

P (C, Y) = {(c, η, w) ∈ C × R × Rm+ : wj ≥ η − cTyj, j = 1, ...m}.

In the above theorem the confidence levels on the two sides are the same, this is however not a necessary condition, it also holds that for any α1, α2∈ (0, 1] (Noyan et al, 2013): cV aRα1(c

TX) ≤ cV aR α2(c

TY)

for all c ∈ C.

Using the name notation, the random vector X dominates Y in polyhedral linear second order with respect to C if and only if cT(l)X (2) cT(l)Y for all l = 1, ..., N

(21)

8.2 Polyhedral MVaR

Pr´ekopa(2010) similarly uses a simplex-method to solve a two-stage problem consisting of a deter-ministic and a stochastic constraint. The decision variables are x,y ∈ Rr. Take T as an r × r matrix initially beholden to constraint P[T x ≥ ξ] ≤ p, i = 1, ..., r. Furthermore define 0 ≤ λ ≤ 1 and take A and B to be convex subsets of Rr.

Pr´ekopa(2010) attempts to find the MVaR by using a combination of a log-concave pfd f and a probability measure P

(

f (λx + (1 − λ)y) ≥ (f (x))λ(y))1−λ

P[λA + (1 − λ)B] ≤ (P[A])λ(P[B])1−λ

The problem to be solved is written as (Pr´ekopa,1973): min{max1≤i≤McTi x + qTy}

subject to Ax + By ≥ b

P[T x + W y ≤ ξ] ≤ p0

x ≤ 0, y ≤ 0

Take ξ to be discrete and define the set of efficient points as p0 = s1, ..., sN. The probabilistic

con-straint can be replaced by an equivalent concon-straint by using T x ∈ S

s∈M V aRp(x)(s + R+r). Then we get

the equivalent problem-set: min{max1≤i≤Mctix + qTy} subject to Ax + By ≥ b T x + W y ∈ NS i=1(si+ R r +) x ≥ 0 y ≥ 0 A relaxation of this problem is

min{max1≤i≤Mctix + qTy} subject to Ax + By ≥ b T x + W y −PN i=1λisi ≥ 0 PN i=1λi= 1 x ≥ 0 y ≥ 0 λ ≥ 0

Introducing a variable t means this problem can be written as: min{t + qTy} subject to t − ctix ≥ 0, i = 1, ..., M Ax + By ≥ b T x + W y −PN i=1λisi ≥ 0 x ≥ 0 y ≥ 0 λ ≥ 0 The dual of this problem is:

max{v + bTu} subject to v − sTi z ≤ 0, i = 1, ..., N BTu + WTz ≤ q ATu + TTz −PM i=1µici ≤ 0 PM i=1µi = 1 µ ≤ 0 z ≤ 0

Which can be rewritten to (Prekopa, 2010): max{min1≤i≤MsTi z + bTu}

subject to WTz + BTu ≤ q

P (−TTz − ATu ≥ η) ≥ p1

(22)

8.3 Solution Algorithms

The following algorithm is proposed by Noyan et al (2013) as a viable solution method for the opti-mization problems (1), (2) and (3) from section 8.1:

1. Initialize a set of scalarization vectors eC = {˜c(1), ..., ˜c(L)} ⊂ C

2. Solve the master problem max f (z)

s.t. cV aRα(˜cT(l)G(z)) ≥ cV aRα(˜cT(l)Y) l = 1, ..., L

z ∈ Z

3. if the master problem is infeasible then 4. stop

5. else

6. Let z∗ be an optimal solution

7. Given the optimal decision vector z∗ set X = G(x∗), solve the cut generation problem ; minc∈CcV aRα(cTx) − cV aRα(cTY)

8. if the optimal objective value of the cut generation problem is nonnegative then 9. Stop

10. else

11. Find an optimum solution ˜cL+1of the cut generation problem, which is a d-vertex of P (C, Y).

Set eC = eCS{˜cL+1} and L = L + 1 then go to step 2.

12. end

9

Upper and lower orthants

The upper(lower) orthant can be defined as the probability that all (none) of your variables have passed a certain threshold. This chapter discusses upper/lower orthants and gives some mathematical background.

9.1 upper and lower orthants

For probability a ∈ [0, 1] the Value-at-Risk for random univariate variable Y is the unique threshold defined as V arα(Y ) := inf {x ∈ R : G(x) ≤ α} in which G is the assumed to be the strictly-increasing

distribution function of Y.

However with multivariate marginals, even with continuous distribution function G, there are in-finitely many vectors s ∈ Rkfor which G(s) = α (Embrechts et al, 2006). This means that when scaling up the above univariate problem to a multivariate framework we need to approach the problem from 2 angles.

On the one hand one might be interested in bounding from above the possibility that the aggre-gate loss in all subgroups exceeds a given threshold, meaning for each subgroup j and one-period loss-vector x it holds that P[ψ(X)j ≥ sj, j ∈ K]. The multivariate lower-orthant VaR (LO-VaR) for

α ∈ [0, 1] using increasing function G : Rk→ I can be defined as V aRα(G) := ∂{x ∈ Rk: G(x) ≥ α}.

(23)

Figure 13: LO-VaR for different α-levels for the sum of 2 bivariate Pareto distributions with different θ (left) and bivariate Log-Normal distributions (right) (Embrechts et al, 2006)

On the other hand one might be interested in bounding from below the possibility that for none of your subgroups the aggregate loss exceeds the threshold, meaning P[ψ(X)j < sj, j ∈ K]. The multivariate upper-orthant VaR (UO-VaR) uses a decreasing function G and can be defined in an analoguous way as V aRα(G) := ∂{x ∈ Rk: G(x) ≤ 1 − α}.

Figure 14: UO-VaR for the sum of 3 bivariate Pareto distributions (left) and bivariate Log-Normal (right) (Embrechts et al, 2006)

The previous graphs used marginal distributions with n is two or three, but LO-VaR and UO-VaR can be scaled up to higher n as well, as illustrated in the below graph which uses n = 5. Dual bounds refers to independence and the standard bound is the worst-case scenario. On the left is P[P5

i=1xi< (s, s)],

which can be interpreted as the probability that both portfolio have broken threshold s. On the right is P[P5

(24)

Figure 15: LO-VaR (left) and UO-VaR(right) for bivariate Log-Normal portfolios with n=5 (Embrechts et al, 2006)

9.2 Kendall distribution

Multivariate distribution can be examined by means of the Kendal distribution (K). Using multivari-ate distribution function F (X) the Kendal distribution is defined as K(α) = P[F (x ≤ α] for α ∈ [0, 1]. The survival-function of F (X) is K(α) = P[F (X > α]. Keep in mind it is not possible to reconstruct F from K alone, as K does not contain information about the marginal distributions FX1, ..., FXd.

As a consequence of Sklar’s theorem (see section 4.1) one can however say the Kendal distribution only depends on copula C(U). So if we take U = (FX1(X1), ..., FXd(Xd)) the Kendal distribution is

defined as K(α) = P[C(U ≤ α] (Cousin et al, 2013)

Kendal distribution for d-dimensional dependence structures (assuming (1) generator φ gives C(u1, ..., ud) =

(φ)−1(φ(u1) + ... + φ(ud)) for all (u1, ..., ud) ∈ [0, 1]d and (2) (φ)−1 is a d-monotone on [0, ∞):

Archimedean: K(α) = α +Pd−1 i=1 1i!(−φ(α))i(φ −1)iφ(α)) Independent : K(α) = α + αPd−1 i=1  ln(1/α)i i!  Comonotonic: K(α) = α

It holds that α ≤ K(α) ≤ 1 for all α ∈ (0, 1). Or equivalently: for any combination of random vector U and comonotonic random vector UC (with copulas respectively C and CC, and uniform marginals) it holds that C(U) st CC(UC) . This implies that as the level of dependence between

X1, ...Xd increases the Kendal distribution also tends to increase (Cousin et al, 2013).

The most commonly used bivariate Archimedean copulas are (with θ as measure for dependence): Gumbel : K(α, θ) = α(1 −1θln(α), for θ ∈ [1, ∞) Frank : K(α, θ) = α +1θ(1 − eθα) ln  1−e−θα 1−e−θ  , for θ ∈ (−∞, ∞)\{0} Clayton: K(α, θ) = α(1 + f rac1θ(1 − αθ)), for θ ∈ [−1, ∞)\{0}

Ali-Mikhail-Haq: K(α, θ) = α−1+θ+(1−θ+θα)(ln(1−θ+θα)+ln α)θ−1 , for θ ∈ [−1, 1)

(25)

Figure 16: Kendall distribution for Clayton copula (left) and Gumbel copula (right). Dark line repre-sents the monotonic case (Cousin et al, 2013)

9.3 Math concerning upper and lower orthants

As stated above the multi-variant LO-VaR for increasing function G : Rd → [0, 1] can be defined as the boundary of its α-upper-level set: ∂{x ∈ Rd : G(x) ≥ α}. Equivalently the UO-VaR for

decreasing function G : Rd → [0, 1] can be found as the boundary of its (1 − α) lower-level set: ∂{x ∈ Rd: G(x) ≤ 1 − α}.

Note that those VaR-generalizations are represented by an infinite number of points, namely a hyperspace of dimension d-1 (Cousin et al, 2013). Considering real risk-management problems it is easier to only focus on points in Rd, the conditional expectation of x given that x stands in this set. Meaning that all measures are real-valued vectors with the same dimensions as the portfolio of risks. In addition G (G) is chosen as the d-dimensional loss distribution function F (survival function F ) of the portfolio, in order to be consistent and not add needless complexity. This choice allows capturing information from both the marginal distributions and the multivariate dependence structure, thus eliminating the need for an arbitrary real-valued aggregate transformation.

Furthermore some regularity conditions are assumed:

-Vector X = (X1, .., Xd) is non-negative and absolutely-continuous

-Multivariate distribution function F is partially increasing, meaning for at least one component xj

function F (x1, ...xj, ...xd) is increasing

-E(Xi) < ∞ for i = 1, .., d

We now define multivariate Lower-Orthant Value-at-Risk as follows (Cousin et al, 2013):

V aRα(X) = E[X|F (X) = α] =    E[X1|F (X) = α] .. . E[Xd|F (X) = α]   

We define multivariate Upper-Orthant Value-at-Risk as: V aRα(X) = E[X|F (X) = 1 − α] =    E[X1|F (X) = 1 − α] .. . E[Xd|F (X) = 1 − α]   

(26)

We use the limit-procedure (Feller, 1966) to interpret those formulas, which gives for LO-VaR: E[Xi|F (X) = α] = lim h→0[E[Xi|α < F (α) ≤ α + h] = lim h→0 R∞ QXix  Rα+h α f(Xi,F (X)(x, y)dy  dx Rα+h α fF (X)(y)dy = R∞ QXixf(Xi,F (X)(x, α)dx K0(α) , K(α) = dK(α) dα

Define V aRiα (V aRiα) as the i’th component of vector V aRα (V aRα). If X is an exchangeable random

vector than we would see V aRiα= V aRjα and V aRiα= V aRjα for any i, j = 1, ..., d.

Given a univariate random variable X we would see E[X|FX(X) = α] = E[X|FX(X) − 1 − α] =

V aR(X) for all α ∈ (0, 1).

Thus for univariate random variables it holds that V aRα = V aRα

9.4 Invariance properties UO/LO-VaR

Rewrite the i’th element of LO-VaR as

V aRiα(h(X)) = E[hi(Xi)|Fh(X(h(X)) = α] for i = 1, ..., d and h(x1, .., xd) = (h1(x1), ..., hd(xd)) based

on Fh(X)(yi, ..., yd) =

(

FX(h−1(y1), ..., h−1(yd)), if h1, ..., hd are non-decreasing functions

FX(h−1(y1), ..., h−1(yd)), if h1, ..., hd are non-increasing functions

It thus holds that (Cousin et al, 2013):

-if h1, ..., hdare non-decreasing functions than V aRiα(h(X)) = E[hi(Xi)|FX(X) = α] for i = 1, ..., d

-if h1, ..., hdare non-increasing functions than V aRiα(h(X)) = E[hi(Xi)|FX(X) = α] for i = 1, ..., d

-if h1, ..., hdare non-decreasing linear functions than

V aRα(h(X)) = h(V aRα(X)) and V aRα(h(X)) = h(V aRα(X))

-if h1, ..., hdare non-increasing linear functions than

V aRα(h(X)) = h(V aR1−α(X)) and V aRα(h(X)) = h(V aR1−α(X))

The above can be used to prove (Cousin et al, 2013) that UO-VaR and LO-VaR satisfy the positive homogeneity and translation invariance property:

-Positive Homogeneity: ∀c ∈ Rd+

V aRα(cX) = cV aRα(X), V aRα(cX) = cV aRα(X)

-Translation Invariance: ∀c ∈ Rd+

V aRα(c + X) = c + V aRα(X), V aRα(c + X) = c + V aRα(X)

As the univariate VaR satisfies those properties one would always want any proposed multivariate extension of VaR to also satisfy those properties.

(27)

10

Joint Density Tail

The Joint Tail Density refers to a mathematical method for combining multivariate tails into a uni-variate vector on which the usual uniuni-variate methods (such as uniuni-variate VaR) can then be used again.

10.1 Introduction to joint density tails

A Joint Density Tail (JDT) is an unbounded region of the Euclidean space that is marked off by cut-off values (Polanski et al, 2013). De JDT O(d, v) projects an N-dimensional space RN onto a line R and requires a cut-off value v ∈ R and a directional vector d ∈ RN. Or alternatively one can interpret the JDT as the intersection of univariate tails, with ui as a unit-vector and O(d

iui, v) a half hyperspace

in RN.

Vector d can be interpreted as a portfolio, f.e. a portfolio that has 2 units of asset 1 and is short 1 unit of asset 2 will be defined as d = (d1, d2) = (2, −1)

JDT: O(d, v) := y ∈ RN : yi/di≥ v, ∀di 6= 0 =Ti:di6=0O(diui, v)

Assume N-dimensional pdf f, than the probability mass of JDT O(d, v) can be defined as: zd(v, f ) = |RdN·∞ dN·v . . . RdN·∞ dN·v R∞ −∞ . . . R∞ −∞f (τ1, ..., τN)dτ1dτ1|

Assume a line in RN along d ∈ RN. Now define projection xd of point x ∈ RN along this line: xd= vd(x) · d where vd(x) = mindi6=o{xi/di}

So point x is projected along an axis that corresponds to the minimum ratio xi/di.

Figure 17: JDT O(d, vα) in R2 (left) with projection x (right) (Polanski et al, 2013)

The above graph gives an intuitive proof for the idea that the projection xd of point x that lies inside (outside) a JDT remains inside (outside) this JDT (Polanski et al, 2013).

So for directional vector d ∈ RN, d 6= 0 and v ∈ R: x ∈ O(d, v) ⇔ xd∈ O(d, v).

Unlike most other multivariate approaches the JDT-approach is only concerned with risk along the pre-specified vector d. The benefits of this are twofold (Polanski et al, 2013). Firstly it reduces a multivariate problem to a univariate one. Secondly the flexibility allows for a lack of constraints on d, allowing an investor to focus on a specific portfolio rather than having to concern himself with complete market-risks.

According to the definition of univariate VaR the probability mass zd(v, f ) is equal to α. In anal-ogy to this we can define the multivariate α-level multivariate Value at Risk in direction d ∈ RN (M V aRd

(28)

is on level α.

Figure 18: MVaR as intersection of VaRs, for a bivariate distribution (Polanski et al, 2013) Value x ∈ RN is an extreme observation when its projection vd(x) violates threshold vd(f, α): vd(x) ≤ vd(f, α) ⇔ x ∈ M V aRdα

Polanski et al (2013) have found that for continuous pdf f , directional vector d ∈ RN, d 6= 0, and significance level α ∈ (0, 1) it holds that: x ∈ M V aRdα⇔ zd(vd(x), f ) ≤ α

This implies that under the correct forecasting model the proportion of ztd (for {xt}Tt=1 and values

≤ α) should approach level α in large samples. This procedure is referred to as unconditional accuracy. Conditional accuracy on the other hand requires that the occurrence of zd ≤ α is unpredictable when conditioned on available information. Or in other words that M V aRdα violations should be se-rially uncorrelated. As the JDT-procedure basically transforms a multivariate distribution back to a univariate distribution the existence of serial correlation can be confirmed or denied using univariate tests.

10.2 JDT-risk dependence

We can define the conditional probability of M V aRαd given that M V aRdα˜˜ as: pαα˜(f, d, ˜d) := Pf(M V ARdα|M V aR ˜ d ˜ α) = Pf(M V aRad)∩M V aR ˜ d ˜ α) Pf(M V aRdα˜˜)

Because of independence it holds that for α = ˜α that: pα(f, d, ˜d) := pαα(f, d, ˜d) = Pf(M V aRdα) = α,

given that M V aRd

α and M V aR ˜ d

αare independend. Thus we can define the dependence between these

two as the relative change in conditional probability: γα(a, d, ˜d) = γα(a, ˜d, d) := (pα(a, d, ˜d) − α)/α.

Another measure for dependence in risk is the Conditional MVaR (CMVaR) (Polasnski et al, 2013) which is quit similar to the CoVaR (Adrian et al, 2011). This is the relative change in M V aRdα when conditioned on M V aRdα˜:

CM V aRd, ˜αd= v

d(f |M V aR

α,α)−vd(f,α)

(29)

11

Comparison of univariate and multivariate models

This chapter describes an experiment which compared many univariate and multivariate models in order to find out wether the use of a multivariate model really makes for better predictions.

11.1 Background of the experiment

In econometrics a bigger model is not necessarily a better model. As more variables are added there is also added uncertainty about their value, adding uncertainty to the model as a whole. Several authors (Berkowitz et al (2002), Brooks et al (2003), Bauwens et al (2006), Christoffersen (2009) and McAleer (2009)) have ran simulations and eventually concluded the univariate VaR model performed better than the multivariate VaR model.

However Santos et al (2013) argue the models used in those tests were too small. Oftentimes containing no more than 3 or 4 assets, whereas the true power of a multivariate model is found in the analysis of a big complex portfolio. Secondly Santos et al (2013) dispute the validity of the comparison measure. The coverage/independence criteria used are more suitable for evaluating a single model and less suitable for comparing different models with each other. Santos et al (2013) propose using the Comparative Predictive Ability (CPA) test instead (Giacomini and White (2006)). Thirdly some of the tested models only used constant conditional correlations, whereas Santos et al (2013) allege those correlations evolve over time and can exhibit asymmetric effects.

11.2 Methodology

Santos et al (2013) performed a Monte Carle experiment with several variations on the data-generating process (DGP) and analyzed 3 large real market portfolio’s. Comparison between models was done with back-testing and the CPA test. Portfolio weights are considered to be known, such that the focus could be on whether the additional information contained in a multivariate model made up for the additional uncertainty due to the larger amount of variables needed for a multivariate model.

The models used are variants of the conditional autoregressive VaR (CAViaR) in which the ϑ quantile moves as follows: V aRϑt=[ω + β1+ yp,t−12 + β2(V aRϑt−1)2]1/2 The portfolio VaR is V aRϑt =

µp,t+ σp,tqϑ.

The fit of the models is tested with a CPA-test. The asymmetric linear loss-function used in the CPA-test is Lα(et) = (α − I(et> 0))et, with et= yp,t− V aRαt

In the univariate model the portfolio returns yp,t are conditional on a linear combination of past

asset returns yp,t = Wt−1Yt . In the multivariate model the portfolio returns are conditional on the

entire vector of past asset returns Yt.

In specific the models used are (represented in their simplest form, with only one lag): GARCH (Bollerslev, 1986) σp,t2 = ω + αyp,t−12 + βσp,t−12 , ω > 0, β, α ≥ 0, α + β < 1 GJR (Glosten et al, 1993) σ2

p,t = ω + αy2p,t−1+ βσ2p,t−1+ δI[εp,t−1≤0]y

2

p,t−1, ω > 0, α, β, δ ≤ 0

EGARCH (Nelson, 1991) ln(σ2p,t) = ω + α (|εp,t−1| − E|εp,t−1|) + δεp,t−1+ β ln(σp,t−12 )

APARCH (Ding et al, 1993) σp,tλ = ω + α(|εp,t−1| + δεp,t−1)λ+ βσp,t−1λ ω, α ≥ 0, −1 ≤ δ ≤ 1

The conditional covariance matrix Htcan be decomposed as followes Ht= DtRtDt. Santos et all(2013)

used 3 different kinds of Rt: Constant Conditional Correlation (CCC), Dynamic Conditional

(30)

11.3 Results

The Monte Carlo experiment gave mixed results in the in-sample comparison as in 80% of simulations the CPA-test was indifferent between the univariate and multivariate models. However in the out-of-sample test the CPA-test was indifferent in only 40% of simulations and in the remaining 60% the multivariate model was usually preferred over the univariate model.The difference can be explained by the fact that univariate models are more limited and less flexible than the multivariate model. For the real market data 2 kinds of out-of-sample forecasting were used: a fixed window estimate and a rolling window estimate. When a fixed window was used multivariate models outperformed univariate models for 2 out of the 3 portfolio’s tested. This out-performance was especially clear in the case of multivariate models that used time-varying conditional correlations and less clear in mul-tivariate models using constant conditional correlations.

It should be noted that the financial crisis of 2007-2008 started just after the end of the chosen forecasting window. As expected the rolling window estimate gave a better forecasting result than the fixed window estimate. In the case of a rolling estimate the preference for multivariate models is even more pronounced than in the case of fixed estimates.

Thus Santos et al(2013) have found that if large diverse portfolio’s are analyzed and dynamic correla-tions are allowed for in the model than multivariate models tend to outperform univariate models in one-step-ahead VaR-forecasting.

12

MVaR illustrations

This chapter will give some visual illustrations of bivariate quantiles. It does this by comparing em-pirical data about the movement of 3 currencies.

12.1 Data

The data used are the exchange rate of several currencies against the euro. The currencies are the US dollar (US$), the English pound (£) and the Japanese Yen ( Y ). The rates are daily rates for the period January 1999 to November 2014. This exchange-rate data is taken from the website of the European Central Bank. Subsequently the daily growth rate is measured by calculating growtht= V alueV aluet−V aluet−1 t−1

for each of the currencies.

By taking such a long interval it is assured there will be both periods of economic growth and periods of crisis in the sample. As well as making sure there is a large collection on data-points on which to base an analysis.

Although exchange rate might fluctuate a bit both in the short and the long term those rates tend to not jump too much. As most governments employ strict macro-economic policies aimed at preventing exactly such jumps, in order to provide stability to their economy and to investors. Those policies might f.e.consist of influencing the risk-free interest-rate in order to influence inflation-rates.

Also used are ’soft policies’ that are aimed at ensuring investor confidence. Such a policy might f.e consist of clearly communicating financial regulations and communicating the activities of regulators.

(31)

Figure 19: Time series of the exchange rates normalized at 1 at time=0

. . . Dollar . . . Pound . . . Yen . . .

Min 0.8252 0.5711 89.3000 Max 1.5990 0.9786 169.7500 Mean 1.2252 0.7384 127.3294 Median 1.2755 0.6956 128.9300 Std 0.1827 0.0966 18.2680 Kurtosis 2.3621 1.6421 2.4142 Skewness -0.5237 0.2900 0.2444 5%quantile 0.8829 0.6122 99.5700 95%quantile 1.4736 0.8898 162.2600 Correlation 0.7636 0.7636 0.5630 0.5630 0.0349 0.0349

(32)

12.2 Univariate VaR

As a starting point some univariate analyses are performed. The histograms of the 1-period differences show each currency is clustered around zero, as would be expected as these exchange rates stay more or less constant over time.

The kurtosis all being > 2 suggests the tails that are thicker than that of the normal distribution (kurtosis=2). Or alternatively that the distribution has a high degree of ’peakedness’, with much of the chance-mass being the result of infrequent very-extreme events.

The correlation between the dollar and the pound and the dollar and the yen is not very differ-ent. This could be explained with the dollar being a leading currency and other currencies following its movements. The correlation between the pound and the yen is however relatively low, which can be explained by those two countries being far apart and having very different economies.

Dollar Pound Yen

Min -0.0412 -0.0340 -0.0525

Max 0.0485 0.0269 0.0597

Mean 1.0e-003*0.0054 1.0e-003*-0.0167 1.0e-003*0.0075

Std 0.0064 0.0049 0.0079 Kurtosis 5.6407 6.7126 6.7789 Skewness 0.0013 -0.3078 0.2457 5%quantile -0.0102 -0.0078 -0.0121 95%quantile 0.0104 0.0076 0.0129 Correlation 0.5078 0.5078 0.6070 0.6070 0.2897 0.2897

Figure 20: histograms of 1-period exchange rate movement for dollars(left), pounds (middle), yen (right) against the euro

(33)

12.3 Multivariate histograms

When drawing 3-dimensional histograms one can see that all exchange rates tend to cluster around a few areas. Those areas are however not at the center of the histogram, which indicates a skewed distribution. This skewness also follows from the high skewness values for the univariate variables. Presumably the region with the most cluttering indicates the usual exchange rate between those cur-rencies when they are exchanged directly, without conversion to the euro.

The histograms showing the change in exchange rate are however all neatly centered on point (0,0), which is reasonable enough when each consists of two variables that are each also centered on zero in the univariate case.

Figure 21: histograms of exchange rate (left) and change in exchange rate (right) for dollar against pound

Figure 22: histograms of exchange rate (left) and change in exchange rate (right) for dollar against yen

(34)

Figure 23: histograms of exchange rate (left) and change in exchange rate (right) for pound against yen

12.4 Contourplots

In order to identify the quantiles for threshold α we need to first calculate the cumulative area under each histogram. This can then be translated to a 2-dimensional contourplot in order to find the lines at which the Value-at-Risk of a portfolio containing equal amounts of these 2 assets is at 10%, 50% or 90%. This is the kind of correlation that would be interesting to someone trying to hedge his risk by holding 2 of those currencies in his portfolio.

(35)

Figure 25: Cumulative density function of dollar and yen

(36)

The same can also be done when looking at the histograms of the movement of these currencies. The square shape of these cdf’s is the result of a lot of probability mass being centered at (0,0), indicating there are a few huge outliers at the edges of the histogram. This is the kind of correlation usefull to a trader trying to predict the day-to-day movement of different currencies based on what other currencies are doing.

Figure 27: Cumulative density function of movement of dollar and pound

(37)
(38)

Literature

Acerbi,C.(2002) Spectral measures of risk: A coherent representation of subjective risk aversion, Jour-nal of Banking & Finance 26(7) 1505-1518

Adrian,T., Brunnermeier, M.K. (2011), NBER working papers series, Working Paper 17454

Artzner,P., Delbaen,F.,Eber,J., Heath,D. (1999) Coherent measures of risk, Mathematical Finance 9(3), 203-228

Bauwens,L.,Laurent,S., Rombouts J.V.K. (2006) Multivariate GARCH models: a Survey, Journal of Applied Econometrics 21(1): 79-109

Berkowits,J., O’Brien,J. (2002) How accurate are Value-at-Risk models at Commercial Banks?, The Journal of Finance 57(3): 1093-1111

Brooks,C., Persand, G. (2003) Volatility Forecasting for Risk Management, Journal of Forecasting 22(1): 1-22

Christoffersen, P. (2009) ”Value-at-Risk Models” . Part of ’Handbook of Financial Time Series’. Clements, M.P., Smith,J. (2000) Evaluating the forecast densities of linear and non-linear models: Applications to output growth and unemployment, Journal of Forecasting 19, 255-276

Clements, M.P., Smith,J. (2002) Evaluating multivariate forecast densities: a comparison of two ap-poaches, International Journal of Forecasting 18 (2002) 397-407

Cousin,A., Di Bernardino,E (2013) On multivariate extensions of Value-at-Risk, Journal of Multivari-ate Analysis 119 (2013) 31-46

Diebold,F.X.Gunther,T.A., Tay,A.S.(1998) Evaluating Density Forecasts, with Applications to Finan-cial Risk Management, International Economic review 39, 863-883

Embrechts,P., Hing,A.Puccetti,G.(2005) Worst VaR scenarios, Insurance: Mathematics and Economics 37 (2005) 115-134

Embrechts,P., Puccetti,G. (2006) Bounds for functions of multivariate risks, journal of Multivariate Analyses 97(2) 526-547

Embrechts, P., Pucetti,G., Rschendorf,L. (2013). Model uncertainty and vaR aggregation, Journal of Banking & Finance 37 (2013) 2750-2764

Feller,W (1966) An introduction to Probability Theory and its Applications, Vol II

Giacombi,R,. White,H (2006) Test of Conditional Predictive Ability, Econometrica 74(6): 1545-1578 Hakwa,B., Jger-Ambrozewicz,M., Rdiger,B. (2012) Measuring and Analysing Marginal Systemic Risk Contribution using CoVaR: a Copula Approach, Cornell University eprint arXiv:1210.4713

Hartmann,P.,Straetmans,S., de Vries,C.G (2004) Asset market linkages in crisis periods, Review of Economics and Statistics 91, 19-24

Hartmann,P.,Straetmans,S., de Vries,C.G (2010) Heavy tails and currency crises, Journal of Emperical Finance 17 (2010) 241-254

Huang, X (1992) Statistics of Bivariate Extreme Values, PhD thesis, Tinbergen Institute

Joe,H. (1997) Multivariate Models and Dependence Concepts, Monographs on Statistics and Applied Probability (73)

Kuritzkes,A., Schuermann,T., Weiner,S.C (2003) Risk Measurements, Risk Management, and Capital Adequacy in Financial Conglomerates, Brooking-Wharton Papers on Financial Services, 2003, 141-193

Mandelbrot, B. (1963a) The variation of certain speculative prices, Journal of Business 36, 394-419 Mandelbrot, B. (1963b) new methods in statistical economics. Journal of Political Economy 61, 421-440

McAleer, M (2009) The ten Commandments for Optimizing Value-at-Risk and Daily Capital Charges, Journal of Economic Surveys 23(5): 831-849

Referenties

GERELATEERDE DOCUMENTEN

To answer our research question more precisely: Based on two back-testing tests (Ku- piec’s unconditional coverage test and Christoffersen’s conditional coverage test), we find that

Archeologische verwachtingen  Er  is  weinig  archeologische  informatie  beschikbaar  over  de  regio.  Het  plangebied  bestaat  uit  zandleemgronden  op  een 

Variables The task, dispersion, and interaction of former UN Education Faculty could be related to theoretical organisational culture and structure... Some aspects of

Analogous to Gauss-Newton and Rayleigh quotient iteration for one matrix, the new Rayleigh quotient method estimates one common root only and should be provided with good

Analogous to Newton-Raphson and Rayleigh quotient iteration for one matrix, the new Rayleigh quotient method estimates one common root only and should be provided with good

Chapter 4 develops a statistical inference theory of a recently proposed tail risk measure by using the jackknife re-sampling technique and the empir- ical likelihood method which

We establish a discrete multivariate mean value theorem for the class of positive maximum component sign preserving functions.. A constructive and combinatorial proof is given

When the pre and post crisis results are compared internally for each model at the 1% probability level, post-crisis p- values are higher and unconditional coverate is lower in