• No results found

The dependence of the value at risk on the time period.

N/A
N/A
Protected

Academic year: 2021

Share "The dependence of the value at risk on the time period."

Copied!
45
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The dependence of the value at risk on

the time period

Aris Meems

Master’s Thesis to obtain the degree in Actuarial Science and Mathematical Finance University of Amsterdam

Faculty of Economics and Business Amsterdam School of Economics

Author: Aris Meems

Student nr: 10007989

Email: Arismeems@hotmail.com

Date: September 9, 2014

Supervisor: Dr. S.U. Can Second reader: Dr. R.J.A. Laeven

(2)
(3)

The dependence of the time period on the VaR — Aris Meems iii

Abstract

This study analyses the dependence of the Var on the time period and discusses the quality of the value at risk as a capital requirement. It is a broad study with a goal of finding risk characteristics that influence the effect of the time period on the VaR. We study four characteristics that might have an influence: the thickness of the tail, the symmetry of the risk distribution, the profit margin and the height of the confidence level. We also introduce a second risk measure, the maximum valuye at risk (MVaR), that is closely related to the ordinary VaR. It considers quantiles of the maximum loss during the time period instead of quantiles of the total loss over the entire time period. We use a discrete approximation of continuous risk distributions to compute the multiple year risks. We find that thick tails result in the most predictable time effect. The probability of losing the one-year VaR at confidence level 99 approximately equals the length of the time period, so 2% for a time period of two years, 3% for a time period of three years, and so on. For thin-tailed distributions, this probability increases much faster. So using the one-year VaR as a capital requirement might be misleading when the risk distribution has a thin tail. The effect of the tail-thickness dominates the effect of the symmetry. A high profit margin has a large effect for the larger time periods. For low confidence levels, the risk even becomes negative. For these cases, the MVaR gives a much better representation of the risk.

Keywords

Value at Risk, Thick tails, Symmetry, Profit Margin, Confidence level, Maximal value at risk, Time depedence, Discrete approximation

(4)

Preface v

1 Introduction 1

1.1 Handling risk . . . 1

1.2 Time period . . . 2

1.2.1 Illustration of effects . . . 2

1.2.2 Consequences for capital requirements . . . 3

1.3 Similarity and differences between criticism . . . 4

1.4 Literature . . . 5

1.5 Research question. . . 6

1.6 Towards a solution . . . 7

2 Methodology 9 2.1 Choosing distributions and values of parameters . . . 9

2.2 Choosing what to calculate . . . 11

2.3 Definitions of risk measures . . . 12

2.3.1 VaR . . . 12 2.3.2 MVaR . . . 12 2.4 Calculation method. . . 14 2.4.1 VaR . . . 14 2.4.2 MVaR . . . 17 2.5 Confidence level. . . 18 2.5.1 Two-year distribution . . . 18 2.5.2 Multiple-year distribution . . . 21 3 Results 24 3.1 Normal distribution . . . 24 3.2 Student’s t-distribution . . . 27 3.3 Exponential distribution . . . 28 3.4 Lognormal distribution. . . 30 4 Discussion 33 4.1 Profit margin . . . 33 4.2 Shape of distribution . . . 34 4.3 Reliability of results . . . 35 5 Conclusion 37 References 39 iv

(5)

Preface

With this thesis, there comes an end of seven years of study for me. Five years of math-ematics and two years of actuarial science. I want to thank all of my fellow student with whom I worked together in these seven years. Without exception, it were all pleas-ant and instructive experiences. I wpleas-ant to thank my parents for always supporting me and my roommates for providing always enough distraction. I especially want to thank Umut Can for supervising this thesis. His comments were very clear and elaborate. Furthermore, I want to thank Roger Laeven for being the second reader of the thesis.

(6)
(7)

Chapter 1

Introduction

1.1

Handling risk

The financial world is full of uncertainties. Changes in interest and exchange rates, stock values, inflation and the mortality table all can have a huge impact on the financial posi-tion of banks or insurers. Some of these risks, such as interest rate risk, can be (partially) hedged, but for other risks like mortality risk this cannot be done. Furthermore, high risk offers high rewards and so risky investments have higher expected returns. This means that risk is not necessarily a bad thing, but it has to be managed with great care.

To manage risk, financial experts have tried to quantify the risk in so-called risk measures. The requirements for risk measures are that they are both meaningful and easy to understand, in order to be well used by the management. In practice this means that the risk must be captured in a single number, which inevitably neglects certain aspects of the actual risk.

For a long time the standard deviation (sd) had been the risk measure of choice. It is easy to compute or to estimate and has the intuitive meaning of the average distance from the mean. Another advantage is that the standard deviation and the mean together completely define a normal distribution, so for normally distributed risks every characteristic can be deduced from these two numbers. Since the normal distribution occurs on many occasions due to the central limit theorem, the use of standard deviation made sense.

However, after a while some serious drawbacks came to light. Many of the supposedly normally distributed risks were not normally distributed after all, but had fatter tails. This meant that extremely high losses occurred more often than expected. These ex-tremely high losses can have devastating effects, including underfunding and ultimately bankruptcy, so they are the most important. For these kinds of risks, the standard de-viation is not a good risk measure, since it contains no information about the shape, in particular the tail behavior, of a distribution.

As a consequence, a new risk measure has become popular since the 1990’s and is now widely used: the value at risk (VaR). The VaR focuses on the weak point of the standard deviation: the extreme losses. The idea is very simple. For a certain confidence level p, the maximum loss in the best p proportion of cases is determined. So for a p of 99%, you lose 99% of the time less than the VaR99 and 1% of the time more than

the VaR99, so essentially the value at risk is just a quantile of the loss distribution.

Except for the confidence level p, the value at risk has one other very important, but underexposed parameter: the time period of the risk. The effect of this time period on the value at risk will be the main topic of this thesis.

The strongest point of the value at risk is that it contains information that can be used directly. When a company has own funds that are greater than or equal to the value at risk of a certain level p, it knows that it can meet its liabilities with at least

(8)

probability p. Since p can be chosen arbitrarily high, the required amount of own funds to obtain every certainty can be determined. Many regulators, for example EIOPA for European insurance companies, use the VaR at a certain high level, 99.5% for instance, as a capital requirement, the minimal required amount of own funds.

The value at risk, however, is not embraced by everyone. Most critics emphasize that the value at risk isn’t a coherent risk measure. A coherent risk measure is a measure that satisfies certain objectively determined axioms formulated by Artzner et al. (1998). The value at risk does not satisfy the axiom of sub-additivity. This axiom states that there must always be a benefit of diversification when two risks are merged. This means that the measure of the merged risk cannot be larger than the sum of the measures of the separate risks. It is easy to construct examples of risks for which the VaR does not satisfy this axiom, but in practice they will rarely be encountered.

To counter this drawback, alternative risk measures have been suggested and imple-mented, such as the tail value at risk (TVaR). This risk measure is closely related to the VaR, but also looks at what happens in the more extreme cases. It is in fact the average of the losses that exceed the value at risk. A problem is that this value is a lot harder to determine and harder to explain to the management, so the VaR is still the most widely used risk measure by both companies and regulators. A complete overview of the history and origin of the VaR can be found in a book by Holton (2003). The paper of Duffie & Pan also gives a complete overview of the VaR, but is more mathematical and also contains model descriptions and numerical examples.

1.2

Time period

An underexposed aspect of risk measures, and VaR in particular, is the time period of the risk. It is clear that a longer time period increases the risk, but it is not clear at which rate. For the standard deviation this rate is known under standard assumptions, since variances increase linearly and the standard deviation is its square root, but for the VaR it can take any form. A doubling of the time period could have an arbitrarily large effect on the value at risk, but also none at all.

1.2.1 Illustration of effects

The following examples illustrate this. In the example the time unit is one year. The one-year risks are i.i.d. and denoted by Xi. The risk over longer periods of time is the

sum of the one-year risks:

St= t

X

i=1

Xi. (1.1)

In the first example we consider a risk where the value at risk increases dramatically when we look at the two-year VaR instead of the one-year VaR. The confidence level is 95%. The one-year risks are defined through the following probability mass function:

fX1(y) =



0.96 if y = 0; 0.04 if y = x.

The number x is here a constant value that can take any value. It is clear that the VaR95 equals 0 in this example. When we sum up two independent copies we get the

following probability mass function:

fS2(y) =    0.9216 if y = 0; 0.0768 if y = x; 0.0016 if y = 2x.

(9)

The dependence of the time period on the VaR — Aris Meems 3

Now the VaR95 equals x, which can be made arbitrarily large.

The second example illustrates a change in the time period without effect. We use the same framework as the first example, but another distribution of the one-year risk:

fX1(y) =



0.94 if y = 0; 0.06 if y = x.

The VaR95 now equals x in this example. For the two-year time period we get:

fS2(y) =    0.8836 if y = 0; 0.1128 if y = x; 0.0036 if y = 2x.

The VaR95 still equals x, so the risk has not grown according to this measure.

The one-year risk distributions in both examples are very similar and could even be chosen closer to each other, but the effect on the value at risk of a change in the time period is completely different.

1.2.2 Consequences for capital requirements

The examples in the previous subsection were artificial and chosen in a way to maximize the effect of a doubling of the time period. In practice these risks will not occur, but the examples show clearly that the effect cannot always be estimated by a simple rule of thumb.

This unpredictability is a serious drawback of the VaR, but the the real problem lies in the application of the VaR as a capital requirement. A good capital requirement should ensure a healthy financial position of the firm, but the VaR only does that for a single time period, often one year. It is very well possible that a company loses half of its capital requirement in the first year and the other half in the second year. Since the probability of smaller losses will in general be higher than the probability of larger losses, the probability of losing the one-year capital requirement within two years can easily be a lot higher than the confidence level of the one-year VaR would suggest.

The following example illustrates the problem. The notation and assumptions are the same as the previous examples. We now choose a distribution with a lot of mass around half the VaR95:

fX1 =    0.60 if y = 0; 0.34 if y = 50; 0.06 if y = 100.

The VaR95 equals 100 in this example. We again sum up two i.i.d. one-year risks:

fS2 =            0.36 if y = 0; 0.408 if y = 50; 0.1876 if y = 100; 0.0408 if y = 150; 0.0036 if y = 200.

The VaR95still equals 100 now, but more important is that the probability that the

two-year loss is larger than or equal to 100 is 23.2%. Only 11.64% is caused by a direct loss of 100 in year one or year two, so another approximate 11.6% is caused by two subse-quent smaller losses. This example is of course still artificial, but its main characteristic, higher probabilities for small losses than for high losses, is perfectly natural.

Therefore, the VaR may offer false security. The confidence level of the VaR can be interpreted in a wrong way. The wrong thought is that a VaR with confidence level 99.5% implies that a company loses its capital requirement once every 200 years. We

(10)

have shown that this is not true. The possible consequences of this wrong interpretation can be huge, since it underestimates the long term risk.

An important remark is that here the problem is that the VaR contains no informa-tion about losses that are smaller than the value at risk, whereas most criticism on the value at risk, such as the lack of sub-additivity in the paper of Artzner et al., focuses on the lack of information about losses that are greater than the VaR.

1.3

Similarity and differences between criticism

At first sight, the often heard criticism that the value at risk is not sub-additive, see Artzner et al. (1998), and the criticism that it might not behave nicely under changes in time period seem unrelated. However, when we look more closely into it we see that there are some similarities. Sub-additivity focuses on what happens when risks are merged. These risks can be anything: two different portfolios of stocks, assets merged with liabilities or a complete merger between two firms, but it could also be seen as adding up the risk of year one and the risk of year two. So extending the time period is essentially the same as merging portfolios, but with the difference that the risks are to a high degree independent and identical in distribution. For adding i.i.d. risks the problem of sub-additivity still exists; we even have seen an example earlier in this chapter in section1.2.1. The second example there adds up two risks with a VaR95of 0

and the sum has a VaR95 of an arbitrarily high x.

The difference between the two criticisms is that the sub-additivity focuses on a variable value at a fixed probability, whereas the criticism on the time period focuses on a variable probability for a fixed value. For sub-additivity you always have to look at the same confidence level, because otherwise you compare two quantities with different units. In this thesis we fix the one year value at risk at a certain confidence level and take that as a capital requirement. Then we calculate the confidence levels for which the value at risk for other time periods equal this capital requirement, in other words: we calculate the probability we have lost the money at the end of the period. This is an essential difference, because risk distributions with different characteristics cause problems.

We recall two examples we have already seen, one where the sub-additivity is a problem, but a change of time period behaves normal and one with the opposite case. As before we work with the risks X1 and X2 that are identically distributed and

inde-pendent, and their sum S2.

fX1(y) =  0.96 if y = 0; 0.04 if y = 100. fS2(y) =    0.9216 if y = 0; 0.0768 if y = 100; 0.0016 if y = 200.

The VaR95 equals 0 for a one-year risk, but 100 for a two-year risk. This is definitely

not sub-additive. However, the probability that you lose more than 0 goes from 4% to 7.84%. This is almost a doubling, which makes sense, since we also doubled the time period. The problem regarding sub-additivity is caused, because there is a large probability, 4%, compared to 1 − p, 5%, of a loss that is significantly larger than the VaR95. In every counterexample against sub-additivity this characteristic is present.

The next example lacks that characteristic and is therefore sub-additive:

fX1 =    0.60 if y = 0; 0.34 if y = 50; 0.06 if y = 100.

(11)

The dependence of the time period on the VaR — Aris Meems 5 fS2 =            0.36 if y = 0; 0.408 if y = 50; 0.1876 if y = 100; 0.0408 if y = 150; 0.0036 if y = 200.

The VaR95 equals now 100 for both a one-year risk and a two-year risk, so the

sub-additivity axiom is clearly satisfied. The probability that you lose more than 100, however, goes from 5% for one year to 22% for two years. This increase is significantly more than linear and therefore shows the problem of changing the time period. The characteristic of the previous example, a high probability for values significantly higher than the VaR95, is completely absent, since the VaR95 is now the upper bound of the

distribution. The characteristic that now causes the problem is that there is a high probability of half the VaR95. Adding half the VaR95to half the VaR95of course results

in the complete VaR95. More generally, we can say that when there is a high probability

of values between half the VaR95 and the VaR95, there is a high probability that the

sum of two independent copies of this risk will exceed the VaR95.

The first example is far more artificial than the second one and has an unrealistic characteristic, because in general high losses become increasingly rare. Furthermore, by just looking at the one-year risk it is clear that the VaR95is in this case not a good risk

measure. The second example is far more interesting. It is still artificial because it takes only three values, but its main characteristic is very common, because it has a higher probability for the small loss than for the big loss. Also, based on the one-year loss, the VaR95 seems like a good risk measure, maybe even a too conservative one.

We conclude that there is an important similarity between both problems with the value at risk, because they both occur when adding up two risks and are caused by the fact that you do not know what happens below or above the specified quantile. The difference is that the problem with sub-additivity is caused by a lack of information above the p-quantile and the problem with changing the time period mainly by a lack of information below the p-quantile. The characteristics of distributions that form a problem for sub-additivity are not common and are mostly artificially created, while the characteristics of distributions that potentially form a problem when doubling the time period is perfectly normal. Therefore, the unpredictability of a change in time period is potentially a bigger danger.

1.4

Literature

There is not a lot of research done on the relationship between the short and long term value at risk. Most of the literature about the VaR considers relatively short periods of time, while for the longer horizons of pension funds and insurers the impact in the literature is limited, as pointed out in a paper of Dowd et al (2003).

The paper of Danielson & Zigrand (2005), for example, studies the relationship between the 1-day VaR and the 10-day and 20-day VaR. They discuss the quality of the square root rule (SRR). This rule states that the VaR grows with the square root of the time period T . An important reason for this is that the Basel Committee on Banking Supervision (1996) suggest to estimate the 10-day VaR out of the 1-day VaR. We already noticed that this is the case for normally distributed risks, but the authors compared it with the S&P500. They conclude that the SRR is biased downwards. This means that the risk is actually bigger than the square root rule predicts. Furthermore, when the time period grows, the difference becomes increasingly big. The ratio between the actual VaR and the predicted VaR grows from 1.02 after 10 days to 1.42 after 60 days. The reason they give for this behavior is that the risk is built up from parts that all behave differently. A Brownian motion part increases with the square root of t, while

(12)

the return increases linearly and a jump part in a risk even increases exponentially. This results in the domination of one aspect of the risk over the others for longer periods of time.

The SRR is also discussed briefly by Blake et al. (2000). Their paper focuses on drifts and concludes that a positive drift leads to an overestimation of the risk when looking at raw returns.

A more recent paper is that of Kaplanski & Levy (2010). They performed an error analysis of the actual VaR’s and the SRR. They conclude that for short terms the square root rule overestimates the risk and for long terms it underestimates the risk. This paper considers time periods between 1 and 250 days. This contradicts the explanation that is given in the paper of Danielson & Zigrand, since that explanation implies an ever increasing error. In this paper there are two distributions examined, the normal distribution and the Student’s t-distribution. The over- and underestimation are much bigger for the Student’s t-distribution.

The results in these two papers say a lot about the behavior over time of the VaR. However, there is a big difference between periods measured in days and periods mea-sured in years.

Dowd et al. acknowledge this. Their paper also focuses on the square root rule, but consider periods of up to 60 years. Instead of using the SRR, they estimate the parameters of the return distribution and with the assumption that the portfolio stays the same they can calculate the VaR’s. Some interesting remarks about the behavior of the value at risk for different confidence levels are made. They notice for instance that the µ of a log-normal distribution that represents asset returns is very important. When this value is positive, meaning that there is an expected profit, the VaR first rises over time, but eventually starts to decline and will eventually converge to minus infinity. This holds for every positive µ, so also for very small values. However, when µ is exactly 0, the behavior changes completely. Instead of dropping, it will converge to the value of the initial investment, meaning that all the money will be lost. They also notice that the speed of both effects depends on the confidence level. For positive µ the peak is reached more quickly for low confidence levels and declines faster for low confidence levels than for high confidence levels. When µ is zero, the high confidence levels also converge a lot faster to the value of the initial investment. The authors also notice that since in the case that µ equals zero the VaR is bounded by the initial investment and when µ is positive the VaR eventually starts to decline, the square root rule is in the long run very inaccurate.

1.5

Research question

The papers discussed in the previous section all approach the time dependence of value at risk in a different way than this thesis. Our main goal is to determine the probability that you lose a capital requirement based on a one-year VaR in a longer time period for various risks, while the papers look at the development of the absolute money value of the VaR. However, some remarks about the influence of certain variables on the results are very useful to us. In the paper of Dowd et al. we could see the huge impact on the behavior of a minor change in the expected profit. The profit margin will therefore certainly be a variable in our research. The paper of Kaplanski and Levy show that the distribution of the risk gives some very different results. Just taking one risk distribution will therefore give insufficient information to generalize our results. We therefore will take different risk distributions with different characteristics.

The considerations in the first sections and the limited amount of research done in this area determined the choice of the research question:

(13)

The dependence of the time period on the VaR — Aris Meems 7

This is a very broad question, depending on many variables like the already men-tioned risk distribution and profit margin. Therefore, we must specify which aspects of a risk are likely to influence the answer. Once we have done that, we can vary these aspects and compare the results to look for statements that can be made about a general risk.

The most important aspect of the risk will most likely be the shape of its distribution function and more specifically the thickness of the right tail. This thickness indicates how rare large losses are. The thicker the tail, the more likely large losses are. In the lit-erature there is a clear definition of the tail-thickness, where the exponential distribution functions as a benchmark. When the tail is thicker than the exponential distribution a distribution is called thick tailed. We will study the effect of a change in the time period for both thick and thin tailed distributions. Another characteristic of a shape is whether it is symmetric or not. For symmetric distributions the possible profits are unbounded, like the losses, while for an asymmetric distribution this is not necessarily the case. The effect of symmetry or the difference between one-tailed or two-tailed distributions is hard to predict.

The second aspect of a risk that becomes especially important when we change the time period from one to multiple years is the profit margin. In the explanation of the problem we sketched a situation were two smaller losses added up to become a big one, but the opposite can also happen: a big loss can be compensated by a couple of profitable years. The likelihood of this to happen depends on the expected profit. Especially when we consider periods of time with a length of more than 10 years this variable will play a crucial role as we have seen already in the paper of Dowd et al.

The effect might also depend on the confidence level. It could be possible that for extremely high confidence levels the effect will be different than for lower confidence levels. It is hard to predict this effect, but it makes sense to assume that high confidence levels are affected by one extreme loss and low confidence levels profit a lot from the benefits of diversification.

The above three aspects of a risk form our first three subquestions: • How does the distribution of the risk influence the effect?

• How much does a higher profit margin lower the value at risk? • Is the effect different for high and low confidence levels?

Once we obtained the answers of these questions, we will look critically at the VaR. Is it still a good measure for long periods or not? The biggest problem we foresee is that in a long time period there might be a large probability that somewhere during the time period the cumulative loss is very big, while at the end of the time period this is compensated. This is a serious problem, because once such big loss is reached before the end of the period, it is very likely that most of our assumptions like independence cease to hold. A company might even go bankrupt before the end of the period, while the VaR indicates that there is nothing wrong. Especially for high profit margins, the VaR’s for long time periods will probably be very small. We will therefore compare the VaR to a new risk measure designed for longer periods, the maximal value at risk (MVaR). This is just like the VaR a quantile, but instead of a quantile of the value at time T it is a quantile of the maximum function of the random walk that describes the risk. The height of the difference between the VaR and the MVaR will be our last subquestion:

• How big is the VaR compared to the maximal cumulative loss in a time period?

1.6

Towards a solution

To answer our research question and its subquestions we must take two steps. First we must make a selection of risk distributions and a selection of the values for the profit

(14)

margins and confidence levels. The choices will be made in such a way that the effect of each variable is clear. Since all variables interact with each other, this means that we take a few values at crucial points rather than a lot of values with small differences between them. Our goal is not to find a closed form formula for the development of the value at risk, but to spot general trends.

The second step is more mathematical. We have to calculate the VaR’s for the longer time periods. Only for stable distributions, like the normal distribution and the Cauchy distribution, this is easy. This is however just a small subclass of all probability dis-tributions and most of the other disdis-tributions that are encountered in practice are not stable. Summing up multiple independent copies will therefore not give a nice closed form formula. We therefore have two possibilities, either use an approximation or do a simulation. We choose the first and make a discrete approximation of every risk distri-bution. This can be done relatively quickly and has the advantage that we can obtain an exact confidence interval for the values of the VaR we compute. The simulations will take longer to give reliable results, especially when we consider the maximum loss in the time period.

(15)

Chapter 2

Methodology

This chapter consists of five sections. First we will explain which distributions and val-ues for the profit margins we have chosen and why we have chosen them. Secondly, we determine for which confidence levels we calculate the VaR and what other calculations we perform to gain the most insight into time dependence. The next step is properly defining and introducing the new risk measure MVaR mentioned in the previous chap-ter. Then we will introduce our calculation method with the discrete approximation for both the VaR and the MVaR. Finally we will deduce a confidence interval of the approximation of the VaR.

2.1

Choosing distributions and values of parameters

In the introduction we already pointed out which variables and characteristics of risk distributions are likely to have an influence on the time dependence: the thickness of the tail, the profit margin, the height of the confidence level and possibly symmetry. We also have to determine which values the time period can take. The last thing we have to do is to determine how we perform the scaling in order to compare the results of the different distributions.

We already noticed that there are two characteristics of a risk distribution that may have an effect: the thickness of the tail almost certainly and possibly symmetry. We therefore will use four distributions, one for every thick/thin-tailed and symme-try/asymmetry combination. Table (2.1) shows our choices.

Table 2.1: The four chosen distributions

Symmetric Asymmetric

Thin-tailed Normal distribution Exponential distribution Thick-tailed Student’s t-distribution Log-normal distribution

These distributions are not only chosen because of their characteristics, but also be-cause they occur often in practice. The normal distribution occurs in almost every place. The central limit theorem states that under mild assumptions, sums of independent variables with finite variance, when properly scaled, converge to a normal distribution. So every risk that is built out of multiple smaller risks is approximately normal dis-tributed. The exponential distribution is also a commonly used distribution with many applications. It has the special property that it is memoryless. This property makes the exponential distribution especially useful for the modeling of unpredictable waiting times, ranging from the time till the next phone call to the time till the next volcano eruption. The Student’s t-distribution looks like the normal distribution and is derived from it, but has fatter tails. Its main application lies in statistics, where the famous Student’s t-test is one of the most used tests. The Student’s t-distribution has also a

(16)

−3 −2 −1 0 1 2 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

The four distributions, shifted to mu = 0 and scaled to sigma = 1

x p Normal distribution Exponential distribution Students t−distribution Log−normal distribution

Figure 2.1: The four chosen distributions.

degrees-of-freedom parameter, which determines its shape. The lower the degree of free-dom, the thicker the tail. For one or two degrees of freefree-dom, the variance does not exist, so we choose a Student’s t-distribution with three degrees of freedom. The last distri-bution is the log-normal distridistri-bution. This is a distridistri-bution for which the logarithm is normally distributed. It is used as a model for price changes in the stock market and is the assumption leading to the Black-Scholes formula for option pricing and is therefore very important in the financial world.

Figure2.1shows all the distributions in one graph. The asymmetric distributions are shifted to the left, so that their expectations equal 0. All the standard deviations equal 1. Based on this graph we can sort the distributions on tail-thickness. The log-normal distribution has easily the thickest right tail. The difference between the Student’s t-distribution with three degrees of freedom and the exponential t-distribution is smaller, but that is partially caused by the fact that the Student’s t-distribution is symmetric and has a lot of its density on the left hand side as well. The normal distribution is by far the thinnest-tailed distribution.

In order to compare the distributions with each other, we use the same standard deviation of 1 for all distributions. This is not completely fair, because the symmetric distributions have two tails that cause variance, while the asymmetric ones have only one tail. However, we are more interested in probabilities and they do not depend on the scaling. Fixing the standard deviation only makes sure that all the values lie in the same neighborhood.

The other important variable is the profit margin. We define the profit margin as minus the expected value of the risk. Since we used the standard deviation to scale the distributions, we actually measure the profit margin in standard deviations. The first case we want to include is of course the case where there is no profit margin. This is mathematically the most interesting case and also in practice realistic. On top of that, when we consider shorter time periods, so days or weeks instead of years, the profit

(17)

The dependence of the time period on the VaR — Aris Meems 11

margin will be very close to zero, so the results of this case can easily be translated to these other time scales.

Besides the zero profit margin, we also want to consider a moderate profit margin and a large profit margin. It is hard to determine the right values beforehand. When we choose the wrong value, there is a change that the effect is minimal and almost invisible or that the profit margin is very dominant and disturbs the results completely. Because this research is broad and not linked to one company and only slightly linked to the insurance world, we are not confined to a certain range, as long as the profit margin does not become very large. We therefore first performed some preliminary calculations to see which values gave useful results, where trends are visible. Based on these calculations we found that a large profit margin of 0.25 standard deviations and a moderate profit margin of 0.1 standard deviations gave insightful results.

2.2

Choosing what to calculate

For each combination of distributions and profit margins we want to calculate a number of VaR’s and MVaR’s that gives the best understanding of the possible trends. In total, five different calculations are performed per distribution/profit margin combination.

The first two are VaR’s for fixed confidence levels. We want to use a high and a low confidence level, in order to see if the behavior is different. Because our calculation method approximates the risk distributions in a number of discrete points, the results becomes more unreliable for extremely high confidence levels, since then there are only a few numbers larger than the VaR left. Therefore, we choose the VaR99 as our high

confidence level. For the low confidence level we choose 95%. Confidence levels lower than this are rarely encountered in practice, so it is useless to go any lower. We expect that the effect of a fat tail is already visible with the VaR99, but not yet or much less

for the VaR95.

The third calculation is based on the possible rule of thumb we already described in the Introduction. When the probability that you lose a certain amount in one year is 1%, the possibility that you lose the same amount in two years might be 2%, considering that you can lose it in the first year with a 1% probability and in the second year with a 1% probability. For three years the probability of losing that amount will be around 3% and so on. The possibilities of subsequent losses or compensations are ignored, so this thought is incorrect, but it is interesting to see how good this approximation is. We therefore use the value of the one-year VaR99 as the fixed value, and compare it to

the two-year VaR98 and so on. So for every time period of length T we consider, we

calculate the VaR100−T.

The fourth series of calculations is also based on this rule of thumb, but in a more sophisticated way. Now we don’t calculate the VaR for a fixed confidence level, but we calculate the confidence level for a fixed value of the VaR. This fixed value is the one-year VaR99. So we calculate the probability that you have lost at least the one-year

VaR99at the end of a T -year period. We will refer to this type of calculation as matching

the confidence level.

The fifth calculation is almost the same as the fourth, but instead of the VaR we consider the MVaR. This means that instead of calculating the probability that you have lost at least the one year VaR99 at the end of a T -year period, you calculate

the probability that you have lost this amount somewhere during that time period. This is an essential difference, because there is a possibility that you compensate an intermediate loss with a number of profitable years. These probabilities will therefore be lower, especially when the profit margin is large.

The last thing we have to determine are the different time periods for which we calculate the results and especially the maximum time period. We set this on 20 years, because in the financial world that is already a very long time. It is unrealistic to think

(18)

that the one year risk distributions stay the same for a period longer than 20 years. As intermediate values we take 2, 5 and 10 years, so that the time period gets almost or exactly doubled at every step. The five values for which we calculate the VaR’s and match the confidence levels are thus: 1, 2, 5, 10 and 20 years.

2.3

Definitions of risk measures

In this section we will give the definitions of the risk measures we are going to use. The VaR is of course well known, but it’s notation gets extended with the time period parameter in the superscript. The MVaR risk measure is new and will be defined and explained in detail.

2.3.1 VaR

The formal definition of the value at risk, as given in Kaas et al. (2008), is as follows:

V aR[S; p] := FS−1(p) := inf{s : FS(s) ≥ p}. (2.1)

In this definition S is the risk and p the confidence level. The use of the cumulative distribution function is in line with the intuitive meaning of a quantile. The infimum on the right hand side is needed for the cases when there is no unique value for the inverse of the distribution function for the given p.

The time period is missing in this definition, but is essential for the purpose of this research. We will therefore extend the notation with the time period. Before we do that, we take a closer look at risks. In our context, a risk is a random variable, for which the positive values represent a loss and the negative values represent a profit. In this thesis we will make the very important assumption that the risk distribution stays the same over the years and risks of different years are independent of each other. For large time scales, both assumptions are unrealistic. However, when the risk is small compared to the capital of a firm and the time period is small enough, these are reasonable assumptions and, most importantly, the calculations for dependent risks can become extremely complicated, while for independent variables they are much easier. We will denote the one-year risk variables as Xt. These are all independent copies of the risk of

the first year X1. The sum of these risks forms the total risk over longer time periods

ST: ST = T X t=1 Xt. (2.2)

With this definition it is easy to extend the definition in (2.1) with the time period parameter:

V aRtp:= V aR[St; p] := FS−1t (p) := inf{s : FSt(s) ≥ p}. (2.3)

Since we will always be speaking about a general risk or about a clearly specified risk, there is no need to include the risk X or St in the notation.

2.3.2 MVaR

The other risk measure we will be using is the maximum value at risk (MVaR). This is a new risk measure, but is closely related to the VaR. Just like the VaR, it has a strong intuitive meaning. It is also a quantile with a certain confidence level, but instead of

(19)

The dependence of the time period on the VaR — Aris Meems 13 0 5 10 15 20 −4 −3 −2 −1 0 1 2 3

The difference between the maximum and the end of the period

t

risk

Brownian motion

Maximum during time period End of time period

Figure 2.2: Determining VaR and MVaR.

looking at the risk distribution ST, the value at the end of time period T , we look at

the risk distribution max{St : t = 1, . . . , T }. Intuitively, this means that the MVaR is

the maximal amount of money you lose during the time period [0,T], with probability p. To determine this maximum, we only look at the cumulative losses at the end of the years and not the intermediate values. This will not make a big difference, especially for time periods longer than five years.

Before giving a formal definition, we first introduce a notation for the maximum value of the St:

max{St: t = 1, . . . , T } := ST∗. (2.4)

The formal definition is:

M V aRTp := V aR[ST∗; p] := FS−1∗

T(p) := inf{s : FS ∗

T(s) ≥ p}. (2.5)

We will clarify the difference between the VaR and the MVaR with some pictures. The first is figure 2.2. In this figure one path generated with a Brownian motion is shown. Furthermore, we see two straight lines. The upper line shows the maximum value of the Brownian motion. On this value the MVaR is based. This is of course only one sample of the Brownian motion, but when we generate a large number of paths, we can make an accurate estimation of the p-quantile. The VaR is based on the lower line. The value of this line is negative, because the value of the Brownian motion at T = 20 is negative. The VaR is the quantile of the distribution of this value.

Based on figure2.2, it looks like there is a lot of difference between the VaR and the MVaR. We see for example that the value at the end of the time period is negative, while the MVaR is a maximum and the random walk starts at zero, so will never be negative.

(20)

−200 −15 −10 −5 0 5 10 15 20 200 400 600 800 1000 Distribution of value at T=20 −200 −15 −10 −5 0 5 10 15 20 500 1000 1500 2000

Distribution of maximum value

Figure 2.3: Histograms for VaR and MVaR.

Furthermore, the distribution at the end of the period will be symmetric, because we examine a Brownian motion, while the MVaR cannot be.

Figure 2.3 confirms these statements, but also shows us that at the right tail, the distributions are more similar. In this figure there are two histograms, based on 10000 Brownian motion simulations. Although the left side is completely different, the right hand tails are more alike. This makes sense, since the values on the right hand side are based on samples that reach high values. High values for the MVaR are more likely to be reached at the end of the time period than at the beginning, since the expected distance from zero for the random walk grows with the time. Therefore, a high maximum value of the random walk implies often that the random walk is high somewhere close to the end of the period. This means that the value at the end of the period will still be high, explaining the similarity of the tails.

When there is a profit margin, this is less the case. Since every year, the expected overall profit grows, the probability that the maximum value is reached at an earlier time point also increases. This is the main reason for introducing the MVaR.

2.4

Calculation method

In most of the cases, we cannot calculate the desired values of the VaR and the MVaR directly. In the introduction we pointed out that there are two solutions: simulation or using a discrete approximation of the distributions. We choose the latter, because of the calculation speed and the possibility to obtain an exact confidence level.

2.4.1 VaR

The idea behind the approximation of the distribution is that we transform a continuous distribution to a uniform discrete distribution, so a distribution consisting of a n points

(21)

The dependence of the time period on the VaR — Aris Meems 15

with a probability of n1 each. The location of these points are chosen based on the cumulative distribution function (CDF) of the risk, in such a way that the difference between the CDF of the actual risk and the CDF of the approximation is at most

1

2n. This discrete stochastic variable will be called X. We will call the atoms of this

distribution xi, for i = 1, . . . , n. The definition of the xi and of X are defined by the

two following equalities:

P(X = xi) = 1 n ∀i, (2.6) P(X ≤ xi) = i −12 n , ∀i. (2.7)

Especially the definition (2.7) is important, since it leads to the following inequality:

|P(X ≤ x) − P(X ≤ x)| = |P(X ≤ xi) − P(X ≤ x)| for a certain i, ∀x (2.8) = |i n− P(X ≤ x)|for a certain i, ∀x (2.9) ≤ max(|i n− P(X ≤ xi)|, | i

n− P(X ≤ xi+1)|)for a certain i, ∀x (2.10) = max(|i n− i −12 n |, | i n− i + 1 − 12 n |)for a certain i, ∀x (2.11) = 1 2n ∀x. (2.12)

Figure2.4shows the discrete approximation in a graph that compares the two CDF’s for a normal random variable. Looking at this graph makes it immediately clear that the maximum difference between the two CDf’s is indeed 2n1 . Another thing we see is that the horizontal difference is larger in the tails than in the middle of the graph. This indicates that this method is more reliable for comparing confidence levels than for exact values of the VaR, although the number of intervals in this graph is very small compared to what we will use in practice. It is, however, impossible to derive a bound for the exact value of the VaR of the approximation and the real risk, except for a few special cases.

The next step is generating approximations of multiple year distributions. We as-sumed that all the one-year risks were independent and identically distributed, which makes this relatively easy. For a two year distribution, we can just take the sum of the n possible values for the first year and the n possible values of the first year, resulting in n2, not necessarily distinct, values. To all of these values we assign the probability

1

n2. The CDF of the sum X1 + X2 can easily be defined with the aid of an indicator

function: P(X1+ X2≤ x) = n X i=1 n X j=1 P(X1= xi; X2 = xj)1{xi+xj≤x}∀x (2.13) = 1 n2 n X i=1 n X j=1 1{xi+xj≤x}∀x. (2.14)

The problem with this method is that with every year that is added, the number of points increase with a factor n and since n will be very big, this will immediately cause a problem. Therefore, we reduce the number of points to n after every sum we take. The way we do this is almost the same as the discrete approximation of the continuous distribution. We define the two-year risk S2 as a discrete uniform distribution over n

(22)

−3 −2 −1 0 1 2 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Discrete approximation of cumulative distribution function of standard normal distribution

Figure 2.4: Discrete approximation normal distribution.

points, which we will call s2i. These values are deduced from the n2 values of the sum X1+ X2. The n2 sorted values are divided in n groups. The first group contains the

n smallest values, the second group the smallest n values that are left, and so on till group n that contains the n largest values. Then, the median of every group is taken. The median of group i will be the value of s2i. When the group has an even amount of values, the median is the average between the two numbers with the index closest to n2. This step will be repeated for every new year that is added. The CDF of St and the

CDF of X generate a unique new distribution function over n2 points. These points are sorted again and divided in n groups. The medians of these groups define the number st+1n , which form the atoms of the distribution function of St+1.

When we have the approximate distribution for the time T we are interested in, we just take the p-quantile of the approximated distribution to obtain an approximation of the V aRTp.

Figure2.5shows how this process works. The blue line represents the true continuous distribution of S2. The green line represents a CDF over 100 points, obtained by adding

two independent copies of the discrete approximation of a normal distribution with variance 1. The red line is in turn an approximation of the green line, taking medians in groups as described above.

We can already make two interesting remarks based on this picture. The first is that the approximation of n2 points is too low on the left hand side and too high on the right hand side. This means that the approximation has thinner tails than the real distribution, which is a serious drawback. Later in this thesis, we will derive a bound for this deviation. The second remark is that it is clearly visible that the deviation between the blue and the red line is greater than 2n1 . This is also best visible at the tails. Both remarks are disadvantages of our method. However, for the generation of this graph we used a n of only 10, which is extremely low. In practice we will use a n of 10000, which minimizes these effects.

(23)

The dependence of the time period on the VaR — Aris Meems 17 −3 −2 −1 0 1 2 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Reducing the approximation to n points

x

p

Normal distribution, variance 2 Approximation n2 points Approximation n points

Figure 2.5: Difference between X1+ X2, X1+ X2 and S2.

2.4.2 MVaR

The calculation method for the MVaR is more complicated. This is natural, because the MVaR depends on the entire random walk that describes the development of the loss and not just on the final state. There are, however, also similarities between the two methods. The calculation method of the MVaR can be seen as an extension from the calculation method of the VaR.

We cannot directly calculate the MVaR for a given confidence level, but we can calculate an approximation for the confidence level for a given value of the MVaR. This is the probability that the cumulative loss exceeds that given value somewhere during the time period.

The first step is the same as for the approximation of the VaR. So we add two n point distributions, resulting in a distribution of n2 points, which we then reduce to a new distribution of n points (see figure 2.4). The difference is that we now assign a weight to all of these points. This weight represents the probability that the random walk has not been higher than the fixed value, given that the current position is that specific point.

For the first one-year distribution, representing the loss in year 1, these weights are trivial. When the value of this point is already higher than the given fixed value of the MVaR, this weight is 0, when it is lower, the weight is 1:

wi1= 

1 if xi< MVaR;

0 if xi≥ MVaR;

When we look at longer time periods, we also set the weight to 0 when the sum exceeds the given value of the MVaR, but we do not set it on 1 when it does not. Instead, we look at the weights of the previous time step. Below we describe this procedure in detail.

(24)

The distribution of the previous time step t − 1 consists of n points, each with its own weight. When we add another one-year distribution, we calculate n2 sums. Each

sum consists of one number of the distribution at time t − 1 with a unique weight and one number of the one-year risk, without a weight. Each of the n2 sums takes over the weight of the first number. Then we look if the sum exceeds the given value of the MVaR. When this is the case, the weight is set to 0.

The next step remains the same as for the calculation of the VaR: the n2 points are sorted and divided into n groups based on their size. The weights of the sums are linked to a unique sum, so they are also linked to them after sorting. We now have n groups consisting of n pairs consisting of a value of a sum and a weight. Of the values of the n values of the sum we take the median as before, but for the weights we take the average. The reason to do this, is that in this way we take into account that the same value can be reached through different paths. This median and average form the point-weight combination of time t.

We repeat this procedure until we have reached the desired time T . We then have n points and n weights. The final step is that we take the averages of these weights. All the weights are numbers between zero and one, so this average is a well defined probability, that approximates the probability that the random walk never exceeds the given value of the MVaR.

The above calculation method defines a function with as input the value of the MVaR and as output the confidence level belonging to this MVaR or the probability that the cumulative loss never exceeds the MVaR. This is clearly a monotonic increasing function, since in order to exceed a given value of the MVaR, all smaller values than this MVaR are of course exceeded as well. Because of this monotonicity, it is also possible to calculate the MVaR for a given confidence level by trial and error.

2.5

Confidence level

In this section we will construct an upper bound for the difference between the CDF’s of the real sum ST and the approximated sum ST, which is uniformly distributed over

n points. In mathematical notation this difference is:

|P(ST ≤ x) − P(ST ≤ x)|. (2.15)

The bound will of course depend on the number of years T , since we throw away some information at every time step. It will also depend on the number of intervals n. This makes it possible to make the difference arbitrarily small in trade of a longer calculation time.

We already saw the bound for the approximation of the one-year distribution in (2.12), which was 2n1 . The other bounds will be obtained in two steps. First we will determine the bound for the two-year distribution. Then we will use induction to extend the bound of the difference for an additional year, resulting in a bound depending on T .

2.5.1 Two-year distribution

For the bound of the two-year distribution, we use a mathematical trick known as the triangle inequality. The idea behind this method is that we do not look at the difference between the real CDF of S2 and the approximated CDF of S2 (consisting of n points)

directly, but we use the CDF of X1+ X2 (consisting of n2 points) as an intermediate

step. So we look at the difference between the CDF’s of S2 and X1+ X2 and at the

difference of the CDF’s of S2 and X1+ X2. In the following expression it is shown how

(25)

The dependence of the time period on the VaR — Aris Meems 19 −3 −2 −1 0 1 2 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Upper and lower bound of continuous CDF

Figure 2.6: Construction of X+ and X−.

|P(S2 ≤ x) − P(S2 ≤ x)| = |P(S2 ≤ x) − P(X1+ X2 ≤ x) + P(X1+ X2 ≤ x) − P(S2 ≤ x)|

(2.16) ≤ |P(S2 ≤ x) − P(X1+ X2 ≤ x)| + |P(X1+ X2 ≤ x) − P(S2≤ x)|∀x.

(2.17)

We must now construct an upper-bound for both terms in the right hand side of (2.17). The bound for the first term is the hardest. The way we will solve this is by constructing two new CDF’s, one that is larger than or equal to the CDF of X and one that is smaller than or equal to the CDF of X. We will refer to these CDF’s as the CDF’s of respectively X+ and X−. These are random variables that are independent of each other and all other random variables we consider. This is because we only need their CDF to obtain the bound and nothing else.

These two new CDF’s are closely related to the CDF of X. This is best seen in figure

2.6. Both CDF’s are just like the CDF of X step functions with steps of size n1. The CDF of X+ starts at 0 and jumps exactly at the moment that it hits the CDF of the real risk distribution X. The CDF of X− starts at 1n and jumps at the same time as the CDF of X+. The CDF of X+never reaches 1 and the CDF of X−never reaches 0, but that is not a problem. It means that the random variable X+ has a positive probability of being plus infinity and the random variable X− has a positive probability of being minus infinity.

(26)

P(X+= x+i ) = 1 n, P(X − = x−i ) = 1 n ∀i, (2.18) P(X ≤ x+i ) = i − 1 n , P(X ≤ x − i ) = i n ∀i. (2.19) x+1 = −∞,x−n = ∞. (2.20)

The numbers x+i+1 and x−i are the same.

The random variables X+and X−are constructed in such a way that they have the property:

P(X−≤ x) ≥ P(X ≤ x) ∀x, (2.21)

P(X+≤ x) ≤ P(X ≤ x) ∀x. (2.22)

When we have sequences of independent identically distributed random variables (Xk+), (Xk−) and (Xk), the above result can be translated to the sum:

P( n X k=1 Xk−≤ x) ≥ P( n X k=1 Xk≤ x) ∀x, (2.23) P( n X k=1 Xk+≤ x) ≤ P( n X k=1 Xk≤ x) ∀x. (2.24)

With the above equations we can find a new upper bound for |P(S2 ≤ x) − P(X1+

X2≤ x)| by using the definition of the absolute value:

|P(S2 ≤ x) − P(X1+ X2 ≤ x)| = max(P(S2 ≤ x) − P(X1+ X2≤ x), P(X1+ X2 ≤ x) − P(S2 ≤ x)) ≤ max(P(X1−+ X − 2 ≤ x) − P(X1+ X2 ≤ x), P(X1+ X2≤ x) − P(X1++ X2+≤ x)). (2.25) The last step holds true, because in the first argument of the maximum we replace P(S2 ≤ x) for something bigger and in the second argument we replace P(S2 ≤ x) for

something larger. Because of the sign before the P(S2 ≤ x) term, this makes the value

of the maximum larger.

We will only write out the first argument of this maximum, since both derivations are exactly the same and yield the same answer. Since there are only discrete random variables, we can calculate the probabilities by just writing down all the possible values:

P(X1−+X − 2 ≤ x) − P(X1+ X2 ≤ x) = n X i=1 n X j=1 (P(X1−= x − i ; X − 2 = x − j )1{x−i+x−j≤x}) − n X i=1 n X j=1 (P(X1 = xi; X2 = xj)1{xi+xj≤x}) (2.26) = (1 n) 2 n X i=1 n X j=1 (1{x− i +x − j≤x} −1{x i+xj≤x}) (2.27)

Now we use the definition of the x−i . We know that they are smaller than the xi

with the same index, but also that they are larger than the xi−1, where the index is just

(27)

The dependence of the time period on the VaR — Aris Meems 21 P(X1−+ X − 2 ≤x) − P(X1+ X2 ≤ x) = (1 n) 2( n−1 X i=1 n−1 X j=1 (1{x− i+1+x − j+1≤x} −1{xi+xj≤x}) + n X i=1 (1{x−i+x−1≤x}−1{xi+xn≤x}) + n−1 X j=1 (1{x−1+x−j+1≤x}−1{xn+xj≤x}) (2.28) ≤ (1 n) 2( n X i=1 (1{x− i +x − 1≤x}−1{xi+xn≤x}) + n−1 X j=1 (1{x− 1+x − j+1≤x}−1{xn+xj≤x})) (2.29) ≤ (1 n) 2(n + (n − 1)) (2.30) ≤ 2 n. (2.31)

As we already mentioned, the bound for the X+can be derived in a similar way and yield the same bound:

P(X1+ X2 ≤ x) − P(X1++ X2+ ≤ x) ≤

2

n. (2.32)

We can fill in (2.31) and (2.32) in (2.25), resulting in:

|P(S2 ≤ x) − P(X1+ X2≤ x)| ≤ max( 2 n, 2 n) = 2 n. (2.33)

The only thing left to do is to derive a bound for the other term in (2.17), the difference between the n2 points approximation and the n point approximation. This is much less complicated. In fact the deduction is the same as we have seen before in (2.12) and yields the same bound:

|P(X1+ X2 ≤ x) − P(S2 ≤ x)| ≤

1

2n. (2.34)

Filling in (2.33) and (2.34) in (2.17) gives us the final bound:

|P(S2 ≤ x) − P(S2 ≤ x)| ≤ 2 n+ 1 2n = 5 2n.∀x. (2.35) 2.5.2 Multiple-year distribution

We will now generalize the bound for a risk of a T -year period. Our claim is that for the approximation of the T -year risk, the maximal difference between the real CDF and the approximated CDF is 2T +1n . We will use induction to prove that this holds. For the derivation we will use many of the same techniques as we used for the two year distribution.

We know already that for T = 2, the bound of 2T +1n holds. Therefore, we now have to show that it also holds for T + 1 given that it holds for T . So we want to show that:

|P(ST +1 ≤ x) − P(ST +1 ≤ x)| ≤

2(T + 1) + 1

(28)

In order to prove this we again apply the triangle inequality by again adding and subtracting the n2-point approximation:

|P(ST +1 ≤ x) − P(ST +1 ≤ x)|

= |P(ST +1 ≤ x) − P(ST + XT +1 ≤ x) + P(ST + XT +1≤ x) − P(ST +1≤ x)|

(2.37) ≤ |P(ST +1 ≤ x) − P(ST + XT +1 ≤ x)| + |P(ST + XT +1≤ x) − P(ST +1≤ x)|.

(2.38) For the first term on the right hand side we do the same step as in (2.25): writing the absolute value as a maximum. We also split ST +1 in the first T -years and the last

one:

|P(ST +1 ≤ x) − P(ST + XT +1≤ x)| = max(P(ST + XT +1 ≤ x) − P(ST + XT +1≤ x)

, P(ST + XT +1 ≤ x) − P(ST + XT +1≤ x)).

(2.39) The next step is again creating a CDF that is always larger than the CDF of ST

and a CDF that is always smaller than ST, to which we will refer as the CDF of ST+and

the CDF of ST−. We will again use a uniform discrete distribution on n points and use the assumed maximal difference of 2T +1n to ensure that the two new CDF’s are indeed larger and smaller.

For the n point approximation of the sum we have the following two properties:

P(ST = sT ,i) = 1 n ∀i, (2.40) |P(ST ≤ sT ,i) − P(S ≤ sT ,i)| ≤ 2T + 1 n ∀i. (2.41)

This means that the difference is at most 2T + 1 times the probability of an atom. So by changing the index by adding or subtracting 2T + 1, we have an upper or a lower bound for the CDF of S. So we define the two CDF’s as:

P(ST−≤ sT ,i) =  i+(2T +1) n if i = 1, . . . , n − (2T + 1); 1 if i = n − (2T ), . . . , n P(ST+≤ sT,i) =  0 if i = 1, . . . , 2T + 1; i−(2T +1) n if i = 2T + 2, . . . , n;

The n atoms of these distributions are the same as the atoms of the discrete ap-proximation, but with the index shifted. For the values that cannot be shifted, because they would then become larger than n or lower than 1, we set them at plus or minus infinity: s−T,i=  sT ,i+2T +1 if i = 1, . . . , n − (2T + 1); ∞ if i = n − (2T ), . . . , n s+T ,i =  −∞ if i = 1, . . . , 2T + 1; sT ,i−(2T +1) if i = 2T + 2, . . . , n;

With this two bounds, we can further rewrite expression (2.39):

|P(ST +1 ≤ x) − P(ST + XT +1≤ x)| ≤ max(P(ST−+ XT +1− ≤ x) − P(ST + XT +1 ≤ x)

, P(ST + XT +1 ≤ x) − P(ST++ XT +1+ ≤ x)).

(29)

The dependence of the time period on the VaR — Aris Meems 23

We will again only consider one argument of the maximum, since both deductions will be the same and will yield the same result. We will first rewrite the expression in terms of indicator functions:

P(ST−+X − T +1 ≤ x) − P(Sk−1+ Xk≤ x) = n X i=1 n X j=1 (P(ST−= s−T ,i; Xk−= x−j )1{s− T ,i+x − j≤x}) − n X i=1 n X j=1 (P(ST = sT ,i; XT +1 = xj)1{sT ,i+xj≤x}) (2.43) = (1 n) 2 n X i=1 n X j=1 (1{s− T ,i+x − j≤x} −1{s T ,i+xj≤x}). (2.44)

We now split the sum in the parts that are with absolute certainty negative and the parts that might be positive:

P(ST−+XT +1− ≤ x) − P(Sk−1+ Xk≤ x) ≤ (1 n) 2( n−(2T +1) X i=1 n−1 X j=1 (1{s− T ,i+(2T +1)+x − j+1≤x} −1{s T ,i+xj≤x}) + (2T + 1)n + (n − (2T + 1))) (2.45) ≤ (1 n) 2((2T + 1)n + (n − (2T + 1))) (2.46) ≤ 2(T + 1) n . (2.47)

This bound is the same for the expression with ST+, so we have the final bound for (2.39):

|P(ST +1≤ x) − P(ST + XT +1 ≤ x)| ≤

2(T + 1)

n . (2.48)

We can now finally return to the second term in (2.38). This is again the reduction from a n2 point distribution to a n point distribution, which we have seen before, so this bound is known:

|P(ST + XT +1≤ x) − P(ST +1≤ x)| ≤

1

2n. (2.49)

Having deduced a bound for both terms of (2.38), we now have the total bound for the difference between the real CDF and the approximated CDF:

|P(ST +1 ≤ x) − P(ST +1 ≤ x)| ≤ 2(T + 1) n + 1 2n ≤ 2(T + 1) + 1 n . (2.50)

This bound satisfies our induction assumption, so we showed that this bound holds for the approximation of ST +1 given that it holds for ST. This concludes the proof.

(30)

Results

In this chapter we present the results we obtained for the four risk distributions we examined. We start with the two symmetric distributions: the normal distribution with light tails and the Student’s t-distribution with fat tails. Consecutively we show the results for the asymmetric distributions: the exponential distribution without fat tails and the log-normal distribution with fat tails. For every distribution we calculated the same VaR’s and confidence levels, which are presented in three tables and one graph.

Each table belongs to a risk with a different profit margin, that takes the value 0, 0.1 and 0.25. The tables consist of calculated values for five different time periods: 1, 2, 5, 10 and 20. For each time period we calculated the VaR95 and the VaR99. Furthermore

we calculated the VaRT100−T, so for T = 1 the VaR99, for T = 2 the VaR98, and so

on. This tells us if the relation between the confidence level for which the VaR equals a certain value and the time period T is approximately linear. The final two columns are calculated confidence levels. For the calculation the VaR199 was set as a fixed value. Then, for each time period T we determined the confidence level p for which the VaRTp was exactly equal to the VaR199. We repeated the same procedure for the MVaR, where we used the same fixed value to match the confidence levels because of the equality: VaR199 = MVaR199.

In the calculation process of the confidence levels, we found the matching confidence levels for all time points lower than 20 as well for both the VaR and the MVaR. Since the confidence levels are the most important results, we present all these confidence levels in one single plot. The plot contains six curves: both the VaR and the MVaR for the three different profit margins.

3.1

Normal distribution

The normal distribution is symmetric and has no fat tails. The sum of two independent stochastic variables that are normally distributed is still normally distributed, with the mean and the variance just being the sum. The calculations of the VaR’s are done with an n value of 10000. The calculation of the MVaR is done with a n of 5000. For these calculations we have not obtained a reliable bound.

Table 3.1: Normal distribution, profit margin 0 sd’s

T VaRT95 VaRT99 VaRT100−T VaRTp = VaR199 MVaRTp = VaR199

1 1.64 2.32 2.32 0.990 0.990

2 2.32 3.28 2.90 0.950 0.946

5 3.67 5.17 3.67 0.851 0.795

10 5.18 7.28 4.037 0.770 0.636

20 7.29 10.2 3.73 0.700 0.482

We see in table 3.1 that the VaR95 and the VaR99 increase clearly. By theory we

(31)

The dependence of the time period on the VaR — Aris Meems 25

know that they increase with the square root of T and the results support that quite accurately. The VaRT

100−T increases with T at first, but reaches its peak near T = 10 and

then starts to decrease slowly. All the values are significantly higher than the VaR199. This is also clearly visible when we matched the confidence level p with the VaR199. The probability that you lose an amount equal to the VaR1

99in two years is 5% and even 15

% in a 5 year period. This shows that the relation between p and T is far from linear. The confidence level of the MVaR follows the same trend as the confidence level of the VaR. The MVaR is by definition always lower than the VaR and the difference increases when T increases. For T = 20, this difference is over 20%, but since the confidence level of the VaR was already low with a value of 70%, the relative difference is limited.

Table 3.2: Normal distribution, profit margin 0.1 sd’s

T VaRT95 VaRT99 VaRT100−T VaRTp = VaR199 MVaRTp = VaR199

1 1.54 2.22 2.22 0.990 0.990

2 2.12 3.08 2.70 0.957 0.953

5 3.17 4.67 3.17 0.889 0.836

10 4.18 6.28 3.037 0.847 0.720

20 5.29 8.20 1.73 0.830 0.613

Table 3.2shows the results when a small profit margin of 0.1 standard deviation is added to the one year risk. The VaR’s for a fixed confidence level decrease linearly in T compared to the VaR’s without a profit margin. The profit margin is small enough to ensure that the VaR95 and the VaR99 still increase in T , but it shifts the peak of the

VaRT100−T closer to 0, with the peak now lying around 5. The matching of p has some interesting outcomes. Of course, the confidence levels p are higher with a profit margin than without, but the difference behaves nicely under T . The difference is everywhere between 0.7T and T . The most interesting result is the MVaR. Their confidence levels are of course also higher because of the profit margin, but the difference between the confidence levels of the VaR and the MVaR stays almost the same for every time T . For differences for T = 20 for example are 0.218 without a profit margin and 0.217 with a profit margin. This is very remarkable. Since the confidence levels of the VaR are higher with the profit margin, an equal absolute difference means a larger relative difference. Especially for T = 10 and T = 20, there is a large probability that the net loss is larger than the VaR199 somewhere during the time period, but the loss is compensated at the end of the period.

Table 3.3: Normal distribution, profit margin 0.25 sd’s T VaRT95 VaRT99 VaRT100−T VaRTp = VaR199 MVaRTp = VaR199

1 1.39 2.07 2.07 0.990 0.990

2 1.82 2.78 2.40 0.966 0.961

5 2.42 3.92 2.42 0.932 0.884

10 2.68 4.78 1.537 0.927 0.819

20 2.29 5.20 -1.27 0.945 0.771

In table 3.3 the results for a normal distributed risk with a profit margin of 0.25 standard deviations are shown. Again, the VaR’s for fixed values decrease linearly. The difference between the VaR95and the VaR99 is notable. While the VaR99keeps

increas-ing till T = 20, the VaR95 starts to decrease before that, resulting in a VaR2095 that is

lower than both the VaR1095 and the VaR595. This clearly indicates that the time depen-dence depends on the confidepen-dence level and that the effect of a profit margin is larger for lower confidence levels. The results of the VaRT100−T support this. For T = 10 and T = 20, the values clearly drop and the VaR2080is even negative. The confidence level for which the VaRT equal the VaR1

Referenties

GERELATEERDE DOCUMENTEN

3.2.2 The Indefinite [Cl`N] and Why Cantonese and Mandarin Differ From the above data on indefinite noun phrases in Mandarin and Cantonese, it is clear that indefinite bare nouns

oral contraceptive use at the time of thrombosis äs well äs at the time of venipuncture, and high levels of factor VIII, XI, and the vitamin K-dependent clotling factors

Also, when we excluded subjects who had known genetic risk factors for thrombosis (e.g., protein C or S deficiency, antithrombin deficiency, the factor V Lei- den mutation,

If The Economist had reported that racial intermarriage was white women's greatest taboo, that some white women find non-white men unattractive, that others fear their children

However, that does not alter the fact that Muslim extremists run less risk when they can fall back on sympathizers of the violent jihad.. The possibilities to do so show a different

Muslims are less frequent users of contraception and the report reiterates what researchers and activists have known for a long time: there exists a longstanding suspicion of

Jane Eyre is in My Plain Jane een vriendin van Brontë, en zij beleven samen de avonturen uit Jane Eyre, die natuurlijk in My Plain Jane voor Charlotte aanleiding zijn om haar

The font size commands also change the line spacing, but only if the paragraph ends within the scope of the font size command. The closing curly brace } should therefore not come