• No results found

Implications of switching from value at risk to expected shortfall : comparing the adequacy of backtests for expected shortfall

N/A
N/A
Protected

Academic year: 2021

Share "Implications of switching from value at risk to expected shortfall : comparing the adequacy of backtests for expected shortfall"

Copied!
28
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Department of Economics and Business

Implications of Switching from Value at Risk to

Expected Shortfall

Comparing the adequacy of backtests for expected shortfall

Gwendolyn Stuitje

Supervisor:

dr. S.A. Broda

June 2014

 

Abstract

This paper investigates the adequacy of two pre-existing techniques for expected shortfall backtesting: the residual approach of McNeil and Frey (2000) and the truncated dispersion approach of Righi and Ceretta (2012). Two Monte Carlo simulation studies are performed to simulate both normal distributed returns and returns assuming a student-t distribution. In both simulations the ES is computed under the normal and student-t and the backtests are applied. Through p-values and the rejection rate it is shown that both backtests are in both cases inaccurate for expected shortfall backtesting.

(2)

1 Introduction

In the article Expected Shortfall: A Natural Coherent Alternative to Value at Risk (2002), Acerbi and Tasche describe the need for a change in risk management. This need arose when risk professionals and researchers came to the conclusion that the broadly used risk measure, Value at Risk (VaR), is not a coherent risk measure1. The awareness of the fact

that the use of VaR is not indisputable has been part of the progress that is made in the risk management area, which all started with the articles Thinking Coherently (1997) and Coherent Measures of Risk (1999) by Artzner et al. These articles examine and define for the first time the properties that a statistic of a portfolio should have, for it to be considered a sensible and thus coherent risk measure. To put it differently, Artzner et al. made risk management into a science by expressing general statements that they could use to make logically certain conclusions about risk measures. Using this deductive framework, they can examine if a risk measure meets to the standards of a coherent risk measure.

Since 1996 the Basel Committee on Banking Supervision required the use of VaR as a measure for the market based capital requirement of banks. The abovementioned progress in the science of risk management made it possible to investigate if VaR satisfies all the axioms of a coherent risk measure. When tested, it became clear that VaR didn’t fulfil all of the axioms of coherence; it violates the axiom of sub-additivity. There are researchers like Acerbi and Tasche (2002) who employ the axioms of coherence in a strict manner and thus refuse to call a statistic of a portfolio a risk measure if it is not coherent. They see the non-coherency of VaR is an indication of the need for a change in risk management. However, there are other researchers in the area of risk management who are not as rigorous as Acerbi and Tasche. Yamai and Yoshiba (2002) for instance state that the relevance of the problems with the non sub-additivity of VaR depends on the preference of risk managers. In the case that sub-additivity is irrelevant for some risk practisers and VaR does measure all the other relevant aspects of risk, Yamai and

Yoshiba will see VaR as a valid risk measurement.

However, there is another pressing problem with VaR that might be more urgent because it is related to the insolvency of financial institutions caused by adverse market situations. In an adverse market situation the part of the losses of a portfolio will be in                                                                                                                

1  V  is  a  set  of  real-­‐valued  random  variables  and  a  function  𝜌:  𝑉 → ℝ  is  a  coherent  risk  measure  if  

it  is  monotonous  (𝑋, 𝑌 ∈ 𝑉, 𝑋 ≤ 𝑌   ⟹  𝜌 𝑌 ≤  𝜌(𝑋))  sub-­‐additive  (𝑋  𝑌, 𝑋 + 𝑌 ∈ 𝑉   ⟹  𝜌 𝑋 + 𝜌 𝑌 ≥ 𝜌(𝑋 + 𝑌)),  positively  homogeneous   𝑋 ∈ 𝑉, ℎ > 0, ℎ𝑋 ∈ 𝑉 ⟹ 𝜌 ℎ𝑋 = ℎ𝜌 𝑋  and  

(3)

the tails of the distribution. The problem is that VaR only focuses on the threshold of the possible losses and thus indifferent is about the losses beyond that level, the tail risk. Consequently, VaR does not present an accurate picture of the risk that is faced in adverse market situations.

During the financial crisis, there was an adverse market situation that induced insolvency of financial institutions because VaR did not present an accurate picture of the risk that these financial institutions faced. Thus, while in 1997 and 1999 Artzner et al. already wrote about changing the risk measure due to theoretical shortcomings of VaR, it was the financial crisis that painfully showed the need for a change in risk management. During the crisis it became clear that the by the use of VaR calculated level of required capital against trading book exposures could not absorb the extreme losses that appeared which led to bankruptcy and job losses. In order to prevent future losses of this size, the Basel Committee on Banking Supervision (“the Committee”) announces in a consultative document, Fundamental review of the trading book: A revised market risk framework, a revision of the market risk framework “to contribute to a more resilient banking sector by

strengthening capital standards for market risk” (Basel, 2013, p. 1). Due to reforming the regulatory standards for banks as a response to the financial crisis, the required change is made in the practice of risk management.

A part of the revision of the market risk framework consists of a switch from VaR to a risk measure that satisfies the theoretical requirements set by Artzner et al. (1999) and that is more robust in times of adverse market situations: Expected Shortfall (ES). ES had previously been proposed by Artzner et al. (1999) as a coherent alternative to VaR. In addition to the fact that ES satisfies the axioms of coherence, ES is the natural estimator for the expected loss after the threshold, that is to say, the losses beyond VaR level. This implies that ES accounts for the tail risk of a portfolio.

One implication of the switch from VaR to ES is the need to adjust the way that the regulatory capital requirements are determined. For determining regulatory capital requirements, the Committee has to choose a confidence level to measure the size and likelihood of losses above VaR level (Basel, 2013). With a confidence level of 90% there is a 10% chance that the portfolio will drop, over the selected time horizon, by a value equal to or higher than the one given by the VaR. The current confidence level for the VaR measure is 99%. For ES, the committee could decide to use a 99% confidence level so that both risk measures are calculated based on the same tails of the distribution. But, due to the fact that ES is a more sensitive risk measure, using the same 1% quantile for

(4)

both ES and VaR might not be appropriate from the viewpoint of capital reserve determination. From this perspective, Kerkhof and Melenberg (2004) propose to set the levels in such a way that the risk measurement methods result in a similar level of capital reserve. Consequently, in comparison with the VaR level, the confidence level for ES has to be lower. In their consultative document the Committee decided to use a 97.5% ES, which is in line with the current 99% confidence level for the VaR measure.

Another implication of the change of risk measure is the need for methods that verify the risk measure estimates that are used by financial institutions. Thus, with the introduction of ES as a new risk measure, the Committee requires that there are backtests for ES. A backtest is necessary in order to compare the actual trading results with the used model-generated risk measures (Basel, 1996). If the comparison of the model and the actual results is close enough, there will be no problems regarding the performance of the model. One of the reasons why ES was not present in earlier documents of the Committee, for example Basel II, is due to the difficulties with the backtests of ES (Yamai and Yoshiba, 2002). When there are no adequate backtests for ES, the quality and accuracy of the risk measurement system of a bank or other financial institution cannot be evaluated.

The existing backtests for VaR focus on the number of times that a realized loss is worse than the estimated VaR level; these events are called violations. VaR backtests are not appropriate for ES backtesting because they only give attention to the number of violations of the VaR level and not to the size of the exceeding losses. Therefore these tests lack statistical power and are not a good way to test the model generated ES (see Wong, 2008). As a solution to this problem there are designed different backtest for ES that focus on the expected size of the violations. In Backtesting trading risk of commercial banks using expected shortfall (2008) Wong describes some of these different techniques together with their deficiencies for backtesting ES and puts forward another technique to backtest the trading risk using ES. Some backtesting techniques are the censored

Gaussian approach (Berkowitz, 2001), the functional delta approach (Kerkhof and Melenberg, 2004), the residual approach (McNeil and Frey, 2000) and Wongs’ own saddlepoint technique. The problems with the three former ES backtests are related to the fact that the test statistics of the backtests rely on large sample for convergence to the limiting distributions. Put differently, when the test samples are small, the test might be inaccurate due to the reliance on an asymptotic test statistic (see Wong, 2008). The saddlepoint technique proposed by Wong is unaffected by these disadvantages.

(5)

However, Righi and Ceretta (2012) make clear that there still are some disadvantages with the saddlepoint technique and therefore they propose their own backtesting approach, which extends and improves the aforementioned techniques (see Righi and Ceretta, 2012).

Since ES came into the picture as a possible alternative for VaR, various steps have been taken to contribute to the commissioning of ES. An example of this are the aforementioned backtesting techniques that are designed to test the model generated ES. However, there are still some shortcomings with the designed ES backtests and the Committee does not yet prescribe which of these backtest techniques has to be used. In their last consultative document they state that: “it is often argued that neither expected shortfall nor value-at-risk measures can be compared against actual trading outcomes, since the actual outcomes will reflect changing in portfolio composition during the holding period” (Basel, 2013, p. 106). Thus, one of the implications of the switch from VaR to ES is that an adequate backtesting technique for ES needs to be found.

In this paper I will compare through a Monte Carlo simulation the adequacy of two of the available backtests for ES, one of Righi and Ceretta (2012) and the one of McNeil and Frey (2000). A large number of scenarios of 250 daily returns are generated, and from this the conditional distribution of the daily return and hence the daily ES can be computed. The use of Monte Carlo simulation is suitable to determine risk measures for the daily returns and it can be used to determine if a backtest is accurate and

powerful. By evaluating the obtained results, I will make a suggestion about which backtest is most accurate.

The structure of the paper is as follows. Section 2 explains the research method. In Section 3 the concepts of VaR and ES are defined and a short analysis of the

shortcomings of VaR and the benefits of ES are given. Section 4 considers two available backtesting approaches for ES and Section 5 treats the adequacy testing of these

backtests. Section 6 describes the used Monte Carlo simulation and presents the simulation results. Section 7 gives the general conclusion of the paper.

(6)

2 Research Method

2.1 Backtests for ES

This paper concentrates on two backtest methods for ES, the residual approach of McNeil and Frey (2000) and the truncated dispersion approach of Righi and Ceretta (2012). When a violation occurs, McNeil and Frey (2000) look at the discrepancy

between the returns at time t, which in case of violation are called ‘exceedances’, and the estimated ES that assumes normality. They divide this difference by the standard

deviation of the whole distribution of the exceedances and call this the exceedance residuals. To test if the model rightly specifies the ES, McNeil and Frey (2000) use the null hypothesis that the exceedance residuals have zero mean.

  Righi and Ceretta (2012) also test how far the loss exceeding the VaR level is spread from its expected value and this is done for the truncated normal and student t distribution. Righi and Ceretta state that “in order to calculate how far a loss is from its expected value, we need to use some dispersion measure intrinsic to this expectation rather than linked with the absolute distribution expectation” (2012, p. 7). So, instead of using the whole distribution standard deviation as a dispersion measure, they use the dispersion that is truncated by the VaR. Righi and Ceretta (2012) use the same null hypothesis as McNeil and Frey (2000) for model verification.

2.2 Comparing the adequacy of backtests for ES

In this paper I will test and compare the adequacy of the two aforementioned backtesting procedures. This will be done by employing the Monte Carlo method. The Monte Carlo Method simulates a real process multiple times, which leads to a set of simulations that will provide the needed data. In finance, the Monte Carlo method is used to simulate the underlying processes of the financial market. In other words, it simulates the effects of random variables that cause an increase or decrease in the value of an asset. This makes it possible to simulate the returns of a portfolio over a given time horizon.

To compare the adequacy of the backtests, I use the Monte Carlo method to apply the backtest procedures on a set of simulated portfolio returns for which the ES with corresponding 5% quantile is computed. Due to the simulation of the portfolio returns, it is possible to find the real VaR level, which is the 5% quantile of the vector of returns. In other words, the simulated returns have a value that is less than the VaR for 5

(7)

per cent of the time. With the calculated real VaR level, the real ES is computed as the average value of the asset returns with a value lower than the VaR.

Testing the accuracy of a backtest is done by looking at the size and power of the test. The power of a backtest is given by the probability that the backtest rejects the null hypothesis given that the null hypothesis is false whereas the size of the test is the maximum probability of rejecting the null hypothesis when the null hypothesis is true. To obtain the size and power of the backtest of McNeil and Frey (2000) and the one of Righi and Ceretta (2012), they will be applied to the ES that is computed by means of the simulated returns. When the model generated ES is rightly estimated, it is expected that an accurate backtest approves the model that was used to estimate the ES. In this case, an accurate backtest would have a size of 0.05. When the model estimated ES differs from the real ES of the returns, the results of an accurate backtest should indicate that the used model does not rightly estimates the ES. Therefore it is expected that the backtest has a high power.

2.2.1 Data

The backtests procedures are tested by means of the returns 𝑥! for 𝑡 = {1, … , 250}. The number 250 is chosen because it represents the number of trading days in one year. The required returns are obtained through Monte Carlo simulation.

2.2.2 Simulation

The Monte Carlo simulation is performed in MATLAB. The first step is simulating the returns 𝑥! for 𝑡 = {1, … , 250} twice. In one simulation the returns are draw from a truncated normal distribution and in the second simulation they are draw from a t-location scale distribution. After simulating the data, the ES is computed based on normality and on a student t distribution in both simulations. The third step in the simulations is applying four forms of backtests. At first the backtest of McNeal and Frey (2000) is applied under a normal distribution and again with a student t distribution. The same is done for Righi and Cerettas’ (2012) approach.

These simulations are repeated for 𝑖 = {1, … , 10000}. After 10,000 times, the mean rejection percentage of the backtests is calculated and thus an estimate of the power or size for all the individual backtests is given.

(8)

3 Risk measures

In this section, background in the returns of a portfolio is provided, followed by some preliminary definitions of the risk measures utilized in this paper. After each definition, the problems related to the discussed risk measure are explained.

3.1 Returns of a portfolio

Due to regulators as the Committee, financial institutes are obliged to use risk measures to determine how much capital should be kept in reserve in order to cover potential losses. To determine this level of reserve capital, a risk measure needs information about the returns of the financial assets of the financial institutes. These returns are given by

(3.1) 𝑥𝜖! =   𝜇!+ 𝜖! ! = 𝜎!𝑧!

𝑥!, 𝑡 ∈  ℤ represents the return on the value of a portfolio at time t. Because  𝑥! is the return, losses will have a negative sign and profit will be denoted with a positive sign. 𝜇! and 𝜎!! respectively are the conditional mean and variance of the return on time t. 𝜖

! is the shock that hits the expected value of the return on time t with 𝑧 as a collection of i.i.d. continuous random variables from a location scale family with zero mean and unit variance. These random variables can assume different probability distribution functions.

3.2 Value at Risk (VaR)

Value at Risk (VaR) has been appointed by regulators like the Committee as the basis of a firms risk measurement model since 1996. Because of this, VaR became one of the most widely used risk measure that provides risk managers insight into the probable amount of loss that can occur during a certain period.

Acerbi and Tasche (2002) construct the VaR of a given portfolio by defining X as the random variable that describes the future profit or loss of the portfolio on time T, measured from today. 𝛼 = 𝐴%   ∈ 0,1 is a percentage which represents the A%-quantile of the returns from 𝑡 = 1 till 𝑡 = 𝑇.

(9)

Acerbi and Tasche define the quantile and VaR of a portfolio as follows:

Let 𝑥(!)be the related quantile of the distribution,  𝛼 ∈ (0,1) and let X be a random variable. (3.2) 𝑥 ! 𝑋 =  sup  {𝑥|𝑃[𝑋 ≤ 𝑥] ≤ 𝛼}

(3.3) 𝑉𝑎𝑅 ! 𝑋 = 𝑥 ! (𝑋)

Thus, the VaR of a portfolio provides us with the minimum loss incurred in the A% worst cases of our portfolio. In other words, VaR is a threshold value for which the probability that the portfolio exceeds this value is smaller than the given probability level

α. This can be written as

(3.4) Ρ 𝑥!!!< 𝑉𝑎𝑅!! = 𝛼

3.2.1 Problems with VaR

Acerbi and Tasche (2002) state that 𝑉𝑎𝑅! (𝑋) is the minimum loss incurred in the A%

worst cases of our portfolio. The problem that arises with this is as follows, the probable minimal loss only gives us the threshold of the possible A% losses, and is thus

indifferent about the losses beyond that level. In times of crisis or periods with low market confidence it can occur that the returns are in the left tail of the profit and loss distribution due to which VaR is not a good approximate of the probable market risk in these periods. An example is given in a review of the financial market events in the autumn of 1998, which shows that during the financial market crisis of this fall in 1998 “events were in the “tails” of distribution and that therefore VaR models were useless for measuring and monitoring market risk” (Committee on the Global Financial System, 1999, p. 41). To overcome this problem one should look to the expected loss in the A% worst cases of the portfolio instead of only looking to the threshold given by the VaR.

Another problem with VaR is that it does not conform to the description of being a coherent risk measure because it is not sub-additive (see Artzner et al., 1999). This means that there is the possibility that diversification of a portfolio does not decrease the VaR of this portfolio.

(10)

3.3 Expected Shortfall (ES)

The fact that VaR is not a reliable risk measure when markets are stressed and tend to have their returns in the left tails of the probability density distribution became

impossible to ignore after the last financial crisis. Thus, in 2013 the Committee proposed the switch of VaR to Expected Shortfall (ES) to model market risk capital requirements (Basel, 2013). The ES of a portfolio can be described as the expected loss of a portfolio conditional on the fact that the loss is worse than the VaR. It measures the average of the losses beyond the VaR level and thus gives the expected return of the portfolio in the A% worse cases. However, because ES considers the loss beyond VaR level as a conditional expectation, the effectiveness of ES is reliant on the accurate estimation of the tail of the distribution (see Yamai and Yoshiba, 2002).

A way to define ES is as follows: (3.5) 𝐸𝑆!! =  𝐸 𝑥

!!! 𝑥!!!< 𝑉𝑎𝑅!!                

The ES with probability level 𝛼 at time t is equal to the expected value of the return 𝑥!!!

conditional on the returns being smaller than the VaR at time t with confidence level 𝛼. Because of negative returns when exceeding VaR level, the value of ES is also negative.

3.3.1 Problems with ES

While ES does not face the problems of VaR, it is a coherent risk measure and it works better in adverse market situations because it measures the expected value of the losses above VaR level, there are still problems with ES. One of these problems concerns the user-friendliness of a risk measure. VaR is an easily operated risk measure but measuring the ES of a portfolio requires more knowledge and causes an increase in the workload of risk managers. Furthermore, getting a good estimate of the ES of a portfolio requires more data than VaR estimation because the probability that a loss exceeds VaR-level is even more unlikely to occur. For instance, a 99% VaR is only violated for 1 per cent of the time which results in 2.5 violations per year that should provide a basis for the ES. The switch in itself is also costly; it requires training of risk managers, changing the systems and designing new backtest procedures. And beside the costs that are related to designing new backtesting procedures, the backtesting procedures of ES are in essence, for both financial institutions as regulators, problematic.

(11)

Backtesting procedures are needed to check if the model used by risk managers to estimate their risk measures is accurate enough. While the backtesting procedure of VaR is straightforward, the backtesting of ES is complex and requires more data. To backtest the model estimated VaR with 99% confidence level one checks if the returns on the portfolio exceeded this VaR more than 1% of the time. Backtesting an ES model involves more than only checking the number of returns that exceed the VaR, it also needs to compare the average of these returns that exceeded the VaR level with the estimated ES. And as previously mentioned, estimation of ES requires more data than VaR estimation, and consequently the backtest of ES requires even more data.

4 Backtesting Approaches

This section treats the definition of backtesting and two existing backtests for ES. First, the backtesting approach of McNeil and Frey (2000) is presented together with the difficulties of this method. Subsequently, the problems of other existing methods for ES backtesting are explained on the basis of the paper of Righi and Ceretta (2012) followed by Righi and Cerettas own method of backtesting ES.

4.1 Backtesting

One of the implications of the switch from VaR to ES are the problems that arise with the backtesting of the model generated ES. Backtesting methods are used to examine which model is best to be used in estimating the ES of a portfolio by comparing the actual value of the VaR exceedances with those predicted by the model-generated ES measure.

When risk managers work with models in which the real average of the returns that exceed VaR level is significantly worse than the model generated ES, this leads to an underestimation of the risk of a portfolio with consequently a large chance of not being able to endure the losses that are made in the portfolio. In these cases, a backtest should indicate that the used model underestimates the ES and therefore, this model is not suitable to use for risk management. For regulatory reasons it is of great importance that there are accurate backtests for ES.

In this paper two pre-existing backtesting procedure for ES are considered and tested for their adequacy, the one of McNeil and Frey (2000) and the one of Righi and Ceretta (2012). One of reasons why these papers are chosen for this research is the time

(12)

span between them. Because of this gap of twelve years Righi and Ceretta (2012) had the opportunity to improve the defects of previous designed ES backtests, including the one of McNeil and Frey (2000). The question that therefore arises is if the knowledge about the weaknesses of older backtests has made it possible for Righi and Ceretta (2012) to design a backtest that works considerably better than other existing backtests like the one of McNeil and Frey (2000). If this is the case, designing a backtest for ES that is not sensitive to the current problems might be a future possibility and also an answer for regulators as the Committee.

4.2 McNeil and Frey

4.2.1 McNeil and Frey backtesting procedures

For backtesting ES, McNeil and Frey (2000) are interested in the size of the discrepancy between the returns 𝑥! and the model generated 𝐸𝑆! in the event of violation of the

𝑉𝑎𝑅! level. The first step in finding the size of this discrepancy is to make a distinct

variable that contains the returns that exceed the 𝑉𝑎𝑅!. These returns will be called

exceedances during this paper, and are defined by (4.1) 𝑒𝑥𝑐!!!= 𝑥!!!|𝑥!!!< 𝑉𝑎𝑅!!  

Subsequently the discrepancy between these exceedances and the model generated 𝐸𝑆!! is calculated by the use of the following statistic that is called the exceedance residual: (4.2) 𝑟!!!=!"#!!!!!!!!

!!!!

McNeil and Frey (2000) state that if the ES is rightly estimated, the size of the discrepancy between the occurred returns and the estimated ES is as such that the exceedance residuals are i.i.d. and have zero mean.

When backtesting the model generated 𝐸𝑆!, McNeil and Frey (2000) check for each of the occurred exceedances whether they differ significantly with the estimated risk measure. In order to carry this out, they compute the corresponding exceedance residuals of each exceedance and they use an upper one-sided test to test the hypothesis that the sample of these exceedance residuals has zero mean (𝐻!) against the alternative

hypothesis 𝐻! that the sample residuals’ mean is greater than zero. If 𝐻!  is accepted the estimated risk measure wrongly predicts the average of the returns that exceed 𝑉𝑎𝑅!

(13)

level. The choice for an upper one-sided test is made because underestimation of ES can lead to the insolvency of financial institutes and is thus of more importance to regulators than overestimating the ES of the portfolio. This does not mean that a model that overestimates the ES of a portfolio has no bad effects; on the long term overestimation will be inefficient and may cause problems due to failing to achieve sufficient profit.

The null hypothesis is accepted or rejected on the basis of the p-value of the test. A p-value gives the smallest size of the significance level at which 𝐻! can be rejected, based on the value of the test statistic. Thus, the p-value is the probability that a value at least as extreme as the observed average of the exceedance residuals can occur under the null hypothesis. A p-value of 0.03 leads to accepting the null hypothesis at a 0.01

significance level and it will reject it at a 0.05 significance level. The p-value is calculated as follows: (4.3) 𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 1 − 𝛷(𝑧!) 𝑧! =!/ !!!!! 𝑟 =!! ! 𝑟! !!!

with 𝜎 as the standard deviation of the exceedance residuals, 𝛷 as the cumulative

distribution function of the random variables and m equal to the number of exceedances.

4.2.2 Problem with McNeil and Frey

One of the problems with the backtest of McNeil and Frey (2000) can be related to the problems of ES that were previously discussed in section 3.3.1. In this section it was mentioned that a large amount of data is needed to estimate the ES of a portfolio and in the same sense the backtest of McNeil and Frey also requires large amounts of data to obtain accurate exceedance residuals. This means that the test statistic relies on a large sample for convergence to the limiting distributions, which can cause problems in calculating the p-value of the test when test samples are small (see Wong, 2008, p. 1405). This will be further discussed by Righi and Ceretta (2012) in the next section.

(14)

4.3 Righi and Ceretta

Righi and Ceretta (2012) perceive some disadvantages in the backtest approach of several researchers including McNeil and Frey. They describe this as follows:

The backtests […] rely on asymptotic test statistics that might be inaccurate when the sample size is small, which could penalize financial institutions based on incorrect estimation. Furthermore, these authors compute the required p-value based on the full sample size rather than conditional on the number of

exceptions. (p.4)

Righi and Ceretta (2012) took the above-mentioned disadvantages into account when they designed their own backtesting approach. The first way in which their approach is improved compared with the old approaches, is the use of the dispersion of the

truncated distribution by the estimated VaR limit in their test statistic instead of using an asymptotic test statistic. The dispersion of the truncated distribution denotes how the distribution of the exceedance residuals, which is restricted by the VaR-level, is spread. Righi and Ceretta (2012) refer to this dispersion as shortfall deviation (SD). The second improvement is that the backtest is not limited to the standard normal assumption. By allowing different distributions, the approach of Righi and Ceretta (2012) should be more flexible than the previously proposed backtests. Finally, Righi and Cerettas (2012)

backtesting procedure enables the opportunity to test if the individual violations at VaR level significantly differ from the ES. This allows a faster model of error verification and thereby making it possible to act quickly in situations where extreme financial losses can occur due to market risk.

4.3.1 Righi and Cerettas backtesting procedure

To test models for ES, Righi and Ceretta (2012) estimate the expected loss of the exceedances and their dispersion through the ES and SD. Thus, just as McNeil and Frey (2000) they test if an empirical exceedance is significantly worse than estimated by the model generated ES with the only difference being the use of the SD instead of the standard deviation of the exceedances. Thus, the test statistic of Righi and Ceretta is

(4.4) 𝑟!!!=!"#!!!!!!!

!

(15)

with the SD equal to (4.5)  𝑆𝐷!!!! = Var 𝑥!!! 𝑥!!!< 𝑉𝑎𝑅!! ! ! = (Var 𝑒𝑥𝑐!!! ) ! !

SD is perceived as a better estimate than the whole sample standard deviation because it focuses on the negative returns in the left tail, which are most important for risk management. Also, Righi and Ceretta (2012) are of opinion that it is better to use a dispersion measure that is intrinsic to the returns that exceed VaR level in order to calculate the size of the discrepancy between the exceedances and the ES, instead of using a dispersion measure that depends on the whole sample distribution.

In this paper, the backtest of Righi and Ceretta (2012) is implemented in the same way as McNeil and Freys (2000). By the use of the backtest statistic seen as in (4.4), the null hypothesis that the exceedance residuals have zero mean is tested through an upper one-sided test as seen in equation (4.3).

5 Adequacy testing

This section treats the adequacy of backtests and how it is tested whether a backtest is adequate. Firstly, the concept of adequacy testing and the motivation for testing the adequacy of backtests is explained. Important elements of adequacy testing are the test size and power, thus this section also explains how to acquire the size and power if a backtest. At last the analytical forms of VAR, ES and SD are provided in order to derive test size and power.

5.1 Testing the adequacy of backtests

Testing the adequacy of a backtest means testing if a backtest is accurate enough to be used for model testing. The aforementioned complications with ES backtesting underline the urgency of adequacy testing in order to find the most suitable backtest for ES.

Furthermore, testing the adequacy of backtests can provide inside into the deficiencies of pre-existing backtests and can thus show in which areas these backtests should be

improved. Additionally, the announced change from VaR to ES places even more pressure on finding an adequate ES backtest; the usage of an improper backtest to

(16)

research the model generated ES can lead to accepting risk measures which do not estimate the risk of a portfolio correctly.

Investigation of the empirical size and power of a backtest can provide inside into the adequacy of the test. The size of a backtest is the maximum probability of it rejecting the null hypothesis when the null hypothesis is true and the power of a backtest is the probability that the test rejects the null hypothesis when the null hypothesis is false. In order to investigate empirical test sizes and powers, a Monte Carlo simulation is needed to apply the backtests on a set of returns.

5.1 Statistics

To gain insight in the size and power of the backtests, the results of the tests need to be reported in terms of the p-values as seen in equation (4.3) and (4.6). These p-values are based on the value of the test statistic and in order to determine the test statistics, the analytical forms of VaR, ES and SD are needed. Thus, the first step in adequacy testing is simplifying the definitions that were given in section 3 and 4 in order to apply the

backtests on a simulated set of returns. Since the Monte Carlo simulation will simulate returns under normality and under student-t, the analytical forms of ES and SD need to be computed twice, assuming normality and assuming student-t. In this paper the used VaR, ES and SD are unconditional.

5.1.1 Analytical form of VaR

The returns 𝑥! are defined as 𝜇!+ 𝜎!𝑧!, which gives the following unconditional VaR:

(5.1) P 𝑥! < 𝑉𝑎𝑅! = 𝛼   P 𝜇 + 𝜎𝑧!< 𝑉𝑎𝑟! = 𝛼   P 𝑧! < !"!!!!   !   = 𝛼

(17)

𝑧! is an i.i.d. continuous random variable which can assume different probability distribution functions. Therefore, the cumulative distribution function and the

probability distribution function of 𝑧! are respectively denoted as F(z) and f(z), or as 𝛷 and 𝜙. Thus, equation (5.1) becomes

(5.2) F !"!!!!   !    = 𝛼 !"!!!!   !   = F !! 𝛼     𝑉𝑎𝑅! = 𝜇 + 𝜎F!! 𝛼

5.1.2 Analytical form of ES under normality

The ES of a portfolio is the expected value of the returns that exceed the 𝑉𝑎𝑅! level. Therefore, the unconditional ES is denoted by

(5.3)

𝐸𝑆! =  Ε 𝑥

! 𝑥! < 𝑉𝑎𝑅!  

= 𝜇 + 𝜎Ε 𝑧! 𝑧! < 𝐹!! 𝛼

Under normality the truncated expectation by a superior limit L can be written as

E 𝑋 𝑋 < 𝐿 =   −! !! ! (see Righi and Ceretta, 2012). This transforms the truncated expectation by limit 𝐹!! 𝛼 into

(5.4) Ε 𝑧! 𝑧! < 𝐹!! 𝛼 =   −! ! !!(! ) ! !!! !   = −! !!!! !

Substituting this in equation (5.3) gives the following analytical form of ES under normality:

(5.5) 𝐸𝑆! = 𝜇 − 𝜎! !!! ! !

(18)

5.1.3 Analytical form of ES under student-t

For this analytical form it is assumed that the random variables 𝑧! follow a student-t distribution. This means that the truncated expectation by a superior limit as seen in equation (5.4) now needs to be deduced through the truncated student-t distribution. In the paper of Nadarajah and Kotz (2008), the first central moment of a truncated

student-t disstudent-tribustudent-tion is defined as (5.6) E 𝑋 𝑋 < 𝐿 = ! ! !! !!,!! !! ! 𝐿 !𝐺 !!! ! , 1; 2; − !! ! − !! !!!

with v as the degrees of freedom, 𝑡! 𝐿 = 𝐹(𝐿) and 𝛽(. , . ) and 𝐺(. , . ; . ; . ) are respectively the Beta and Gauss hypergeometric function which are defined by

𝛽 𝑎, 𝑏 =   !𝑤!!! 1 − 𝑤 !!!𝑑𝑤   ! and 𝐺 𝑎, 𝑏; 𝑐; 𝑥 =   ! ! !! !! !! !! ! !!! . Equation (5.6) is only finite for 𝑣 ≥ 1. Using equation (5.6), the truncated expectation by limit 𝐹!! 𝛼 can

be written as (5.7) E 𝑧! 𝑧!< F!! 𝛼 = ! ! !! !!,!! ! 𝑡! !! 𝛼  𝐺 !!! ! , 1; 2; − !!!! ! ! − !! !!!

Substituting this in equation (5.3) gives the following analytical form of ES under student-t, if 𝑣 ≥ 1: (5.8) 𝐸𝑆! = 𝜇 + 𝜎 ! ! !! !!,!! ! 𝑡! !! 𝛼  𝐺 !!! ! , 1; 2; − !!!! ! ! − !! !!!

There also is another less complicated way to define the first central moment of a truncated student-t variable with v degrees of freedom. Instead of the expression of the first central moment in (5.6), Broda and Paolella (2011) denote it as

(5.9) E 𝑋 𝑋 < 𝐿 = −𝛼!! 𝑓

! 𝑡!!! ! 𝛼 !!!!

!!

(19)

with 𝑓! as the probability distribution function of the student-t distributed random variables. Substituting this in equation (5.3) gives the following analytical form of ES under student-t, if 𝑣 ≥ 1:

(5.10) 𝐸𝑆! = 𝜇 − 𝜎 𝛼!! 𝑓

! 𝑡!!! ! 𝛼 !!!!

!!

!!!  

5.1.4 Analytical form of SD under normality

The SD is the square root of the variance of a return that is truncated by the VaR. The unconditional SD of a variable is defined as

(5.11) 𝑆𝐷  ! = Var 𝑥 ! 𝑥! < 𝑉𝑎𝑅  ! ! !   = (Var 𝜇 + 𝜎𝑧!|𝑧!< !"!!!!! )!!   = (𝜎!Var 𝑧 ! 𝑧! < 𝐹!! 𝛼 ) ! !

Under normality the truncated variance by a superior limit L can be written as

Var 𝑋 𝑋 < 𝐿 =  1 − 𝐿! !

! ! − ! ! ! !

!

(see Righi and Ceretta, 2012). This transforms the truncated variance by limit 𝐹!! 𝛼 into

(5.12) Var 𝑧! 𝑧! < 𝐹!! 𝛼 )  =  1 − 𝐹!! 𝛼 !(!!! ! ) ! !!!! − ! !!! ! ! !!! ! !   = 1 − 𝛷!! 𝛼 ! !!! ! ! − ! !!! ! ! !

Substituting this in equation (5.11) gives the following analytical form of SD under normality: (5.13) 𝑆𝐷! = 𝜎! 1 − 𝛷!! 𝛼 ! !!! ! ! − ! !!! ! ! ! !!    

(20)

5.1.5 Analytical form of SD under student-t

To deduce the analytical form of SD under student-t, the second and first moment of the truncated student-t distribution are needed to calculate the variance of the truncated 𝑧!.

Nadarajah and Kotz (2008) define the variance of a truncated student-t distributed variable as (5.14) Var 𝑋 𝑋 < 𝐿 = ! ! !! !!,!! !! ! 𝐿 !𝐺 !!! ! , ! !; ! !;  − !! ! − !! !!! +      !!! !!! − E X X < L !

When using equation (5.9) to define the first moment of a student-t distribution, the truncated variance of 𝑧! by superior limit 𝐹!! 𝛼 is transformed into:

(5.15) Var 𝑋 𝑋 < 𝐿 = ! ! !! !!,!! ! 𝑡! !! 𝛼 𝐺 !!! ! , ! !; ! !;  − !!!! ! ! +      !!! !!! + [𝛼!! 𝑓 ! 𝑡!!! ! 𝛼 !!!! !! !!!   ] !

Substituting this in equation (5.11) gives the following analytical form of SD under student-t: (5.14) 𝑆𝐷! = 𝜎! ! ! !! !!,!! ! 𝑡! !! 𝛼 𝐺 !!! ! , ! !; ! !;  − !!!! ! ! +   ! !!! !!+      [−𝛼!! 𝑓 ! 𝑡!!! ! 𝛼 !!!! !! !!!   ]! ! !    

(21)

6 Monte Carlo simulations

In this section Monte Carlo simulations are carried out to test the adequacy of the backtests of McNeil and Frey (2000) and Righi and Ceretta (2012). For the Monte Carlo simulations, a 95% confidence level is used for computation of VaR, ES and SD. The portfolio returns are simulated assuming normality and student-t. For each simulation, the p-values of the backtests are reported and a 5% significance level is used to check if the backtest rejects or accepts the null hypothesis. After 10,000 simulations, the mean rejection rate of the backtests is calculated in order to compute the empirical size or power of the backtest.

6.1 The simulations

Two Monte Carlo simulations are carried out to test the adequacy of the backtests. The first simulation assumes normality; this means that the portfolio returns are drawn i.i.d. from a standard normal distribution. The second simulating assumes student-t, thus the portfolio returns are drawn i.i.d. from a t distribution.

In both simulations ES is computed under the null and under the alternative hypothesis and the backtests of McNeil and Frey (2000) and Righi and Ceretta (2012) are applied to test whether these ES estimates are correct. Afterwards, the sizes and powers of the tests are calculated by using the rejection rate of the backtests.

6.1.2 Hypothesis testing

When the returns are generated under normality, it is expected that the backtests accept the null hypothesis when the ES is computed under normality and reject the null hypothesis in the case that ES assumes a student-t distribution. For the simulation that assumes student-t the opposite is true. Because test size and power are needed to test the adequacy of the backtests, there are two backtests of both McNeil and Frey (2000) and Righi and Ceretta (2012) needed in the simulations; one testing the standard normal ES and the other testing the student-t distributed ES. Thus, four backtests of ES are carried out per simulation. Depending on the distribution used to simulate the returns, one backtest will give information about the test size and the other about the power of test.

Given that the backtests are testing ES computed under normality and under student-t, the test statistics of the backtests as seen in (4.2) and (4.4) should take two forms in the simulations, one to test ES computed under normality and the other to test

(22)

ES computed under student-t. For the backtests of McNeil and Frey (2000) this means that to test ES that assumes normal returns, equation (4.2) is performed with the

unconditional standard deviation and the ES as denoted in equation (5.5). To test ES that assumes student-t distributed returns, the ES from in equation (5.10) is used. In their own paper, McNeil and Frey (2000) do not consider a distribution other than the normal distribution. Due to this, the p-value of their test is in both cases calculated by the use of the normal distribution.

When Righi and Cerettas (2012) backtest is performed to test ES computed under normality, their test statistic as seen in equation (4.4) uses the ES from equation (5.5), and the SD from equation (5.13). In case of normality, the p-value of their test is calculated by means of the normal distribution. Testing ES computed under student-t is done by using both the ES and SD from respectively equation (5.10) and (5.14) in their test statistic. Because Righi and Cerettas (2012) backtest is not limited to the standard normal assumption, the p-value is in this case calculated under the student-t distribution.

6.1.3 Empirical test size and power

For each simulation, the p-values of the backtests are recorded and a 5% significance level is used to test if the empirical value of the test statistic differs significantly from the null hypothesis. Rejection of the null hypothesis occurs if the p-value has a value that is less than the significance level of 0.05. By dividing the number of times that a backtest rejects in one Monte Carlo simulation by the total number of simulations, the size or power of the test is obtained.

Under the standard normal null hypothesis, this rejection rate is equal to the size of backtests that tested normal distributed ES and to the power of backtests that tested student-t distributed ES. The opposite is true under the student-t null hypothesis. An accurate backtest has a size close to the chosen significance level of the test.

(23)

6.2 Results

6.2.1 Monte Carlo with normal distributed returns

The first results are from the Monte Carlo simulation in which the returns were

simulated under normality. Here, the empirical test sizes and powers of the backtests of McNeil and Frey (2000) and Righi and Ceretta (2012) are based on 10,000 simulations of 250 normal distributed returns. The confidence level of the VaR, ES and SD is set to 95% and p-values of each simulation are calculated using equation (4.3). The null hypothesis is rejected if a simulation has a p-value below the used significance level of 0.05. It is recorded for how many simulations the backtest rejected the null hypothesis and this number is divided by the total number of simulations.

Table 1 shows that the backtest both McNeil and Frey (2000) and Righi and Ceretta (2012) have rejected the ES that was simulated under the null 1181 of the 10,000 times. This means that both backtests reject a true null hypothesis 11.81 per cent of the time, which causes that the backtests are not adequate in the standard normal case given that the acceptable level of error is set at 0.05. Rejecting a false null hypothesis is referred to as the power of the test and as seen in table one, both backtests have a relatively low power. McNeil and Freys (2000) backtest has a power of 12.19 per cent and Righi and Cerettas (2012) power is even lower than the size of their test; they reject a false null hypothesis only 9.88 per cent of the time, which also indicates inadequacy,

The power of a test is equal to 1-type II error of the test:

1 − P[𝑎𝑐𝑐𝑒𝑝𝑡𝑖𝑛𝑔  𝐻!|𝐻!  𝑖𝑠  𝑓𝑎𝑙𝑠𝑒]. Thus, a low power indicates that the test has a high

probability of making a type-II error. Based on the consequences of the errors, type II errors might be more dangerous for the determination of capital requirements because it leads to the acceptance of an ES that underestimates the risk of the portfolio while it is still believed that the used ES is right. A rejection of a true null hypothesis is also unfavourable but it enforces us to check the model that was used to estimate the ES, which gives the possibility that the type-I error will be detected. Thus, even tough both backtests are inadequate; the backtest of McNeil and Frey (2000) should be preferred over the one of Righi and Ceretta (2012) because of the higher power of the former.

(24)

Table 1

Empirical test size and power

ES McNeil and Frey Righi and Ceretta

  𝐻!   Number of rejections 1181 1181 Size of test   𝐻! Number of rejections Power of test 0.1181 1219 0.1219 0.1181 988 0.0988

This table reports the results of the size and power of the ES backtests of McNeil and Frey and Righi and Ceretta based on 10,000 Monte Carlo simulations with each 250 normal distributed returns. Under the null hypothesis the ES is calculated by means of equation (5.5) and under the alternative, the ES is calculated using equation (5.10). In both calculations the 𝛼 was set to 0.05 and a rejection was made if the p-value of the test was lower than 0.05.

6.2.2 Monte Carlo with student-t distributed returns

Secondly the backtests are applied on a simulated set of student-t distributed returns in order to find their size and power when returns are not normal distributed. Again, 10,000 simulations of 250 normal distributed returns are carried out with the confidence level of the VaR, ES and SD set to 95%, and p-values of each simulation are calculated using equation (4.3). The null hypothesis is rejected if a simulation has a p-value below the used significance level of 0.05.

Table 2 shows a slight change in results compared to the results in table 1. Righi and Cerettas (2012) test size is close to the significance level and McNeil and Freys (2000) test size is better than in the case of normal distributed returns, but it is not good enough to call their backtest adequate. Furthermore, the power of both McNeil and Freys (2000) and Righi and Cerettas (2012) backtest is equal to 0.0231. This is to low, a wrongly estimated ES is detected for only 2.31 per cent of the time.

(25)

Table 2

Empirical test size and power

ES McNeil and Frey Righi and Ceretta

  𝐻!   Number of rejections 821 547 Size of test   𝐻! Number of rejections Power of test 0,0821 231 0.0231 0,0547 231 0.0231

This table reports the results of the size and power of the ES backtests of McNeil and Frey and Righi and Ceretta based on 10,000 Monte Carlo simulations with each 250 student-t distributed returns. Under the null hypothesis the ES is calculated by means of equation (5.5) and under the alternative, the ES is calculated using equation (5.10). In both calculations the 𝛼 was set to 0.05 and a rejection was made if the

p-value of the test was lower than 0.05.

7 Conclusions

The adequacy of two pre-existing backtests for ES was tested through Monte Carlo simulation studies, which gave insight in the test and power of the backtests of McNeil and Frey (2000) and Righi and Ceretta (2012). Results of these simulations show that both backtest are not adequate, which again highlights the problems with backtesting techniques for ES. While it was expected that the approach of Righi and Ceretta (2012) would work better for every distribution, the results of the Monte Carlo simulation show the opposite when returns follow a normal distribution. Thus, with normal distributed returns, the use of the shortfall deviation in order to measure the size of the discrepancy between the exceedances and the model generated ES makes no difference for the size of the test and it even decreases the power of the test.

But, a difference in the performance of the backtests becomes visible when they are applied to a set of student-t distributed returns. In this event, the empirical test size of Righi and Cerettas (2012) backtest is close to the significance level of the test but there is still no difference in power compared to McNeil and Frey (2000). This improvement was expected because Righi and Ceretta (2012) did not limit their backtest to the standard normal case, which makes their backtest more accurate when returns follow a student-t distribution. Thus, it can be concluded that Righi and Cerettas (2012) backtest works better when returns assume a student-t distribution

For further studies, the null hypothesis that the residuals have zero mean could be tested by the use of a nonparametric bootstrap test. But, for the switch from VaR to

(26)

ES it is of importance that there are ES backtesting techniques that have high power in order to detect incorrectly estimated ES. Thus another task of future research would be to find a way to create an ES backtest with high power.

(27)

References

Acerbi, C., & Tasche, D. (2002). Expected Shortfall: a natural coherent alternative to Value at Risk. Economic notes, 31(2), 379-388.

Artzner, P., Delbaen, F., Eber, J. M., & Heath, D. (1997). Thinking coherently: Generalized scenarios rather than VaR should be used when calculating

regulatory capital. RISK-LONDON-RISK MAGAZINE LIMITED-, 10, 68-71. Artzner, P., Delbaen, F., Eber, J. M., & Heath, D. (1999). Coherent measures of risk.

Mathematical finance, 9(3), 203-228.

Basel Committee on Banking Supervision (1996). Supervisory Framework for the Use of “Backtesting” in Conjunction with the Internal Models. Approach to Market Risk Capital Requirements. Consultative document, Bank for International Settlements, Basel.

Basel Committee on Banking Supervision (2013). Fundamental review of the trading book: A revised market risk framework. Consultative document, Bank for International Settlements, Basel.

Berkowitz, J. (2001). Testing density forecasts, with applications to risk management. Journal of Business & Economic Statistics, 19(4), 465-474.

Broda, S. A., & Paolella, M. S. (2011). Expected shortfall for distributions in finance. In Statistical Tools for Finance and Insurance (pp. 57-99). Springer Berlin

Heidelberg.

Committee on the Global Financial System (1999): “A Review of Financial Market Events in Autumn 1998” Bank for International Settlements, Basel.

Efron, B. & Tibshirani, R. J. (1993), An Introduction to the Bootstrap, Chapman & Hall. Kerkhof, J., & Melenberg, B. (2004). Backtesting for risk-based regulatory capital. Journal

of Banking & Finance, 28(8), 1845-1865.

McNeil, A. J., & Frey, R. (2000). Estimation of tail-related risk measures for heteroscedastic financial time series: an extreme value approach. Journal of empirical finance, 7(3), 271-300.

Nadarajah, S., & Kotz, S. (2008). Moments of truncated t and F distributions. Portuguese Economic Journal, 7(1), 63-73.

Righi, M. B., & Ceretta, P. S. (2012). Individual and flexible expected shortfall backtesting. Available at SSRN 2155659.

(28)

Wong, W. K. (2008). Backtesting trading risk of commercial banks using expected shortfall. Journal of Banking & Finance, 32(7), 1404-1415.

Yamai, Y., & Yoshiba, T. (2002). On the validity of value-at-risk: comparative analyses with expected shortfall. Monetary and economic studies, 20(1), 57-85.

Referenties

GERELATEERDE DOCUMENTEN

(2014) Water Footprint Assessment for Latin America and the Caribbean: An analysis of the sustainability, efficiency and equitability of water consumption and pollution, Value

Having said that, self-efficacy will be monitored across different aspects of motivational tools analyzing how compensation and promotion impact self-efficacy/performance

amine based capture facility is compared with the energy efficiency of a power plant equipped with a standard MEA-capture facility using the Spence ® software tool developed

In highly reliable systems one is often interested in estimating small failure probabilities, meaning that efficient simulation techniques meant for rare events for model checking are

I will be demonstrating how Silvered Water calls into question its own “self understanding,” performs an aesthetic intervention on “the conditions of

This paper presents C NDFS , a tight integration of two earlier multi- core nested depth-first search (N DFS ) algorithms for LTL model checking.. C NDFS combines the

Self-report treatment adherence behaviour Biological marker (viral load) that reports treatment adherence Attitudes towards treatment adherence Perceived Subjective

Een voordeel voor ons was dat de natuur- speelplaats precies was wat nog ont- brak aan het speelplekkenplan - een vernieuwend idee voor de juiste doel- groep en met een geschikte