Application of Uncertainty Quantification on credit portfolio losses

(1)

Masters Thesis

Application of Uncertainty

Quantification on credit portfolio losses

Author: Nick Boon

Supervisor: Jori Hoencamp

A thesis submitted in partial fulfilment of the requirements for the degree of Master of Science in Computational Science

in the

Computational Science Lab Informatics Institute

(2)

I, Nick Boon, declare that this thesis, entitled ‘Application of Uncertainty Quantification on credit portfolio losses’ and the work presented in it are my own. I confirm that:

This work was done wholly or mainly while in candidature for a research degree at the University of Amsterdam.

Where any part of this thesis has previously been submitted for a degree or any other qualification at this University or any other institution, this has been clearly stated.

Where I have consulted the published work of others, this is always clearly at-tributed.

Where I have quoted from the work of others, the source is always given. With the exception of such quotations, this thesis is entirely my own work.

I have acknowledged all main sources of help.

Where the thesis is based on work done by myself jointly with others, I have made clear exactly what was done by others and what I have contributed myself.

Date: 12 June 2020

(3)

Abstract

Faculty of Science Informatics Institute

Master of Science in Computational Science

Application of Uncertainty Quantification on credit portfolio losses by Nick Boon

Quantifying the uncertainty in credit portfolios would be a first step in systematically reducing its uncertainty. In this thesis the first step is made to apply uncertainty quan-tification to credit portfolio losses. Taking inspiration from Chen et al. [1], McDiarmid’s inequality is applied to US Treasury bonds, obtaining a bound for the maximum loss incurred on a single trading day. This is extended to include semiannual windows from 2000 up to May 2020. In the obtained bounds, we find increased bounds during the credit crisis and more recently, the coronavirus pandemic.

As an extension, we apply Hoeffding’s inequality on a two-factor Merton default thresh-old model, simulating default events and loss given default for a portfolio of 125 counter-parties. The expected loss and 99.9 percentile of the portfolio loss are investigated. Using Hoeffding’s inequality, stricter bounds than Chebyshev’s inequality are obtained for the expected loss, in agreement with Chen’s earlier findings. The 99.9 percentile bounds are wider than traditional bootstrap and jackknife estimates, although now bounds are obtained instead of estimates. For a dataset of 108 _{generated realizations, a bound of}

8.97% is obtained with 95% confidence whereas the best estimate, based on 5 × 108 realizations, obtain a 99.9 percentile of 8.31%.

Due to discontinuities and unconstrained random variables in the model, no definitive conclusion can be drawn from the decomposition of uncertainty on a parameter-level. In future research another approach should be sought to apply the methods described here, such as a modified McDiarmid’s inequality capable of bounding even unconstrained random variables. The methods in this thesis can then guide other works in reducing uncertainty where it is most needed.

(4)

First of all, I want to thank Jori Hoencamp for his endless support and guidance during the entire project. The discussions and (online) meetings were always very helpful for my understanding of the subject.

Furthermore, I want to thank Ioannis Anagnostou for his help in generating credit portfolio data.

Finally, thanks to everyone at the Informatics Institute of the University of Amsterdam who have supported me throughout this project.

(5)

Declaration of Authorship i

Abstract ii

Acknowledgements iii

Contents iv

List of Figures vi

List of Tables vii

Abbreviations viii

1 Introduction 1

2 Theory 4

2.1 Finance . . . 4

2.1.1 Bonds . . . 4

2.1.2 Default Risk Charge (DRC) . . . 6

2.1.3 Probability of default . . . 6

2.2 Generating portfolio losses . . . 7

2.2.1 Generate correlated random variables . . . 7

2.2.2 Generate default events . . . 8

2.2.3 Simulate LGD in credit losses . . . 9

2.2.4 Portfolio loss . . . 10

2.2.5 Checking the generated losses . . . 10

2.3 Uncertainty Quantification. . . 11

2.3.1 Hoeffding’s lemma . . . 11

2.3.2 McDiarmid’s inequality . . . 13

2.3.3 Hoeffding’s inequality . . . 15

2.4 Other percentile methods . . . 15

2.4.1 Traditional bootstrap . . . 16

2.4.2 Jackknife . . . 16

(6)

3 Application on treasury bonds 18

3.1 Data preparation . . . 19

3.2 Principal Component Analysis . . . 19

3.3 Calculating portfolio result . . . 20

3.4 Independent risk factors . . . 21

3.4.1 Cubic splines . . . 21

3.4.2 Generating independent observations . . . 22

3.5 McDiarmid’s Inequality . . . 24

4 Application on credit portfolio 25 4.1 Bounding the expectation . . . 25

4.2 Hoeffding’s inequality and tail bounds . . . 26

4.2.1 Obtaining an upper bound . . . 26

4.2.2 Upper bound for a given confidence level . . . 27

4.2.3 Obtaining a lower bound . . . 28

4.3 Parameter uncertainty . . . 29

5 Results 31 5.1 Treasury bonds . . . 31

5.2 Credit portfolio losses . . . 36

5.2.1 Expected loss . . . 37

5.2.2 99.9 percentile loss . . . 38

5.2.3 Parameter uncertainty . . . 40

6 Discussion 42 6.1 Treasury bond losses . . . 42

6.2 Credit portfolio losses . . . 43

7 Conclusion 45 7.1 Further research . . . 46

7.1.1 Alternative methods . . . 46

7.1.2 Alternative models . . . 46

A Parameters for portfolio loss generation 48

(7)

5.1 The 30 components for the first three eigenvectors of ˆξ. The three eigen-vectors explain 99.91% of the variance in the data. Using two eigeneigen-vectors

reduces this to 99.40%.. . . 32

5.2 Scatter plot of risk factors from treasury bond yield data between

2013-03-04 and 2013-07-24. The complete domain is divided into 10 segments

for each risk factor. . . 32

5.3 Unnormalized empirical density function obtained using spline

interpola-tion from the data shown in figure 5.2. . . 33

5.4 Example of generating 1.000 risk factors from the empirical density

func-tion using two independent uniform random variables. . . 34

5.5 Graph of obtained κ using semiannual windows starting from

2000-01-01. The last window is only until 2020-05-29 as newer data was not yet

available. . . 35

5.6 Graph of obtained expected loss bounds using semiannual windows

start-ing from 2000-01-01 where is set to 0.10. . . 35

5.7 Graph of obtained κ using 2-month windows starting from 2019-02-01. . . 36

5.8 Graph of obtained expected loss bounds using 2-month windows starting

from 2019-02-01 and set to 0.10. . . 36

5.9 Obtained bounds for the expected loss of the credit portfolio for all

con-fidence levels and 500M observations.. . . 37

5.10 Obtained (two-tailed) 95% confidence intervals for several values of N ,

the number of simulations.. . . 38

5.11 Results for 20M simulations, with the confidence level ranging from 90% to 99%. The computation time was 7.85, 1252.0, and 2.17 seconds for the

HB, TB, and JK methods respectively. . . 39

5.12 Results for the portfolio loss bound at the 99.9 percentile with N between 1M and 100M at a confidence level of 95%. The computation time needed to generate the complete curves was 42.6, 41284.7, and 55.5 seconds for

the HB, TB, and JK methods respectively.. . . 39

5.13 McDiarmid subdiameter for all variables in the credit portfolio loss gen-eration for a single counterparty with exposure 1/125 or 0.008. The four standard normal random variables are allowed to attain values between ±20. . . 40 5.14 McDiarmid subdiameter for all variables in the credit portfolio loss

gen-eration for a single counterparty with exposure 1/125 or 0.008. The four standard normal random variables are allowed to attain values between

±100. . . 41

(8)

2.1 Parameters for default generation. . . 9

2.2 Parameters for LGD generation. . . 10

A.1 Values for γ and σ for corporations and sovereigns. The values are taken

from table 3 of Wilkens and Predescu (2017) [2]. . . 49

A.2 Values for the correlation matrix for the factor parameters. The regions are Africa, Asia, Eastern Europe, Europe, Unused, Latin America, Middle

East, Unused, Oceania.. . . 49

(9)

BCBS Basel Committee on Banking Supervision CDF Cumulative distribution function

DRC Default Risk Charge

EAD Exposure at Default

ECDF Empirical Cumulative Distribution Function

HB Hoeffding Bound

JK Jackknife

JTD Jump-To-Default

LGD Loss Given Default

PD Probability of Default

ppf Percent Point Function

TB Traditional Bootstrap

UQ Uncertainty Quantification

VaR Value-at-Risk

YTM Yield to maturity

(10)

Introduction

Banks play a major role in stabilizing the economy, providing credit and access to many financial derivatives. Because of their importance and the great risks involved in their activities, most banks are regulated. The Basel Committee on Banking Supervision (BCBS), established by the G10 central bank governors in 1974, develops the bank regulation standards that form the basis of many sovereigns’ regulation and supervision on banks [3].

The credit crisis of 2007-2009 spurred the development of new rules on bank supervision. An improved framework was necessary, as the crisis uncovered weaknesses in the design and implementation of the old framework [4]. These weaknesses can be disastrous as the credit crisis has shown. In his speech at the 2017 CIRSF conference, the chairman of the Financial Stability Institute at the Bank of International Settlements, Fernando Restoy, mentioned that ”the Basel Committee on Banking Supervision (BCBS) estimates that its member countries alone lost output worth more than $76 trillion as a result of the crisis, up to the end of 2015” [5]. The implementation of the new framework, Basel III, is still ongoing. The full implementation is expected to be finalised on January 1st, 2022 [6].

With Basel III, additional supervision and regulation guidelines are added, including the default risk charge (DRC), meant to protect against unforeseen or very unlikely losses by banks. This requires banks to simulate a large number of realizations of the financial instruments in their portfolios and set aside capital in order to account for the 99.9 percentile of the simulated losses [7]. However, current techniques may not yield an upper bound on the losses, so an obtained or generated sample can underestimate the true 99.9 percentile of the possible outcomes, and therefore potentially put the bank at an increased risk.

(11)

Methods to estimate the percentile of a distribution from an empirical sample include bootstrapping methods as described by Chadyˇsas (2008) [8]. However, these methods attempt to obtain an estimate of the confidence interval, and do not attempt to obtain a bound. It is possible the estimate is based on insufficient data, which makes an incorrect estimation of the percentile likely. Using bootstrapping methods for obtaining percentile estimates is also not optimal, as the bootstrapping samples are no longer independently drawn from the true population.

Before applying Uncertainty Quantification (UQ) techniques on percentiles, we will first apply UQ as inspired by the work of Chen et al. (2017) [1]. In their work, McDiarmid’s inequality is applied on a simple portfolio of US Treasury bonds to obtain an upper bound on the expected loss of this portfolio. The obtained bound is then also compared to Chebyshev’s inequality. The work of Chen et al. is one of the few papers that attempt to use McDiarmid’s inequality in a financial context. Its use is more common in the statistical learning domain [9], [10], and for decision trees for mining data streams [11], [12].

Our work applies UQ methods on credit portfolio losses generated by a two-factor Merton default threshold model, modified from the description by Bade et al. (2011) [13], and a separate model for the loss given default (LGD), using parameters and techniques from Wilkens and Predescu (2017) [2]. The combination generates default events and an LGD percentage, thereby allowing to simulate losses for many scenarios. We use a provided dataset containing a credit portfolio with 125 counterparties. From the generated losses, we use Hoeffding’s inequality to obtain bounds on the expected loss. The obtained bound is compared to Chebyshev’s inequality. We also obtain a probability bound on the true 99.9 percentile exceeding the empirically found percentile by more than a given θ. However, it is also possible and more interesting to set a specific probability or confidence level and instead calculate the minimum θ needed for such a bound at the chosen confidence level. These bounds are then compared to other estimators, the Traditional Bootstrap (TB) and Jackknife (JK) methods.

The rest of this thesis is structured as follows. We will begin the introduction of the necessary theory in chapter 2. First is the introduction of bonds after which, we will consider the credit-based probability of default (PD). A counterparty with a low credit quality will be considered more likely to default. The thesis continues with a short description of the default risk charge (DRC) in section2.1.2. In section2.2 the method of generating portfolio losses is introduced, including default events and the loss given default. In section 2.3, we introduce relevant methods from UQ such as Hoeffding’s lemma, Hoeffding’s inequality [14] and McDiarmid’s inequality [15]. Finally, in section 2.4two percentile estimate methods are introduced.

(12)

After the theory section, we continue with the application of UQ to US treasury bonds in chapter 3. Here, McDiarmid’s inequality is used to certify that the true expected loss of a simple treasury bond portfolio is, with some defined confidence level, within a specific range of the calculated empirical loss average. The approach is inspired by the work of Chen et al. (2017) [1], although some modifications have been made.

In chapter4, we describe the procedure of applying Hoeffding’s inequality on the treasury bond portfolio and generated losses in more detail. The chapter ends with the application of McDiarmid’s inequality on parameter uncertainty in the generation of the credit portfolio losses.

The results are presented in chapter 5, where we present the results of both methods. This is followed by a discussion in chapter6, where the results will be interpreted. The limitations and assumptions made will also be discussed in this chapter. We conclude the thesis in chapter 7, where we summarize our findings and conclusions and make suggestions for further research.

(13)

Theory

In the first section of this chapter several financial concepts are introduced. This includes treasury and zero-coupon bonds, the probability of default (PD), and the default risk charge (DRC) introduced by Basel III. The second section introduces the generation of credit portfolio losses. In the third section, we introduce uncertainty quantification methods to obtain bounds on the expectation and percentile of the generated credit portfolio losses. In the last section, we introduce other percentile methods to provide a comparison to the uncertainty quantification methods.

2.1 Finance

2.1.1 Bonds

The bond is a financial instrument that allows the buyer to buy debt from the seller of the bond. Depending on the type of bond bought, several payments can occur from the seller to the buyer of the bond. In this section we will only discuss the treasury bond and the zero-coupon bond. Many more variants exist, but these are beyond the scope of this thesis.

A treasury bond, also called a T-bond, indicates the bond was issued by a government. For treasury bonds two types of payment are defined; recurring interest coupons, and a final payment in which the face value is returned to the buyer of the bond. Historically, coupon payments were attached as coupons to the physical, paper bond. Each coupon could then be exchanged for the interest payment when it was due. Coupon payments are usually semiannual or annual, depending on the conditions of the bond. The face value is the amount over which the issuer of the bond will pay interest to the buyer of the bond and is the final amount returned by the issuer of the bond at maturity.

(14)

Treasury bonds are often considered risk-free, as governments are very unlikely to de-fault. This is almost entirely true when the government controls the currency in which the bond is sold, as the government could change policies to repay its debt such as print-ing additional money or an increase in tax, both in the same currency as the debt itself. However, a default is still possible if a government does not allow itself to take on more debt to pay its legal obligations. For example, if the United States government does not increase its debt ceiling when it needs additional funding, it would then default on its treasury bonds [16]. Although governments are highly likely to meet their obligations, inflation is still a large risk for long-term treasury bonds if inflation is larger than the interest rate of the bond.

The bond is characterised by its face value, maturity date, and its yield. Note that the market value of the bond also depends on the market interest rates. If the market interest rates are high, the bond will trade lower as it would be more attractive to invest on the markets with the high interest rates. The market value also depends on the quality of the issuer of the bond. If the issuer is in financial distress, buyers might only buy debt at far below the face value as it is uncertain if the issuer is still in business at maturity.

On the maturity date, the issuer repays the face value of the bond. After this final payment, the bond has matured if all payments have been made by the issuer.

The yield of the bond equals the rate of returns. Several methods exist to calculate a yield, such as the current yield, defined by

Current yield = Annual coupon payment

Bond price (2.1)

and the yield to maturity is calculated from

Current bond price =

T

X

t=1

Cash flowst

(1 + YTM)t (2.2)

The yield to maturity (YTM) calculates the rate of returns considering all future pay-ments on the bond and the current price.

A zero-coupon bond does not have any coupon payments from the issuer of the bond to the buyer throughout the bond’s lifetime. An example is the US treasury bill, or T-bill, which has a maturity date of less than one year without any interest payments. The yield of a zero-coupon bond is a special case of equation 2.2 and can be directly calculated

Yield to maturity = n s

Face value

(15)

with n the number of years to maturity.

2.1.2 Default Risk Charge (DRC)

The Default Risk Charge (DRC) has been introduced to repair a weakness in the Basel II framework. Its sensitivities-based method, in which ”the sensitivities of financial instruments to a prescribed list of risk factors are used to calculate the delta, vega and curvature risk capital requirements” [7], did not capture jump-to-default (JTD) risk. The Bank of International Settlements defines JTD as ”the risk that a financial product, whose value directly depends on the credit quality of one or more entities, may experience sudden price changes due to an unexpected default of one of these entities” [17].

For example, if an entity suddenly defaults on a select number of bonds without markets expecting this default, the bond value will experience a sudden jump to a value of zero; the bonds have become worthless. This risk is not captured when sensitivities are used, as the jump is far too sudden. If such a default happens, the loss is however very large. The default risk charge contains a value-at-risk (VaR) calculation that ”must be done weekly and be based on a one-year time horizon on a one-year time horizon at a one-tail, 99.9 percentile confidence level” as stated by paragraph 33.20 subnote 5 of BCBS (2019) [7]. For this reason we will consider the 99.9 percentile bound of credit portfolio losses.

2.1.3 Probability of default

An important property of counterparties in a credit portfolio is their probability of default (PD). This probability cannot be exactly determined in practice, as the default of a specific counterparty is an exceptional event. A method to estimate the PD is by using historical experiences with the counterparty, or similar counterparties. Another option is to use credit ratings from external rating agencies, or a combination of historical experiences and external credit ratings in an internal model of the bank.

A different method is to use the market-implied PD. The market-implied PD uses market rates of credit default swaps, financial instruments offering protection against a counter-party default. Multiple methods exist to calculate the market-implied PD, but its use is not allowed for DRC [7]. For a description of market-implied PD, we refer the reader to Gregory (2015) [18].

An explicit calculation of the probability of default was not required for this thesis, as it was provided in the portfolio dataset.

(16)

2.2 Generating portfolio losses

The credit portfolio loss can be split into three distinct parts [13]. The first is the default event, denoted by an indicator function. If the counterparty does not default, it will meet its legal obligations and return the credit balance without any incurred losses. If a counterparty does default, some percentage of the exposure may still be recovered. This recovery rate (RR) can also be given as 1 − RR, the loss given default (LGD). Finally, the exposure at default (EAD) gives the credit balance at risk when a default occurs. The credit portfolio contains 125 counterparties, each with an exposure at default of 1/125 for simplicity. The product of the three parts gives the loss of the credit portfolio. To obtain enough data on portfolio losses, a method is needed to generate many real-izations of the losses. In this section, we first introduce a method to obtain correlated variables. These are required for correct statistics, as it is more likely that defaults occurs with counterparties between well-connected or correlated regions, due to shared economic distress, than between regions without any connection. In the second sub-section, a two-factor Merton-like default threshold model to generate default events is introduced. Next, we introduce a second model to generate the LGD of each counter-party and finally combine everything to calculate the loss of a single iteration. Many iterations can then be made to obtain enough data to calculate loss expectations and the 99.9 percentile of the portfolio losses.

2.2.1 Generate correlated random variables

To obtain correct statistics and correlations between different counterparties, a method to generate correlated variables is required. For example, if Europe is largely impacted in a scenario it is reasonable to expect that the US will also be largely impacted, whereas other regions may be less affected.

Using Cholesky decomposition, it is possible to decompose a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, or Q = LLT. For more details of the entire procedure, see Higham (2009) [19]. A vector X consisting of uncorrelated random variables, Xi, with zero mean and unit variance is

generated. As the random variables are uncorrelated, E[XiXj] = σij where σij = 1 iff

i = j. The covariance matrix is then given by

cov(X, X) = E[XXT] − E[X]E[X] = E[XXT] = I where I is the identity matrix.

(17)

Now, using Cholesky decomposition, a matrix that satisfies Q = LLT can be constructed, where Q is the original covariance matrix. Next, we define the random vector Z = LX, and find that

E[ZZT] = E[(LX)(LX)T] = E(LXXTLT] = LE[XXT]LT = LILT = Q such that the random vector Z has the desired covariance from Q.

2.2.2 Generate default events

A counterparty defaults when it can no longer meet its legal obligations. A credit portfolio may contain many counterparties with each having their own probability of default. For the purpose of this thesis, we will assume the probability of default as given by the provided dataset is accurate. We include the correlations from a pre-calibrated model by using a two-factor Merton default threshold model, modified from the description by Bade et al. (2011) [13]. In their factor model the asset return is factorized into multiple systematic random variables and a single idiosyncratic variable. The return is constructed in such way that it is normally distributed with a variance of 1. The default threshold is set to zero, such that the counterparty defaults if the return is below zero.

We use a similar approach, which differs on two distinct points. First, the probability of default of each counterparty is given by the dataset. Second, a global and regional parameter is also provided by the portfolio dataset. This already suggests a different formulation of the factor model. A critical threshold variable is therefore constructed which is normally distributed with unit variance. Two systematic factors are used, a global factor which is equal for every counterparty and a regional factor.

The portfolio dataset contains 125 counterparties with sensitivity values to the global and their regional factor, and contains a correlation matrix between all factors. The correlation matrix is shown in table A.2. The random factor values are generated by Cholesky decomposition of the correlation matrix as described in section2.2.1.

The critical threshold variable itself is calculated by

xi = p βi αG,iFG_√+ αRj,iFRj ψi +p1 − βi1,i (2.4)

where we assume each counterparty is only active in a single region Rj. The different

(18)

Parameter Description

xi Measure of impact of the current simulation on counterparty i.

βi Sensitivity to the global and regional factors.

αG,i Sensitivity of counterparty i to the global factor FG.

FG Global factor for all counterparties.

αRj,i Sensitivity of counterparty i to the regional factor. FRj Regional factor for the region Rj.

ψi Scaling factor to rescale back to standard normal.

1,i Random factor not explained by global or regional factors.

Table 2.1: Parameters for default generation.

The values are calibrated such that the final xi follows a standard normal distribution.

As FG, all FRj, and 1,i are already standard normal, the values must be scaled by βi and ψi. The calibration is not performed in this thesis, but is provided as-is. Several

checks are performed on the generated data, which is described in section 2.2.5.

The generated value of xi is then compared with φ−1(P Di), where φ−1 is the percent

point function (ppf) of the standard normal distribution, also known as the inverse CDF, and P Diis the probability of default (PD) of counterparty i. φ−1(P Di) returns the value

of x for which the cumulative standard normal has a probability equal to the PD itself. In other words, it returns the probability of finding a value lower than this value of x from a random sample of the standard normal distribution. As this value is compared to xi, the probability of xi < φ−1(P Di) being true is exactly equal to the probability of

default. The empirical correlation found in the model calibration is, however, retained by correlating the random factor values.

2.2.3 Simulate LGD in credit losses

When a counterparty defaults, some loss may still be recovered. We use the work of Wilkens and Predescu (2017) [2] to obtain a recovery rate for each default event. The log recovery rate is given by

Yi = γ + σ( p ρY_F G+ p 1 − ρY 2,i) (2.5)

where the same global factor FG is used from the default simulation in section2.2.2. A

short description of each parameter is tabulated in table 2.2. For a list of parameter values, we refer the reader to sectionA in the appendix.

(19)

Parameter Description

γ_C{corp, sov} Base recovery rate for corporation or sovereign with credit rating C. σ_C{corp, sov} Recovery rate from global and idiosyncratic factors.

ρY Fraction of recovery due to global factors.

FG Global factor which is equal for all counterparties.

2,i Idiosyncratic factor for counterparty i.

Table 2.2: Parameters for LGD generation.

From the log recovery rate, the loss given default (LGD) for counterparty i can be calculated by LGDi = 1 − exp(Yi) = 1 − exp γ + σ(pρY_F G+ p 1 − ρY 2,i) (2.6) 2.2.4 Portfolio loss

To compute the total loss of the portfolio for a single realization, we assume the exposure at default (EAD) to be equal for all n counterparties and is therefore given by

EAD = 1

n (2.7)

Combining the default event, loss given default, and the exposure at default, we obtain the total loss of the portfolio

Loss =X i Lossi = X i 1xi<φ−1(P Di)× LGD × EAD (2.8) where the indicator function 1 indicates whether counterparty i has defaulted. This concludes a single run of a realization. Millions more realizations can be generated by randomly generating new global and regional factors from the correlation matrix. The generation of 106 simulations took around 67 seconds to complete.

Note that the loss is normalized to 1. In the most extreme case, where every counterparty defaults with an LGD of 1, the complete portfolio of unit value would be lost.

2.2.5 Checking the generated losses

The generated losses can be checked on several properties. As the threshold variable xi from equation 2.4 should be a standard normal variable. The empirical mean and

(20)

Another check is the covariance of the generated factors FG and FRj. Using Cholesky decomposition, independent random numbers are transformed to correlated random fac-tors. The correlation of these random factors should be approximately the original correlation matrix as shown in table A.2.

Finally, the probability of default from the generated data should agree with the theo-retical PD as given as input to the model. Consider all simulations, and check whether each counterparty default with approximately the same probability in the generated realizations.

2.3 Uncertainty Quantification

Uncertainty quantification (UQ) originates from the engineering disciplines, where parts of a larger system can be isolated and studied independently [1]. Sullivan (2015) de-scribes uncertainty quantification as ”a combination of probability theory and statistics with ’the real world’” [15].

From uncertainty quantification, we are mostly interested in inequalities known as con-centration of measure inequalities, which provide rigorous certification criterion as high-dimensional random variables are highly concentrated around the mean [15]. We will consider McDiarmid’s inequality and its corollary, Hoeffding’s inequality.

This chapter is structured as follows. First, we will prove the proof of Hoeffding’s lemma which will be used to prove McDiarmid’s inequality in the subsection thereafter. Finally, Hoeffding’s inequality is proven as a special case of McDiarmid’s inequality.

2.3.1 Hoeffding’s lemma

In this section we introduce Hoeffding’s lemma as a first step in proving McDiarmid’s inequality. The proof of Hoeffding’s lemma is given by Sullivan (2015) [15].

Let X be a random variable with mean zero taking values in the range [a, b]. We can then show that, for any t ≥ 0,

EetX ≤ exp t

2_{(b − a)}2

8

We begin the proof by using the convexity of the exponential function, for each x ∈ [a, b],

etx≤ b − x b − ae

ta₊ x − a

b − ae

(21)

We then apply the expectation operator, and use the fact that E[X] = 0 as initially assumed. This results in

EetX ≤ b b − ae

ta₋ a

b − ae

tb_{= e}φ(t)

Now we introduce the following definitions h = t(b − a)

p = −a

b − a

L(h) = −hp + ln(1 − p + peh) and we show that exp(L(h)) ≡ eφ(t):

exp(L(h)) = exp(−hp + ln(1 − p + peh)) = exp −t(b − a) −a b − a+ ln 1 − −a b − a+ −a b − ae t(b−a) = exp ta + ln 1 + a b − a− a b − ae t(b−a) = eta 1 + a b − a− a b − ae t(b−a) = b − a b − ae ta₊ a b − ae ta₋ a b − ae ta_et(b−a) = b b − ae ta₋ a b − ae tb Note that L(0) = 0 L0(h) = −p + pe h 1 − p + peh L0(0) = ln(1 − p + pe0) = 0 L00(h) = pe h 1 − p + peh − peh (1 − p + peh₎2pe h = pe h 1 − p + peh 1 − pe h 1 − p + peh = θ(1 − θ) ≤ 1 4

where we defined θ = _1−p+pepeh h and θ > 0. We now use Taylor’s theorem, which states that for any real u a v must exist such that

f (u) = f (0) + uf0(0) + 1 2u

2_f00

(22)

Hence, since eφ(t)= eL(h) and L00(h) ≤ 1₄, we obtain EetX ≤ eφ(t) = eL(h) = exp L(0) + hL0(0) +1 2h 2_L00₍₀₎ ≤ exp 0 + 0t(b − a) +(b − a) 2 4 t2 2 = exp t 2_{(b − a)}2 8 (2.9) 2.3.2 McDiarmid’s inequality

In this subsection McDiarmid’s inequality is introduced and a mathematical proof is given. The complete proof is given by Sullivan (2015) [15].

McDiarmid’s inequality provides a method of obtaining a global sensitivity measure for a function with n independent parameters. Each parameter is individually stressed to obtain a maximum variation in the function outcome. This results in n McDiarmid subdiameters, with the ith McDiarmid subdiameter of f defined by

Di[f ] = sup|f (x) − f (y)|

x, y ∈ X such that x_j = y_j for j 6= i ; (2.10) or equivalently D_i[f ] = sup    |f (x) − f (x₁, . . . , xi−1, x0i, xi+1, . . . , xn)| x = (x1, . . . , xn) ∈ X and x0_i ∈ Xi    . (2.11)

The full McDiarmid diameter of f is then given by

D[f ] = v u u t n X i=1 Di[f ]2. (2.12)

This expression can be interpreted as the Euclidean norm of the n-dimensional vector with elements Di[f ].

McDiarmid’s inequality is then defined by

P[f (X) ≥ E[f (X)] + t] ≤ exp − 2t 2 D[f ]2 (2.13) P[f (X) ≥ E[f (X)] − t] ≤ exp − 2t 2 D[f ]2 (2.14) P[|f (X) − E[f (X)]| ≥ t] ≤ 2 exp − 2t 2 D[f ]2 (2.15) for any t ≥ 0.

(23)

Again, the proof is given by Sullivan (2015) [15] and we closely follow his proof of McDiarmid’s inequality below. Let Fi be the σ-algebra generated by X1, . . . , Xi, and

define random variables Z0, . . . , Zn by Zi def

= E[f (X)|Fi]. Note that Z0 = E[f (X)] and

Zn= f (X). Now consider (Zi− Zi−1)|Fi−1. First observe that

E[Zi− Zi−1|Fi−1] = 0

such that the sequence (Zi)i≥0is a martingale, i.e. the sequence’s expected value doesn’t

change. Secondly, observe that

Li≤ (Zi− Zi−1|Fi−1) ≤ Ui

where

Li def

= inf

l E[f (X)|Fi−1, Xi= l] − E[f (X)|Fi−1]

Ui def

= sup

u E[f (X)|Fi−1

, Xi= u] − E[f (X)|Fi−1]

Since Ui− Li ≤ Di[f ], Hoeffding’s lemma (equation 2.9) implies that

E h es(Zi−Zi−1)_F i−1 i ≤ es2Di[f ]2/8

For any s ≥ 0, we obtain

P[f (X) − E[f (X)] ≥ t] = P[es(f (X)−E[f (X)])≥ est] ≤ e−st_E[es(f (X)−E[f (X)])_] = e−stE[es Pn i=1Zi−Zi−1_] = e−stE[E[es Pn i=1Zi−Zi−1_F n−1]] = e−stE[es Pn−1 i=1 Zi−Zi−1 E[es(Zn−Zn−1)Fn−1]]

where Markov’s inequality is used. From the last step, Hoeffding’s lemma is applied to obtain P[f (X) − E[f (X)] ≥ t] ≤ e−stE[es Pn−1 n=1Zi−Zi−1 E[es(Zn−Zn−1)Fn−1]] ≤ e−stes2Dn[f ]2/8 E[es Pn−1 i=1 Zi−Zi−1_] Now, the same steps are applied n − 1 more times to obtain

P[f (X) − Ef (X)] ≥ t] ≤ exp −st +s 2 8D[f ] 2 .

(24)

bound possible. As the natural exponential function is strictly increasing, only minimiz-ing the argument is sufficient. We take the derivative of the argument to s and set this to zero: d ds −st +s 2 8 D[f ] 2 = −t + s 4D[f ] 2 _{= 0} =⇒ s = 4t D[f ]2

Inserting this optimal value of s into the previous expression, the final form of McDi-armid’s inequality is obtained

P[f (X) ≥ E[f (X)] + t] ≤ exp − 2t 2 D[f ]2 (2.16) 2.3.3 Hoeffding’s inequality

A special case of McDiarmid’s inequality is its application on a random variable X = (X1, . . . , Xn) with independent bounded components. Each component is bounded by

Xi ∈ [ai, bi] and Sn = _n1Pn_i=1Xi is the expected value of X. Then the McDiarmid

diameter is given by D[f ]2 ₌ 1 n2 n X i=1 (bi− ai)2

Inserting this into equation 2.13, we obtain Hoeffding’s inequality, which is given by

P[Sn− E[Sn] ≥ t] ≤ exp −2n2_t2 Pn i=1(bi− ai)2 (2.17)

and similarly for deviations below the mean [14],[15]. Because this inequality can be easily derived from McDiarmid’s inequality, Hoeffding’s inequality is a corollary to Mc-Diarmid’s inequality.

2.4 Other percentile methods

In this section two methods to obtain a percentile estimate are introduced. Both methods are described by Chadyˇsas (2008) [8]. Note that both methods estimate the percentile with a given confidence level, and do not attempt to find a bound on the percentile of the true population unlike the McDiarmid and Hoeffding bounds as discussed in section 2.3.

(25)

2.4.1 Traditional bootstrap

With a bootstrap method, a single sample is taken from a population and this sample is taken to be equal to the entire population. Effectively, this allows one to obtain statistics such as confidence levels which would normally require multiple samples from the true population.

Each subsample is obtained by sampling with replacement from the original bootstrap sample. After sampling B subsamples and calculating B percentiles, one can estimate the percentile at the desired confidence level.

Step-by-step, the procedure can be described by:

Draw a simple random sample of size N from the population

From this sample, draw a simple random subsample with replacement of size N. Now let ˆKq(1) be the desired quantile, and repeat this process B times, obtaining

ˆ

Kq(1), ˆKq(2), . . . , ˆKq(B)

Taking the chosen confidence level, denoted by 1−α, of this set of desired quantile estimates, we obtain an estimate for the confidence interval for the desired quantile Kq by ˆKqT B(1 − α) at the chosen confidence level 1 − α.

We slightly deviate from the procedure as described by Chadyˇsas (2008) [8], as our interest lies not in estimating a confidence interval, but an upper bound with the upper tail containing even larger losses. In Chadyˇsas, the interest is in bounding deviation away from the estimation which is why they take a two-tailed confidence level of 1 − α/2 on both ends to obtain a confidence interval with a confidence level of 1 − α.

One of the difficulties of the traditional bootstrap method is choosing an optimal value of B. This optimal value will be dependent on the data itself and the used statistic and cannot be determined a priori. In general, a minimum value of B = 1000 for bootstrap confidence intervals is recommended by Efron and Tibshirani (1986) [20], although Davidson and MacKinnon (2000) [21] recommend at least B = 399 for a 95% bootstrap confidence level. We will choose the safer, but more computational expensive, choice of B = 1000.

2.4.2 Jackknife

The jackknife method takes a single sample from the population like the traditional bootstrap method. However, it then creates subsamples by eliminating one observation

(26)

at a time. From each subsample, of size n − 1, the percentile is calculated. From this set, the desired confidence interval is then estimated.

Step-by-step, the procedure is:

Draw a simple random sample of size N from the population

The desired quantile is estimated from the sample with the i-th observation elim-inated, obtaining ˆKq(i) with i = 1, . . . , n

Taking the chosen confidence level, denoted by 1−α, of this set of desired quantile estimates ˆKq(1), ˆKq(2), . . . , ˆKq(n), we obtain the one-tailed confidence interval for the

desired quantile Kq: ˆKqJ(1 − α) at the chosen confidence level 1 − α.

We again slightly deviate from Chadyˇsas, as our objective is different. In implementing this algorithm we note that many percentile estimates are identical. For example, if the 99.9 percentile is considered, 99.9% of the estimates will be from subsamples eliminating an observation lower than the observation determining this percentile. Therefore, almost all percentile estimates are identical. The only difference is whether an observation before the determining observation is eliminated, the determining observation itself, or an observation after the determining observation. By not constructing the complete subsample from the original sample the computation time is drastically reduced. This is only possible as the jackknife procedure itself is deterministic. Once the random sample is drawn, every following step contains no stochastic elements.

(27)

Application on treasury bonds

In this chapter we will apply uncertainty quantification to a simple portfolio of (theo-retical) zero-coupon US Treasury bonds. Real-world data will be used here, gathered by G¨urkaynak et al. (2007) [22]. This dataset is updated once per week and is freely available fromhttps://www.federalreserve.gov/data/nominal-yield-curve.htm. The objective is to obtain an upper bound on the portfolio losses. The portfolio contains equally weighted bonds with value 1/30 for each of the 30 maturities in the portfolio. Every day, the portfolio is rebalanced to weights of 1/30 for each maturity, requiring to pay or receive a certain amount of money in return.

The price changes of this portfolio is calculated from the bond yield data for a given period of time. These effect of these price changes are then calculated on the portfolio-level. From the distribution of portfolio losses, it is then possible to obtain a bound on the expected portfolio loss.

Do note that most of the following is inspired by and adapted from Chen et al. (2017) [1], although some parts are changed such as obtaining the independent risk factors in section 3.4.

This chapter is structured as follows. In the first section, the data preparation is ex-plained. The second section introduces principal component analysis (PCA), which allows to reduce the dimensionality of the data while retaining most of the variance in the data. This improves tractability and visualization of the results. The third section discusses obtaining independent risk factors, a requirement for applying McDiarmid’s inequality. In the last section, the application of McDiarmid’s inequality is discussed.

(28)

3.1 Data preparation

A dataset containing US Treasury bond yield data is freely available fromhttps://www.

federalreserve.gov/data/nominal-yield-curve.htm [22]. The zero-coupon yield is

used, named SVENYXX in the data, where XX denotes the maturity in years. All maturities, 1, 2, . . . , 30 years, are considered with the yield given by continuous compounding. Next, the data is cleaned by dropping all NA-values, which can be holidays for which no data is available, and selecting only data between pre-defined dates. Finally, the data is divided by 100 to obtain a fraction.

Once the data is cleaned, the change in bond value ξm(t) is computed for every maturity

and each date using the equation

ξm(t)def= e(yt(m)−yt+1(m))m− 1 (3.1)

with m the maturity in years. This is stored in a matrix defined by

ξ(t) =        ξ1(t) ξ2(t) .. . ξ30(t)        (3.2)

with the mth row containing maturity m and the ith column containing the bond price fluctuation between day i and i + 1.

From ξ(t), the empirical average is computed for each maturity m. We define

¯ ξm = 1 N N X i=1 ξm(i) (3.3)

where we sum over all dates in the cleaned dataset.

3.2 Principal Component Analysis

The 30-dimensional matrix ξ is highly correlated. If the bond value for a single maturity increases, it is likely that other maturities have also increased in value. It is therefore possible to reduce the dimensionality of the data using Principal Component Analysis (PCA) while retaining most of the variance. This isolates the direction and amount of variance in the data, ordered in magnitude. Using a 30-dimensional dataset, we obtain 30 eigenvectors with 30 corresponding eigenvalues. The large correlation has been earlier

(29)

observed by Alexander (2008) in UK spot rates [23], where 91.05% of all variance could be captured by a single eigenvector.

The PCA procedure decomposes the original matrix into the form ETΣE = diag(λˆ 1, λ2, . . . , λ30)

with ˆΣ the covariance matrix of ˆξ, and λi the ith eigenvector with λ1> λ2 > · · · > λ30.

Furthermore, E is a orthonormal matrix with the columns em being the eigenvectors of

ˆ

Σ. The decomposition isolates the dominant fluctuations in ˆξ, and allows to reduce the dimensionality of the data while retaining the most important features.

To calculate what variance is accounted for by using only a limited number of eigenvec-tors, the summation of all included eigenvalues is calculated and divided by the total sum of all eigenvalues:

explained variance = Pn i=0λi PN i=0λi (3.4) with 0 < n ≤ N .

In our tests, using data from March 4th 2013 to July 24th 2013, we find that using two factors accounts for 99.4% of all variance, whereas adding an additional factor increased this percentage to 99.91%. Like Chen (2017) [1], we continue with two factors. Using two factors improves tractability, allows easier visualization, while still retaining > 99% variance of the original data. The two risk factors are defined by

ˆ

X_t(j)def= eT_j ˆξ(t) − ¯ξ(t)

(3.5) with eT

j the jth transposed eigenvector.

3.3 Calculating portfolio result

From the previous sections, we have obtained the change in bond value ξm(t) for maturity

m in equation3.1, and applied PCA to obtain two risk factors ˆX_t(j)in equation3.5. The approximate profit can now be defined in terms of the risk factors by

˜

P def= ¯P + c1Xˆt(1)+ c2Xˆt(2) (3.6)

with ¯P , the average profit for the entire portfolio, defined by ¯

P def= 1 301, ¯ξe

(30)

and cj, the average across all 30 elements of the jth eigenvector, defined by

cj def

= 1

30h1, ejiR30 (3.8)

Finally, we only allow positive losses and bound the loss by K, the total value of the portfolio. This results in a final loss expression

L(ˆx)def=

−n ¯P + c1xˆ(1)+ c2xˆ(2)

o+

∧ K (3.9)

with ˆx = (ˆx(1)_{, ˆ}_x(2)_{), or explicitly written out as}

minmax− ˜P , 0, K= L(ˆx1, ˆx2) (3.10)

3.4 Independent risk factors

To satisfy the requirements for McDiarmid’s inequality, we need to construct an expres-sion for the loss with independent variables. From applying PCA we have obtained two uncorrelated variables, but PCA does not provide independent variables.

To create independent variables, a density function is constructed from the N obser-vations of price changes in the data. To reduce overfitting the data, the domain D of all observations of the two risk factors is divided into DX and DY segments. The

observations in each of the DX × DY cells are then counted, and the counts are then

interpolated by cubic splines. From this empirical distribution function (EDF), a em-pirical cumulative distribution function (ECDF) is created from which independent risk factors may be sampled with the same statistics as the original risk factors.

This section is structured as follows. First, we introduce cubic splines as an interpolation method to obtain the distribution function. Next, we describe how the cumulative distribution function may be used to obtain new independent risk factors with the same statistics as the original risk factors.

3.4.1 Cubic splines

To obtain a distribution function from the observed risk factor, we interpolate the count per cell using spline interpolation. A description of spline interpolation can be found in the lecture notes of McKinley and Levine (1998) [24].

(31)

Splines are piecewise-defined polynomials, where each division will have its own coeffi-cients defined for the polynomial. A cubic spline is of order 3 and is defined by

fi(x) = ai(x − xi)3+ bi(x − xi)2+ ci(x − xi) + di (3.11)

for the ith interval. Furthermore, the first and second derivative of this cubic spline are given by

f_i0(x) = 3ai(x − xi)2+ 2bi(x − xi) + ci (3.12)

f_i00(x) = 6ai(x − xi) + 2bi (3.13)

The cubic spline needs to interpolate all data points. Also, the cubic spline itself and its first and second derivatives all need to be continuous on every interval. We set

fi(xi) = yi

f_i0(xi) = fi−10 (xi)

f_i00(xi) = fi−100 (xi)

This system is only uniquely defined by also setting boundary conditions. A natural spline sets the second derivative at the end points to zero. The cubic spline can then be efficiently solved in matrix notation. We use the implementation of the Python package scipy to obtain a cubic spline fit of the data.

As a distribution function is considered, negative values are not allowed in the final interpolation. To ensure only positive values, actually only the root of the original values are considered, and then squared after the fit is obtained. This will however introduce small inaccuracies in areas where a minimum would be negative. Using this method any such minimum will become a small maximum.

3.4.2 Generating independent observations

In this section we will derive a method to generate independent risk factors from inde-pendent uniform random variables with the same statistics as the original risk factors. We first suppose that X is a R-valued random variable with a cumulative distribution function FX defined by

FX(t) def

(32)

and we define the inverse, with u ∈ [0, 1], by

F_X−1(u)def= inf{s ∈ R : FX(s) ≥ u} (3.15)

Now if U is a uniform random variable on [0, 1], then X (L)= F_X−1(U ). Next, we define ζ1 def = F₁−1(U1) (3.16) as we have shown ζ1 (L) = X(1)_{, or ζ}

1 has the same law as X(1). We define F2,x to be the

conditional cumulative distribution function of X(2), given that X(1) = x. We can then define

ζ2 = F2,ζ−11(U2) (3.17)

and apply the same method twice P [ζ1 ≤ t1, ζ2 ≤ t2] = E P [ζ2≤ t2|ζ1] 1{ζ1≤t1} = E P [U2≤ F2,ζ1(t2)|ζ1] 1{ζ1≤t1} = EF2,ζ1(t2)1{ζ1,t1} = EhF_2,X(1)(t₂)1_{X(1)_≤t 1} i = EhP h X(2) ≤ t2|X(1) i 1_{X(1)_≤t 1} i = P h X(2) ≤ t2, X(1)≤ t1 i (3.18)

We can therefore statistically replace X(1) and X(2) in equation3.6by ζ1 and ζ2

respec-tively, so the approximate profit is given by ˜

P = ¯P + c1ζ1+ c2ζ2 = ¯P + c1F1−1(U1) + c2F_2,F−1−1

1 (U1)(U2) (3.19) with U1 and U2 independent uniform random variables. The loss can now be calculated

by L(U1, U2) = −n ¯P + c1F1−1(U1) + c2F_2,F−1−1 1 (U1) (U2) o+ ∧ K (3.20)

(33)

3.5 McDiarmid’s Inequality

Now that we have constructed the loss function with independent variables, we can apply McDiarmid’s inequality as given by equation 2.16 to obtain a bound on the expected loss. Using equation3.20 we identify a McDiarmid diameter of

D2= N κ

2

N2 (3.21)

where κ is the supremum of the change in losses caused by variation in a single parameter. κ is defined by κ2 = sup u1,u2,v∈R −n ¯P + c1F1−1(u1) + c2F_2,F−1−1 1 (u1)(u2) o+ ∧ K −−n ¯P + c1F₁−1(v) + c2F_2,F−1−1 1 (v) (u2) o+ ∧ K sup u1,u2,v∈R −n ¯P + c1F1−1(u1) + c2F_2,F−1−1 1 (u1)(u2) o+ ∧ K −−n ¯P + c1F₁−1(u1) + c2F_2,F−1−1 1 (u2)(v) o+ ∧ K (3.22)

where each parameter, u1 and u2, is individually stressed such that the loss variability

is maximal. Note that in the first supremum the argument for F₁−1 is changed from u1

to v, whereas in the second supremum the argument for F₂−1 is changed from u2 to v.

We therefore obtain a McDiarmid bound of

Pl − ¯lN ≥ L ≤ exp −2L 2 D2

Now, we set θ_N,M cDiarmid as the value of L which makes the right hand side equal to . This means θ is given by

= exp " −2 θ 2 N_Nκ22 # ln = −2θ 2 N N2 κ2 θM cDiarmid_N, = κ r 1 2N ln 1 (3.23)

and we finally obtain the McDiarmid bound given by

P n

l ≥ ¯lN + θM cDiarmidN,

o ≤

(34)

Application on credit portfolio

4.1 Bounding the expectation

In this section we will apply Hoeffding’s inequality on the expected loss of the credit portfolio as described in section 2.2. We start from Hoeffding’s inequality in equation 2.17, and note that the loss is bounded between 0 and 1 for each observation. We can therefore write P [Sn− E[Sn] ≥ t] ≤ exp −2n2_t2 Pn i=1(1 − 0)2 = exp −2n 2_t2 n = exp(−2nt2) (4.1)

where t is the maximum value by which the empirical measure can exceed the true expected loss. The same holds for the lower bound of the expected loss; we can simply apply Hoeffding’s inequality on the expression

P [− (Sn− E[Sn]) ≥ t] ≤ exp _−2n2_t2 Pn i=1(1 − 0)2 = exp −2n 2_t2 n = exp(−2nt2) (4.2) 25

(35)

We can identify the obtained bound as a confidence level, denoted by 1 − α. This also allows to obtain a optimal value for t for a given α

exp(−2nt2) ≡ α −2nt2 = ln α t2 = ln α −2n t = r ln α −2n (4.3)

We can now state that the probability of the true expectation E [Sn] exceeding the

empirical expectation Sn plus t ≡

q

ln α

−2n, or below the empirical expectation Sn minus

t ≡ q

ln α

−2n, is bounded by α.

To understand the performance of the application of McDiarmid’s inequality, we will follow Chen (2017) [1] in comparing the McDiarmid bound to Chebyshev’s inequality. Chebyshev’s inequality is given by

P [Sn− E[Sn] ≥ t] ≤ (4.4)

with

t = √K

n (4.5)

where K is the variance of the data, which can be at most equal to 1, and is again the desired confidence interval. The performance of both methods is then measured by the value of t. A lower t means a stricter bound on the expectation value, and is therefore a more useful bound.

4.2 Hoeffding’s inequality and tail bounds

4.2.1 Obtaining an upper bound

In this section we will apply Hoeffding’s inequality, given by equation 2.17, to obtain a probability bound on the true empirical loss percentile exceeding the empirical loss percentile by θ or more (θ > 0). The derivation is adapted from lecture notes by Hung (2004) [25].

We denote the true p percentile by ξp, and the empirical p percentile by ˆξp. Starting

(36)

or more, we obtain P ξp− ˆξp ≥ θ = P ξp ≥ ˆξp+ θ = Pp ≥ F ( ˆξp+ θ) = PFN( ˆξp+ θ) + p ≥ FN( ˆξp+ θ) + F ( ˆξp+ θ) (4.6) = P FN( ˆξp+ θ) − F ( ˆξp+ θ) ≥ FN( ˆξp+ θ) − p = P 1 N X 1{Xi≤ ˆξp+θ}− E(1{X1≤ ˆξp+θ}) ≥ FN( ˆξp+ θ) − p

Combining this expression and Hoeffding’s inequality, as given by equation 2.17, we identify Xi = 1_{X_i_{≤ ˆ}_ξ_p_+θ}, and t = FN( ˆξp + θ) − p. We know the bounds of Xi, as the

indicator function 1 can only be 1 or 0. We therefore obtain the inequality

P ξp− ˆξp ≥ θ ≤ exp −2n2t2/X i (bi− ai)2 ! = exp −2n2t2 ≡ α (4.7)

Note that, in this calculation, we have only made the assumption that ˆξp is the percentile

obtained from an empirical distribution. By using Hoeffding’s lemma, this is enough to guarantee a bound on the percentile of the true distribution, improving on the uncertain bound obtained by bootstrapping methods.

4.2.2 Upper bound for a given confidence level

We can also rewrite equation4.7 to obtain θ, taking the upper bound given by α as an equality α = exp(−2nt2) ln α = −2nt2 ln α −2n = t 2 r ln α −2n = t = FN( ˆξp+ θ) − p r ln α −2n+ p = FN( ˆξp+ θ) F_N−1 "r ln α −2n+ p # = ˆξp+ θ F_N−1 "r ln α −2n+ p # − ˆξp = θ (4.8)

(37)

Note that, due to the use of the inverse empirical CDF, we need to have a minimal number of simulated scenarios in order to find a value for θ. This minimal number nmin

can be expressed by r ln α −2n_min + p = 1 ln α −2n_min = (1 − p) 2 nmin = ln α −2(1 − p)2 (4.9)

Using this equation, we calculate that it is necessary to have at least nmin ≈ 1.5 × 106

simulated scenarios to calculate a bound for the 99.9 percentile with α = 0.05.

4.2.3 Obtaining a lower bound

A lower bound can also be calculated for the percentile. We start from the probabil-ity that the empirical percentile exceeds the true percentile by θ or more (θ > 0), or equivalently, that the true percentile plus θ is still lower than the empirical percentile. Combined with the upper bound, we have essentially decomposed the absolute difference |ξ_p− ˆξp| into both its positive (upper bound) and negative (lower bound) part. For the

lower bound, we have the expression

P −ξp− ˆξp ≥ θ= Pξp− ˆξp≤ −θ = P ξp ≤ ˆξp− θ = Pp ≤ F ( ˆξp− θ) = P F ( ˆξp− θ) − FN( ˆξp− θ) ≥ p − FN( ˆξp− θ) (4.10) = P−FN( ˆξp− θ) − F ( ˆξp− θ) ≥ p − F_N( ˆξp− θ) = P − 1 N X 1{Xi≤ ˆξp−θ}− E h 1{X1≤ ˆξp−θ} i ≥ p − FN( ˆξp− θ)

Due to the symmetries in Hoeffding’s inequality, the minus-sign in front of (FN(...) −

F (...)) makes no difference in the application of the inequality. We identify Xi =

1{Xi≤ ˆξp−θ}, and t = p − FN( ˆξp − θ). The bounds of Xi are again 0 and 1. Using Hoeffding’s inequality, we therefore obtain

P −ξp− ˆξp ≥ θ≤ exp −2n2t2/X i (bi− ai)2 ! = exp −2n2t2 ≡ α (4.11)

(38)

with t = p − FN( ˆξp− θ).

Like the upper bound case before, we can again find a optimal bound for any given confidence level α. We calculate

α = exp(−2nt2) ln α = −2nt2 ln α −2n = t 2 r ln α −2 = t = p − FN( ˆξp− θ) p − r ln α −2 = FN( ˆξp− θ) F_N−1 " p − r ln α −2 # = ˆξp− θ ˆ ξp− F_N−1 " p − r ln α −2 # = θ (4.12)

4.3 Parameter uncertainty

In this section we will apply McDiarmid’s inequality not on the losses, as we have done up to now, but on the parameters. This will allow to consider the contribution of uncertainty in each parameter to the total variability and the bounds of the losses. For example, if the global sensitivity factor αG,i is not exactly known but is known with a

5% margin from the value given by the portfolio dataset, we can quantify the additional variability and the increased upper bound observed in the 99.9 percentile.

Using McDiarmid subdiameters as defined by2.11, it is possible to consider each param-eter individually and its effect on the total McDiarmid diamparam-eter as defined in equation 2.12. Note that the loss itself will always have some variability, as the parameters FG,

FRj, 1,i, and 2,i are randomly sampled from a standard normal distribution.

The total credit portfolio loss is bounded between 0 by losing nothing, and 1 by losing the entire portfolio. These bounds are necessary for the application of McDiarmid’s inequality in equation 2.13. However, the underlying parameters of the portfolio loss, as shown in equations 2.4, 2.6, and 2.8, have no such bounds. The random numbers drawn from a standard normal distribution can theoretically attain any real number. By taking the supremum in calculating the McDiarmid subdiameter, as shown in equation 2.11, every possibility must be included, even if the probability is almost surely zero. Exploring the infinite parameter space is computationally infeasible.

(39)

To obtain the supremum, we use two global optimizers, dual annealing and differential evolution. Both are as supplied by the Python package scipy. The optimizers need to have defined bounds over the space from which is sampled. For this reason, the random variables will be bounded at multiple levels.

For each subdiameter, the ’known’ parameter values given by the portfolio dataset are used with exception of a single parameter. This parameter value is allowed to change by a set percentage away from this known value. The random variables are always allowed to vary, although in the supremum the value will be fixed. The optimizer then searches the 5D parameter space, composed of the four random variables and the single uncertain parameter, to find the maximum difference between two losses obtained by only varying the uncertain parameter and keeping all other parameters and random variables fixed. By using two optimizers we attempt to minimize inaccuracies in case a single optimizer fails. If the two optimizers succeed in finding the supremum, we choose the largest supremum to continue with.

(40)

Results

In this section results on both McDiarmid’s inequality on treasury bonds and Hoeffding’s inequality on credit portfolio losses are presented. Results on the parameter uncertainty analysis of the portfolio loss generation function as shown in equation2.8 are also pre-sented.

The interpretation and discussion of the results presented in this chapter are given in chapter 6.

5.1 Treasury bonds

We follow Chen’s example and use the US Treasury bond data from G¨urkaynak et al. (2007) [22]. This dataset is updated once per week at https://www.federalreserve.

gov/data/nominal-yield-curve.htm. To compare results with Chen the same period

of 100 trading days, between 2013-03-04 and 2013-07-24, is used.

After performing the data preparation as described in section 3.1, PCA is performed on the price change vector ˆξ. The 30 components, one for each maturity, for the first three eigenvectors are shown in figure 5.1. These three eigenvectors explain 99.91% of the variance. However, the following steps are made using two eigenvectors that explain 99.40% of the total variance in the data.

(41)

0 5 10 15 20 25 30 Maturity 0.4 0.2 0.0 0.2 Eigenvectors (Covariance)

First three eigenvectors of the covariance matrix

eigenvector 1 eigenvector 2 eigenvector 3

Figure 5.1: The 30 components for the first three eigenvectors of ˆξ. The three eigen-vectors explain 99.91% of the variance in the data. Using two eigeneigen-vectors reduces this

to 99.40%.

The two uncorrelated, but possibly still dependent, risk factors are then calculated using equation 3.5 and used to create the scatter plot of all 99 price changes in the 100 days period in figure 5.2. For both risk factors 10 segments are used to fit the empirical distribution function. 0.10 0.05 0.00 0.05 0.10 0.15 0.20 Risk factor 1 0.02 0.01 0.00 0.01 0.02 0.03 Risk factor 2

Risk factors plotted on grid

Figure 5.2: Scatter plot of risk factors from treasury bond yield data between 2013-03-04 and 2013-07-24. The complete domain is divided into 10 segments for each risk

factor.

Next, the number of risk factor sets inside each cell is counted. From the number of points per cell, spline interpolation is applied with 100 interpolation points per dimension, to

(42)

obtain the density function of the two risk factors. This interpolated empirical density function is shown in figure 5.3.

Note that this interpolation is not normalized to a density of 1. For new independent risk factors, generated from independent uniform random variables, an empirical cumulative distribution function (ECDF) is needed to sample values with the correct statistics from. It is far easier to normalize the ECDF, as it only needs to be divided by the maximum value.

First principal component

0.10 0.05 0.00 0.05 0.10 0.15

0.20 0.02 Second principal component

0.01 0.00 0.01 0.02 0.03 Z axis 0 2 4 6 8 10 12

Empirical Density Function

Figure 5.3: Unnormalized empirical density function obtained using spline interpola-tion from the data shown in figure5.2.

For the application of McDiarmid’s inequality as given by equation2.13the parameters must be independent. By constructing the ECDF it is now possible to sample risk factors from the ECDF using two independent uniform random numbers, as derived in equation 3.20. As an example, 1.000 risk factors are sampled, resulting in figure5.4.

(43)

0.10 0.05 0.00 0.05 0.10 0.15 0.20 0.02 0.01 0.00 0.01 0.02 0.03 generated original

Figure 5.4: Example of generating 1.000 risk factors from the empirical density func-tion using two independent uniform random variables.

For the final results, 300 sets of risk factors are independently generated. For each set, another 300 values for v are sampled to obtain the supremum as defined in the expression for κ in equation 3.22. Remember that κ captures the worst-case losses produced by fluctuations in the two risk factors. A larger value of κ denotes the possibility of larger losses, and will therefore produce a larger upper bound.

For the treasury bond data between 2013-03-04 and 2013-07-24, consisting of 100 trading days, a value of κ = 0.181 is obtained. From this κ, the bound can be calculated by equation 3.23. Using = 0.1 a value of θ = 0.0195 is obtained. The empirical mean of the loss in this period was 0.0035, which results in an expected loss bound of 0.0230. As we have used a portfolio with its value normalized to 1, we can write this bound as 2.30% of the portfolio value.

Taking semiannual windows from 2000-01-01 to 2019-12-31 and a last window from 2020-01-01 to 2020-05-29, values for κ are obtained as shown in figure 5.5. In figure 5.6 the expected loss bound is shown for the same semiannual windows. The expected loss bound is given by the sum of the expected loss and θMcDiarmid_N, , as given by equation 3.23, with = 0.10.

(44)

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 Semiannual window ending date

0.12 0.14 0.16 0.18 0.20 0.22

per semiannual window (January 1st 2000 - May 29th 2020)

Figure 5.5: Graph of obtained κ using semiannual windows starting from 2000-01-01. The last window is only until 2020-05-29 as newer data was not yet available.

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 Semiannual window ending date

1.20% 1.40% 1.60% 1.80% 2.00% 2.20% 2.40% 2.60% 2.80%

Expected loss bound

Expected loss bound per semiannual window (January 1st 2000 - May 29th 2020)

Figure 5.6: Graph of obtained expected loss bounds using semiannual windows start-ing from 2000-01-01 where is set to 0.10.

As the last window shows the largest κ and expected loss bound, a zoomed-in version is also created using 2-month windows starting from February 1st 2019 to May 29th 2020. The resulting graph for κ is shown in figure 5.7 and the graph for the expected loss bound is shown in figure5.8. The bound shows a maximum of 4.39% in the window from 2020-02-01 to 2020-03-31. Do note that the obtained bound is larger than with

(45)

semiannual windows of the same period, as we have used a different N in calculating both bounds. Using less observations, or smaller N , creates larger bounds as shown by equation 3.23.

2019-03-31 2019-05-31 2019-07-31 2019-09-30 2019-11-30 2020-01-31 2020-03-31 2020-05-29

Window ending date 0.10 0.12 0.14 0.16 0.18 0.20 0.22

per 2-month window (February 1st 2019 - May 29th 2020)

Figure 5.7: Graph of obtained κ using 2-month windows starting from 2019-02-01.

2019-03-31 2019-05-31 2019-07-31 2019-09-30 2019-11-30 2020-01-31 2020-03-31 2020-05-29

Window ending date 2.00% 2.50% 3.00% 3.50% 4.00% 4.50%

Expected loss bound

Expected loss bound per 2-month window (February 1st 2019 - May 29th 2020)

Figure 5.8: Graph of obtained expected loss bounds using 2-month windows starting from 2019-02-01 and set to 0.10.

5.2 Credit portfolio losses

In this section the results on the credit portfolio losses are presented. We start with a comparison between the confidence interval on expected loss for Hoeffding’s inequality

(46)

and Chebyshev’s inequality. Next, results on the 99.9 percentile are presented with a comparison between Hoeffding’s inequality and the traditional bootstrap and jackknife estimators. This section is concluded by the results on parameter uncertainty using McDiarmid’s inequality.

5.2.1 Expected loss

For the expected loss two methods are used, Hoeffding’s inequality and Chebyshev’s inequality. An average is also given by taking the average of all observed losses. The result on a dataset of 500M simulations with a wide range of α is shown in figure 5.9. The Hoeffding bound is shown to be stricter for all confidence levels, and also does not explode as quickly as the Chebyshev bound when the confidence level approaches 100%. Note that the shown confidence interval is two-tailed.

In figure5.10the results for a range of N with a fixed confidence level of 95% are shown. Note that results with a negative bound are floored to 0%, as a negative loss makes no sense in the current context. Again, the Hoeffding bound is shown to be more strict for all N . 0% 20% 40% 60% 80% 100% Confidence level 0.000% 0.020% 0.040% 0.060% 0.080% 0.100% 0.120%

Expected loss percentage

Expected loss bounded by Hoeffding and Chebyshev for dataset of 500M observations Empirical average

Chebyshev bound Hoeffding bound

Figure 5.9: Obtained bounds for the expected loss of the credit portfolio for all confidence levels and 500M observations.