Managing swaption risk with a dynamic SABR model

(1)

Faculty of Economics and Business

Managing Swaption Risk

with a Dynamic SABR Model

MSc in Econometrics

Financial Econometrics track

Frank de Zwart

10204245

supervised by Dr. S.A. Broda and supervisor at Abn Amro

Ms Hiltje Bijkersma

July 28, 2017

ABN AMRO Bank N.V.

(2)

Statement of Originality

This document is written by Student Frank de Zwart who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Abstract

This thesis focuses on models that can be used to estimate risk measures, like Value at Risk and Expected Shortfall. The displaced Black’s model and the displaced SABR volatility model are used to price a portfolio of swaptions. The aim here is to capture the dynamics of the SABR parameters in a time series model to obtain more accurate swaption risk estimates. Hence, this time series model is used to simulate the one-day-ahead profit and loss distribution and is then compared to the Historical Simulation method. In an empirical study, we compute the Value at Risk and Expected Shortfall estimates based on the Historical Simulation method as well as the time series model. These models are analyzed with several backtests and diagnostic tests to be able to answer the following research question. Can one outperform the Historical Simulation Value at Risk and Expected Shortfall forecasts by fitting a time series model to the calibrated SABR model parameters instead?

A vector autoregressive model is used as well as a local level model. Based on these two models we are not able to outdo the Historical Simulation estimates of the risk measures. Diagnostic tests show remaining significant autocorrelation as well as heterogeneity in the residuals of the vector autoregressive model. Also the backtests that are carried out show that the vector autoregressive model performs worse than the Historical Simulation method.

(4)

1 Introduction

The Basel Committee (2013) has introduced the Fundamental Review of the Trading Book (FRTB).

To contribute to a more resilient banking sector, they have decided to change the current framework’s reliance on Value at Risk (VaR) to the Expected Shortfall (ES) measure to estimate market risk. On

the other hand Pérignon and Smith (2010) state that most banks use Historical Simulation (HS) to

estimate their VaR. This Historical Simulation method computes the VaR by using past returns of the portfolio’s present assets so that one obtains a distribution of price changes that would have realized, had the current portfolio been held throughout the observation period. The decision described in the FRTB shows that it is getting even more important for financial institutions to estimate their market risk accurately. However, we also see that a relatively simple method is still used to obtain these risk measures. One of the main drawbacks of this Historical Simulation method is that it does not take the decreasing predictability of older returns into account.

Derivatives are traded extensively these days, and one of these products is a swap option, or swaption. A swaption is an option on an interest rate swap. Swaptions are traded over-the-counter, so compared to derivatives that are traded on an exchange, the information is more scarce and not publicly available. This makes it an interesting challenge to find an accurate method to assess the risk of holding these derivatives. Besides this, the negative interest rates also affect almost all the valuation methods for these options. In the current interest rate environment, the Historical Simulation method that is used to produce the VaR and ES estimates of market risk may not be reliable. Hence, finding a method to get more reliable estimates for the VaR and ES, based on historical swaption data, is of interest.

This leads to the following research question, which defines the main purpose of this thesis. Can one outperform the Historical Simulation Value at Risk and Expected Shortfall forecasts by fitting a time series model to the calibrated SABR model parameters instead? An empirical study is performed to be able to answer this question. This study is based on an ICAP data set of swaption premiums, interest rate deposits, and interest rate swaps. The time series, with a time grid of approximately 2.5 years, of the displaced SABR volatility model parameters will be analyzed to obtain an one-day-ahead forecast of the price of a portfolio of swaptions. Finally, a backtesting procedure will assess the quality of this new method compared to the well-known Historical Simulation method.

The remainder of this research report is structured as follows. First in Section 2, the necessary

background theory for this research will be discussed. This includes theory on interest rate instruments in general as well as a description of an interpolation method called bootstrapping to obtain the zero and discount curves. We will then give a description of a swaption and some of its relevant trading strategies are discussed. This section will be concluded with the a description of martingales and measures. Then we

will briefly discuss the relevant literature for this research in Section3and subsequently we will continue

in Section4 with the theory that is used to price the swaptions. In this section, the well-known model

ofBlack(1976) is described in detail. Besides this, we will focus on the SABR volatility model ofHagan

et al.(2002) and the correction to their work fromObłój(2008). We will then discuss the implications of

negative interest rates on these models. In Section4.2, some basic time series models that are used in this

research will be discussed. We will then continue with the risk measurement concepts, like Value at Risk and Expected Shortfall. Finally, several different backtests will be elaborated. Different backtests are used to be able to assess the quality of our model estimates as thoroughly as possible. The data set will

be described in Section5. Not only the raw data will be described, but also the different pre-processing

(6)

argumentation on why some adjustments are made. In the next section, Section 6, the empirical study

and results are described. This section follows the structure of Section4 and starts with the calibrated

SABR parameters and continues with the time series analysis, risk measurement, and concludes with the backtesting procedure. There are also some diagnostic tests carried out, besides the backtests themselves, to assess the quality of the fit of the time series analysis. The results will be elaborated and discussed

in every single step. Finally in Section7, the main findings are summarized and a conclusion is drawn.

The research question will be answered and some limitations and recommendations for further research will be provided.

2 Preliminaries on financial notation

Trading in derivatives has become an indispensable part of the financial industry. There are multiple different derivatives for every type of investment asset. The magnitude of this market shows that it is of great importance to have an understanding of how these derivatives work. Consequently a lot of researchers have focused on all these derivatives. Numerous papers and books describe how derivatives work and what risks the holder of an open position in them is taking. We will first explain some basic

but crucial concepts of interest rate instruments. Then in Section 4, we will describe the models and

methods that are applied in the empirical analysis of this research.

2.1 Interest rate instruments

Interest rates are crucial in the valuation of derivatives. Especially the ’risk-free’ rate is of concern

when evaluating derivatives. Hull (2012) explains that the interest rates implied by Treasury bills are

artificially low because of a favorable tax treatment and other regulations. For this reason the LIBOR rate became commonly used instead. However, when the rates ascended during the crisis in 2007, many derivatives dealers had to review their practices. The LIBOR rate is the short-term opportunity cost of capital of AA-rated financial institutions and it lifted off due to unwillingness of banks to lend to each other during the crisis. Many dealers have switched now to using overnight indexed swaps (OIS), because they are closer to being ’risk-free’. This research focuses on the Euro market and makes use of the Euribor rate. The Euribor rate is similar to the LIBOR rate, however is only based on a panel of European banks. The rates at which they borrow funds from one another is the Euro Interbank Offered Rate (Euribor). Although the Euribor rate is not theoretically risk-free, it still is considered a good alternative against which to measure the risk and return trade off.

There is one very important assumption that makes the risk-free rate even more crucial. This is known as the assumption of a risk-neutral world. In a risk-neutral world is assumed that all investors are risk-neutral. In other words, they do not require a higher expected return from an investment that is more risky.

Theorem 2.1. This leads to the following two characteristics of a risk-neutral world (Hull,2012):

1. The expected return on an investment is the risk-free rate.

2. The discount rate used for the expected payoff on a financial instrument is the risk-free rate. This makes the pricing of derivatives much more simple. The world is not actually risk-neutral, however it can be shown that if we compute the price of a derivative under the risk-neutral world

(7)

assumption, we obtain the correct price for the derivative in all worlds. This makes a significant difference, because there is still a lot unknown about the risk preferences of buyers and sellers of derivatives.

The main focus of this research is on swaptions. The underlying of this product is an interest rate swap and therefore this derivative will first be discussed. However before the swaps are considered, we briefly discuss forward rate agreements. These agreements give an insight in how we can price a swap. We will then continue with an interpolation method, known as bootstrapping, that is used to obtain the zero curve. Then the swaption will be explained together with some of its most common trading strategies. Finally we will continue with theory about martingales and measures. These measures are used to compute the discounted expected value of a certain future payoff.

A forward rate agreement (FRA) is an agreement defined to ensure that a certain interest rate will apply to either borrowing or lending a certain principal during a specified future period of time. We

define RK as the interest rate agreed to in the FRA and define RF as the forward of the reference rate

at time Tα for the period between times Tα and Tβ. We denote the value of a FRA at time t, where RK

is received as

VF RA(t) = L(RK− RF)(Tβ− Tα)P (t, Tβ), (2.1)

where L is the principal of the FRA and P(t,T) is the present value at time t of 1 Euro received at time

T (Brigo and Mercurio,2007).

The forward interest rate that is used in the FRA’s, is implied by current zero rates for periods of time in the future. A n-year zero rate is the rate of interest earned on an investment that starts today and lasts for n years. All the interest and principal is realized at the end of n years. A curve of zero rates can be created from market quotes by using a popular interpolation method known as the bootstrap

method. This method will be described in more detail in Section2.2.

A fixed-for-floating swap, also known as a payer swap, is the most common type of swap. In this swap an investor agrees to pay interest at a predetermined fixed rate on a notional principal for a predetermined number of years. In return it receives interest at a floating rate on the same notional principal for the same period of time. A swap can be characterized as a portfolio of forward rate agreements and this can be used to determine its value. The value of the swap is simply the sum of multiple FRA’s, so we find that the value of a payer swap is given by

Vswap(t) = L

β

X

i=α+1

(RK− RFi)(Ti− Ti−1)P (t, Ti), (2.2)

where the length of the swap Tβ− Tα is called the tenor with n years between Tαand Tβ with each m

cash flows per year. Throughout this entire paper we will denote m as the swap payment frequency per annum. So in total we have n × m cash flows, which can be valued like FRA’s. This leads to the sum as

shown in (2.2), which sums in total over n × m different cash flows (Brigo and Mercurio,2007).

2.2 Bootstrapping the zero curve

Only spot rates are quoted in the market, so the bootstrapping method is used to obtain the forward rates and forward swap rates. This method works by incrementally computing zero-coupon bonds in order of increasing maturity.

As curve inputs first multiple market quotes based on the Euribor rate are used. More precisely, interest rate deposits with different maturities varying from overnight up to 3 weeks are used. To expand

(8)

the time grid out to 30 years, swaps are used with a maturity varying from 1 month up to 30 years. The interpolation over these market quotes will give us all zero-coupon prices, which we can use to compute the forward rates and the forward swap rates.

Uri (2000) lists the different payment frequencies, compounding frequencies and day count

conven-tions, as applicable to each currency specific interest rate type. The conventions for the Euro rates are used for this research, namely for Euro deposit rates the day count convention of ACT/360 and for Euro swap rates the day count convention of 30/360, respectively.

The deposit rates that are used for the time grid of the swap curve up to 3 weeks are inherently zero-coupon rates. For this reason they only need to be converted to the base currency swap rate compounding frequency and day count convention. The day count convention of the deposit is ACT/360, so we can directly interpolate the data points to obtain the first part of the zero curve.

For the middle part of the curve one could use market quotes of forward rate agreements like described

byUri(2000). This can be preferable, because they carry a fixed time horizon to settlement and settle

at maturity. However the FRA’s can lack liquidity, which results in inaccurate market quotes. For this reason only swaps and deposits are used. The annually compounded zero swap rate is used to construct most of the zero curve. The different day count convention of the swaps is taken into account. The

discount rates are computed based on the deposit and swap rates. Brigo and Mercurio(2007) define the

zero curve at time t as a graph of the simply-compounded interest rates for maturities up to one year and of the annually compounded rates for maturities larger than one year. The simply compounded and the annually compounded interest rates are defined as follows

L(t, T ) = 1 − P (t, T )

(T − t)P (t, T ), (2.3)

and Y (t, T ) = 1

[P (t, T )]1/(T −t), (2.4)

where L(t, T ) represents the simply compounded interest rate at time t for maturity T and Y (t, T ) represents the annually compounded interest rate respectively. The simply compounded interest rates that represent the first part of the zero curve are now combined with the annually compounded rates

that are used for the other part of the zero curve. To do so we first define R(ti) as the interest rate

corresponding to maturity ti, where i is the market observation index. Hence, R(ti) represents the simply

compounded interest rate if ti≤ 1 and the annually compounded interest rate if ti> 1. There is no single

way to construct this complete zero curve correctly. It is however important that the derived yield curve

is consistent, smooth, and closely tracks the observed market points. Uri(2000) however also mentions

that over-smoothing the yield curve might cause the elimination of valuable market pricing information. Piecewise linear interpolation and piecewise cubic spline interpolation are two commonly used methods that are appropriate for market pricing.

The piecewise linear interpolation method is simple to implement, because the value of a new data point is simply assigned according to its position along a straight line between observed market data points. One drawback of this method however is that it produces kinks in the areas, where the yield curve is changing slope. The piecewise linear interpolation can be constructed in a closed form as follows

R(t) = R(ti) + _{(t − t} i) (ti+1− ti) × [R(ti+1) − R(ti)], (2.5) where ti≤ t ≤ ti+1.

To avoid the kinks produced by the linear method, one can choose to fit a polynomial function through the observed marked data points instead. It is possible to either use a single high-order polynomial or a

(9)

number of lower-order polynomials. The latter method is preferred, because the extra degrees of freedom can be used to impose additional constraints to ensure smoothness of the curve. The piecewise cubic spline technique goes through all observed data points and creates the smoothest curve that fits the observations and avoids kinks.

We can construct a cubic polynomial for the n − 1 splines between the n market observations. Now

let Qi(t) denote the cubic polynomial associated with the t segment [ti, ti+1]

Qi(t) = ai(t − ti)3+ bi(t − ti)2+ ci(t − ti) + R(ti), (2.6)

where R(ti) again represents market observation point i and tirepresents the time to maturity of market

observation i. With three coefficients per spline and n − 1 splines, we have 3n − 3 unknown coefficients and we impose the following constraints

ai(ti+1)3+ bi(ti+1− ti)2+ ci(ti+1− ti) = R(ti+1) − R(ti),

3ai−1(ti− ti−1)2+ 2bi−1(ti− ti−1) + ci−1− ci= 0,

6ai−1(ti− ti−1) + 2bi−1− 2bi= 0,

b1= 0,

6an−1(tn− tn−1) + 2bn−1= 0.

(2.7)

The first set of n − 1 constraints is imposed in order to force the polynomials to perfectly fit to each other at the knot points. To also let the first and second order derivatives of the polynomials match, we set the second and third set of 2n − 2 constraints. Finally two endpoint constraints are required to set the derivative equal to zero at both ends. We end up with a linear system of 3n − 3 equations and 3n − 3 unknowns, which is solved to obtain the optimal piecewise cubic spline.

Both methods are used and plotted in Figure 2.1below. The main advantage of the linear

interpo-lation method is that it is closed form. The piecewise cubic spline interpointerpo-lation method however takes longer to compute. The figures show almost no difference between both methods. This relatively small difference can be explained by the large number of data points and the smooth structure of the rates with respect to their time to maturity.

0 5 10 15 20 25 30 −5 0 5 10x 10 −3 Time to Maturity Rates

Zero curve for 02−Jun−2016

Market Quotes Linear interpolated line

0 5 10 15 20 25 30 0.7 0.8 0.9 1 1.1 Time to Maturity Rates

Discount curve for 02−Jun−2016

Discount rates Linear interpolated line

0 5 10 15 20 25 30 −5 0 5 10x 10 −3 Time to Maturity Rates

Zero curve for 02−Jun−2016

Market Quotes Cubic spline interpolated line

0 5 10 15 20 25 30 0.7 0.8 0.9 1 1.1 Time to Maturity Rates

Discount curve for 02−Jun−2016

Discount rates Cubic spline interpolated line

Figure 2.1: The interpolated zero and discount curves.

The zero curves as well as the discount curves are computed daily over the entire time grid. The discount curves are used to price the swaptions and are obtained for a maturity up to 30 years. However, we will only focus on swaptions with a maximum maturity of 10 years and a maximum underlying tenor of the swap of 10 years.

As showed in Figure 2.1, the two interpolation methods differ very little. The computation time of

(10)

this research we have chosen to only use the linear interpolation method.

2.3 Swaptions

Swap options, or swaptions, are options on interest rate swaps. They give the holder the right to enter into a certain interest rate swap at a certain time in the future. Depending on whether the swaption is a call or a put option, we call it a payer swaption or a receiver swaption respectively. The swap rate of the swap contract equals the strike of the swaption. In a payer swaption, the owner pays the fixed leg and receives the floating leg and in a receiver swaption this is the other way around. For example, a ’2y10y’ European payer swaption with a strike of 1%, represents a contract in which the owner has the right to enter a swap, with a tenor of ten years, in two years from now where he pays a fixed rate of 1%.

First we define the annuity factor A that is used for discounting

Aα,β(t) = β

X

i=α+1

(Ti− Ti−1)P (t, Ti), (2.8)

where we have again n × m cash flows, with n the number of years and m the number of cash flows per

year respectively. We also define the forward swap rate at time t for the sets of times Ti, likeBrigo and

Mercurio(2007). The forward swap rate is the rate in the fixed leg of the interest rate swap that makes

the contract fair at the present time. We denote the forward swap rate as follows Sα,β(t) =

P (t, Tα) − P (t, Tβ)

Aα,β(t)

. (2.9)

Now one can define the value of a payer swaption with strike K and resetting at Tα, . . . , Tβ−1 as follows

Vswaption(t) = Aα,β(t)Et(Sα,β(Tα) − K)+ . (2.10)

The value of the swaption clearly depends on the expected value of the difference between the forward swap value and the strike rate. To obtain an arbitrage free price of a swaption, we need to define the corresponding measure used to derive the expected value. This will be elaborated in more detail in Section2.4.

Some swaptions or combinations of swaptions will briefly be explained in this section, because of their relevance for this research. An at-the-money (ATM) swaption is a swaption that has a strike equal to the par swap rate of the underlying swap of the swaption. There are multiple trading strategies involving swaptions. We define a straddle as the sum of an ATM payer swaption and an ATM receiver swaption with the same ATM strike. If the interest rate is close to the strike rate at expiration of the options, the straddle leads to a loss. However, if there is a sufficiently large move in either direction, a significant profit will be the result.

We also define a strangle, which is the sum of a receiver swaption with a strike of ’ATM - offset’ and a payer swaption with a strike of ’ATM + offset’. The market normally refers to strangles as a ’2Y into 10Y 100 out/wide skew strangle’ in which 100 is the width (in basis points) between the payer and receiver strike and the offset from the ATM to both the payer and receiver swaption is thus width/2. For example if we assume an ATM strike of 1%, the receiver strike is thus 0.5% and the payer strike is 1.5%. A strangle is a similar strategy to a straddle. The investor is betting that there will be a large movement in the interest rate, but is uncertain whether it will be an increase or a decrease. If we would compare the payoff of both strategies, we see that the interest rate has to move farther in a strangle than

(11)

in a straddle for the investor to make profit. However, the downside risk if the interest rate ends up in a central value is less with a strangle.

Finally, a collar is also defined. A collar is a payer swaption with a strike of ’ATM + offset’ minus a receiver swaption with a strike of ’ATM - offset’. A collar is normally quoted as a ’2Y into 10Y 100 out/wide skew collar’ in which the width of 100 basis points is again the width between the payer and receiver strike. So, you will pay floating if the swap rate is within the interval of ’ATM ± offset’ and pay a fixed rate for the range of the swap rate outside this interval.

2.4 Martingales and Measures

The models that are used to price derivatives try to estimate the expected payoff of the derivative. These models are based on a stochastic process, which is simply a variable whose value changes over time in an uncertain way. The processes, where only the current value of a variable is relevant for predicting the future, are called Markov processes. The Markov property is very useful, because it states that the future value of a variable is independent of the path it has followed in the past. This corresponds to the assumption of weak market efficiency and states that all the relevant information is captured in the

current value of the variable (Hull, 2012). A stochastic process that satisfies the Markov property is

known as a Markov process.

We now focus on a particular kind of Markov process, which is known as a Wiener process (or a

Brownian motion). Formally, we define a P-Wiener process as stated in the theorem below (Tsay,2005).

Theorem 2.2. A real-valued stochastic process {Wt}t≥0is a P-Wiener process if for some real constant

σ, under P,

1. for each s ≥ 0 and t ≥ 0 the random variable Wt+s− Ws has the normal distribution with mean

zero and variance σt2,

2. for each n ≥ 1 and any times 0 ≤ t0 ≤ t1 ≤ · · · ≤ tn, the random variables {Wtr − Wtr−1} are

independent,

3. W0= 0,

4. Wt is continuous in t ≥ 0.

The probability measure P is defined as the probability of each event A ∈ F . We can think here

of F as a collection of subsets out of the entire sample space. Finally Ft contains all the information

about the evolution of the stochastic process up until time t.

The price of a non-dividend paying stock is often modelled as a Geometric Brownian motion. Before we define this Geometric Brownian motion we first define a standard Brownian motion as a Wiener process with zero drift and a variance proportional to the length of the time interval. This corresponds to a rate of change in the expectation that is equal to zero and a rate of change in the variance that is equal to one. We now consider a generalized Wiener process, where the expectation has a drift rate

equal to µ and the rate of change in the variance is equal to σ2₍_Tsay_,₂₀₀₅_{). This leads to the following}

generalized Wiener process

(12)

where Wt is a standard Brownian motion. We then consider the modelled change in price of a

non-dividend paying stock over time and this results in the following Geometric Brownian motion

dSt= µStdt + σStdWt ⇒

dSt

St

= µdt + σdWt, (2.12)

where µ and σ are constant. Now Itˆo’s lemma can be used to derive the process followed by the logarithm

of St(Itô,1951). First consider the general case for the continuous-time stochastic process xt of (2.11).

We also define G(xt, t) as a differentiable function of xtand t and find

dG = ∂G ∂xµ(xt, t) + ∂G ∂t + 1 2 ∂2_G ∂x2σ 2_(x t, t) dt +∂G ∂xσ(xt, t)dWt. (2.13)

We then apply Itˆo’s lemma to obtain a continuous-time model for the logarithm of the stock price. The

differentiable function is now defined as G(St, t) = ln(St). This leads to

d ln (St) = µ −σ 2 2 dt + σdWt. (2.14)

This stochastic process has a constant drift rate of µ − σ2_{/2 and a constant variance of σ}2_{. This implies}

that the price of a stock at some future time T is log-normally distributed, given the current value of the stock at time t

ln ST ∼ φ ln S0+ µ −σ 2 2 ∆, σ2∆ , (2.15)

where ∆ is the fixed time interval T − t. Black’s model is based on this lognormal property together

with the property of a risk-neutral world as will be further explained in Section4.1.

In order to be able to normalize different asset prices, one can use a numeraire Z as reference asset. A numeraire is defined as any positive non-dividend-paying asset. A key result used in the pricing of derivatives is the relation between the concept of absence of arbitrage and the existence of a probability

measure like the martingale measure (or risk-neutral measure). Brigo and Mercurio(2007) denote this

relation as follows based on a numeraire Z St Zt = E Z ST ZT |Ft 0 ≤ t ≤ T, (2.16)

where the price of any traded asset S (without intermediate payments) relative to Z is a martingale

under probability measure QZ

. This probability measure Q is equivalent to the real world probability

measure P. A martingale is a zero-drift stochastic process, so under probability measure QZ _{we have for}

a sequence of random variables S0, S1, . . .

EZ[Si|Si−1, Si−2, . . . , S0] = Si−1, ∀ i > 0. (2.17)

The preferred numeraire to use, depends on the derivative that is priced. Two frequently used numeraires are now briefly described. First a numeraire based on the zero-coupon bond and secondly a numeraire based on the annuity of a swap.

A zero-coupon bond, with a maturity T equal to that of the derivative, is commonly used as a

numeraire. We denote the value of this numeraire at time t as Zt and note that ZT = P (T, T ) = 1. We

also denote the measure associated with this numeraire as the T-forward measure QT _{with expectation}

ET. This way we are able to price a derivative by computing the expectation of its payoff under this

measure. This leads to the following price of a derivative at time t V (t) = P (t, T )ET

_{V (T )}

P (T, T )|Ft

(13)

for 0 ≤ t ≤ T (Brigo and Mercurio, 2007). Notice that the forward rate is a martingale under this measure, this makes the forward measure convenient to work with.

The annuity of a swap is a linear combination of zero-coupon bonds. A numeraire is defined as a positive non-dividend paying asset, so the annuity of a swap can also be used as a numeraire. The numeraire in this case will be the following portfolio of zero coupon bonds:

ZT = Aα,β(T ) =

β

X

i=α+1

(Ti− Ti−1)P (T, Ti), (2.19)

this leads to the swap measure Qα,β_. _{Under this measure we find that the swap rate S}

α,β(t) is a martingale: Sα,β(t) = P (t, Tα) − P (t, Tβ) Aα,β(t) (2.20) ⇒ P (t, Tα) − P (t, Tβ) Zt = E α,β P (T, Tα) − P (T, Tβ) ZT |Ft 0 ≤ t ≤ T. (2.21)

These numeraires and their related measures are used in arbitrage-free pricing, which is an essential part of the option pricing models that are used. These models and their assumptions are further explained in

Section 4.1. However, first we will review several studies that are relevant for this research in the next

section.

3 Literature review

This study combines several methods with different underlying assumptions. First of all, the risk-neutral world assumption is used and both the Black model and the SABR volatility model are used to price a swaption on an interval of strike rates. When pricing these swaptions we also take negative interest rates into account by using a displacement parameter. Then also the risk measures Value at Risk and Expected Shortfall are computed. The estimated underlying profit and loss distribution is based on the normal world probabilities to estimate valid risk measures. Hence, we make a bridge between the Q-measure and the P-measure. To obtain the forecasts of the risk measures, we use two different methods. The quality of the estimates of these two methods is evaluated by several different backtests. The methods that are used differ in several ways. First, the Historical Simulation method for instance gives all historical returns in the estimation window an equal weight and uses them to construct the profit and loss distribution of the portfolio. The time series analysis that is performed on the other hand simulates one-day-ahead forecasts of the SABR model parameters. These SABR parameters represent the characteristics of the volatility structure of the individual swaptions. As a result of this we estimate the risk measures based on one-day-ahead simulations of this volatility structure instead of the value of the portfolio on itself.

We will now discuss some studies that have also focused on the aspects that we are looking at.

Pérignon and Smith (2010) for instance, compare the disclosed quantitative information and VaR

es-timates of up to fifty international commercial banks in their paper. They use panel data over the period 1996-2005 and find that VaR estimates are in general excessively conservative and also note that there is no improvement in the estimates of the VaR over time. Besides this, they find that the most popular VaR method is the Historical Simulation method. Then they also conclude that this method

helps little in forecasting the volatility of future trading revenues. Pérignon and Smith (2010) use the

Unconditional Coverage test of Kupiec(1995) to test whether the proportion of VaR violations equals

(14)

hypothesis of unconditional coverage is rejected for every year except for 1998 at the 5% confidence level. This study clearly shows the relevance of finding an improved and less conservative method to estimate risk measures like the Value at Risk.

There are multiple improvements proposed in the literature that are based on the Historical

Simula-tion method such as the Filtered Historical SimulaSimula-tion method (FHS) as described byBarone-Adesi et al.

(2002). While the Historical Simulation method was found to be excessively conservative byPérignon

and Smith (2010), it is also known to underestimate risk in some particular situations. This is because

the method is based on the assumption that the risks do not change over time. Hence, when the market conditions change and the market becomes more volatile, the risk is underestimated by the method. This can fortunately be solved by first standardizing the historical returns and then scaling them to the current volatility as is done with the Filtered Historical Simulation method. In this method a GARCH model is fitted to the historical data and the residuals are divided by their corresponding volatility estimates. These standardized residuals are then randomly drawn and used to simulate the one-day-ahead profit and loss distribution. Even though this method overcomes a shortcoming of the Historical Simulation

method it still needs some care. According to the work of Gurrola and Murphy (2015) the filtering

process changes the return distribution in ways that may not be intuitive. Furthermore it is important to make a careful selection in which application of the FHS method is used and besides this re-calibration

and re-testing is essential to ensure that the model remains relevant. Finally,Pritsker(2001) also shows

that one has to be careful when dealing with limited data sets. He shows for example that two years of historical data is not sufficient for the FHS method to estimate the Value at Risk accurately at a 10-day horizon.

The Historical Simulation method is based on historical returns, however to obtain these returns we first need to price the swaptions. There are numerous models that can be used to price a swaption. However, we also need to take the smile risk into account due to the fact that the volatility of the

swaptions differs for different strike rates. To capture this smile risk in the derivatives market Hagan

et al.(2002) introduce the SABR volatility model. West(2005) calibrates the parameters of the SABR

model in a situation where input data is very scarce. The calibration is based on equity futures which are traded at the South African Futures Exchange. The study focuses on packages of options that combine multiple derivatives like a collar or a butterfly for example. Some of these packages are traded in total

for about 800 times, while there are more than double that number of strike combinations. West(2005)

compares two cases. First, he estimates all of the SABR model parameters daily and then in the second case he keeps one of the parameters (β) fixed while he still estimates the other parameters daily. This is because hedging efficiency can be ensured by changing the parameters only once a month while changing

the input values of F and σAT M daily. West (2005) finds that the calibrated parameters of the model

only change infrequently when the value for β is fixed. In fact, they are always changing up to a very high precision, but they remain unchanged up to a fairly high precision. For this reason, he finds that keeping the value for β fixed leads to an infrequent change of the other SABR parameters. These infrequent changes result in the end in lower hedging costs. Hence, this research shows a robust algorithm to capture the volatility smile based on the SABR model while the input data is very scarce and also shows the advantages of keeping the parameter β fixed.

Bogerd(2015) also uses the SABR volatility model, but he combines it with the Historical Simulation

method. He focuses on the volatility structure of swaptions in specific. He uses daily observations of the calibrated SABR model parameters and also uses a displacement parameter to deal with negative

(15)

interest rates. He simulates 1000 one-day-ahead estimates of the profit and loss distribution based on historical changes in the SABR model parameters. A distinction is made here between the curvature and the level of the volatility structure. Only varying one of the SABR parameters (i.e. α) results in

just a vertical shift of the volatility skew. Bogerd(2015) notes that this is a reasonable approximation,

because most of the variation in the swaption volatility over time is caused by vertical movements of the volatility smile. He performs an unconditional coverage test as well as an independence test and only rejects the independence property for the Historical Simulation method applied to all of the SABR model

parameters. The independence property is tested here with the backtest ofDu and Escanciano(2015),

which is based on the Ljung-Box statistic. These results imply that there are possibilities to obtain valid forecasts of the risk measures based on estimates of the one-day-ahead volatility structure. We note however that the SABR parameters that represent the volatility structure are dependent on each other

like described in Section 4.1.2. When dealing with such a time series of interdependent parameters a

multivariate time series model can be used to capture the dynamics of the parameters over time. This makes it interesting to investigate whether it is possible to improve the one-day-ahead forecasts of the volatility structure by using a time series analysis.

There are however some difficulties when applying the Historical Simulation method on the SABR

model parameters. Moni(2014) explains that it is questionable if it is meaningful to add past changes

in the SABR parameters to their current values. A change in the SABR parameters changes the entire volatility structure. Such a change may not always be valid, especially if the values of the historical SABR parameters are significantly different from the current values of the SABR parameters. For this reason, the Historical Simulation method will not be applied to the SABR parameters in this study. We will compare estimated risk measures of the Historical Simulation method based on the portfolio returns with the estimated risk measures based on a time series analysis of the SABR model parameters.

In this study we make use of two different measures with each their own underlying assumptions. The risk-neutral world assumption makes it possible for us to compute the expected value of future payoffs without having to deal with the different risk preferences of buyers and sellers of derivatives.

Giordano and Siciliano(2013) clarify in their paper that this risk-neutral hypothesis is acceptable for

pricing derivatives. However, they also note that the risk-neutral assumption can not be used to forecast the future value of a financial product. So, if we estimate the one-day-ahead value of a swaption we need to take the risk premium into account. Hence, we compute the estimated profit and loss distribution based on the real-world probability measure P. Therefore we use the risk-neutral world assumption only to compute the volatility structure of the derivatives based on the quoted historical swaption premiums. These volatility structures are then used together with the risk-neutral assumption to price the swaptions up to and including the last day of the estimation window. The methods that are then used to estimate the one day ahead profit and loss distribution do not depend on the risk-neutral assumption. The one-day-ahead forecasts of the price of the swaptions are estimated based on the real-world probabilities. The risk measures are then computed based on these estimates of the profit and loss distribution.

The adequacy of the forecasts based on these models will be assessed by several backtests. Piontek

(2009) reviews various backtests that assess the quality of models that produce VaR estimates. He

analyzes some commonly used backtesting methods in his research and focuses on the problems regarding limited data sets and low power of the tests. The simulations are performed for different sample sizes

with the number of observations between 100 and 1000. He finds a low power for the backtest ofKupiec

(16)

model that gives 3% or 7% of violations, instead of the chosen tolerance level of 5%. In this example the backtest only rejects the model in 35% of the draws. This shows that an inaccurate model in such a situation is not rejected in 65% of the cases with a significance level of 5%. A low power is also found by other backtests and this shows that we can not assume that a model is correct if it is not rejected by a backtest. In the empirical study of this research we also have to deal with a limited backtesting sample size of 363 observations. For this reason, we apply numerous different backtests that enable us to assess the quality of our methods more extensively.

In the next section we will first discuss the models and methods that are used in the empirical part of this research. We will then continue with a description of the data and then also discuss the results of the empirical study.

4 Models and method

The SABR volatility model that is used will be explained in more detail in Section 4.1.2. It will be

used to convert the quoted market swaption premiums into a volatility surface that allows us to price swaptions for arbitrary non-quoted strikes. This will be done for a selected combination of the expiry and tenor, so not the entire surface will be taken into account.

4.1 Option pricing models

Under the right corresponding measure, we have seen that both the forward rate as well as the swap rate are martingales. In this research we use the Euribor forward rate, which is a martingale under the

forward measure QT

. We also have that forward swap rates are a martingale under their measure Qα,β_.

The option pricing models are based on the following stochastic process

dFt= c(t, . . . )dWt. (4.1)

The Brownian motion Wtand coefficient c can be deterministic or random. Note that the dynamics do

not have a drift term, since the forward rate is a martingale under its corresponding measure.

4.1.1 Black’s model

Black(1976) introduced a model which gives a closed form solution for the price of an option under the

assumption that price movements of the forward rate Ftfollow a log-normal distribution. The dynamics

in Black’s model depend on the current value of the forward rate Ftand one parameter σB called Black’s

volatility and are given by the following equation

dFt= σBFtdWt F0= F > 0. (4.2)

The standard continuous-time stochastic process is denoted in (2.11). Notice that the drift parameter

µ is dropped out of Black’s differential equation. This implies that the equation is independent of risk preferences. Black, Scholes and Merton use in their analysis that a riskless portfolio can be set up from the stock and the derivative. This portfolio is riskless for an instantaneously short period, but can be rebalanced frequently. This way one can assume that investors are risk-neutral and therefore use the following results. The expected return on all securities is the risk-free interest rate r and the present

value of any cash flow can be obtained by discounting its expected value at the risk-free rate (Tsay,

(17)

The expected payoff of an European call option on a futures contract under the forward measure is

ET[max(V (T ) − K, 0)], (4.3)

where ET _{denotes the expected value under the forward measure and V (T ) is the value of the underlying}

of the option at time t = T . We denote the price of this call option at time t as

ct= P (t, T )ET[max(V (T ) − K, 0)]. (4.4)

Using the dynamics of (4.1), the following well known solution for the price of an European call option

on a futures contract can be derived

c0(F0, K, T ; σB) = P (0, T )[F0φ(d1) − Kφ(d2)], d1= log F0 K + σ2 2 T σ√T , d2= d1− σ √ T . (4.5)

Besides this general formula, one can also compute the price of a payer swaption with Black’s formula,

as described inHull(2012) d1= lnSα,β(Tα) K +σ₂2T σ√T , d2= d1− σ √ T , VSwaption(t) = LAα,β(t)[Sα,β(Tα)N (d1) − KN (d2)], (4.6)

where L is the notional principal value of the contract. In this formula the swap rate is used instead of the discounted futures price, based on this swap rate and the swap measure we can price a swaption in a similar manner to an option on a futures contract.

4.1.2 SABR volatility model

One of the assumptions of the Black model is that a fractional change in the futures price over any

interval follows a lognormal distribution (Black, 1976). If this assumption would be violated, some of

the outcomes will as a result change. If for example the probability of a large positive movement in the interest rate would actually be significantly higher than implied by the lognormal property, this would lead to a higher expected payoff of an out-of-the-money (OTM) payer swaption with a strike rate in this region. The corresponding price of such a swaption will subsequently also need to be higher than the price based on the lognormal assumption. This phenomenon is observed in the market and leads to a volatility that varies for different strike rates, as opposed to the constant Black’s volatility. For this reason, we introduce a volatility model to take this volatility skew into account.

The Stochastic Alpha Beta Rho model, like derived by Hagan et al.(2002), is given by a system of

two stochastic differential equations. The state variables Ft and αt are defined as the forward interest

rate and a volatility parameter respectively. The dynamics of the model are as follows dFt= αtFtβdW (1) t F0= F > 0, dαt= ναtdW (2) t α0= α > 0, dW_t(1)dW_t(2) = ρdt, (4.7)

(18)

where the power parameter β ∈ [0, 1] and ν > 0 is the volatility of αt, so the volatility of the volatility

of the forward rate. dW_t(1) & dW_t(2) are two ρ-correlated Brownian motions. The factors F and α are

stochastic and the parameters β, ρ and ν are not.

West (2005) describes the parameters in more detail. α is a ’volatility-like’ parameter: not equal to

the volatility, but there will be a functional relationship between this parameter and the at-the-money volatility. Including the constant ν acknowledges that volatility obeys well known clustering in time. The parameter β ∈ [0, 1] defines the relationship between futures spot and at-the-money volatility. A value of β close to one indicates that the user believes that if the market were to move up or down in an orderly fashion, the at-the-money volatility level would not be affected significantly. Whereas for values of β << 1 it indicates that if the market were to move then the at-the-money volatility would move in the opposite direction. The closer β is to zero the more distinct this effect would be. Moreover the value for β also gives insight in the distribution of the the underlying. If β is close to one the stochastic model is said to be more lognormal and the closer β is to zero the closer the stochastic model follows the normal distribution instead.

Hagan et al. (2002) show that the price of a vanilla option under the SABR model is given by the

appropriate Black’s formula, provided the correct implied volatility is used. For given α, β, ρ, ν and τ , this volatility is given by

σ(K, F, τ ) = α1 +(1−β)₂₄ 2 α2 (F K)1−β + 1 4 ρβνα (F K)(1−β)/2 + 2−3ρ2 24 ν 2_τ (F K)(1−β)/2h_{1 +} (1−β)2 24 ln 2 F K + (1−β)4 1920 ln 4 F K i z χ(z), (4.8) where z = ν α(F K) (1−β)/2_lnF K, (4.9) and χ(z) = ln p 1 − 2ρz + z2_{+ z − ρ} 1 − ρ ! , (4.10)

for an option with strike K, given that the current value of the forward price is F . Here we note that in

our case we have that the forward value is equal to the par swap rate. Hence, we have F = Sα,β(Tα) and

note that if F = K the swaption is said to be at-the-money. For the ATM strike rate, we can remove

the terms z and χ(z) from the equation, because in the limit we have _χ(z)z = 1. So for an at-the-money

volatility, one can rewrite the equation as follows

σAT M(F, τ ) = (1−β)2_τ 24F(2−2β)α 3₊ ρβντ 4F(1−β)α 2₊_{1 +} 2−3ρ2 24 ν 2_τ_α F(1−β) , (4.11)

where τ is the year fraction to maturity. This formula is closed form, which makes the model very convenient for the pricing of an option.

There is however one main drawback of Hagan’s formula. This drawback is that the formula is known

to produce wrong prices in region of small strikes for large maturities. Obłój (2008) proposes for this

reason to an improvement to the original formulas that compute the volatility as defined byHagan et al.

(2002). In his paper he gives several arguments to use the formula derived by Berestycki et al.(2004).

To understand why we use the formula ofBerestycki et al.(2004), we consider the Taylor expansion of

the implied volatility surface

σ0(K, F, τ ) = σ0(K, F ) 1 + σ1(K, F )τ + O(τ2_). _(4.12)

Obłój(2008) then compares the explicit expressions ofHagan et al. (2002) and Berestycki et al.(2004)

(19)

the same when either K = F , ν = 0 or β = 1. However, when β < 1 the results of σ0(K, F ) of the

two papers differ and Obłój (2008) argues that the formula of Berestycki et al. (2004) is correct and

should be used. This conclusion is based on two arguments. First of all Hagan’s formula is inconsistent

if β → 0. And secondly the formula suggested byObłój(2008) produces, in most cases, correct prices in

the region of small strikes for large maturities, unlike Hagan’s formula.

The formula for the implied volatility is now obtained by combining σ0(K, F ) from Berestycki et al.

(2004) and σ1_{(K, F ) from}_{Hagan et al.} ₍₂₀₀₂_{). We define the fine-tuned implied volatility as follows}

σ(K, F, τ ) = ν ln _KF 1 +(1−β)₂₄ 2_{(F K)}α21−β + 1 4 ρβνα (F K)(1−β)/2 + 2−3ρ2 24 ν 2_τ χ(z) , (4.13) where z = ν α F(1−β)_{− K}(1−β) 1 − β , (4.14) and χ(z) = ln p 1 − 2ρz + z2_{+ z − ρ} 1 − ρ ! , (4.15)

which is used instead of (4.8) if there is reason to assume that β < 1.

We will now discuss the method that is used to calibrate the SABR model parameters. In the empirical part of this research we will only find values of β < 1, so as a result we will only work with

(4.13) instead of (4.8). NeverthelessObłój(2008) showed that the expressions fromHagan et al.(2002)

andBerestycki et al.(2004) are exactly the same for the volatility of an at-the-money swaption. For this

reason, (4.11) remains valid. We now follow the steps fromWest(2005) and notice the following relation

ln σAT M = ln α − (1 − β)ln F + . . . , (4.16)

so the right value of β can be estimated from a log-log plot of σAT M and F . Hagan et al.(2002) suggest

that it is appropriate to fit this parameter in advance and never change it. So the appropriate value for

β is chosen first. Then (4.11) is inverted to obtain an expression of α in the other SABR parameters and

the at-the-money volatility. This is done by setting the equation equal to zero and selecting the smallest positive real root. In the final step we minimize the difference between the market volatilities and the volatilities computed with the SABR model

min

ρ,ν |σM − σSABR(α, β, ρ, ν, τ )|, (4.17)

where β is already estimated and α(σAT M, β, ρ, ν, τ ). The time to maturity τ is also known, so we

calibrate ρ and ν by minimizing this difference. In this method, we calibrate the parameters so that the produced at-the-money volatilities are exactly equal to the market quotes. The at-the-money volatilities are important to match, because they are traded most frequently. Finally, when all of the parameters

are calibrated and we have estimated the SABR volatility for a swaption, we can use (4.6) to price this

swaption. The steps to calibrate the SABR model parameters are all applied and described in more detail in Section6.1.

4.1.3 Pricing in a negative interest rate environment

Before we continue with the time series analysis, we first need to consider a method that enables us to price derivatives in a negative interest rate environment. The option pricing models that are used in this research do not allow interest rates to become negative. However a lot has changed since these models where constructed and we need to adjust our models to be able to deal with the negative interest rates

(20)

that have occurred over the past years. Frankema (2016) describes the Displaced Black’s model as well as the displaced SABR model, which allow interest rates to be negative. The shifted models with shift s > 0 allow rates larger than −s to be modelled. This leads to the following adjusted dynamics of Black’s model, which is also known as a displaced diffusion process

dFt= d(Ft+ s) = σB(Ft+ s)dWt, (4.18)

where s is the constant displacement (or shift) parameter. Note that ˆFt≡ (Ft+s) follows a lognormal (or

Black) process. This fact, together with the fact that the payoff of a European call option max(FT−K, 0)

can be written as

max(FT − K, 0) = max((FT+ s) − (K + s), 0) ≡ max( ˆFT − ˆK), (4.19)

leads to the conclusion that European calls and puts can be valued under the displaced diffusion model

by plugging in ˆF0≡ (F0+ s) and ˆK = (K + s) in Black’s model.

A similar adjustment leads to the following dynamics of the displaced SABR model dFt= αt(Ft+ s)βdW (1) t , dαt= ναtdW (2) t , E[dW_t(1)dW_t(2)] = ρdt. (4.20)

Hence, we use the formulas from Black’s model (4.6) and the SABR model (4.13) with the displaced

values ˆF0 and ˆK instead of F0 and K. A drawback of the displaced models however is that the shift

parameter needs to be selected a priori. So an assumption has to be made on the minimum of the interest

rate. To overcome this drawback mentioned above, Antonov et al.(2015) describe the Free boundary

model. However for this research the displaced SABR model is preferred.

4.2 Time series analysis

The SABR volatility model parameters are estimated on a daily basis. The aim of this research is to estimate the risk related to a portfolio of swaptions. Therefore an analysis of these SABR parameters over time is of interest to be able to forecast the one-day-ahead volatility structure. In this section, some

models will be discussed that are used to capture the dynamics of the parameters αt, ρt and νt over

time.

4.2.1 Vector Autoregressive model

A time series is called white noise if all autocorrelation functions (ACF) of a sequence {γt} are equal

to zero. So we need for a white noise series, that all sample ACFs are close to zero. To obtain this, we

need to apply some time series models to model the dynamic structure of our time series. Tsay(2005)

denotes first the simple autoregressive model of order 1 or simply AR(1) model. This model is defined as follows:

γt= φ0+ φ1γt−1+ at, (4.21)

where {at} is assumed to be a white noise series with mean zero and variance σa2.

This model described above could make sense for the individual parameters, but we have to obtain a forecast of all of the SABR parameters together. These parameters clearly depend on each other like

(21)

described in (4.7). Hence, a model that takes the correlation between these time series into account is desired. The vector autoregressive model (VAR) is a model that can be used for this kind of linear dynamic structures of a multivariate time series. We fit a VAR model to the three time series α, ρ and ν

Γt= φ0+ ΦΓt−1+ at, where Γt=     αt ρt νt     . (4.22)

The vectors Γtand φ0are k-dimensional, Φ is a k×k matrix, and {at} is a sequence of serially uncorrelated

random vectors with mean zero and co-variance matrix Σ. Note that we are modelling three different SABR parameters over time and for this reason have k = 3.

For our V AR(p) model estimation, we have to decide how many lags p to include. A vector au-toregressive model of lag length p refers to a time series in which its current value is dependent on its first p lagged values. There are several tools that can be used to decide which lag length to include. Firstly a sample autocorrelation function (ACF) of the parameters can be used to check their level of

autocorrelation. If we have a weakly stationary return series γt, we define the lag-l autocorrelation of

γt, ACFl, as the correlation coefficient between γt and γt−l. We define ACFl as follows (Tsay,2005)

ACFl= Cov(γt, γt−l) pV ar(γt)V ar(γt−l) =Cov(γt, γt−l) V ar(γt) (4.23)

Another method to determine the optimal selection of lags to include is to use information criteria. These criteria like the Akaike information criterion (AIC), Bayes information criterion (BIC) and Hannan-Quinn criterion (HQC) can be used to measure the relative quality of statistical models for a given set

of data. Liew(2004) compares these different criteria in a simulation study to obtain the best choice of

lag length criteria for an autoregressive model. He finds out that for a relatively large sample, with 120 or more observations, the Hannan-Quinn criterion is found to outdo the rest in correctly identifying the true lag length.

4.2.2 Local level model

A local level model is a type of state space model, which can like the VAR model also be used for a time series analysis. In a classical regression model, a trend and an intercept are estimated. However, when focusing on a time series this intercept might in reality not be fixed over time. When this level component changes over time it is applied locally and for this reason this model is known as the local level model. The local level model allows this intercept to change over time and is defined as follows

µt+1= Imµt+ Bηt, where µt=     µ(1)_t µ(2)_t µ(3)_t     , (4.24) Γt= Cµt+ Dεt, (4.25)

where Γt is the vector of SABR parameters that is defined in (4.22). Moreover the observation or

measurement equation (4.25) contains the values of the three observed time series at time t. Besides

this, we also have a m × 1 vector of unobserved variables µt. Three unobserved variables are used in

(22)

and we define (4.24) as the state equation. We also define εtas the observation disturbances and ηt as

the state disturbances respectively. These disturbances are independent and follow the standard normal distribution.

The state disturbance coefficient matrix B is here defined as a 3 × 3 matrix. This results in a

co-variance matrix equal to BB0_{. The observation innovation coefficient matrix D is defined in a similar way}

as a 3 × 3 matrix, which leads to an observation innovation co-variance matrix equal to DD0. Both the

state disturbance coefficient matrix as well as the observation innovation coefficient matrix are defined as a diagonal matrix. The diagonal elements of these matrices are estimated by using maximum likelihood.

Furthermore, we note that Im is the identity matrix of size m = 3. Finally, the 3 × 3 matrix C links

the unobservable factors of the state vector µtwith the observation vector Γt. All the coefficients of the

matrix C are also estimated by using maximum likelihood.

The state equation is defined as a random walk and in the measurement equation an irregular

com-ponent εtis added, which makes this model a random walk plus noise. The state equation is essential in

time series analysis, because the time dependencies in the observed time series are dealt with by letting

the state at time t + 1 depend on the state at time t (Commandeur and Koopman,2007).

4.3 Risk measurement

The option pricing models are based on a probability measure Q that is related to a risk-neutral world. On the other hand the real probability P is used to estimate the risk of a portfolio. These two measures give different weights to the same possible outcomes for the same derivatives. The risk measures are based on estimates of the profit and loss distribution. The probability of occurring a certain value from this profit and loss distribution needs to be equivalent to the real probability P to obtain a valid risk measure. In this research, we will use option pricing models together with the risk neutral measure to price the swaptions. These swaption prices as well as the calibrated parameters of the SABR model are then used to derive the profit and loss distribution under the probability in the real world. In this section the concepts of financial risk and some methods of measuring risk will be introduced. This includes a definition of Value at Risk and Expected Shortfall as well as their limitations.

4.3.1 Risk measures

Financial risk can be seen as the change of a loss in a financial position, caused by an unexpected change in the underlying risk factor. In this research we focus on a portfolio of swaptions, so the risk related to this is the risk of losses in positions arising from movements in market prices. The risk we are trying to measure is called market risk and in our case in specific, interest rate risk.

Now a formal definition of a risk measure is provided. We have a finite set of states of nature Ω, a set of all risks χ and the set of all real-valued functions X ∈ χ , which represent the final net worth of an instrument for each element of Ω. We now define a risk measure ρ(X) as a mapping of χ into R (Roccioletti,2016).

To assess whether a risk measure is acceptable, the axioms of a coherent risk measure are defined. In other words, a risk measure is said to be coherent if it satisfies the following four properties.

Axiom 1. Translation Invariance For all X ∈ χ and for all m ∈ R, we have

(23)

Translation invariance implies in words that the addition of a sure amount of capital reduces the risk by the same amount.

Axiom 2. Sub-additivity

For all X1∈ χ and X2∈ χ, we have

ρ(X1+ X2) ≤ ρ(X1) + ρ(X2) (4.27)

So, the risk of two portfolios together cannot get any worse than adding the two risk separately. Axiom 3. Positive Homogeneity

For all X ∈ χ and for all τ > 0, we have

ρ(τ X) = τ ρ(X) (4.28)

Again in words, positive homogeneity implies the risk of a position is proportional to its size. Axiom 4. Monotonicity

For all X1∈ χ and X2∈ χ with X1≤ X2, we have

ρ(X1) ≥ ρ(X2) (4.29)

Finally, like described by Roccioletti (2016), the monotonicity axiom explains that if, in each state of

the world, the position X2, performs better than position X1, then the risk associated to X1 should be

higher than that related to X2.

The Value at Risk measure is a single estimate of the amount by which an institution’s position in a risk category could decline due to general market movements during a given holding period. Define

∆Vl as the change in value of the assets of a financial position from time t to t + l. This quantity will

be measured in Euros and is a random variable at time index t. The cumulative distribution function of

∆Vl is expressed as Fl(X). The Value at Risk measure is defined such that a loss will not exceed VaR

with probability 1-p over a given time horizon(Tsay,2005). The VaR is given by

p = P r[∆Vl≤ −V aR] = Fl(−V aR). (4.30)

Although VaR is widely used among banks, it also has several limitations. First of all, as described

by theBasel Committee(2013), the VaR measure does not capture tail risk. As it is a single estimate of

the minimal potential loss in an adverse market outcome, it will underestimate the actual potential loss. The Value at Risk measure gives no estimation of the magnitude of the loss in such an event. Besides this the sub-additivity property fails to be valid for VaR in general, meaning that it is not a coherent risk measure and we can have

V aR(X1+ · · · + Xd) > V aR(X1) + · · · + V aR(Xd). (4.31)

While in general, portfolio diversification always leads to risk reduction, this is not the case for the VaR measure. This is especially a problem when we consider the capital adequacy requirements for a financial institution made of several businesses. With a decentralized approach, where the VaR number is calculated for every different branch, we can not be sure if the aggregated overall risk is an accurate estimation. However, we note that VaR is not sub-additive in general, but whether or not it is the case depends on the properties of the joint loss distribution.

To overcome the shortcomings of the Value at Risk measure, the Expected Shortfall measure can be used instead. Expected Shortfall is the expected return of the portfolio given that a loss has exceeded

(24)

the VaR. We define the ES as −ES(1−p) = E[∆Vl|∆Vl≤ −V aR(1−p)], (4.32) −ES(1−p)₌1 p Z −V aR(1−p) −∞ xfl(x)dx, (4.33)

where fl(x) is the probability distribution function of ∆Vl. In these formulas we assume a long position

in the portfolio, but the same can be derived for a short position.

Expected Shortfall fulfills all the four axioms above and so it is a coherent risk measure. Also, the tail risk is taken into account with the ES measure. There are still some other issues with this measure. To obtain the ES forecast, we first need to ascertain the VaR estimate to subsequently compute the tail expectation. This brings a greater uncertainty into the estimation. There is also some difficulty with

the validation of risk models’ ES forecasts. As showed by Gneiting (2011), Expected Shortfall is not

elictable. A function is elictable if there exists a scoring function that is strictly consistent for it. The difficulty with the ES forecasts is that it measures all risk in the tail of the return distribution. Some losses far out in the tail however, will not be observed in regular backtesting. Despite these drawbacks, it is still proposed as a replacement of the VaR measure.

To assess the risk related to the swaptions, we want to compute the Value at Risk and Expected

Shortfall forecasts. The Historical Simulation method will now be described. This procedure uses

historical returns to predict the VaR. It is easy to implement, but has some shortcomings. All of the returns are given the same weight, so this procedure does not take the decreasing predictability of data that are further away from the present into account.

Let rt, rt−1, . . . , rt−Kbe the returns of a portfolio in the sample period. So first the changes in

swap-tion price over our sample are computed. Then we sort the returns in ascending order: r[1], r[2], . . . , r[K].

The one-day ahead Value at Risk is given by:

−V aR(1−p)_{= r}

[k], (4.34)

where k = Kp. The Expected Shortfall follows from the previous steps and can be computed as follows:

−ES(1−p)₌ 1 k k X i=1 r[i] (4.35)

We note that classical HS is only valid in theory when the volatility and the correlation are constant over time, when dealing with a time-varying volatility we need to use another method.

4.4 Backtests

Backtesting can be described as checking whether realizations are in line with the model forecasts. Financial institutions base their decisions partly on their estimates of risk measures. Therefore it is very important to test whether these estimates are accurate. There are various different tests developed over time to be able to assess the quality of the models that produce these estimates. Even though it may seem like a simple task, there are some complications. The main difficulty is that the methods result in an estimate of the profit and loss distribution daily, but to assess the quality of this estimated distribution only one true profit or loss is observed. Especially the evaluation of the accuracy of an Expected Shortfall

estimate is challenging. If we focus on the ES(0.975) for example, we only incur in theory a loss that

(25)

VaR, we need to assess whether the ES forecast actually represents the true expected value of the tail loss. Besides this, the tail loss is also estimated based on a different profit and loss distribution for every new forecast. Fortunately there are some methods to backtest the models we are using. We will mainly focus however on backtests based on the VaR estimates, but we will also perform a backtest to assess the performance of the models with regard to their Expected Shortfall estimates.

Campbell(2007) reviews a variety of backtests. He defines a hit function that creates a sequence like

for example: (0, 0, 0, 1, 0, 0, . . . , 1), where a 1 stands for a loss that exceeds the VaR measure. Determining the accuracy of the VaR measure can be reduced to determining whether the hit sequence satisfies two properties. First of all, the probability of receiving a loss that exceeds the (1 − p)% VaR measure must be p. Secondly, any two elements of the hit sequence must be independent from each other. Only hit sequences that satisfy both properties can be described as evidence of an accurate VaR model. Let this hit function be defined as follows

It=

(

1, if rt+1< −V aR(1−p)

0, if rt+1≥ −V aR(1−p).

(4.36)

The hit function is used to test the unconditional coverage property with the backtest proposed by

Kupiec(1995) and also to test the independence property with the backtest proposed by Christoffersen

(1998). In addition also the magnitude of losses that exceed the VaR can be taken into account with a

magnitude-based test.

4.4.1 Unconditional coverage backtesting

The unconditional coverage backtest, proposed byKupiec(1995_{), tests the null hypothesis of E[I}t] = p.

The hit function defined at the beginning of this section is used and we first compute the total number of hits n1= T X t=1 It, (4.37)

we also define n0 = T − n1 as the total number of returns larger than −V aR(1−p). The estimated

probability now becomes

ˆ

π = n1

n0+ n1

. (4.38)

So this corresponds to the following hypothesis based on the returns and the Value at Risk measure

H0: ˆπ = p, H1: ˆπ 6= p. (4.39)

The likelihood under the null hypothesis is defined as

L(p; I1, I2, . . . , IT) = (1 − p)n0pn1, (4.40)

and under the alternative hypothesis as

L(π; I1, I2, . . . , IT) = (1 − π)n0πn1. (4.41)

This can be tested with a standard likelihood ratio test

LRuc = −2 log L(p; I1, I2, . . . , IT) L(ˆπ; I1, I2, . . . , IT)] asy ∼ χ2_{(m − 1)} _(4.42)

Managing swaption risk with a dynamic SABR model

Faculty of Economics and Business