Calibration Techniques for Affine Short Rate Models

(1)

Calibration Techniques for

Affine Short Rate Models

Richard Dani¨

els

A thesis presented for the degree of

Master of Science in Computational Science

Faculty of Science

University of Amsterdam

September 26, 2014

Academic Supervisor: Dr. Drona Kandhai (UvA)

Company Supervisor: Dr. Norbert Hari (ING Group)

(2)

Acknowledgment

I would like to thank Drona Kandhai for giving me the opportunity to work on this project at ING Bank. He was also my academic supervisor and provided me with excellent feedback on my thesis. My daily supervisor at ING was Norbert Hari. I would like to thank him for his help while I was learning about term structure models, for all the discussions we had about my results and for his comments on my thesis. I would like to thank Vytautas Savickas for providing me with a multi-curve term structure model and for the help he gave me with understanding the multi-curve framework. Finally, I would like to thank everyone else that I have worked with at ING.

(3)

Abstract

In this work several calibration techniques for affine term structure models are tested empirically, based on Euro swap data. The intended use of the models is for credit risk management, i.e. the calculation of expected exposure and potential future exposure. For credit risk management it is important that the parameter estimates are stable and that the characteristics of simulated yield curves are realistic, both over short and long time periods.

Three calibration techniques were applied to one and two factor Gaussian affine models. The first method fits model implied yield volatility to historical volatility. The second method extends the first method by also fitting model implied corre-lations between yields to historical correcorre-lations. The third method is maximum likelihood on a panel of yield data using the Kalman filter. It was found that there are large advantages to two factor models. Both the volatility+correlation fitting method and maximum likelihood result in stable parameter estimates and reason-able model implied stylized facts.

The experience obtained from the calibrations of Gaussian affine models was used to calibrate a multi-curve model. A multi-curve model contains both the risk free (OIS) yield curve and at least one risky (LIBOR) yield curve. The distinction between OIS and LIBOR yield curves is important for the pricing and risk management of swaps and other derivatives. The multi-curve model that is used in this work has two Hull-White factors for the OIS short rate and one CIR factor for the instantaneous LIBOR-OIS spread. In the model both OIS and LIBOR yields are affine functions of the underlying factors. The model was calibrated using maximum likelihood and the research question was whether the model/calibration method combination could be used for credit risk management.

The parameter estimates of the two Hull-White factors are stable and the model implied OIS yields behave realistically. Calibration of the CIR factor is more prob-lematic, amongst others because there are multiple parameter combinations with the same likelihood. The main problem is that completely different parameter combina-tions are found for 3-month and 6-month LIBOR. For future research it is suggested that simulation methods are used to calibrate the multi-curve model. Simulation methods reduce the bias of the estimator and might solve the current issues.

(4)

1 Introduction

Many financial institutions have a large exposure to interest rates. Financial prod-ucts with interest rate exposure can be relatively simple like loans, bonds and swaps, but over the last decades also more complex interest rate derivatives like caps, swap-tions and spread opswap-tions have become popular. The BIS estimates that as of June 2013, the market value of OTC interest rate derivatives was over $15 trillion with total notional amount of $561 trillion. This makes the OTC interest rate derivatives market the largest market in the world in terms of notional value. Not surprisingly a huge literature has been written on models that can be used for pricing and risk management of interest rate derivatives.

Since the financial crisis of 2008 accurate risk management has become more important to banks. This is further stimulated by stronger regulatory requirements in the Basel III accord. A distinction can be made between market risk management and credit risk management. Market risk management is about possible losses that can occur due to changes in yields. Credit risk management is about the losses that can result from the default of a counterparty.

Credit Value Adjustment (CVA) is the provision that has to be taken on a port-folio due to the possible future default of a counterparty. CVA has to be calculated for each counterparty and is a function of exposure, probability of default and loss given default. Yield curve simulations can be used to calculate potential future values of financial products in a portfolio. Exposure is the sum of the value of all financial products with one counterparty, taking into account collateral and netting agreements. In order to calculate CVA simulations have to be made over the en-tire duration of a financial product. Interest rate swaps can have a large time to maturity of over 20 years, so models used for CVA should be able to make realistic predictions of interest rate distributions over long time periods.

Interest Rate Models

A type of interest rate model that can be used for risk management are the dynamic term structure models (also called short rate models). In a short rate model the dynamics of the short rate are specified. The short rate is a name for the theoretical interest rate with an infinitesimal time to maturity, the short rate is not observable. The short rate dynamics are specified both under the real world measure and the equivalent martingale measure. Bond prices are seen as derivatives on the short rate and can be priced by no-arbitrage arguments under the equivalent martingale measure. The real world measure is needed in order to have a realistic behavior of bond returns over time. Some of the first (and most famous) short rate models are the models by Merton (1973), Vasicek (1977) and Cox, Ingersoll and Ross (CIR) (1985). There also exist more complicated versions of these models, where the short rate is a function of multiple unobservable factors.

Most research has been done on the class of affine term structure models defined by Duffie and Kan (1996). The affine class includes the Vasicek and CIR models, but also multi-factor models with stochastic volatility, like the Fong and Vasicek (1991) and Chen (1996) models. The essentially affine class by Duffee (2001) and the semi-affine class by Duarte (2004) extend the affine class with a more complicated specification of the market price of risk.

(6)

Short rate models can be used both for risk management and for pricing and hedging of interest rate derivatives. For derivatives pricing it is important that a short rate model exactly fits the current yield curve. Hull and White (1990) ex-tended the Vasicek and CIR models with a time dependent parameter in order to fit the current yield curve exactly.

Calibration

There exist a large number of short rate models. Selection of the best model depends on the application of the model and on the current market conditions. At least as important as the choice for an appropriate model is the choice for a calibration method. The parameter values have a large influence on the properties of a model. For the pricing of derivatives it is important that model implied prices are close to the observed market prices. Short rate models can be calibrated by fitting model implied derivatives prices to market prices. Brigo and Mercurio (2006) give details of this method for the calibration of several short rate models to cap and swaption prices. The method is similar to fitting model implied volatilities to the implied volatility of options. In risk management it is preferable that the model implied volatilities are close to historical volatilities. In Gaussian short rate models it is possible to calibrate the model by fitting model implied volatilities can be fitted to historical volatilities. Park (2004) uses this method on the Hull-White model.

A large literature has been written on using econometric techniques to calibrate short rate models. Standard econometric techniques can be used when observable variables are used as proxies for the unobservable factors, like the efficient method of moments, simulated method of moments and maximum likelihood. Chapman et al. (1999) discusses the bias of the method of moments for several models when the short rate is replaced by an observable short-term yield. Litterman and Scheinkman (1991) show that the variation in yields can almost completely be explained by three factors that they call level, slope and curvature. It is possible to use these three components as proxies for the unobservable factors.

An alternative is to invert the bond pricing formula to obtain the factor val-ues from n different bond yields that are assumed to be observed without error. The problem with this approach is that the results depend to a large degree on which yields are used. Chen and Scott (2003) and Dai and Singleton (2000) use this method. It is also possible to assume that bond prices are observed with mea-surement errors. In that case the model can be written in state space form and the Kalman filter can be used to calculate the likelihood. Babbs and Nowman (1999) use this approach for the 1, 2 and 3 factor Vasicek model. De Jong (2000) shows that at least (n+1) bonds must be used to identify all the parameters in the n factor Vasicek model. Maximum likelihood using the Kalman filter is only exact for Gaussian mod-els. For other short rate models the Kalman filter can be used as a quasi maximum likelihood estimator. Brandt and He (2006) show that the parameter estimates can be improved when simulated maximum likelihood is used. A¨ıt-Sahalia and Kimmel (2010) also improve on QML by using closed-form likelihood approximations, this method is computationally less expensive than simulation. The assumptions that are made about the measurement errors influence the parameter estimates. In most papers the measurement errors are assumed to be uncorrelated, but Dempster and Tang (2011) recently proposed a specification that includes cross-sectional and serial

(7)

correlation of the measurement errors.

It is possible to include macroeconomic variables as explanatory variables. Duf-fee (2011) shows that parameter estimates can be improved by using macroeconomic factors, he uses the growth of industrial production and inflation as factors. Dynamic term structure models impose no-arbitrage restrictions on bond prices, Carriero and Giacomini (2011) investigate whether these restrictions are useful for forecasting. Multi-Curve Models

Another change brought about by the financial crisis is that the risk premium in LIBOR rates has increased. The existence of a LIBOR risk premium complicates the pricing of interest rate derivatives based on LIBOR rates. The size of the cash flows of such derivatives depend on future LIBOR rates. Traditionally those cash flows are discounted by LIBOR yields. However, the cash flows of collateralized derivatives are considered to be risk free and should be discounted with the risk free rate. The LIBOR rate with the least risk is the overnight rate. Discount factors derived from Overnight Indexed Swap (OIS) rates are close to risk free and can be used to discount collateralized cash flows.

Interest rate models that specify both the OIS and a LIBOR yield curve are needed for pricing and risk management of LIBOR based derivatives. The literature on multi-curve models is relatively small and focuses mostly on derivatives pricing. For example Mercurio (2010) extends the LIBOR market model to a multi-curve framework. A short-rate setting is most common for risk management. Examples of multi-curve short-rate models are by Kenyon (2010) and Savickas (2014). Filipovi´c and Trolle (2012) developed a multi-curve short-rate model where the LIBOR risk premium is separated in a default risk component and a liquidity component. They also calibrate the model with maximum likelihood using real data.

Problem Statement

The focus of this thesis is on the calibration of dynamic term structure models using real data with risk management (mainly CVA) as intended use. First several cali-bration techniques are tested on the single curve one and two factor Gaussian short rate models. The calibration methods are:

1. The historical volatility fitting method used by Park (2004).

2. A new extension to the volatility fitting method where both model implied volatility and correlation are fitted to historical data.

3. Maximum likelihood using a panel of bond data as used by Babbs and Nowman (1999).

It is possible that other models are more suitable for risk management. However, the focus of this thesis is on calibration and due to the limited time that is available only Gaussian models are considered. The research questions for single curve model calibration are:

• What is the influence of the calibration technique on the properties of the Gaussian short rate model?

• Which calibration technique is the most suitable for risk management pur-poses?

(8)

• Do the beneficial characteristics of the two factor model justify the additional complexity?

In the multi-curve context the model by Savickas (2014) is calibrated using max-imum likelihood with a panel of bond data. The research question is whether the model calibrated with maximum likelihood is suitable for risk management applica-tions.

Thesis Structure

In section 2 an introduction to interest rates and dynamic term structure models is given. Relevant equations, like the bond price and yield volatility, are provided for all the single and multi-curve models that are used. In section 3 all the calibration methods are explained. First a general description of maximum likelihood on state space models using the Kalman filter is given. It is described in detail how each model can be written in state space form, such that the Kalman filter can be used to calculate the likelihood. The results section (4) starts with a discussion of some stylized facts of interest rates. For each calibration method it is investigated how well the Gaussian model fits the stylized facts. Also the parameter stability when the model is recalibrated is tested. For the multi-curve model parameter stability and some basic properties of simulated OIS and LIBOR yield curves are investigated. Section 5 concludes.

(9)

2 Interest Rates and Term Structure Models

2.1 Interest Rate Basics

An interest rate is the rate of return that a lender receives on a loan, expressed in annual terms. There exist different interest rates, depending on the credit quality of the borrower and the time to maturity of the loan. The term structure of interest rates shows the level of interest rates with the same credit quality for different times to maturity. In this thesis only risk free interest rates are considered, i.e. the probability that the borrower will default is zero.

There are multiple ways to represent the same interest rate. The day counting convention and compounding rule must be specified to know the meaning of an in-terest rate. In this thesis all inin-terest rates have the same day counting convention and are continuously compounded if not stated otherwise. In this subsection the definitions of bonds, day counting conventions and compounding rules are given. Also the most important derivatives on interest rates are introduced. The informa-tion is secinforma-tion is based on Brigo and Mercurio (2006) chapter 1 and on Gibson et al. (2010).

2.1.1 Bonds and the Term Structure of Interest Rates

Zero-Coupon Bonds

A bond is a tradable loan with a principal amount N and a maturity date T . Most bonds are coupon bonds, meaning that interest payments are made periodically and the principal is repaid at maturity. The bonds that are considered in this thesis are zero-coupon bonds, the only payment is the repayment of the principal at maturity. P (t, T ) is the price at time t of a zero-coupon bond with maturity date T and principal amount 1, i.e. P (T, T ) = 1. The time to maturity of a bond is given by T − t and is always displayed in years. The zero-coupon bond price P (t, T ) can be seen as the discount factor of payments made at time T . Consecutive discount factors (and thus bond prices) can be multiplied to obtain the discount factor over the entire period,

P (t, T ) = P (t, S)P (S, T ). (2.1)

Compounding Rules

To convert zero-coupon bond prices and interest rates into each other the compound-ing rule must be specified. The compoundcompound-ing rule specifies how often the interest on a bond is reinvested. As an example consider a bond that pays $101 after 0.25 years and costs $100 today (i.e. it pays 1% interest over a quarter year). This interest rate can be annualized with quarterly compounding to 4%. However, when the return of the bond is reinvested in similar bonds over a year, the annual return would be 1.014− 1 = 4.06%. This is the interest rate with annual compounding. Continuous compounding means that interest is paid and reinvested continuously.

Brigo and Mercurio (2006) give formulas for the conversion between zero-coupon prices and interest rates with any compounding rule. The only compounding rules that are used in this thesis are simple compounding and continuous compounding. The simply-compounded interest rate L(t, T ) between time t and T is given by

L(t, T ) = 1 − P (t, T )

(10)

and the bond price in terms of L by

P (t, T ) = 1

1 + L(t, T )(T − t). (2.3)

The continuously-compounded interest rate R(t,T) between time t and T is given by

R(t, T ) = −ln P (t, T )

T − t , (2.4)

and the bond price in terms of R by

P (t, T ) = exp −(T − t)R(t, T ). (2.5)

Day Counting Conventions

Interest rates are always expressed in annual terms, the day counting convention specifies what is meant by annual. Popular day counting conventions are the actual amount of days to maturity divided by 360 or 365. To be able to compare different interest rates it is important that they all contain the same day counting convention and compounding rule. It is not important which day counting convention is used. The Bank Account and Short Rate

A bank account is an interest paying investment that has no maturity date, funds can be withdrawn at any time. The bank account is equivalent to continuously reinvesting in zero-coupon bonds P (t, t + δt) with infinitesimal time to maturity δt. The interest rate on such a bond is called the instantaneous spot rate, or short rate. The short rate r(t) is stochastic and not observable in the market.

The value of the bank account evolves according to the differential equation dB(t) = r(t)B(t)dt,

with initial condition B(0) = 1. This differential equation has the solution, B(t) = exp Z t 0 r(s)ds . (2.6)

The future value of the bank account is unknown, since the short rate is stochas-tic. The stochastic discount factor D(t, T ) is defined as the value of the bank account at time t such that the value increases to 1 at time T , and is given by

D(t, T ) = exp − Z T t r(s)ds . (2.7)

The difference between the stochastic discount factor and the bond price is that only the latter is observable in the market.

Forward Rates

The simply-compounded forward rate F (t, S, T ) is the rate of return that is agreed upon at time t on a risk-free investments between the date of expiration S and the date of maturity T . It can be seen as the expectation at time t of the simply-compounded rate L(S, T ). The forward rate is defined as

F (t, S, T ) = 1 − P (S, T ) (T − S)P (S, T ) =

P (t, S) − P (t, T )

(11)

the forward discount rate in terms of F is given by

P (S, T ) = 1

1 + F (t, S, T )(T − S). (2.9)

The instantaneous forward rate f (t, T ) is the forward rate F (t, S, T ) for an infinites-imal time interval (T − S) and can be seen as an expectation at time t of the short rate r(T ). The forward rate f (t, t) is equal to the short rate r(t). The instantaneous forward rate is given by

f (t, T ) ≡ lim T →S+F (t, S, T ) = − lim T →S+ 1 P (t, T ) P (t, T ) − P (t, S) (T − S) = − 1 P (t, T ) ∂P (t, T ) ∂T = −∂ ln P (t, T ) ∂T . (2.10)

This can be rewritten to

P (t, T ) = exp − Z T t f (t, u)du . (2.11) Term Structures

The term structure of interest rates, also called yield curve, maps the time to matu-rity T − t to the yields R(t, T ). The same information is contained in the term struc-tures that relate time to maturity to bond prices P (t, T ) or forward rates f (t, T ).

Another type of term structure is the term structure of yield volatilities, also called volatility curve. The volatility curve maps the time to maturity to the stan-dard deviation (volatility) of the yield with that time to maturity.

2.1.2 Interest Rate Derivatives

The goal of this thesis is to make predictions of the distribution of yields in the fu-ture, with the application of risk management of contingent claims on interest rates. Such contingent claims are called interest rate derivatives. The price of a zero-coupon bond is a function of an interest rate, thus a zero-zero-coupon bond can be seen as an interest rate derivative. The most important other interest rate derivatives are: Swaps

An interest rate swap is a financial contract that can be seen as a combination of a floating rate and a fixed rate bond with the same notional and maturity date. In a payer swap a fixed rate is paid on the notional at the coupon dates, a floating rate is received at the same dates. The opposite position is called a receiver swap. The floating rate is usually the LIBOR rate with a time to maturity equal to the time between coupon dates.

Bond Options

(12)

the underlying bond on the maturity date for the strike price. A call (put) option OP(t, S, T ) on the bond P (t, T ) is equivalent to a put (call) option OR(t, S, T ) on spot rate R(t, T ), both with the same maturity date S. The only difference is that physical delivery is not possible when the underlying is an interest rate.

Caps/Floors

A cap is an agreement that gives the buyer protection against rising interest rates. Consider a floating rate bond with nominal value N and maturity date T . At dates Ti, i = 2 . . . m a coupon with value N τiL(Ti−1, Ti) is paid, where τi = Ti− Ti−1 is the time between two payments and L(Ti−1, Ti) is the LIBOR rate at time Ti−1with maturity Ti. The interest rate of the bond is determined at dates Ti, i = 1 . . . (m−1). The seller of the bond can cap the interest rate that has to be paid at time Ti to RK by buying a European call option on the LIBOR rate L(Ti−1, Ti) with strike price RK. The payoff is max(L(Ti−1, Ti) − RK, 0). Such an option is called a caplet. A cap consists of m − 1 caplets, where the payoff is paid one time step later than the time when the value of the payoff is determined. The implied volatility of a caplet is the volatility that must be entered into a Black-Scholes type of formula for the pricing of bond options. Each caplet inside a cap can have a different implied volatility, but only a single market price is observed for a cap. It is possible to retrieve caplet volatilities when the prices of multiple caps with different maturity dates Tm are observed.

A floor consists of m − 1 floorlets, a floorlet is equivalent to a European put option on an interest rate.

Swaptions

A European swaption gives the holder the right to enter a payer/receiver swap at a predetermined date. All aspects of the underlying swap are specified in the swap-tion, including the notional, maturity, rate of the fixed leg (this is also the strike of the swaption) and how the rate of the floating leg is determined. Like with other options the price of swaptions is usually expressed with the implied volatility. Im-plied volatilities of at-the-money swaptions exist for two time dimensions, the time to maturity of the swaption and the time to maturity of the swap after the swaption has been exercised.

2.1.3 The LIBOR-OIS Spread

There exist different types of swaps, depending on how the floating rate is deter-mined. In a LIBOR based swap the time between coupon payments determines which LIBOR rate is used, 3 and 6 month LIBOR swaps are the most liquid, but 1 and 12 month LIBOR swaps also exist. In an Overnight Indexed Swap (OIS) the floating rate is determined by taking the geometric average of the overnight rate (EONIA for the Euro) between coupon payments. The LIBOR-OIS spread is the difference between the fixed rate of a LIBOR based swap and the OIS rate.

Both the overnight rate and LIBOR rates are based on unsecured loans between banks and contain a risk premium. The amount of default risk and thus the size of the risk premium are increasing in the tenor of the loans, thus the overnight rate contains the least default risk. Up to the beginning of the financial crisis in 2007 the LIBOR-OIS spread was only a few basis points and was ignored in the

(13)

valu-ation of derivatives. However in the financial crisis LIBOR-OIS spreads increased dramatically and are currently used in the pricing of many derivatives. Note that there are two dimensions to the LIBOR-OIS spread, the first dimension is the tenor of the LIBOR rate and the second dimension is the time to maturity of a yield or zero-coupon bond.

The use of collateral and netting agreements has made derivatives close to risk free. In the valuation of such derivatives the cashflows should be discounted using OIS discount factors, they are the closest to risk-free discount factors. OIS discount factors can be bootstrapped from OIS rates using traditional methods, both the cashflows and discount factors are based on the overnight rate. However for many derivatives the sizes of the cashflows are based on a LIBOR rate.

To explain how LIBOR discount factors are bootstrapped from LIBOR based swaps, first consider the valuation of a swap with notional N based on 3 month LIBOR. Payments are made at times tk, k = 1 . . . K with time intervals tk− tk−1= ∆k of 3 months. The fixed rate is denoted by SLIBOR and the floating rate payed at time tk is equal to the LIBOR rate L(tk−1, tk). The expectation at time 0 of the LIBOR rate L(tk−1, tk) is equal to the forward rate FLIBOR(0, tk−1, tk). The fixed leg of the swap can be valued by

N × SLIBOR

K X k=1

∆k× DOISk ,

and the floating leg by

N K X k=1 FLIBOR(0, tk−1, tk) × ∆k× D_kOIS, where DOIS

k are OIS discount factors.

Swap rates are quoted such that the price equals zero, so

SLIBOR K X k=1 ∆k× DFkOIS = K X k=1 FLIBOR(0, tk−1, tk) × ∆k× DFkOIS.

Using the market price of the fixed rate one LIBOR forward rate can be determined if all the other LIBOR forwards are known. Thus LIBOR forwards can be boot-strapped from LIBOR swap rates.

In the empirical part of this thesis continuously compounded OIS and LIBOR yields are used. Discount factors, yields and forward rates can easily be converted into each other by using equations (2.4)-(2.5), (2.8)-(2.9) and (2.1).

(14)

2.2 Term Structure Models

Over the last decades an immense literature has been written on models of the term structure. In section 2.2.1 a short history is given of how term structure models have developed, what the key differences between models are and what the models are used for. In section 2.2.2 the mathematical framework relating to short rate models is explained, the class of affine term structure models is presented in section 2.2.3. Some examples of famous short rate models are given in section 2.2.4 and finally the equations that will be used in the empirical part of this thesis are derived in sections 2.2.5 and 2.2.6.

2.2.1 History of Term Structure Models

The return obtained on a risk-free bond is only known with certainty when the bond is held to maturity. Uncertainty arises both when investing in shorter and longer maturity bonds than the investment period. With shorter maturity bonds the funds have to be reinvested at some time and the interest rates at that date are currently unknown. With longer maturity bonds the bond has to be sold at some time and the price of the bond at that date is currently unknown. Theories that try to explain bond returns over some time period for bonds with different times to maturity have been around for a long time. Such a theory is the expectations hypothesis. There are many versions of the expectations hypothesis, but they all have in common that forward rates are expectations of future spot rates. The expected return over a fixed time period is the same for bonds with all times to maturity, in particular the expected return over an infinitesimal small time period is equal to the short rate for all bonds. Several forms of the expectations hypothesis are the local expectations hypothesis, P (t, T ) = E exp − Z T t r(s)ds Ft , (2.12)

the return-to-maturity expectations hypothesis, 1 P (t, T ) = E exp Z T t r(s)ds Ft , and the yield-to-maturity expectations hypothesis,

− ln(P (t, T )) T − t = 1 T − tE Z T t r(s)ds Ft .

If future short rates would be known with certainty all these forms would be equiv-alent. Jensen’s inequality can be used to show that the theories are mutually incon-sistent when the short rate is stochastic, see CIR (1981) for more information.

A problem with the expectations hypothesis is that empirically long-term bonds have a higher expected return than short-term bonds. There is a risk premium associated with holding longer term bonds. The liquidity preference theory and pre-ferred habitat theory are extensions to the expectations hypothesis. In the liquidity preference theory investors prefer to hold short-term bonds and demand a higher expected return to invest in longer maturity bonds. This implies that the average

(15)

yield curve over time is upward sloping. The preferred habitat theory is more gen-eral, it states that borrowers and lenders have a preferred time to maturity. The risk premium for each time to maturity is determined by demand and supply. This allows for any shape of the average yield curve.

The liquidity preference and preferred habitat theories do not provide a method to predict the risk premium and test the predictions empirically. Another problem with all the classical theories of the term structure is that they do not make a clear distinction between the space and the time dimension of the yield curve. The space dimension is given by the yield curve at one moment in time. Yield curves are usu-ally upward sloping, but can also be downward sloping (inverted) or have a humped shape. The time dimension can be used to measure properties of yield changes. Expected bond returns depend both on the current yield levels and the expected changes in yields. Both the space and the time dimension are needed to estimate expected returns.

Black and Scholes used a continuous time model to derive option prices based on the absence of arbitrage. A similar approach is taken by dynamic term structure models. In a dynamic term structure model the continuous time dynamics of a vector of Markovian state variables are defined. Bonds, bond options, swaptions and other financial products are assumed to be a function of the state variables and can be priced under the assumption that there are no arbitrage opportunities. In most models the short rate is one of the state variables, dynamic term structure models are often called short rate models for this reason.

The time dimension is specified by the dynamics of the state vector under the real world measure. The space dimension (yield curve) at a specific time is a function of the values of the state vector at that time. The function depends on the dynamics of the state vector under the risk neutral measure.

A distinction can be made between equilibrium and arbitrage-free short rate models. In equilibrium models the current yield curve is an output of the model, the model implied yield curve can not be fitted exactly to the yield curve observed in the market. For derivatives pricing it is unacceptable that the current yield curve is not fitted exactly. Arbitrage-free models have a time dependent parameter that can be set in such a way that the model implied yield curve exactly fits the observed yield curve. The name equilibrium model comes from the theory of general equilib-rium. In some short rate models, for instance the CIR (1985) model, the state vector dynamics are derived from an economy in general equilibrium. However, in most equilibrium models the theory of general equilibrium is not used. The dynamics of the state vector are simply specified and bond prices are calculated by absence of ar-bitrage arguments. Examples of equilibrium models are the Merton (1973), Vasicek (1977), Beaglehole and Tenney (1991), Fong and Vasicek (1991) and Chen (1996) models. Hull and White (1990) extended the Vasicek and CIR models with time dependent parameters. The arbitrage-free version of the Vasicek model is called the Hull-White (1994) model.

Heath, Jarrow and Morton (HJM) (1992) introduced an arbitrage-free model where a stochastic differential equation is defined for the entire instantaneous

(16)

for-ward curve f (t, T ). Bond prices can be calculated using equation (2.11). In the HJM framework both the initial forward curve f (0, T ) and the volatility structure are inputs. In general the bond price P (t, T ) is path dependent, but for certain specifications of the volatility structure the model is Markovian. The short rate is also part of the HJM framework, because r(t) = f (t, t). As a result all the widely used arbitrage-free short rate models are special cases of the HJM framework. The specification of the volatility structure determines which model is obtained.

In the LIBOR market model (1997) a set of simply-compounded forward inter-est rates F (t, T, S) are modeled. An advantage over short rate and HJM models is that forward LIBOR rates are observed in the market, the short rate and instan-taneous forward rates are not observed. The forward rates are assumed to have a log-normal distribution. The pricing formula for caps that results from the LI-BOR market model is equal to Black’s formula for option pricing. It is possible to reproduce caplet market prices exactly using the LIBOR market model, when the implied volatility is known. Short rate models are only capable of exactly fitting a limited number of caplets. For the pricing of caps the LIBOR market model is the most suitable model, but for risk management a short rate model should be used. When calibrating any interest rate model to historical bond prices it is convenient to have a time series of yields with a constant time to maturity. In the market only a limited amount of bonds are observed and most bonds are coupon bonds. To obtain constant maturity yields the observed prices must be interpolated. Several models have been developed that specify a functional form of the yield curve, depending on some parameters. The parameters can be chosen in such a way that observed (coupon) bond prices are fitted as close as possible. An example of such a model is the Nelson-Siegel-Svensson model (Svensson 1994) that is commonly used by central banks.

Diebold and Li (2006) added a time dimension to the Nelson-Siegel model by

letting the parameters evolve by an AR process. These type of models are an

alternative to short rate models for the purpose of risk management.

2.2.2 Dynamic Term Structure Models

The key elements of the mathematical framework relating to multi-factor short rate models are given in this section. The reader is referred to the lecture notes by Lund (1998) for an introduction to one factor models and a more complete description of multi-factor models.

The most important assumptions underlying multi-factor short rate models are: • The short rate r(t, x(t)) and zero-coupon bond prices P (t, T, x(t)) for any time

to maturity T are a function of an [n x 1] vector of state variables x(t). • The state vector follows a continuous time Markov process. The dynamics of

the state vector under the real world measure P are given by

(17)

where µ is a [n x 1] drift vector, σ is a diagonal [n x n] volatility matrix and W is a n-dimensional Brownian motion with correlation between elements given by dWi(t)dWj(t) = ρijdt.

By applying Ito’s lemma on the bond price P (t, T ) and using no-arbitrage arguments, the bond price can be shown to satisfy the following PDE,

1 2 n X i=1 n X j=1 ∂2P (t, T ) ∂xi(t)∂xj(t)ρijσi()σj()+ n X i=1 ∂P (t, T ) ∂xi(t) µi() − σi()λi() +∂P (t, T ) ∂t = rP (t, T ), (2.14) with boundary condition P (T, T ) = 1. Drift µi(), market price of risk λi() and volatility σi() are all functions of time t and the state vector x(t).

By the Feynman-Kac formula the solution to PDE (2.14) can be represented by the risk-neutral conditional expectation

P (t, T ) = EQ h exp − Z T t r s, x(s)ds Ft i . (2.15)

The dynamics of the state vector under the risk neutral measure Q are given by dx(t) =

µ(t, x(t)) − σ(t, x(t))λ(t, x(t))

dt + σ(t, x(t))dWQ(t). (2.16)

Note that the bond pricing formula (2.15) is equal to the local expectations hypothesis under the risk neutral measure. By substitution of equation (2.15) into equation (2.4), it can be seen that the spot rate is given by

R(t, T ) = − 1 T − tln EQ exp − Z T t r(s)ds Ft . (2.17)

2.2.3 Affine Term Structure Models

The class of affine term structure models was introduced by Duffie and Kan (1996). An affine term structure model satisfies the following conditions:

1. The short rate is an affine function of the state variables, r(t) = φ +

n X

i=1

wixi(t) = φ + w0x(t), (2.18)

where φ is a scalar parameter and w is an [n x 1] vector of parameters. 2. The drift and volatility of the state vector are affine functions of the state

variables. The dynamics of the state vector under the risk neutral measure are given by

(18)

where ˜K is an [n x n] matrix of mean reversion parameters, ˜Θ is an [n x 1] vector of mean level parameters, ZQ(t) is an [n x 1] vector of independent Brownian motions, S(t) = diag(αi+ β_i0x(t)), αi is a scalar parameter, βi is an [n x 1] vector of parameters and Σ is an [n x n] matrix.

3. The market price of risk is an affine function of the state variables of the form,

λ(t, x(t)) =pS(t)λ, (2.20)

where λ is a [n x 1] parameter vector.

These assumptions imply that the dynamics of the state vector under the real world measure are also affine,

dx(t) = K Θ − x(t)dt + Σp

S(t)dZ(t). (2.21)

The parameters of the risk neutral and real world measures are linked by

K = ˜K − ΣΛB0 and KΘ = ˜K ˜Θ + ΣΛα, where Λ = diag(λ) is an [n x n] matrix, α = (α1; . . . ; αn) is an [n x 1] vector and B = (β1, . . . , βn) is an [n x n] matrix.

Note that this specification with uncorrelated Brownian motions is equivalent to a specification with correlated Brownian motions. The main difference is that the correlation between factors is not given by a single parameter, but is a function of several parameters.

Duffie and Kan (1996) show that for affine models the solution of the bond price can be written as,

P (t, t + τ ) = exp A(τ ) − n X i=1 Bi(τ )xi(t) , (2.22)

where τ = T − t is the time to maturity. A(τ ) and B(τ ) are deterministic functions of time to maturity and the model parameters, note that B(τ ) is an [n x 1] vector of functions. The bond price P (t, T ) depends on the current time t only through the value of the state vector x(t) at time t. By substitution of equation (2.22) into equation (2.4) it can be seen that spot rates are given by

R(t, t + τ ) = A(τ ) τ + n X i=1 Bi(τ ) τ xi(t). (2.23)

The solutions of the functions A(τ ) and B(τ ) are given by the system of n + 1 ordinary differential equations,

dA(τ ) dτ = −( ˜K ˜Θ) 0_{B(τ ) +} 1 2 n X i=1 Σ0_{B(τ )}2 i αi− φ, dB(τ ) dτ = −( ˜K) 0 B(τ ) − 1 2 n X i=1 Σ0 B(τ )2_i βi+ w, (2.24)

with initial conditions A(0) = 0 and B(0) = 0 implied by P (T, T ) = 1. In some cases there is an analytic solution to this system, in the worst case the system of

(19)

ordinary differential equations needs to be solved numerically. Either way it is eas-ier to solve for the bond price in the affine model than in the general case, where a partial differential equation needs to be solved.

Not all parameters in the affine model are identified. Dai and Singleton (2000) classify an affine model by the number of factors m that have an influence on the conditional variance of the state vector. There are n + 1 subclasses Am(n) of the n

factor model. The maximal model of each subclass Am(n) contains the maximum

number of identifiable parameters. Many previously known term structure models can be obtained by setting further parameter restrictions on the maximal models.

2.2.4 Examples of Dynamic Term Structure Models

The short rate dynamics of several dynamic term structure models are shown in table 1. All models except the Dothan and exponential Vasicek models are affine term structure models. The Hull-White model is the only arbitrage-free model that is shown, all other models are equilibrium models. In the affine models the short rate is itself one of the state variables, this implies the parameter restrictions φ = 0 and w = (1; 0; . . . ; 0).

In section 4.1 empirical properties of interest rates are discussed. From the defi-nition of a dynamic term structure model it can be discussed whether the model satisfies the following properties:

1. Number of factors. Any one factor model has the disadvantage that the flexi-bility in the shape of yield curves is limited and that the correlation between any two interest rates is perfect. Multi-factor models are more realistic in both these properties.

2. Interest rates have never been negative. In the column ’r > 0’ in table 1 it is shown if the short rate is strictly positive. Interest rate positivity depends on the distribution of the short rate, in a normal distribution negative values are possible, in a log-normal or χ2 distribution the probability of negative values is zero.

3. Interest rates are mean reverting. In most models the short rate follows an Ornstein-Uhlenbeck process, this ensures mean reversion of interest rates. The Merton (1973) model is not mean reverting. The Dothan (1978) model is only mean reverting if the parameter κ is negative, in that case the mean reversion level is 0 and there will be a large probability of negative interest rates. 4. Yield changes have a fat-tailed distribution. This property depends directly

on the distribution of the short rate. The χ2 distribution of the CIR (1985) model has fat-tails. The Fong and Vasicek (1991) and Chen (1996) models contain a stochastic volatility factor, this causes excess kurtosis in the short rate distribution.

(20)

Table 1: Examples of dynamic term structure models, including the distribution of the short rate, the possibility of negative interest rates and the availability of an analytic bond pricing formula.

model year n specification short rate distribution r > 0 analytic bond price

Merton 1973 1 dr(t) = µdt + σdW (t)

λ(t, r(t)) = λ Normal No Yes

Vasicek 1977 1 dr(t) = κ(θ − r(t))dt + σdW (t)

λ(t, r(t)) = λ Normal No Yes

CIR 1985 1 dr(t) = κ(θ − r(t))dt + σpr(t)dW (t)

λ(t, r(t)) = λpr(t) Non Central χ2 Yes Yes

Dothan 1978 1 dr(t) = κr(t)dt + σr(t)dW (t)

λ(t, r(t)) = λr(t) Log-Normal Yes Yes*

Exponential Vasicek 1 r(t) = exp(x(t)) dx(t) = κ(θ − x(t))dt + σdW (t) λ(t, x(t)) = λ Log-Normal Yes No Hull-White 1994 1 dr(t) = κ(θ(t) − r(t))dt + σdW (t) λ(t, r(t)) = λ Normal No Yes

Beaglehole & Tenney 1991 2

dr(t) = κ1(θ(t) − r(t))dt + σ1dW1(t) dθ(t) = κ2(¯θ − θ(t))dt + σ2dW2(t) λi(t, x(t)) = λi dW1(t)dW2(t) = ρdt Normal No Yes Fong-Vasicek 1991 2 dr(t) = κ1(θ − r(t))dt +ps(t)dW1(t) ds(t) = κ2(¯s − s(t))dt + σps(t)dW2(t) λi(t, x(t)) = λips(t) dW1(t)dW2(t) = ρdt No Yes* Chen 1996 3 dr(t) = κ1(θ(t) − r(t))dt +ps(t)dW1(t) dθ(t) = κ2(¯θ − θ(t))dt + σ1pθ(t)dW2(t) ds(t) = κ3(¯s − s(t))dt + σ2ps(t)dW3(t) Yes Yes* 19

(21)

5. Computational complexity. This is not an interest rate property, but for cali-bration it is important that the computation of the bond price is tractable. In the last column of table 1 it is shown if there exists an analytic expression for the bond price. The Dothan model is the only model with a log-normal short rate distribution that has an analytic bond price formula. The Yes* means that there is an analytic expression, but that it is not so nice. For instance the gamma function or a modified Bessel function can be part of the solution. For the affine models (Fong-Vasicek and Chen) it might be easier to solve the ODEs numerically than to use the analytic expression of the bond price. 6. The models with a log-normal short rate have the theoretical disadvantage

that the expected value of the bank account is infinite.

Term premia are mainly influenced by the specification of the market price of risk. All the models in table 1 have a restrictive market price of risk. Models in the essentially affine class of Duffee (2001) and the semi-affine class of Duarte (2004) have a more flexible market price of risk specification. These models are outside the scope of this thesis.

2.2.5 Generalized Vasicek Short Rate Models

Only Gaussian affine term structure models are used in the empirical part of this thesis. This class of affine models is obtained by setting B = 0, in the notation of Dai and Singleton (2000) this is the A0(n) class. There are many ways in which the remaining parameters can be restricted such that all parameters are identified, a particularly simple form is given by De Jong (2000). This form of the class of Gaussian affine models is also used by Babbs and Nowman (1999), they call it the class of generalized Vasicek short rate models. The only difference is that affine term structure models are equilibrium models, Babbs and Nowman (1999) also define an arbitrage-free version of the generalized Vasicek models. In this thesis the general-ized Vasicek form is used with the modification that the Brownian motions can be correlated, to keep all the parameters identified the matrix Σ is restricted to be the identity matrix.

In the generalized Vasicek short rate model, the short rate is given by r(t) = φ(t) +

n X

i=1

xi(t) = φ(t) + ι0x(t). (2.25)

The dynamics of the state vector under the risk neutral measure are given by dxi(t) = −κixi(t)dt + σidWiQ(t), i = 1 . . . n, (2.26) with correlated Brownian motions, dW_iQ(t)dW_jQ(t) = ρijdt. The market price of risk of factor xi is defined as σiλi, thus the dynamics of the state vector under the real world measure are given by

dxi(t) =

σiλi− κixi(t)

(22)

In the equilibrium form of the generalized Vasicek model the initial conditions are unknown, x(0) = x0, and φ(t) is a scalar parameter. In the arbitrage-free version x(0) = 0 and φ(t) is a deterministic function that is defined in such a way that the yield curve that is implied by the model is equal to the yield curve that is observed in the market.

It might seem restrictive that all the factors are mean reverting to 0 under the risk neutral measure. However Babbs and Nowman (1999) mention that all Gaus-sian affine models can be written in this form and show it for the Beaglehole and Tenney (1991) model. The G2++ model is the arbitrage-free version of the two factor generalized Vasicek model, Brigo and Mercurio (2006) show that the G2++ model is equivalent to the two factor Hull-White model.

2.2.6 Probabilistic Derivation of the Bond Price

A formula for the bond price in the Gaussian affine model can be found by solving the system of ordinary differential equations (2.24). In this section a different ap-proach is taken. By assuming that the local expectations hypothesis holds under the risk neutral measure it is possible to derive the bond price using probabilistic arguments. It is not necessary to create a hedging portfolio and assume that there are no arbitrage opportunities.

The solution of xi(t) under the real world measure can be found by application of Itˆo’s formula to yi(t) = eκit xi(t) − λiσi/κi, where dxi(t) is given in equation (2.27): xi(t) = λiσi κi 1 − e−κi(t−s)_{+ e}−κi(t−s)_x i(s) + σi Z t s e−κi(t−u)_dW i(u), (2.28) where time s is before time t.

The factors xi(t) can be shown to be normally distributed, conditional on the filtration Fs. From equation (2.25) it can be seen that the short rate is a sum of normally distributed random variables. This means that the short is also normally distributed. The mean of the short rate is given by

EP h r(t) Fs i = φ(t) + n X i=1 λiσi κi 1 − e−κi(t−s) + e−κi(t−s)_xi(s), _(2.29)

and the variance by V arPhr(t) Fs i = n X i=1 n X j=1 ρijσiσj (κi+ κj) 1 − e−(κi+κj)(t−s)_, _(2.30)

where EP means the expectation is taken under the real world measure. The results (2.28)-(2.30) under the risk neutral measure can be found by setting λ = 0. Note that the variance depends on the time difference (t − s), but not on the factors x(s)

(23)

and that the variance is equal under both probability measures.

By substitution of the short rate formula (2.25) into the local expectations hy-pothesis (2.12), the bond price is given by

P (t, T ) = exp − Z T t φ(s)ds EQ " exp − n X i=1 Z T t xi(s)ds ! F_t # (2.31)

It can be shown thatR_tTxi(s)ds is normally distributed, thus the sumPn_i=1R_tT xi(s)ds is also normally distributed. The mean of this sum is given by

M (t, T ) ≡ EQ " _n X i=1 Z T t xi(s)ds Ft # = n X i=1 1 κi 1 − e−κi(t−s)_x i(t), (2.32)

and the variance by

V (T − t) ≡ V arQ " _n X i=1 Z T t xi(s)ds Ft # = n X i=1 n X j=1 ρijσiσj κiκj " T − t − 1 − e −κi(T −t) κi − 1 − e −κj(T −t) κj + 1 − e −(κi+κj)(T −t) (κi+ κj) # . (2.33)

The expectation in equation (2.31) is taken over the exponent of a normal random variable, and thus over a log-normally distributed random variable. The expectation is given by EQ " exp − n X i=1 Z T t xi(s)ds ! F_t # = exp −M (t, T ) + V (T − t) 2 .

The function φ(t) is needed to find the bond price in the arbitrage-free model. It can be found by equating the theoretical bond price P (0, T ) to the market price PM(0, T ), φ(t) = fM(0, t) − ∂M (0, t) ∂(t) + 1 2 ∂V (t) ∂(t) , (2.34)

where f (0, t) = −∂P (0, t)/∂(t) is the instantaneous forward rate.

In the empirical part of this thesis continuously compounded yields will be used, not bond prices. By inserting the bond price into equation (2.4) and by using τ = T − t, the spot rate in the equilibrium model is found to be given by

R(t, t + τ ) = φ −V (τ ) 2τ + n X i=1 1 − e−κiτ κiτ xi(t), (2.35)

(24)

and in the arbitrage-free model by R(t, t + τ ) = (t + τ ) τ R(0, t + τ ) − t τR(0, t)+ + V (t + τ ) − V (t) − V (τ ) 2τ + n X i=1 1 − e−κiτ κiτ xi(t). (2.36)

The conditional covariance of two spot rates is equal in the equilibrium and arbitrage-free forms of the generalized Vasicek model and is given by

Cov h R(t, t + τ1), R(t, t + τ2) Fs i = n X i=1 n X j=1 ρijσiσj κiκj(κi+ κj)τ1τ2 1 − e−κiτ1 1 − e−κjτ2 1 − e−(κi+κj)(t−s) . (2.37)

Note that the conditional (co)variance of the spot rate only depends on time s through (t − s) and not through the state vector x(s). This means that the condi-tional and uncondicondi-tional (co)variance of changes in yields are equal.

2.2.7 A Three Factor Affine LIBOR-OIS Model

In section 2.1.3 it was explained that both the OIS yield curve and the LIBOR yield curve are needed to price LIBOR swaps. Obviously both curves are also needed to price derivatives based on LIBOR swaps, like caps and swaptions. In this section a three factor affine model is presented where both the OIS and LIBOR curves depend on the underlying factors. This model was first used by Savickas (2014) for pricing non-liquid caps.

In the model the OIS short rate r(t) follows the two factor Gaussian affine model specified in equations (2.25)-(2.27). Continuously compounded OIS yields can be calculated by equation (2.35) if the Vasicek model is used or equation (2.36) if the Hull-White model is used.

The instantaneous LIBOR-OIS spread is defined as

s(t) = ρrsr(t) + x3(t), (2.38)

where ρrs can be seen as a correlation parameter between the OIS short rate and the instantaneous LIBOR-OIS spread. The third factor follows CIR dynamics. The Brownian motion W3(t) of the third factor is uncorrelated with the Brownian mo-tions of the first two factors. This is done in order to have an analytic solution for the LIBOR bond price.

CIR Factor Properties

The risk neutral dynamics of a CIR factor xi(t) are defined as

(25)

and the market price of risk as λi(t, xi(t)) = λipxi(t). From this it can be derived that the real world dynamics of a CIR factor are given by

dxi(t) = (κiθi+ σiλixi(t) − κixi(t))dt + σipxi(t)dWi(t). (2.40) A solution for xi(t) can be found by substitution of

yi(t) = e(κi−σiλi)t xi(t) − θi 1 + σiλi κi− σiλi

into equation (2.40) and by application of Itˆo’s lemma. The solution for xi(t) can be written as xi(t) = θi 1 + σiλi κi− σiλi + e−(κi−σiλi)(t−s) xi(s) − θi 1 + σiλi κi− σiλi + + σi Z t s e−(κi−σiλi)(t−u)p_x i(u)dWi(u). (2.41) Shao (2012) shows that the conditional distribution of xi(t) is non-central χ2 up to a scalar factor, xi(t)|xi(s) ∼ c × ncχ2(d, µ), c = σ 2_{(1 − e}−(κi−σiλi)(t−s)₎ 4(κi− σiλi) , d = 4θiκi σ2 , µ = 4(κi− σiλi)e−(κi−σiλi)(t−s) σ2_{(1 − e}−(κi−σiλi)(t−s)₎ xi(s), (2.42)

where d are the degrees of freedom and µ is the non-centrality parameter. The conditional expectation of xi(t) is given by

EPhxi(t) Fs i = θi 1 + σiλi κi− σiλi 1 − e−(κi−σiλi)(t−s)₊ + e−(κi−σiλi)(t−s)_x i(s), (2.43) and the conditional variance by

V arPhxi(t) Fs i = xi(s) σ2 κi− σiλi e−(κi−σiλi)(t−s)_{− e}−2(κi−σiλi)(t−s) + + θi 1 + σiλi κi− σiλi σ2 2(κi− σiλi) 1 − e−(κi−σiλi)(t−s)2_, _(2.44) note that the conditional variance of a CIR factor depends on its level.

The LIBOR short rate is defined as the sum of the OIS short rate and the in-stantaneous LIBOR-OIS spread,

l(t) = r(t) + s(t) = (1 + ρrs) φ(t) + x1(t) + x2(t) + x3(t). (2.45) The conditional expectation is given by

EPhl(t) Fs i = (1 + ρrs)EP h r(t) Fs i + EPhx3(t) Fs i , (2.46)

(26)

where EP h r(t) Fs i

is given in equation (2.29) and EP h x3(t) Fs i in equation (2.43). Likewise the conditional variance is given by

V arP h l(t) Fs i = (1 + ρrs)2V arP h r(t) Fs i + V arP h x3(t) Fs i , (2.47) where V arPhr(t) Fs i

is given in equation (2.30) and V arPhx3(t) Fs i in equation (2.44).

The LIBOR short rate has an affine specification, thus discount factors can be computed by solving a system of ordinary differential equations. The system of ordinary differential equations can be solved analytically for the LIBOR-OIS model. However, there is an easier way to derive the formula for the discount factors.

P (t, T ) = E exp(− Z T t l(u)du) = E exp(− Z T t

(1 + ρrs) φ(u) + x1(u) + x2(u)du) E exp(− Z T t x3(u)du) = P(1+ρrs)GV 2F_{(t, T )P}CIR1F_{(t, T )}

The expectation of the Gaussian factors and the CIR factor can be separated, be-cause they are uncorrelated. The bond price P(1+ρrs)GV 2F_{(t, T ) of (1 + ρ}

rs) times the two factor generalized Vasicek short rate can be derived in the same way as in section 2.2.6, since the factors remain Gaussian when they are multiplied by a con-stant. The discount factor P(1+ρrs)GV 2F_{(t, T ) is given by equation (2.35) or (2.36)} where φ(t) and M (t, T ) are multiplied by (1 + ρrs) and where V (t, T ) is multiplied by (1 + ρrs)2. The bond price PCIR1F(t, T ) of a one factor CIR model can be found in many papers and text books, see for instance Brigo and Mercurio (2006) page 66. If both the OIS and the LIBOR yields need to be fitted perfectly to the initial term structures, it is possible to make both φ(t) and the parameter θ3(t) time de-pendent. Note that θ3(t) is defined in such a way that the LIBOR-OIS spread is fitted perfectly, not the LIBOR yield curve.

The formulas for continuously compounded LIBOR yields are provided below, they will be used in the next section. For the 2 factor Vasicek + 1 factor CIR (V2C1) model LIBOR yields are given by

RLIBOR(t, t + τ ) = 1 + ρrs φ + 2 X i=1 1 − exp(−κiτ ) κiτ xi(t) ! − (1 + ρ_rs)2V (τ ) 2τ −2κ3θ3 σ₃2τ log 2g exp((κ3+ g)τ /2) 2g + (κ3+ g)(exp(gτ ) − 1) +x3(t) τ 2(exp(gτ ) − 1) 2g + (κ3+ g)(exp(gτ ) − 1), (2.48)

(27)

and for the 2 factor Hull-White + 1 factor CIR (H2C1) model by RLIBOR(t, t+τ ) = 1 + ρrs τ (t + τ )ROIS(0, t + τ ) − tROIS(0, t) + V (t + τ ) − V (t) 2 + 1 + ρrs 2 X i=1 1 − exp(−κiτ ) κiτ xi(t) − (1 + ρrs)2V (τ ) 2τ −2κ3θ3 σ2 3τ log 2g exp((κ3+ g)τ /2) 2g + (κ3+ g)(exp(gτ ) − 1) +x3(t) τ 2(exp(gτ ) − 1) 2g + (κ3+ g)(exp(gτ ) − 1), (2.49) where g =pκ2 3+ 2σ23.

(28)

3 Calibration & Simulation

In the literature many different methods are used to calibrate dynamic term struc-ture models. A distinction can be made between the calibration of equilibrium and arbitrage-free models.

Equilibrium models are usually calibrated using econometric techniques, like maximum likelihood or the efficient method of moments. Both of these estima-tors are unbiased and asymptotically efficient. Duffee and Stanton (2012) use both methods to calibrate affine term structure models and find that the small sample properties of maximum likelihood are better than of the efficient method of mo-ments. For this reason the method of moments will not be investigated in this thesis, maximum likelihood is the only econometric method that is used.

Arbitrage-free models were developed for more accurate pricing and hedging of derivative securities. The time varying parameter is chosen such that the model exactly fits the current yield curve. For pricing purposes the other parameters should be calibrated by fitting model implied derivatives prices to market prices. Brigo and Mercurio (2006) give examples of this method for several short rate models.

For risk management of simple interest rate derivatives (bonds, swaps, bond options) up to one year in the future, the most important property is (historical) yield volatility. Park (2004) discusses a method in which the generalized Vasicek model is calibrated to historical volatilities. Meng et al. (2013) find that calibration of the one factor Hull-White model to historical volatilities provides a better estimate of future volatility than calibration to swaption prices. The calibration to historical volatilities is called the ’volatility fitting’ method in this thesis.

One of the goals of this thesis is to investigate the differences between the one and two factor Gaussian affine models. An advantage of the two factor model is that it allows for decorrelation between yields. It depends on the parameter values whether yields are actually imperfectly correlated in the model. Inspired by the volatility fitting method, a calibration method is developed where model implied correlations are fitted to historical correlations. This method is called the ’correlation fitting’ method in this thesis. In the ’combined fitting’ method both historical volatilities and historical correlations are fitted simultaneously.

The volatility fitting method is only available for Gaussian models. The three factor LIBOR-OIS model also contains a non-Gaussian CIR factor. So maximum likelihood using the Kalman filter is the only calibration method that is used on the LIBOR-OIS model.

In the remainder of this section all the calibration techniques are explained in detail. In all cases the models need to be discretized, so first the notation that will be used in the discrete time framework is explained.

Discrete Time Framework

Dynamic term structure models are usually defined in continuous time, but the data that is used for calibrations is observed in discrete time. Some of the equations that were derived in sections 2.2.6 and 2.2.7, will be discretized in this section.

It is assumed that the dataset consists of time series of p yields R(tk, tk+ τj), j = 1, . . . , p. There are N observations of each yield, at times t1, . . . , tN. For

(29)

simplicity the time between observations ∆t = tk− tk−1 is assumed to be equal for all observations. For some calibration methods yield changes are needed, there are N − 1 yield changes in the dataset for each time to maturity. Yield changes are expressed as

∆Rk(τ ) = R(tk, tk+ τ ) − R(tk−1, tk−1+ τ ).

3.1 Calibration: Fitting Methods

In section 2.2.6 it was shown that for the Gaussian affine model the conditional covariances of yields do not depend on the value of the state vector. Thus the conditional covariances are constant over time and the unconditional covariances are equal to the conditional covariances. Yield covariances can be calculated with equation (2.37). In the fitting methods a cost function is defined on the difference between the model implied and empirically observed volatilities and correlations. The parameter estimate results from a minimization of the cost function.

Volatility Fitting

The model implied unconditional volatility over a time period ∆t can be calculated from equation (2.37) by σ_∆tmodel(τ ) = r Cov h R(t, t + τ ), R(t, t + τ ) Ft−∆t i , (3.1)

where τ is the time to maturity. The unconditional empirical volatility of yield changes with time ∆t between yields is just its standard deviation,

σ_∆tdata(τ ) = v u u t 1 N − 2 N X k=2 ∆Rk(τ ) − µdata_∆t (τ ) 2 , (3.2) where µdata_∆t (τ ) = 1 N − 1 N X k=2 ∆Rk(τ ).

The cost function is the sum over all times to maturity of the squared difference between the model implied and empirical volatilities. The parameters in the param-eter vector

ψ = {κi, σi, ρij, i = 1 . . . n, j = (i + 1) . . . n}, (3.3)

are estimated in the volatility fitting method by

ˆ ψ = arg min_ψ   p X j=1 σmodel_∆t (τj) − σ∆tdata(τj) 2  . (3.4)

(30)

Correlation Fitting

The model implied unconditional correlations can be calculated from equations (2.37) and (3.1) by ρmodel_∆t (τi, τj) = Cov h R(t, t + τi), R(t, t + τj) Ft−∆t i

σmodel_∆t (τi) σ_∆tmodel(τj) . (3.5)

The empirical unconditional correlations are given by

ρdata_∆t (τi, τj) = 1 N −2 PN k=2 ∆Rk(τi) − µdata∆t (τi) ∆Rk(τj) − µdata∆t (τj) σdata ∆t (τi) σ∆tdata(τj) . (3.6)

The cost function is the sum of squared differences between model implied and empirical correlations. The parameter vector of equation (3.3) is estimated in the correlation fitting method by

ˆ ψ = arg min_ψ   p X i=1 p X j=i+1

ρmodel_∆t (τi, τj) − ρdata_∆t (τi, τj)2



. (3.7)

Combined Fitting

In the combined fitting method the generalized Vasicek model is fitted to both em-pirical volatilities and correlations. The cost function is the volatility cost function in equations (3.4) plus a weighting factor times the correlation cost function in equa-tion (3.7). In the empirical part of this thesis the weighting factor is chosen such that the volatility and correlation contributions to the combined cost function are close to each other.

Market Price of Risk

The variances and correlations of yields are not affected by the probability measure. For this reason the market price of risk parameters λ can not be estimated with the fitting methods. When the market price of risk is not estimated, it will be assumed that λ = 0.

The parameter φ in the Vasicek model also has no influence on the model implied yield volatilities and correlations and can not be estimated with the fitting methods. Even though the parameter estimates of the fitting methods are equal for the Vasicek and Hull-White models, they will only be used with the Hull-White model.

(31)

3.2 Calibration: Maximum Likelihood

In maximum likelihood the log-likelihood function is maximized with respect to the model parameters. The log-likelihood of a data sample is given by

log L(YN) = N X

k=1

log f (yk|Yk−1), (3.8)

where yk is the observation vector at time tk, Yk = {y1, . . . , yk} contains all obser-vations up to time tkand f (yk|Yk−1) is the (multivariate) density function of vector yk given all information up to time tk−1.

It is relatively easy to use maximum likelihood (or the method of moments) if the factors are known. However, in dynamic term structure models no interpretation of the factors is given. It is possible to assume that the factors (in a two factor model) correspond to for instance the 1 month yield and the 10 year yield. A disadvantage of this method is that the results depend on the arbitrary decision of which yields (or other factors) to use. Chapman et al. (1999) investigates the effect of using the 3 month yield as the factor in one factor models.

In this thesis a different approach is taken where no interpretation is given to the factors. The Kalman filter is used to filter the values of the factors from a panel of yield data and the filtered factor values are used to compute the log-likelihood. The Kalman filter can be applied to any model that can be written in state space form. In a state space model the dynamics of an unobservable state vector are specified in the so-called transition equation. The space part of the model consists of a function of the data in terms of the state vector, this is called the measurement equation.

All dynamic term structure models can be written in state space form. The dynamics of the state vector are just the factor dynamics defined by the model, they only need to be discretized in order to use the Kalman filter. The measurement equation depends on the type of data. In this thesis only continuously compounded yields are used. So for instance for the Vasicek model the measurement equation is given by equation (2.35). It is also possible (but much less common) to use other instruments, like cap or swaption prices.

In the remainder of this subsection the linear Gaussian state space model is de-fined and the Kalman filter applied to linear Gaussian models is explained. The state space form of the Gaussian affine term structure model is shown in detail. The adjustments that need to be made to the Kalman filter when it is applied to the CIR model (or the LIBOR-OIS model defined in section 2.2.7) are discussed. And finally an extension to the regular state space form of dynamic term structure models is provided.

The notation used in this subsection is based on Durbin and Koopman (2012).

3.2.1 The Linear Gaussian State Space Model

The measurement equation of the linear Gaussian state space model is defined as

(32)

where yk is a vector of observations, xk is a vector of state variables and εk is a vector of measurement errors, all at time tk. The matrices/vectors Zk, Hk and dk are deterministic and may depend on model parameters. The transition equation is defined as

xk= Tkxk−1+ ck+ Rkηk, ηk∼ N ID(0, Qk), (3.10)

where ηk is a vector of unexpected changes in the state vector between time tk−1 and tk. The matrices/vectors Tk, ck, Rk and Qk are deterministic and may depend on model parameters. The dimensions of all the vectors and matrices in the linear state space model are given in table 2.

The NID distribution means that the error vectors are assumed to be multivariate normally distributed and that the error vectors at time tk are independent of the error vectors at any other time. Also εk and ηk are independent from each other.

Table 2: Dimensions of the linear state space model.

yk p x 1 xk n x 1 Zk p x n Tk n x n dk p x 1 ck n x 1 εk p x 1 ηk n x 1 Rk n x r Hk p x p Qk rx r

3.2.2 The Kalman Filter

The Kalman filter is a method that recursively estimates the expected value and variance of the unobservable state vector xkof a state space model. In the Gaussian state space model those estimates can be used to calculate the likelihood of the data. Parameter estimation is done by maximum likelihood. Note that the model parameters are fixed in the Kalman filter. The Kalman filter is run for each parame-ter combination in the maximization algorithm used to find the maximum likelihood. Some definitions that are used in the Kalman filter are:

• The state estimate given all information up to the current time step ˜

xk|k ≡ E[xk|Yk]

• The state estimate given all information up to the previous time step ˜

xk ≡ E[xk|Yk−1]

• The variance of the state estimate given all information up to the current time step

P_k|k ≡ E[x_kx0_k|Y_k]

• The variance of the state estimate given all information up to the previous time step

(33)

• The prediction errors

vk≡ yk− E[yk|Yk−1] = yk− (Zkx˜k+ dk) = Zk(xk− ˜xk) + εk

In the Gaussian state space model both xk and εk are multivariate normally dis-tributed. This means that the prediction errors vk are also normally distributed. The conditional mean and variance of vk are,

E[vk|Yk−1] = 0,

Fk≡ E[vkvk0|Yk−1] = ZkPkZk0 + Hk.

The log-likelihood of observation vector ykis given by the multivariate normal den-sity of vk, log f (yk|Yk−1) = log f (vk|Yk−1) = − p 2log(2π) − 1 2log(|Fk|) − 1 2v 0 kFk−1vk.

The values ˜xk and Pk, k = 1 . . . N are needed to compute the log-likelihood. In the Kalman filter ˜xk and Pkare computed recursively. The initial conditions are usually taken to be the unconditional mean and variance of xk,

˜

x1= (In− T )−1c,

vec(P1) = (In2− T ⊗ T )−1vec(RQR0),

(3.11)

where In is the identity matrix of size n, ⊗ is the Knonecker product and vec() means that the matrix is stacked to a vector.

The update of the state vector estimate given a new observation is done by a linear transformation of the prediction errors,

˜

xk|k= ˜xk+ Kkvk, where the Kalman gain is given by Kk= PkZk0F

−1

k . Prediction is then done by using the transition equation on ˜x_k|k,

˜

xk= Tkx˜k−1|k−1+ ck.

The update of the state variance estimate is done by P_k|k = Pk− KkZkPk, and the prediction by

Pk = TkPk−1|k−1T

0

k+ RkQkR

0

k.

The Kalman filter is a best linear unbiased estimator (BLUE) for the linear Gaussian state space model when Kalman gain Kk = PkZk0F

−1

Calibration Techniques for Affine Short Rate Models