Pricing guaranteed annuity options using two hybrid models

(1)

University of Amsterdam

Master Thesis

Pricing Guaranteed Annuity Options

using two hybrid models

Author1_: _{Supervisors UvA}2_:

Duncan Barker Dr. S.A. Broda

Supervisors (Towers Watson)3_:

Drs. G. De Lange A. Stroij MSc

March 26, 2015

Final Version

Abstract

A Guaranteed Annuity Option allows the holder the right to money equal to his invest-ment, or a life annuity on this investment payed out at a fixed rate. We propose to price these options under a hybrid Heston Hull-White or a Heston Gaussian two-factor inter-est rate model. We focus on the independent calibration of the equity and interinter-est rate models. The former is done by implementing four semi-analytical pricing techniques to price puts and calls, and the latter by the implementation of pricing swaptions. The performance of the calibration between the interest rate models is compared in great detail. Because the volatility of the Heston model follows a CIR process, we revert to a Monte Carlo simulation. Four simulation schemes to price the options are implemented, where the results are compared to a different stochastic volatility model. They reveal that the use of a correct interest rate model is essential in pricing the GAO.

1_{University of Amsterdam, student nr, 6136516, duncanbarkerw@gmail.com} 2_{University of Amsterdam, Depertment of Quantative Economics}

(2)

1 Introduction

There is a wide range of guarantees which an insurance company offers in its products. Some of these products, especially pension products, are long-term contracts with maturities of up to 40 years. These are issued far out of the money and, as such, have no intrinsic value thus are seen as having negligible value by insurers. However, due to the length of these options, unexpected economic fluctuations can transform a seemingly negligible sum into very costly liabilities. A guaranteed annuity option (GAO) is an example of one of these products. This guarantee gives the policyholder the option to convert his gained funds into a life annuity which pays out at a fixed rate, or money equal to the amount invested into equity.

GAOs were very popular throughout the 70s and in the beginning of the 80s. However, the sales were stopped at the end of the 80s due to the drop of long-term interest rates and changing mortality tables, two components that the conversion rates were based upon. Nevertheless, many guarantees were still running due to the long-term contract length, and the GAOs have become a significant burden for insurance companies. For example, according to The Telegraph (2014), the rising liabilities as a result of the GAOs forced Equitable Life, the oldest mutual insurer in the UK, worth 26 billion pounds and having 1.5 million policyholders at its peak, to close to new business in 2000 and reduce payouts to current policyholders.

As noted, the drop in interest rates impacted the GAOs. When these guarantees were issued, long-term interest rates were at a high of 11%. Insurance companies assumed that, given the mortality tables they used, the no-loss interest rates would not be reached and that these guarantees would stay far of the money, nor would the options ever be exercised. But as time progressed interest rates did drop, and by the late 90s they were reaching 5%, steering the options into the money.

Bolton et al. (1997) noted that when these guarantees were issued, insurance companies still used the mortality tables from the 1950s, and did not expect drastic changes. Yet there were severe changes in mortality rates amongst the people for whom these contracts were issued. For example, the life expectancy for a 65 year old in the mortality table used was 14.3 years, yet the expectancy using the table of 1998 (PMA92) was 19.8 years. This implies that the interest rates at which the GAOs broke-even were higher than expected, and increasing the amount of options that were exercised.

Past literature has varied significantly in how GAOs have been priced. Boyle and Hardy (2001 & 2003), Zhou (2007) and Chu and Kwok (2007) all used standard geometric Brownian motions as the underlying process for equity prices. Chu and Kwok (2007) claimed that a closed form expression for the GAO, after introducing a specific stochastic interest rate model, was not possible, and priced the option with three different approximation methods. They also advised that due to the long maturities of the options, a stochastic volatility model should be used. Haastrecht, Plat and Pelsser (2009) followed that advice and compared a closed form solution of the Sch¨obel-Zhu Hull-White stochastic volatility model to the constant Black-Scholes Hull-White model, and found that the use of a stochastic volatility model significantly impacted the price of GAOs; they find that the price of GAOs are much higher when using such models. Additionally, they also formed a closed form solution for the Sch¨obel-Zhu Gaussian two-factor model and compared it to the three approximation

(5)

methods of Chu and Kwok.

As seen above, there are many approaches to price options. A popular method is the Black-Scholes model, which assumes positive prices and has a closed form expression for both European plain vanilla options and many exotic options. However, a disadvantage of this model is that it assumes both constant volatility and interest rates. These assumptions are the main reason why this model is, in practice, somewhat inaccurate, especially when modeling long-term interest rates. One could modify the BS-model to incorporate the term structure of the implied volatilities, but it would still not be able to capture the volatility smile. A possible alternative is to use a stochastic volatility (SV) model, which assumes that the volatility of the asset follows a random process. A major concern when the volatility has been made stochastic is that closed form solutions for options do not always exists. The Heston model, however, is an attractive SV model as it allows for semi-closed form solutions for plain vanilla options. One can then revert to Monte Carlo simulations to price exotic options. Additionally, the model is robust and tractable, allows for probability distributions that are non-normal, has mean reverting volatility and takes the leverage effect (Black, 1976) into account.

But this model also has its disadvantages. The volatility is unobserved, and hence its parameters cannot be estimated, the prices are very parameter sensitive and one must thus be careful when calibrating the model. Furthermore, this model still assumes deterministic interest rates. This can be overcome by extending the Heston model with a model that represents stochastic interest rates, but then difficulties lie in calibrating the model when a full correlation structure is assumed. As a result, in this paper we will use a Hull-White one-factor and a Gaussian two-factor model to represent the interest rates, while disregarding the correlation between the volatility and the interest. We then obtain a hybrid model to price our option.

To price our option, we need to obtain the unknown parameters for both the Heston and the interest rate models. To do this, the models need to be calibrated to real market data. Calibration is the procedure where an objective function is minimized such that market and model prices, at one specific point in time, are near identical. As a result, the parameters for the models depend on the current market. For a calibration procedure it is important to specify a correct objective function and pricing technique. In our thesis we will compare both the bias and speed of four semi-analytical pricing techniques and compare three different objective functions for the Heston model. As for the interest models, we will only compare two different objective functions.

When the parameters are obtained, we are able to price our option by means of a Monte Carlo simulation. The Heston model assumes the volatility follows a CIR process. In our thesis we will compare four different discretization schemes for the volatility prices. Speed efficient and often used discretization schemes, the Euler and Milstein scheme, disregard the true distribution of the CIR process and use a left Riemann sum to discretize the SDE. The Truncated Gaussian and Quadratic Exponential schemes, which do take the true distribution into account, take longer but are expected to be less biased due to the fact that they approximate the true distribution by means of moment matching.

(6)

factor model we are going to use, and the corresponding pricing of zero-coupon bonds. Section 4 will explain the Heston calibration and introduce swaptions, an option closely related to the zero curve which can be expressed as zero-coupon bonds, used here for calibrating the interest models. Section 5 will present our Monte Carlo schemes. Section 6 will discuss the numerical results and finally Section 7 will conclude.

(7)

2 Pricing the GAO

The GAO is priced as follows; assume we have a single individual who commits to this guarantee. At maturity date, T , the amount invested into equity is S(T ). The guarantee offers a guaranteed fixed payout rate of g. Thus following this contract, the GAO gives the policyholder the right to either money equal to the investment in the equity, amounted to S(T ), or a life annuity with fixed rate g with payoff gS(T )ax(T ), where the value of the annuity is:

ax(T ) = ω−x

X

n=1

npx D(T, tn), (2.1)

where npx is the probability that a person aged x survives n years, ω denotes the maximum age of

the policyholder, tn= T + n the time after pension date and D(T, tn) is the discount factor, which

denotes the market value at time T of a zero-coupon bond that matures at time tn. This discount

factor between two instances, T and tn, is the amount at time T that is “equivalent” to one unit of

currency payable at time tn, it is portrayed as follows:

D(T, tn) =

B(T ) B(tn)

= e−RTtnrudu_. _(2.2)

A zero-coupon bond P (T, tn) is closely related to the discount factor. It guarantees one unit of

currency at maturity date to the holder of the bond. The relationship between the two, given that interest rates are stochastic, is that P (T, tn) can be viewed as an expectation of the discount factor

above:

P (T, tn) = E [D(T, tn)] = E

h

e−RTtnrudui_. _(2.3)

Thus, a rational individual who holds a GAO will choose that option, which has a higher value, i.e.: (gS(T )ax(T ), S(T ))+ = S(T ) (gax(T ) − 1)+ = gS(T ) (ax(T ) − K) + , (2.4) where K = 1

g. We must also take into account that the holder may die before maturity date. The

GAO then reduces to:

Tpx−T gS(T ) (ax(T ) − K) +

. (2.5)

We can now see that the GAO denotes a call option with strike K on an annuity ax(T ). Under the

no-arbitrage assumption, the value of a derivative with payoff V (t) at maturity time T is

V (t) = EQ[D(t, T )V (T )|Ft] , (2.6)

where Q is the risk neutral probability. We may assume that the survival probabilities of an insurance company are well diversified and thus do not change from P to Q. We can then denote the price of

(8)

our GAO at time zero, under the risk-neutral measure Q, as: C(0) = Tpx−T EQ h D(0, T ) gS(T ) (ax(T ) − K) +i . (2.7)

The above equation is priced using the bond price as numeraire. It is often easier to price options under the stock price numeraire. To show this, take the Black-Scholes model with stochastic interest rates. Evaluating the expectation is still valid but depends on two stochastic factors, namely the stock and interest rate. After changing numeraire, the expectation only has to be evaluated under one stochastic factor.

Haastrecht et al. (2009) found a closed form solution for the GAO under the Sch¨obel-Zhu model by changing numeraire and expressing the price as follows:

C(0) =Tpx−TS(0)g EQ Sh (ax(T ) − K) +i =Tpx−TS(0)g EQ S   ω−x X n=1 npxD(T, tn) − K !+  (2.8)

Under their model they were then able to express their GAO only under the dynamics of the interest rate model, which is incorporated in the discount factor. They claim that the dynamics of the interest rate is of Gaussian form, and can express it analytically. For the derivation, we refer you to their paper. Unfortunately, the Heston model assumes that the volatility follows a CIR process, and a closed form solution is not possible.

(9)

3 Model Specifications

In this Section we will go into further detail of the models used. First, the equity model for stochastic volatility will be presented. Then we will move on to the two interest models; a one-factor Hull-White and a Gaussian two-factor model. Then we will explain how these models can be combined and estimated simultaneously. Finally we will price zero-coupon bonds needed for the GAO and calibration of the interest rate models.

3.1 Heston model

The Heston model (Heston, 1993) assumes that the logarithmic price of the asset is determined by a stochastic process, which under the risk-neutral measure Q, is defined as:

d lnS(t) = (r(t) −1

2v(t))dt + p

v(t)dWs(t), (3.1)

dv(t) = κ(θ − v(t))dt + λpv(t)dWv(t), (3.2)

where dWs(t) and dWv(t) are standard Brownian motions with correlation ρvS. Furthermore, θ

is the long run average volatility, κ determines both the speed at which v(t) reverts to θ and the volatility clustering, and λ the strength of the volatility smile: vol of vol. Finally, r denotes the instantaneous short rate, which will be modeled using either a Hull-White one-factor or Gaussian two-factor model, considered in the next subsection. It must also be noted that the process {v(t)}t≥0

can’t become zero if the parameters obey the Feller condition: 2κθ > λ2_{. One main advantage of}

the Heston model is that it can capture the volatility smile, which is displayed in Figure 1.

3.2 Two interest rate models

As mentioned before, the Heston model must be extended to overcome the deterministic interest rates. There are many stochastic interest rate models known. The first type of models are so called equilibrium models. Examples of such models are the Vasicek (1997) model, the Cox, Ingersoll, and Ross (1985) model and the Rendleman and Bartter (1979) Model. The downside of such models is the fact that they do not exactly fit the current term structure of the interest rates and, as such, are not used a lot in practice.

Although Hull (2012) states that when modeling long term options equilibrium models may be useful, due to the fact that the term structure has little effect on the risks of the value of the portfolio, we will only look into no-arbitrage models. Such models do exactly fit the term structure of the current interest rates, and are thus much more reliable and are used more often in practice.

3.2.1 Hull-White model

The first model we consider is the Hull-White (HW) interest rate model (1990), which is a no-arbitrage extension of the Vasicek model. The HW model is defined as:

(10)

Figure 1: Volatility smile Heston Model with parameters κ = 0.74 θ = 0.06 λ = 0.5 v0= 0.05 ρ = −0.33

were a is the mean reversion rate, σ the standard deviation of the short rate and dWr a

stan-dard Brownian motion. Additionally, we assume correlation between the equity and interest as: dWr(t)dWs(t) = ρrS and dWr(t)dWv(t) = ρrv = 0. θ(t) is the time varying function which makes

sure that the model will fit today’s term structure of the interest rates, calculated as:

θ(t) = Ft(0, t) + aF (0, t) +

σ2

2a 1 − e

−2at_, _(3.4)

where F is the forward curve, which depends on the yield curve. Ft is its partial derivative with

respect to t. From here we can see that, when σ is small, the interest rate will in general follow the slope of the forward rate curve, and will revert back to the curve at rate a when it deviates from it. The short rate of the model above is in the category of Gaussian interest rate models. This carries the advantages that the model is analytically tractable. One can obtain closed form expressions for zero-coupon bonds, and to price swaptions used for the calibration of the interest models. A disadvantage, according to Brigo & Mercurio (2007), is that there is positive probability that the model produces negative interest rates. Although they state that this probability is negligible, that was when interest rates were a lot higher than in the current market. As such, the model may not perform as reliably in markets where interest rates are low.

3.2.2 Gaussian two-factor model

The second model we consider is the two-factor Gaussian (G2++) interest rate model (1994). The short rate of this model is the sum of two correlated Gaussian factors, including a function which

(11)

fits the term structure of the interest rates. Just like the HW model discussed above, this model is effective in that it is analytically tractable and thus zero-coupon bonds and swaptions are able to be derived, now with the addition of a second factor. This second factor allows for more variability and can thus capture more information from the swap volatility surface we calibrate to. For example, a two factor model will capture more of the imperfect correlation between a 1 year rate and a 30 year swap volatility. A disadvantage in using this G2++ model in relation to the HW model described above, is that the interpretation of the model is less clear. Also, due to the fact that this model is also in the category of Gaussian models, it has the same drawbacks as mentioned in the HW model. As such, the G2++ model can be written as:

r(t) = x(t) + y(t) + φ(t), (3.5)

where

dx(t) = −ax(t)dt + σdW1(t), x(0) = 0

dy(t) = −by(t)dt + ηdW2(t), y(0) = 0,

(3.6)

with correlation between the two Brownian motions ρxy and where r, a, b, σ and η are positive

constants. φ(t) is the term which fits the current term structure.

We now have a one and a two-factor model. The question is then, why not add more factors if the two-factor model improves the one-factor model? Jamshidian and Zhu (1997) show that the one-factor explains 68%-76% of the total variation and the two-factor explains 85% to 90%. They go on and show that the three-factor explains 93% to 94%. Thus, at least a two-factor model is needed to explain a good amount of the total variation of the volatility surface in our calibration procedure. One major positive regarding the one and two-factor models is that they are tractable and easy to implement. Although the three-factor model may explain more of the total variation, due to tractability and implementation we will use the one and two-factor models.

We subsequently have what we need to price zero-coupon bonds, which will be used to calibrate the interest models to market data. Also, we are able to formulate an expression for our hybrid models which will help us in pricing the GAO.

3.3 Hybrid models

We now consider the models jointly. As mentioned in the introduction, the Heston model is a popular model because it has a semi-closed form solution for plain vanilla options. In Section 2 we found that a closed form solution for the GAO was not possible, because the volatility of the Heston model follows a CIR process. Duffie et al. (2000) showed that, when a system is affine, a semi-closed form solution can be derived. But, after extending the Heston model to account for the stochastic interest rates, this no longer holds as the model is not affine anymore. Oosterlee (2009) shows that both the joint Heston-Hull-White (H-HW) and the Heston-Gaussian two-factor (H-G2++) models are able to be made affine by introducing a time varying approximation for the volatility. We would then be able to obtain a characteristic function for both models and price our GAO in a semi-closed form.

(12)

However, past literature (Kuhne, 2012) has found that the results will not converge to a stable price because of a sensitivity issue with the parameters.

As a result we are going to revert to Monte Carlo simulation. For both models we are going to need the correlation between the equity and the interest models ρrS. This correlation is obtained

from historical estimation, which is done as follows. First we need historical returns from both the Eurostoxx50 (equity) and European swap rates (interest) with a tenor which is relatively liquid. Then the correlation between the two data sets is calculated over a range that is equal to the maturity of the GAO. As we assume the GAO’s maturity date is in a minimum of 10 years time, we will create a 2 year rolling window of the correlation over 10 years, and average the correlations. Figure 2 shows the difference between using 1 year and 10 year correlations. As can be seen, the 1 year correlations vary a lot, while the 10 year correlations are near constant. It should be noted that the difference between the two years are near identical when taking the average over the two year window.

Figure 2: Rolling correlations between the first difference of EUSA30 and log Eurostoxx50

Correlated Brownian motions are needed to capture the relationships within and between the equity and interest models. These correlated Brownian motions are created in MatLab as d ˆW (t) = CdW (t), where dW (t) are independent Brownian motions and C is the Cholesky decomposition obtained from the correlation matrix. For the HW model they are described as:

ρ =    1 ρvr ρvS ρvr 1 ρrS ρvS ρrS 1   =    1 0 ρvS 0 1 ρrS ρvS ρrS 1    (3.7)

(13)

C =       1 0 0 ρvr p1 − ρ2vr 0 ρvS ρrS√−ρvrρvS 1−ρ2 vr s 1 − ρ2 vS+ ρrS_√−ρvrρvS 1−ρ2 vr 2       =    1 0 0 0 1 0 ρvS ρrS p1 − ρ2vS− ρ 2 rS.    (3.8)

For the H-G2++ model, we find the following Cholesky decomposition:

C =         1 0 0 0 0 1 0 0 0 ρxy q 1 − ρ2 xy 0 ρvS ρxS ρyS_√−ρxSρxy 1−ρ2 xy r 1 − ρ2 vS− ρ2xS− ρyS_√−ρxSρxy 1−ρ2 xy .         (3.9)

Note that we assume no correlation between the volatility and interest rate process. With these correlated Brownian motions, we are able to implement our Monte Carlo procedure explained in Section 5, but we first need an expression for the zero-coupon bonds.

3.4 Pricing zero-coupon bonds

In Section 2 we obtained an expression for our option. In this option we need to price zero-coupon bonds under the two different interest rate models described above. The expression for the price of a zero-coupon bond, P (t, T ), under the Hull-White interest model, is given by (2.3). Because the short rate is normally distributed, the short rate in the exponent of (2.3) will be as well. Brigo & Mercurio then show that the price of a zero coupon bond can be denoted as

P (t, T ) = A(t, T )e−B(t,T )r(t), (3.10) where A(t, T ) = P M_{(0, T )} PM_{(0, t)}e B(t,T )F (0,t)−σ2 4a(1−e −2at_{)B(t,T )}2 (3.11) B(t, T ) = 1 − e −a(T −t) a (3.12) F (0, t) = −∂ lnP M_{(0, t)} ∂t . (3.13)

The price of a zero-coupon bond under the Hull-White model then only depends on the unknown parameters a and σ. Furthermore, PM_{(0, m) denotes the market zero discount factor maturing at}

time m.

(14)

and Model (3.5) by substituting ruas follows: P (t, T ) = Ehe−RtTrudu i = e−RtTφudu−M (t,T )+ 1 2V (t,T )_, _(3.14) where M (t, T ) = 1 − e −a(T −t) a x(t) + 1 − e−b(T −t) b y(t), and V (t, T ) = σ 2 a2 (T − t) +2 ae −a(T −t)₋ 1 2ae −2a(T −t)₋ 3 2a +η 2 b2 (T − t) +2 be −b(T −t)₋ 1 2be −2b(T −t)₋ 3 2b + 2ρση ab (T − t) +e −a(T −t)_{− 1} a + e−b(T −t)− 1 b − e−(a+b)(T −t)− 1 a + b . (3.15)

(3.14) follows from the fact that if X is normally distributed with mean mxand variance σx2, so that

E[eX] = emx+12σ 2

x_{. The model for the short rate dynamics of r(t) fits the currently-observed term}

structure of discount factors if and only if

e−RtTφ(u)du= P

M_{(0, T )}

PM_{(0, t)}e −1

2[V (0,T )−V (0,t)]_. _(3.16)

Note that it may seem that we have to derive the whole φ curve, which involves interpolation and will thus mean we are approximating the curve. But Brigo & Mercurio state that only the market discount curve is needed, as we only need φ to be computed under the two instances shown in (3.16). For these two instances we do not need to interpolate. The subsequent zero-coupon bond prices are then given by P (t, T ) =P M_{(0, T )} PM_{(0, t)}e A(t,T ) A(t, T ) = 1 2[V (t, T ) − V (0, T ) + V (0, t)] − M (t, T ). (3.17)

For proofs we refer the reader to Brigo & Mercurio (2007).

We then see that the pricing of our zero-coupon bond under the two-factor Gaussian model requires five unknown parameters: a, b, σ, η and ρ. These can be obtained via calibration which will be discussed in Section 4.

3.5 Sch¨

obel-Zhu Model

In the results we will compare the Heston HW and Heston-G2++ model with the Schöbel-Zhu HW and Schöbel-Zhu G2++ closed form solutions from Haastrecht et al (2009). This subsection will give a short overview of the Schöbel-Zhu (SZ) model.

(15)

is Gaussian. As such, the SZ model can be written as d lnS(t) = (r(t) −1

2v(t))dt + v(t)dWs(t), (3.18)

dv(t) = κ(θ − v(t))dt + λdWv(t), (3.19)

where the parameters have the same interpretation as in the Heston model.

As we can see, the volatility then follows an Ornstein-Uhlenbeck process. The main difference between the Heston and SZ model is then that the Heston model can capture more of the volatility smile, and that the calibrated parameters will be different. One downside of this model is that it can produce negative values for the volatility process.

(16)

4 Calibration

To price our option, we must first obtain the unknown parameters in the model that most accurately portray the market. This is done by means of calibration, which minimizes an objective function, which fits the market data at one moment in time, to obtain the parameters in the model. An example is the error between the market and model prices. However, there are numerous other objective functions that may be chosen, each with its (dis)advantages. Minimizing such an objective function may seem straightforward, but it will not be convex and also has no known structure. This comes with the disadvantage that we cannot just find the parameter set for which the gradient of our objective function is zero. As a result, finding the global minimum will be difficult. In the next subsections we will go through the calibration procedure for the Heston and interest rate models.

4.1 Calibrating the Heston model

There are numerous objective functions that can be used to calibrate the Heston model. Chen (2007) shortly describes two popular objective functions. He states that using a relative pricing error assigns more weight to short-term rather than long-term options. This can be overcome by using a quadratic norm, which would then favor long-term options. Gatheral (2006) states that the Heston model will have large errors when using short-term options because it cannot capture the large skews they portray. As such, we advise to use the quadratic norm. The objective function we will be using is as follows:

v u u t N X i=1 wi{Ciξi(Ki, Ti) − Cimkt(Ki, Ti)}2,

where ξi is the parameter set of the model, Ciξi(Ki, Ti) and Cimkt(Ki, Ti) the ithoption price of the

model and market respectively, with strike Ki and maturity Ti, N the amount of options and wi

the weight attached to the option.

Detlefsen and Hardle (2006) looked into pricing OTM exotic options with four different objective functions, and concluded that using implied volatilities (IVs) instead of prices significantly reduces the errors. But the IVs are not in closed form, and minimization in IV space will be too time-consuming. Thus to compare different calibration methods we will use different weights in the objective function described above.

The first will be to use vega-weighted prices, suggested by Cont (2005). This weight will approx-imate minimization in IV space and, as such, will be a good alternative for the use of IVs in the objective functions. The further an option goes OTM or ITM the smaller the vega of the option will be, thus we have to correct for this by simply implementing a maximum the weight is able to be, so that we do not overweight an option. The weights we will use here are thus: wi= min(_vega1

i, 100).

As noted above, vega is very large with options that are ATM, and will consequently overweight options that are OTM or ITM. One possible way to solve this problem is to use uniform weights, w0 = w1= ... = wN, or simply, each weight is _N1.

(17)

all options, weighted to their market price, are equal. The weights to be used are thus: wi=_Cmkt1 i

, where Cmkt

i is the market price of the option.

One possible way to check the performance of the weights is to compare the relative difference between the market prices and model prices. However, when doing that it should be obvious that the third weight will be optimal, as that was exactly the weight criteria. Hence, to compare the different weights we will calculate the implied volatilities given the model parameters, and compare these to the volatility surface obtained from Bloomberg.

The Heston model will be calibrated to put and call options of the Eurostoxx 50 Index at the 31st _{of December 2014. An important aspect in finding the parameters to portray the Heston model}

given the option we are pricing, is to find instruments that have a relevant maturity date. It is not advised to calibrate your model to a surface with short term options when the option you intend to price is long term. But long term equity options may not be very liquid and, consequently, will not be suitable for calibration. As the GAO expires far into the future, we should use options that have maturity dates around 10 years. But due to the liquidity of such options, we use dates between 1 and 10 years. The strike prices considered are from 90-110% with increments of 2.5% of the current stock price, except 92.5% and 107,5%. Hence we look at ITM, ATM and OTM options. The implied volatility surface is shown in Appendix A.

4.2 Semi-analytical pricing methods

As mentioned in the introduction, the Heston model has a semi-closed form solution to price plain vanilla options. These are used for calibrating the model to market data. In short, this semi-closed form solution is obtained by integrating the conditional density function. This function is able to be obtained via the characteristic function (CF) under many different methods. The Heston model has a known characteristic function which will be shown in the next subsection. In our thesis we will compare four pricing methods to obtain call and put options under the dynamics of the Heston model. These methods can be used to calibrate the model efficiently.

4.2.1 Characteristic function

For all the methods to find semi-closed form solutions for plain vanilla options under the Heston model, we need the CF. The CF defines a probability distribution for any real-valued random variable. The CF, unlike the Moment Generating Function which has the property that all moments must be finite, exists even when moments are not finite. It is defined, together with its Fourier pair, as φ(u) = EeiuX₌ Z R eiuxf (x)dx (4.1) f (x) = 1 2π Z R e−iuxφ(u)du. (4.2)

(18)

where f (x) is the probability density function. The cumulative distribution function is then obtained by integrating (4.2) as F (x) = Z ∞ −∞ f (x) = 1 2 − 1 π Z ∞ 0 < e −iuk_φ(u) iu du. (4.3)

Heston (1993) showed that under the risk neutral measure, with X(T ) = log S(T ), the CF is:

φHeston(u) = eiu(logS(0)+rτ )+B+C, (4.4)

where B =v(0) λ2 _{1 − e}−Dτ 1 − Ge−DT (κ − ρλiu − D), C =κθ η2 τ (κ − ρλiu − D) − 2log1 − Ge −Dτ 1 − G (4.5) and D =p(κ − ρλiu)2_{+ (u}2_{+ iu)λ}2 G =κ − ρλiu − D κ − ρλiu + D.

Given this CF, we will now show how to price call options under four different methods.

4.2.2 Black-Scholes type solution

The first pricing method is one that looks the most like the Black-Scholes (BS) solution. We are able to price a European call with strike K and spot S(t) as

C(S0, K, T ) = e−rTEQ(ST − K)+ = e−rT Z ∞ 0 (ST − K)+f (St)dSt = e−rT Z ∞ K STf (ST)dST − K Z ∞ K f (ST)dST = e−rT Z ∞ k exf (x)dx − ek Z ∞ k f (x)dx = S0P1− e−rT +kP2, (4.6)

where k = log K and x = log ST. P1 and P2 are conditional probabilities, where P1 is computed

with the stock price as numeraire and P2with bond price as numeraire. We can easily see that the

term P2is the probability P(x > k), which after integrating (4.2), simplifies to

P2= 1 2 + 1 π Z ∞ 0 < e −iuk_φ(u) iu du. (4.7)

(19)

As for P1, after changing the numerair to the stock price, we obtain P1= 1 2+ 1 π Z ∞ 0 < e −iuk_{φ(u − i)} iuφ(−i)) du. (4.8)

Combining these two results we then get a semi-closed form solution for the price of a call option under the dynamics of the Heston model.

4.2.3 Attari’s approach

One downside of the previous brute force approach is that you have to evaluate both P1 and P2

separately. Attari (2004) modified the previously shown BS solution by combining the two proba-bilities P1 and P2. He shows that the price of a call option, after combining the two probabilities

and applying Euler’s identity, is

C(S0, K, T ) = S0−e−rTK   1 2+ 1 π Z ∞ 0

<[φ(u)] + =[φ(u)]_u cos(uk) +=[φ(u)] −<[φ(u)]_u sin(uk)

i + u2 du



. (4.9) We then only need to calculate the CF once as we are able to cache the value of φ(u).

4.2.4 Fast Fourier Transform (FFT)

One disadvantage with computing the price in a BS-type solution is that it can be slow, even with the changes that Attari made. The integrals are complex and more importantly, they must be done for each strike price. One can also use the FFT to price options. The FFT has the advantage that it is fast as well as being accurate. But one can not directly apply the FFT to the integral in (4.2) due to the fact that the integrand is singular at u = 0. This is because the call price in (4.6) goes to St when k goes to –∞, and as a result a Fourier transform does not exist. Carr and Madan

(1999) proposed to introduce an exponential dampening function, eαk_{, where the modified price is}

cT(k) = eαkCT(k), which is an integrable function.

We can then consider the Fourier transform of the modified call price as

ψ(u) = Z ∞

−∞

(20)

Where the expression for ψ is: ψ(u) = Z ∞ −∞ eiukc(k)d(k) = Z ∞ −∞ eiuk Z ∞ −∞ eαke−rT(ex− ek₎+_{f (x)dxdk} = Z ∞ −∞ eiuk Z ∞ k eαke−rT(ex− ek_{)f (x)dxdk} = Z ∞ −∞ e−rTf (x) Z x −∞ (ex− ek_)eiuk_eαk_dk dx = e−rT Z ∞ −∞ f (x) e (α+1+iu)x α + iu − e(α+1+iu)x α + 1 + iu dx = e −rT_φ T(u − (α + 1)i) α2_{+ α − u}2_{+ i(2α + 1)u}. (4.11)

We can then obtain the call price by using the the Fourier pair of (4.10), defined as

CT(k) = e−αk 2π Z ∞ −∞ eiukψ(u)du = e −αk π Z ∞ 0

<[eiuk_ψ(u)]du.

(4.12)

This formulation is faster than the brute force BS-type solution because FFT is a fast and accurate method to compute the sum:

Xj= N −1 X n=0 xne−i2πj n N, (4.13)

where N is a power of 2. This method then reduces the computational complexity from N2 to N logN . We can discretize the integral in (4.12) using the trapezoid rule and formulate the result as a function that corresponds to (4.13).

Taking the real operator out of the integral, applying the trapezoid rule and truncating the range of integration at B = (N − 1)h, where h is the step size, we find that (4.12) can be written as

CT(kj) = e−αkj 2π < Z ∞ 0 e−iukj_ψ(u)du ≈e −αkj 2π < "N −1 X n=0 ψ(nh)e−inh(k0+jλ)_w nh # , (4.14)

where kj = k0+ jλ is to make sure that ln K is centered around S0. We can then, by defining

ψ(nh)e−inhk0_w

nh as xn and λ as _{N h}2π, formulate our call price as

CT(kj) ≈ e−αkj 2π < "_{N −1} X n=0 xne−i2πj n N # , (4.15)

which is in the same form as (4.13). An extra addition is that, as we center the log strike prices around S0, we can pass the strike prices for the same maturity date in as a vector, and obtain the

(21)

Figure 3: Effect of the dampening factor

The effect of the dampening factor α is shown in Figure 3. If α is near the origin the price of the call option shows pointless results. This confirms that the integrand of (4.2) is singular at u = 0. As α rises, the price approaches its true value.

Additionally, it must be noted that though this method is fast and accurate, it will still be an approximation due to the truncation error in (4.14). Also, as the prices given the strikes will not be equidistantly spaced, there will be some form of interpolation error.

4.2.5 COS method

Fang and Oosterlee (2009) state that although the FFT method is fast, it has the disadvantage that the convergence of the error is low (due to equally spaced integration rules), and that N has to be a power of 2. They proposed to solve the inverse Fourier integral, defined in (4.2), from its Fourier-cosine expansion. We can formulate the cosine expansion of f (x) from (4.2), after truncating the integral to the domain [a,b] as

f (x) = 0_{N −1} X k=0 Ak cos(kπ x − a b − a), (4.16) where Ak = _b−a2 R b a f (x) cos(kπ x−a

b−a)dx, and the prime in the summation means that the first term

is multiplied by a half. By combining the definition of f (x) in the equation for Ak we obtain an

expression which is a function of variables which are known:

Ak= 2 b − a< φ _kπ b − a e−ikaπb−a

(22)

If we then go back to the second step in (4.6), using that φ(w) = φHeston(w)eiwxwith x = ln S_K0,

we can denote the price of a plain vanilla option V (y, T ) = [αK(ey_{− 1)]}+ _{with y = ln} ST K, and

α = 1 for calls and -1 for puts, at t = 0, as

V (x, 0) = e−rT Z b a V (y, T )f (y|x)dy = e−rT 0_{N −1} X k=0 < φHeston kπ b − a

e−ikπx−ab−a 2αK

b − a Z d c (ey− 1)cos(kπy − a b − a) dy = e−rT 0_{N −1} X k=0 < φHeston _kπ b − a

b − a Z d c eycos(kπy − a b − a) − cos(kπ y − a b − a) dy = e−rT 0_{N −1} X k=0 < φHeston _kπ b − a

b − a(χk− ψk) ,

(4.18) where the last step is obtained by using integration by parts and normal integration rules to get the following expressions for χk and ψk

χk =

1

1 + ω2cos(ω(d − a))e

d_{− cos(ω(d − a))e}c_{+ ωsin(ω(d − a))e}d_{− ωsin(ω(d − a))e}c

(4.19) ψk =    d − c, if k = 0 sin(ω(d−a))−sin(ω(c−a)) ω , otherwise, (4.20) where ω = kπ

b−a and c = 0, d = b if a call and c = a, d = 0 if a put is being priced.

This COS method, just like FFT, can pass the strikes in as a vector, but does not need to use an interpolation rule to obtain the corresponding strike prices. And again, although this is a faster method, it will still be an approximation to the brute force methods explained in Section 4.2.2 and 4.2.3

4.2.6 Results

First we will discuss the convergence speed of the two approximation methods and their relation to the brute force approaches. Figure 4 displays the convergence in price for the COS and FFT method while varying N = 2l_{. As can be seen, the COS method converges much faster than the}

FFT method. This confirms that FFT produces some form of interpolation error to obtain the prices. Given the data when N = 29_{, we observe that the two approximation methods produce the}

same prices as the brute force methods.

When discussing numerical efficiency, one also has to look at the computational time it takes to price an option. One can imagine that a method which is biased but very fast may be more preferable to one that is extremely accurate but takes a long time. When inspecting the computational time of the methods used in this thesis, the COS method was the fastest, followed by FFT, Attari and the BS solution. One surprising result was that in the calibration procedure, the FFT model was

(23)

very sensitive to the initial parameters, and the procedure took, on average, just as long as Attari’s method. Thus, we found that the COS method was both the fastest and just as accurate as the brute force methods.

Figure 4: Convergence in price

In Table 1 the calibration results for the Heston model, given the data used, are displayed for different weights attached to the objective function. The performance of the weights is shown in the final column, where the average absolute difference between implied and market volatilities are displayed. The first noticeable observation one can make is that the uniform weights perform the worst. This result could have been anticipated as one would not expect all prices to be weighed equally.

In Section 4.1 it was noted that minimization in IV space should perform significantly better than comparing prices. But minimization in IV space would be too time consuming, and using vega weights should approximate minimization in IV space. Table 1 indeed shows that using vega weights performs better than using uniform weights, but actually performs worse than relative weights, although the difference between the two is minimal.

After calibrating using relative weights, the difference between the implied and market volatilities is displayed in Figure 5. We observe that apart from two or three prices, which are located in the extremes of the surface, the difference between model and market volatilities are very low, especially given that we are trying to price 35 options given only 5 parameters.

Weight κ θ λ v0 ρvS Absolute Error

Uniform 0.2965 0.0870 0.4070 0.0483 -0.3699 0.1656 Vega 0.3589 0.0816 0.4153 0.0486 -0.4128 0.1482 Market 0.3792 0.0794 0.4071 0.0482 -0.4255 0.1481

(24)

Figure 5: Difference between market and implied volatilities

In Appendix A the calibration procedure will be considered when using different implied volatility surfaces. This is because the calibration of the model is done at one moment in time, and thus heavily relies on what the market is doing.

4.3 Calibrating the interest rate models

The HW and the G2++ model will be calibrated separately using European swaptions. Swaptions are basic derivatives on interest rates that can be expressed in terms of the zero coupon bonds given a swap volatility surface. They give the right, but not the obligation, to enter into an interest rate swap (IRS) at the swaption maturity Tα. The underlying IRS then has a separate length, called the

tenor, Tβ− Tα, where Tβ is the combined length of the swaption and IRS. As they can be expressed

as zero-coupon bonds, they can be priced under both the HW and the G2++ model we use. Given these swaption prices, the objective function used for calibrating the interest rate models is as follows v u u t N X i=1 wi(S ξi i − Simkt)2, (4.21)

where wi are the weights, which are either uniform or relative.

The interest rate models will be calibrated to payer and receiver swaptions from the swap rates at the 31st of December 2014, obtained from Bloomberg. Again, as the GAO has a long maturity date we need to use relevant dates. We also have to look out for swap rates that are not liquid. As a result, we will calibrate to swaptions with a maturity date of between 10 and 25 years and with a tenor between 10 and 25 years. The swaptions are priced ATM. The following sections will show the analytical formulas for the swaptions in the HW and G2++ model.

4.3.1 Hull-White model

The pricing of swaptions under the Hull-White framework is closely related to the pricing of zero-coupon bonds. We may formulate the price of an European call and put option on a zero-zero-coupon

(25)

bond at time t maturing at time T on a pure discount bond maturing at time ti, where t < T < ti

as

ZB = w P (t, ti)Φ(wh) − KP (t, T )Φ(w(h − σp)), (4.22)

where w = 1 for a call (ZBC) and -1 for a put (ZBP), and

σp= σ r 1 − e−2a(T −t) 2a B(T, ti) h = 1 σp ln P (t, ti) P (t, T )K + σp 2 . (4.23)

Brigo & Mercurio show that the price of a swaption under the Hull-White interest rate model can be priced under the condition that

n X i=1 ciA(T, t)e−B(T,ti)r ∗ = 1, (4.24)

where ci= Kτi (τiis the year fraction from ti−1to ti) and the strike K := A(T, ti)e−B(T,ti)r ∗

. r∗is obtained by using a root finding algorithm. We can then denote the swaption price as

S = N ω

n

X

i=1

ci[P (t, ti)Φ(wh) − KP (t, T )Φ(w(h − σp))] (4.25)

where w = 1(w = −1) for a payer (receiver) swaption and N the notional of the swaption. For proofs we refer the reader to Brigo & Mercurio.

4.3.2 G2++ model

The G2++ model has three extra parameters that need to be taken into account. Following the framework of Brigo & Mercurio, the price of a swaption under the G2++ model is:

S(0, T ) = N ωP (0, T ) Z ∞ −∞ e−12(x−µxσx ) 2 σx √ 2π " Φ(−ωh1(x)) − n X i=1 λi(x)eκi(x)Φ(−ωh2(x)) # dx, (4.26)

where ω = −1 (ω = 1) for a receiver (payer) swaption. Expressions for the terms in 4.26 are shown in Appendix B.

In the HW model a root finding algorithm needed to be used to obtain r∗. In the G2++ model one also has to use that same algorithm to obtain ¯y in the equationPn

i=1ciA(T, ti)e

−B(a,T,ti)x−B(b,T ,ti)¯y ₌

1. Due to the integration range we then have to obtain more than one ¯y. This procedure is compu-tationally heavy, and as result calibrating this model takes a far longer time than the HW model.

4.3.3 Results

Now that we have all the closed form expressions for the swaptions in our objective function, we can calibrate our models. In the Heston calibration we noted that, when comparing the differences

(26)

between the weights in the calibration procedure, we compared the volatilities. This was possible because we were able to calculate model implied volatilities. But this is not the case for the interest rate models, as we can not go from swaption prices to swap volatilities. As a result, we will simply compare the relative differences between market and model swaption prices, and acknowledge that the results may not be trustworthy, although it should be intuitive to expect the second weight to show the smallest error. Tables 2 and 3 show the calibration results for the HW and G2++ models. The performance of the weights can be found in the final column. Finally, the swap volatility surface can be found in Appendix C. Note that we are only calibrating to the tenors and maturities larger than 10 years.

For the HW model, the second weight does indeed perform better than the first, but the differ-ences are minimal. Also, the difference between the parameters are in magnitude of ten-thousands. One possible explanation is that there is hardly any variation within the swap volatility surface and, as such, the weights perform almost equally. When looking at the entire surface, which will be shown later, we do indeed see differences which are larger.

Surprisingly, the G2++ model seems to perform near identical to the HW model. Again due to the small variation within the swap volatilities, the addition of a second factor appears to be negligible. We do observe that the parameters a and σ in the G2++ model are almost identical to the ones in the HW model. We also find that the volatility of the second factor tends to zero, implying that the G2++ model treats the second factor as constant. We must note that due to the speed issues of the G2++ model, and the observation that the relative weights performed better than the uniform weights in the HW model, we only calibrated with relative weights in the G2++ model.

Weight a σ Average error

Uniform 0.0451 0.01 1.9703% Market 0.0454 0.01 1.9497%

Table 2: Hull-White calibration results

Weight a σ b η ρxy Average error

Market 0.0437 0.0101 0.6177 0.002 -0.6496 1.9274%

Table 3: Gaussian two-factor calibration results

To visualize the performance of the calibration, Figure 6 displays the relative error between market and model swaption prices. As we stated in the previous paragraph, both models seem to give identical results. We find that the difference in price tends to zero when both tenor and maturity are approximately equal (the diagonal of the surface). Prices start to differ substantially when this is not the case, especially for large maturity and relatively low tenor.

We found that the HW and G2++ model produce identical results. But as was noted in Section 4.2.6, we have to look at computational time. The HW model, although marginally less accurate,

(27)

calibrates 25 swaptions in about one minute, whereas the G2++ model takes an hour. The integra-tion of (4.26) is numerically inefficient as it involves a root finding algorithm for the entire surface and each dx. We found that the parameters of one of the factors in the G2++ model were close to the parameters of the HW model. As a result, if one wants to find parameters for the G2++ model, we advise to first calibrate the one-factor model and then to use the calibrated parameters aHW _and

σHW as an initial guess for those corresponding two terms in the two-factor model. Then you must randomize the other parameters, thus

aG2++= aHW (4.27) bG2++= U (0, 1) (4.28) σG2++= σHW (4.29) ηG2++= U 0, 1 10 (4.30) ρG2++_xy = 2 ∗ U (0, 1) − 1, (4.31)

where U is a draw from a uniform distribution. This may reduce the time to look for local minima.

Figure 6: left: Difference market and Hull-White model right: Difference market and Gaussian two-factor model.

As the variation within the surface used was small, it may be useful to look at the full volatility surface. Figure 7 shows the calibration results for the full volatility surface. The parameters we obtained were a = 0.0188 and σ = 0.0067 for the HW model, and a = 0.0506, b = 0.5231, σ = 0.0121, η = 0.0235 and ρxy= −0.9907 for the G2++ model. The average difference between prices

is 16.02% for the HW, and 5.72% for the G2++. Comparing the weights, we observe that the uniform weights in the HW model show an average price difference of 17.40%, were the parameters were a = 0.0298 and σ = 0.0079. This is a much larger difference than when inspecting the smaller, less varying surface.

The first main observation is that now the G2++ model performs much better than the HW model. The HW model is unable to capture the variation within the full volatility surface, while the two-factor model partially can. Inspecting the parameters, we can see that the mean reversion

(28)

the factor with large mean reversion is now much larger (0.0235) than in the smaller surface (0.002). We thus find that the second factor, which can be seen as a constant in the small surface, now has a larger volatility than the first factor.

The very negative ρ, which reduces the hump at low maturity and tenor is surprising, as Brigo & Mercurio state that swaptions, contrary to caps, contain information regarding the correlation in forward rates. It must be noted that they used a surface from 2001, with far less variation than the one used here. Also, if we compare the smaller surface in Figure 6 which has little variation, we also find a value of ρ which is far away from -1. In general, if you want the G2++ model to introduce the hump at low tenor and maturity, you would have to set the bound for ρ to a less negative value than -1. Appendix C shows the calibration results of the HW and G2++ models when inspecting the swap volatility surfaces of two different dates as well as a comparison between our and Brigo & Mercurio’s results.

Figure 7: left: Difference market and Hull-White model right: Difference market and Gaussian two-factor model.

Finally it must be noted that as ρ ≈ −1 the G2++ model tends to a one-factor short rate process. But due to the fact that a 6= b, the process will still be non-Markovian (Brigo & Mercurio, 2007). This is the main reason why the G2++ model will outperform its one-factor counterpart. So, due to the computational time, we advise to use the one-factor model when there is small variation within the surface. But as soon as one wants to look at a larger (or even the total) surface, it is beneficial to use the G2++ model.

(29)

5 Monte Carlo Simulation

As there is no expression for the closed form solution of the GAO under the hybrid models we spec-ify, we must go over to pricing the option with a Monte Carlo analysis. There are several schemes to discretize the Heston model. We will compare four different schemes. The Euler and Milstein scheme, which do not take the true distribution of the volatility process in to account, and the Truncated Gaussian and the Quadratic Exponential scheme, which do.

5.1 Heston Hull-White/Gaussian two-factor model

Recall that the Heston Hull-White model can be expressed as three SDEs

d lnS(t) = (r(t) −1 2v(t))dt + p v(t)sWs(t) dv(t) = κ(θ − v(t))dt + λpv(t)dWv(t) dr(t) = (θ(t) − ar(t))dt + σdWr(t). (5.1)

and the Heston G2++ model only alters the final SDE according to (3.5) and (3.6). We can see that S(t) in a function of r(t) and v(t), thus we must first propose a scheme for both the volatility and the interest rate before the stock. We start with a discretization scheme for the interest rate models. For discretizing the HW model, all we need to do is generate the correlated Brownian motions using the Cholesky decomposition obtained in Section 3. But as we assume independence between the volatility and the interest, we find d ˆWr= dWr.

For simulating r, brigo & Mercurio show that the short rate can be written as r(t) = x(t) + α(t)

dx(t) = −ax(t)dt + σdWr(t),

(5.2)

where dx(t) is an Ornstein Uhlenbeck process and α(t) = FM_{(t) +} σ2

2a2 1 − e−at

2

fits the current term structure. We can then simulate x as

x(t + dt) = x(t)e−adt+ r σ2 2a(1 − e −2adt_)Z r. (5.3)

We must then only insert the simulated path for x into (5.2) to obtain r. As for the G2++ model, we obtain that d ˆWx= dWxand d ˆWy= ρxydWx+

q 1 − ρ2

xydWy. We

can then simulate paths for x and y as

x(t + dt) = x(t)e−adt+ r σ2 2a(1 − e −2adt_)Z x y(t + dt) = y(t)e−bdt+ r η2 2b(1 − e −2bdt₎_ρ xyZx+ q 1 − ρ2 xyZy , (5.4)

(30)

discretization of the curve that fits the term structure, Brigo & Mercurio show that the integral Rt+dt

t φ(u)du is known analytically, which will be shown in Section 5.3.1.

There are many schemes available for simulating the volatility process. The difference between many of them are either speed or bias. In the coming sections we will discuss the properties of the volatility process as well as show four schemes which will discretize this process, coupled with two schemes for the stock process.

5.2 Volatility process

According to Anderson (2006) the true distribution of the volatility process v(t + dt), conditional on v(t), is a constant C times a non-central chi-square distribution with non-centrality parameter λ and d degrees of freedom:

P (v(t + dt) ≤ x|v(t)) = Fχ2 k(λ) x C = ∞ X i=0 e−λ2 λ 2 i i! R_Cx 0 x C d2_e−u₂_du Γ(i +k₂) , (5.5)

where C := λ2(1−e_4κ−κdt), d := 4κθ_λ2 and λ :=

e−κdtv(t)

C . He notes that if v(t) is large, the

non-centrality parameter approaches ∞, and v(t + dt) could be approximated by a Gaussian variable, while matching the first two moments to the non central chi-square distribution, shown in the next paragraph.

But for small v(t), the non-centrality parameter approaches zero. As a result, the distribution of v(t + dt) becomes an ordinary chi-square distribution. This density is very large around 0, due to the small amount of degrees of freedom (DoF) given by the calibration results. This is illustrated in Figure 8, where the solid line is that given the calibration results. For the remainder of the schemes

Figure 8: Chi-square distribution.

we will denote the mean and variance of v(t + dt) conditional on v(t) as

(31)

s2= V ar((v(t + dt)|v(t)) = v(t)λ 2_k 1 κ (1 − k1) + θλ2 2κ(1 − k1) 2_, _(5.7) where k1= e−κdt.

5.2.1 Euler and Milstein Scheme

An Euler scheme is the simplest way to discretize the SDEs of the Heston model, and does not use the properties of the process described in the previous subsection. This scheme approximates the integrals using the left Riemann sum. One drawback of this scheme is that we can obtain negative values for the process, but if handled correctly, the Euler scheme is computationally the most efficient. Additionally, is it easily implemented and can be extended for models other than the Heston model. By putting the process into integral form we obtain

v(t + dt) = v(t) + Z t+dt t κ(θ − v(u))du + Z t+dt t λpv(u)dWv,u. (5.8)

We can then use the left Riemann sum to approximate the integrals as Z t+dt t κ(θ − v(u))du ≈ κ(θ − v(t))dt Z t+dt t λpv(u)dWv,u≈ λ p v(t)(Wt+dt− Wt) = λpv(t)dtZv, (5.9)

where Zv ∼ N (0, 1). This leaves us will an Euler scheme for the stochastic volatility process defined

as

v(t + dt) = v(t) + κ(θ − v(t))dt + λpv(t)dtZv. (5.10)

But we then observe that the probability that the process becomes negative is

P(v(t + dt) < 0) = P Zv< −v(t) − κ(θ − v(t))dt λpv(t)dt ! = Φ −v(t) − κ(θ − v(t))dt λpv(t)dt ! (5.11)

which is positive. Thus if we want to make sure that the process does not fall below zero, we must adapt the scheme. Lord et al. (2008) proposed some additions to (5.10) by introducing either identity, truncation, or reflection on the volatility parameters. He argues that the best scheme proposed is by setting the first term to its identity (v(t) = v(t)) and truncating the other two terms (v(t) = v(t)+_{). We then obtain the following Euler scheme}

v(t + dt) = v(t) + κ(θ − v(t)+)dt + λpv(t)+_dtZ

v. (5.12)

The Milstein scheme is related to the Euler scheme discussed above. But the accuracy is increased as this scheme uses a second order approximation of the SDE. As a result one more term gets added

(32)

to (5.12). Introducing σt= λpv(t), the Euler scheme gets extended with Z t+dt t Z s t (σ_u0σu)dWudWs= 1 4λ 2_dt(Z2 v− 1), (5.13)

after applying It¯o’s Lemma toR_tt+dtWsdWsds = 1₂(Wt+dt+ t).

5.2.2 Truncated Gaussian scheme

The TG scheme (Anderson, 2006) uses the properties of the process described in Section 5.2. He proposes to sample from a moment matched truncated Gaussian density. When v(t) is large, v(t+dt) can be approximated by a Gaussian variable whose two moments match m and s2. But when small it is approximated by an ordinary chi-square distribution, as proposed in Section 5.2. The TG scheme can then be denoted as

v(t + dt) = (µ + σZv)+, (5.14)

where µ and σ depend on the Heston parameters, dt and v(t). Additionally, Zv is drawn from a

standard normal distribution.

The moment matching is then done as follows. Match m, from (5.6), to the first moment from (5.14) m = E[v(t + dt)] = Z ∞ −µ σ (µ + σx)φ(x)dx = [µΦ(x) − σφ(x)]∞₋µ σ = µΦµ σ + σφµ σ , (5.15)

where the lower integration range is such that (5.14) is positive,R xφ(x)dx = −φ(x) and µ(Φ(∞) − Φ(−µ_σ)) = µ(1 − (1 − Φ(µ_σ)) = µΦ(µ_σ) using Φ(−x) = 1 − Φ(x). After denoting r =µ_σ we can express the above equation as m = σrΦ(r) + σφ(r), and then obtain the following expressions for σ and µ

σ = m

rΦ(r) + φ(r) (5.16)

µ = m

Φ(r) + r−1_φ(r), (5.17)

where µ = rσ.

Then we are able to match the second moment m2+ s2= E[v(t + dt)2] = Z ∞ −µ_σ (µ + σx)2φ(x)dx =µ2Φ(x) − 2µσφ(x) + σ2Φ(x) − σ2xφ(x)∞₋µ σ = µ2Φµ σ + σµφµ σ + σ2Φµ σ . (5.18)

(33)

Now, given the expressions for σ and µ above, we can simplify (5.18), after substituting r = µ_σ, as m2+ s2= m2 ₁ Φ(r) + r−1_φ(r)+ Φ(r) (rΦ(r) + φ(r))2 (5.19)

If we then denote ψ = _ms22, (5.19) can be rearranged to rφ(r) + Φ(r)(1 + r2) = (1 + ψ)(φ(r) + rΦ(r))2

which is a function which only depends on ψ. Given that r is a function of ψ we can now reformulate (5.16) and (5.17) as σ = fσ(ψ)s, fσ(ψ) = ψ−.5 φ(r(ψ)) + r(ψ)Φ(r(ψ)) (5.20) µ = fµ(ψ)m, fµ(ψ) = r(ψ) φ(r(ψ)) + r(ψ)Φ(r(ψ)). (5.21)

Here, r is a nonlinear function of ψ which needs to be recovered with a root finding algorithm. For simulating v(t) we must then create a cache for the values of fµ(ψ) and fσ(ψ), so that we can easily

look them up given r(ψ) which varies for each time step. Figure 9 shows what r, fµ(ψ) and fσ(ψ) look like given ψ ∈

h 1 52, λ2 2κθ i

, where the lower bound is a value close to zero and the upper bound is the maximum of ψ. The parameters are taken from the calibration results. For large values of v(t), ψ will be small and we can see that fµ(ψ) and fσ(ψ)

will be close to one, then v(t + dt) will approximate a Gaussian distribution. But for smaller values of v(t), fµ(ψ) will be smaller and fσ(ψ) larger than one and the scheme will then produce a mass

at zero, which will approximate an ordinary chi-square distribution with low amount of DoF. The two schemes discussed in Section 5.2.1 assume both fµ(ψ) and fσ(ψ) ≈ 1 ∀ ψ.

Thus for simulating v(t + dt), we must first calculate the moments m and s2 _{which depend on}

v(t). Given these two moments, we can compute ψ and find fµ(ψ) and fσ(ψ) for which we created

a cache earlier. Then we are able to compute µ and σ, and set v(t + dt) as in (5.14) after drawing a standard normal variable.

5.2.3 Quadratic Exponential scheme (QE)

Anderson noted that although the TG scheme is less biased than an Euler scheme, it does not perform great when v(t) is small due to the fact that the density decay is too fast. This implies that (5.14) produces too many v(t + dt) which are too small. Also, the fact that the TG scheme involves performing a root finding algorithm for r(ψ) to cache fµ(ψ) and fσ(ψ), slows down the process a

lot, especially if λ is large or if κ and/or θ is small.

In the QE scheme, Anderson proposes that for large values of v(t), v(t + dt) can be approximated by a quadratic representation of a Gaussian distribution, which he states is a non-central chi-square distribution with one DoF

v(t + dt) = a(b + Zv)2. (5.22)

(34)

Figure 9: Functions of the Truncated Gaussian scheme distribution, that m = E [V (t + dt)] = a(1 + b2) → a = m 1 + b2 s2= V ar [V (t + dt)] = 2a2(1 + 2b2) → b2= z − 1 +√z√z − 1 (5.23)

where z = 2ψ−1 and ψ is denoted below (5.19).

For small values he notes that moment matching will not work, and thus proposes to alter (5.22) by introducing a different function to sample when v(t) is small. This function should approximate the asymptotic density of an ordinary chi-square distribution

P (v(t + dt) ∈ [x, x + dx]) ≈ pδ(k) + β(1 − p)e−βx dx, x ≥ 0, (5.24) where δ is the Dirac delta-function so that we get a mass at k = 0, just like the true distribution of the volatility process. To sample from (5.24), Anderson mentions that we are able to compute the inverse of the CDF of (5.24). Thus by integrating

Ψ = Z x

0

(35)

we can then compute v(t + dt) as v(t + dt) = Ψ−1(u, β, p) =    0, if 0 ≤ u ≤ p β−1ln_1−u1−p, if p <u ≤ 1, (5.26)

where u is a uniform random number. Then the only thing left to do is to compute both p and β. Anderson argues that these can be obtained by matching m and s2_{to the expectation and variance}

of (5.25). He obtains the following results

p = ψ − 1

ψ + 1 (5.27)

β = 1 − p

m . (5.28)

Given these two results we are now fully able to implement the scheme.

Thus for simulating v(t + dt), we must first calculate the moments m and s2 _{which depend on}

v(t). Given these two moments, we can compute ψ. At this point Anderson proposes that we use the Gaussian approximation if ψ ≤ 1.5 and the chi-square approximation otherwise. For the Gaus-sian approximation both a and b need to be computed, and then v(t + dt) can be set according to (5.22). For the chi-square approximation we need to compute β and p and then set v(t + dt) as in (5.26).

One could also create a scheme by just sampling from the true distribution of the volatility process, instead of matching its moments. One would then have to invert a non-central chi-square distribu-tion. But this method is too slow, and approximations of it have proven to be difficult (Anderson, 2006).

5.3 Stock Process

Given the paths of the volatility and interest processes we are now able to simulate the stock prices. This can be done in two separate ways. The first is to discretize using an Euler scheme, in the same manner as the volatility process:

ln S(t + dt) = ln S(t) + Z t+dt t r(u) − 1 2v(u) du + Z t+dt t p v(u)d ˆWS,u (5.29)

where we put the SDE into integral form and where r and v were simulated previously. After applying the left Riemann sum, we only need to find the correlated Brownian motion d ˆWS,u, which

can be decomposed using the Cholesky decomposition that was stated in Section 3: d ˆWS,u =

ρvSdWv,u+ ρrSdWr,u+p1 − ρ2vS− ρ2rSdWS,u. Truncating the volatility process as described in

(36)

models as ln S(t + dt) = ln S(t) + rHW(t) −1 2v(t) + dt +pv(t)+_dt ρvSZv+ ρrSZr+ q 1 − ρ2 vS− ρ 2 rSZS . (5.30) ln S(t + dt) = ln S(t) + rG2++(t) −1 2v(t) + dt +pv(t)+_dt   ρvSZv+ ρxSZx+ ρyS− ρxSρxy q 1 − ρ2 xy Zy+ v u u t1 − ρ 2 vS− ρ 2 xS− ρyS− ρxSρxy q 1 − ρ2 xy ZS   . (5.31) For the Milstein scheme, the derivative of the log stock price is independent of t, and thus there is no change.

5.3.1 Leaking Correlation

One large problem with discretizing the SDE as above when taking the volatility process, as presented in Section 5.2 into account, is that the correlation between the stock and the volatility will be closer to 0 than to ρvS(Anderson, 2006). This is due to the fact that that v(t+dt) and Zvhave a non-linear

relationship. As a result the scheme above will simulate paths of S which have a bad fit in the tails, implying it will not accurately simulate strikes that are either ITM or OTM.

To solve this we can rearrange (5.8) in such a way that we get an expression for dWv,u from the

Cholesky decomposition: Z t+dt t p v(u)dWv,u= 1 λ v(t + dt) − v(t) − κθdt + κ Z t+dt t v(u)du ! . (5.32)

(37)

Combining d ˆWS,u and (5.32) we obtain for the final term in (5.29) Z t+dt t p v(u)d ˆWS,u= ρvS Z t+dt t p v(u)dWv,u+ ρrS Z t+dt t p v(u)dWr,u + q 1 − ρ2 vS− ρ 2 rS Z t+dt t p v(u)dWS,u = ρvS λ v(t + dt) − v(t) − κθdt + κ Z t+dt t v(u)du ! + ρrS Z t+dt t p v(u)dWr,u + q 1 − ρ2 vS− ρ2rS Z t+dt t p v(u)dWS,u = ρvS λ v(t + dt) − v(t) − κθdt + κdt 2(v(t) + v(t + dt)) + ρrS r dt 2(v(t) + v(t + dt))Zr + r dt(1 − ρ2 vS− ρ 2 rS) 2 (v(t) + v(t + dt))ZS (5.33) where Rt+dt t pv(u)dWL,u≈ q dt 2 (v(t) + v(t + dt))ZL.

Combining this with (5.29), and letting K = dt₂ (v(t) + v(t + dt)), we obtain the full discretization for the stock process

ln S(t + dt) = ln S(t) + Z t+dt t r(u)du +ρvS λ (v(t + dt) − v(t) − κθdt) + ρvSκ λ − 1 2 K + ρrS √ KZr+ q (1 − ρ2 vS− ρ 2 rS)KZS. = ln S(t) + Z t+dt t r(u)du + K0+ K1v(t) + K2v(t + dt) + ρrS √ KZr+ q (1 − ρ2 vS− ρ 2 rS)KZS, (5.34) where K0 = −dtρvS_λκθ, K1 = dt₂ ρvSκ_λ −1₂ − ρvS_λ , K2 = dt₂ ρvSκ_λ −1₂ + ρ_λvS, R t+dt t r(u)du = 1

2(r(t) + r(t + dt)) and assuming that the dividend is zero.

As for the Gaussian two factor model, we find d ˆWS,u= ρvSdWv,u+ ρxSdWx,u+

ρyS_√−ρxSρxy 1−ρ2 xy dWy,u+ r 1 − ρ2 vS− ρ2xS− ρyS_√−ρxSρxy 1−ρ2 xy

Pricing guaranteed annuity options using two hybrid models

University of Amsterdam

Master Thesis