Comparison of Black-Scholes, g-and-h distribution and Variance Gamma in modelling S&P500 index option prices.

(1)

Comparison of Black-Scholes, g-and-h distribution and Variance

Gamma in modelling S&P500 index option prices

Thesis

2013-08-26

Author:

Erwin Abrahamse (E.C.Abrahamse@asys.nl) MBA PT2 - 6160166

Supervisor:

(2)

Summary

The standard Black-Scholes option pricing model is not able to fully replicate the skewness and kurtosis observed in the distribution of asset returns. Models based on the g-and-h distribution and the Variance Gamma model are able to introduce these higher moments in the asset return distribution, which should make them better in replicating observed option prices. In this paper, these models have been calibrated to more than 500,000 historical S&P500 index option prices from the 5-year period 2007-2011. It appears that models based on a g-and-h transformation are not able to improve upon the Black-Scholes model in terms of sum of squared errors. The calibrated Variance Gamma model is more successful in the replication of observed option prices, generating the smallest sum of squared errors both during calibration as well as for an out-of-sample dataset of 10,000 option prices from January 2012. However, this accuracy comes at the price of increased mathematical complexity, resulting in more than a 200-fold increase in required computation time from 2 µs to more than 400 µs per option price.

(3)

Introduction

As a derivative financial instrument, the price of an option depends on a lot of factors: the difference between the current price of the underlying asset and the option exercise price, the volatility in the asset’s price, the interest rate, the time to maturity of the option, etc.. There are boundaries within which the option price must be to disallow arbitrage opportunities, but this still leaves significant room for the option price to float.

Most well-known option pricing models assume that the underlying asset prices follow a geometric Brownian motion and are lognormally distributed. This implies that the underlying asset returns should follow a Gaussian distribution. However, in practice it appears that the distribution of stock returns differs strongly from a normal distribution with the same mean and standard deviation. They often have fatter left and right tails than a normal distribution, which indicates a higher probability of extreme stock price movements than a normal distribution would predict. In addition, stock price returns are usually not symmetric around the mean like normal distributions.

The deviations from lognormality will cause models which assume a normal distribution of asset returns, like the famous Black-Scholes model (Black and Scholes (1973)), to generate option prices which are not consistent with the risks taken in practice. To accommodate for these deviations skewness and kurtosis in the distributions must be incorporated in the option pricing models. By introducing these so-called higher moments the modelled distribution of asset returns should "fit" the actual distribution better, thereby generating option prices which are more consistent with observed asset price movements.

Several ways have been proposed to introduce skewness and kurtosis in option pricing models, among which:

• The g-, h- and g-and-h distributions introduced by Tukey (1977) with thorough analysis by Hoaglin (1985). This model was successfully applied to model daily returns of individual stocks traded on the New York Stock Exchange by Badrinath and Chatterjee (1988, 1991), to model FTSE Index return distributions by Mills (1995), to model LIBOR rates (Dutta and Babbel (2002)) and to price interest rate options (Dutta and Babbel (2005)).

(5)

• The Variance Gamma model introduced by Madan, Carr and Chang (1990)

This model was successfully applied to model option prices by Madan, Carr and Chang (1998).

In this document option pricing models based on the methods above as well as the standard Black-Scholes model will be implemented in MATLAB® and their parameters will be calibrated to historical S&P500 index option prices over the 5-year period 2007 to 2011. Then we will compare their effectiveness in approaching observed implied volatilities in the option market.

In section I we will look at previous relevant research. In section II we will discuss the option pricing models from Black-Scholes, the g-and-h distribution-based model and the Variance Gamma model. In Section 0 the S&P500 index return distribution is analyzed and a dataset of historical S&P500 index call option prices in the period 2007-2011 will be described. This dataset will be used for parameter estimation of the option price models. The resulting models will be compared in their effectiveness in approaching the historically observed implied volatilities of these S&P500 index options in section 0. In section V the calibrated models will be compared to determine which model turns out to replicate the historical prices in the best way. Finally, we will give our conclusions in section VI, along with possible directions of further research.

(6)

I. Previous research

Since Fama (1965) it has been known that daily stock returns display more kurtosis than permitted in order to consider the distribution Gaussian. In addition, skewness is also present, even when observed with more robust determinants (Kim and White (2003)). These deviations cause the famous Black-Scholes option pricing model (Black and Scholes (1973)) to generate incorrect prices, because this model assumes a normal distribution of asset returns by taking the following stochastic process for the underlying asset price S(t):

with µ as the expected rate of return for this asset, 𝜎 the volatility in the asset returns and standard Brownian motion W.

To accommodate for those deviations from normality, many different models have been proposed, some of which are reviewed and compared by Kon (1984), Akgiray and Booth (1988). However, they all seem to suffer from computational complexity and relative inflexibility in modelling the distributional shapes found in stock return data.

Tukey (1977) introduced the g-and-h distribution, which was later examined by Martinez and Iglewicz (1984) and Hoaglin (1985). This transformation creates a distribution which incorporates skewness and kurtosis using parameters g and h respectively:

for a standard normal variable Z.

This distribution was successfully applied to US index return data and stock return data by Badrinath and Chatterjee (1988, 1991). In Badrinath and Chatterjee (1988) they use a percentile-based approach to estimate parameters g and h to fit daily as well as monthly returns on the CRSP value-weighted index and CRSP equal-weighted index for the period 1962-1985 (CRSP = Center for Research in Security Prices). This is similar to the method of moments, but multiple parameter estimates are possible and maximum likelihood estimation and the method of moments are computationally quite tedious. A value for h is estimated conditionally on an estimated value for g, because introducing skewness also influences kurtosis. Using constant values for g and h results in a good fit for the data, except for the tail-ends of the distribution. This leads the researchers to use separate values for both g and h

S(t) = 𝑆(0)𝑒𝜇𝑡+𝜎𝑊 ₍₁₎

(7)

for the tail-ends and the central part of the distribution, resulting in a much better fit. Badrinath and Chatterjee (1991) follows a similar approach, but now applied to daily returns on individual stocks for all firms continuously listed on the New York Stock Exchange and the American Exchange for the period 1963-1986.

Mills (1995) used the g-and-h distribution to model daily FTSE 100, FTSE Mid 250 and FTSE 350 index return data from 1986-1992. Like Badrinath and Chatterjee, Mills also takes a percentile-based approach to match the parameters of the observed distribution with the standard normal distribution. Where Badrinath and Chatterjee (1988) used separate values for

g and h for the central part and the tail-ends, Mills considers the half-spreads of the

distribution separately. In addition, the researcher allows the values for g and h to be functions of𝑍2:

Using this approach, Mills is able to determine 𝑔𝑖 and ℎ𝑖 such that the g-and-h distribution provides an excellent fit for the daily returns on the FTSE indices.

Dutta and Babbel (2002) apply the g-and-h distribution to model US Dollar 1- and 3-month LIBOR interest rates of the period 1987-2001. They compare the resulting model with the Generalized Beta Distribution of the Second Kind (GB2). Like the previous researchers, Dutta and Babbel use percentiles to fit the model to the observed data as they feel that the method of moments and maximum likelihood estimation will limit the use of the structural richness of the g-and-h distribution. Like Mills (1995), the researchers allow g to be a function of 𝑍2:

But parameter h is determined by a linear regression of ln �𝑔�𝑋𝑝−𝑋1−𝑝� 𝑒𝑔𝑍𝑝−𝑒−𝑔𝑍𝑝� on �

𝑍𝑝2

2� given g:

The researchers conclude that the g-and-h distribution combined with percentile fitting is able to provide a much better fit than the GB2 distribution using the method of moments.

𝑔(𝑍) = 𝑔0+ 𝑔2𝑍2_{+ 𝑔4𝑍}4 ℎ(𝑍) = ℎ0+ ℎ2𝑍2_{+ ℎ4𝑍}4

(3)

𝑔(𝑍) = 𝑔0+ 𝑔2𝑍2+ 𝑔4𝑍4+ 𝑔6𝑍6 (4)

(8)

In Dutta and Babbel (2005) use several distributions among which the g-and-h distribution to model the pricing of European interest rate options, more specifically interest rate caplets. Such a caplet can be considered a call option on short-term interest rates and can be priced in the same way as an option. The researchers use the g-and-h distribution, the GB2 distribution, the Burr-3 distribution (special case of the GB2 distribution) and the Weibull distribution (limiting case from both g-and-h distribution and GB2 distribution). For these distributions closed-form formulas where determined to calculate the price of a caplet. For the g-and-h distribution, this closed form formula is:

In this formula with a slightly different notation as used by Dutta and Babbel (2005), c is the call price, K the exercise price, a and b a location resp. scaling parameter. The function N() is the cumulative distribution function of the standard normal distribution.

This result differs slightly from that of Dutta and Babbel (2005). We used their equation (13) as our starting point, leading to differently placed brackets in our equivalent of their equation (14).

To determine the distribution’s parameters a dataset of end-of-day closing prices of US dollar caps with different strikes and maturities (tenors) in the period October 23, 2000 to September 19, 2001 was used. The authors conclude that “the g-and-h distribution exhibits the highest accuracy in extracting the implied distribution”.

As stated, we will compare the g-and-h distribution with the so-called Variance Gamma (VG) model, introduced by Madan and Seneta (1990). It belongs to the family of stochastic volatility models, where volatility is not a constant as assumed by the previously discussed models. The distribution in returns is assumed to be lognormal, but the volatility V in asset returns is itself a stochastic variable, with V assumed to have a gamma distribution. Their first model is symmetrical with parameter 𝜈 to add excess kurtosis and uses the following stochastic process for the asset price S(t):

𝑐 = 𝑒−𝑟𝑇_{�(𝑎 − 𝐾)[1 − 𝑁(𝐾)] −} 𝑏 𝑔√1 − ℎ�1 − 𝑁�𝐾√1 − ℎ�� + 𝑏 𝑔√1 − ℎ𝑒 𝑔2 2(1−ℎ)_{�1 − 𝑁 ��√1 − ℎ�𝐾 −} 𝑔 √1 − ℎ�� (6) S(t) = 𝑆(0)𝑒𝜇𝑡+𝑌(𝑡;𝜎,𝜈) ₍₇₎

(9)

in which 𝑌(𝑡; 𝜎, 𝜈) = 𝑏(𝛾(𝑡; 1, 𝜈); 𝜎) is a VG process defined in terms of a brownian motion 𝑏(𝑡; 𝜎) without drift, volatility 𝜎 and variance rate of gamma time change 𝜈.

The authors compare the option prices generated based on this model with the prices generated by the model of Black and Scholes (1973) for given maturity, exercise price, interest rate and volatility. They conclude that the VG model creates a differential effect on at-the-money (ATM) options, opposed to in-the-money (ITM) and out-of-the-money (OTM) option: the model generates higher option prices for OTM and ITM options than ATM options when kurtosis is increased. No calibration of their model to observed option prices is performed in their paper.

In Madan, Carr and Chang (1998) the VG model is modified and a second parameter is added to produce a skewed distribution. This yields the following process for the asset price:

in which 𝑋(𝑡; 𝜎, 𝜈, 𝜃) = 𝑏(𝛾(𝑡; 1, 𝜈); 𝜃, 𝜎) is a VG process defined in terms of a Brownian motion 𝑏(𝑡; 𝜃, 𝜎) with drift 𝜃, volatility 𝜎 and variance rate of gamma time change 𝜈. Parameter 𝜃 controls skewness in the distribution and 𝜈 again controls excess kurtosis.

This time, the researchers calibrate their model to 8245 prices of S&P500 futures options traded at the Chicago Mercantile Exchange in the period 1992-1994, using maximum likelihood estimation. Both the symmetrical variant as the full VG model with skewness and kurtosis are compared with the Black-Scholes model. The researchers conclude that the symmetrical model is sufficient to model the underlying asset price, but the full VG model performs best when replicating option prices.

This paper follows an approach similar to Bakshi, Cao and Chen (1997), who compare several types of option pricing models using historical S&P500 index option prices.

S(t) = 𝑆(0)𝑒𝜇𝑡+𝑋(𝑡;𝜎,𝜈,𝜃)+𝜔𝑡

𝜔 =1_{𝜈 ln (1 − 𝜃𝜈 −}𝜎_{2 )}2𝜈

(10)

II. The option price models

We will assume a complete market, i.e. any claim depending on a future state of the world (a

contingent claim) can be replicated with a portfolio of existing securities. We will also

assume that those securities can be traded without transaction costs and that the market is arbitrage-free. Under these circumstances there will be a unique system of state prices for securities with a contingent payoff depending on a particular future state.

So, suppose we have an asset with price 𝑆_𝑡 at time t and a security depending on 𝑆_𝑡, with a payoff function 𝑋(𝑆_𝑇) at time T. Then the expected payoff at time T for this security is the sum of all the contingent payoffs at time T weighted by that payoff’s probability:

Here 𝑝(𝑆) is the probability density function (pdf) of asset price S, the so-called physical

measure expectation (denoted by the subscript P in 𝔼𝑃).

The asset price S is influenced by risk preferences of investors. This is usually represented in the expected rate of return that investors demand on the asset. As this differs across investors, this makes it difficult to determine the asset price. If the investors were all risk-neutral, then the expected rate of return would be equal to the risk-free interest rate for all investors. This leads to a simpler pricing method, based on an expected rate of return that we can observe. The method is based on the result that under the above assumptions a probability measure (the risk-neutral measure) exists such that the price obtained by discounting at the risk-free rate is equal to the price obtained under the physical measure. This is more thoroughly explained in Gisiger (2010).

In what follows, we will use neutral valuation to determine option prices. Since neutral probabilities are nothing more than compounded state prices (Gisiger (2010)), risk-neutral valuation implies discounting at risk-free interest rate r to get the real world value. So, using risk-neutral valuation, a European call option price c on a non-dividend paying asset is valued at (Hull (2009)): 𝔼𝑃[𝑋(𝑆𝑇)] = � 𝑋(𝑆∞ 𝑇) 𝑝(𝑆)𝑑𝑆 −∞ (9) 𝑐(𝑆, 𝐾, 𝑇, 𝑟) = 𝑒−𝑟𝑇_𝔼 𝑄[max{𝑆𝑇− 𝐾, 0}] = 𝑒−𝑟𝑇_{∫ max{𝑆𝑇}∞ _{− 𝐾, 0} 𝑞(𝑆)𝑑𝑆} 𝐾 (10)

(11)

The variables are defined as asset price S, annualized risk-free interest rate r, option time to maturity T (years), option strike price K and 𝑞(𝑆) as the risk-neutral pdf of the underlying distribution of S, the so-called risk-neutral measure. We used subscript Q in 𝔼𝑄 to indicate the risk-neutral expecation.

Thus, the value of such a call option is equal to the risk-neutral expected value of the difference between the asset value and the exercise price at the time of maturity, discounted by the risk-free interest rate. Obviously a call option cannot have a negative price; it will always have a positive price as long as there is still a probability that the asset price will exceed the strike price. When the option expires at its time of maturity and the asset value is lower than the exercise price at that time, then the call option’s value at expiry will be zero.

Black-Scholes option price model

The famous option price model of Black and Scholes (1973) assumes that asset price behavior is given by the stochastic process

with the variables defined as asset price S, annualized expected rate of return μ, annual volatility of the asset price σ, time t (years) and standard Brownian motion W. Since we will use risk-neutral valuation, μ will be considered equal to the risk-free rate r.

This model makes the following important assumptions, among others: • the expected rate of return μ and the volatility σ are constant;

• the risk-free rate of interest is constant and the same for all maturities; • there are no dividends in the period from now to the derivative’s maturity; • there are no jumps in the underlying asset price and pricing is continuous;

𝑑𝑆

𝑆 = 𝜇 𝑑𝑡 + 𝜎 𝑑𝑊 (11)

(12)

Under these assumptions, the stochastic process implies a normal distribution of asset returns, so the risk-neutral distribution function of these asset returns will be the standard normal distribution

After substitution in equation (10) this results in the following call price c (Hull 2009, appendix of chapter 13)

where

This option pricing model will be the standard for the other models to improve upon. It is a very successful model because of its computational simplicity, but it has several shortcomings as the assumptions already indicate.

𝑞(𝑥) = 1 √2𝜋 𝑒−12𝑥 2 (12) 𝑐 = 𝑒−𝑟𝑇_{[ 𝑆0 𝑒}𝑟𝑇_{𝑁(𝑑1) − 𝐾𝑁(𝑑2}_)] ₍₁₃₎ 𝑑1 = ln(𝑆0/𝐾) + (𝑟 + 𝜎 2_{/2) 𝑇} 𝜎√𝑇 d2 =ln(𝑆0/𝐾) + (𝑟 − 𝜎2/2) 𝑇 𝜎√𝑇 = 𝑑1− 𝜎√𝑇

(13)

g-and-h transformation (Dutta and Babbel)

One of the major shortcomings in the Black-Scholes model is the assumption that the asset returns follow a normal distribution. As pointed out in the next chapter, empirical evidence exists that asset returns generally do not match a normal distribution but show skewness and kurtosis.

One way to introduce skewness and kurtosis into the option model is to use the g-and-h transformation. The g-and-h transformation 𝑌_𝑔,ℎ(𝑍) of a standard normal variable Z is given by

For h = 0, this becomes the g-distribution 𝑌_𝑔(𝑍)

For g = 0, this becomes the symmetrical h-distribution 𝑌ℎ(𝑍)

This results in the following stochastic process for S under the risk-neutral measure:

with the variables defined as asset price S, location parameter A, scaling factor B, skewness parameter g, kurtosis parameter h and standard normal variable Z.

𝑌𝑔,ℎ(𝑍) = (𝑒𝑔𝑍 − 1)𝑒 ℎ𝑍2 2 𝑔 for g ≠ 0 (14) 𝑌𝑔(𝑍) = 𝑒𝑔𝑍_𝑔−1 for h = 0 (15) 𝑌ℎ(𝑍) = 𝑍 𝑒ℎ𝑍22 for g = 0 (16) 𝑆 = 𝐴 + 𝐵𝑌𝑔,ℎ(𝑍) = 𝐴 + 𝐵�𝑒𝑔𝑍−1� 𝑔 𝑒 ℎ𝑍2 2 for g ≠ 0 = 𝐴 + 𝐵 𝑍 𝑒ℎ𝑍22 for g = 0 (17)

(14)

As Badrinath and Chatterjee (1991) remarked as well, equation (17) is only monotone increasing in Z for h ≥ 0. Applying this transformation to the standard normal distribution function and substituting in equation (10) gives Dutta and Babbel (2005) the following result:

If we take this result and rewrite equation (20) to isolate a and substitute in equation (19): 𝑐 = 𝑒−𝑟𝑇 1 √2𝜋 � (𝑎 + 𝑏(𝑒𝑔𝑍− 1) ∞ 𝐾 𝑒ℎ𝑍22 𝑔 ) − 𝐾) 𝑒−𝑍 2 2 𝑑𝑍 (18) 𝑐 = 𝑒−𝑟𝑇_{�(𝑎 − 𝐾)[1 − 𝑁(𝐾)] −} 𝑏 𝑔√1−ℎ�1 − 𝑁�𝐾√1 − ℎ�� + 𝑏 𝑔√1−ℎ𝑒 𝑔2 2(1−ℎ)�1 − 𝑁 �𝐾√1 − ℎ − 𝑔 √1−ℎ�� for g ≠ 0, h<1 c = 𝑒−𝑟𝑇�(𝑎 − 𝐾)[1 − 𝑁(𝐾)] − 𝑏 √2𝜋 𝑒(ℎ−1)𝐾22 ℎ−1 � for g=0, h<1 (19) 𝔼[𝑆𝑇] = 𝑎 + 𝑏 �𝑒 𝑔 2 2(1−ℎ)_{− 1�} 𝑔√1 − ℎ = 𝑆0 (20) 𝑎 = 𝑆0− 𝑣(𝑢 − 1) for g ≠ 0, h < 1 = 𝑆0 for g = 0, h < 1 (21) 𝑐 = 𝑒−𝑟𝑇_{{ (𝑆0}_{− 𝑣(𝑢 − 1) − 𝐾)[1 − 𝑁(𝐾)]} − 𝑣 �1 − 𝑁�𝐾√1 − ℎ�� + 𝑣𝑢 �1 − 𝑁 �𝐾√1 − ℎ − 𝑔 √1 − ℎ��} = 𝑒−𝑟𝑇_{� (𝑆} 0− 𝐾)[1 − 𝑁(𝐾)] − 𝑏 √2𝜋 𝑒(ℎ−1)𝐾2 2 ℎ − 1 � for g ≠ 0, h < 1 for g = 0, h < 1 with u = 𝑒 𝑔2 2(1−ℎ)_and𝑣 = 𝑏 𝑔√1−ℎ (22)

(15)

This would leave the following parameters to be estimated using empirical data: • b: scaling factor

• g: controls skewness

• h: controls kurtosis, ℎ ∈ [0, 1)

However, we can already see that there will be some issues with this option formula. Dutta and Babbel (2005) used the transformation to model the behaviour of the underlying asset S, in their case LIBOR interest rates. The resulting option formula may work well to price interest rate options, but we would like to use it to price S&P 500 index options. For index options a term like N(K) seems meaningless as it will always be 1, since K in itself is a relatively large number and there is no normalization with respect to S.

So, all 3 terms in equation (22) would quickly go to 0 for K>3, which represents 3 standard deviations. However, when dealing with strike rates for S&P 500 index options, K is usually in the order of ~ 1000. In order to still result in some kind of option price, h must be very close to 1, in the order of 0.9999. This would blow up v to the order of 300𝑏

𝑔 , and u would explode �𝑒50,000𝑔2� unless 𝑔2 is a very small number to compensate. If g is small then b must be small as well, otherwise v would become very large.

In other words, for the strike prices we are dealing with, equation (22) is numerically unstable and very hard to calibrate to a dataset. This could be caused by the fact that Dutta and Babbel (2005) use strike price K as the integral’s lower limit (called c in their paper). However, S is a transformed variable of Z, so the lower limit should have been transformed as well. Instead of using S = K, we need to use Z = q as lower limit, with q being that specific value for Z that makes the term 𝐴 + 𝐵𝑌𝑔,ℎ(𝑍) − 𝐾 equal to zero.

(16)

This results in equations (23) and (24):

There is only a closed-form solution for q for the g- and the h-distribution:

with W(.) being the Lambert W product log function. See the appendix for the derivation of equation (25). For the combined g/h distribution we must use numerical methods to determine q.

g-and-h transformation (Mills)

Mills (1995) on the other hand uses a different stochastic process for S under the risk-neutral measure, because he applies the g/h-distribution to daily index returns X:

𝑐 = 𝑒−𝑟𝑇 1 √2𝜋 � �𝑎 + 𝑏(𝑒𝑔𝑍− 1) 𝑒ℎ𝑍22 𝑔 − 𝐾� ∞ 𝑞 𝑒 −𝑍₂2_𝑑𝑍 ₍₂₃₎ 𝑐 = 𝑒−𝑟𝑇_{�(𝑎 − 𝐾)[1 − 𝑁(𝑞)] −} 𝑏 𝑔√1−ℎ�1 − 𝑁�𝑞√1 − ℎ�� + 𝑏 𝑔√1−ℎ𝑒 𝑔2 2(1−ℎ)�1 − 𝑁 �𝑞√1 − ℎ − 𝑔 √1−ℎ�� for g ≠ 0, h<1 c = 𝑒−𝑟𝑇�(𝑎 − 𝐾)[1 − 𝑁(𝑞)] − 𝑏 √2𝜋 𝑒(ℎ−1)𝑞22 ℎ−1 � for g=0, h<1 (24) 𝑞 =1_𝑔ln �𝑔𝐾−𝑎+1_𝑏 � for h = 0 𝑞 =(𝐾−𝑎)_𝑏 𝑒−12 𝑊�ℎ �𝐾−𝑎𝑏 � 2 � = �𝑊�ℎ � 𝐾−𝑎 𝑏 � 2 � ℎ for g = 0 (25) 𝑋 = ln 𝑆 𝑆0 = 𝐴 + 𝐵𝑌𝑔,ℎ(𝑍) = 𝐴 + 𝐵�𝑒𝑔𝑍_𝑔−1� 𝑒ℎ𝑍22 for g ≠ 0 (g/h-distribution) = 𝐴 + 𝐵 𝑍 𝑒ℎ𝑍22 for g = 0 (h-distribution) (26)

(17)

Note that for g = h = 0, equation (26) would become the same as equation (11) of the Black-Scholes model.

Combining with equation (10) results in:

There is no closed-form solution for the second term, so this integral must be determined numerically. Note that the integrand will impose restrictions on b, g and h in order to converge to a defined value for the infinite upper limit. As for the lower limit q, again there is only a closed-form solution for the g-distribution (h = 0) and the h-distribution (g = 0):

with W(.) being the Lambert W product log function. See the appendix for the derivation of equation (30). For the g- and g/h distribution we must use numerical methods to determine q. This leaves the following parameters which must be estimated from empirical data:

• a: location parameter • b: scaling factor • g: controls skewness • h: controls kurtosis 𝔼[max{𝑆𝑇− 𝐾, 0}] = � max{𝑆𝑇∞ − 𝐾, 0} 𝑓𝑇(𝑆)𝑑𝑆 𝐾 (27) = � �𝑆0 𝑒𝑎+𝑏 𝑌𝑔,ℎ(𝑍)− 𝐾�𝑒 −𝑍₂2 √2𝜋 𝑑𝑍 ∞ 𝑞 (28) = 𝐾[𝑁(𝑞) − 1] +𝑆0𝑒𝑎 √2𝜋� 𝑒 𝑏 𝑌𝑔,ℎ(𝑍)−𝑍 2 2 𝑑𝑍 ∞ 𝑞 (29) =ln �𝐾𝑆0� − 𝑎 + 2𝜋 𝑙 𝑖 _𝑏 𝑒−12𝑊�ℎ � ln� 𝐾_𝑆 0�−𝑎+2𝜋 𝑙 𝑖 𝑏 � 2 � 𝑞 =1_𝑔ln �1 +𝑔_𝑏�𝑙𝑛 �_𝑆𝐾 0� − 𝑎 + 2𝜋 𝑙 𝑖�� for h = 0 𝑞 =� 𝑊�ℎ �ln� 𝐾𝑆0�−𝑎_𝑏 � 2 � ℎ for g = 0 (30)

(18)

Variance Gamma option price model

Madan, Carr and Chang (1998) suggest the following stochastic process for asset price behaviour:

with the variables defined as asset price S, annualized expected rate of return μ, annual volatility of the asset price σ, time t (years), Variance Gamma process 𝑋(𝑡; 𝜎, 𝜈, 𝜃), standard Brownian motion W(t), Brownian motion process 𝑏(𝑡; 𝜃, 𝜎) with drift θ and volatility σ, and Gamma process Γ(𝑡; 1, 𝜈) with unit mean rate (one) and variance rate ν.

Substituting the distribution function under the risk-neutral measure (μ = risk-free rate r) in equation (10), Madan, Carr and Chang (1998) evaluate this to

and function Ψ is a combination of terms of the modified Bessel function of the second kind 𝐾𝛼(𝑥) (not to be confused with strike price K) and the degenerate hypergeometric function of

𝑆 = 𝑆0 𝑒𝜇𝑡+𝑋(𝑡; 𝜎,𝜈,𝜃)+𝜔𝑡 (31) 𝜔 =1_{𝜈 ln �1 − 𝜈𝜃 −}1_{2 𝜎}2_𝜈� ₍₃₂₎ 𝑋(𝑡; 𝜎, 𝜈, 𝜃) = 𝑏(Γ(𝑡; 1, 𝜈); 𝜃, 𝜎) (33) = 𝜃Γ(𝑡; 1, 𝜈) + 𝜎𝑊�Γ(𝑡; 1, ν)� (34) 𝑐 = 𝑒−𝑟𝑇_{� 𝑆0 𝑒}𝑟𝑇_{Ψ �𝑑 �}1 − 𝑐1 𝜈 ,(𝛼 + 𝑠)� 𝜈 1 − 𝑐1, 𝑡 𝜈� − 𝐾Ψ �𝑑 �1 − 𝑐2_{𝜈 , 𝛼�}_{1 − 𝑐}𝜈 2, 𝑡 𝜈�� (35) where 𝑑 = 1_{s � ln �}𝑆_{𝐾 � + 𝑟𝑡 +}0 _{𝜈 ln �}𝑡 1 − 𝑐_{1 − 𝑐2}1�� 𝑠 = 𝜎 � 1 +�𝜃𝜎� 2 𝜈 2 , 𝛼 = −_𝜎𝜃₂𝑠 𝑐1 = 𝜈(𝛼+𝑠) 2 2 , 𝑐2 = 𝜈𝛼2 2

(19)

two variables Φ(𝛼, 𝛽, 𝛾; 𝑥, 𝑦). See equation (57) in the appendix for the complete equation as it was presented by Madan, Carr and Chang (1998).

This introduces the following parameters which must be estimated from empirical data: • ν: variance rate of the Gamma process; controls kurtosis

• θ: drift of Brownian motion process; controls skewness

Fast Fourier Transformation applied to Variance Gamma model

The closed-form solution given by equation (35) can be calculated but takes a significant amount of time. A faster way to calculate the option price can be achieved by applying a Fast Fourier Transformation (FFT) as proposed by Carr and Madan (1999).

If 𝜙_𝑇(𝑢) denotes the characteristic function of the risk-neutral density of the returns is defined by

then Carr and Madan (1999) show that the following Fourier transformation 𝜓_𝑇(𝑣) of call option price c

can be rewritten in terms of the characteristic function 𝜙_𝑇(𝑢) of the density of the returns:

with k denoting the log of strike price K and c(S, K, T, r) is the call option price as given in equation (10). Coefficient 𝛼 > 0, a dampening or convergence parameter, is required to assist the numerical integrability of the expression as 𝛼 = 0 would introduce a singularity in the integrand in equation (38). This is caused by the fact that 𝑐 → 𝑆0 as 𝑘 → −∞ (or 𝐾 → 0), which would render equation (10) non-integrable. Carr and Madan (1999) show that for the Variance Gamma model 𝛼 must satisfy the upper bound

𝜙𝑇(𝑢) = � 𝑆𝑒∞ −𝑖𝑢 −∞ 𝑓𝑇(𝑆)𝑑𝑆 (36) 𝜓𝑇(𝑣) = � 𝑒∞ −𝑖𝑣𝑘 −∞ 𝑒 𝛼𝑘_{𝑐(𝑆, 𝐾, 𝑇, 𝑟) 𝑑𝑘} ₍₃₇₎ 𝜓𝑇(𝑣) =_𝛼₂𝑒−𝑟𝑇_{+ 𝛼 − 𝑣} 𝜙𝑇(𝑣 − (𝛼 + 1)𝑖)₂ _{+ 𝑖(2𝛼 + 1)𝑣} (38)

(20)

By introducing the FFT, Carr and Madan (1999) determine that after performing the inverse Fourier transformation call option price c can be approximated by numerical computation of

Kienitz and Wetterau (2012) slightly rewrite Carr and Madan (1999)’s result to:

where N is typically a power of 2 for the number of terms, 𝜂 > 0 a spacing parameter, 𝑣𝑗 = 𝜂𝑗 (j = 0, …, N-1), 𝑏 =𝑁𝜆₂ and 𝑘𝑢 = −𝑏 + 𝜆𝑢 (u = 0, …, N-1). 𝛿𝑛 is the delta function with unity for n=0 and zero otherwise.

Expression (37) can be calculated efficiently with the Fast Fourier Transform and this is what will be used in our calculations.

𝛼 < �_𝜎𝜃2₄+_𝜎2₂_{𝜈 −}_𝜎𝜃₂− 1 (39) 𝑐 =𝑒−𝛼𝑘_{𝜋 � 𝑒}∞ −𝑖𝑣𝑘 −∞ 𝜓𝑇(𝑣) 𝑑𝑣 (40) 𝑐 ≈𝑒−𝛼𝑘_{𝜋 � 𝑒}_𝑗−𝑖𝑣𝑗𝑘_𝜓 𝑇�𝑣𝑗� 𝜂 𝑁 𝑗=1 (41) 𝑐 ≈𝑒−𝛼𝑘𝑢 𝜋 � 𝑒−𝑖2𝜋𝑁 𝑗𝑢 𝑒𝑖𝑏𝑣𝑗 𝜓𝑇�𝑣𝑗� 𝜂 3 �3 +(−1)𝑗+1− 𝛿𝑗� 𝑁−1 𝑗=0 (42)

(21)

III. Dataset analysis

The dataset used for this research consists of the daily closing prices of S&P 500 index options, along with the closing prices of the S&P 500 index itself.

S&P 500 index closing prices

S&P 500 index closing prices have been extracted from the Thomson Reuters Datastream database (S&P 500 COMPOSITE - PRICE INDEX) for the period January 1st, 2007 to December 31st, 2011. This resulted in 1305 closing prices (Datastream data type MP).

Table 1: S&P500 Index return statistics for period Jan 1, 2007 - Dec 31, 2011 This table presents various distribution statistics for the dataset of the daily S&P500 Index returns for the period of January 1st, 2007 to December 31st, 2011.

Number of samples 1305

Mean -9.2144e-05 (daily) -0.0240 (annually)*

Standard Deviation 0.0165 (daily) 0.2668 (annually)*

Skewness -0.2492

Kurtosis 9.8704

6.8704 (excess)

Minimum -0.094695 (on Oct 15, 2008)

Maximum 0.109572 (on Oct 13, 2008)

(* assumes 1305/5 = 261 trading days per year)

See Figure 1 for a graph of the S&P500 Index return distribution over the period January 1st, 2007 to December 31st, 2011. We can see clearly that the distribution is highly leptokurtic: it has a relatively high peak and heavier tails compared to the normal distribution. Also, it seems to show a slight negative skewness.

(22)

Figure 2 shows a Normal Probability Plot of the daily S&P500 Index returns. If the distribution in returns would be a normal distribution, then the blue graph should have followed the red dashed line. It is clear that the middle part of the distribution in returns is normally distributed and the tails are not: the probability of extreme values is much higher than a normal distribution would dictate.

Figure 1: Distribution of daily S&P500 Index returns over Jan 1, 2007 - Dec 31, 2011 This figure shows the return distribution of the S&P500 Index for the period January 1st, 2007 to December 31st, 2011, together with a standard normal distribution.

0,00% 2,00% 4,00% 6,00% 8,00% 10,00% 12,00% -5,00 -4,00 -3,00 -2,00 -1,00 0,00 1,00 2,00 3,00 4,00 5,00 Fre qu en cy (%) Z value

S&P500 Index return distribution

(23)

Figure 2: Normal Probability Plot of daily S&P500 Index returns

This figure shows a normal probability plot for the return distribution of the S&P500 Index for the period January 1st, 2007 to December 31st, 2011, together with the plot expected for a standard normal distribution.

To verify that the S&P500 Index return distribution is indeed not normally distributed, we calculate the standard error of skewness (SES) and standard error of kurtosis (SEK):

The observed skewness of -0.2492 falls inside the interval [-1.96 SES, 1.96 SES]. This means that the observed skewness is not significant with 95% confidence. The observed normalized kurtosis of 6.8704 clearly falls outside the interval of [-1.96 SEK, 1.96 SEK]. This means that the return distribution is leptokurtic with 95% confidence

To test for normality, we can use the D’Agostino-Pearson test:

Alternatively, the Jarque-Bera test:

Both results are clearly greater than the p-value for χ2(2 df) of 15.20 for 99.95% significance, leading to the conclusion that the S&P500 Index return distribution is not a normal distribution with 99.95% confidence. Of course, we can also see this in a graphical way in

Figure 1 and Figure 2.

𝑆𝐸𝑆 = �_{(𝑛 − 2)(𝑛 + 1)(𝑛 + 3) = 0.1659}6𝑛(𝑛 − 1) (43)

𝑆𝐸𝐾 = 2 𝑆𝐸𝑆�_{(𝑛−3)(𝑛+5)}𝑛2−1 = 0.3315 (44)

𝐷𝑃 = �𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠_{𝑆𝐸𝑆 �}2 + �𝑛𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑 𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠_𝑆𝐸𝐾 �2 = 431.7 (45)

𝐽𝐵 =𝑛₆ �𝑠𝑘𝑒𝑤𝑛𝑒𝑠𝑠2 ₊1

(24)

S&P 500 Index option prices

S&P 500 index call option prices have been extracted from the Thomson Reuters Datastream database (lists LOPTSPXLC and LOPTSPXDC) for the period January 1st, 2007 to December 31st,

2011. This resulted in 749,665 closing prices (Datastream data type MP).

From this set, the following option prices were removed:

• 211,291 call option prices less than 𝑆 − 𝐾 𝑒−𝑟𝑇_{were left out, because they are below the} minimum arbitrage price (which can happen when the last transaction for this particular option has been made before the close of day).

• 29,247 options with a time-to-maturity shorter than 6 days were left out, because these prices may be influenced by liquidity issues which are beyond the scope of the models used.

• 151 option prices with a moneyness S/K of more than 5 were removed because of their illiquidity. These options are relatively seldom traded and the last transaction price may not correspond to the index closing price, skewing implied volatility calculations and model fitting.

• 46 option prices equalling 998.5 were removed, because these were clearly incorrect (assumed to be a problem with the Datastream data, mainly concerning CALL SPX APR07 1435 and several option prices on November 28th and December 1st, 2008).

This results in a dataset of 508,976 S&P500 index option prices over the 5-year period 2007-2011.

See Table 2 for a breakdown of the dataset across the options’ moneyness and time-to-maturity. OTM stands for Out-of-The-Money options (low underlying S over strike price K ratio), ATM stands for At-The-Money options with a strike price similar to the underlying value and ITM stand for In-The-Money options with a high S/K ratio.

(25)

Table 2: Average price and sample size of S&P500 Index Options for calibration dataset This table presents average option prices and sample size for subsets of the S&P500 Index option prices for the period January 1st, 2007 to December 31st, 2011. The subsets are created based on the time to maturity for the options as well as their “moneyness” (asset value to exercise price ratio). Options with a moneyness less than 0.97 are considered Out-of-The-Money options (OTM), those with a moneyness of 0.97 up to 1.03 are At-The-Money options (ATM) and those with a moneyness greater than 1.03 are In-The-At-The-Money options.

Moneyness S/K Days-to-Expiration Subtotal < 60 60-180 ≥ 180 OTM < 0.94 $1.61 $5.38 $25.52 256,646 71,601 61,210 91,382 0.94-0.97 $9.68 $27.23 $93.25 13,846 10,118 8,489 ATM 0.97-1.00 $21.74 $43.85 $111.93 62,049 13,324 10,039 8,473 1.00-1.03 $41.32 $63.59 $131.51 12,514 9,397 8,302 ITM 1.03-1.06 $66.25 $86.44 $150.43 190,281 11,068 8,701 7,965 ≥ 1.06 $206.40 $190.80 $266.54 56,608 45,619 60,320 Subtotal 178,961 145,084 184,931 508,976

Table 16 (see appendix) gives the implied volatilities as calculated by the Black-Scholes

option pricing model for various periods. The presence of the so-called “volatility smile” can be clearly seen.

For the interest rate, the closing prices of US T-BILL SEC MARKET 3 MONTH (D) - MIDDLE RATE were extracted from Datastream for all dates for which option prices were collected.

(26)

Figure 3: Distribution of option prices over moneyness and time-to-maturity This graph depicts the distribution of the S&P500 options in the dataset in terms of the options’ Time-To-Maturity (TTM) and their “moneyness” (asset value to exercise price ratio).

Figure 3 depicts this distribution of option prices with respect to moneyness and time-to-maturity in a graphical way. As can be seen, most option prices have a time-to-time-to-maturity of less than a year with an exercise price near the underlying asset price (moneyness near 1.0).

For the option prices in the dataset the Black-Scholes implied volatility can be calculated. For this purpose, the interest rate given by the Thomson Reuters Datastream database (US T-BILL SEC MARKET 3 MONTH (D) - MIDDLE RATE) for the period January 1st, 2007 to December 31st, 2011 has been used. When plotted against an option’s moneyness and time-to-maturity this results in the implied volatility surface given in Figure 4.

(27)

Figure 4: Black-Scholes implied volatility surface of the option prices in the dataset This graph shows the implied volatility of the S&P500 Index option prices in the dataset in terms of the Time-To-Maturity of the options and their moneyness” (asset value to exercise price ratio).

(28)

IV. Estimating model parameters

Following Dutta and Babbel (2005), the set of parameters Δmin must be estimated that minimizes the Sum of Squared Errors (SSE):

in which 𝑐_{𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑}𝑖 are the observed call option prices, 𝑐_{𝑚𝑜𝑑𝑒𝑙|Δ}𝑖 the call option prices calculated by a model for a given parameter set Δ, and n the number of call option price observations.

Function lsqnonlin of MatLab vR2013a will be used to find parameter set Δmin for the respective option pricing models. This function implements the Levenberg-Marquardt minimization algorithm (Levenberg (1944), Marquardt (1963)), also known as the damped

least-squares algorithm. It requires an initial estimate of the parameters Δinit to start the

iterative process to find Δmin. However, this is a local minimum, not necessarily the global minimum. So some models will converge to a global Δmin regardless of the chosen initial set Δinit, but less stable models can yield wildly varying solutions for Δmin depending on this chosen initial set. Therefore a grid of initial parameters will be used to increase chances to find a global minimum.

The calculations are performed on an Intel Core i5-3570 CPU, with 4 cores running at a clockspeed of 3.4 GHz. The OS used was Windows 7 SP1 and the MatLab Parallel Computing module was used to increase throughput where possible. Unfortunately MatLab is not compatible with the OpenCL standard for using the available AMD graphical processors to accelerate the computations.

𝑆𝑆𝐸 = ��𝑐𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝑖 − 𝑐𝑚𝑜𝑑𝑒𝑙|Δ𝑖 �2 𝑛

𝑖=1

(29)

The Black-Scholes option pricing model

The Black-Scholes model does not have a parameter to calibrate, as all variables in equation

(13) are given by the specific option, the current interest rate and the underlying asset. We

can use sample moments for all these parameters, but we choose to calibrate the model to the dataset by selecting volatility σ as calibration variable. This determines the implied volatility by minimizing equation (47).

See the appendix for the implementation of the algorithm in MatLab. The Black-Scholes model is computationally very fast: MatLab’s lsqnonlin function is able to find the implied volatility for the collection of option prices in less than a second for all starting points. This amounts to 1-2 µs per option price calculation.

The Black-Scholes model is also very stable, always converging to the same volatility σ regardless of the value of initial vector Δinit. We used starting values for σ of 0.05 up to and including 1.0, with an increment of 0.05.

See Table 3 for the resulting σ and the accompanying SSE.

Table 3: Implied Parameters and In-Sample Fit for Black-Scholes model

This table presents the implied Black-Scholes volatility σ for the 4 datasets, which was calculated by using a given start value for σ and then using the MatLab lsqnonlin function to determine the optimal σ for which the total Sum-of-Squared-Errors (SSE) was smallest. The resulting SSEs are also presented, which will be used to compare this result to the other models.

Dataset Sample size Initial Δinit Optimal Δmin SSE

ATM 62,049 σ = [0.05, 1.00]* σ = 0.1977 19.96M

ITM 190,281 σ = [0.05, 1.00]* σ = 0.2166 40.11M

OTM 256,646 σ = [0.05, 1.00]* σ = 0.1821 38.47M

ALL 508,976 σ = [0.05, 1.00]* σ = 0.1959 106.48M

(30)

The option pricing model based on g-and-g transformation

Badrinath and Chatterjee (1991) suggest first estimating a, b and g by matching different percentiles of the asset price distribution and the standard normal distribution, because they deem maximum likelihood estimation quite difficult. We will use MatLab’s lsqnonlin function instead.

We will first consider the h-transformation (g = 0). This should introduce kurtosis in the distribution. Model H1 based on Dutta and Babbel’s transformation (2005) and model H2 based on Mills’ transformation (1995) will be calibrated.

Different initial vectors have been tried:

• Parameter a = [-1, 2] with step 0.5 (H2 model only) • Parameter b = [0.5, 2] with step 0.5

• Parameter h = (0.3, 0.5, 0.6, 0.9)

This resulted in 16 calibrations for the H1 model and 112calibrations for the H2 model. See Table 5 and Table 6 for the results. Only the Δinit that led to the lowest SSE is shown.

For these models, by far the most computation time is spent in the calculation of the Lambert W function which is required to determine the lower boundary of the integral. Apparently MatLab is not able to parallelize the calculation of the lambertw function for a vector. This results in calculation times of a relatively high 17.6 ms for the H1 model, which increases with the length of the vector (i.e. the number of option prices in the dataset) indicating an unwanted serialization in MatLab’s calculation for this function. Therefore it may be better to use a MatLab parfor loop and call the lambertw function for each element individually, especially if there are a large number of option prices in the dataset. In our case this reduced the calculation of the lambertw function for 256646 option prices from about 30 minutes to 4 minutes, averaging between 0.7 and 0.9 ms per option price.

(31)

Table 4: Calibration results of models H1 and H2 (h-distribution)

This table shows the starting values and the resulting calibrated values of the parameters for the H1 and H2 models for which the total Sum-of-Squared-Errors (SSE) was the smallest. The SSEs can be compared to the other models to determine relative performance. The total time needed to calibrate the model is also presented.

Model H1: h-distribution based on Dutta and Babbel (2005)

Dataset Initial Δinit Optimal Δmin SSE Calibration Time

ATM b = 1.0 h = 0.5 b = 35.008 h = 0.77032 119.06M ~ 1h (serial lambertw) ITM b = 0.5 h = 0.5 b = 82.372 h = 0.48187 194.67M ~ 11h (serial lambertw) OTM b = 0.5 h = 0.3 b = 95.948 h = 0.33025 165.98M ~ 15h (serial lambertw) ALL b = 0.5 h = 0.5 b = 96.113 h = 0.37651 486.30M ~ 4h (parallel lambertw)

Model H2: h-distribution based on Mills (1995)

ATM a = 0 b = 0.1 h = -0.3 a = 0.017600 b = 0.096795 h = -0.00060057 118.07M ~ 35m ITM a = 0 b = 0.1 h = -0.3 a = 0.049609 b = 0.11470 h = -1.6544 186.52M ~ 3h30m OTM a = 0 b = 0.1 h = -0.3 a = -0.17709 b = 0.24362 h = -4,7428e-08 168.21M ~ 4h ALL a = 0 b = 0.1 h = -0.3 a = 0.0089207 b = 0.11461 h = -1,0685e-07 480.12M ~ 4h

(32)

Table 5: Calibration results of g-and-h based model GH1

This table shows the starting values and the resulting calibrated values of the parameters for the GH1 model for which the total Sum-of-Squared-Errors (SSE) was the smallest. The SSEs can be compared to the other models to determine relative performance. The total time needed to calibrate the model is also presented.

Model GH1: g/h-distribution based on Dutta and Babbel (2005)

Dataset Initial Δinit Optimal Δmin SSE Calibration Time

ATM b = 30 g = -0.1 h = 0.5 b = 30.473 g = -0.048890 h = 0.79778 118.90M ~ 1h ITM b = 30 g = -0.1 h = 0.5 b = 89.971 g = -0.19757 h = 0.38297 194.64M ~ 4h OTM b = 30 g = -0.1 h = 0.5 b = 91.761 g = -0.094412 h = 0.37571 165.97M ~ 6h30m ALL b = 30 g = -0.1 h = 0.5 b = 91.042 g = -0.15519 h = 0.39471 479.71M ~ 9h

(33)

Table 6: Calibration results of g-and-h based model GH2

This table shows the starting values and the resulting calibrated values of the parameters for the GH2 model for which the total Sum-of-Squared-Errors (SSE) was the smallest. The SSEs can be compared to the other models to determine relative performance. The total time needed to calibrate the model is also presented.

Model GH2: g/h-distribution based on Mills (1995)

ATM a = 0 b = 0.1 g = 0.00001 h = -0.3 a = 0.022487 b = 0.10875 g = -0.30414 h = -8,8037e-05 118.08M ~ 3h ITM a = 0.01 b = 0.1 g = 0.0001 h = -0.001 a = 0.010720 b = 0.10027 g = 1.2841e-04 h = -9.6018e-04 192.06M ~ 8h (*) OTM a = 0 b = 0.1 g = 0.00001 h = -0.3 a = 0.019589 b = 0.15685 g = -0.63347 h = -1.1068e-06 178.10M ~ 12h30m ALL a = 0 b = 0.1 g = 0.00001 h = -0.3 a = 0.010491 b = 0.10032 g = 1.2838e-04 h = -9.5302e-04 488.59M ~19h (*)

*) MatLab’s lsqnonlin function was interrupted because it started looping over the same parameter values after the calibration time mentioned, remaining stuck in an endless loop.

(34)

The Variance Gamma option pricing model

Madan, Carr and Chang (1998) use maximum likelihood estimation for all VG parameter estimations, but as discussed MatLab’s lsqnonlin function is used here.

Unfortunately MatLab does not have a built-in function for the degenerate hypergeometric function of two variables used in Madan, Carr and Change (1998) Φ(𝛼, 𝛽, 𝛾; 𝑥, 𝑦) (see appendix). This means that numerical methods must be used to approximate the integral using MatLab’s quad function, which makes the model computationally expensive.

For this paper the option price will be calculated using the FFT-based implementation presented by Kienitz and Wetterau (2012). This algorithm still requires long computation times though. Note that this method introduces 3 extra parameters: convergence parameter α, grid spacing η and a number of terms N, as described in chapter II.

For these additional parameters, we can use the settings suggested by Carr and Madan (1999): α = 1.5, η = 0.25, N = 212_{. At these settings, Matlab requires about 0.4 ms to calculate} a single option price.

Alternatively, we can use the more granular values suggested by Kienitz and Wetterau’s (2012) implementation are used: α = 0.5, η = 0.05, N = 214. The expected trade-off is to get more accurate calibration (lower SSE) at the expense of longer calculation time. At these settings, MatLab requires about 1.3 ms to calculate a single option price, 3 times as long as for the settings used by Carr and Madan (1999).

Volatility σ is initially set to the standard deviation of all returns in the complete dataset, resulting in calibration VG(1). See Table 7 for the results for both the low granular Carr and Madan (1999) settings as well as the more granular Kienitz and Wetterau (2012) settings. The total runtime for the parameter calibration is mentioned as well for the specific datasets. In a second calibration VG(2) the implied Black-Scholes volatility for a specific dataset is used, but it is still kept constant. See Table 8 for the results.

Finally, in a third calibration VG(3) we make volatility σ an additional calibration parameter with the implied Black-Scholes volatility as its initial value. This allowed MatLab an additional degree of freedom to better fit the dataset at the expense of a much longer calibration time. See Table 9 and Table 10 for the results. Verification afterwards shows that α satisfies the upper bound stated by equation (39) in all cases.

(35)

Table 7: Calibration results of Variance Gamma model VG(1L/H)

This table shows the starting values and the resulting calibrated values of the parameters for the VG(1) model for which the total Sum-of-Squared-Errors (SSE) was the smallest. In this model, the global dataset volatility was used for σ and kept constant. Two calibrations were done, with low granular settings VG(1L) and more granular settings for better accuracy VG(1H). The SSEs can be compared to the other models to determine relative performance. The total time needed to calibrate the model is also presented.

VG(1L): fixed global 𝝈, 𝛂 = 𝟏. 𝟓, 𝛈 = 𝟎. 𝟐𝟓, 𝐍 = 𝟐𝟏𝟐

Dataset Constants Initial Δinit Optimal Δmin SSE Calibration Time

ATM σ = 0.2668 ν = 2.0 θ = -0.1 ν = 2.146 θ = -0.02633 33.98M ~ 5h ITM σ = 0.2668 ν = 2.0 θ = -0.1 ν = 4.338 θ = -0.006697 45.75M ~18h30m OTM σ = 0.2668 ν = 2.0 θ = -0.1 ν = 6.450 θ = -0.08477 53.05M ~ 1h ATM σ = 0.2668 ν = 2.0 θ = -0.1 ν = 5.774 θ = -0.06452 153.10M ~ 2h VG(1H): fixed global 𝝈, 𝛂 = 𝟎. 𝟓, 𝛈 = 𝟎. 𝟎𝟓, 𝐍 = 𝟐𝟏𝟒

ATM σ = 0.2668 ν = 2.0 θ = -0.1 ν = 2.160 θ = -0.0261 33.95M ~ 12h30m ITM σ = 0.2668 ν = 2.0 θ = -0.1 ν = 3.721 θ = -0.0148 46.34M ~ 34h OTM σ = 0.2668 ν = 2.0 θ = -0.1 ν = 6.450 θ = -0.0848 53.05M ~ 2h ALL σ = 0.2668 ν = 2.0 θ = -0.1 ν = 5.776 θ = -0.0645 153.00M 6h30m

(36)

Table 8: Calibration results of Variance Gamma model VG(2L/H)

This table shows the starting values and the resulting calibrated values of the parameters for the VG(2) model for which the total Sum-of-Squared-Errors (SSE) was the smallest. In this model, the Black-Scholes implied volatility for the specific dataset was used for σ and kept constant. Two calibrations were done, with low granular settings VG(2L) and more granular settings for better accuracy VG(2H). The SSEs can be compared to the other models to determine relative performance. The total time needed to calibrate the model is also presented.

VG(2L): fixed BS 𝝈, 𝛂 = 𝟏. 𝟓, 𝛈 = 𝟎. 𝟐𝟓, 𝐍 = 𝟐𝟏𝟐

ATM σ = 0.1977 ν = 2.0 θ = -0.1 ν = -0.00005995 θ = -0.1077 19.96M ~ 1h ITM σ = 0.2166 ν = 2.0 θ = -0.1 ν = 0.3415 θ = -0.08585 39.04M ~ 1h OTM σ = 0.1821 ν = 2.0 θ = -0.1 ν = 0.0005419 θ = -0.03520 38.47M ~ 2h30m ALL σ = 0.1959 ν = 2.0 θ = -0.1 ν = 0.2040 θ = -0.1463 103.52M ~ 2h VG(2H): fixed BS 𝝈, 𝛂 = 𝟎. 𝟓, 𝛈 = 𝟎. 𝟎𝟓, 𝐍 = 𝟐𝟏𝟒

ATM σ = 0.1977 ν = 2.0 θ = -0.1 ν = -0.0001347 θ = -0.1074 19.97M ~ 3h ITM σ = 0.2166 ν = 2.0 θ = -0.1 ν = 0.3415 θ = -0.08583 39.04M ~ 3h OTM σ = 0.1821 ν = 2.0 θ = -0.1 ν = 0.001494 θ = -0.03520 38.47M ~ 8h30m ALL σ = 0.1959 ν = 2.0 θ = -0.1 ν = 0.2047 θ = -0.1462 103.51M ~ 6h30m

(37)

Table 9: Calibration results of Variance Gamma model VG(3L)

This table shows the starting values and the resulting calibrated values of the parameters for the VG(3) model for which the total Sum-of-Squared-Errors (SSE) was the smallest. The model differs from the previous VG models in the volatility σ, which has now become a calibration parameter with the Black-Scholes implied volatility for the specific dataset as its initial value. Low granularity of the numerical integration is used in this particular model. The SSEs can be compared to the other models to determine relative performance. The total time needed to calibrate the model is also presented.

VG(3L): variable 𝝈, 𝛂 = 𝟏. 𝟓, 𝛈 = 𝟎. 𝟐𝟓, 𝐍 = 𝟐𝟏𝟐

ATM σ = 0.1977 ν = 2.0 θ = -0.1 σ = 0.18542 ν = 0.00021583 θ = 4.6513 19.95M ~ 1h ITM σ = 0.2166 ν = 2.0 θ = -0.1 σ = -0.0027525 ν = 0.20924 θ = -0.46674 37.94M ~ 10h30m OTM σ = 0.1821 ν = 2.0 θ = -0.1 σ = 0.11911 ν = 0.0012550 θ = 3.8312 38.40M ~ 6h30m ALL σ = 0.1959 ν = 2.0 θ = -0.1 σ = 0.099590 ν = 0.066997 θ = -0.68477 101.64M ~ 6h30m

(38)

Table 10: Calibration results of Variance Gamma model VG(3H)

This table shows the starting values and the resulting calibrated values of the parameters for the VG(3) model for which the total Sum-of-Squared-Errors (SSE) was the smallest. The model differs from the previous VG models in the volatility σ, which has now become a calibration parameter with the Black-Scholes implied volatility for the specific dataset as its initial value. The numerical integration is performed in a more granular way in this particular calibration of the model, resulting more in more accurate calibration but with much longer computation time. The SSEs can be compared to the other models to determine relative performance. The total time needed to calibrate the model is also presented.

VG(3H): variable 𝝈, 𝛂 = 𝟎. 𝟓, 𝛈 = 𝟎. 𝟎𝟓, 𝐍 = 𝟐𝟏𝟒

ATM σ = 0.1977 ν = 2.0 θ = -0.1 σ = 0.1857 ν = 0.0002455 θ = 4.301 19.95M ~ 4h30m ITM σ = 0.2166 ν = 2.0 θ = -0.1 σ = 0.001596 ν = 0.2206 θ = -0.4546 37.94M ~ 11h OTM σ = 0.1821 ν = 2.0 θ = -0.1 σ = 0.1133 ν = 0.001282 θ = 3.925 38.40M ~ 26h ALL σ = 0.1959 ν = 2.0 θ = -0.1 σ = 0.09948 ν = 0.06716 θ = -0.6842 101.63M ~ 24h

(39)

V. Model comparison

After the several calibrations of the previous chapter, it is possible to compare the performance of the models:

1. BS – the standard Black-Scholes model;

2. H1 – the model based on the underlying process 𝑆 = 𝐴 + 𝐵𝑌ℎ(𝑍); 3. H2 – the model based on the underlying process ln (𝑆

𝑆0) = 𝐴 + 𝐵𝑌ℎ(𝑍);

4. GH1 – the model based on the underlying process 𝑆 = 𝐴 + 𝐵𝑌_𝑔,ℎ(𝑍); 5. GH2 – the model based on the underlying process ln (𝑆

𝑆0) = 𝐴 + 𝐵𝑌𝑔,ℎ(𝑍);

6. VG(1L) – the Variance Gamma model using fixed global volatility, low granularity; 7. VG(1H) – the Variance Gamma model using fixed global volatility, high granularity; 8. VG(2L) – the Variance Gamma model using fixed BS volatility, low granularity; 9. VG(2H) – the Variance Gamma model using fixed BS volatility, high granularity; 10. VG(3L) – the Variance Gamma model using variable volatility, low granularity; 11. VG(3H) – the Variance Gamma model using variable volatility, high granularity;

In-Sample calibration comparison

See Table 11 for a comparison of the various Sum-of-Squared-Errors (SSEs) for the various models for the dataset for which they were calibrated; see Figure 5 for a graphical view. Based on the in-sample SSEs, the VG models perform only marginally better than the Scholes model for In-The-Money (ITM) options and the total dataset, provided that the Black-Scholes implied volatilities for the dataset are used to calibrate the VG model. In those cases, SSEs for VG(2) are about 2.7% lower than Black-Scholes. If the volatility is considered a calibration variable, as is done in model VG(3), then SSE is 5.4% lower for ITM options and 4.6% lower for the total dataset.

It is disappointing to see that using more granular calculations does not yield any noticeable results in the VG models. SSEs for the VG(xH) calibrations are very similar to those of the VG(xL) calibrations, even though they take a lot more time to calibrate. Also, making the volatility σ a calibration parameter sometimes results in nonsensical values for σ from volatility point of view. For the ATM dataset, σ calibrates to 18.57% which still makes sense, but for OTM and TOTAL it drops to ~10% and for ITM even to -0.28% which cannot be regarded as a measure of volatility anymore.

(40)

Figure 5: Comparison of in-sample SSEs

Here the Sum-of-Squared-Errors (SSE) of all the models are shown in a comparative graph for the 4 datasets used in calibration. A lower score is better.

Table 11: Comparison of in-sample SSEs to BS model

This table presents the Sum-of-Squared-Errors (SSE) for each model and each dataset, normalized to the Black-Scholes result which is set to 100% (lower score is better). For example, the H1 model yielded an SSE for the At-The-Money (ATM) dataset of 596.5% higher than the Black-Scholes model. In other words, its SSE was almost 6 times higher than Black-Scholes, a much worse performance.

ATM ITM OTM TOTAL

H1 596,5% 485,3% 431,5% 456,7% H2 591,5% 465,0% 437,3% 450,9% GH1 595,7% 485,3% 431,4% 450,5% GH2 591,6% 478,8% 463,0% 458,9% BS 100,0% 100,0% 100,0% 100,0% VG(1L) 170,3% 114,1% 137,9% 143,8% VG(1H) 170,1% 115,5% 137,9% 143,7% VG(2L) 100,0% 97,3% 100,0% 97,2% VG(2H) 100,1% 97,3% 100,0% 97,2% VG(3L) 100,0% 94,6% 99,8% 95,5% VG(3H) 99,9% 94,6% 99,8% 95,4% ATMITM OTMTOTAL 0,0E+00 1,0E+08 2,0E+08 3,0E+08 4,0E+08 5,0E+08 ATM ITM OTM TOTAL

(41)

As is clear, the models based on the g-and-h-distribution (H1, H2, GH1 and GH2) perform much worse than the standard Black-Scholes model, with SSEs of 6 times higher for ATM options, and about 4.5 times higher for all other categories. The H2 model generally performs a little bit better than the H1 model in this respect, with SSEs about 5% lower.

Because of the added g parameter, one would expect that the GHx models perform slightly better than the Hx models. Only slightly, because the g parameter introduces skewness and as we have seen in chapter III there is hardly any skewness in the distribution of returns. The GH1 model is indeed performing marginally better than the H1 model. The GH2 model, however, performs noticeably worse than the H2 model. The reason for this can be found in the difficulties in calibrating this model as the model is highly sensitive and unstable: for only very small changes in values g and h the integrand might not yield a definite result, which makes it very hard to find a suitable initial vector for lsqnonlin.

(42)

Out-of-Sample test results

To provide an out-of-sample comparison, 11,871 S&P 500 index option prices of the period January 2nd, 2012 – February 7th, 2012 have been taken. See Table 12 and Table 13 for an overview of the characteristics of this dataset.

Table 12: Average price and sample size of out-of-sample data

This table presents average option prices and sample size for subsets of the S&P500 Index option prices for the period January 2nd to February 7th, 2012, our dataset for an out-of-sample test. The subsets are created based on the time to maturity for the options as well as their “moneyness” (asset value to exercise price ratio). Options with a moneyness less than 0.97 are considered Out-of-The-Money options (OTM), those with a moneyness of 0.97 up to 1.03 are At-The-Money options (ATM) and those with a moneyness greater than 1.03 are In-The-Money options. Moneyness S/K Days-to-Expiration Subtotal < 60 60-180 ≥ 180 OTM < 0.94 $0.67 $2.88 $29.94 4,920 850 1,176 1,964 0.94-0.97 $4.22 $19.55 $95.54 470 263 197 ATM 0.97-1.00 $14.55 $36.51 $113.83 1,701 454 244 205 1.00-1.03 $36.03 $58.90 $136.47 410 218 170 ITM 1.03-1.06 $65.02 $85.40 $155.63 5,250 354 236 202 ≥ 1.06 $242.47 $189.58 $288.64 1,551 1,309 1,598 Subtotal 4,089 3,446 4,336 11,871

(43)

Table 13: Black-Scholes Implied Volatility for out-of-sample dataset

This table presents average Black-Scholes implied volatility and sample size for subsets of the S&P500 Index option prices for the out-of-sample dataset. The subsets are created based on the time to maturity for the options as well as their “moneyness” (asset value to exercise price ratio). Options with a moneyness less than 0.97 are considered Out-of-The-Money options (OTM), those with a moneyness of 0.97 up to 1.03 are At-The-Money options (ATM) and those with a moneyness greater than 1.03 are In-The-Money options.

Jan 2, 2012 up to Feb 7, 2012 Moneyness S/K Days-to-Expiration < 60 60-180 ≥ 180 < 0.94 18.98% 16.81% 16.08% 18.74% 18.47% 0.94 - 0.97 14.97% 15.89% 18.39% 0.97 - 1.00 15.23% 17.28% 18.78% 20.18% 1.00 - 1.03 17.01% 18.85% 19.56% 1.03 - 1.06 19.30% 20.11% 19.86% 17.16% > 1.06 66.64% 22.96% 20.31%

For all examined models, the calibrated parameters are used to determine their SSEs for these 11,871 option prices. This will give an indication of how well the calibrated models are able to predict option prices outside of the data they have been calibrated on. See Figure 6 and

Table 14 for the results.

Figure 6: Comparison of out-of-sample SSEs (lower is better)

Here the Sum-of-Squared-Errors (SSE) of all the models are shown in a comparative graph for the 4 out-of-sample datasets, using their calibrated parameters. A lower score is better.

ATMITM OTMTOTAL 0,0E+00 2,0E+06 4,0E+06 6,0E+06 8,0E+06 1,0E+07 1,2E+07 1,4E+07 ATM ITM OTM TOTAL

(44)

Table 14: Comparison of out-of-sample SSEs to BS model

This table presents the SSE for each model and each out-of-sample dataset, normalized to the Black-Scholes result which is set to 100% (lower score is better). For example, the H1 model yielded an SSE for the At-The-Money (ATM) out-of-sample dataset of 3,927.6% higher than the Black-Scholes model. In other words, its SSE was almost 40 times higher than Black-Scholes, a much worse performance.

ATM ITM OTM TOTAL

H1 3927,6% 1750,4% 2037,7% 1581,7% H2 4147,9% 1873,4% 2055,0% 1725,3% GH1 3928,2% 1749,2% 2039,4% 1584,2% GH2 4148,6% 1805,1% 2328,6% 1647,6% BS 100,0% 100,0% 100,0% 100,0% VG(1L) 459,1% 198,0% 292,0% 340,1% VG(1H) 458,1% 414,2% 291,9% 340,0% VG(2L) 100,4% 115,5% 100,1% 87,5% VG(2H) 100,4% 115,5% 100,0% 87,4% VG(3L) 101,0% 96,7% 105,3% 66,0% VG(3H) 100,7% 96,9% 105,4% 66,0%

As we can see, most models perform much worse than the standard Black-Scholes model in predicting call option prices in the out-of-sample data. Even the VG(1) model yields very poor results with SSEs of 2 to 4.5 times higher than Black-Scholes. The VG(2) model did reasonably well, scoring similar to Black-Scholes on the ATM and OTM subsets and a nice 12.5% improvement in SSEs on the TOTAL dataset. However, the results for ITM options are disappointing as they were able to get 2.5% lower SSEs than Black-Scholes during calibration, but for the out-of-sample dataset their SSEs are 15% worse.

The big surprise is the VG(3) model which performs quite similar to the Black-Scholes model both during calibration and in this sample test. However, for the TOTAL out-of-sample dataset, the VG(3) model is able to get a 34% lower SSE than Black-Scholes, which is an extraordinary improvement.

(45)

It is remarkable that the H2 model performs much worse than its H1 counterpart, even though it was able to calibrate to lower SSEs. This can be an indication of overfitting. Apparently the H2 model can be calibrated well, but it has less predictive powers for data outside of its calibration domain.

In this comparison, the VG(3) model clearly performs best for the out-of-sample TOTAL dataset and a little bit better for the ITM dataset. The Black-Scholes model performs best for ATM and OTM subsets.

(46)

Computational complexity

The different models all require a certain time to compute an option price, given their set of calibrated parameters. See Table 15 for a comparison. The values have not been measured very accurately, but they are close enough to give an indication how the various models compare to eachother in terms of computational complexity.

As can be seen, the Black-Scholes model is really fast, being up to 800x faster than the other models. Of the other models the VG model with low granularity parameters performs best, being “only” 200 times slower than the Black-Scholes model.

Table 15: Option price computation time

This table presents the average computation time for each model to calculate the price of one option as well as the indexed computation time with the Black-Scholes set to 1. For example, the H1 model took about 0.8 ms to calculate a single option price, 400 times more than the Black-Scholes model.

Option Price model Calculation time

(milliseconds per option)

Indexed (Black-Scholes = 1) Black-Scholes (BS) 0.002 1 h-distribution (H1) 0.8 400 h-distribution (H2) 1.1 550 g/h-distribution (GH1) 1.4 700 g/h-distribution (GH2) 1.6 800 Variance Gamma (VGxL) low granularity 𝛂 = 𝟏. 𝟓, 𝛈 = 𝟎. 𝟐𝟓, 𝐍 = 𝟐𝟏𝟐 0.4 200 Variance Gamma (VGxH) high granularity 𝛂 = 𝟎. 𝟓, 𝛈 = 𝟎. 0𝟓, 𝐍 = 𝟐𝟏𝟒 1.3 650

Comparison of Black-Scholes, g-and-h distribution and Variance Gamma in modelling S&P500 index option prices.