• No results found

Does the price difference in theoretical and market prices of options have a predictive value?

N/A
N/A
Protected

Academic year: 2021

Share "Does the price difference in theoretical and market prices of options have a predictive value?"

Copied!
51
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

1

Does the price difference in theoretical and

market prices of options have a predictive value?

Msc. Thesis Financial Econometrics

By Z.Z. Hu (10013709) Amsterdam, July 27, 2015

University of Amsterdam Supervisor prof. dr. J.G. de Gooijer Second reader prof. dr. H.P. Boswijk

Abstract

The aim of this thesis is to provide a framework for estimating the effects of the price difference in theoretical and market prices of options on future option prices. Using coefficients from a cross-sectional ordinary least squares regression of option returns on the price difference in theoretical and market prices, we obtain predicted returns. A set of trading rules driven by one-step ahead forecasts of option returns is devised. The trading performance is evaluated by means of trading simulations. Both in-sample and out-of-sample forecasts are considered.

1

Introduction

With the establishment of the Chicago Board Options Exchange (CBOE) in 1973, options started to be traded on an organized exchange. On April 26, the first day of opening of the CBOE, 911 options contracts were traded. In the same year Black and Scholes (1973) formulated an equation to price financial options. Merton (1973) extended their work and coined the term ‘’Black-Scholes options pricing model’’. In 1997, ‘’Black-Scholes and Merton were awarded the Nobel Prize in Economics, illustrating the impact their work had on financial markets.1 Since then, option trading has experienced a remarkable growth; according to statistics by the Futures Industry Association, in

1

Black was mentioned for his important contributions to the Black-Scholes (1973) option pricing model, but was not awarded the Nobel Prize in Economics because of his death in 1995.

(2)

2

2012, 9.5 billion option contracts were traded worldwide. With the phenomenal growth of option trading in the past decades, there has been extensive research carried out on improving the accuracy of option pricing models in both the academic world and the financial industry.

The Black-Scholes (BS) (1973) model became one of the most well-known option pricing models, and it is commonly used as a benchmark within the financial industry and by academics. It is popular for its computational and conceptual simplicity, and reasonable accuracy. However, this model has some flaws, and other option pricing models have been developed to tackle them. One of the shortcomings of the BS model is that it assumes a constant volatility on the underlying asset. Various other option pricing models have been proposed to incorporate a non-constant volatility, which produce theoretical option prices that are closer to the market prices. Despite abundant research which focuses on improving option pricing models, there has been little research focused on whether options will move in a corrective path so that price differences will be eradicated or whether a large pricing difference is an indication that the options will move in a “counter-corrective” direction.2 Therefore, in this thesis, we will investigate whether the price difference in theoretical and market prices of options has a predictive value on the future value of option prices. If any predictive value from option mispricing is found, this could have substantial implications for the participants in the financial markets. The financial industry actively searches for methods to predict future stock and option prices through statistical methods and technical analyses. If, in this study, a predictive value is found, this can be used to support investors in their trading decisions.

Theoretical option prices are generated from three different option pricing models. In order to create these option prices, only data that is available upon the moment of trading is used. We derive theoretical option prices based on historical data from the S&P 500 index. For computational convenience as well as being limited by the availability of data, we focus on discrete time data by evaluating daily closing prices of options. Although option data can be categorized by time, strike prices and expiration date, we only consider highly traded options and as options also expire, we do not obtain balanced panel data. Therefore, in order to answer the research question, we perform a cross-sectional analysis. An ordinary least squares (OLS) regression with White standard errors is proposed, as we find strong evidence of heteroscedasticity in our data-set. Using the coefficients from this regression, option return forecasts are obtained. Trading simulations guided by these forecasts are run in order to investigate whether a trading system using mispricing information is consistently profitable.

The remainder of this thesis is divided as follows. In the following section, we review the literature of option pricing models. Furthermore, we discuss the literature that investigates the

2

A corrective movement is a movement in which an option that is overpriced (underpriced) according to an option pricing model at time 𝑡 − 1, decreases (increases) in price at time 𝑡.

(3)

3

predictability of variables on the returns of financial assets and options. In Section 3, we formulate the option pricing models which are used for this research. In Section 4, we describe our data-set and we provide the methodology in order to answer the research question. Section 5 presents our results; first, we redefine the data-set based on the mispricing errors, and then the results from the OLS regressions and the trading simulations are discussed. Finally, Section 6 concludes the thesis.

2

Literature

The BS model assumes a constant volatility on the underlying asset, which is not in line with reality. In practice, a volatility smile is observed, implying that options with different strike prices and different maturities have different volatilities. This became much more evident after the market crash of 1987. Hence, extensions to the BS model and alternative option pricing models have been proposed that take into account a non-constant volatility. Some of these models have shown to produce significantly more accurate option prices. Option mispricing by the BS model could occur, for instance, by a change in volatility, making it less likely to find a predictive value in option

mispricing when the BS model would be implemented for our research. Therefore, it is crucial to use option pricing models which incorporate a non-constant volatility in order to deliver more accurate theoretical option prices.

Several option pricing models have been proposed that take non-constant volatility into account. The most popular alternatives can be divided into three groups. The first group of

alternative option pricing models is a minor modification of the BS model. For instance, Dumas et al. (1998) propose the ad-hoc Black-Scholes (ad-hoc-BS) model, where implied volatilities of options are smoothed along strike prices and maturities. Within the approach of the ad-hoc BS model, there are some variations how implied volatilities are smoothed. The second group of alternative option pricing models incorporates stochastic volatility (SV). The two most well-known option pricing models incorporating this are the SV models by Hull and White (1987) and Heston (1993). The third group of popular option pricing models incorporates time-varying volatility. This group of option pricing models is based on the autoregressive conditional heteroscedasticity (ARCH) models, proposed by Engle (1982) and its generalization, the generalized autoregressive conditional heteroscedasticity (GARCH) model by Bollerslev (1986). Duan (1995) proposes a GARCH option pricing model, where the volatility changes over time, based on GARCH processes. Duan’s (1995) GARCH model relies on Monte-Carlo simulation and is therefore computationally intensive. Heston and Nandi (2000) formulate a closed-form GARCH option pricing model, reducing computing time.

(4)

4

accurate theoretical option prices than the BS model. For instance, Dumas et al. (1998) find that their ad-hoc BS model prices options more accurately than the original BS model. Kim and Kim (2004) find that Heston’s (1993) SV model outperforms both the ad-hoc BS model by Dumas et al. (1998) and the GARCH model by Heston and Nandi (2000). Bakshi et al. (1997) find that pricing options under the assumptions of Hull and White’s (1987) SV model delivers more accurate results than the BS model. Duan and Zhang (2001) compare the empirical performance of Duan’s (1995) GARCH pricing model with the BS model, and conclude that the GARCH model delivers more accurate option prices. Lehar et al. (2002) find that the GARCH option pricing model by Duan (1995) significantly outperforms both the SV model by Hull and White (1987) and the BS model.Hsieh and Ritchken (2005) find that both Duan’s (1995) and Heston and Nandi’s (2000) GARCH models perform better than the BS model. When both GARCH models are compared, they conclude that Duan’s (1995) GARCH model outperforms the GARCH model by Heston and Nandi (2000). Kokoszczyński et al. (2012) find promising results for a modification of the Black (1976) model, which they refer to as the “Black model with implied volatility (BIV)”. Their approach has a similar approach as the ad-hoc model by Dumas et al. (1998). In following research, Kokoszczyński et al. (2010) compare Heston’s (1993) SV model, Duan’s (1995) GARCH model, and the BIV model.3 Similar to the results of Kim and Kim (2004), who find that Heston’s (1993) SV model outperforms Heston and Nandi’s (2000) GARCH model, Kokoszczyński et al. (2010) show that Heston’s (1993) SV model outperforms the GARCH model by Duan (1995). Their main results indicate that the BIV model, outperforms both Duan’s (1995) GARCH model and Heston’s (1993) SV model. Table 1 summarizes the employed evaluation measures by the mentioned authors.

From the literature, the SV model by Heston (1993), the BIV model by Kokoszczyński et al. (2012), and Duan’s (1995) GARCH model seem to be among the most accurate option pricing models.4 Therefore, to investigate whether option mispricing has a predictive value, these three option pricing models will be used to generate theoretical option prices. However, it is worth mentioning that in order to implement the GARCH model, we follow the same approach as Duan (1995). He uses data on the past returns of the underlying asset to obtain the parameters for the proposed GARCH model. By contrast, Bakshi et al. (1997), Duan and Zhang (2001), Lehar et al. (2002), and Hsieh and Ritchken (2005) adopt a cross-section of option prices to obtain the

parameter estimates for Duan's (1995) GARCH option pricing model. Also, regarding the BIV model, we apply the approach of Kokoszczyński et al. (2012) on the BS model rather than on the Black (1976) model. The reason we apply these two alternative approaches for the BIV model and the GARCH model is detailed in Section 3, where we formulate the three selected option pricing models.

3

The paper of Kokoszczyński et al. (2010) has not been published (yet), therefore the year between brackets is lower.

4

(5)

5

Moving on, given the potential to earn substantial profits, researchers have widely

investigated the predictive performance of available data on future market returns. Welch and Goyal (2008) investigated the potential performance of several variables that, throughout the literature, have been suggested to have a predictive value on the market returns. Using the suggested variables to predict the returns of the S&P 500 index, they find little evidence of returns predictability.

However, in the field of option trading, there might be some more evidence of variables with a predictive value. A certain degree of option mispricing can be caused by investors with insiders’ information. In the literature, there is some evidence for this as informed investors frequently use leveraged financial products as options to gain larger profits. For instance, Arnold et al. (2006) investigate the behavior of stocks with traded options versus stocks without traded options in a time span before takeover bids are publicly announced.5 They find that stocks without traded options experience exceptionally high trading volumes, whereas for stocks with traded options, only the corresponding options experience abnormally high trading volumes. Cremers and Weinbaum (2010) find that a violation of the put-call parity in option prices can predict future stock returns.

Specifically, they find that stocks with relatively expensive call options perform better than stocks with relatively expensive put options. Such literature suggests that instead of corrective movements of option prices, counter-corrective movements are more likely to occur. However, the

aforementioned authors focus their research on individual stock options rather than investigating a well-diversified market index. Investors with non-public information are more likely to buy

individual stock options rather than purchasing diversified market index options, making it less likely that, with our data-set, we would find that option mispricing can predict counter-predictive price movements.

Related to the research on the effects of mispricing of options on future returns, Kanoh and Takeuchi (2011) investigate the mispricing of options. They find that the variables maturity and moneyness of options affect the magnitude of option mispricing. Hence, these are the variables that we consider adding to the OLS regression in order to measure the pure effects of mispricing on the returns of options. Additionally, these authors find evidence of heteroscedasticity in their data, which justifies testing for the presence of heteroscedasticity in our data-set.

5

Stocks and their corresponding options usually experience very high upward price movements after takeover announcements.

(6)

6

3

Option pricing models

In this section, we formulate the three option pricing models from which we generate the theoretical option prices. For the first option pricing model, we apply the option pricing approach by Kokoszczyński et al. (2012) on the BS model. The second option pricing model is the SV model by Heston (1993). The third and last pricing model that is defined in this section is the GARCH option pricing model by Duan (1995). Finally, at the end of this section, we will elaborate on the different causes and implications of mispricing by the used option pricing models.

3.1

Black-Scholes Model

Black and Scholes (1973) assume an option pricing model where the underlying asset 𝑆𝑡, has a constant volatility 𝜎, with the following differential stochastic equation

d𝑆𝑡 = 𝜇𝑆𝑡d𝑡 + 𝜎𝑆𝑡d𝑊𝑡, (3.1)

where 𝜇 is the drift rate of 𝑆𝑡, and 𝑊𝑡 is a Wiener process. The call and put option prices satisfy the put-call parity

𝐶(𝑆, 𝑡) − 𝑃(𝑆, 𝑡) = 𝑆𝑡− 𝐾𝑒−𝑟(𝑇−𝑡). (3.2)

Hence, the BS option prices can be formulated as

𝐶𝐵𝑆(𝑆, 𝑡) = 𝑆𝑡Φ(𝑑1) − 𝐾𝑒−𝑟(𝑇−𝑡)Φ(𝑑2), (3.3) 𝑃𝐵𝑆(𝑆, 𝑡) = 𝐾𝑒−𝑟(𝑇−𝑡)Φ(−𝑑 2) − 𝑆𝑡Φ(−𝑑1), (3.4) where Φ(𝑥) =∫ 2𝜋1 𝑒− 𝑧2 2d𝑧 𝑥 −∞ and 𝑑1 =ln⁡(𝑆𝑡/𝐾)+(𝑟+𝜎2/2)(𝑇−𝑡) 𝜎√𝑇−𝑡 , 𝑑2= 𝑑1− 𝜎√𝑇 − 𝑡 . (3.5)

Throughout this whole section, 𝐾 denotes the strike price, 𝑇 denotes the date of expiration and 𝑟 denotes the risk-free interest rate. We refer to the paper of Black and Scholes (1973) for the complete derivation of the BS model.

(7)

7

3.2

Black-Scholes model with implied volatility

We use the adjusted approach of Kokoszczyński et al. (2012) for computing option prices with the BS model. As noted before, they refer to their method as the “Black model with implied volatility (BIV)”. Kokoszczyński et al. (2012) modify the Black (1976) model, which is used to price options on future contracts, rather than the BS model for stocks and indices. They state two reasons for this. The first reason is that they can relax continuous dividend assumptions. Secondly, and more importantly, they are able to use additional data as options and futures are traded much longer than the underlying asset.6,7 The authors use a 5-minute data interval and, therefore, their preference is to use the future index as the underlying asset of the options. As we focus only on daily closing prices of options and additional intraday data is costly, we perform their method of option pricing based on the BS model for stocks and indices, rather than the Black (1976) model for futures.

Regarding the method of Kokoszczyński et al. (2012), the volatility of the underlying asset is no longer an input in the BS model. In order to compute theoretical option prices at time 𝑡, the implied volatilities of the previous observations at 𝑡 − 1 have to be calculated first. The implied volatility is the volatility for which the theoretical price of the BS option equals the market price. As part of the approach by Kokoszczyński et al. (2012), the implied volatilities are averaged within the same moneyness and maturity group. Based on their data-set, they define five moneyness groups and five maturity groups. Next, by evaluating both call and put options, they obtain 50 (5x5x2) different implied volatility values with this method. Afterwards, the averaged implied volatilities are treated as input in the BS model to compute option prices of one time step ahead. This procedure is performed to take the observed volatility smile into account, where options with different

moneyness and maturities have different implied volatilities. The algorithm is repeated for each time step.

For this study, we consider option data of the two nearest expiration months and we exclude observations where the expiration is at least seven days ahead. As we consider options with at most 60 days until expiration, there are a maximum of two expiration dates considered at each time step. Hence, 20 (5x2x2) implied volatility groups are defined. A further inspection of the data-set is needed in order to categorize our options in moneyness groups. Moneyness groups for the BSIV model are defined in Section 5, where we proceed with the data-set refinement for the option pricing models.

6

Futures prices reflect the market expectation on the price of the underlying at the expiration date, therefore the futures prices also takes into account dividend payouts that occur before expiration.

7

(8)

8

3.3

Heston’s stochastic volatility model

Heston (1993) obtains a closed-form solution for pricing European style call options, assuming stochastic volatility. The underlying asset 𝑆𝑡 follows a stochastic differential process as in the BS model, whereas the variance 𝑣𝑡 follows a Cox-Ingersoll-Ross (1985) process. The diffusion of the SV model is formulated as

d𝑆𝑡 = 𝜇𝑆𝑡d𝑡 + √𝑣𝑡𝑆𝑡d𝑊1,𝑡, (3.6) d𝑣𝑡 = −𝜅(𝜃 − 𝑣𝑡)d𝑡 + 𝜎√𝑣𝑡d𝑊2,𝑡⁡. (3.7)

The variables of (3.6) and (3.7) are defined as follows:

 𝑊1,𝑡 and 𝑊2,𝑡 are Wiener processes with correlation d𝑊1,𝑡∗ d𝑊2,𝑡= 𝜌,

 𝜃 is the long-term price variance,

 𝜅 is the speed of the mean reversion to the long-term price variance 𝜃,  𝜎 is the volatility of the volatility √𝑣𝑡.

In order for the variance to be positive, the Feller (1951) condition

2𝜅𝜃 > 𝜎2 (3.8)

needs to hold.

Combining Ito’s lemma and standard arbitrage arguments by Black and Scholes (1973), Heston (1993) shows that the call option has to satisfy the following partial differential equation

1 2𝑣𝑆 2 𝜕2𝐶 𝜕𝑆2+ 𝜌𝜎𝑣𝑆 𝜕2𝐶 𝜕𝑆𝜕𝑣+ 1 2𝜎 2𝑣𝜕2𝐶 𝜕𝑣2+ 𝑟𝑆 𝜕𝐶 𝜕𝑆+ [𝜅(𝜃 − 𝑣𝑡) − 𝜆(𝑆, 𝑣, 𝑡)] 𝜕𝐶 𝜕𝑣− 𝑟𝐶 + 𝜕𝐶 𝜕𝑡 = 0, (3.9) where 𝜆 denotes the price of volatility risk. Furthermore, for a European call option the following boundary conditions must hold:

𝐶(𝑆, 𝑣, 𝑇) = max⁡(0, 𝑆 − 𝐾), 𝐶(0, 𝑣, 𝑡) = 0, 𝜕𝐶 𝜕𝑆(∞, 𝑣, 𝑡) = 1, 𝑟𝑆𝜕𝐶 𝜕𝑆(S, 0, 𝑡) + κθ 𝜕𝐶 𝜕𝑆(S, 0, 𝑡) − 𝑟𝐶(S, 0, 𝑡) + 𝐶𝑡(S, 0, 𝑡) = 0, 𝐶(S, ∞, 𝑡) = 𝑆. (3.10)

By analogy with the BS formula, Heston (1993) guesses the value of a non-dividend paying European call option 𝐶𝐻 as

(9)

9

𝐶𝐻(𝑆𝑡, 𝐾, 𝑡, 𝑇) = (𝑆𝑡− 𝐷𝑡)𝑝1− 𝐾𝑒−𝑟(𝑡−𝑇)𝑝

2, (3.11)

where 𝑝1 and 𝑝2 are risk neutral probabilities. Furthermore, when (3.11) is substituted in (3.9), 𝑝𝑗 has to satisfy the following partial differential equation

1 2𝑣 𝜕2𝑝𝑗 𝜕𝑥2 + 𝜌𝜎𝑣 𝜕2𝑝𝑗 𝜕𝑥𝜕𝑣+ 1 2𝜎 2𝑣𝜕2𝑝𝑗 𝜕𝑣2 + (𝑟 + 𝑢𝑗𝑣) 𝜕𝑝𝑗 𝜕𝑥 + (𝜅𝜃 − 𝑏𝑗𝑣) 𝜕𝑝𝑗 𝜕𝑥 + 𝜕𝑝𝑗 𝜕𝑡 = 0, (3.12)

for 𝑗 = 1,2 and 𝑥 = ln[𝑆]. Heston (1993) shows that the option price satisfies the conditions of (3.10), and the risk neutral probabilities must satisfy

𝑝𝑗(𝑥, 𝑣, 𝑇; ln[𝐾]) = 1{𝑥≥ln[𝐾]}. (3.13)

As 𝑥 follows the diffusion

d𝑥𝑡 = (𝑟 + 𝑢𝑗𝑣)d𝑡 + √𝑣𝑡d𝑊1,𝑡,

(3.14) d𝑣 = (𝜅𝜃 − 𝑏𝑗𝑣)d𝑡 + 𝜎√𝑣𝑡d𝑊2,𝑡,

it can be shown that

𝑝𝑗(𝑥, 𝑣, 𝑇; ln[𝐾]) = Pr[𝑥(𝑇) ≥ ln⁡[𝐾]]⁡|⁡𝑥𝑡 = 𝑥, 𝑣𝑡 = 𝑣. (3.15)

Moreover, the characteristic functions of 𝑝𝑗 satisfy (3.12) and is subject to

𝑓𝑗(𝑥, 𝑉𝑡, 𝑡, 𝑇, 𝜙) = 𝑒𝑖𝜙𝑥, (3.16)

from which it can be shown that the characteristic solution is given by

𝑓𝑗(𝑥, 𝑣, 𝑡, 𝑇, 𝜙) = 𝑒𝐶(𝑇−𝑡,𝜙)+𝐷(𝑇,𝑡,𝜙)𝑣𝑡+𝑖𝜙𝑥, (3.17)

where the variables in (3.17) are defined as

𝐶(𝑇, 𝑡, 𝜙) = 𝑟𝜙𝑖(𝑇 − 𝑡) +𝜅𝜃𝜎2[(𝑏𝑗− 𝜌𝜎𝜙𝑖 + 𝑑)(𝑇 − 𝑡) − 2ln (1−g𝑒1−g𝑑(𝑇−𝑡))] , (3.18) 𝐷(𝑇, 𝑡, 𝜙) =𝑏𝑗−𝜌𝜎𝜙𝑖+𝑑 𝜎2 ( 1−𝑒𝑑(𝑇−𝑡) 1−𝑔𝑒𝑑(𝑇−𝑡)) , 𝑔 =𝑏𝑗−𝜌𝜎𝜙𝑖+𝑑 𝑏𝑗−𝜌𝜎𝜙−𝑑, 𝑑 = √(𝜌𝜎𝜙𝑖 − 𝑏𝑗) 2 − 𝜎2(2𝑢 𝑗𝜙𝑖 − 𝜙2), 𝑢1= 1/2, 𝑢2= −1/2, 𝑏1= 𝜅 + 𝜆 − 𝜌𝜎, 𝑏2= 𝜅 + 𝜆.

(10)

10 𝑝𝑗=12+𝜋1∫ Re [𝑒 −𝑖𝜙𝑙𝑛(𝐾)𝑓 𝑗(𝑥,𝑣,𝑇;𝜙) 𝑖𝜙 ] ∞ 0 d𝜙, 𝑗 = 1,2. (3.19)

In (3.19), the integral is taken over the real part of the complex variables, where 𝑖 is a complex number that satisfies √𝑖2= −1. We refer to Heston’s (1993) paper for the full derivation of the SV model.

The parameters of the SV model are not directly observable from the market data, hence the parameters have to be calibrated. For this purpose, we introduce a time index such that the

parameters Ω𝑡= (𝜅𝑡, 𝜃𝑡, 𝜎𝑡, 𝑣𝑡, 𝜌𝑡) can be calibrated from the market prices on a daily basis. It is worth mentioning that Heston (1993) assumes that the parameters, 𝜅, 𝜃, 𝜎, and 𝜌 are

time-independent as he does not consider the daily calibration of the parameters. We choose to calibrate the parameters for each day separately in order to increase the goodness-of-fit of the SV model. With 𝑁𝑡 observations at time 𝑡, we calibrate the parameters Ω𝑡 daily by minimizing the non-linear least squares error of the relative option price difference:

min Ω𝑡 ∑ [ 𝐶𝑚,𝑖(𝑆𝑡, 𝐾𝑖, 𝑡, 𝑇𝑖) − 𝐶𝐻,𝑖(𝑆𝑖, 𝐾𝑖, 𝑡, 𝑇𝑖) 𝐶𝑚,𝑖(𝑆𝑡, 𝐾𝑖, 𝑡, 𝑇𝑖) ] 2 𝑁𝑡 𝑖=1 . (3.20)

We choose a calibration method that minimizes the relative pricing difference instead of minimizing the absolute price difference. The disadvantage of the latter method is that the highly valued options are likely to have consistently very low percentage pricing errors, while lowly valued options are likely to have consistently high percentage pricing errors. Minimizing the relative pricing

difference can avoid such mispricing.

After obtaining the parameter values from calibration, the theoretical option prices can be computed. The value of an SV put option can be simply computed by the put-call parity

𝑃𝐻(𝑡, 𝑇, 𝐾) = 𝐶𝐻(𝑡, 𝑇, 𝐾) − 𝑆𝑡+ 𝐾𝑒−𝑟(𝑇−𝑡)− 𝐷

𝑡, (3.21)

where 𝑃𝐻 represents the SV put price. It is possible to derive the SV put prices from the market data of the call options. However, we choose to derive these prices from the market data for put options. This approach is preferred because market liquidity is concentrated in different strike prices for call and put options as moneyness categories are defined differently. Moreover, by using the market data of put options, we are able to obtain more accurate put option prices. With the put-call parity using only the data of put options, the market call options can be derived as

(11)

11

Afterwards, the calibration method of (3.20) is performed on the transformed call market prices from (3.23). Finally, the SV put prices can be obtained as

𝑃𝐻(𝑡, 𝑇, 𝐾) = 𝐶𝐻(𝑡, 𝑇, 𝐾) − 𝑆𝑡+ 𝐾𝑒−𝑟(𝑇−𝑡)− 𝐷𝑡. (3.24)

3.4

GARCH option pricing model

Duan (1995) formulates the following model, where the returns of the underlying asset 𝑆𝑡 are lognormally distributed as

ln 𝑆𝑡

𝑆𝑡−1= 𝑟 + 𝜆ℎ𝑡− 1

2ℎ𝑡+ 𝜀𝑡. (3.25)

Here 𝜆 is the unit risk premium, 𝜀𝑡|𝜙𝑡−1~𝑁(0, ℎ𝑡), with 𝜙𝑡−1 the information set containing all information up to and including time 𝑡 − 1, and the GARCH process is defined as

ℎ𝑡 = 𝛼0+ ∑𝑞𝑖=1𝛼𝑖𝜀𝑡−𝑖2 + ∑𝑝𝑗=1𝛽𝑗ℎ𝑡−𝑗2 , (3.26)

where 𝑝 ≥ 1 and 𝑞 ≥ 1 are the order of the GARCH(𝑝, 𝑞) model. The GARCH parameters 𝛼0, 𝛼𝑖 (𝑖 = 1, … , 𝑞), and 𝛽𝑗 (𝑗 = 1, … , 𝑝) are assumed to be nonnegative in order for the conditional variance to be positive. Furthermore, ∑𝑞𝑖=1𝛼𝑖+ ∑𝑝𝑖=1𝛽𝑖≤ 1 has to satisfy in order for the GARCH

process to be stationary.

Under Duan’s (1995) proposed notion of the locally risk-neutral valuation relationship (LRNVR), a pricing measure 𝑄 is mutually absolutely continuous with respect to measure 𝑃, 𝑆𝑡/𝑆𝑡−1|𝜙𝑡−1 is lognormally distributed under 𝑄,

E𝑄(𝑆𝑆𝑡

𝑡−1|𝜙𝑡−1) = 𝑒

𝑟, (3.27)

where 𝐸𝑄 denotes conditional expectation with respect to the measure 𝑄 and

Var𝑄(𝑙𝑛(𝑆𝑡/𝑆𝑡−1|𝜙𝑡−1)) = Var𝑃(𝑛(𝑆𝑡/𝑆𝑡−1|𝜙𝑡−1)) . (3.28)

The subscripts 𝑄 and 𝑃 in (3.28) denote the conditional variance with respect to 𝑄 or 𝑃. The notion of LRNVR implies that under the pricing measure 𝑄

ln 𝑆𝑡

𝑆𝑡−1= 𝜇 − 1

(12)

12 where 𝜉𝑡|𝜙𝑡−1~𝑁(0, ℎ𝑡), and

ℎ𝑡 = 𝛼0+ ∑𝑞𝑖=1𝛼𝑖(𝜉𝑡−1− 𝜆√ℎ𝑡−𝑖)2+ ∑𝑝𝑖=1𝛽𝑖ℎ𝑡−𝑖 . (3.30) From (3.29) and (3.30), Duan (1995) derives that the underlying asset 𝑆𝑡 possesses the martingale property. From this, the value of the underlying asset at time of maturity 𝑇 can be expressed as

𝑆𝑇 = 𝑆𝑡exp [(𝑇 − 𝑡)𝜇 −12∑𝑇𝑠=𝑡+1ℎ𝑠+ ∑𝑇𝑠=𝑡+1𝜉𝑠]. (3.31)

Finally, under a GARCH(𝑝, 𝑞) process, European options can be expressed as the difference between the expected asset price 𝑆𝑇 at time of maturity 𝑇 and strike price 𝐾. Hence, the GARCH call and put option prices are expressed by respectively

𝐶𝑡 = 𝑒−(𝑇−𝑡)𝐸

𝑄[𝑚𝑎𝑥(𝑆𝑇− 𝐾, 0)|𝜙𝑡−1], (3.32)

𝑃𝑡= 𝑒−(𝑇−𝑡)𝐸

𝑄[𝑚𝑎𝑥(𝐾 − 𝑆𝑇, 0)|𝜙𝑡−1]. (3.33) We choose the order (𝑝, 𝑞) of the GARCH model to be (1,1), following most researchers (e.g. Duan (1995), Duan and Zhang (2001), and Hsieh and Ritchken (2005)). On each day, the parameters 𝑎0, 𝛼1, 𝛽0 and 𝜆 of (3.25) and (3.26), are estimated by maximum likelihood on S&P 500 index closing prices, using a rolling window going back three years up to the previous trading day. With the obtained GARCH parameters at time 𝑡, the price path is simulated, creating the terminal prices at the expiration 𝑇 as in (3.31). For each time interval, there are 10,000 price path simulations, a number of simulations proposed by Duan and Zhang (2001). From this, the GARCH call and put prices can be computed as in (3.32) and (3.33).

As noted in Section 2, our approach for the GARCH model differs from some researchers. In particular, a calibration method based on cross-sections of option prices is computationally more cumbersome for the GARCH model than for the SV model. Parameter calibration for the GARCH is known to be computationally more intensive as for many different parameter values the Monte Carlo simulation for the asset price path from (3.31) has to be run. We choose to perform the original method by Duan (1995) as this is computationally much faster. Moreover, obtained

parameters rely on past asset price movements rather than on a cross-section of option prices at the moment of trading. In addition, parameter calibration based on cross-section option data is already performed for the SV model. With our approach, we include one option pricing model that actually mimics option prices based on the past price movements of the underlying asset.

(13)

13

3.5

Causes and implications of mispricing

In order to price options, we only use information that is available upon that time. Option pricing with the BSIV model relies on data at the moment of trading and of one time period before. A part of the mispricing with this model can be caused by averaging the implied volatilities in each moneyness group. This feature might decrease the chance of us finding a predictive value from mispricing by the BSIV model, even if the theoretical option prices are very accurate. However, the implied volatility is one of the many inputs of the BSIV model and, as the BSIV model uses data at the moment of trading and of one time unit before, mispricing can still indicate that an option is too cheap or too expensive compared to past market prices.

The calibration procedure for the SV model only uses information upon the moment of trading and does not take past information into consideration. Consequently, mispricing from the SV model could only imply that some options are relatively cheap and some are relatively expensive, compared to other options priced at the same period. Hence, mispricing cannot give an indication of a general future market direction. Most logically, if there were any predictive value from the

mispricing with the SV model, we would expect it to indicate a predictive value so that relatively expensive options tended to drop in price and relatively cheap options tended to rise in price.

In contrast with the procedure to obtain the option prices with the BSIV model and the SV model, the process to obtain the parameters for the GARCH model solely relies on past asset price movements. The procedure to obtain the parameters for the GARCH model only employs past asset price movements. As this captures the dynamics of the markets, mispricing by the GARCH model could imply that options are priced either too low or too high compared to expectations based on past price movements of the underlying asset.

4

Data and Methodology

In this section, we first provide a description of our data, as well as describing how we select our data-sample. With the data-set, we compute the theoretical option prices of the models as described in Section 3 with Matlab; see the Appendix for the computer code. Next, we elaborate on the regression model for this study. Regressions and statistical tests are performed with Stata. Finally, we design a trading simulation using the returns forecasts from the regressions.

(14)

14

4.1

Data

The data-set consists of options on the S&P 500 index, a market index based on the 500 largest US stock exchange listed companies. We focus on the options of the S&P 500 index as these options are one of the most actively traded options on the global markets. For simplicity, daily closing prices are considered, which are the average of the bid and ask prices at the end of the day. Hence, the bid-ask spread is ignored.8 The expiration day of the S&P 500 index options is the third Friday of the contract month. Options on the S&P 500 index are a European style of options and thus these cannot be exercised before maturity. Consequently, there are no complications due to early exercise of options.

Several researchers find that due to relatively high levels of volatility, option pricing models substantially misprice options within the week of maturity. Therefore, options that mature within less than seven days are discarded from our data-set. Also, it is crucial to gather only data for options with sufficient liquidity, as the price behavior of options with low trading volumes is very volatile, and option pricing models have a relatively low accuracy in pricing such options. Large deviations from option pricing models can also be caused by low trading volumes so that price movements are more arbitrary. To avoid problems associated with illiquidity, we only include contracts with high trading volumes. We only consider options where the trading volume over the day is at least 250.9 Another reason justifying the choice for highly traded options is that, in our study, we ignore the bid-ask spread of options. Thinly traded options usually have a higher

percentage bid-ask spread, making it less likely to profit from trading such options. As options with high levels of trading volume generally are options with an expiration no longer than two months, we do not consider options where the expiration is more than 60 days ahead.

The data-set covers the period from January 3, 2012 to December 31, 2012. During this period, the S&P 500 index had a relatively low annual volatility and since had the smallest range of the more recent years (2010 until 2014). In 2012, the annual volatility of the S&P 500 index returns was relatively low at 12.71%, and the average closing day price was 1379.35 with a range of 188.71. With the described data selection, we initially obtain 9,027 observations on call options and 13,514 observations on put options. We discard some observations to decrease the mispricing errors in a data-sample: see Section 5, where the valuation errors with respect to moneyness ratios are given.

Daily closing prices of the S&P 500 index are obtained from the website finance.yahoo.com. Daily closing prices of S&P 500 index options are obtained from optiondata.net. Other data

8

The bid-ask spread is the difference in the price one is able to buy and sell at the same time.

9

(15)

15

necessary for our research are the S&P 500 dividends obtained from us.spindices.com. As a proxy for the risk-free interest rate, the 3-month United States Treasury yield from finance.yahoo.com is used.

4.2 Regression model

We partly follow the methodology by Kanoh and Takeuchi (2011), who propose an FGLS model that is able to explain the mispricing of Nikkei 225 index options. Specifically, they find that the mispricing is dependent on moneyness ratio 𝑀𝑡 and time till maturity. The authors estimate the following models to investigate the mispricing of call and put options:

1 𝑀𝑖𝑡𝑘𝜏(𝐶𝑚,𝑖𝑡𝑘𝜏− 𝐶𝑓,𝑖𝑡𝑘𝜏) = 1 𝑀𝑖𝑡𝑘𝜏(𝛽𝑀𝑀𝑖𝑡𝑘𝜏+ ∑ 𝛽𝜏,𝑠 4 𝑠=1 𝐷𝜏,𝑠+ ∑5𝑗=4𝛽𝑜𝑝,𝑗𝐷𝑜𝑝,𝑗+ 𝑢𝑖𝑡𝑘𝜏), (4.1) 𝑀𝑖𝑡𝑘𝜏(𝑃𝑚,𝑖𝑡𝑘𝜏− 𝑃𝑓,𝑖𝑡𝑘𝜏) = 𝑀𝑖𝑡𝑘𝜏(𝛽𝑀𝑀𝑖𝑡𝑘𝜏+ ∑𝑠=14 𝛽𝜏,𝑠𝐷𝜏,𝑠+ ∑5𝑗=4𝛽𝑜𝑝,𝑗𝐷𝑜𝑝,𝑗+ 𝑢𝑖𝑡𝑘𝜏). (4.2)

Kanoh and Takeuchi (2011) categorize options according to strike prices, survival period, and day of expiration. In (4.1) and (4.2), the mispricing of call and put options is 𝐶𝑚,𝑖𝑡𝑘𝜏− 𝐶𝑓,𝑖𝑡𝑘𝜏 and

𝑃𝑚,𝑖𝑡𝑘𝜏− 𝑃𝑓,𝑖𝑡𝑘𝜏 respectively. Furthermore, 𝐷𝜏,𝑠 is a dummy variable which takes the value 1 when

the time until maturity is 𝑠 months and 𝐷𝑜𝑝,𝑗 is a dummy that takes a value of 1 when the option trading period is 𝑗 months. The survival period is 𝜏 = 𝑡 − 4(𝑖 − 1). The error term is decomposed as 𝑢𝑖𝑡𝑘𝑡= 𝜀𝑡+ 𝜀𝑘+ 𝜀𝜏+ 𝜀𝑖𝑡𝑘𝜏. The error term 𝜀𝑡 is dependent on the transaction date, 𝜀𝑘 relies on the strike price, 𝜀𝜏 is dependent on the survival period within the same maturity, and 𝜀𝑖𝑡𝑘𝜏 is an entirely random error term. Moreover, the authors assume that all components of 𝑢𝑖𝑡𝑘𝜏 are mutually independent. Kanoh and Takeuchi (2011) observe that the sample variance becomes smaller as options become more out of the money. This occurs as out of the money options have lower prices than in the money options. Thus, to adjust for this heteroscedasticity, the authors have multiplied (4.1) and (4.2) by the terms 𝑀1

𝑖𝑘𝑡⁡ and 𝑀𝑖𝑘𝑡 respectively.

In our study, we simply consider the logarithmic returns of options to correct for this heteroscedasticity. By taking the logarithmic returns, we correct for the size of the options. In order to answer the question whether the price difference in theoretical and market of options has a predictive value, we estimate the following OLS regressions:

log𝐶𝐶𝑚,𝑡+1 𝑚,𝑡 = 𝛼 + 𝛽misprlog 𝐶𝑚,𝑡 𝐶𝑓,𝑡+ 𝜀𝑡,𝐶, (4.3) log𝑃𝑚,𝑡+1 𝑃𝑚,𝑡 = 𝛼 + 𝛽misprlog 𝑃𝑚,𝑡 𝑃𝑓,𝑡+ 𝜀𝑡,𝑃, (4.4)

(16)

16 where log𝐶𝐶𝑚,𝑡

𝑓,𝑡 and log 𝑃𝑚,𝑡

𝑃𝑓,𝑡 are the logarithm of the mispricing and 𝜀𝑡,𝐶 and 𝜀𝑡,𝑃 are the error terms.

As Kanoh and Takeuchi (2011) find that the variables moneyness and maturity affect the mispricing, we consider regressions where we include the variables moneyness 𝑀𝑡 and time till maturity 𝑇𝑇𝑀𝑡 into (4.3) and (4.4).10 This is done to correct the effects between mispricing and these variables, so that the pure effects of mispricing on the returns can be estimated.

The variables 𝑀𝑡 and 𝑇𝑇𝑀𝑡 are expected to have no clear effect on the returns of the options, as opposed to their first differences 𝑀𝑡+1− 𝑀𝑡⁡ and 𝑇𝑇𝑀𝑡+1− 𝑇𝑇𝑀𝑡. We do not include the first differences of these variables as 𝑀𝑡+1− 𝑀𝑡 is unknown at time 𝑡. The decision whether or not to include the variables 𝑀𝑡 and 𝑇𝑇𝑀𝑡 depends on whether the root mean square error (RMSE) is lowered after adding the variables. The regressions of (4.3) and (4.4) with these two additional independent variables are done for the BSIV model, the SV model, the GARCH model, and each option type separately.

4.3 Trading system

Let the variable 𝑂𝑝𝑡𝑖𝑜𝑛𝑚,𝑡 represent either the option types 𝐶𝑚,𝑡 or 𝑃𝑚,𝑡. With the regression results, the predicted log returns log𝑂𝑝𝑡𝑖𝑜𝑛𝑂𝑝𝑡𝑖𝑜𝑛̂ 𝑚,𝑡+1

𝑚,𝑡 can be computed. Next, we design a

trading system where the executed transactions are dependent on the predicted log returns. When the predicted log returns are positive (negative), the option is bought (sold). By contrast with reality, we assume we can buy fractions of options, allowing the trade of larger positions when larger returns are predicted. The dollar value of the transaction equals 100 ∗ log𝑂𝑝𝑡𝑖𝑜𝑛𝑂𝑝𝑡𝑖𝑜𝑛̂ 𝑚,𝑡+1

𝑚,𝑡 , where we

multiplied this by 100 as this is the contract size of the S&P 500 index options. Consequently, the return of the trade is 100 ∗ log𝑂𝑝𝑡𝑖𝑜𝑛𝑂𝑝𝑡𝑖𝑜𝑛̂ 𝑚,𝑡+1

𝑚,𝑡 ∗

𝑂𝑝𝑡𝑖𝑜𝑛𝑚,𝑡+1−𝑂𝑝𝑡𝑖𝑜𝑛𝑚,𝑡

𝑂𝑝𝑡𝑖𝑜𝑛𝑚,𝑡 . The described trading system is

done with both in-sample forecasts as out-of-sample forecasts using the model with the lowest RMSE.

Trading simulations using out-of-sample forecasting are performed by means of a rolling window. In these simulations, only information available upon the moment of trading is used to trade options. The data-set is split into two parts; the first part covers the period January 3, 2012 to June 29, 2012 and the second period covers July 2, 2012 to December 31, 2012. The first period is solely assigned for the estimation of the coefficients. Out-of-sample forecasting is done in the

10

In contrast with Kanoh and Takeuchi (2011), we use a one-day data frequency, thus the effects of the days until maturity instead of the monthly effects are considered.

(17)

17

second period, where the first trading day is July 2 and the coefficient estimation is performed on the first part of the data. For the following trading day, the data window used for coefficients estimation is extended to the previous trading day. This procedure is done until the end of the trading period. Note that the time frame of the rolling window is stretched after each day rather than moved. We do this as it allows us to use more data to estimate the coefficients. For simplicity, the bid-ask spread and transaction costs are ignored in all trading simulations.

5 Results and analysis

In this section, we first evaluate the mispricing by the three chosen option pricing models: the BSIV model, the SV model, and the GARCH model. For this purpose, we adopt the following three error measures: the mean percentage error (MPE), the mean absolute percentage error (MAPE) and the mean absolute error (MAE), respectively defined as:

MPE = 1 𝑁∑ 𝑂𝑝𝑡𝑖𝑜𝑛𝑓,𝑖−𝑂𝑝𝑡𝑖𝑜𝑛𝑚,𝑖 𝑂𝑝𝑡𝑖𝑜𝑛𝑚,𝑖 𝑁 𝑖=1 , (5.1) MAPE =𝑁1∑ |𝑂𝑝𝑡𝑖𝑜𝑛𝑓,𝑖−𝑂𝑝𝑡𝑖𝑜𝑛𝑚,𝑖 𝑂𝑝𝑡𝑖𝑜𝑛𝑚,𝑖 | 𝑁 𝑖=1 , (5.2) MAE =𝑁1∑𝑁𝑖=1|𝑂𝑝𝑡𝑖𝑜𝑛𝑓,𝑖− 𝑂𝑝𝑡𝑖𝑜𝑛𝑚,𝑖|, (5.3)

where 𝑁 denotes the number of observations in the data-sample.

These three error measures have been chosen to provide an overall picture on mispricing. The MPE gives a general idea in which direction the mispricing tends to occur, while the MAPE provides a more accurate estimate of the magnitude of mispricing as underpricing and overpricing of options do not cancel out each other in the case of the MPE (making the MPE a less ideal error measure). The MAE shows the absolute values of mispricing. However, the MAE might reflect differences in price level rather than in pricing errors, even when a relative error measure such as the MAPE shows a decrease in pricing errors. Therefore, the MAPE is considered the “best” error measure for mispricing.

Ideally, we do not want the MAPE in our subsamples to be too large, as this is an indication that option prices are not accurate. It is known that the accuracy of theoretical option prices in a sample strongly depends on the moneyness. Hence, we evaluate the MAPE on smaller subsamples, where observations with some moneyness ratios are excluded. Based on the MAPE, we select the samples with affordable pricing errors. Afterwards, the OLS regressions using the selected data-samples are provided. Finally, we show the results from the trading simulations driven by in-sample forecasts and out-of-sample forecasts.

(18)

18

5.1 Mispricing in the Black-Scholes implied volatility model

Following the approach by Kokoszczyński et al. (2010) for the BIV model, we define symmetric moneyness groups. The moneyness of options at time 𝑡 is defined as the ratio 𝑀𝑡 =𝑆𝐾𝑡. Based on the distribution of the observations by moneyness as shown in the histograms of Figure 1, for the call options we define the classes of moneyness ratio as:

 deep out of the money for 0.9 ≤ 𝑀𝑡 < 0.95,

 out of the money for 0.95 ≤ 𝑀𝑡 < 0.99,

 at the money for 0.99 ≤ 𝑀𝑡 < 1.01,

 in the money for 1.01 ≤ 𝑀𝑡 < 1.05,

 deep in the money for 1.05 ≤ 𝑀𝑡< 1.10.

The classes of moneyness ratio for put options are defined in the opposite order.

Tables 2a and 2b show the valuation errors of call and put options by the BSIV model. When observations with lower moneyness ratios are discarded from the data-sample for call options, the MAPE significantly decreases. Table 2b demonstrates how the reverse is true for put options, as moneyness categories are defined in a reverse order. Tables 2a and 2b depict how, when out of the money options are removed, the MAE increases. This can be explained by the removal of lowly priced options leading to an increase of the average option price in the subsamples.

Furthermore, the MPE is low for the mispricing of both call and put options. This can be explained as both overpricing and underpricing in the BSIV model occur, possibly caused by averaging the implied volatilities within each moneyness group.

Ideally, we want mispricing errors to be low while preserving a high number of observations. Table 2a shows that a moneyness between 0.99 ≤ 𝑀𝑡 < 1.10 for call options seems to be

reasonable. Table 2b indicates that a moneyness between 0.90 ≤ 𝑀𝑡 < 1.01 seems to be

reasonable, where a low MAPE of approximately 6% is obtained and maintaining a decent number of observations. Choosing the data-samples with the above moneyness ratios, we have 2,309 (1,850) observations on call (put) options.

5.2 Mispricing in the SV model

Tables 3a and 3b show the valuation errors in pricing the call options and put options with the SV model. For the full data-sample of call options, there are 9,027 observations, the lowest moneyness ratio is 0.81 and the MAPE of this sample is 321.95%. For the full data-sample of put

(19)

19

options, where the highest moneyness ratio is 9.43, there are 13,514 observations and the MAPE is 346.44%. These large mispricing errors can be explained by the inclusion of deep out of the money observations. Market prices of deep out of the money options are typically low, resulting in high percentage pricing errors, while absolute price differences are small. This is confirmed by the MPE values in Tables 3a and 3b as the MPE is large and positive for the full data-sample of both call and put options; options are relatively more overpriced rather than underpriced. Kokoszczyński et al. (2010) find with their data-set on the Nikkei 225 index options that the SV model greatly misprices deep out of the money options. For the SV, model they find the best fit for deep in the money options.

Tables 3a and 3b show additional data-samples where the out of the money observations are removed. For these samples the daily SV parameter estimates are recalibrated on the

corresponding samples and, thus, are different for each sample. The parameters have been recalibrated for each sample to increase the goodness-of-fit. The tables show that the value of all three error measures decreases when out of the money observations are discarded. For an MAPE value of around 5%, the number of observations in the data-samples is considered large. Hence, for the call options we consider the sample with a minimum moneyness ratio of 0.96 and for put options we consider the sample with a maximum moneyness ratio of 1.02. There are 5,389 observations for call options considered and 2,877 observations for put options.

5.3 Mispricing in the GARCH option pricing model

Tables 4a and 4b show that the degree of mispricing for the GARCH model is much larger than for the BSIV model and the SV model. The cause for such large mispricing is the opposite to the other two models; the chosen approach for the parameter calibration of the GARCH model does not employ any information from the market prices of options. The GARCH option pricing model

consistently overprices call options and underprices in the money put options. This implies that the market expectation for future stock prices is lower than the GARCH model assumes. For call options, low mispricing errors can be found for the sample where 𝑀𝑡 ≥ 1.10 with 198 observations and an MAPE of 7.62%. For put options, similar accuracy can be achieved for 𝑀𝑡 ≤ 0.90 with only 29 observations. We consider these numbers of observations to be too small for this study. We choose to filter our data-set by considering an MAPE of approximately 17%. Thus, we consider call

observations where 𝑀𝑡≥ 1.05, resulting in 355 observations and an MAPE of 16.56%. We obtain 119 put observations with 𝑀𝑡 ≤ 0.95. These numbers of observations are still not large, but choosing samples with more observations results in larger mispricing errors in the data-samples.

(20)

20

5.4

Regression estimates

Figure 2 shows scatter plots of the logarithm of option mispricing on the horizontal axis and the logarithm of option returns on the vertical axis. The scatter plots are shown for call and put options and for the three option pricing models separately. From the scatter plots, we cannot observe a clear relation between log mispricing and log returns of options. The dots on the line seem to be scattered around randomly, indicating that estimating a regression using mispricing

information does not explain many of the returns.

Before we estimate the relation of log returns with the log mispricing of options by means of OLS regressions, we must first proceed with the Breusch-Pagan test for heteroscedasticity. Tables 5a-f show the test results on the mispricing data of options, where the dependent variable of the regression are the logarithmic returns and the independent variables are: logarithmic mispricing, moneyness ratio, and time until maturity. This is done for six regressions, where the bottom three columns of the tables indicate whether the independent variable has been included. Throughout the whole thesis, the nominal significance level is set at 0.05. In all but two cases, regressions with GARCH put options as shown in Table 5f, we reject the null hypothesis of a constant variance. In addition, we also provide the results of the Breusch-Pagan test on the OLS regression with the lowest RMSE that excludes the variable mispricing.

Tables 6a-6f present the regression estimates. For the regressions where we reject the null hypothesis of homoscedasticity, we account for heteroscedasticity by reporting White standard errors. In every table, the setting with the lowest RMSE is underlined and we discuss the results from this setting as the variables in these settings are used to compute forecasts for the returns. The sign of the variable log mispricing is negative, except in Table 6e for the GARCH call options where this coefficient is positive. A negative coefficient implies that when options are relatively expensive compared to the theoretical prices, the future market prices of options decrease. Table 6a shows that the coefficient of the logarithm of mispricing is -0.112, implying that when the market price is 1.0% higher than the theoretical price, a -0.112% price change of the option at the next time period occurs. Furthermore, the coefficient for mispricing is statistically significant in three of the six cases. In addition, the last column of Tables 6a-6f present the estimation results for the model with the lowest RMSE that excludes the variable log mispricing. This is provided in order to examine whether a model that includes the variable log mispricing has a better goodness-of-fit than a model that excludes this variable. In Tables 6a-6e, we find that a model that includes the variable

mispricing results has the lowest RMSE, although differences are very small even when the variable mispricing is statistically significant. Table 6f shows that when GARCH put options are used a model that excludes the variable mispricing has the lowest RMSE. Tables 6a-6f show that all RMSE values

(21)

21

are relatively high and all of the 𝑅2 values are very low. Thus, including mispricing can be helpful in explaining the variation in option returns, albeit only to a very small extent.

Next, the coefficients of the variables moneyness and time till maturity are discussed. As stated before, these two variables are added into regressions in order to estimate the pure effects of mispricing on the option returns as both mispricing and time till maturity affect the mispricing. The variables should be included in the model when the RMSE is lowered. The results of the underlined columns in Tables 6a-6f show that the variables moneyness and time till maturity are all statistically significant in the case they are included. As the coefficients for moneyness and maturity are not the main focus of this study, we only elaborate on the estimated coefficients from the regression with the BSIV call options.Table 6a shows that the coefficient for time till maturity is 0.002, implying that when the days till expiration is shortened by 1.00 day, that the returns drop by 0.20%. However, the coefficient for moneyness is 1.125 implying that an absolute increase of the moneyness ratio by 0.01 results in a rise of the returns by 1.13%. This effect seems quite large, but the magnitude of the estimated coefficient for moneyness can be partly attributed to the estimated coefficient for the constant term, which is -1.239. The constant term indicates the level of the slope, but this level is for a large part countered by the slope of the moneyness, given that the chosen data-sample for the BSIV call options has a moneyness ratio 0.99 ≤ 𝑀𝑡 ≤ ⁡1.10.

5.5 In-sample trading performance

Table 7 shows the averages along with the standard errors of the predicted returns on the complete data-sample for each option pricing model for both calls and puts. From this table, we can see that there are relatively more negative returns forecast for the in-sample trading simulation. The averages of only positive and negatives predicted returns are provided separately. Table 8a shows the results of the trading simulation guided by in-sample forecasts. Figures 3a-f show the graphs of the trading returns over the time period. The horizontal axis shows the number of trades. Note that, on many days, multiple trades occur, and each individual trade is shown in the graphs . The six simulations with in-sample forecasts ended up being profitable, albeit with highly volatile returns and periods of losses during the trading process. Even though the levels of returns are quite different, the pattern of the return hikes for the BSIV call options and SV, call options on the one side, and the BSIV put options and SV put options on the other side are similar. Table 8a summarizes the end results of the trading. In each trading simulation, the percentage of profitable trades is over 50%. The final returns for the SV call options are the highest, and the returns are the lowest for GARCH options. This can be partly explained by the large number of included observations for the SV

(22)

22

call options, and the low number of included observations for the GARCH model.

Figures 4a-f shows the accumulated returns over time after running the trading simulation driven by out-of-sample forecasts. As noted before, the period in this trading simulation covers from July 2, 2012 to December 31, 2012. As observed with the simulation with in-sample forecasts, also with the out-of-sample forecasts the patterns of the returns movements are similar for the BSIV call options and the SV call options and for the BSIV put options together with the SV put options. Unlike in the case with the in-sample forecasts, the out-of-sample trading simulations using the BSIV and SV call options end with negative returns. This can be partly explained as in the simulation with in-sample forecasts, the returns for the BSIV and SV call option there is a large drop in returns in the second part of the trading period. The same drop in returns is observed in the simulation with out-of-sample forecasts. As profits are not consistent over time in the simulations driven by in-the-sample forecasts as well as the out-of-in-the-sample forecasts, the use of mispricing information does not seem to be consistently profitable to be used for trading in practice. Also, as noted before,

transaction costs and the bid-ask spread are ignored, implying that the actual profits would be lower.

6

Conclusion

In this study, we have investigated the predictive performance of option mispricing on future option returns. This has been done in three steps. First, we created theoretical S&P 500 index option prices with the BSIV model, the SV model, and the GARCH option pricing model. Second, cross-sectional OLS regressions with the logarithmic returns on logarithmic mispricing and the variables moneyness ratio and time till maturity have been proposed. White standard errors have been provided as we found evidence of heteroscedasticity in the data. Third, with the estimates of the OLS regressions, we obtained forecasts of option returns. The performance of both in-sample and out-of-sample forecasts have been evaluated by means of a trading simulation.

From the regressions, we found that in some cases the coefficients of mispricing of options are statistically significant at the 0.05 level. Furthermore, most of the coefficient estimates of mispricing are negative, implying that options that are relatively expensive compared to the theoretical prices tend to decrease in price. However, as the RMSE in the regressions are relatively large and the 𝑅2 values are low, the mispricing is capable of explaining little of the total variation associated with the returns of options. In addition, the trading simulations showed that a trading system driven by forecasts is not able to predict option returns accurately. Even when profits are obtained at the end of the simulation, there were no consistent profits over the time as the returns

(23)

23

are highly volatile. In conclusion, both the results from the OLS regressions and the simulations indicate that the price difference between the theoretical and market prices of options have no predictive value.

Referring back to the literature, our results are in line with those by Goyal and Welsh (2008) who studied the predictive value of several variables on the S&P 500 index. In this study, we found no predictive value from the mispricing of options on the S&P 500 index option returns, confirming that financial returns are highly unpredictable. However, research in the field of this study is new and given the potential for high returns, further research is recommended. Some future research topics are as follows. First, we suggest the use of a panel data structure, following options with same strike prices and expiration dates over time. Because we did not consider observations with a trading volume below 250, we did not obtain a balanced panel data structure. A panel data structure can bring a substantial improvement in the goodness-of-fit compared to the use of a cross-section structure since financial returns are known to have time-varying volatilities. Second, instead of examining options on a market index, options on other assets such as individual stocks,

commodities, bonds, and currency pairs can be investigated. Especially options on stocks can be interesting as the literature confirmed that there are some variables that have a predictive value on stock option returns (see e.g. Arnold et al. (2006) and Cremers and Weinbaum (2010)). Third, different option pricing models can be considered to generate theoretical option prices.

Acknowledgements

This thesis has been carried out under the supervision of prof. dr. J.G. de Gooijer. I am truly grateful for all efforts he put in to help me write this thesis. During his guidance, he provided me many helpful comments and fundamental insights for improving this thesis. He inspired me to work hard and grow.

(24)

24

References

Arnold, T., Erwin, G., Nail, L., and Nixon, T. (2006). Do option markets substitute for stock markets? Evidence from trading on anticipated tender offer announcements. International Review of Financial

Analysis, 15, 247–255.

Bakshi, G., Cao, C., and Chen, Z. (1997). Empirical performance of alternative option pricing models.

The Journal of Finance, 52(5), 2003-2049.

Black, F., and Scholes, M. (1973). The pricing of options and corporate liabilities. Journal of Political

Economy, 81(3), 637-654.

Bollerslev, T. (1986). Generalized autoregressive conditional heteroscedasticity. Journal of

Econometrics, 31, 307-327.

Cox, J.C., Ingersoll, J.E., and Ross, S.A. (1985). A Theory of the Term Structure of Interest Rates.

Econometrica, 53(2), 385-407.

Cremers, M., and Weinbaum, D. (2010). Deviations from put-call parity and stock return predictability. Journal of Financial and Quantitative Analysis, 45(2), 335-367.

Duan, J. (1995). The GARCH option pricing model. Mathematical Finance, 5, 13-32.

Duan, J., and Zhang, H. (2001). Pricing Hang Seng index options around the Asian financial crisis – A GARCH approach. Journal of Banking & Finance, 25(11), 1989–2014.

Dumas, B., Fleming, J., and Whaley, R. (1998). Implied volatility functions: Empirical tests. Journal of

Finance, 53, 2059-2106.

Engle, R.F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica, 50, 987-1007.

Feller, W. (1951). Two Singular Diffusion Problems. Annals of Mathematics, 54, 173-182.

Heston, L. (1993). A closed-form solution for options with stochastic volatility with applications to bond and currency options. The Review of Financial Studies, 6(2), 327-343.

Heston, L., and Nandi., S. (2000). A closed-form GARCH option valuation model. The Review of

Financial Studies, 13(3), 585-625.

Hsieh, K.C., and Ritchken, P. (2005). An empirical comparison of GARCH option pricing models.

Review of Derivatives Research, 8, 129-150.

Hull, J., and White, A. (1987). The pricing of options on assets with stochastic volatility. Journal of

Finance, 42, 281–300

Kanoh, S., and Takeuchi, A. (2011). An analysis of Nikkei 225 call and put option price differences between market price and theoretical price. Waseda University Working Paper, 48, 61-79.

Kim, I.J., and Kim, S. (2004). Empirical alternative stochastic volatility option pricing models: Evidence from Korean KOSPI 200 index options market. Pacific-Basin Finance Journal, 12(2), 117-142.

(25)

25

Kokoszczyński, R., Sakowski, P., and Ślepaczuk, R. (2010). Which option pricing model is the best? High frequency data for Nikkei225 index options. University of Warsaw Working Paper, 16(39), 1-29. Kokoszczyński, R., Sakowski, P., and Ślepaczuk, R. (2012). Option pricing models with HF data: An application of the Black model to the WIG20 index. The Business and Economics Research Journal, 5, 70-90.

Lehar, A., Scheicher., M. and Schittenkopf. C. (2002). GARCH vs. stochastic volatility: Option pricing and risk management. Journal of Banking & Finance, 26(2–3), 323–345.

Welch, I., and Goyal. A. (2008). A comprehensive look at the empirical performance of equity premium prediction. Review of Financial Studies, 21(4), 1455-1508.

(26)

26

Appendix

Appendix A provides the figures and tables of this thesis. Appendices B-F provide the used Matlab programs for this thesis. In Appendix B-F, a brief explanation and possible sources of the codes are provided when used. The start of a new code is indicated by either ‘‘%Name.m’’or “[output]=function(Name).m”, where Name indicates the name of the code. The end of a code is indicated by %End of function: function(Name).m or %End of function: function(Name).m. In addition, the start and the end of each program and function are written in bold.

Appendix A

All figures and tables that we refer to in this thesis are in this part of the appendix.

Figure 1: Histograms for call and put options observations

a) b)

Figure 2: Scatter plots of Black-Scholes IV call and put percentage mispricing.

(27)

27

c) d)

e) f)

Figure 3: Cumulative returns of the trading simulation - in-sample forecasts.

(28)

28

c) d)

e) f)

Figure 4: Cumulative returns of the trading simulation – out-of-sample forecasts.

(29)

29

c) d)

e) f)

Table 1: Summary of studies with corresponding evaluation measures.

Author Index Period Interval RMSE MSE MPE MAE MAPE Other(s)

Dumas et al. (1998) S&P 500 06/88 – 05/91 daily X X X**

Kim and Kim (2004) KOSPI 200 01/99 - 12/00 intraday X X X X

Bakshi et al. (1997) S&P 500 06/88 – 05/91 daily X X

Duan and Zhang (2001) HSI 01/97 - 01/98 weekly X

Lehar et al. (2002) FTSE 100 01/93 – 10/97 intraday X X

Hsieh and Ritchken (2005) S&P 500 01/91 – 12/95 intraday* X

Kokoszczyński et al. (2012) Nikkei 225 01/08 - 06/08 intraday X X***

Notes: the error measures are: root mean squared error (RMSE), mean squared error (MSE), mean absolute error

(MAE), and mean absolute percentage error (MAPE). *Intraday of every Wednesday.

**Akaike Information Criterion (AIC) and mean outside error (MOE), a self-defined error measure, and frequency that the ad-hoc BS model has a lower RMSE than the BS model.

(30)

30

Table 2a: Valuation errors for BSIV call options with respect to moneyness.

BS IV Call 𝑁 MPE MAPE MAE

0.90 ≤ 𝑀𝑡 < 1.10 7,820 0.0313 0.2070 0.8546

0.95 ≤ 𝑀𝑡 < 1.10 5,715 0.0379 0.1476 1.0738

0.99 ≤ 𝑀𝑡 < 1.10 2,309 0.0166 0.0602 1.3702

1.01 ≤ 𝑀𝑡 < 1.10 649 0.0121 0.0405 1.6199

1.05 ≤ 𝑀𝑡 ≤ 1.10 52 0.0019 0.0147 1.3203

Notes: Samples are categorized by moneyness ratio as indicated in the first column. 𝑁 denotes sample size. The

underlined row denotes the selected data-sample to perform a regression on. These notes also apply to Tables 2b, 3a-b and 4a-b.

Table 2b: Valuation errors for BSIV put options with respect to moneyness.

BS IV Put 𝑁 MPE MAPE MAE

0.90 ≤ 𝑀𝑡 ≤ 1.10 7,246 -0.0073 0.1606 1.2526

0.90 ≤ 𝑀𝑡 < 1.05 4,630 -0.0055 0.1146 1.4317

0.90 ≤ 𝑀𝑡 < 1.01 1,850 -0.0019 0.0606 1.5194

0.90 ≤ 𝑀𝑡 < 0.99 435 -0.0089 0.0420 1.8153

0.90 ≤ 𝑀𝑡 < 0.95 19 -0.0002 0.0207 1.9372

Table 3a: Valuation errors for SV call options with respect to moneyness.

Moneyness ratio 𝑁 MPE MAPE MAE

𝑀𝑡 ≥ 0.81 9,027 3.1240 3.2195 4.2354 𝑀𝑡 ≥ 0.90 8,633 2.8592 2.9407 4.1990 𝑀𝑡 ≥ 0.95 6,273 0.0679 0.1082 0.8611 𝑀𝑡 ≥ 0.96 5,389 0.0075 0.0304 0.3645 𝑀𝑡 ≥ 0.97 4,511 -0.0006 0.0129 0.3334 𝑀𝑡 ≥ 1.00 1,890 0.0005 0.0065 0.4882

Table 3b: Valuation errors for SV put options with respect to moneyness.

Moneyness ratio 𝑁 MPE MAPE MAE

𝑀𝑡 ≤ 9.43 13,514 0.5332 3.4644 1.5046 𝑀𝑡 ≤ 1.20 10,715 0.1018 1.2607 1.6250 𝑀𝑡 ≤ 1.03 3,618 -0.0026 0.0802 1.3269 𝑀𝑡 ≤ 1.02 2,877 0.0038 0.0518 1.0554 𝑀𝑡 ≤ 1.01 2,131 0.0008 0.0432 1.0754 𝑀𝑡 ≤ 1.00 1,237 0.0034 0.0254 0.9073

(31)

31

Table 4a: Valuation errors for GARCH call options with respect to moneyness.

Moneyness ratio 𝑁 MPE MAPE MAE

𝑀𝑡 ≥ 0.81 9,027 3.5272 3.5966 14.4137 𝑀𝑡 ≥ 1.00 1,890 0.6301 0.6301 23.8148 𝑀𝑡 ≥ 1.04 420 0.1940 0.1940 25.0786 𝑀𝑡 ≥ 1.05 355 0.1656 0.1656 25.4723 𝑀𝑡 ≥ 1.10 198 0.0762 0.0762 25.6096 𝑀𝑡 ≥ 1.15 155 0.0460 0.0460 25.2000

Table 4b: Valuation errors for GARCH put options with respect to moneyness.

Moneyness ratio 𝑁 MPE MAPE MAE

𝑀𝑡 ≤ 9.43 13,514 0.0549 0.6514 3.7330 𝑀𝑡 ≤ 1.10 7,707 0.1443 0.5236 5.7706 𝑀𝑡 ≤ 1.00 1,237 -0.1403 0.2570 11.0892 𝑀𝑡 ≤ 0.96 189 -0.1996 0.2016 19.4456 𝑀𝑡 ≤ 0.95 119 -0.1646 0.1672 19.9134 𝑀𝑡 ≤ 0.90 29 -0.0647 0.0692 17.4768

Table 5a: Breusch-Pagan test for heteroscedasticity - BSIV call options.

(1) (2) (3) (4) (5)

𝜒2(𝑘) 41.67 195.31 577.96 717.50 705.81

𝑝-value 0.000 0.000 0.000 0.000 0.000

Log Mispricing Yes Yes Yes Yes No

Moneyness ratio No Yes No Yes Yes

Time till maturity No No Yes Yes Yes

Notes: 𝐻0: constant variance. The bottom three rows indicate which variables are included in the Breusch-Pagan

test. Used significance level in this study is 0.05. These notes also apply for Table 5b-5f.

Table 5b: Breusch-Pagan test for heteroscedasticity - BSIV put options.

(1) (2) (3) (4) (5)

𝜒2(𝑘) 3.91 39.33 452.17 486.03 489.37

𝑝-value 0.048 0.000 0.000 0.000 0.000

Log Mispricing Yes Yes Yes Yes No

Moneyness ratio No Yes No Yes Yes

(32)

32

Table 5c: Breusch-Pagan test for heteroscedasticity - SV call options.

(1) (2) (3) (4) (5)

𝜒2(𝑘) 4.76 57.29 1275.7 1327.79 1330.80

𝑝-value 0.029 0.000 0.000 0.000 0.000

Log Mispricing Yes Yes Yes Yes No

Moneyness ratio No Yes No Yes Yes

Time till maturity No No Yes Yes Yes

Table 5d: Breusch-Pagan test for heteroscedasticity - SV put options.

(1) (2) (3) (4) (5)

𝜒2(𝑘) 8.18 100.72 722.35 808.03 799.63

𝑝-value 0.004 0.000 0.000 0.000 0.000

Log Mispricing Yes Yes Yes Yes No

Moneyness ratio No Yes No Yes Yes

Time till maturity No No Yes Yes Yes

Table 5e: Breusch-Pagan test for heteroscedasticity - GARCH call options.

(1) (2) (3) (4) (5)

𝜒2(𝑘) 19.36 41.66 56.58 75.24 38.17

𝑝-value 0.000 0.000 0.000 0.000 0.000

Log Mispricing Yes Yes Yes Yes No

Moneyness ratio No Yes No Yes Yes

Time till maturity No No Yes Yes No

Table 5f: Breusch-Pagan test for heteroscedasticity - GARCH put options.

(1) (2) (3) (4) (5)

𝜒2(𝑘) 0.12 18.56 5.07 23.67 14.96

𝑝-value 0.729 0.000 0.079 0.000 0.000

Log Mispricing Yes Yes Yes Yes No

Moneyness ratio No Yes No Yes Yes

Referenties

GERELATEERDE DOCUMENTEN

team has a positive effect on the value of the firm and the hypothesis that the announcement of a sponsorship agreement with a Formula One teams is positively associated with abnormal

The innovativeness of this paper is threefold: (i) in comparison to economic studies of land use our ABM explicitly simulates the emergence of property prices and spatial patterns

However, the dominant discourse, represented by the environmental global society 4 , has locked itself in an echo chamber that impedes further engagement with

Niet alleen waren deze steden welvarend, ook was er een universiteit of illustere school gevestigd; daarmee wordt nogmaals duidelijk dat de firma Luchtmans zich met hun

SNLMP can be introduced as transition systems with stochastic and non-deterministic labelled transitions over a continuous state space.. Moreover, structure must be imposed over

The first two parts of this paper discussed underlying techni- cal material for the system-theoretic analysis of sampling and reconstruction (SR) problems and the design of

After examination of the Dutch retail gas market using the rank reversal model I am able to establish a clear link between gas prices and consumer search. Using a rank reversal

As both operations and data elements are represented by transactions in models generated with algorithm Delta, deleting a data element, will result in removing the