• No results found

Irregular market inefficiencies : a cointegration-based statistical arbitrage approach

N/A
N/A
Protected

Academic year: 2021

Share "Irregular market inefficiencies : a cointegration-based statistical arbitrage approach"

Copied!
80
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

University of Amsterdam

Amsterdam School of Economics

Faculty of Economics and Business

A Master’s Thesis to obtain the Degree in Financial Econometrics

Irregular Market Inefficiencies:

A Cointegration-based Statistical Arbitrage Approach

Justo Engelander

Student number: 10631275

Date of final version: July 15, 2018

Master’s Programme: Econometrics

Specialisation: Financial Econometrics

Supervisor: dr. K.A. Lasak

(2)

Acknowledgement

First, I would like to thank my thesis supervisor dr. Kasia Lasak for her guidance, intellectual support and for the freedom she gave me during the whole research process. Furthermore, I would like to express my gratitude to my family, especially my mother Henny, my father René and my little sister Tess, and to my girlfriend Carlijn for their infinite support. Without their love, I would never been able to write this master’s thesis.

(3)

Statement of Originality

This document is written by Student Justo Engelander who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(4)

Contents

1 Introduction 1

2 The Efficient Market Hypothesis 2

2.1 Definition of the EMH and its implications . . . 2

2.2 Testing the EMH . . . 4

2.3 The EMH and Irregular Market Inefficiencies . . . 4

2.4 Suggestions when a Forecasting Framework might work . . . 5

3 Cointegration 6 3.1 Definitions . . . 6 3.2 Cointegration Test . . . 8 3.2.1 Motivation . . . 8 3.2.2 VECM representation . . . 9 3.2.3 Johansen’s LR-test . . . 10

4 Cointegration-based Statistical Arbitrage 11 4.1 Introduction to Cointegration-based Statistical Arbitrage . . . 12

4.2 Theoretical Foundation . . . 14

4.3 The Cointegration-based Statistical Arbitrage Forecasting Framework . . . 15

4.3.1 Historical Data . . . 16

4.3.2 Pairs Identification Procedure . . . 16

4.3.3 Pairs Identification Period Length and Pairs Trading Period Length . . . . 17

4.3.4 Spread Modeling and Forecasting . . . 18

4.3.5 Implementation of Trading Rules . . . 18

4.4 Statistical Arbitrage and Irregular Market Inefficiencies . . . 19

4.5 Literature Review . . . 20

5 Data 22 6 Methodology 24 6.1 Basic Cointegration-based Statistical Arbitrage Forecasting Framework . . . 25

6.1.1 Johansen Cointegration Test as Identification Procedure . . . 25

6.1.2 Multiple Testing Problem and Preselection of Assets . . . 26

6.1.3 Forecasting and Trading the Spread . . . 27

6.1.4 Return Calculation . . . 28

6.2 Financial Performance and Risk Measures . . . 30

(5)

7 Results 32

7.1 Analysis of Performance and Risk Characteristics . . . 32

7.1.1 Portfolios generated by Specification 1, 2 & 3 . . . 35

7.1.2 Portfolios generated by Specification 4 . . . 39

7.1.3 Portfolios generated by Specification 5 . . . 40

7.1.4 Portfolios generated by Specification 6 . . . 41

7.2 Investigation of Irregular Market Inefficiencies . . . 42

8 Conclusion 45 References 48 Appendix 52 A Data 53 B Return Calculations of Individual Pair 55 C Financial Performance Measures and Tests 57 C.1 Cumulative Excess Return Index . . . 58

C.2 Mean Daily/Weekly Excess Return . . . 58

C.3 Annualized Excess Return . . . 59

C.4 Daily/Weekly Sharpe Ratio . . . 59

C.5 Annualized Sharpe Ratio . . . 61

C.6 Daily/Weekly Market Alpha . . . 61

D Financial Risk Measures 62 D.1 Maximum Drawdown Percentage and Maximum Drawdown Period Length . . . 62

D.2 Daily/Weekly Market Beta . . . 63

D.3 Value-at-Risk . . . 63

D.4 Risk Factor Models . . . 64

D.4.1 Fama-French Factor Model . . . 64

D.4.2 Currency Risk Factor Model of Lustig, Roussanov, and Verdelhan . . . 65

E Different Specifications of Forecasting Framework 66

(6)

1

Introduction

While closely related but different theories such as the random walk theory (Pearson, 1905) were already introduced before, the term Efficient Market Hypothesis (EMH) was six decades later first used by Roberts (1967). Among others such as Fama (1965), he argued that all financial markets are efficient. Applying the definition of Jensen (1978), a market is said to be efficient with respect to an information set if it is not possible to make economic profits by trading on the basis of this information set. In other words, if the EMH holds, it is impossible to forecast price series in financial markets in such a way that consistent positive returns can be generated. Although a vast amount of literature has been written in which the EMH is either questioned or attacked, the mainstream academic view defends the hypothesis. An overview is given by Sewell (2011).

There are however instances in which the EMH is clearly rejected for a period of time resulting in an irregular market inefficiency. A famous market inefficiency example among practitioners dates back to 1985 and the underlying trading strategy can be described as univariate statistical arbitrage, also known as pairs trading (Pole, 2011). Specifically, a small group of researchers at Morgan Stanley discovered that by simultaneously buying and selling a pair of fundamentally related assets positive market neutral returns could be earned due to the underlying mean reverting property. Pole (2011) showed that this pairs trading strategy was profitable for at least 15 years, after which the positive returns faded. For a short but significant period of time there was a market inefficiency present.

Due to the improvements in fields such as time series econometrics and machine learning, the EMH remains a subject of intense study. Proponents largely argue that due to the creative self-destruction of predictability eventually all markets are efficient. Opponents of the EMH however use examples such as given above to show that there will always be opportunities present. Timmermann and Granger (2004) combine both views and argue that the arguments of the EMH opponents, namely the irregular market inefficiencies, fits perfectly in the EMH statement. Based on the work of Black (1986), they conclude that markets are efficient most of the time, but that there are instances in which certain models and methods can be exploited such that the underlying market is irregularly forecastable.

Timmermann and Granger (2004) even go a step further by suggesting which and when forecasting approaches might work and this is the departure for this thesis. Specifically, in this thesis, the characteristics of a forecasting framework to exploit irregular market inefficiencies are investigated. To narrow down the scope, this research is solely focused on the cointegration-based statistical arbitrage forecasting framework. Moreover, because statistical arbitrage is a widely known trading strategy among researchers and practitioners, its consistent return potential should be marginal by the EMH. In this way, it can be argued that the statement of Black (1986) and Timmermann and Granger (2004) can be supported even more if market inefficiencies can still irregularly be found with a widely known trading strategy such as statistical arbitrage. At last, a reason to only investigate the cointegration-based statistical arbitrage framework is that it is built on sound econometric theory.

(7)

cointegration-based statistical arbitrage forecasting framework that can possibly exploit irregular market inefficiencies. First, six different specifications of the underlying forecasting framework are developed. Each distinct specification of the forecasting framework is then back-tested with either equity or currency pair data consisting of either daily or weekly observations. Every back-test essentially generates a portfolio from which the characteristics can be analyzed by evaluating several financial performance and risk measures. Based on these analyses, it can be argued which specification of the forecasting framework can possibly exploit irregular market inefficiencies.

The remainder of this thesis is organized as follows. Section 2 states the theory of the EMH. Then, in Section 3, the theory of cointegration is presented. Section 4 provides information about cointegration-based statistical arbitrage in general. Section 5 discusses the data and Section 6 outlines the methodology of this research. The empirical results and the underlying analysis are stated in Section 7. At last, Section 8 presents the conclusion.

2

The Efficient Market Hypothesis

This section presents an overview of the Efficient Market Hypothesis (EMH) from a forecasting perspective. Specifically, it is shown why formal tests on the EMH are complex and, furthermore, it is shown how irregular market inefficiencies are incorporated in the theory of the EMH. From this, it immediately follows that this thesis is not concerned with testing the validity of the EMH. Instead, this thesis is interested in investigating a forecasting framework that can possibly exploit irregular market inefficiencies even if the EMH holds.

This section discusses the following. First, the definition of the EMH is presented in Sub-section 2.1 together with its implications. In SubSub-section 2.2, it is argued why formally testing the EMH can be difficult. Thereafter, it is shown how irregular market inefficiencies are incorporated in the theory of the EMH. This is done in Subsection 2.3. At last, in Subsection 2.4, suggestions are given for how to maximize the chance of finding and exploiting irregular market inefficiencies. The following mirrors the work of Timmermann and Granger (2004) closely.

2.1 Definition of the EMH and its implications

The EMH is an hypothesis developed in the mid 1960s by Fama (1965), Samuelson (1965) and Roberts (1967) among others and simply states that financial markets are efficient. Multiple defini-tions can be found in the literature but here the definition of Jensen (1978) is used. He states that "a market is efficient with respect to an information set Ωt if it is impossible to generate economic

profits by trading on the basis of information set Ωt". Although this definition seems rather simple and intuitive, it has important implications. To see this, note that the definition of an efficient market effectively consists of three components, namely (1) the information set Ωt, (2) a forecasting framework that makes use of this information set Ωt which is translated into a trading strategy

and (3) economic profits generated by this trading strategy that eventually determine whether a market is efficient.

(8)

First, the information set Ωt has to be clearly defined. In the literature, there are three main forms of the EMH. If Ωt only consist of past and current asset prices, then the EMH is in its

weak form. Also incorporating all other publicly available information in Ωt results in the EMH

in semi-strong form. At last, if Ωtconsists of all publicly and privately available information, then the EMH is in its strong form.

If the information set Ωt is defined, then a forecasting framework expressed as a trading strategy can be specified. An important implication of the EMH is that it does not directly state that returns are impossible to forecast. Timmermann and Granger (2004) show this as follows. Let Ωt be the information set at time t and let Rt+1be the return forecast generated by a trading

strategy over interval [t, t + 1]. At last, denote the stochastic discount factor or pricing kernel that accounts for variations in economic risk premia over interval [t, t + 1] by Qt+1 such that Qt+1Rt+1

is the risk-adjusted return. Then, ignoring transaction costs, the EMH is represented by:

E [Qt+1Rt+1 | Ωt] = 0, (1)

Under certain assumptions the above equation is equivalent to the moment equation found by Harrison and Kreps (1979). Rewriting (1) yields:

E [Rt+1 | Ωt] = −

Cov (Qt+1, Rt+1| Ωt)

E [Qt+1 | Ωt]

(2)

The expression in (2) shows that under the EMH forecasting frameworks are able to successfully predict future returns but only through the conditional covariance between the future return Rt+1

and the stochastic discount factor Qt+1.

Even if a forecasting framework is found which can successfully predict future returns based on the information set Ωt, it does not imply whether a market is efficient or not. This namely depends on whether it is possible to generate economic profits with the underlying trading strategy. Timmermann and Granger (2004) define economic profits as profits that are risk-adjusted and net of transaction costs. To formally incorporate transaction costs in the EMH, (1) has to be extended. First, let ct be the transaction costs that has to be paid at time t in order to follow the trading

strategy that is based on the forecasting framework. Define R∗t+1= Qt+1Rt+1 as the risk-adjusted

return forecast. Then, a complete trading strategy at time t just maps the forecast of the risk-adjusted return R∗t+1 and the corresponding transaction costs ctto the generated economic profits ft. In other words, ftis a function of the transaction costs ctat time t and the risk-adjusted return

forecast R∗t+1. Formally, the EMH is then denoted by:

Eft ct, R∗t+1



t = 0, (3)

Thus, under the EMH, the expected economic profits generated by a trading strategy at time t that produces a risk-adjusted return forecast R∗t+1 is zero.

(9)

financial asset prices are related to their underlying intrinsic value or to the prices of other financial assets. This implies that, without violating the EMH, it is perfectly possible that asset prices deviate from their intrinsic value and, more extremely, that speculative bubbles can arise. For a deeper understanding, the interested reader is referred to Timmermann and Granger (2004).

2.2 Testing the EMH

Using (3), it is intuitive to set up a formal test of the form:

H0 : Eft ct, R∗t+1



t ≤ 0 vs. Ha: Eft ct, R∗t+1 

t ≥ 0 (4)

to investigate whether the EMH holds. While this intuition is correct, there are complications with the execution of this testing procedure. To see this, note that the risk-adjusted return forecast R∗t+1 consist of two random variables Rt+1and Qt+1so that the above testing procedure is in fact a joint

hypothesis testing problem. Furthermore, the stochastic discount factor Qt+1that accounts for the variations in the risk premia is hard to measure and thus the economic profits ft are difficult to

calculate. For these complications among others, this thesis does not formally try to test the EMH.

2.3 The EMH and Irregular Market Inefficiencies

If the EMH as presented in (3) holds for every possible forecasting framework, then it necessarily also holds for the best forecasting framework. This implies that investors or forecasters have no uncertainty about what the best forecasting framework is at any given time and thus it implies that markets will always be efficient. In reality, however, investors or forecasters cannot know ex ante with certainty if a forecasting model is optimal. This uncertainty about the optimal forecasting framework among investors can thus possibly create market inefficiencies locally in time, or equivalently, irregular market inefficiencies. Timmermann and Granger (2004) define these in the framework of the EMH as follows.

First, a market efficiency locally in time t is defined by expanding (3). To do this, let mit

 zt; ˆθt



denote the ith forecasting model at time t from a set Mt of all forecasting models

available at time t such that mit∈ Mt. Specifically, the actual prediction of the forecasting model mit

 zt; ˆθt



depends on its estimated parameters stored in the vector ˆθtand depends on the input

vector zt ∈ Ωt at time t. Now, note that the economic profits ft do not only depend on the transaction costs ct and the risk-adjusted return forecast R∗t+1 generated by the forecasting model

mit, but they also depend on the forecasting model itself. A market is then said to be locally

efficient at time t if:

E h ft  ct, R∗t+1, mit  zt; ˆθt  Ωt i = 0, mit∈ Mt, zt∈ Ωt. (5)

The time index in (5) is necessary to avoid cases in which forecasting models are used before their discovery. Otherwise, researchers could possibly show a violation of the EMH by applying these models on data before the models were invented. For example, employing a neural network

(10)

forecasting model before it was published, might result in ex post economic profits. In this way, the EMH that is based on an equation of the form (5) is not violated by this example while the EMH based on (3) is violated.

As stated above, if model uncertainty is taken into account, markets can be locally inefficient. If this is the case, then there exists a time interval Y = [tstart, . . . , tend] where 0 ≤ tstart≤ tend≤ T

for time points t ∈ {0, 1, . . . , T } and then there exists a model mit∈ Mtsuch that:

E h ft  ct, R∗t+1, mit  zt; ˆθt  Ωt i > 0, zt∈ Ωt, ∀t ∈ Y. (6)

At last, Timmermann and Granger (2004) incorporate these market inefficiencies in the EMH framework by stating a more specific definition of an efficient market with respect to time horizon τ . They state that a market is efficient over horizon τ if the length of the intervals, in which irregular market inefficiencies are present, is smaller than τ for all forecasting models mit ∈ Mt. That is, if tend− tstart< τ for all mit∈ Mt.

The definition of Timmermann and Granger (2004) thus states that irregular market ineffi-ciencies can exist in an efficient market as long as these ineffiineffi-ciencies eventually dissappear. In other words, even if the EMH based on (5) holds, it is still possible to forecast asset returns such that positive economic profits can be generated for some time interval without violating the EMH. In this way, the self-destruction of predictability property is more explicitly incorporated in the EMH: if an irregular inefficiency become publicly known, a large group of investors will try exploiting this inefficiency such that it will quickly dissappear.

Based on the EMH as outlined by Timmermann and Granger (2004), it immediately follows that this research does not try to reject the EMH. Instead, this research is solely concerned with investigating the statistical arbitrage forecasting framework that possibly exploits irregular market inefficiencies.

2.4 Suggestions when a Forecasting Framework might work

After arguing that irregular market inefficiencies can exist in the theory of the EMH, Timmer-mann and Granger (2004) also give suggestions when a forecasting framework might work. Three suggestions are highlighted:

1. Wide searches across models and assets

First, Timmermann and Granger (2004) suggest that because of the uncertainty about the optimal forecasting model among investors, forecasters should review as many models of the set of available models Mt as possible. They argue that in this way economic profits can be discovered for a short period of time. Also, financial assets that are ’under-researched’, are more likely to have inefficient market prices than ’over-researched’ assets. Using this argument, Timmermann and Granger (2004) also suggest that investors should look at as many financial assets as possible.

(11)

If investors are uncertain about the optimal forecasting model, then it follows implicitly that they are also uncertain about which information zt ∈ Ωt to use for estimating the

optimal parameters and which information zt ∈ Ωt to use for forecasting the asset returns.

This suggests that a forecaster should investigate multiple data windows and multiple time frequencies to increase the chance of finding irregular market inefficiencies.

3. Thick modeling

Another suggestion that Timmermann and Granger (2004) give for finding irregular market inefficiencies is to combine outcomes of multiple forecasting models and data windows on which eventually a trading decision is made. This is also known as thick modeling. Aiolfi and Favero (2005) state that this approach, when used correctly, can substantially improve forecasting accuracy and thus have a positive effect on generating economic profits.

The goal of this thesis is to investigate the statistical arbitrage forecasting framework which can potentially exploit irregular market inefficiencies. In finding these irregular market inefficiencies the suggestions made above will be incorporated in the statistical arbitrage forecasting framework.

3

Cointegration

As will become apparent, the theory of cointegration of financial time series plays a key role in developing our statistical arbitrage forecasting framework to identify potential irregular market inefficiencies. For this reason, the essential components of the underlying theory will be discussed in this section.

This section consists of the following parts. First, formal definitions concerning cointegration are given in Subsection 3.1. Subsection 3.2 then discusses the statistical testing procedure that is extensively used in this thesis to test for cointegration in pairs of asset prices.

3.1 Definitions

Cointegration is a phenomenon present in multivariate time series analysis in which two or more individual unit-root nonstationary time series have a common underlying trend such that these time series cannot drift too far apart. Put differently, there exists a linear combination of these individual nonstationary time series such that this combination is unit-root stationary. Examples of such time series are interest rates with different term lengths or stock prices of the same company that trade on different exchanges.

The above shows that cointegration relates to fundamental time series concepts such as multivariate stochastic processes and stationarity. To give the formal definition of cointegration, we must first state some of these underlying concepts.

First, it has to be clear what a time series is. A k-dimensional time series of T observations is just a k-dimensional stochastic process, denoted by {xt}Tt=1. Given its filtered probability space

(Ω, F , {Ft}t=1,...,T, P), a k-dimensional stochastic process is defined as follows (Etheridge, 2002,

(12)

Definition 3.1. (Stochastic process) A dimensional stochastic process is a sequence of k-dimensional random variables, {xt}Tt=1 on Ω such that xt is Ft-measurable for each t = 1 . . . , T .

As Tsay (2010, p. 30) puts it, stationarity is the foundation of time series analysis. Mainly two forms of stationarity can be found in the literature. These forms are weak stationarity and strong stationarity. In this thesis, only the former is of interest (Spanos, 1999, pp. 419–420):

Definition 3.2. (Weak stationarity) A k-dimensional time series {xt}Tt=1 is said to be weakly stationary if:

1. E [xt] = µ, ∀t = 1, . . . , T, where µ is a k-dimensional vector of constants, and;

2. E(xt− µ) (xt−l− µ)0 = Γl, ∀t = 1, . . . , T where Γl is only a function of l.

Most financial time series of asset prices are expected to be nonstationary. Modeling these time series by traditional ARMA models will most likely result in a corresponding AR polynomial with 1 as a characteristic root. This type of nonstationarity is often called unit-root nonstationarity. One method to overcome unit-root nonstationarity is by differencing the time series such that the d-th differenced series is an ARMA-process. More formally (Engle & Granger, 1987, p. 252):

Definition 3.3. (Integration) A time series {xt}Tt=1is said to be integrated of order d, denoted by

xt∼ I(d), if the series has a stationary and invertible ARMA-representation after it is differenced

d times; that is ∆dxt∼ ARMA where ∆ is the difference operator.

Now, the concept of cointegration can be formally defined (Engle & Granger, 1987, p. 253):

Definition 3.4. (Cointegration) The components of a k-dimensional time series {xt}Tt=1are said to be cointegrated of order d, b, denoted by xt∼ CI (d, b), with b > 0 if:

1. all components of xtare I(d), and if;

2. there exists a vector β 6= 0 such that wt= β0xt∼ I(d − b).

β is called the cointegrating vector.

As already implicitly stated, financial time series are most likely to be I(1). Because β 6= 0, it follows that only cointegration of order d = 1, b = 1 is relevant in this thesis: xt ∼ CI (1, 1).

Moreover, because our statistical arbitrage forecasting framework is only interested in investigating the cointegrating relation of pairs of two assets, we can restrict our attention to 2-dimensional or bivariate time series from now on.

The next subsection discusses the statistical procedure which is used throughout this thesis to formally test for cointegration in pairs of (log) asset prices.

(13)

3.2 Cointegration Test

Having stated all the necessary definitions regarding cointegration, the statistical procedure is now presented that is used extensively in this thesis to formally test for a cointegrating relation between pairs of asset prices. Specifically, this procedure is known as Johansen’s LR test. In this subsection, first a motivation is given why this procedure is chosen. Then, the vector error correction model (VECM) representation will be highlighted. At last, the Johansen LR test is stated.

3.2.1 Motivation

In the literature, two main cointegration testing approaches can be distinguished. The first approach is named after Engle and Granger (1987) and can be summarized as a two-step residual-based testing procedure. The other approach is due to Johansen (1988, 1995) and is implicitly based on a system of equations estimator (Lim & Martin, 1995).

The main idea behind the Engle-Granger approach is as follows. For simplicity, assume that a 2-dimensional time series under consideration, denoted by {xt}Tt=1, is potentially integrated of the order d = 1, b = 1; that is xt ∼ CI(1, 1). If the assumption is true, and using Definition 3.4,

this would mean that the two components of xtboth are I(1) and that there exists a linear relation between the two components that is unit-root stationary or I(0). The Engle-Granger approach just tests both conditions sequentially. The first step consists of testing whether both individual components are unit-root nonstationary. If this is the case, then the second condition can be evaluated. Specifically, in the second step, first a regression model is estimated by OLS between the two individual components in the series {xt}Tt=1. That is, if xt = (yt, zt)0, then the following

regression model is estimated:

yt= α + γzt+ εt, t = 1, . . . , T. (7)

If the components of the time series xt are cointegrated, then the residual series { ˆεt}Tt=1 from (7)

represents the linear relation that should be unit-root stationary. This can once again be tested by unit-root stationarity testing procedures.

Different unit-root stationarity tests can be used in the two steps of the Engle-Granger approach. Examples are the augmented Dickey-Fuller unit-root test of Dickey and Fuller (1979), the Phillips-Perron unit-root test of Phillips and Perron (1988), the KPSS stationarity test of Kwiatkowski, Phillips, Schmidt, and Shin (1992) and the Phillips-Ouliaris test of Phillips and Ouliaris (1990). For an overview and a simulation study of these tests in the Engle-Granger approach, see Harlacher (2016, pp. 32–53).

The Engle-Granger approach is because of its simplicity widely used in the empirical coin-tegration literature. However, this approach has a serious disadvantage from the perspective of a statistical arbitrage forecasting framework. The disadvantage is concerned with the fact that almost all unit-root stationarity tests (except the Phillips-Ouliaris test) mentioned above are not invariant to the formulation of the regression equation in (7) (Harlacher, 2016, p. 40). This means that, for

(14)

correctly identifying a cointegrating relation, not only the residual series of the regression in (7) has to be tested for stationarity, but also the residual series of the regression model in which zt is the

dependent variable and ytis the regressor. From a theoretical perspective, this is not a big problem.

However, from a practical perspective, this double estimation and testing in the second step of the Granger-Engle approach can drastically increase computing time and it can magnify the so-called multiple testing problem. The multiple testing problem will be discussed in the Methodology. To see this, assume that we have an asset universe of 100 different assets. In total, these 100 different assets make up 100(100−1)2 = 4, 950 unique pairs. Applying the Engle-Granger approach on these pairs would then result in performing 4, 950 unit-root stationarity tests in step 1 and 9, 900 OLS estimations and unit-root stationarity tests in step 2. This is computationally not very efficient.

Even if the invariance to the formulation of the regression equation is not taken into account (e.g. the Phillips-Ouliaris test is used), the Engle-Granger method is still less efficient than the Johansen (1988) approach. In this case, the Engle-Granger approach still has to perform two unit-root stationarity tests and one OLS estimation sequentially.

As shown by Lim and Martin (1995) for example, the cointegration testing approach of Jo-hansen (1988) is from a computing perspective more efficient than the Engle-Granger approach. This follows from the fact that the approach of Johansen (1988) simultaneously tests for cointegra-tion in a system of equacointegra-tions instead of testing individual equacointegra-tions sequentially. For this reason, the approach of Johansen (1988) is solely used in our statistical arbitrage forecasting framework.

3.2.2 VECM representation

As already stated, the cointegration testing approach of Johansen (1988) is based on a system of equations. This system of equations is often represented by the multivariate vector autoregressive moving average model or VARMA-model. However, for their simplicity in estimation, the attention in this research is restricted to the more intuitive vector autoregressive model of order p or VAR(p)-model. This paragraph states the VECM representation for a bivariate VAR(p)-model from which it becomes apparent how potential cointegrating relations can be formally tested.

Consider a bivariate VAR(p) time series {xt}Tt=1 from which the components are possibly

cointegrated: xt= µt+ p X i=1 Φixt−i+ at, t = 1, . . . , T, (8)

where at is the innovation series which is assumed to be normally distributed and where µt = µ0+ µ1t contains the deterministic constant and trend term in the series if necessary. Also, assume

that xtis at most I(1) which is plausible for financial price series. Then, the VECM representation of the VAR(p)-process xt is:

∆xt= µt+ Πxi−1+ p−1

X

i=1

Φ∗t−i∆xt−i+ at, t = 1, . . . , T, (9)

where Π = −Φ(1) = −I2+Ppi=1Φi and Φ∗i = −

Pp

(15)

correction term. It follows that the rank of the (2 × 2)-matrix Π solely determines the number of linearly independent cointegrating vectors and thus the number of common stochastic trends in the system of xt (Tsay, 2010, p. 433). The rank of Π can only take 3 possible values:

1. rank (Π) = 0. This implies Π = 0 and thus ∆xt follows a VAR(p − 1)-model. In this case,

xt is not cointegrated.

2. rank (Π) = 2. This implies det(Π) 6= 0 and thus det(Φ(1)) 6= 0. Hence, xt contains no unit-root or equivalently xt∼ I(0).

3. rank (Π) = 1. In this case, xtis cointegrated such that there exists one cointegrating relation between the components of the series. Furthermore, xthas one common stochastic trend.

The components of a bivariate time series xt are thus cointegrated if the matrix Π in the VECM

has a rank equal to 1. In this case, the matrix Π can be decomposed as:

Π = αβ0=   α1 α2    β1 β2 

where α is the adjustment vector and β is the cointegrating vector. For our cointegration-based statistical arbitrage forecasting framework, the cointegrating vector will be important because it will determine which position sizes must be taken to exploit a potential irregular market inefficiency in a cointegrated pair of assets. However, the above decomposition is not unique in a sense that multiplying α by a constant c 6= 0 and dividing β by the same constant yields the same matrix Π. To overcome this minor issue, it is common to set β1= 1 so that β = (1, β). At last, note that α also needs to satisfy restrictions in order to be that β0xt∼ I(0).

The above discussion shows that the matrix Π determines whether the components of the time series xtare cointegrated. The approach of Johansen (1988, 1995) to formally test for cointegration

makes use of this fact by testing the rank of matrix Π in the VECM. This approach or, more explicitly, the Johansen (1988) LR-test is discussed next.

3.2.3 Johansen’s LR-test

This paragraph states the LR-test of Johansen (1988), also known as the Johansen trace test, which can be used to formally test for cointegration in a multivariate time series. The derivation of this test and the underlying maximum-likelihood estimation procedure of the VECM are not discussed here but can be found in Tsay (2010, pp. 432–437).

The Johansen LR-test formally tests the number of cointegrating relations m in a time series xtby testing the null hypothesis H0 that the rank of matrix Π in the VECM representation in (9)

is equal to m. That is,

(16)

The corresponding likelihood ratio test statistic LR(m) is given by: LR(m) = −(T − p) 2 X i=m+1 ln  1 − ˆλi  ,

where T is the number of time series observations used and ˆλiis the ith squared canonical correlation

between the residuals of the regression equations for ∆xt and xt. The LR test statistic is not asymptotically χ2-distributed as expected, but is in fact a function of standard Brownian motions. This follows from the fact that unit-roots can be present under the null hypothesis. For this reason, the critical values must be obtained by simulation in a similar manner as for the critical values of the ADF unit-root stationarity test. It can be shown that these critical values depend on the dimension of the underlying time series, the number of cointegrating relations m assumed under the null hypothesis and, at last, they depend on the form of the deterministic function µt in the

VECM representation.

As stated before, only bivariate time series are considered here such that the number of cointegrating relations m can only take on the value of 0 or 1. To test for a cointegrating re-lation, the Johansen LR-test can be used as follows. First, the null hypothesis H0 of no

coin-tegrating relation (m = 0) is tested versus the alternative hypothesis that there is at least one cointegrating relation in the series xt (m ≥ 1). The corresponding test statistic is LR(0) = −(T − p)ln  1 − ˆλ1  + ln  1 − ˆλ2 

. If this null hypothesis H0 is rejected, then the null

hypoth-esis H0 of only one cointegrating relation (m = 1) is tested versus the alternative hypothesis that xt ∼ I(0). The corresponding test statistic reduces to LR(1) = −(T − p)

 ln  1 − ˆλ2  . If this last test does not reject the null hypothesis, then it is plausible that xtis cointegrated.

4

Cointegration-based Statistical Arbitrage

While the last two sections were somewhat more theoretical but necessary, we can finally turn our attention to cointegration-based statistical arbitrage in general. As might be guessed based on the introduction, statistical arbitrage is concerned with exploiting certain potential inefficiencies that are identified on the basis of a deviation from a statistical relationship between a pair of asset prices. If the statistical relationship is based on the theory of cointegration, then we speak of cointegration-based statistical arbitrage.

To be more specific, this section gives a clear overview of cointegration-based statistical arbitrage. First, in Subsection 4.1, a short introduction to cointegration-based statistical arbitrage is presented. Subsection 4.2 discusses some theoretical foundations of statistical arbitrage. Then the cointegration-based statistical arbitrage forecasting framework, which forms the basis of many scientific statistical arbitrage papers, is outlined in Subsection 4.3. Subsection 4.4 discusses how cointegration-based statistical arbitrage fits in the theory of the EMH as presented in Section 2. At last, we have enough tools to review the existing literature on cointegration-based statistical arbitrage. This is done in Subsection 4.5

(17)

4.1 Introduction to Cointegration-based Statistical Arbitrage

Statistical arbitrage is best described as a trading strategy that tries to generate excess returns by exploiting inefficiencies in the form of deviations from previously identified statistical relationships between two or more speculative assets. The most simple form of statistical arbitrage, which has the main focus in this thesis, is called statistical arbitrage pairs trading or univariate statistical arbitrage and just focuses on the statistical relationship between a pair of two assets. Statistical arbitrage is one of the main classes of trading strategies that are mostly used by investment professionals such as investment banks, hedge funds and other institutional investors. Examples of other classes of trading strategies that can be found in practice are global macro, event-driven, long-short equity or emerging market strategies. The investment approach of statistical arbitrage is seen as highly quantitative because the statistical relationships are identified by quantitative methods.

The intuition behind univariate statistical arbitrage is as follows (Vidyamurthy, 2004). The main principle of security investing from a valuation point of view is to buy securities that are valued under their intrinsic value and sell securities that are valued above their intrinsic value. In this way, assuming that security prices will eventually convert to their intrinsic value, positive returns should be generated. However, a practical limitation of this approach is that it is very hard, if not impossible, to correctly estimate the intrinsic or absolute value of a security. Statistical arbitrage tries to resolve this practical problem by looking at the relative value of securities in relation to other securities. This is called relative pricing. In this way, the absolute prices of the securities are not important anymore as long as there exist a relationship between the assets. If, at some future time point, one of the assets becomes overpriced in relation to the other assets, an inefficiency or mispricing is found that can be exploited by a trading strategy in which the relatively overpriced asset is sold and the relatively underpriced asset is bought. Hence, the investor speculates that the relative mispricing correct itself by taking a long-short position.

To eventually exploit the relative asset pricing approach in terms of a trading strategy, there has to be some relation between a pair of asset prices. In the case of statistical arbitrage, this relation between asset prices is for a large part based on statistical methods. If the theory of cointegration as presented in Section 3 is applied to investigate the statistical relationship between asset prices, then we speak of cointegration-based statistical arbitrage which has the main focus in this thesis. Cointegration-based statistical arbitrage tries to find asset prices that are cointegrated such that there exists a linear combination of these asset prices that is weakly stationary and thus mean-reverting. This linear combination of asset prices can be found with the cointegrating vector and the resulting process is often called the cointegrating error or the spread process. Based on the spread, trading decisions of long and short positions are made in which it is assumed that the asset prices remain cointegrated in the future such that the future spread converges back to its mean. Hence, the underlying portfolio that is generated by the trading strategy is a long-short portfolio. For this reason, cointegration-based statistical arbitrage is long-short trading strategy.

The above implicitly indicates that the success of cointegration-based statistical arbitrage crucially depends on the fact that the cointegrated asset prices remain cointegrated in the future.

(18)

In other words, the weakly stationary spread process that is identified with historical data, must remain stationary and thus mean-reverting in the future. If this is not the case, the spread might never converge to the mean or it might even diverge. In this case, a trading strategy that is purely based on the assumption of mean reversion of the spread can potentially result in significance losses. Thus, in relation to pure arbitrage, statistical arbitrage is not riskless. If the relation between asset prices is solely based on statistical methods such as cointegration, it might happen by chance that the prices of assets seem cointegrated for the given data window while in general the asset prices are not cointegrated at all. This happens for example when a cointegration test is performed which concludes that the null hypothesis of no cointegration is rejected while the null hypothesis in reality was true. In other words, the statistical test produces a type I error. The spurious result of a type I error causes the trading strategy to fail and this is for example empirically shown by Gatev, Goetzmann, and Rouwenhorst (2006). Hence, to decrease the likelihood of spurious results in the identification of cointegrated pairs, the underlying assets of the pair should not only be statistically related but also fundamentally related. A theoretical foundation for this fact is given in the next subsection.

To be as clear as possible, a simple graphical example will now be given with some of the above mentioned statements. Consider the daily stock prices of the companies Deere & Co. and Honeywell Int’l Inc. over the period December 29, 2007 - December 29, 2009. Both companies operate in the same sector, namely the industrials sector, such that they are fundamentally related. The corresponding prices are given in Figure 1. It seems that both stock prices move closely together and thus are possibly cointegrated.

Figure 1: Daily stock prices of Deere & Co. (red) and Honeywell Int’l Inc. (blue) between December 29, 2007 and December 29, 2009

To formally test whether the stock prices are cointegrated, a Johansen trace test is applied on the logarithm of these prices. This test eventually implies that the daily stock prices are indeed cointegrated. Moreover, using pDt for the logarithm of the stock price of Deere & Co. at time t and pHt for the logarithm of the stock price of Honeywell Int’l Inc. at time t, the cointegrating vector

(19)

suggests that the spread mt obtained by the linear combination:

mt= pDt − 1.758pHt + 17.153 (10)

is weakly stationary. Indeed, Figure 2 graphically indicates that the spread mtis weakly stationary

and thus mean-reverting. Assuming that (10) also holds in the future, a trading strategy can be developed in which one of the stocks is bought (sold) and a fraction of the other stock is sold (bought) if some predetermined threshold value is exceeded by the spread. How this is done, is shown in the next Subsection 4.3 and in the Methodology.

Figure 2: Spread process obtained from a linear combination between the logarithm of the daily stock prices of Deere & Co. and Honeywell Int’l Inc. between December 29, 2007 and December 29, 2009

In summary, cointegration-based statistical arbitrage tries to find pairs of assets in which the underlying prices are cointegrated. If the asset prices are cointegrated, this means that there exists a linear combination of these asset prices, called the spread, that is weakly stationary and thus mean-reverting. To increase the likelihood that asset prices remain cointegrated in the future the prices should not only be statistically related but also the underlying companies should be fundamentally related.

4.2 Theoretical Foundation

The previous subsection introduced the concept of cointegration-based statistical arbitrage and the main intuition why the trading strategy might be successful or might fail. This subsection gives a more theoretical motivation for statistical arbitrage. Specifically, the Arbitrage Pricing Theory (APT) of Ross (1976) is used to argue that the inclusion of a fundamental relationship between assets in the cointegration-based statistical arbitrage strategy increases the likelihood of success of the trading strategy.

The APT was first introduced by Ross (1976) and the most simple version of the APT is based on three assumptions (Luenberger, 1998). First, the APT assumes that investors prefer

(20)

greater returns over lesser returns if these returns are certain. Secondly, it assumes that the universe of assets being considered is infinitely large. Of course, this assumption is not met in reality but it can be argued that the current universe of financial assets is sufficiently large such that this assumption is well approximated. At last, the return ri of the ith individual asset is governed by a factor model consisting of m factors f = (f1, . . . , fm) without an error term:

ri = ai+ m

X

j=1

bijfj, i = 1, 2, 3, . . . , (11)

where ai and the factor loadings bij for all j = 1, . . . , m are fixed constants. Because (11) does not

contain an error component, the uncertainty of the asset return riis solely caused by the uncertainty in the factors f .

The APT states that the fixed constants ai and bij for all j = 1, . . . , m should be related to

each other in equilibrium, if arbitrage opportunities are to be excluded. That is, there are constants λ0, λ1, . . . , λm such that: E [ri] = λ0+ m X j=1 bijλj, i = 1, 2, 3, . . . ,

where E [ri] is the expected return of the ith asset and λj is called the factor price of factor fj.

Vidyamurthy (2004) was the first who applied the APT to the framework of statistical arbi-trage. He argues that financial assets with similar risk exposures should approximately have the same value for the fixed constants ai and bij for all j = 1, . . . , m in the factor model (11). If this is the case, then by the APT, the corresponding factor prices λj for all j = 0, . . . , m should be

equal to each other so that the expected returns of these assets are the same for all time windows. Hence, the underlying asset prices of fundamentally related assets should move together and thus are possibly cointegrated.

The above discussion suggests that cointegration-based statistical arbitrage is less prone to spurious results if not only statistical but also fundamental relations are incorporated in the identi-fication of cointegrated assets prices. In other words, the implicit assumption of statistical arbitrage that the historically identified spread will remain weakly stationary in the near future, is more likely to hold based on the APT if not only a statistical but also fundamental motivation is incorporated. As will become clear in the Methodology, the incorporation of a fundamental relation in the iden-tification of potential pairs can be accomplished by preselecting all assets based on their respective industry or sector in which the underlying companies operate in.

4.3 The Cointegration-based Statistical Arbitrage Forecasting Framework

This subsection presents the general cointegration-based statistical arbitrage forecasting frame-work as can be found in most scientific statistical arbitrage papers. The term cointegration-based statistical arbitrage forecasting framework can be best described as the procedures or steps that are necessary to back-test and employ a complete cointegration-based statistical arbitrage trading

(21)

strategy. As implicitly became apparent from Subsection 4.1, cointegration-based statistical arbi-trage consists mainly of two parts, namely pairs identification and pairs trading. The first part is concerned with identifying cointegrated pairs such that these pairs are suitable for statistical arbi-trage. In the second part the cointegrated pairs are actually traded. However, each of these two parts consists of multiple procedures in which choices have to be made that influences the success of the forecasting framework. For this reason, the statistical arbitrage trading framework is here split up in five smaller steps instead of the two main parts.

The five steps of the cointegration-based statistical arbitrage forecasting framework will now be discussed in short. To ease the discussion for each of the five steps, we only consider one hypothetical pair of two assets in the cointegration-based statistical arbitrage forecasting framework. These two assets are denoted by A and B. If all the five steps are discussed, then the framework is easily generalized to multiple pairs of assets.

4.3.1 Historical Data

The first step in the cointegration-based statistical arbitrage forecasting framework is to gather the necessary historical data. Because only one single pair is considered, this amounts to obtaining the historical prices of the assets underlying the pair. In this step, it has to be decided which historical prices are gathered. Examples are daily closing prices or opening prices of one minute intervals. Let the obtained historical prices for both assets be denoted as the time series PtA Tt=1 andPtB Tt=1 such that both consist of T observations.

4.3.2 Pairs Identification Procedure

The second step in the cointegration-based statistical arbitrage forecasting framework consists of determining which approach is used to identify whether a pair of asset prices is cointegrated. In the statistical arbitrage literature, two main approaches are used. The first approach is a nonparametric method and is often called the minimum distance approach, or shorter, the distance approach. The other approach is based on pure cointegration theory and thus is simply called the cointegration approach.

The distance approach was first introduced by Gatev et al. (2006) and tries to identify cointegrated pairs by measuring the comovement of the underlying asset prices in a nonparametric way. Specifically, it calculates the sum of squared differences (SSD) between the normalized prices of both assets. This is done as follows for the assets A and B. Assume that the pairs identification period consist of τ observations and let n ˜PtA

t=1 and

n ˜PB t

t=1 be the normalized prices of asset

A and B where: ˜ PtA= P A t PA 1 , P˜tB= P B t PB 1 , t = 1, . . . , τ.

Then the SSD between the normalized prices of asset A and B over a identification period with length τ is given by:

SSDτ(A, B) = 1 τ τ X t=1  ˜PA t − ˜PtB 2 .

(22)

If two asset prices are cointegrated, then the prices should move closely together such that the SSD is expected to be small. If asset prices are not cointegrated, the SSD is expected to be large. Hence, pairs can be identified as suitable pairs for statistical arbitrage if their corresponding SSD is smaller than some threshold value.

In the statistical arbitrage literature, the distance approach is the most used method for identifying cointegrated pairs. Different reasons for this can be given. First of all, the method is very intuitive and easy to use. Another reason is that the method is nonparametric. Because the distance approach is nonparametric, no parameters have to be estimated and thus there is no risk of making estimation errors. Furthermore, it is argued by Gatev et al. (2006) that pairs, identified with the distance approach, implicitly contain the property of cointegration. They state that the high potential pairs with cointegrating components are likely to be identified by the distance approach. However, the distance approach also has disadvantages. The main disadvantage is concerned with the practical application of the approach. As stated, a cointegrated pair is identified if the underlying SSD is smaller than some threshold value. In practice, this threshold value is very hard, if not impossible, to determine. To overcome this problem, many papers do not use a threshold value at all and only select the n pairs for statistical arbitrage which have the smallest SSD. For example, a researcher can choose to select only the 20 pairs with the smallest SSD from a set of 1000 pairs. This is done by Gatev et al. (2006). However, applying the distance approach in this fashion results in the new problem that always cointegrated pairs will be identified for statistical arbitrage, even if in reality none of the underlying asset prices are cointegrated.

As the name suggests, the cointegration approach is a more sophisticated approach for iden-tifying cointragrated pairs. Cointegration was discussed comprehensively in Section 3. While the cointegration approach is parametric such that there is risk for making estimation errors, this ap-proach is, unlike the distance apap-proach, based on sound statistical theory. As shown in Section 3, cointegration procedures can explicitly test whether there is a linear combination of asset pairs such that this combination is weakly stationary and thus mean reverting, whereas the distance approach only tries to implicitly identify cointegrated pairs by measuring the comovement in the asset prices. Because of these reasons, next to the inclusion of a fundamental relation, the cointegration approach is solely used for the identification of cointegrated pairs in this thesis. How cointegration is exactly used in the framework of cointegration-based statistical arbitrage is shown in the Methodology.

4.3.3 Pairs Identification Period Length and Pairs Trading Period Length

If the necessary historical data is gathered and the approach for identifying cointegrated pairs is chosen, then it has to be decided how many data points are actually used in the identification period. This is called the pairs identification period length. Furthermore, it has to be decided how long it is assumed that pairs of asset prices remain cointegrated in the future. In other words, it has to be decided how long the trading period is before new cointegrated pairs must be identified. This is called the pairs trading period length.

(23)

an impact on the performance of a cointegration-based statistical arbitrage forecasting framework when improperly chosen. Furthermore, because both period lengths depend on the time frequency of the data and on the pairs identification approach, care must be taken in the determination of those lengths (Krauss, 2017).

4.3.4 Spread Modeling and Forecasting

With the above three steps, one can investigate whether the asset prices of A and B are coin-tegrated. If the prices of asset A and B are cointegrated, then it follows from the definition of cointegration that there is a linear combination of the asset prices such that this combination is a weakly stationary process. Based on this weakly stationary spread process eventually trading decisions in the form of a trading strategy can be made. For this reason, it might be benificial to model the underlying spread process such that forecasts can be produced which can be used to optimize trading rules. Because cointegration tests already concluded that the spread is weakly stationary, one can decide to model the spread as a traditional ARMA process. Moreover, one can go a step further by using more sophisticated forecasting techniques from the fields of nonlinear time series analysis or machine learning. This is however not done is this thesis.

As will become apparent from Subsection 4.5, many papers do not model the underlying spread processes. The main reason is that cointegration theory already concluded that the spread process is weakly stationary. This implies that the spread process has the mean reversion property which can be easily exploited by introducing simple trading rules. The implementation of trading rules are discussed next.

4.3.5 Implementation of Trading Rules

To make the cointegration-based statistical arbitrage forecasting framework complete, all the above steps have to be converted into an actual trading strategy. This is done by implementing explicit trading rules. The most used trading rule in the framework of cointegration-based statistical arbi-trage is based on the mean reversion property of the weakly stationary spread series. The trading rule states that one has to either buy or sell the spread when the spread exceeds the threshold value of k times the standard deviation in relation to the mean of the spread. Buying the spread means that one of the underlying assets of the spread has to be bought and a fraction of the other asset has to be sold. A more explicit explanation is given in the Methodology. Other trading rules that can be implemented are for example stop loss rules.

The above five steps together make up a single specification of the cointegration-based statistical arbitrage forecasting framework. The performance of the specification of the forecasting framework can be investigated by executing a back-test in which certain financial performance measures such as the cumulative return and the Sharpe ratio are researched. The above five steps of the cointegration-based statistical arbitrage forecasting framework show that the forecasting framework can possibly be improved by making different decisions in each of the five steps. This is exactly the approach

(24)

taken to investigate the cointegration-based statistical arbitrage forecasting framework that can exploit irregular market inefficiencies.

4.4 Statistical Arbitrage and Irregular Market Inefficiencies

This subsection describes how the cointegration-based statistical arbitrage forecasting framework fits in the theory of the EMH and the corresponding irregular market inefficiencies as presented in Section 2. Recall that a market is efficient over a time horizon τ , if in general (5) holds over a time horizon with length τ . However, it can happen for some time interval with a length smaller than τ that irregular market inefficiencies in the form of (6) are present.

Using the notation as in (5) and (6), we can state the following. Evidently, the complete cointegration-based statistical arbitrage forecasting framework eventually generates economic prof-its or losses so that the complete framework is denoted by ft. The economic profits or losses ft

are influenced by (1) the underlying forecasting model mit that is a function of information zt and

estimated parameters ˆθt, (2) the risk-adjusted return Rt+1∗ = Qt+1Rt+1that the forecasting model

predicts over next time period and (3) transaction costs ct.

In the case of the cointegration-based statistical arbitrage forecasting framework, it can be argued that step 2 through step 5 of the forecasting framework together form a forecasting model mit. If the identification of cointegrated pairs is only based on cointegration tests, then it follows

that the information zt∈ Ωtfor producing a forecast consists solely of current and historical asset

prices that are available at time t. Thus, the EMH in its weak form is considered. Furthermore, note that in step 2, 4 and 5 of the forecasting framework parameters must possibly be estimated which are denoted by ˆθt.

If the parameters θtof the forecasting model mitare estimated, then it is possible to generate a forecast of the future return Rt+1 by using information zt ∈ Ωt as input. However, as can be

suggested based on Subsection 2.2, it is not possible to exactly measure the stochastic discount factor Qt+1 and thus directly measure the risk-adjusted return R∗t+1= Qt+1Rt+1. While it is not

possible to directly measure the risk-adjusted return Rt+1∗ , it is still possible to investigate the risk-adjusted return by evaluating risk-risk-adjusted performance statistics and applying appropiate risk factor models to the normal return series. This is explained more carefully in the Methodology. At last, using the transaction costs ct, the actual economic profits ft of the cointegration-based statistical arbitrage forecasting framework can be calculated.

Having described how the forecasting framework fits in the theory of the EMH, we can now turn our attention to the irregular market inefficiencies that can possibly be captured with the forecasting framework. As became clear from the previous subsection, cointegration-based statistical arbitrage first identifies cointegrated pairs and then possibly trade them. It can thus be argued that a potential irregular market inefficiency can possibly arise in the asset prices of a cointegrated pair. The identification of this potential irregular market inefficiency is then done by using the forecasting procedure in step 2 to step 5 of the forecasting framework. At last, to investigate whether the potential inefficiency was an actual irregular market inefficiency, the

(25)

corresponding economic profits must be calculated. If these profits are positive, it is assumed that the forecasting framework successfully exploited the irregular market inefficiency.

4.5 Literature Review

Up until now, enough information is given such that the existing literature on statistical arbitrage can be reviewed. This subsection will thus review this existing literature. This is done from the viewpoint of the forecasting framework discussed in Subsection 4.3. Only the most popular statistical arbitrage literature will be reviewed here. For a more extensive statistical arbitrage literature review, the reader is referred to Krauss (2017).

While statistical arbitrage was fairly popular among practitioners immediately after its in-troduction in the mid 1980s, it lasted more than 20 years before the scientific community caught up on it. The surge of interest in statistical arbitrage can be attributed to Gatev et al. (2006) who introduced an intuitive statistical arbitrage trading strategy in 2006 and became the most cited statistical arbitrage paper. For this reason, the work of Gatev et al. (2006) is reviewed first.

The forecasting framework of Gatev et al. (2006) is as follows. To investigate their forecasting framework, they use daily price series data of the most liquid stocks in the CRSP database over the time period 1962 to 2002. As pairs identification procedure, they employ the minimum distance approach as described in Paragraph 4.3.2 and select the different pairs for trading based on the ranking of the SSD. Specifically, they use the top 5 and top 20 pairs with the lowest SSD in the identification period for trading in the trading period. They also consider the 101-120th pairs with the lowest SSD in the identification period for trading. Gatev et al. (2006) fix the pairs identification period to 12 months and the pairs trading period to 6 months. To actually trade the identified pairs in step 2 and 3 of the forecasting model, Gatev et al. (2006) do not model or forecast the found spread series but instead directly implement a mean reversion trading rule. If the found spread series exceeds two times the standard error of the spread in absolute terms, then the pair is traded under the assumption that the spread converges back to the mean.

Gatev et al. (2006) eventually show that the annualized excess return generated by this forecasting framework over the period 1963 to 2002 is approximately 11% with a Sharpe ratio of around 0.55. Furthermore, Gatev et al. (2006) investigate the risk characteristics by applying the Fama-French risk factor model to the generated return series and conclude that a large part of the generated returns cannot be explained by these risk factors.

The forecasting framework of Gatev et al. (2006) has some clear advantages. First of all, the generated trading returns are impressive from a performance and risk point of view. Second, the identification procedure for identifying pairs is very intuitive and free of paramaters. At last, because of the simplicity of the overall trading strategy, the forecasting framework is robust to data snooping bias. However, the forecasting framework also has several disadvantages. Without going into too much detail, the main disadvantages are that Gatev et al. (2006) did not incorpo-rate transaction costs and that the minimum distance approach, as argued in Paragraph 4.3.2, is suboptimal.

(26)

The paper of Gatev et al. (2006) accelerated the quality and quantity of literature about statistical arbitrage. First of all, many researchers used the forecasting framework on different data sets. Bianchi, Drew, and Zhu (2009) for example apply the forecasting framework to daily price series of commodity market futures over the time period 1990 to 2008. They find that the framework generates positive and significant excess returns with a minor exposure to systematic risk. Broussard and Vaihekoski (2012) apply the forecasting framework to daily data obtained from the Finnish stock market over the period 1987 to 2008 and find similar results as Gatev et al. (2006). At last, Bowen and Hutchinson (2016) use the forecasting framework of Gatev et al. (2006) on daily observations of the UK stock market over the interval 1979 to 2012.

Second, many researchers tried to improve the forecasting framework of Gatev et al. (2006). Do and Faff (2010, 2012) for example incorporated transaction costs in the framework using the same data as Gatev et al. (2006) and concluded that most of the excess returns faded. Chen, Chen, Chen, and Li (2017) tried to improve the pairs identification procedure by investigating the correlation between the return series instead of the price series. Using the same data as Gatev et al. (2006) and a pair identification period of 5 years and a pairs trading period of 1 month, they report an average monthly return of 1.70% which is twice as large as the reported returns of Gatev et al. (2006). However, they also did not incorporate transaction costs.

The statistical arbitrage research in the above mentioned papers all identified pairs using the minimum distance approach. As indicated, this approach, however, is not optimal because it does not explicitly incorporate a statistical relationship such as cointegration or mean reversion. For this reason, a significant group of researchers applied cointegration theory instead of the minimum distance measure to identify pairs of asset prices for trading in step 2 of the forecasting framework. The book of Vidyamurthy (2004) is the most cited work by researchers who applied the cointegration approach for identifying pairs. A reason for this is that Vidyamurthy (2004) gives a theoretical motivation why the theory of cointegration can be used to form a statistical arbitrage forecasting framework. This motivation was presented in Subsection 4.2. While Vidyamurthy (2004) mentioned cointegration explicitly for identifying pairs, he did not suggest finding cointe-grated pairs with the Engle-Granger approach or the approach of Johansen as presented in Para-graph 4.3.2. Instead, he suggested a more practical approach in which the zero crossing frequency of the spread is mostly used. Furthermore, Vidyamurthy (2004) does not give empirical results for their suggested forecasting framework.

Papers that have empirically tested Vidyamurthy’s proposed forecasting framework are for example the following. Dunis, Rudy, Giorgioni, and Laws (2010) tested their forecasting framework on different high-frequency data samples of the stocks in the Eurostoxx 50 index. Specifically, they used data with time intervals of 5, 10, 20, 30 and 60 minutes over the period July 3, 2009 to November 17, 2009. To identify pairs for trading, first Dunis et al. (2010) preselected the 50 stocks based on underlying industry groups. Then, they applied the Engle-Granger two step approach with a significance level of 5%. Instead of selecting all the pairs that pass the cointegration tests, they only select the top 5 pairs with the highest ADF t-test statistic for the underlying spread

(27)

obtained in the second step of the Engle-Granger approach. Furthermore, Dunis et al. (2010) only use one pairs identification period, ranging from July 3, 2009 to September 9, 2009 and one trading period, ranging from September 10, 2009 to November 17, 2009. Just as most papers, Dunis et al. (2010) do not model or forecast the underlying spread series and only implement a trading rule similar to the one Gatev et al. (2006) used. After accounting for transaction costs, the forecasting framework produces positive annualized excess returns of 1.92%, 7.83%, 10.33%, 14.08% and 14.08% respectively for the 5, 10, 20, 30 and 60 minutes interval data. While the results are appealing, it should be tested on a larger data set such that more than 5 pairs can be taken into account. Moreover, Dunis et al. (2010) only investigate the performance measures and thus do not take the risk characteristics into account.

Caldeira and Moura (2013) identify pairs with the cointegration approach by using both the Engle-Granger two step method and the Johansen cointegration method. As data, Caldeira and Moura (2013) use daily observations of the 50 most liquid stocks of the Brazilian stock index IBovespa ranging from January 2005 to October 2012. Furthermore, their identification and trading period lengths are respectively 1 year and 4 months. Because Caldeira and Moura (2013) do not preselect the Brazilian stocks based on their industry group, they test a total of 1225 unique pairs and eventually find 90 cointegrated pairs. From these pairs, they choose the 20 pairs with the highest in-sample Sharpe ratio for trading. Just as most papers, Caldeira and Moura (2013) also do not forecast the underlying spread series and instead implement a similar trading rule as Gatev et al. (2006). Caldeira and Moura (2013) conclude that the developed forecasting framework show annualized excess returns after transaction costs of 16.38% and a corresponding Sharpe ratio of 1.34.

All papers that are discussed up to now do not model and forecast the underlying spread series and instead only implement a simple trading rule. Example of papers that do model the underlying spread series are Do, Faff, and Hamza (2006), Puspaningrum (2012) and Huck (2010). Do et al. (2006) try to model the underlying spread in state space. Puspaningrum (2012) model the spread to find optimal threshold values for trading and Huck (2010) apply machine learning techniques to forecast the spread of pairs.

The above literature review is not exhaustive by any means. For a complete literature review on statistical arbitrage, the interesting reader is referred to Krauss (2017).

5

Data

Before stating the methodology, the data used in this research is presented. As suggested by Timmermann and Granger (2004), to increase the chance of finding irregular market inefficiencies, it is best to apply the cointegration-based statistical arbitrage forecasting framework to different asset classes and different time frequencies. For this reason, this thesis uses two different asset classes, namely the equity and currency asset class, and two different frequencies, namely daily and weekly observations, to develop and investigate the forecasting framework.

Referenties

GERELATEERDE DOCUMENTEN

B.2 Tools Needed to Prove The Second Fundamental Theorem of Asset Pricing 95 C Liquidity Risk and Arbitrage Pricing Theory 98 C.1 Approximating Stochastic Integrals with Continuous

Daar kan afgelei word dat daar nie altyd in elke skool ʼn kundige Skeppende Kunste- onderwyser is wat ten opsigte van al vier strome voldoende opgelei en toegerus is

Summarizing the results of the preceding analysis yields that we have two contradicting results: On the one hand there is evidence for the validity of the physical arbitrage model,

Cases will be compared on the basis of the themes: internationalization, international new ventures (INVs), international entrepreneurship (IE), social mission,

Landelijke Huisartsen Vereniging, ‘Notitie: Bewegingsruimte voor de huisartsenzorg, van marktwerking en concurrentie naar samenwerking en kwaliteit’, 26-05-2015, online via

Bo en behalwe die goeie gesindheid wat dit van die gemeente kan uitlok, kan die orrelis ‘n positiewe bydrae maak tot die kwaliteit van musiek en keuse van liedere wat in

Keywords: Therapeutic Change Processes Research (TCPR), Multilevel Models (MLMs), text mining, process data, web-based interventions, text variables..

Concerning the second approach using single crystal substrates, the epitaxial lift-off of single crystalline oxide perovskite thin films with different orientations is realized