• No results found

Intraday momentum and parameter instability : behaviour of the intraday momentum and forecasting under parameter instability

N/A
N/A
Protected

Academic year: 2021

Share "Intraday momentum and parameter instability : behaviour of the intraday momentum and forecasting under parameter instability"

Copied!
59
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Faculty of Economics and Business, Amsterdam School of Economics Bachelor thesis - BSc Econometrics & Operations Research

Intraday Momentum and Parameter Instability

Behaviour of the intraday momentum model and forecasting under parameter

instability

26 June 2018

Abstract

This thesis investigates the behaviour of the intraday momentum model as in Gao, Han, Li and Zhou (2017) over time and studies the performance of models that account for parameter instability. Evidence shows that the coefficients of the intraday momentum model do not stay stable over time and are heavily influenced by a financial crisis. Using an estimation method that weights only the past observations shows substantial gains in forecasting accuracy over Recursive Least Squares.

Author: Gino Bransen

Student number: 10973893

Supervisor: MSc Hao Li

(2)

Statement of Originality

This document is written by Gino Bransen who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating

it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Contents

1 Introduction 1

2 Literature Review 3

2.1 Medium-term momentum . . . 3

2.1.1 Existence of Medium-term momentum . . . 3

2.1.2 Causes of Medium-term momentum . . . 3

2.2 Intraday Momentum . . . 4

2.2.1 Existence of Intraday momentum . . . 4

2.2.2 Causes of Intraday momentum . . . 5

2.3 Parameter Instability . . . 6

2.4 Hypothesis . . . 7

3 Methodology 9 3.1 Data . . . 9

3.2 Methodology . . . 10

3.2.1 Existence of intraday momentum . . . 10

3.2.2 Evidence for parameter instability . . . 11

3.2.3 Forecasting under parameter instability . . . 12

4 Results 16 4.1 Existence of intraday momentum . . . 16

4.2 Parameter Instability . . . 18

4.3 Forecasting performance under parameter instability . . . 20

4.3.1 OOS R2 . . . 20

4.3.2 Return performance . . . 22

5 Conclusion 27 6 Appendix 31 6.1 Descriptive Statistics . . . 31

6.2 Existence of Intraday momentum . . . 32

6.3 Parameter Instability . . . 36

6.4 Forecasting Performance . . . 37

(4)

1

Introduction

In today’s world, more than 50 percent of all US equity trading is done by high frequency traders (Cheng, 2017). This practice, which uses computer algorithms to make trading deci-sions with a short holding period, has known a great increase in popularity since 2005 (Kaya, 2016). However, whilst there has been a great increase in popularity, there has surprisingly not been a great increase in literature about intraday stock prediction. The vast literature on stock return prediction is still mainly about long-term prediction, while intraday stock prediction is evenly interesting.

While there are many different studies on different predicators, Goyal and Welch (2007) argue that regressions of excess returns on predictor variables are not suitable to forecast the future returns. They state that those predictors perform badly both in and out of sample and argue that the historical average excess return forecast future returns better than predictor variables. Campbell and Thompson (2008) contradict this conclusion, they show that there are many predictive regressors that beat the historical average return once restrictions have been made. This contradiction motivates to conduct even more research on predicting stock returns.

An important long-term predictor is studied by Jegadeesh and Titman (1993). They found that when a stock is a winner in the first six months, meaning a big positive return, that it continues to be a winner for the next six months. This is an interesting finding, as this means that previous returns can predict future returns. This finding is supported by Asness, Moskowitz and Pedersen (2013), who found that momentum strategies deliver abnormal returns across different asset classes. However, once again these studies focus on long term-stock prediction.

One of the few studies on short-term prediction is written by Gao, Han, Li and Zhou (2017). Gao et al. (2017) studied the existence and usability of intraday momentum patterns across different ETF’s (Exchange Traded Funds). They conclude that the first half-hour return of the market predicts the last half-hour return, and hence that there exists an intraday pattern. The coefficients in their study are estimated by using a sample of 1999-2013 and a sub-sample of the financial crisis 2007-2009 by using recursive least squares. They find that the parameters differ during the financial crisis, which suggests that the parameters are not constant over time.

There are multiple studies that support structural breaks in markets and suggest that market parameters do not remain constant over time. For example, Pesaran and Timmermann (2002) state that financial time series are likely to undergo large changes. In a later study, Paye and Timmermann (2006) conclude that the empirical evidence of predictability is not

(5)

constant over time and is concentrated in certain periods. Also, Fama and French (1998) assume that parameters changes overtime, as they estimate the CAPM-Beta’s by a rolling sample.

The goal of this thesis is to study how the intraday momentum predictive regressors vary over time and if there are estimation methods that outperform the simple Recursive Least Squares in predicting the intraday momentum. This is therefore an extension of the study of Gao et al. (2017), as they do not use other estimation methods than Recursive Least Squares and only briefly study the behaviour of the coefficients. It also extends the scarce literature on short-term stock predication, as there can be found no study that is comparable to this thesis. To answer the research question multiple econometric models are used with intraday data of different ETF’s. Furthermore, this thesis indicates the usefulness of its finding by showing the profitability of intraday momentum and its estimation methods.

The remainder of this thesis is organized as follows. In section 2 literature is discussed to construct an empirical framework. Section 3 describes the methods and data used to answer the research question. Section 4 provides the results and analyzes those results. Section 5 concludes.

(6)

2

Literature Review

The literature review is organized in four subsections. The first subsection discusses evidence and theory on the existence of momentum. The second subsection focuses solely on intraday momentum. It discusses literature that provides evidence of the existence of intraday mo-mentum and also theories on the cause of it. The third subsection discusses the change of parameters over time. As there is only one paper that discusses this for intraday momentum parameters, the section does not only focuses on the change of intraday momentum param-eters. The last subsection discusses hypotheses that can be made by using the literature discussed in the first three subsections.

2.1

Medium-term momentum

2.1.1 Existence of Medium-term momentum

The existence and profitability of momentum in stocks is first studied by De Bondt and Thaler (1985). They find evidence that there exists a long-term momentum effect. Their results show that stocks that performed poorly past three to five years, are likely to perform well next three to five years. This momentum phenomenon is further studied by Jegadeesh and Titman (1993), who use a shorter time frame than De Bondt and Thaler (1985). Jegadeesh and Titman (1993) find that there also exists a momentum effect in a shorter term. Their results show that stocks that performed well during the first six months, are likely to be good performers in the following six months. Further in this thesis, when referring to momentum, momentum effects with a time frame approximately equal to that in Jegadeesh and Titman (1993) are meant.

In a more recent study, Asness, Moskowitz, and Pedersen (2013) find that there exist momentum patterns across different markets and asset classes. This implies that momentum cannot be attributed to only a certain time period and a certain market. As the evidence of existence of momentum is clear, it is interesting to discuss what causes momentum.

2.1.2 Causes of Medium-term momentum

According to Jegadeesh and Titman (1993), the momentum effect is caused by delayed price reactions to firm specific information. This explanation is found to be partly true by Scowcroft and Sefton (2005). They find that in a small-cap universe (small-cap stocks), the majority of momentum profits can be attributed to firm specific momentum. The finding of Scowcroft anf Sefton (2005) supports the explanation of Jegadeesh and Titman (1993), as they also attribute the cause of momentum to firm specific information.

(7)

Momentum in a large-cap universe (large-cap stocks) is caused by a different effect accord-ing to Scowcroft and Sefton (2005). They find that momentum in a value-weighed large-cap universe is mainly driven by industry momentum and not by firm specific momentum. This explanation of momentum of large-cap stocks is partly contradicted by Berk, Green, and Naik (1999). They state that changes in a firm’s growth opportunity can generate momen-tum returns, and this growth opportunity is most likely to be correlated within industries. This suggests that in the end, firm specific momentum causes industry momentum, and hence large cap stock momentum is caused by firm specific momentum. Considering the previous discussed literature, there is still to explain what causes the overall momentum effect.

There are several studies state that the momentum effect overall is caused by the behaviour of investors. For example, Barberis, Shleifer and Vishny (1998) state that investors exhibit conservatism. They find that investors are slow to update their prior beliefs in the event of good or bad news. The slow updating of prior beliefs causes a delayed price effect, prices will not adjust directly but will do later on. This supports the explanation of Jegadeesh and Titman (1993).

Another example is the study of Daniel, Hirshleifer, and Subrahmanyam (1998), who conclude that a psychological bias can cause momentum and long-term price overreaction. The underlying idea is that this effect is caused by a self-attribution bias, which is that investors find that positive outcomes are caused by their skills, and negative outcomes by bad luck. Investors who have this bias are more likely to buy more stocks of a previous bought stock when they receive good news but are not likely to sell when they receive bad news. This causes prices to rise too far in the short term, but will eventually correct them self later (Daniel, Hirshleifer, & Subrahmanyam, 1998).

The literature discussed in this subsection explains what might cause the momentum effect. Literature suggest that momentum effect for large-cap stocks and small-cap stocks are caused by firm-specific momentum. Further, this momentum effects appears to be driven by the behaviour of investors.

2.2

Intraday Momentum

2.2.1 Existence of Intraday momentum

The existence of intraday trading patterns and return patterns are studied as early as 1988, by Jain and Joh. They find that on average, the largest returns occur during the first and last trading hours of the day. Jain and Joh (1988) also conclude that there exists an intraday pattern of trading volume. Their data shows that the first half hour volume is up, then significantly decreases, to be up again for the last trading hours of the day.

(8)

A more recent study also concludes that there exists an intraday pattern, as returns continue with half hour intervals (Heston, Korajczyk, & Sadka, 2010). Further, Heston et al. (2010) also find that the return continuation is more pronounced for the first half hour and the last half hour of the trading day. Their results show that if the first half hour of the trading day is positive, the last half hour of the trading day is also likely to be positive, but during the day, there is likely to be a reversal. Additionally, they find that there can also be found patterns in trading volume and volatility, but that those patterns do not explain the return pattern. Moreover, Gao, Han, Li, and Zhou (2017) find that the first half hour return of the trading day can predict the last half hour return of the trading day.

Taking into account the literature in this subsection, there are multiple studies that show evidence that there exists an intraday momentum effect. Also, evidence shows that intraday patterns are persistent over time, as it was first studied in 1988, but is still present today. 2.2.2 Causes of Intraday momentum

There are various explanations of the forces that drive intraday momentum. First, investor sentiment might be a force that drives the intraday momentum. It is well known that when investors are highly positive about a stock, then that stock tends to rise, caused by the investor sentiment. Renault (2017) conducted research on online investor sentiment and intraday returns of the S&P500 Index ETF. He finds that the first half hour of the market change in investment sentiment can predict the last half hour return after controlling for lagged market returns. Also, this finding holds for multiple ETF’s. Najand, Shen, and Sun (2016) theorize why investment sentiment can have predictive power and why it is concentrated at the end of the trading day. First, they state that investors are risk averse and might want to wait a few hours before taking a position in the market. Second, arbitrageurs are likely to trade against sentiment traders during the first trading hour rather than the last few hours, due to the uncertainty of overnight news. Those theories are supported by the results of Renault (2017).

Another possible force driving intraday momentum is trading behaviour of informed traders (Gao et al., 2017). Admati and Pfleiderer (1988) show that informed traders are likely to time their trading during high-volume periods. Those high volume trading periods are often during the first and last half hour of the trading day, as shown by Jain and Joh (1998). Also, Hora (2006) shows that an optimal trading strategy is to trade mostly during the beginning and the end of the trading day. Hence, if there is overnight news, this might cause rapid trading at the beginning of the day, pushing prices up. Then by using the optimal trading strategy of Hora (2006), trading is continued during the last half hour of the day.

(9)

In theory this process yields the intraday momentum. The results of Gao et al. (2017) also support this theory, as the intraday momentum effect is stronger when the first half hour is positive, than when the first half hour is negative.

Infrequent rebalancing of portfolios can also be a cause of intraday momentum according to Gao et al. (2017). When a large liquidity shock arises, traders end up with excessive posi-tions compared to their original weights, rebalancing at the open and close of the day might contribute to the intraday momentum. This infrequent rebalancing mechanism is studied by Bogousslavsky (2016). His model of rebalancing at the beginning and end of the day can explain the results of Gao et al. (2017). The rebalancing explanation is also in line with for example Hora (2006) and Admati and Pfleider (1998), as they state that the optimal trading period is also at the last hours of the trading day.

Taking a broad view on all literature in this subsection suggests that intraday momentum is likely caused by the behaviour of traders or investor sentiment. Most traders are likely to trade in the first and last half trading hours, also they are likely to rebalance portfolios at those times, as Hora (2006) suggests that those are the optimal trading hours.

2.3

Parameter Instability

In the study of Gao et al. (2017), the robustness of the intraday momentum parameters is reported. Gao et al. (2017) split up the test sample in different smaller samples, for example there is a distinction between the two business cylces: expansion and recession. For both business cycles the parameters are estimated. The findings suggest that intraday momentum has a more significant impact during recession than expansions (Gao et al., 2017). They report that the predictive regressors during a recession are twice as large as during an expansion, also their results show that the predictive power changes. Likewise, returns by using the intraday momentum strategy during a recession are six times larger than during an expansion (respectively 16.79% and 2.35%). Further, Gao et al. (2017) also took the financial crisis of 2007-2009 into consideration. Once again, their results show that during a recession the intraday momentum parameters are different, as the parameters in a regression with a sample that only consists the financial crisis are almost twice as high as when regressing with the full sample. Combining all the results of their study, they suggest that the parameters of the model are time dependent and change over time.

The parameters of the intraday momentum model are not the only parameters that occur to differ over time. Pesaran and Timmermann (1995) find that the predictive power of multiple economic factors tends to change over time. Their results indicate that this change partly depends on the business cycles, but also on the magnitude of economic shocks. Further,

(10)

they find that the predictive power also tends to vary with the volatility of the returns. In a later study, Pesaran and Timmermann (2002) state that financial time series are likely to undergo large changes. Their results suggest that parameters change over time and that using a stable prediction model could lead to asset miss allocation. The causes of those sudden market changes and changes of parameters are often caused by large institutional changes, regime switches or market breakdowns like the financial crisis (Pesaran & Timmermann, 2002).

There can also be found various models that seemed to have a significant predictive power, but after testing those models, they appear to be spurious or unstable (Goyal & Welch, 2007). Goyal and Welch (2007) test various models on the predictive power and stability. Their results show that most of their models are only significant because the performance during unusual years, for example the Oil crisis. As of 2005, most of their investigated models are insignificant and mostly underperform the prevailing mean.

Combining all literature discussed in this subsection, there is various evidence that pa-rameters of models are unstable, and the predictive power varies over time. Those changes are mostly caused by breaks, as Pesaran and Timmermann (1995) and Gao et al. (2017) show that performance is different during business cycles. Goyal and Welch (2007) even show that the significance of models can be caused by unusual years. However, that seems not to be the case of the intraday momentum model of Gao et al. (2017), as they verify that the model is still significant by excluding for example the financial crisis.

2.4

Hypothesis

The research in this thesis will contain three parts. The first part will focus on the existence of the intraday momentum effect. The second part of this thesis will research the stability of the intraday parameters. The last part focuses on forecasting performance.

In section 2.2 the existence and theory of intraday momentum is discussed. According to the discussed literature, when the first half hour return is positive, the last half hour is also likely to be positive. Further, this first half hour return can predict the last half hour return. Those results bring up the following hypothesis that can be attributed to the first part of this thesis:

Hypothesis 1: The first half hour returns are significant predictors of last half hour returns.

Section 2.3 focuses on the parameter instability of different models. The literature shows that the parameters of the intraday momentum are different during the financial crisis and business

(11)

cycles. It is assumable that correcting for those changes in parameters yields a better model, as it should fit the data better. The theory in section 2.3 yields the following hypothesis that can be attributed to part two of this thesis:

Hypothesis 2: The intraday momentum model parameters are unstable over time. The study of Gao et al. (2017) also uses a kind of time varying parameters as they estimate out-of-sample forecasting power by the use of recursive regressions. It is of interest if other models that account for parameter instability outperform these simple recursive regressions in forecasting performance and profitability. Combining this with the theory of parameter instability brings up the following two hypotheses:

Hypothesis 3: The forecasting accuracy of Recursive Least Squares on the intraday momentum model is outperformed by other methods that account for parameter insta-bility.

Hypothesis 4: The performance of a trading strategy based on Recursive Least Squares on the intraday momentum model is outperformed by other methods that account for parameter instability.

The focus of this thesis lays on researching hypotheses two, three and four. The method used to answer these hypotheses is discussed in section 3.

(12)

3

Methodology

In this section, first the data is discussed, and after the methodology.

3.1

Data

The intraday data used in this thesis is part from a larger data set available at Kibot1. This data set contains half hour prices and volume of the 50 most liquid ETF’s with a sample size that varies per ETF. This data set is chosen as most ETF’s have half hour data that go back longer than 15 years. This benefits this thesis, as it is evident that a large time frame is necessarily to study parameter instability over time. The data that is used is adjusted for splits and dividends and is checked for intraday spikes by the provider.

In this thesis I will mainly focus on the following ETF’s: SPDR S&P500 (SPY), SPDR S&P Midcap 400 (MDY), SPDR Dow Jones Industrial Average (DIA), Powershares QQQ (QQQ). These ETF’s all capture a specific part of the market and seek to produce returns equal to a specific index.2 These ETF’s all have data that go back longer than 18 years, ETF specific sample sizes and exchanges can be found in Table 6.1.

The dataset contains half hour prices and volume during trading hours and after-market hours. The trading hours for the NYSE and NASDAQ are 9:30AM - 4:00PM Eastern Time, all ETF’s chosen are traded on one of those exchanges. To study the intraday momentum effect, I define rj,t as the return made in trading half hour j at date t as:

rj,t =

pj,t− pj−1,t

pj−1,t , j = 1, ...., 13 (1)

with pj,t the closing price of trading half hour j at date t.3 Hence, the first half hour return at date t is denoted as r1,t and the last half hour as: r13,t. For the first half hour, the closing price of previous day is used. Therefore, the first half hour return also captures the overnight returns.

The half hour returns for the ETF’s are first calculated by using the prices that are available in the dataset, trading days that are not complete or do not have normal trading hours4 are deleted from the sample, but as the time frame is large, this doesn’t result in too few observations. In Figure 6.1 the resulting returns for all ETF’s are shown.

1Website: www.Kibot.com

2SPY tracks the S&P500 Index, MDY the S&P500 Midcap Index, DIA the Dow Jones Industrial Average

Index and QQQ tracks the Nasdaq.

3Using log returns make little difference in the results, the forecasting accuracy using log returns is equal

or slightly lower. The differences are not noteworthy.

(13)

Time series are prone to non-stationarity, this could lead to spurious regressions and wrong results. Figure 6.1 shows no evidence of a non-stationary time series, but the Augmented Dickey Fuller test is performed to support this, full test specifications and statistics are denoted in Appendix 6.1. The test-statistics reject the null hypotheses, and hence state that there is no unit root. This indicates that the time series is stationary, this finding is necessary for the methods used in the following section.

3.2

Methodology

The research in this thesis consists of three parts. First, I will test whether there is evidence that there exists an intraday momentum effect for the four ETF’s. This part corresponds with hypothesis one and the methodology is discussed in section 3.2.1. The second part studies how the parameters of the intraday momentum model behave over time, this corresponds with hypothesis two, the method used is discussed in section 3.2.2. The last part studies estimation methods that account for parameter instability and how they perform as forecasters or as trading tool. The corresponding methods are addressed in section 3.2.3 and correspond with hypotheses three and four.

3.2.1 Existence of intraday momentum

The main model of Gao et al. (2017), where the first half hour predicts the last half hour is used to study the existence of intraday momentum. By using the same notation, this leads to the following model:

r13,t= α + βr1,t+ t , t = 1, ...., T (2) where r13,t and r1,t represent respectively the last half hour return and the first half hour return at date t as defined in section 3.1.5 The most recent observation is denoted by T . The coefficients are estimated by using Ordinary Least Squares for all four ETF’s.

It is known that financial time series often contain volatility clustering (Heij, De Boer, Franses, Kloek, & Van Dijk, 2004, p. 621) and hence are prone to heteroskedasticity and auto-correlation. This results in that the standard errors that correspond with the Ordinary Least Squares method are too low. Therefore, I have conducted the Engle ARCH test to test for heteroskedasticity and autocorrelation (Engle, 1982). The Newey-West (HAC) standard errors are used to compute the t-statistics if there is evidence for heteroskedasticity and autocorrelation.

5I have also run regression with the first half trading hour return and the overnight returns split, these

regression results are shown Table 6.4. These results indicate that when excluding the overnight return, the first half hour is in no case a significant predictor of the last half hour return.

(14)

The estimated coefficients and the corresponding t-statistics (and significance) indicate whether the first half hour returns are significant predictors for the last half hour returns. This is used to validate or reject hypothesis one. The results are discussed in section 4.1. 3.2.2 Evidence for parameter instability

Parameter instability occurs when the parameters of a model fluctuate over time and hence not stay constant. This could occur when recursively estimating the coefficient, which is used for out-of-sample forecasting.

Recursively estimating is done by estimating the coefficients at each period in time and using all observations that are available until that certain time period. This results in that the estimated coefficient can fluctuate over time and hence become time dependent, this results in the following model by using matrix notation:

yt= xtβt+ t (3)

with this notation the observations at time t are captured in the 1 times 2 vector xt. The vector βt contains the coefficients at time t and is a 2 times 1 vector. This matrix notation will be used further in this thesis to ease the mathematical expressions.

The method that recursively estimates the coefficients is defined as Recursive Least Squares (RLS) and has the following estimator for the coefficients:

ˆ

βRLSt = (X0tXt)−1X0tyt (4) with

yT = (y1, y2, ...., yT)0, XT = (x1, x2, ..., xT)0 (5) Recursively estimating these coefficients for all periods in the sample yields an always in-creasing sample size.

The RLS method will indicate how the intraday momentum parameters behave over time, this is shown by plotting the estimated coefficients over time and plotting the corresponding t-statistics.

To statistically test if the coefficients are stable, there are conducted two tests that test for structural breaks in the model without making assumptions on the break date. The Cusum squares test is used as described in Brown, Durbin and Evans (1975) and the Quandt-Andrews breakpoint test (Andrews, 1993) is used. The latter is used as the Quandt-Andrews likelihood statistic does not perform poorly when autocorrelation and heteroskedasticity is present.

(15)

The results of these tests and the plots of the estimated coefficients are discussed in section 4.2 and will be used to reject or not-reject hypothesis two.

3.2.3 Forecasting under parameter instability

To forecast the last half hour return at time T + 1, only data until T is used, this is done for every time period t in the sample. All estimations are done recursively, resulting in a new coefficient for every t. These coefficients, together with the first half hour return of the following period (r1,T +1) are used to forecast the last half hour return of the following period (r13,T +1).

The forecasting performance is measured for three different methods, all these methods are estimated recursively. The first method, which is also the base method, is Recursive Least Squares (RLS) and is already discussed in section 3.2.2. The estimations start after 200 observations are made.

The remaining two methods are discussed next and after the method to measure the per-formance is addressed.

Recursive Discounted Least Squares

The recursive discounted least squares (RDLS) method gives less weight to older observations. This yields a model where the learning process of the model is biased to more recent obser-vations, hence there is assumed that older observations are less relevant for the coefficient estimation than the more recent observations.

The coefficients are estimated by using the Weighted Least Squares method, this results in the following estimator:

ˆ

βW LSt = (X0tWtXt)−1X0tWtyt (6) where Wt is a weighting matrix where the diagonals elements are filled with the weights that are attributed to an observation, which are determined by a discount function. I have weighted this model by two different methods, I conducted the standard weighting, where the intercept is also weighted, denoted by RDLS. Also, I weighted only the returns, this method is denoted by RDLS*.

The discount function, that assigns a weight to an observation at t is given by: wT(t) = (

1 1 + α)

T −t, t = 1, 2, ...., T (7)

(16)

sample window by recursively estimating. The estimations start after an observation window of at least 200 is reached. The Matlab code used to perform Discounted Least Squares is shown in Appendix 6.5.

Reverse Ordered Cusum

Conducting a break test often requires an estimate of the location of the break point or an estimate of the number of breaks in a sample (for example with the Chow-Test or Bai-Pierron Test). The Cusum squares test by Brown et al. (1975) doesn’t require any assumptions about the location of the break. As shown in Peseran and Timmerman (2002), using the Cusum squares test in a forecasting method might benefit the performance.

The Reverse Ordered Cusum method (ROC) as in Peseran and Timmerman (2002), re-verses the observations that are needed for the Cusum squares test, as the Cusum squares test is normally done forward. When a break is detected, the sample size is conditioned on the most recent break. The coefficients are estimated with Ordinary Least Squares on the new sample size. To highlight and explain this method, I will use the same notation as in Peseran and Timmerman (2002).

First, the observations are reversed from time τ until T , this yield the following observation matrices:

˜

yT,τ = (yT, yT −1, ...., yτ)0, X˜T,τ = (xT, xT−1, ..., xτ)0 (8) with

τ = ˜T , ˜T − 1, ...., 2, 1 (9)

and the most recent breakpoint denoted by ˜T . Then the least squares estimator is defined, this is simply the OLS estimator, but conditioned on ˜XT,τ and ˜yT,τ:

ˆ

βτ = ( ˜X0T,τX˜T,τ)−1X˜0T,τ˜yT,τ (10) The one-step-ahead residuals are computed as:

ˆ vt = yτ − ˆβτ −1 0 xτ q 1 + x0 τ( ˜X0T,τX˜T,τ)−1xτ (11)

and the Cusum squares test statistic is given by:

W Wτ,T = Pτ j=p+1vˆ 2 j PT j=p+1vˆ 2 j (12)

(17)

with p the amount of regressors in the model. To prevent false break point signals, the significance level is set on the highest possible, at 1%.

By testing for the most recent break and conditioning the sample, it might occur that the test finds a break which results in a small sample size. To solve this, I have chosen to include at least 200 observations, which is also the minimum for the RLS method and RDLS method. This means that it might be possible that pre-break data is included, when the break is small, this might even benefit the forecast (Peseran & Timmerman, 2004).

Further, I refer to Appendix 6.5 which includes the Matlab code and function that is used to forecast with the ROC method.

Performance Measurements

To measure the out-of-sample forecasting performance, the OOS R2 of Campbell and Thomp-son (2008) is used. This measurement shows whether the forecasts by the different methods outperform the simple moving average. The OOS R2 is defined as follows:

R2OS = 1 − PT t=1(rt− ˆrt)2 PT t=1(rt− ¯rt)2 (13)

with rt the realized last half hour return, ˆrt the forecasted last half hour return with data until t − 1 and ¯rt the mean of all realized last half hour returns until t − 1.

The OOS R2 is used to compare the forecasting performance among the different esti-mation method for all four ETF’s and will be used to study hypothesis three. Also, this measurement method makes the results comparable with those in Gao et al. (2017). The results of the OOS R2 are discussed in section 4.3.1.

The performance of the different methods is also measured by constructing a simple trading strategy which uses the forecast of the different methods. The strategy is defined as, when the forecast at time t passes a certain threshold level θ, then either take a long or short position. This can be mathematically denoted as:

η(ˆr13,t, θ) =    r13,t, if ˆr13,t> θ −r13,t, if ˆr13,t< −θ (14)

The performance of this trading strategy6 is measured for the full sample for all methods for the four ETF’s. The performance of each method is evaluated by calculating the compounded 6Not trading is also considered as a choice, hence a return of 0% is used when there is no trade at a

(18)

total return, the compounded average yearly return and the yearly standard deviation.7 Transaction fees can have great impact on the performance of a method when trading on a daily basis, therefore the average yearly return after transaction costs is also calculated. For transaction cost, the costs of Dutch broker De Giro8 are used, which are $0.006 per ETF. This fee will be assumed to be constant9 for the full sample, the base fee of $0.50 is ignored. To compare performances of the methods, the Sharpe ratio is introduced, which is given by the following formula:

S = rp− rf σp

(15) with rp the yearly average return of the trading strategy and σp the yearly standard deviation. The risk-free rf is provided in the Kenneth R. French Data Library10 11 and is derived from the one-month treasury bill rate. There are two different Sharpe ratios reported, the standard Sharpe where the risk-free rate and transaction fees are ignored and the corrected Sharpe12, where there is adjusted for the risk-free rate and transaction fees.

There are three different threshold levels used when evaluating the performance of the methods. A threshold level which is equal to zero, a threshold level set at 0.10% and one that is equal to the sum of the risk-free rate and expected transaction fees.

7It is worth noting that Gao et al. (2017) do not use compounded returns. Using only arithmetic returns

results in a loss of how the strategy performs over time, therefore I chose to only use compounding results.

8PDF file containing transaction cost can be found at: https://www.degiro.nl/data/pdf/Tarievenoverzicht.pdf 9It is worth nothing that this assumption is not realistic, as it is not likely that the transaction fees where

this low at the start of the samples.

10Website: http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html

11The database does not contain data until the stop of each sample and misses the last month. To solve

this, the last available risk-free rate is used for the missing dates.

12As the returns are made over a long period of time, the average yearly risk-free rate is from the start of

(19)

4

Results

The results are discussed in three subsections. The first subsection focuses on the existence of intraday momentum. The second subsection is centered around parameter instability and the last subsection addresses the forecasting performance of the different estimation methods.

4.1

Existence of intraday momentum

To study the existence of the intraday momentum, the model as in equation (2) is estimated for all ETF’s.

As financial time series are known to be prone to auto-correlation and heteroskedasticity, therefore the Engle test for conditional heteroskedasticity is performed. The calculated test statistics reject the null-hypotheses, which indicates that for all ETF’s there is enough evi-dence for conditional heteroskedasticity, the full test specification is shown in Appendix 6.2. The evidence for conditional heteroskedasticity implies that the standard t-statistics are un-reliable and possibly too high, therefore the Newey-West standard errors are used to compute the t-statistic. The regression results for all ETF’s are given in Table 4.1.

Table 4.1: Regression results all ETF’s

This table reports the regression results for all four ETF’s of the model specified in equation (2), using all observations available for each ETF. The coefficients are estimated using Ordinary Least Squares. The t-statistics with Newey-West standard errors are reported in parentheses under the estimated coefficient. Significance at the 1%, 5% and 10% are given by∗∗∗,∗∗or∗. The regression coefficients are scaled by a factor 100. The bottom part of the table gives information on the sample window and the amount of observations and the standard R2.

SPY DIA QQQ MDY

Intercept α -0.004 (-0.85) -0.006 (-1.31) -0.008 (-1.23) 0.000 (-0.19)

First half-hour return β 6.41

∗∗∗ (3.44) 7.10∗∗∗ (2.74) 6.00∗∗∗ (4.65) 5.99∗∗∗ (3.81) R2 1.6% 1.9% 1.5% 1.8% Sample Window 05/98-05/18 01/98-05/18 03/99-05/18 01/98-05/18 Observations 5088 5054 4791 5045

The regression results in Table 4.1 show that all estimated coefficients for the first half hour returns are statistically significant at the 1% level and hence differ from zero. Therefore, the first half hour return is found to be a significant predictor for all studied ETF’s for the

(20)

chosen sample window. All intercepts in the regression are found to be insignificant at any level below 10%, however it is not chosen to omit the intercept, as it is widely known that including the intercept does lead to a better fit of the model.

The estimated coefficients for SPY are slightly lower and less significant than those in Gao et al. (2017). The estimated coefficient for β is 0.53 lower and the corresponding t-statistics is 0.64 lower, the coefficient for the intercept is 1.6 higher and the corresponding t-statistic 0.31 higher. It is assumable that these differences are caused by the different sample window, as the sample window in Table 4.1 starts around 5 years later and ends more recent than the sample used Gao et al. (2017).

The results in Gao et al. (2017) indicate that excluding the financial crisis of 2007 − 2009 results in a lower significance level. To verify whether this is also the case for DIA, QQQ and MDY, the model as in equation (2) is also estimated with a sample consisting of only the financial crisis and a sample without the financial crisis. The results of the regression with only the financial crisis are shown in Table 6.2 and without are shown in Table 6.3.

In Table 6.2 can be seen that all estimated coefficients are higher and the R2 is higher compared to the regression results in Table 4.1. The significance of the coefficients is weaker than those observed in Table 4.1, this might be explained by the sample size, which is roughly 10% of the full sample. The results in Table 6.3 show that when excluding the financial crisis, the coefficients become less significant and the R2 is lower than those in Table 4.1. However, all coefficients are still significant at the 1% level.

To add some more robustness, the model as in equation (2) is also regressed on six more ETF’s. Information on these ETF’s are shown in Table 6.5 and the regression results are shown in Table 6.6. The regression results show that the first half hour return is a significant predictor of the last half hour return for four of the six ETF’s. For the other two ETF’s, the first half hour return is insignificant at any significance level lower than 10%. These findings imply that the intraday momentum model as in equation (2) is not persistent across all ETF’s. This contradicts the robustness ETF’s used in Gao et al. (2017), all the ETF’s in their study show that the first half hour is a significant predictor.13

Combining all results in this subsection indicates that the first half hour return is a significant predictor for the last half hour return for all four studied ETF’s. Also, there is shown that the significance without the financial crisis is lower, but that the predictors are still significant. Hence there is found to be no evidence to reject hypothesis one.

13It must be noted that this thesis considers ETF’s other than those used in Gao et al. (2017). ETF’s used

(21)

4.2

Parameter Instability

To illustrate how the coefficients of model (3) behave over time, the recursively estimated coefficients and corresponding Newey-West t-statistics by using the RLS method are shown in Figure 4.1.

Figure 4.1: Recursively estimated coefficients

This figure shows the estimated βtfor all studied ETF’s, scaled by a factor 100. The coefficients are recursively

estimated by the RLS method, with a minimum observation window of 200. The reported t-statistics are calculated by using the Newey-West standard errors.

The estimated coefficients for the first half hour return for SPY and DIA in Figure 4.1 show that the financial crisis of 2007-2009 has a big impact on the estimated coefficients, which is also seen in Gao et al. (2017). The pre- and post-crisis estimated coefficients seem

(22)

to be more stable for both ETF’s, with exception early in the sample. During the period of 10/98 to 10/02 the coefficients of both seem to be volatile without choosing a clear direction. The period after the financial crisis show declining coefficients for both ETF’s. This decline is not seen in Gao et al. (2017), as their sample stops in 2013. The t-statistic for SPY departs from 0.3 and increases over time as new observations come available, the estimation becomes significant at the 1% level early in 2003 and forward. The estimated t-statistics for DIA depart from a higher value than those from SPY and slowly increase to become significant at the 1% level early in 2002 and after.

In Figure 4.1 can be seen that the estimated coefficients for QQQ behave different than the coefficients from the other ETF’s. The financial crisis had only little impact on the estimated coefficient for QQQ. However, there is a big shift early in sample, as the estimated coefficients from 1999 to 2000 are high, and then start to decline rapidly until approximately mid 2001. It is plausible that the dot-com bubble caused that the coefficients are high and result in a decline after the bubble burst, as the same is seen for the financial crisis in SPY and DIA. After the rapid decline, the coefficients show to be quite stable during the rest of the sample. The t-statistic starts from an estimated value of 3, which slowly increases to the value of 5 and stays further around that level, this shows that the estimated coefficients are significant at the 1% level for the full sample period.

The plot of the estimated coefficients of MDY seen in Figure 4.1 shows a contradiction with the provided theory in section 2. Early in the sample, until 10/2002 the estimated coefficients for the first half hour return are negative, which indicates that there exists an inverse momentum effect. The corresponding t-statistics starts highly significant and slowly becomes less significant over time, as the estimated coefficient nears 0, after reaching 0 the significance starts to increase again. The estimated coefficients suffer from a shift during the financial crisis and after the financial crisis the coefficients stay stable over time.

To statistically test whether the models of equation (3) contain structural breaks, the Cusum squares test and the Quandt-Andrews tests are performed. The results of the Cusum squares test are shown in Figure 6.2 and the results of the Quandt-Andrews test are shown in Table 6.7.

The Cusum squares test rejects coefficient stability for all ETF’s. In Figure 6.2 can be seen that the test statistic for SPY, DIA and MDY crosses the critical value during the financial crisis. The recursive residuals for those ETF’s stay relatively small pre-crisis and rapidly increase during the crisis, which results in crossing the critical values at 1% significance level. This rapid increase is also seen in the estimated coefficients shown in Figure 4.1, which show to shift during the same period. After the financial crisis the recursive residuals don’t show

(23)

to be significantly increasing anymore.

The test statistic for QQQ passes the critical value early in the sample. The recursive residuals rapidly increase after the first few observations and cross the critical value early in 2000. This period corresponds with the dot-com bubble and a great shift in the estimated coefficients for QQQ, which can be seen in Figure 4.1. Hence, from all performed Cusum squares tests can be concluded that parameter stability is rejected during a crisis that has a large impact on the tested ETF.

In Table 6.7 can be seen that the Quandt-Andrews tests reject the null-hypotheses of no breakpoint for all ETF’s with a significance level of at least 5%. The estimated breakpoint dates for SPY, DIA and QQQ correspond with the Cusum squares test. The estimated breakpoint of MDY differs from that found in Figure 6.2, the break comes in 11/12/2001 instead of during the financial crisis14. This point approximately corresponds with the moment that the estimated coefficient switches signs from positive to negative.

The results in this section indicate that the coefficients of the model as specified in equa-tion (3) are not constant over time. Figure 4.1 shows that a relevant crisis results in a shift in the estimated coefficient. The Cusum squares test and Quandt-Andrews test support that the model consists of at least one structural break. Therefore, there is found to be no evidence to reject hypotheses two.

4.3

Forecasting performance under parameter instability

4.3.1 OOS R2

The models that are discussed in section 3.2.2 and 3.2.3 are estimated recursively for all ETF’s to calculate the OOS R2. The calculated OOS R2 is shown in Table 4.2. The RDLS and RDLS* discount factor is set at α = 0.002, as by trial-and-error this factor is found to result in the highest OOS R2 for all ETF’s. Using this factor results in that an observation which is made a year early only weights in the estimation for roughly 60%.

14When conducting the Quandt-Andrews breakpoint test on a sample of MDY starting after 11/12/2001,

(24)

Table 4.2: OOS R2

This table shows the OOS R2 calculated for the different models, with the maximum sample size available.

All estimations are done recursively with a minimum of 200 observations. The discount factor for RDLS and RDLS* is set at α = 0.002. The OOS R2 is computed as stated in equation (13). RDLS* represents the

Recursive Discounted Least Squares method with no weighting of the intercept.

RLS RDLS RDLS* ROC SPY 1.47% 1.42% 1.54% 1.00% DIA 1.69% 2.30% 2.39% 2.15% QQQ 0.94% 1.05% 1.27% 0.89% MDY 1.52% 2.17% 2.23% 2.03% Average 1.40% 1.73% 1.85% 1.51%

The OOS R2 varies for each ETF, the forecasting accuracy by using the RLS method for SPY, is in absolute value 0.27% higher than those found in Gao et al. (2017). It is reasonable that this difference is caused by using a different sample size rather than a structural difference in the method or data. Further, the OOS R2for DIA and MDY by using the RLS method are approximately the same size and the OOS R2 for DIA is found to be 0.66% higher in absolute value than in Gao et al. (2017). The RLS methods performs worst on the QQQ ETF, with an OOS R2 of ’only’ 0.94%, this value is however higher than in Gao et al. (2017), where the OOS R2 is 0.70%.

In Table 4.2 can be seen that the RDLS* method without weighting the intercepts always outperforms the RDLS method, this implicates that for these ETF’s, weighting the intercept is not preferable. Therefore, the results of RDLS will not be discussed and the focus lays on the RDLS* method. Using the RDLS* methods leads to substantial gains in the OOS R2, on average the OOS R2 is 32% higher compared to the RLS method. 15

The ROC method outperforms the RLS method only on DIA and QQQ, it does however underperform the RDLS* method in all cases, also the average increase in OOS R2 is lower (6%). To illustrate how the ROC method behaves over time, the estimated coefficients and the sample size are shown in respectively Figure 6.3 and 6.4. In Figure 6.4 can be seen that Cusum squares tests find many breaks in the sample of each ETF, the sample size rarely passes the 400. This influenced the coefficients estimations, as Figure 6.3 shows that the coefficients fluctuate much more over time than seen in Figure 4.1. The underperformance compared to the RDLS* might be caused by the detection of wrong breaks, which causes the average number of observations to be low and results in higher estimation errors. This is supported by the t-statistics in Figure 6.3, which on average appears to be very low for SPY, 15This is first taking the average of both methods and then calculating the relative increase in average

(25)

DIA and QQQ.

As robustness test, all estimation methods are used on six other ETF’s, information on these ETF’s can be found in Table 6.5 and the OOS R2 results can be found in Table 6.8. The RDLS* method shows again to be the best performer, as it only underperforms the RLS method on one ETF and also underperforms the RDLS method only in one case. It does however always outperform the ROC method. Hence there is no suspicion that the RDLS* method performs good only on a small amount of specific ETF’s.16

This section shows that when only measuring performance by using the OOS R2 the Recursive Discounted Least Squares method without weighting the constant outperforms all other methods. The Reverse Ordered Cusum method only outperforms the RLS method two of the four times. Because the RDLS* method shows a higher OOS R2 than the simple RLS method on all four ETF’s, hypotheses three is not rejected.

4.3.2 Return performance

To illustrate if intraday momentum can be profitable for a trader, the performance of the trading strategy as specified in equation (14) is evaluated in this section. The performance of all models for the ETF’s are discussed per ETF, by first discussing the results for the full sample, then the split sample17 and after the results are summarized. Finally, all results are combined and summarized. All results mentioned are excluding transaction costs or risk-free rate unless mentioned otherwise. The corrected Sharpe ratio is including the risk-free rate and transaction costs.18

SPY

The performance of the trading strategies for SPY using the full sample can be found in Panel A of Table 6.9. The benchmark of always buying at the start of the last half hour shows a negative total return of −16%, the other benchmark of the basic RLS method with no threshold shows a total return of 80% and a Sharpe of 0.54. These benchmark returns are lower than those found in Gao et al. (2017). The average yearly return by using the RLS method is 3.05%, which is half of that reported in Gao et al. (2017).

To show what possible causes this difference between the results in Table 6.9 and the

16The gain in average OOS R2 when using the RDLS* method is still respectively 30% compared to the

RLS method

17Results reported on the sub samples are always made when setting no threshold.

18It is noteworthy to state that the returns in this section do not include the bid-ask spread, as it is not

(26)

results found in Gao et al. (2017), the compounded return over time by using all the different methods are shown in Figure 6.5. This figure shows that for the RLS method, most of the total returns are made during the financial crisis, the returns that are made during that period and also excluding that period are shown in Panel B of Table 6.9. Panel B shows that for the RLS method, the average return during the crisis is around six time higher than before and after the crisis. Figure 6.5 shows that the returns increase only slowly after the crisis and eventually start decreasing after 2014. This explains the differences between Table 6.9 and the results in Gao et al. (2017), as their sample stops at the top of Figure 6.5 and does not include the drawdown that started after 2014.

Further in Panel A of Table 6.9 can be seen that the RDLS* method outperforms all other methods (with the same threshold) when using no threshold or setting the threshold at 0.1%, as the average return and Sharpe is higher than for the other methods. The highest Sharpe ratio of 0.74 is achieved by RLS when setting the threshold equal to the sum of the transaction costs and risk-free rate. However, RDLS* and ROC always perform equal or better than the benchmarks.

Looking at the impact of the transaction costs on the average returns for SPY in Table 6.9 shows that the transaction fees eat up returns, but even after costs all methods show a positive return and, still RDLS* and ROC outperform the benchmarks. When including the risk-free rate and transaction costs in the Sharpe, all calculated corrected Sharpe ratios near zero or become negative for most methods. This indicates that the methods do not perform well after costs when taking risk into account.

The results during the crisis for SPY show more promising returns, as the Sharpe for RDLS* and ROC are both higher than 1 after including the transaction costs and risk-fee-rate. When the financial crisis is excluded, all methods perform worse and all have a negative corrected Sharpe ratio, caused by a very low average return.

Summarizing all the results in Table 6.9 and Figure 6.5 shows that most returns using different methods on SPY are made during the financial crisis. Considering the full sample and the split sample, there is no method that performs best, but the RDLS* method and ROC method always outperform the benchmarks. After correcting for transaction fees and the risk-free rate the Sharpe ratio becomes very low, which indicates that using any method on SPY gives no good return when taking risk into account.

DIA

The performance results for DIA using the full sample are reported in Panel A of Table 6.10. Just as was the case for SPY, the Always Buy benchmark makes a negative return over the

(27)

full sample. The RLS method shows an average return of 3.38% and a Sharpe of 0.63, these results are slightly better than those for SPY. The RDLS* method with setting a threshold does always outperform the benchmarks and the ROC method only outperforms the bench-mark when setting the threshold equal to the transaction costs plus the risk-free rate. Again, the transaction fees show a decrease in the average return, but all corrected average returns stay positive. The corrected Sharpe ratio nears zero or becomes negative when adjusting for the risk-free rate and transaction fees, indicating that the risk-reward ratio is low.

The compounded returns for DIA are shown in Figure 6.6. This figure shows that during the financial crisis the most returns are made. The RLS and RDLS* method seem to have declining or horizontal compounded returns after the financial crisis. It is interesting to note that the ROC method did not show outperformance over RLS in Table 6.10, but in the Figure 6.6 can be seen that after 2010, this method does generate some return, which is not the case for RLS and RDLS*.

Panel B of Table 6.10 shows the returns when the splitting the sample. As could be seen in Figure 6.6, the returns made during the financial crisis are high. The highest average yearly return is 31.74% by the ROC method, with a Sharpe of 2.87. The RDLS* method is the worst performer during the financial crisis. When not taking the financial crisis into account, all returns become low or negative after correcting for transaction fees. This indicates that all positive returns shown in Panel A of Table 6.10 are mostly caused by the returns during the financial crisis.

Considering all results shown in Table 6.10 and Figure 6.6 there is no method that per-forms best, as the performance heavily depends on the threshold. Splitting up the sample shows that all returns over the full sample are caused by the financial crisis and that when excluding the crisis, the returns after correcting for transaction fees are not positive. This indicates that trading using an intraday momentum strategy on DIA was only profitable during the financial crisis.

QQQ

In Panel A of Table 6.11 the performance results for QQQ when using the full sample are denoted. The Always Buy benchmark shows a negative average yearly return of −1.85%. The RLS benchmarks shows an average yearly return of 6.95%, which is more than double than the average of SPY and DIA. Comparing the methods across the different threshold shows that there is no overall best performer. The highest Sharpe of 1.15 is achieved by the RLS method when setting the threshold equal to the sum of the risk-free rate and transaction costs. The ROC method performs worst, as it never outperforms the RLS benchmark when

(28)

comparing the Sharpe ratios. The average returns decrease slightly after correcting for trans-action fees but remain positive for all methods. The corrected Sharpe ratio is significantly lower than the standard Sharpe ratio. The corrected Sharpe for RLS and RDLS* are roughly 0.50 when setting a threshold, this value is substantially higher than those found for the corrected Sharpe of SPY or DIA.

The compounded returns for QQQ are shown in Figure 6.7. This figure shows once again that most returns are made during the financial crisis. The returns after the financial crisis seem to be longer increasing than those of SPY and DIA. However, the returns slowly show a reversal to a declining state.

The results in Panel B of Table 6.11 confirm what is seen in Figure 6.7, the returns are substantially up during the financial crisis. The Sharpe ratios during the crisis are all higher than 1, with the highest Sharpe of 1.73 made by the RDLS* method. When excluding the financial crisis, the returns are significantly higher than those found for SPY and DIA, even after correcting for transaction fees the returns stay positive. However, the corrected Sharpe is negative or close to zero when excluding the financial crisis.

Combining all results in Table 6.11 and Figure 6.6 indicates that there is no best per-forming method across the different thresholds. The results are more promising than those for SPY and DIA, as overall the average yearly return and Sharpe is higher than for the other two ETF’s. Again, most returns are made during the financial crisis, with non cor-rected Sharpe ratios that are higher than 1. The returns excluding the financial crisis are substantially lower, but remain positive even after correcting for transaction costs and show to be more promising than those for SPY and DIA.

MDY

The performance results for MDY using the full sample are shown in Panel A of Table 6.12. The RDLS* and ROC method both show promising results when setting no threshold, as they reach a Sharpe of respectively 1.06 and 1.21, which are higher than the RLS benchmark of 1.03. The Always Buy benchmark performs well compared to the other ETF’s, as this is the only instance where the average yearly return is positive. The RDLS* and ROC both outperform the RLS benchmark when setting no threshold or the threshold equal to the sum of the transaction costs and the risk-free rate. However, there is no overall best performer, as there is no method that performs best across threshold levels. After correcting for transaction costs, the average yearly return for all methods stay positive.

In Figure 6.8 the compound returns are shown, the returns seem to be steadily increasing over time, with a spike up during the crisis. It is remarkable that there are made returns

(29)

over time after the financial crisis, as this is not the case for SPY, DIA and QQQ (except for using the ROC method on DIA).

The results of the split sample are denoted in Panel B of Table 6.12. As was the case for the other ETF’s, the returns during the financial crisis are up substantially, resulting in high Sharpe ratios. The corrected Sharpe ratio’s during the crisis are all higher than 1. The average yearly returns when excluding the financial crisis are high compared to those for SPY and DIA but are comparable with those for QQQ. After correcting for transaction costs, the average returns stay positive and are higher than those for QQQ.

Summarizing the results for MDY shows that there is no method that performs best, but the RLS benchmark can easily be outperformed by choosing either the RDLS* or ROC method and setting no threshold or setting it equal to the sum of the transaction costs and risk-free rate. The returns during the financial crisis are up substantially, but Figure 6.8 shows that even after the crisis there are made returns. The returns when excluding the financial crisis are lower, but even after correcting for transactions cost positive and higher than any other ETF.

Summary

Combing all individual results show that overall there is no evidence that one method is the best performer and that most of the returns are caused by the financial crisis. The perfor-mance of each method depends on the ETF chosen and varies largely per threshold. However, in most cases the RLS benchmark can be outperformed by choosing either the RDLS* or ROC method and setting no threshold. But as this is not always the case, and that there are results that show that RLS in cases outperform the other two methods, there is found to be enough evidence to reject hypothesis four.19

19There have been also made optimal mean variance portfolios based on the forecast of the different

estimation methods, the results of those portfolios are shown in Table 6.13 and the method is discussed in Appendix 6.4. These results give a similar conclusion, there is no overall best performing method.

(30)

5

Conclusion

This thesis studied the behaviour of the intraday momentum parameters over time and the forecasting performance of models that account for parameter instability. The study was done in three parts with four corresponding hypotheses to help answer the research question.

The first part of this study focused on the existence of the intraday momentum effect, where the first half hour returns can predict the last half hour returns. Regression results show that the first half hour return is a significant predictor of the last half hour return for the four studied ETF’s. Splitting the sample showed that the predictability is highest during the financial crisis of 2007-2009. Nevertheless, excluding the financial crisis of the sample still shows that the first half hour is a significant predictor. Combining these findings gives no evidence to reject hypothesis one and hence that the first half hour return is a significant predictor for the last half hour return for the studied ETF’s.

The regressions done in the first part already show that the coefficients differ during the financial crisis and those findings are confirmed in part two, that focuses on parameter instability. A simple figure that shows how the recursively estimated coefficients for all ETF’s behave over time indicates that a relevant financial crisis results in a shift in the coefficients. Further, Gao et al. (2017) state that the estimated coefficient for SPY stays stable after the financial crisis, however Figure 4.1 shows a clear declining trend in the estimated coefficients. To statistically test whether there exists a structural break, the Cusum squares test and Quandt-Andrews breakpoint test are performed. Both tests indicate that there is a structural break in the coefficient and hence that the parameters are not stable over time. By using Figure 4.1 and the test results, there is no evidence to reject hypothesis two, which states that parameters are unstable over time.

The last part addresses the forecasting performance of the different estimation methods. When using the OOS R2 as performance measure, there are made significant gains in forecast-ing accuracy by choosforecast-ing the Recursive Discounted Least Squares method without weightforecast-ing the intercept (RDLS*). The average OOS R2 of RDLS* is relatively 32% higher than the average OOS R2 of the Recursive Least Squares method (RLS). Also, the RDLS* method outperforms the standard RDLS method for all four ETF’s, this suggest that weighting the intercept does not lead to a gain in forecasting accuracy. The Reverse Ordered Cusum method (ROC) only has a higher forecasting accuracy than RLS for DIA and MDY but has never a higher OOS R2 than RDLS*. These findings give no evidence to reject hypothesis three, as RDLS* has a higher forecasting accuracy than RLS for all ETF’s.

Furthermore, the performance of the estimation methods is also evaluated by constructing a market timing strategy based on the forecasts of those methods. The market timing strategy

(31)

showed that the overall profitability heavily depends on the ETF chosen and that most of the returns are made during the financial crisis. When excluding the financial crisis, all corrected Sharpe ratios become very low or in most cases become negative, indicating a low risk-reward ratio. Comparing the estimation methods shows that there is no method that always outperforms the RLS method for all four ETF’s. Also, the performances vary largely and depend on the threshold chosen. Therefore, there is found to be enough evidence to reject hypothesis four, as the estimation method studied in this thesis do not always outperform the RLS method.

Combining all results in this thesis has shown that there is strong evidence that the intraday momentum parameters do not stay constant over time and contains at least one structural break. The profitability of a market timing strategy based on intraday momentum was highly influenced by the financial crisis and the parameters of the models shifted during a crisis. When taking the risk-free rate and transaction costs into account, the Sharpe ratio became negative or close to zero for most estimation methods when excluding the financial crisis. Overall, there could be made substantial gains in forecasting accuracy over the stan-dard Recursive Least Squares method by using Recursive Discounted Least squares without weighting the intercept. However, when using a market timing strategy, there is no estimation methods that always performed best.

This thesis focused mainly on four different ETF’s, but further studies could easily extend this thesis by using more and different ETF’s. Also, the parameter instability can be studied on for example currency markets or individual stocks. Another interesting topic is how other estimation methods perform, as there are many more estimation methods that account for parameter instability than the methods used in this thesis. Hence, there are multiple topics that are interesting for further research and these topics all extend the scarce literature on intraday stock prediction.

(32)

References

Admati, A. R., Pfleiderer, P. (1988). A theory of intraday patterns: Volume and price vari-ability. The Review of Financial Studies, 1 (1), 3-40.

Andrews, D. W. (1993). Tests for parameter instability and structural change with unknown change point. Econometrica: Journal of the Econometric Society, 821-856.

Asness, C. S., Moskowitz, T. J., Pedersen, L. H. (2013). Value and momentum everywhere. The Journal of Finance, 68 (3), 929-985.

Barberis, N., Shleifer, A., Vishny, R. (1998). A model of investor sentiment1. Journal of financial economics, 49 (3), 307-343.

Berk, J. B., Green, R. C., Naik, V. (1999). Optimal investment, growth options, and security returns. The Journal of Finance, 54 (5), 1553-1607.

Bondt, W. F., Thaler, R. (1985). Does the stock market overreact?. The Journal of finance, 40 (3), 793-805.

Bogousslavsky, V. (2016). Infrequent rebalancing, return autocorrelation, and seasonality. The Journal of Finance, 71 (6), 2967-3006.

Brown, R. L., Durbin, J., Evans, J. M. (1975). Techniques for testing the constancy of regression relationships over time. Journal of the Royal Statistical Society. Series B (Method-ological), 149-192.

Campbell, J. Y. and Thompson, S. B. (2008). Predicting excess stock returns out of sample: Can anything beat the historical average?. The Review of Financial Studies, 21 (4), 1509-1531.

Cheng, E. (2017). Just 10% of trading is regular stock picking, JPMorgan estimates, CNBC, Retrieved from https://www.cnbc.com/2017/06/13/death-of-the-human-investor-just-10-percent-of-trading-is-regular-stock-picking-jpmorgan-estimates.html

Daniel, K., Hirshleifer, D., Subrahmanyam, A. (1998). Investor psychology and security mar-ket under-and overreactions. The Journal of Finance,53 (6), 1839-1885.

Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the vari-ance of United Kingdom inflation. Econometrica: Journal of the Econometric Society, 987-1007.

Fama, E. F., French, K. R. (1998). Value versus growth: The international evidence. The journal of finance, 53 (6), 1975-1999.

(33)

Gao, L. , Han, Y., Li, S. Z. and Zhou, G. (2017). Market Intraday Momentum. Forthcoming in The Review of Financial Studies.

Heij, C., De Boer, P., Franses, P. H., Kloek, T., Van Dijk, H. K. (2004). Econometric methods with applications in business and economics. OUP Oxford.

Heston, S. L., Korajczyk, R. A., Sadka, R. (2010). Intraday Patterns in the Cross-section of Stock Returns. The Journal of Finance, 65 (4), 1369-1407.

Hora, M. (2006). The practice of optimal execution. Trading, (1), 52-60.

Jain, P. C., Joh, G. H. (1988). The dependence between hourly prices and trading volume. Journal of Financial and Quantitative Analysis, 23 (3), 269-283.

Jegadeesh, N., Titman, S. (1993). Returns to buying winners and selling losers: Implications for stock market efficiency. Journal of Finance 48, 65-91.

Kaya, O. (2016). High-frequency trading Reaching the limits, Deutsche Bank Research. Paye, B. S., Timmermann, A. (2006). Instability of return prediction models. Journal of Empirical Finance, 13 (3), 274-315.

Pesaran, M. H., Timmermann, A. (1995). Predictability of stock returns: Robustness and economic significance. The Journal of Finance, 50 (4), 1201-1228.

Pesaran, M. H., Timmermann, A. (2002). Market timing and return prediction under model instability. Journal of Empirical Finance,9 (5), 495-510.

Pesaran, M. H., Timmermann, A. (2004). How costly is it to ignore breaks when forecasting the direction of a time series?. International Journal of Forecasting, 20 (3), 411-425.

Renault, T. (2017). Intraday online investor sentiment and return patterns in the US stock market. Journal of Banking Finance, 84, 25-40.

Scowcroft, A., Sefton, J. (2005). Understanding momentum. Financial Analysts Journal, 61 (2), 64-82.

Sun, L., Najand, M., Shen, J. (2016). Stock return predictability and investor sentiment: A high-frequency perspective. Journal of Banking Finance, 73, 147-164.

Welch, I., Goyal, A. (2007). A comprehensive look at the empirical performance of equity premium prediction. The Review of Financial Studies, 21 (4), 1455-1508.

(34)

6

Appendix

6.1

Descriptive Statistics

Table 6.1: Information on studied ETF’s

Ticker Name Sample Exchange

SPY SPDR S&P500 05/01/98 - 29/05/18 NYSE

DIA SPDR Dow Jones Industrial Average ETF 21/01/98 - 31/05/18 NYSE

QQQ PowerShares QQQ 11/03/99 - 30/05/18 NASDAQ

MDY SPDR S&P MIDCAP 400 05/01/98 - 30/05/18 NYSE

Figure 6.1: Graph of last half hour return

This figure contains graphs of the last half hour returns of the studied ETF’s. It is worth noting that SPY, DIA, QQQ and MDY have high spikes during the financial crisis of 2007-2009. Also, the graph of QQQ contains high spikes around 2000, which correspond with the dot com bubble.

Referenties

GERELATEERDE DOCUMENTEN

Hogere productiekosten zorgen bij een beperkte groei van de bruto productiewaarde voor een forse daling van het inkomen in de Nederlandse land- en tuinbouw in 2008.. Kleine groei

Sinds 2002 wordt de ontwikkeling van enkele individuele oesterbanken in de Nederlandse Waddenzee gevolgd.. In deze rapportage wordt een beschrijving gegeven van de ontwikkeling

Wij hebben de in het Financieel Jaarverslag Fondsen 2017 opgenomen jaarrekening en financiële rechtmatigheidsverantwoording over 2017 van het Fonds langdurige zorg (hierna

ABSTRACT: A novel constitutive model is proposed in which a fully coupled approach combining ductile damage, mixed nonlinear hardening and anisotropic plasticity is enhanced with

More specifically, this paper addressed the following questions: (1) Does the January effect exist on London stock market between January 1999 and May 2016?; (2) Does the

By using thematic analysis, recurring themes across the interviews can be found that illustrate how youth make meaning of specific needs throughout the foster

Deze regel leidt tot inconsistenties, omdat sommige regels zijn uitgedrukt in een maximaal aantal diensten in een periode van een aantal weken, zoals de regel die zegt dat

We identified and assessed digital standardization practices in a utility engineering consultancy and show that the industry currently uses various standards to model the same