• No results found

Modelling conditional heteroskedasticity in forecasting intraday returns using overnight : an analysis of the DAX 30

N/A
N/A
Protected

Academic year: 2021

Share "Modelling conditional heteroskedasticity in forecasting intraday returns using overnight : an analysis of the DAX 30"

Copied!
41
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Modelling conditional heteroskedasticity in forecasting intraday returns using overnight returns - An analysis of the DAX 30

ABSTRACT - A UNIVARIATE ANALYSIS ON THE DAX 30 FOR OCTOBER 1, 2015 UNTIL OCTOBER 1, 2017, SHOWS THAT THE AVERAGE OVERNIGHT RETURN IS 0.0535%, WHILE THE AVERAGE INTRADAY RETURN IS 0.0096%. IN-SAMPLE RE-GRESSIONS SHOW THAT THE OVERNIGHT LOG RETURN AND THE LAST HALF-HOUR LOG RETURN OF THE PREVIOUS TRADING DAY HAVE A POSITIVE EF-FECT ON THE SUBSEQUENT FIRST HALF-HOUR LOG RETURN OF, RESPECTIVELY, 0.030 and 0.30 TO 0.41. ADDITIONALLY, THE OVERNIGHT LOG RETURN HAS A NEGATIVE EFFECT ON BOTH THE PERIOD BETWEEN THE FIRST AND LAST HALFHOUR AND ON THE LAST HALFHALFHOUR ITSELF OF, RESPECTIVELY, 0.081 and -0.024. FINALLY, AN OUT-OF-SAMPLE ANALYSIS SHOWS THAT THE FORECAST INTERVAL RANGES ALMOST SYMMETRICALLY AROUND ZERO WHICH MAKES IT RATHER USELESS FOR INVESTORS.

Name: Mark van Kampen, BSc Student number: 10610618 Supervisor: Hao Li, MSc Date: December 22, 2017

University of Amsterdam

(2)

Statement of Originality

This document is written by Mark van Kampen who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of

(3)

Contents

1 Introduction 4

2 Literature review 6

2.1 Overnight versus intraday return . . . 6

2.2 Possible explanations . . . 6

2.2.1 Intraday return pattern and high opening prices . . . 6

2.2.2 Volatility . . . 7

2.2.3 Bid-ask bounce . . . 7

2.2.4 Role of market makers . . . 8

2.3 Forecasting the intraday return . . . 9

2.4 Expectations for this dissertation . . . 10

3 Methodology 11 3.1 Data . . . 11

3.2 Research method . . . 11

3.2.1 Univariate analysis . . . 11

3.2.2 Forecasting the intraday return . . . 12

4 Results 17 4.1 Univariate analysis . . . 17

4.1.1 The overnight and intraday period . . . 17

4.1.2 Distribution analysis . . . 18

4.2 Forecasting the intraday return . . . 20

4.2.1 In-sample analysis . . . 20

4.2.2 Out-of-sample analysis . . . 27

5 Conclusion 29 6 Reference list 31 7 Appendix 33 7.1 Components of the DAX 30 . . . 33

7.2 Data Filtering . . . 34

7.3 Econometric tests . . . 35

7.4 Distribution analysis . . . 38

(4)

1. Introduction

Due to different timezones in the world, relevant information for investors accumulates twenty-four hours per day. Imagine a day when Mario Draghi, president of the European Central Bank, announces to cut back the European asset purchase program. A couple of hours later, when the European stock markets are already closed, Janet Yellen, Chair of the Federal Reserve, announces that the Federal Reserve increases the federal funds rate with 0.25 percent. Finally, when both the European and American stock markets are closed, Kim Jong-un, the supreme leader of North-Korea, decides to unleash another rocket and lets it land in Japanese waters.

Due to the introduction of the Internet, thirty years ago, the dissemination of these noteworthy events has become easier and more efficient. Moreover, as a result of increased globalisation, stock markets are more and more influenced by developments in the rest of the world (Ansari, 2009). However, since most financial markets are only trading for five days a week and a limited number of hours per day, investors cannot always respond immediately to this new information.

According to the efficient market hypothesis, all publicly available information is (almost) immediately processed in the price of a stock (Fama, 1970). However, due to the continuous flow of information, vital information can become available during non-trading hours. During these non-trading hours, investors are unable to respond on this information and the efficient market hypothesis is, at least temporarily, violated. As soon as the market opens, investors respond on the overnight events which leads to an adjustment of the prices of the individual stocks. Hence, a stock’s opening price is likely to deviate from its closing price of the previous day. It is possible that a stock, that seemed to be a good investment at the end of the previous trading day, turns out to be a rather bad investment in the morning. Holding stocks during non-trading hours can therefore be a risky business and the importance of this overnight period should not be underestimated.

So far, the overnight period has not been of great concern in the economic theory. Returns in asset pricing models, such as the CAPM, APT and Fama-French, are estimated on a close-to-close basis. However, a return of 0%, does not mean that the price of the stock has not changed at all during that day. Branch & Ma (2006) were one of the first to investigate the relation between the overnight and intraday period. They show that there is a negative autocorrelation for American stock returns in the period of 1994 until 2005. This result, however, contradicts the weak form of the efficient market hypothesis which argues that time series of financial returns are uncorrelated (Fama, 1970).

The weak form of the efficient market hypothesis has another important implication. Since financial time series are uncorrelated, forecasting stock returns using historical data should not be possible (Fama, 1970). However, a negative autocorrelation might indicate that this is possible. This idea is confirmed by Liu and Tse (2017) who show that it is possible to predict the first and last half-hour return of the trading day, using overnight returns. If it really turns out that the overnight return can be used to forecast intraday returns, this would have major consequences for investors. By just observing the opening price, and hence the overnight return, they already know what a stock approximately is going to do that day. Hence, the following research questions arises: To what extent is it possible to use overnight returns to forecast the subsequent intraday returns?

(5)

CHAPTER 1. INTRODUCTION

This dissertation analyses the DAX 30 from October 1, 2015 until October 1, 2017. The re-search basically consists of two parts. First, the overnight and intraday return are compared and their distributions are analyzed. Second, an in-sample and out-of-sample analysis are conducted to check the predictability of the intraday return. Basic regression models are then extended to volatility-ARMA and ARMA-GARCH models to correct for conditional heteroskedasticity. Fi-nally, the out-of-sample performance is evaluated on the basis of both the RMSPE and MAPE.

The results in this thesis show an average overnight return of 0.0535% and an average intraday return of 0.0096%. Although the overnight return seems to be higher, it is not possible to say that it is significantly higher than the intraday return. For the in-sample analysis, this dissertation is distinctive in taking care of conditional heteroskedasticity. Results show that it is necessary to model the relation with either an volatility-ARMA model or an ARMA-GARCH model to capture the relation between the overnight and intraday period. These models show that the overnight return has a positive marginal effect of approximately 0.030 on the subsequent first half-hour log return and a negative marginal effect of about -0.024 on the subsequent last half-hour log return. Moreover, the overnight return has a negative effect on the log return between the first and last half-hour of about -0.081. Additionally, the last half-half-hour log return of the previous trading day has a positive marginal effect on the subsequent first half-hour log return of 0.30 to 0.41. Finally, the 95% forecast interval, out-of-sample, turn out to range almost symmetrically around zero which makes it rather useless for investors.

This dissertation proceeds as follows: chapter 2 consists of an overview of the most important literature on the subject. Chapter 3 describes the dataset and research method. Chapter 4 presents the empirical results. Chapter 5 summarises and concludes.

(6)

2. Literature review

This literature review consists of four sections. The first section discusses the main findings con-cerning the difference between the overnight and intraday returns and the relation between these returns. The second section then describes four explanations for the results in the first section. The third section addresses the existing literature on forecasting intraday returns. The goal of this chapter is not only to provide a description of the literature, but also to provide a motivation for the research in this thesis. Hence, the fourth section concludes with the expectation for this thesis in the form of five hypotheses.

2.1

Overnight versus intraday return

According to Cooper, Cliff and Gulen (2008) the premium of stock prices is mainly due to the overnight return (close-to-open return). The average overnight return of the individual S&P 500 stocks, for the period of 1993 to 2006, is between 0.0282% and 0.0476%. The average intraday return ranges from -0.0285 to an insignificant 0.022% (Cooper et al., 2008). Their results are robust to different days of the week and across different securities and American markets.

Kelly and Clark (2010) confirm that the average overnight return is higher than the average intraday return. For the period of 1999 to 2006, the overnight return of several ETFs on American indices ranges from 0.037% to 0.093%. The intraday return, however, ranges from -0.089% to -0.039%. According to Berkman, Koch, Tuttle and Zhang (2012), the average overnight return, of the 3000 largest US companies in the period of 1996 to 2008, is 0.10%. In contrast, the average intraday return is -7 basis points (Berkman et al, 2012). Liu and Tse (2017) find results of the same magnitude for American ETFs and several international index futures for the period between 1999 and 2014.

Additionally, Branch and Ma (2006) and Cooper et al. (2008) find that the overnight and intraday returns tend to move in an opposite direction. It turns out that when the overnight return is positive, the intraday return is usually negative and vice versa. This result is confirmed by Branch and Ma (2012) who show an autocorrelation of -0.252. On the basis of a multiple regression model, they claim that a one percent increase in the overnight return leads to a change of -0.37% intraday. Branch and Ma (2012) also show that this reversal effect is reinforced for more extreme overnight returns.

Several explanations for the overall findings of a higher overnight return and a reversal effect intraday are discussed in the next section.

2.2

Possible explanations

2.2.1 Intraday return pattern and high opening prices

One possible way of explaining the higher overnight return and the reversal effect intraday is by analysing the intraday return pattern. Jain and Joh (1988) show that the intraday hourly returns, of the stocks on the NYSE for the period of 1979 to 1983, follow a U-shaped pattern. The return in the first hour is the highest of the day, after which the return decreases for a couple of hours

(7)

CHAPTER 2. LITERATURE REVIEW

and increases again towards the end of the day (Jain & Joh, 1988). This U-shaped pattern for the intraday return was already observed by Wood, McInish and Ord (1985) and Harris (1986). If the intraday return indeed follows a U-shaped pattern, it could be argued that it represents a high opening price which is then adjusted during the rest of the trading day (Jain & Joh, 1988).

Hong ang Wang (2000) argue that information asymmetry between informed and uninformed investors increases during the night and gradually decreases during the subsequent trading day. To avoid overnight risk, investors tend to sell their positions before close and buy it back at the open (Hong & Wang, 2000). Kelly and Clark (2010) agree on this tendency to close overnight positions, by introducing the argument of ’illusion of control’. They argue that investors are self-confident during the trading day, because they believe in their own ability to trade. However, during non-trading hours investors are unable to trade and the result of their portfolio is out of control (Kelly & Clark, 2010). According to both Hong and Wang (2000) and Kelly and Clark (2010) the above-mentioned arguments lead to high opening prices.

So, from different angles, researchers argue that the market usually opens with higher prices, relative to the previous closing price. By showing that the close-to-close return is not significantly different from zero, Cooper et al. (2008) and Berkman et al. (2012) claim that the higher opening price is just a temporary mispricing of the stock. They argue that the high opening prices can hardly be explained by the stock’s characteristics and therefore are corrected during the trading day. Hence, a negative autocorrelation arises.

2.2.2 Volatility

A second possible explanation is the volatility. According to the risk-return trade-off, a potential return should increase with the risk. With stocks, it is expected that stocks with a higher average return also have a higher volatility. Since existing literature shows that the average overnight return is usually higher than the average intraday return, it is also expected that the overnight volatility is higher.

However, Cooper et al. (2008) show that the overnight variance of the stocks is significantly lower than the intraday variance. These results are confirmed by Kelly and Clark (2010) and Liu and Tse (2017). Since most investors are more interested in the Value at Risk (VaR) and the Expected Shortfall (ES) as measures of risk, Liu and Tse (2017) have also investigated whether these measures can be used to explain the results. However, they show that the VaR and the ES during non-trading hours is at least twice as small as the VaR and ES during trading hours which is again in contrast with the risk-return trade-off.

The result of a higher intraday variance was already observed in 1990 by Lockwood and Linn. They also show that the hourly variances follow a U-shaped pattern. This is in line with earlier results of Wood, McInish and Ord (1985) and Harris (1986). Although the results on volatility cannot be used to explain the higher overnight risk, it is in accordance with the U-shaped intraday return, as discussed in section 2.2.1.

2.2.3 Bid-ask bounce

A third possible explanation is the bid-ask bounce. A bid-ask bounce is present when stocks are bouncing between their bid and ask prices, leading to opposite overnight and intraday returns

(8)

CHAPTER 2. LITERATURE REVIEW

(Branch & Ma, 2012). They argue that this can happen when stocks are trading at the same quotes for a number of consecutive days. It is then possible that a stock closes at the bid and opens at the ask, resulting in a positive overnight return. Branch and Ma (2012) argue that when the stock closes again at the bid, the intraday return is negative and a negative autocorrelation is observed. This can go on, even when the underlying quotes are changing marginally (Branch & Ma, 2012).

Lease, Masulus and Page (1991) already addressed the importance of using the midpoint quote, the price between the bid and ask price, to correct for the potential influence of the bid-ask bounce. Moreover, Heston et al. (2010) argue that the intraday reversals are merely due to the bid-ask bounce. However, Branch and Ma (2012), show that using this midpoint quote does not change much to the results. Moreover, Cooper et al. (2008) and Berkman et al. (2012) show that their results are robust to the bid-ask bounce. Hence, there is still some indistinctness about the effect of the bid-ask bounce on the relation between the overnight and intraday return.

2.2.4 Role of market makers

The last possible explanation is the role of market makers. Branch and Ma (2012) argue that the closing price of a stock, somewhere between the bid and ask, is a rather good approximation of the intrinsic value at that moment. Even if there is a large bid-ask spread, the closing price of the stock should be a reliable estimate of the value the market thinks is fair (Branch & Ma, (2012).

When the market closes, all open day orders of both individual investors and market makers are deleted from the order book, while only the good-till-canceled (GTC) orders remain. Due to the deletion of a large part of the order book, Branch and Ma (2012) argue that the bid-ask spread is likely to widen. Therefore, the market makers have freedom to choose an opening price for the next day which is usually somewhere between the highest bid and lowest ask price (Branch & Ma, 2012). However, they argue that if there is an inequality between the buy and sell orders, market makers tend to set a price outside these bounds.

As information is accumulating twenty-four hours a day, investors are likely to enter orders during non-trading hours. Following these overnight orders, it is possible that an imbalance of buy and sell orders arises (Branch & Ma, 2012). If there is such an imbalance, market makers have roughly two strategies when it comes to quoting their opening prices: ’leaning against the wind’ or opening away from the prior close (Branch & Ma, 2012).

The ’leaning against the wind’ strategy is widely discussed in the paper of Weill (2007). He describes this strategy as trading against the market to restore the order imbalance. If, for example, there is selling pressure in the market and the market maker is implementing the ’leaning against the wind’ strategy, the market maker buys the stocks and stores them in the inventory. Weill (2007) claims that market makers usually dispose these stocks when the selling pressure has diminished. However, Branch and Ma (2012) argue that, when market makers implement this strategy, they usually trade at an unfavorable price.

The second strategy is to quote an opening price away from the prior close. If there is sell-ing pressure in the market, the market maker has to make sure that enough buy limit orders are triggered (Branch & Ma, 2012). The limit price of these buy orders are usually somewhat be-low the previous close and therefore the market maker quotes an opening price that is bebe-low the

(9)

CHAPTER 2. LITERATURE REVIEW

close. Now that buy limit orders are triggered, the imbalance gradually disappears without the need for the market maker to buy the stocks. Branch and Ma (2012) argue that this strategy leads to more transaction revenue for the market makers, because they receive a fee for every transac-tion. Moreover, the market makers limit changes in the inventory, because they do not have to buy the stocks.

The choice to take on this strategy is reinforced by the intraday pattern of the trading volume. Brock and Kleidon (1992) show that the trading volume is highest and less elastic at the open and close. Market makers can use this situation by quoting further away from the opening price. Ac-cording to Chan and Fong (2000), this market maker behavior is also supported by other empirical studies such as Glosten and Harris (1988) and Huang and Stoll (1997).

According to Branch and Ma (2012), the opening price that results from this strategy is actually an overshoot, because it is not a good representation of the true value at that moment. They argue that the price is likely to go back gradually to its intrinsic value during the trading day, causing the negative autocorrelation. This reasoning is also in line with the arguments of Cooper et al. (2008) and Berkman et al. (2012), as discussed in subsection 2.2.1. about.

Although this explanation is intuitive and backed by several researchers, it is hard to verify this explanation empirically. Since data on the net order flow is not publicly available, there is not much research on this topic. Still, Liu and Tse (2017) argue that the market maker explanation is not sufficient. Their paper is focused on index futures which are traded on a market without market makers and they still find positive overnight returns and negative intraday returns. If the explanation turns out to be true, only a buying pressure could explain the positive overnight returns, because then the market maker tends to quote a higher opening price. Still, both a buying and selling pressure would be sufficient to explain the negative autocorrelation. However, at this time, it is unclear whether an imbalance exists and if so, what kind of imbalance.

2.3

Forecasting the intraday return

As was argued in the introduction, the weak form of the efficient market hypothesis rules out the possibility of forecasting financial returns. However, Heston, Korajczyk and Sadka (2010) were one of the first to show that it is possible to forecast the intraday return of stocks on the NYSE for the period of 2001 until 2005. They find that a return in a half-hour interval can predict the same interval in consecutive trading days. They even show that this predictability stays valid for a minimum of forty trading days.

Gao, Han, Li and Zhou (2017) show that the first half-hour return positively predicts the last half-hour return of a trading day for S&P500 stocks in the period of 1993 until 2013. They show that this prediction becomes empirically stronger when adding the hour before the last half-hour. The predictions are more significant for trading days with a higher volatility and a lower trading volume (Gao et al., 2017).

Liu and Tse (2017) show insignificant results when predicting the last half-hour returns, us-ing the first half-hour returns. These results are robust to usus-ing the first half-hour as a sus-ingle independent variable and using it in combination with the overnight return and the return in the period between the first and last half-hour. In this last prediction model, the other two coefficients

(10)

CHAPTER 2. LITERATURE REVIEW

(overnight and period between first and last half-hour) are positive and significant.

Liu and Tse (2017) also show that the overnight return can be used to predict the first and last half-hour of the subsequent trading day. They show that the first-half hour is negatively predicted, with its estimator ranging from -0.366 to -0.017. In contrast, the last half-hour is positively pre-dicted with its estimator ranging from 0.030 to 0.089. They show that these predictions are both in-sample and out-of-sample significant. The estimators for the period between the first and last half-hour are mainly insignificant and hence can hardly be used to draw any conclusions from.

2.4

Expectations for this dissertation

The research in this dissertation is twofold and consists of a univariate analysis on the overnight and intraday period and a forecasting analysis. The literature, discussed in section 2.1, is convinc-ing on the fact that the overnight return is positive on average, while the average intraday return is negative and sometimes insignificant. Moreover, it was shown that there is a reversal effect intraday, resulting in a negative autocorrelation. Hence, the following hypotheses arise:

• Hypothesis 1: The average overnight return is positive. • Hypothesis 2: The average intraday return is negative.

• Hypothesis 3: The autocorrelation between the overnight and intraday return is negative. The second part of this dissertation focuses on forecasting the intraday return, using overnight returns, as was partly described in section 2.3. The weak form of the efficient market hypothesis says that it should not be possible to predict future returns using historical data (Fama, 1970). However, based on existing literature in section 2.3, the expectation is still that it is possible to predict intraday returns. Since Liu and Tse (2017) are the only researchers who have investigated the predictability of intraday returns using overnight returns, the hypotheses are based on their paper. This results in the following hypotheses:

• Hypothesis 4: The overnight return predicts the first half-hour return with a negative coef-ficient.

• Hypothesis 5: The overnight return predicts the last half-hour return with a positive coeffi-cient.

(11)

3. Methodology

This chapter discusses the methodology for investigating the hypotheses that were mentioned on the previous page. The first section of this chapter describes the dataset that is used in this disser-tation. The second section discusses the research method which is again divided into a univariate analysis and forecasting the intraday return in-sample and out-of-sample.

3.1

Data

The dataset for this dissertation is a time series of the DAX 30, a blue chip stock market index consisting of 30 major German companies trading on the Frankfurt Stock Exchange (Appendix 7.1). The research focuses on Germany, because Germany has the largest GDP in the European Union (OECD, 2016) and fifteen of the thirty stocks on the DAX 30 are also listed on the Euro Stoxx 50 (Stoxx, 2017). The Euro Stoxx 50 is an index of 50 blue chip stocks which are the largest and most liquid in the Euro Area. Hence, analysing the stock market of Germany is highly relevant. Moreover, the focus on Germany is distinct, because existing literature mainly focus on stock markets in the United States.

The dataset consists of high-frequency data, with a five minute interval on bid and ask prices, downloaded from Dukascopy: a Swiss bank which is specialized in providing Internet and mobile trading services. The data that is used is a two-year period that ranges from October 1, 2015 until October 31, 2017. Since most of the literature is focused on the last ten years of the 20th century and the first ten years of the 21st century, this dissertation is unique in focusing on a more recent period. The Frankfurt Stock Exchange is open five days a week from 09:00 until 17:30. Hence, the non-trading hours during the nights and weekends are deleted from the dataset by an, in VBA generated, algorithm (Appendix 7.2.1). Moreover, an additional fourteen non-trading days are deleted, because of (inter)national holidays (Appendix 7.2.2). After these deletions, 529 trading days remain with 103 observations per trading day, leading to a total of 54,487 observations. However, since this dissertation consists of both an in-sample and out-of-sample analaysis, the last month (October 2017) is omitted until the out-of-sample analysis. Therefore, a total of 509 trading days is used for the univariate analysis and the in-sample analysis.

As was discussed in section 2.2.3, it might be possible that the bid-ask bounce has an influence on the differences between the overnight and intraday return. To make sure that the results in this dissertation are not influenced by the bid-ask bounce, the midpoint quote of the bid and ask price at close is used in this research.

3.2

Research method

3.2.1 Univariate analysis

As was already discussed in section 2.4, the research in this dissertation consists of two parts. The first part is a univariate analysis which focuses on the first three hypotheses. To compare the overnight and intraday returns, the returns are calculated for every trading day in-sample, leading to 509 returns in total. The overnight return is calculated as the relative difference between the

(12)

CHAPTER 3. METHODOLOGY

opening price at 09:00 and the closing price at 17:30 on the previous trading day. The intraday return is calculated as the relative difference between the closing price at 17:30 and the opening price at 09:00 on the same trading day. The average returns are then tested on their significance with a Z-test. To check whether the average overnight return is higher than the average intraday return, both a 90% and 95% confidence interval are constructed. The econometric background of the Z-test and the confidence intervals are elaborated in Appendix 7.3.1 and 7.3.2.

• Overnight Return: ORt = P09:00, t− P17:30, t−1 P17:30, t−1 OR= 1 509 509

t=1 ORt • Intraday Return: IRt= P17:30, t− P09:00, t P09:00, t IR= 1 509 509

t=1 IRt

For the third hypothesis, the sample autocorrelation between the overnight and intraday return is calculated. To calculate this autocorrelation the sample variance-covariance matrix is constructed. The diagonal represents the variance of the overnight and intraday return, while the covariance between these two returns can be read from the off-diagonal elements. The sample autocorrelation can be calculated by dividing the sample covariance by the product of both standard deviations. To make sure that the estimators for the variance and covariance are unbiased, the denominator is corrected with one degree of freedom (Heij, De Boer, Franses, Kloek, & Van Dijk, 2004, p. 45). The autocorrelation is then tested on its significance with a t-test. The econometric background can be found in Appendix 7.3.3.

V = 1 508

509

t=1

(ORt− OR)2 (ORt− OR)(IRt− IR)

(IRt− IR)(ORt− OR) (IRt− IR)2

! = s 2 OR sOR, IR sIR, OR s2IR ! ρ =qsOR, IR s2OR q s2IR

Before moving on to the second part of this dissertation, a distribution analysis is performed on both the overnight and intraday period. The main focus of this analysis is the effect of potential outliers on the results.

3.2.2 Forecasting the intraday return

Data transformation

The second part of this dissertation consist of both an in-sample and out-of-sample analysis and focuses on forecasting the intraday returns using overnight returns. First, the original dataset with close prices is divided into four subsequent periods: the overnight period, the first half-hour of the trading day, the period between the first and last half-hour and the last half-hour itself. The prices

(13)

CHAPTER 3. METHODOLOGY

in these periods are then tested on stationarity with the augmented Dickey-Fuller test, with both a trend and intercept. However, appendix 7.3.4 show that non-stationarity is not rejected for all four series.

According to Heij et al (2004, p. 647), the variables have to be transformed to stationary series to prevent possible nonsense correlations or spurious regressions. Moreover, stationary series provide the possibility of testing coefficients with an ordinary t-test. A possible solution is transforming the close prices to log returns which was suggested by Shang (2017).

logReturnt = log(

Pt−i

Pt

) = log(Pt−i) − log(Pt) (3.1)

Shang (2017) argues that using this transformation leads to stationary time series. Since, both the intercept and trend are now removed from the series, the augmented Dickey-Fuller test on these log returns is performed without an intercept and trend. The second part of Appendix 7.3.4 shows that all series are now stationary at a 1% significance level.

Estimating basic regression models and other robustness tests

Now that it is proved that the dataset consists of stationary time series, the relations can be mod-elled. The basic regression models of Liu and Tse (2017) are used as a starting point for this in-sample analysis. Since the overnight return is used to explain the intraday return which is sub-divided into three periods, a total of three basic regression models are estimated. This leads to the following general prediction model:

IRt = α + β ORt+ εt (3.2)

However, Hansen (1995) argued that testing the stationarity of individual variables in univariate time series often give low power and hence is not sufficient. According to him, the power can be increased by taking into account important information in multivariate time series. This can be done by performing a Covariate Augmented Dickey-Fuller test on the estimated relations. Hence, as a final check, the three basic regression models are tested on stationarity in Appendix 7.3.5. The results show some smaller test statistics when compared to the ordinary ADF test, but they are still significant and hence, the relations are stationary as well.

These three basic regression models are just used as a starting point. As an extension to the paper of Liu and Tse (2017), this dissertations attempts to extend and improve the results. Therefore, the basic regression models are extended with the one-lagged intraday periods. This means that in total seven additional models for every intraday period are tested. After adding these one-lagged intraday periods, it is checked whether there is a significant improvement in loglikelihood in comparison to the basic regression models. Only if the extended model turns out to be a significant improvement, according to the Likelihood Ratio Test (Appendix 7.3.7), the estimation output is displayed in the results.

To extend the line of the distribution analysis in the univariate analysis, the regression models are both regressed with and without outliers. Brownlees and Gallo (2006) argue to delete observa-tions that lie more than three standard deviaobserva-tions away from the mean. However, this would lead

(14)

CHAPTER 3. METHODOLOGY

to a deletion of 30 observations which is almost 6% of the sample. Therefore, it is decided to use the full sample in the remaining in-sample and out-of-sample analysis.

After estimating the regression models, other tests for robustness are needed to make sure that the results can be interpreted in the usual way. One of the assumptions of the Classic Linear Regression Model is homoskedasticity which indicates that the disturbance variance is constant over time. This is tested with a Breusch-Pagan-Godfrey test on heteroskedasticity. Moreover, a Breusch-Godfrey test is used to check for potential serial correlation between the disturbance terms. When one of these tests reject at a 5% significance level, the ordinary standard errors are replaced by Huber-White or Newey-West standard errors. The outcomes of these tests are not displayed, but a comment is made when one of the above-mentioned standard errors are used.

Another important assumption of the CLRM is exogeneity where the explanatory variable is correlated with the error term. The explanatory variable in this thesis is either the overnight return or the lagged intraday return. Hence, these regressors belong to the Information Set Yt−1and are

thus exogenous which is also shown accordingly (Heij et al., p. 538): Cov(yt−k, εt) = E[(yt−k− µ)(yt− E[yt|Yt−1])]

= E[(yt−k− µ)yt] − E[E[(yt−k− µ)yt|Yt−1]]

= E[(yt−k− µ)yt] − E[(yt−k− µ)yt] = 0

Extension to basic regression models

To check whether the estimated basic regression models are a good capture of the relations between the overnight and intraday period, the residuals of the models are analysed. A Jarque Bera test on normality is used to check whether the residuals follow a normal distribution (Appendix 7.3.6). Moreover, since the log returns almost seem to follow a standard normal distribution, the ultimate models are tested on this with a Kolmogorov-Smirnov test (Appendix 7.3.9). Moreover, when a model is specified well, the residuals tend to follow a white noise process which is defined by a zero mean, constant variance and zero autocorrelation (Heij et al., 2004, p. 537). They argue that this white noise process can be tested with the Ljung Box test or equivalently with the Q-statistics (Heij et al., 2004, p. 365).

In order to use these Q-statistics to test the residuals on a white noise process, the variance of the residuals must be constant (Heij et al., 2004, p. 621). However, they argue that this is usually not the case in financial time series due to volatility clustering. This conditional heteroskedasticity can be observed by analysing the squared residuals. Either significant Q-statistics or a signifi-cant LM-statistics in the ARCH LM-test indicate conditional heteroskedasticity. The econometric background of this ARCH LM-test is shown in Appendix 7.3.8. Only if there is not enough evi-dence for conditional heteroskedasticity, the residuals can be tested on a white noise process with the method mentioned in the previous paragraph.

So, Heij et al. (2004, p. 621) argue that it is presumable that conditional heteroskedasticity is present in financial time series. Therefore, as a further extension to the basic regression models, the relations are tested on conditional heteroskedasticity. According to Heij et al. (2004), this

(15)

CHAPTER 3. METHODOLOGY

conditional heteroskedasticity indicates non-linearities, but it is not known which model should be used. In this dissertation, three different solutions are used and compared.

First, the non-linearities might indicate time-varying parameters. Therefore, the basic regres-sion models are extended to a threshold model and checked on the improvement in loglikelihood after adding a dummy variable. Since there is no clear-cut breakpoint in the dataset, it is decided that the value of this dummy depends on the value of the overnight log return. The dummy has a value of one when the overnight log return is non-negative and zero otherwise.

Next to time-varying parameters, the conditional heteroskedasticity could be solved by mod-elling the variance. This can either be done by adding an explicit explanatory variable for the volatility or by modelling the variance of the innovation term with a GARCH process (Heij et al., p. 621). The former one is accomplished by summing all five minute intraday absolute returns, because Forsberg and Ghysels (2007) show that this measure is a good predictor of the volatil-ity. This measure for volatility is then added to the basic regression models and checked on the contribution in loglikelihood.

The latter is just an alternative estimation. To find the right GARCH-model, all models up to GARCH(5,5) are tried to capture potential weekly effects. Eventually, the model with the lowest AIC and SIC is shown in the results. The GARCH processes are optimized with the Broyden-Fletcher-Goldfarb-Shanno algorithm and the Marquardt iterative algorithm.

A possible extension to these volatility and GARCH models is leveling the returns with ARMA terms. To find the right ARMA-model, an ARMA(5,5) model is used as starting point, again to capture potential weekly effects. The coefficients in this model are then gradually deleted, based on their significance. This process is stopped when all ARMA-terms are significant on a 5% two-sided significance level. The combination of the ARMA and GARCH modelling is known as ARMA(p,q)-GARCH(r,s) and can be presented in the following way:

yt = α + φ1yt−1+ ... + φpyt−p+ θ1εt−1+ ... + θqεt−q+ εt (3.3)

where σt2= ω + α1εt−1+ ... + αrεt−r+ β1σt2−1+ ... + βsσt−s2 + ξt with ξt∼ IID(0, 1)

Out-of-sample analysis

As was already pointed out in the description of the dataset, the last month of the dataset is used for the out-of-sample analysis. This month is omitted in the in-sample analysis and is only used to check the out-of-sample forecast performance of the estimated regression models. The period of October 1, 2015 until September 30, 2017 is used as estimation sample which consists of 509 observations. The last month, from October 1, 2017 until October 31, 2017 is the prediction sample and consists of 20 observations. The basic linear models are tested on a structural break at October 1, 2017, using a Chow Break test. The results of these tests are displayed in Appendix 7.3.10.

The out-of-sample analysis is performed with a static forecasting method. The static forecast-ing method calculates one-step ahead forecasts usforecast-ing only the actual, rather than the forecasted, values. The basic regression models are compared to the ultimate volatility-ARMA and ARMA-GARCH models. The out-of-sample forecast performance is compared on the basis of the Root

(16)

CHAPTER 3. METHODOLOGY

Mean Squared Prediction Error(RMSPE) and the Mean Absolute Prediction Error (MAPE). It is only possible to compare models with the same dependent variable where a lower value of RM-SPE and MAPE is preferred. Moreover, forecast graphs and 95% forecasts intervals are compared and discussed.

(17)

4. Results

In this chapter, the results of the research, as it was described in the third chapter, is presented and analysed. The first section describes the univariate analysis which consists of a comparison of the overnight and intraday period and a distribution analysis of both periods. In the second section, the in-sample and out-of-sample regressions are shown.

4.1

Univariate analysis

4.1.1 The overnight and intraday period

Table 4.1: Overnight versus intraday return

Average St. dev. St. dev. of average Z-stat. Overnight return 0.0535% 0.7570% 0.0336% 1.592∗ Intraday return 0.0096% 0.8313% 0.0368% 0.261

for 10% significance level,∗∗for 5% significance level,∗∗∗for 1% significance level

(one-sided)

In Table 4.1, the average and standard deviation of the overnight and intraday return of the DAX 30 are displayed for the period of October 1, 2015 until October 1, 2017. The results show an average overnight return of 0.0535%, while the average intraday return is 0.0096%. The average overnight return is only significantly larger than zero at a 10% significance level. The average intraday return shows an even smaller z-statistic which indicates that this average is insignificant. As was described in section 2.1, Berkman et al. (2012) showed a significant positive overnight return and an either significant negative or insignificant intraday return. Although the results are not that convincing, they still tend to follow the existing literature. A larger sample or deleting some outliers might improve the results. Based on these first results, the results are at a 10% significance level in line with hypothesis one, but not with hypothesis two.

Although the average overnight return is more than five times the size of the average intraday return, it is still doubtful whether the overnight return is higher than the intraday return due to the large standard deviations. Therefore, a 90% and 95% confidence interval is calculated for both average returns. As can be seen from Table 4.2 and Figure 4.1 on the next page, the range of both averages blend together. Hence, it is not possible to conclude that the average overnight return is higher than the average intraday return.

Despite the fact that there is no real evidence of the overnight return being higher than the in-traday return, it is expected from these first results. In section 2.2.2, the volatility was discussed as one of the possible explanations for a higher overnight return. The results show that the overnight standard deviation is 0.08% lower than the intraday standard deviation. This is in line with the results of, among others, Cooper et al. (2008).

(18)

CHAPTER 4. RESULTS

lower bound upper bound 90%-CI Overnight -0.0018% 0.1088%

90%-CI Intraday -0.0509% 0.0701% 95%-CI Overnight -0.0124% 0.1194% 95%-CI Intraday -0.0625% 0.0817%

Table 4.2: Confidence intervals Figure 4.1: Confidence intervals Finally, to check the third hypothesis, the autocorrelation between the overnight and intraday return, and its t-statistic, are displayed in Table 4.3. The autocorrelation that is observed for this sample is -0.00762. The t-statistic only shows a value of -0.1716 which is highly insignificant. Therefore, it is not possible to conclude that there is a negative autocorrelation or even a significant autocorrelation between the overnight and intraday return. Hence, the result is both in contrast with existing literature such as Berkman et al. (2012) and Branch and Ma (2012) as well as with the third hypothesis from section 2.4.

Table 4.3:Autocorrelation between overnight and intraday return

Correlation T-stat. Overnight, Intraday -0.00762 -0.1716

4.1.2 Distribution analysis

In this section, a deeper analysis of the distribution of the overnight and intraday returns is per-formed. Table 4.4 shows the remaining descriptive statistics for the overnight and intraday return.

Table 4.4: Remaining descriptive statistics overnight and intraday return

Min. Max. Skew. Kurt. JB-stat. Overnight -6.7534% 2.3657% -1.59 16.66 4169.69 Intraday -3.5070% 4.6581% -0.02 6.91 324.32

The overnight return has a negatively skewed distribution with a skewness of -1.59. The thickness of the tails is measured with the kurtosis which is 16.66. This indicates that the distribution has a long left tail which is also confirmed by the histogram in Appendix 7.4.1. As can be seen from the histogram, this left tail is mainly caused by the outlier -6.7534%. A normal distribution has a skewness of zero and a kurtosis of 3. Whether a distribution follows a normal distribution can be tested by the Jarque-Bera test statistic (Appendix 7.3.6). The test statistic is equal to 4169.69 which leads to a clear rejection of the normality hypothesis.

As has been already mentioned in the methodology, Brownlees and Gallo (2006) suggest an outlier deletion for observations that are at least three standard deviations away from the mean. Table 7.1 in Appendix 7.4.1 shows four of such outlier detections. The results show that the

(19)

CHAPTER 4. RESULTS

overnight returns follow a normal distribution for a deviation of 2.327 standard deviations which is in line with 1% at both sides. This can be achieved by deleting fifteen of the 509 observations, leading to a remaining 494 observations. Table 7.2 shows that the average overnight return changes to 0.0659% which is now significantly larger than zero at a 1% significance level. Thus, after deleting fifteen outliers from the sample, the average overnight return is already more in line with existing literature and the first hypothesis from section 2.4.

Table 7.1 shows that the intraday return has a more symmetric distribution with a skewness of only -0.02. The kurtosis is equal to 6.91 which indicates that the tails of the intraday returns are thinner than for the overnight returns. This can also be seen from Figure 7.3 in Appendix 7.4.2. The skewness and kurtosis are already somewhat closer to the characteristics of a normal distribution which is also confirmed by the JB-test statistic of 324.32. However, this test statistic is still a clear rejection of the hypothesis of normality.

Table 7.3 shows once more that now the intraday returns become normally distributed when observations, that lie at least 2.327 times the standard deviation from the mean, are deleted. This can be achieved by deleting 19 observations, resulting in 490 remaining observations. Surpris-ingly, the average intraday return increases to 0.0236%. However, this is still not significant on both a 5% and 10% significance level. Hence, the result is still in contrast with the second hypoth-esis from section 2.4.

Since not all of the outliers for the overnight period coincide with the outliers for the intraday period, both outlier deletions are combined in Appendix 7.4.3. This combination leads to a sample of 476 observations. According to Table 7.5, both returns now follow a normal distribution with a JB-statistic of, respectively, 1.58 and 5.68. Figures 7.5 and 7.6 show the histogram of both returns.

Table 4.5:Overnight and intraday return after outlier deletion (n=476)

Average St. dev. St. dev. of average T-stat. Overnight return 0.0663% 0.5858% 0.0269% 2.469∗∗∗

Intraday return 0.0223% 0.6568% 0.0301% 0.741

Table 4.5 shows that the average overnight return is now 0.0663% and is again significantly higher than zero at a 1% significance level. The intraday return is 0.0223%, but is not significantly different from zero. Therefore, the results found in this dissertation are in line with most existing literature of a significant positive overnight return and an insignificant intraday return. However, only the first hypothesis of section 2.4 can be confirmed.

In Table 7.6 and Figure 7.7, the confidence intervals are once more calculated and shown in a graph. They still blend together and hence, it is not possible to say that the average overnight return is higher than the intraday return. The overnight standard deviation is still 7 basis points lower than the intraday standard deviation.

Finally, the autocorrelation between both returns has now, surprisingly, increased to 0.04589 with a t-statistic of 1 which is not significant. So instead of becoming more in line with the negative autocorrelation of Branch and Ma (2012), it even increased. Hence, the results are not in line with literature and the third hypothesis in this dissertation.

(20)

CHAPTER 4. RESULTS

4.2

Forecasting the intraday return

4.2.1 In-sample analysis

Basic regression models

Table 4.6:Estimation output of the basic regression models with the intraday period divided into the first half-hour, the period in-between and the last half-hour. The regression models are estimated for the full sample of 509 observations, including outliers.

Variable (1) log(IR_first30) (2) log(IR_first30) (3) log(IR_between) (4) log(IR_last30) Constant 4.28E-05 1.70E-05 -9.69E-05 0.000120

(0.000127a) (0.000125a) (0.000335a) (8.38E-05a) log(OR) 0.023198 0.027128 -0.030956 -0.000324 (0.025696) (0.025169) (0.056500) (0.013489) log(IR_last30)(-1) 0.292398∗∗ (0.102040) AIC/SC -8.914/-8.897 -8.954/-8.929 -6.957/-6.941 -9.723/-9.706 N 509 509 509 509

aHuber-White SEs between parentheses

for 5% significance level,∗∗for 1% significance level (two-sided)

Table 4.7: Estimation output of the four basic regression models, excluding observations that lie at least three standard deviations away from the average.

Variable (1) log(IR_first30) (2) log(IR_first30) (3) log(IR_between) (4) log(IR_last30) Constant -1.34E-05 -3.78E-05 9.31E-05 0.000162∗∗

(0.000104a) (0.000105a) (0.000289) (6.60E-05a) log(OR) 0.038625∗ 0.0390800.018729 0.003474 (0.019478) (0.018722) (0.045757) (0.011069) log(IR_last30)(-1) 0.184901∗ (0.091819) AIC/SC -9.316/-9.298 -9.333/-9.307 -7.308/-7.290 -10.222/-10.204 N 479 479 479 479

aHuber-White SEs between parentheses

for 5% significance level,∗∗for 1% significance level (two-sided)

Tables 4.6 en 4.7 show the four basic regression models for this dissertation. In the first, second and fourth column, the three models from Liu and Tse (2017) are found with the overnight log return as sole explanatory variable. In the second column, the one-lagged last half-hour log return is used as additional variable for the first half-hour log return. It turned out that adding this one-lagged last half-hour log return to the first model was the only significant contribution to the loglikelihood. Hence, no other models are estimated in this dissertation. The results of the LR test for this second model can be found in Appendix 7.3.7.

The four regression models in Table 4.6 are estimated with the full sample of 509 observations. All models are estimated with Huber-White standard errors to correct for heteroskedasticity. The first model shows a positive coefficient for the overnight log return, but the coefficient is insignifi-cant at a 5% level. The coefficient remains insignifiinsignifi-cant in the second model, but the coefficient for the one-lagged last half-hour log return is positive and significant at a 1% significance level. The coefficient indicates that a one percent increase in the last half-hour return of the previous trading

(21)

CHAPTER 4. RESULTS

day leads to an average increase of 0.29% in the subsequent first half-hour return. The third and fourth model both show a negative and insignificant coefficient for the overnight log return.

Table 4.7 shows the estimation output for the same regression models, but after excluding the 30 outliers that lie more than three standard deviations away from the average. For the first, second and fourth model, Huber-White standard errors are used. The first model shows a positive coefficient for the overnight log return which is now significant at a 5% level. This coefficient increases only slightly in the second model and remains significant at a 5% level. The significance of the coefficient of the one-lagged last half-hour log return has diminished, but is still significant at a 5% level. Surprisingly, the coefficients for the overnight log return in the third and fourth model are now negative, but they are still insignificant at a 5% level.

When comparing Table 4.6 and 4.7, it is observed that, for the first and second model, the overnight coefficient becomes significant when the outliers are excluded. Moreover, its magni-tude increases by respectively 66% and 44%. Conversely, the lagged last half-hour coefficient loses significance and it decreases by almost 50%. The overnight coefficient in models 3 and 4 switches sign, but is highly insignificant in both models. Since the results are rather volatile and insignificant, it is hard to draw any premature conclusions.

In the rest of this chapter on the results, the analysis is continued with the full sample. So far, the results of these regressions have not been that satisfactory. Therefore, a short analysis on the residuals is performed. Although the correlogram of the residuals show insignificant Q-statistics, the correlogram of the squared residuals show significant Q-statistics at a 1% level from the second lag term onwards. Moreover, ARCH LM-tests are performed for all four regression models, up to lag term three. The null hypothesis of conditional homoskedasticity is rejected at a 5% significance level for all twelve tests. As was already discussed in the methodology, there are mainly three different solutions to correct for conditional heteroskedasticity. These results of these solutions are discussed in the next subsection.

Solutions for conditional heteroskedasticity

The previous subsection showed that there is enough evidence for conditional heteroskedasticity in the four basic regression models. A first possible solution is using a threshold model by in-troducing a dummy variable. The dummy has a value of one when the overnight log return is non-negative and zero otherwise. The four basic regression models are extended by adding an in-teraction term of the explanatory variable(s) with the dummy variable. The improvement of these models is based on the contribution in loglikelihood. Whether this contribution is significant or not is decided with a LR-test and the results can be found in Appendix 7.3.7. The results show that the LR-tests on the threshold models do not show any significant improvements. Hence, it can be concluded that the non-linearities in the residuals do not find their origin in time-varying parameters.

A second and third possibility are modelling the variance by either incorporating an explicit variable for the volatility or by estimating the variance of the innovation term with a GARCH process. As was discussed in the methodology, the volatility is calculated by summing the five minute absolute returns of the concerned intraday period. The volatility that is added to the models

(22)

CHAPTER 4. RESULTS

is only the volatility for the intraday period that is used as dependent variable. In contrast to the results of the LR-tests on the threshold models, the results for these models are all significant at a 5% significance level. Hence, the estimation output of the four, so-called, volatility models is shown in Table 4.8. Moreover, the estimation output of the four GARCH models is presented in Table 4.9.

Table 4.8: Estimation output with volatility as additional explanatory variable. The volatility is the sum-mation of the five minute absolute returns for the concerned intraday period. For example, the volatility in the first model is the volatility in the first half-hour.

Variable (1) log(IR_first30) (2) log(IR_first30) (3) log(IR_between) (4) log(IR_last30) Constant -0.000489 −0.000579∗ 0.003689∗∗ 0.000359∗ (0.000227a) (0.000268a) (0.000857a) (0.000165a) log(OR) 0.033016 0.038331 -0.071425 -0.000906 (0.023058) (0.022172) (0.054628) (0.013825) log(IR_last30)(-1) 0.304183∗∗ (0.097004) Volatility 0.117740 0.131725 −0.077213∗∗ -0.081258 (0.078517) (0.075265) (0.020405) (0.074687) AIC/SC -8.926/-8.901 -8.971/-8.938 -7.024/-6.999 -9.731/-9.706 N 509 509 509 509

aHuber-White SEs between parentheses

for 5% significance level,∗∗for 1% significance level (two-sided)

Table 4.9: Estimation output after modelling the variance of the innovation term with an appropriate GARCH(r,s) process. In the first row, below the dependent variable, the used GARCH process can be found.

Variable (1) log(IR_first30) (2) log(IR_first30) (3) log(IR_between) (4) log(IR_last30) GARCH(1,0) GARCH(1,0) GARCH(1,1) GARCH(1,1) Constant 4.17E-05 1.47E-05 8.72E-05 3.51E-05

(0.000126) (0.000127) (7.58E-05) (4.72E-05) log(OR) 0.027603∗∗ 0.031150∗∗ −0.090690−0.016944∗ (0.009636) (0.009245) (0.035437) (0.007370) log(IR_last30)(-1) 0.282449∗∗ (0.044686) AIC/SC -8.936/-8.903 -8.967/-8.925 -7.129/-7.088 -10.217/-10.175 N 509 509 509 509

for 5% significance level,∗∗for 1% significance level (two-sided)

Table 4.8 shows that, after adding the volatility of the first half-hour, the coefficient of the overnight log return is still insignificant at a 5% significance level. However, when compared to the basic model in Table 4.6, its magnitude has increased with almost 50%. This coefficient increases by 0.005 in the second model, but remains insignificant. The coefficient for the one-lagged last half-hour log return is positive and significant, and it shows approximately the same magnitude as in Table 4.6: 0.30. Although the LR-tests result in test statistics of, respectively, 8.10 and 10.58, the coefficients of volatility are insignificant in both models. The coefficients of the overnight log return remain insignificant in the third and fourth model. The sign of both coefficients is now negative, as was the case in Table 4.6, but the magnitude has more than doubled in the third model and almost tripled in the fourth model.

(23)

CHAPTER 4. RESULTS

process can be found in the first row of the table. The coefficient of the overnight log return is now significant at a 1% significance level in both the first and second model. The magnitude seems to lie around 0.029 which indicates that a 1% increase in the overnight return leads on average to an increase of 0.029% in the subsequent first half-hour return. As was the case in the previous results, the coefficient of the one-lagged last half-hour log return is positive and significant. The third and fourth model still show a negative coefficient for the overnight log return. However, it is noteworthy that both coefficients are now significant at a 5% significance level.

The results in the GARCH models in Table 4.9 show some more promising results than the volatility models. This can also be observed from the AIC and SIC which are on average lower in the GARCH models, when compared to the basic and volatility models. However, when com-paring the residuals, the volatility models and the GARCH models show significant Q-statistics from the second lag term onwards which indicates that the conditional heteroskedasticity is still present. Additionally, the ARCH LM-test still rejects for the second and third lag term.

So far, the relations are modelled with basic regression models and with three different repre-sentation who tried to take care of conditional heteroskedasticity. The threshold models did not give any significant improvement in loglikelihood, but the volatility models did. The GARCH models show on average the lowest AIC and SIC. The results show a positive effect of the overnight log return on the first half-hour log return and a negative effect on the last half-hour log return. These results are exactly opposite of the findings of Liu and Tse (2017) and the fourth and fifth hypothesis in this dissertation. However, it seems that there is still conditional heteroskedas-ticity and apparently, just estimating a volatility or GARCH model is not sufficient to correct for this. Hence, the models in in the next subsection are further extended by adding appropriate AR and MA terms.

Extension to ARMA modelling

A final effort to take care of the conditional heteroskedasticity is adding AR and MA terms. The appropriate ARMA models are shown in Appendix 7.5. The AR and MA terms that are used in those models, are also used in the volatility-ARMA models and the ARMA-GARCH models. Below, the estimation output of both representations are shown in Table 4.10 and 4.11. In both tables, the AR parts all satisfy the stationarity assumption, while the MA parts all satisfy the assumption of invertibility.

(24)

CHAPTER 4. RESULTS

Table 4.10: Estimation output of four volatility-ARMA models, using AR and MA terms from ARMA models in Appendix 7.5.

(1) log(IR_first30) (2) log(IR_first30) (3) log(IR_between) (4) log(IR_last30)

Constant -0.000477 −0.000705∗ 0.003390∗∗ 0.000429∗ (0.000308) (0.000343) (0.000718) (0.000217) log(OR) 0.031910∗∗ 0.041317∗∗ −0.073097−0.027162∗∗ (0.012169) (0.013462) (0.036073) (0.008802) log(IR_last30)(-1) 0.334067∗∗ (0.046238) Volatility 0.115156∗∗ 0.160395∗∗ −0.071033∗∗ −0.104158∗∗ (0.033608) (0.035515) (0.009162) (0.025687) AR(1) −0.422676∗∗ (0.160616) AR(2) −0.712909∗∗ 0.404598 −0.476353∗∗ 0.107673∗∗ (0.152445) (0.612535) (0.027593) (0.041715) AR(3) −0.667769∗∗ (0.027022) AR(4) 0.383707 −0.516229∗∗ (0.445890) (0.030440) AR(5) 0.659787∗∗ 0.372822∗∗ (0.037923) (0.043444) MA(1) 0.415601∗∗ (0.142461) MA(2) 0.782467∗∗ -0.307027 0.436340∗∗ (0.140412) (0.598929) (0.028073) MA(3) 0.838516∗∗ (0.069849) MA(4) -0.410140 0.477677∗∗ (0.385743) (0.024870) MA(5) −0.701800∗∗ −0.237089∗∗ (0.046180) (0.027535) AIC/SC -8.917/-8.851 -8.964/-8.889 -7.034/-6.950 -9.783/-9.709 N 509 509 509 509

for 5% significance level,∗∗for 1% significance level (two-sided)

Table 4.11: Estimation output of four ARMA-GARCH models, after adding AR and MA terms and modelling the variance of the innovation term with an appropriate GARCH(r,s) process. The used GARCH process can be found in the first row.

(1) log(IR_first30) (2) log(IR_first30) (3) log(IR_between) (4) log(IR_last30)

GARCH(0,3) GARCH(4,5) GARCH(2,0) GARCH(2,2)

Constant 1.52E-05 2.07E-05 0.000304 1.96E-05

(9.50E-05) (8.02E-05) (0.000250) (5.95E-05)

log(OR) 0.028400∗∗ 0.032892∗∗ −0.081735∗∗ −0.023876∗∗ (0.010215) (0.008164) (0.029394) (0.005972) log(IR_last30)(-1) 0.415258∗∗ (0.055249) AR(1) −0.452405∗∗ (0.022439) AR(2) −0.958975∗∗ −0.262421∗∗ 0.221496∗∗ 0.056268 (0.028081) (0.005016) (0.043281) (0.045505) AR(3) −0.725117∗∗ (0.062755) AR(4) −0.999961∗∗ −0.179881∗∗ (3.21E-05) (0.030887) AR(5) 0.789969∗∗ 0.219135∗∗ (0.031634) (0.072000) MA(1) 0.452700∗∗ (0.012625) MA(2) 0.982735∗∗ 0.256883∗∗ −0.280437∗∗ (0.016288) (0.008655) (0.038251) MA(3) 0.805162∗∗ (0.045618) MA(4) 0.979331∗∗ 0.407911∗∗ (0.005364) (0.039003) MA(5) −0.997397∗∗ −0.207579∗∗ (0.010486) (0.044139) AIC/SC -9.216/-9.133 -9.301/-9.159 -7.137/-7.045 -10.232/-10.132 N 509 509 509 509

(25)

CHAPTER 4. RESULTS

The first model in Table 4.10 and 4.11 show a positive and significant coefficient of the overnight log return. The magnitude of this coefficient, between 0.028 and 0.032, is in accordance with the earlier results in Table 4.6, 4.8 and 4.9. The coefficient for the volatility in the first half-hour is significant and indicates a positive effect on the first half-hour log return. Both the AR and MA terms are significant at a 1% significance level in the two estimations. Finally, it can be observed that the ARMA-GARCH model shows the lower AIC and SIC which leads to a preference for the ARMA-GARCH model.

The coefficient of the overnight log return in the second volatility model deviates from the results found before. It is significant at a 1% significance level, but the magnitude increased by approximately 45%. The coefficient in the ARMA-GARCH model is in line with earlier results, namely around 0.033. For the coefficient on the one-lagged last half-hour log return, the results are just the opposite. The coefficient of the volatility model is now in line with earlier results, while the coefficient in the ARMA-GARCH model increased by about 33%. The coefficient on the volatility is again significant and positive. A noteworthy difference between the two models are the insignificant AR and MA terms in the volatility model while the coefficients are significant at 1% in the GARCH model. Again, the AIC and SIC show a precedence for the ARMA-GARCH model.

The third model in Table 4.10 and 4.11 show rather equivalent results. The coefficient of the overnight log return is negative and significant in both models, although only at a 5% significance level in the volatility model. The magnitude of the coefficient is approximately equal to earlier results, namely -0.081. This indicates a marginal effect of approximately -0.081% for the subse-quent intraday period between the first and last half-hour. The coefficient of the volatility is now negative and significant. Moreover, there is no difference between the significance of the AR and MA terms. Although the results are not that different from each other, the AIC and SIC show a preference for the ARMA-GARCH model.

As in the third model, the results for the fourth model in the volatility and ARMA-GARCH model are almost the same. The coefficient of the overnight log return is negative and significant at a 1% significance level. The result is in line with the GARCH model from Table 4.9 and shows a magnitude of -0.024. This indicates that a 1% increase in the overnight return leads on average to a decrease of 0.024% in the subsequent last half-hour return. Table 4.10 shows that the volatility has a negative effect on the last half-hour log return. Both tables show significant AR and MA terms, but the AR(2) coefficient in the ARMA-GARCH model. Finally, the AIC and SIC is again lower for the ARMA-GARCH models.

So, the results for the coefficients of interest have not changed much after adding AR and MA terms. Moreover, the results of the volatility-ARMA and ARMA-GARCH models do not differ much from each other. However, the values of the AIC and SIC showed a clear preference for the ARMA-GARCH models. To check whether the goal of taking care of the conditional heteroskedasticity is achieved, the residuals of all models are analysed.

The squared residuals of the first and second volatility-ARMA model show significant Q-statistics from the second lag term onwards. In contrast, the squared residuals of the first and second ARMA-GARCH model are insignificant from the first lag term onwards. Moreover, the

(26)

CHAPTER 4. RESULTS

ARCH LM-test does not reject the null hypothesis of conditional homoskedasticity for these two ARMA-GARCH models. Because of this, the correlogram of the residuals is now valid and it shows insignificant Q-statistics which indicates that the residuals follow a white noise process.

The JB test statistics on the normality of the standardized residuals decrease from about 250 in the volatility-ARMA models to 50 in the ARMA-GARCH models. Additionally, a Kolmogorov-Smirnov test is performed on all four ARMA-GARCH models where the empirical distribution is equal to the normal distribution with mean 0 and variance 1. Appendix 7.3.9 show that there is enough evidence, at a 5% significance level, that the standardized residuals of the second and third model indeed follow a standard normal distribution. Moreover, the first model would not be rejected at a 4.5% significance level. Only the fourth ARMA-GARCH model shows a clear rejection of the standard normal distribution.

The squared residuals of the third volatility-ARMA model show significant Q-statistics only from lag term seven and onwards. However, the ARCH LM-tests show a rejection of conditional homoskedasticity. The third ARMA-GARCH model shows significant Q-statistics from lag term 6 and onwards. However, the ARCH LM-tests do not reject conditional homoskedasticity. Hence, the correlogram of the standardized residuals can be checked. However, it turns out that the Q-statistics are all significant and the residuals do not follow a white noise process. The JB Q-statistics are 201 for the volatility-ARMA model and 78 for the ARMA-GARCH model.

The fourth model shows significant Q-statistics in the squared residuals from the first lag term and the ARCH LM-tests all reject at a 1% significance level. However, the ARMA-GARCH model shows insignificant Q-statistics in the squared residuals and the ARCH LM-tests do not reject. From the correlogram on the residuals, it can be concluded that the residuals follow a white noise process. The JB statistic decreased from 305 in the volatility-ARMA model to 169 in the ARMA-GARCH model.

In the end, it can be concluded that the ARMA-GARCH models are best in capturing the relation between the overnight and intraday returns. It consistently shows the lowest AIC and SIC and only in the third model, the residuals do not follow the desired white noise process. Additionally, the ARMA-GARCH models shows persistently a lower JB test statistic.

From these models, it is observed that the overnight log return has a positive effect on the subsequent first half-hour log return of about 0.03. This result is in contrast with earlier findings from Liu and Tse (2017) and the fourth hypothesis of this dissertation, because a negative sign was expected. The overnight log return turns out to have a negative effect on the subsequent last half-hour log return of about -0.024 which is again exactly opposite of the findings of Liu and Tse (2017) and the fifth hypothesis. A rather new effect is the positive effect of the last half-hour log return of the previous trading day on the first half-half-hour log return which ranges between 0.30 and 0.41. Although, the third model shows a negative effect of the overnight log return, of approximately -0.08, on the period between the first and last half-hour, prudence is called when drawing any conclusions, because conditional heteroskedasticity is still present.

(27)

CHAPTER 4. RESULTS

4.2.2 Out-of-sample analysis

Now that the in-sample analysis has been conducted, the out-of-sample analysis results are pre-sented in this subsection. As was already described in the methodology, the first two years of the in-sample analysis are used as the estimation sample for the forecast of October 2017. Ap-pendix 7.3.10 shows the results of the Chow Break Test on a possible structural break at October 1, 2017. However, the observed F-statistics are all small and hence there is not enough evidence for a structural breakpoint between the in-sample and out-of-sample. Table 4.12 shows the RMSPE and MAPE for the out-of-sample forecasts of the basic, volatility-ARMA and ARMA-GARCH models.

Table 4.12: Root Mean Squared Prediction Error (RMSPE) and Mean Absolute Prediction Error (MAPE) of all four models and three different estimations: basic, volatility-ARMA, ARMA-GARCH. For every model, the lowest RMSPE and MAPE are underlined.

Basic Volatility-ARMA ARMA-GARCH

RMSPE (1) 0.000981 0.000979 0.000989 MAPE (1) 0.000798 0.000772 0.000793 RMSPE (2) 0.001037 0.001028 0.001103 MAPE (2) 0.000748 0.000731 0.000760 RMSPE (3) 0.003444 0.003655 0.003416 MAPE (3) 0.002734 0.002928 0.002644 RMSPE (4) 0.000927 0.001074 0.000837 MAPE (4) 0.000761 0.000873 0.000661

The results in Table 4.12 show that the volatility-ARMA estimation have the lowest RMSPE and MAPE for the first and second model. However, the third and fourth model show the lowest RMSPE and MAPE for the ARMA-GARCH estimation. Although there is no representation that clearly performs best, it is observed that the basic models are performing worst. This was already expected from the in-sample analysis. Still, the differences between the RMSPE and MAPE of the three different estimations are only small and therefore caution should be exercised when drawing any conclusions from these out-of-sample forecasts.

Figure 4.2:Forecast with 95% FI of volatility-ARMA model 1

Figure 4.3:Forecast with 95% FI of volatility-ARMA model 1

Referenties

GERELATEERDE DOCUMENTEN

Omdat de proefpersonengroep ‘random’ is gekozen, mag in principe ver- wacht worden dat eventuele verschillen in hun scores veroorzaakt worden door de methode. Het zou kunnen zijn dat

Het onderzoek op de afgedekte vindplaats Aven Ackers is nog niet volledig beëindigd, maar de voorlopige resultaten tonen toch al aan dat op de hoogste delen van de

Contour map of H(M,L) showing lines of constant film thickness for Moes and Chittenden film models Non-dimensional film thickness chart for circular contacts: Comparison

Similar to to our results of forecasting the variance and the tail tests, we found that using a normal or skewed normal specification for either ARMA-GARCH or GAS resulted in the

For the group with the highest absolute news risk changes, the positive effect of the lagged news risk variable on the log returns seems to be stronger than when including

Bij de afzet van cider kan gekozen worden voor een aanpak waarbij grootschalige productie wordt nagestreefd of voor een aanpak gericht op een afzet als streekeigen

Dijk, Maarten de Pourcq en Carl de Strycker samengestelde bundel over intertekstualiteit, houdt zich zowel bezig met de algemene theo­ rie van intertekstualiteit als

Een ander discussie punt is dat voor de convergente validiteit de SEARS-C vergeleken is met de aangepaste versie van de SVL, echter is van deze aangepaste versie onduidelijk hoe de