• No results found

Overnight returns and retail investors : an analysis of intraday predictability

N/A
N/A
Protected

Academic year: 2021

Share "Overnight returns and retail investors : an analysis of intraday predictability"

Copied!
29
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

AN ANALYSIS OF INTRADAY PREDICTABILITY

OVERNIGHT RETURNS AND RETAIL INVESTORS

Wietse Steenstra (11004487) Supervised by: MSc. Hao Li

In partial fulfillment of the requirements for the Degree of

Bachelor of Econometrics and Operations Research Presented to the faculty of Economics and Business of the

University of Amsterdam

(2)

This document is written by student Wietse Steenstra who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating

it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

Abstract

The relation between intraday returns and overnight returns has been studied by numerous researchers. This paper joins the theory of inaccurate opening prices by Branch and Ma (2012) and that of the price effect of retail investors’ attention-driven trading by Berkman et al. (2012). Using historical prices for the US500 index proxy, it is found that using proxies for investors’ attention does not significantly improve forecasting models for close-to-open, first half-hour or last half-hour returns. The addition of proxy variables does improve model fit, with short-term proxies like volatility and trading volume yielding more powerful results than long-term macro proxies. Investors can use this result to gain additional insight in the potential effects of incorporating extra variables into their forecasting models.

(4)
(5)

1 1. Introduction

The ever accelerating globalization brings more integration of international markets and means news spreads from all corners of the world, at all times (Ahoniemi & Lanne, 2013). Nowadays, investors accumulate relevant information twenty-four hours a day, but are generally still only able to act on that information during the restricted opening hours of the stock exchange. Market closures are an increasingly interesting phenomenon in this

environment, because when the market prices remain constant during this time, the price can no longer be used as a valuation tool for the underlying security and investors have to rely solely on their private information (Hong & Wang, 2000). This private information is

dependent on the input of news and external information sources, which are often subjective and can lure smaller investors away from their rational decision-making. When the market opens on the following day, it often occurs that smaller (retail) investors are net buyers of stocks that attracted their attention on the last day or during the night, regardless of what happened to the intrinsic security value (Berkman, Koch, Tuttle, & Zhang, 2012).

As studied by Berkman et al. (2012), this irrational behavior by retail investors often leads to upwardly biased opening prices, followed by a reversal during the day. When these opening prices are higher than usual, it is a boost for overnight returns that are measured close-to-open and a reduction for intraday returns that are calculated open-to-close.

Interestingly, this leads to on average higher returns overnight, while it is generally found that volatility of overnight returns is lower (Liu & Tse, 2017). This is in conflict with both the conventional financial risk theory and the hypothesis of market efficiency, a premise often used in economic research.

Aside from this interesting conflict, another important implication of the biased opening prices is that it creates a negative autocorrelation between overnight returns and intraday returns, which can also be used in causal predictions (Liu & Tse, 2017). Liu and Tse (2017) find that overnight returns can significantly predict intraday returns, specifically for the first half-hour of the day. Branch and Ma (2012) show that these predictions are more

accurate when lagged variables are added. Predictions like this are valuable for academics and practitioners alike, but none of these models incorporate the research done by Berkman et al. (2012), who show that retail investors have a significant impact on the biased prices and thus

(6)

2 on the autocorrelation between overnight returns and intraday returns. This paper is

dedicated to joining the theory of overnight returns as a predictor of intraday returns and of smaller investors that impact opening prices by buying stocks that attracted their attention. The main purpose is to investigate to what extent the estimated amount of retail investors’ attention-driven trading can be used to improve the predictions of intraday returns. To research this, proxies for investors’ attention are incorporated into existing models, thus building on previous work.

The text adopts the following structure: the second section is an extensive review of the existing literature on the topic, in the third section the data sample and research methods are described, section four presents the main results and section five concludes.

(7)

3 2. Literature review

Overnight returns have historically drawn quite some attention from researchers, as the periodic market closures create an interesting disruption of the assumed perfection that is intraday trading. When this disruption leads to predictable anomalies, exploiting strategies are enabled, which makes patterns immediately applicable for investors. The hypothesis of weak market efficiency forbids such predictability (Jensen, 1978), but it is nonetheless found by numerous researchers.

Mean and volatility of overnight returns

Most described by the aforementioned researchers is the difference between the mean overnight (close-to-open) returns and the mean intraday (open-to-close) returns, often accompanied by a comparison of their respective volatilities. There are various disagreeing reports on the average overnight return, but different studies at least agree that overnight volatility is lower than intraday volatility (Lockwood & Linn, 1990; Hong & Wang, 2000; Branch & Ma, 2012; Liu & Tse, 2017). This result is consistent through different timeframes, different types of securities and different methods of volatility characterization. During market closures, investors can’t move the price and thus can’t reveal their own expectations to other investors via price variations of the security. Considering this, the conventional explanation for lower overnight volatility is that investors can’t reveal their private information and can’t react on the preference indications of others, which stabilizes the price (Hong & Wang, 2000). A common method to quantify this stabilization is to use the variance of the returns. French and Roll (1986) first used this estimator for volatility to note that overnight volatility is lower than intraday volatility, which was later supported by studies conducted by Lockwood and Linn (1990), Hong and Wang (2000) and Branch and Ma (2012).

Another interesting approach to examine overnight volatility is taken by Liu and Tse (2017). Instead of using the variance of returns as an estimator of volatility, the researchers use value at risk (VaR) and expected loss (ES). VaR estimates the invested value that can be lost within a given confidence interval and ES is the average of all losses that exceed this

confidence interval, thus quantifying the slope in the tails of the distribution. This is a valuable addition, because variance of returns is known to structurally underestimate the tail-end risk of investments (Jorion, 2006). Jorion (2006) proves that VaR and ES do incorporate this risk

(8)

4 and are also consistent when returns aren’t normally distributed, giving further credibility to the results. For all studied ETFs and futures, Liu and Tse (2017) find the VaR and ES to be significantly lower overnight than during day-time. This means that the observed volatility is lower during market closures, regardless of the evaluation methods.

Because of the lower volatility, one would expect overnight mean returns to be lower, in accordance to financial risk theory, which says that lower risk should yield a smaller return. However, this is not a triviality, as there are a lot of disagreeing reports on the mean overnight and intraday returns and the relation seems to differ per timeframe and type of security. The paper of Branch and Ma (2012) illustrates this nicely. In this paper, two types of securities are examined: an index proxy of the S&P500 and the weighted average of a group of American stocks. Their index proxy reported subzero mean returns during trading hours and significantly positive returns overnight. This outcome is interesting as it contradicts with market efficiency, a hypothesis that is also supported by numerous studies, and enables some exploiting

strategies like increased overnight holding. Also worth noting is that this result is not consistent when they examine the weighted average stocks. In contrast to the index, these stocks reported a lower mean return overnight than intraday, giving rise to the idea that overnight returns may be higher for ETF’s and futures, but lower for stocks. In other words, a reversed effect could exist between different types of securities. Research by other authors provides extensive extra evidence for this. Liu and Tse (2017) examine nine sector ETF’s and twelve international index futures and find that mean overnight returns are significantly greater than intraday returns for all of these instruments. Looking at stocks, Hong and Wang (2000) find that the mean intraday return is higher than the mean overnight return, also strengthening the results found by Branch and Ma (2012). As of now, none of these researchers have proposed an explanation as to why there would be a difference between ETFs and stocks.

Intraday movements

Regardless of the difference between the instruments, the fact that there are securities for which returns are higher in periods where volatility is lower is counterintuitive. Multiple researchers have found that this is mainly because of one notable pattern; securities often open with a price that slightly overshoots the equilibrium price, before decreasing and

(9)

5 reverting to a stable level during the day (Berkman et al., 2012; Branch & Ma, 2012). At the end of the day, the price then increases again up to market close, closing on average only slightly lower than the opening price (Hong & Wang, 2000)

This phenomenon is comprehensively described by Hong and Wang (2000), who call this trend the “U-shaped pattern of returns”. They support this claim by combining the theoretical effects of information asymmetry and hedging trade on the price. Hedging trade is the trading of derivatives to shield a portfolio, which investors do at market open to protect their investments from negative shocks. These hedging positions need to be predominantly closed when the day ends, as they can have a detrimental effect if they cannot be sold instantly. Because of this, hedging demand is high at market open and then decreases during the day, down to a low point when the markets close. Lower hedging demand means lower overall demand and trade in the market, which lowers the price of securities. This creates the initial downward slope of the price pattern.

During the day information asymmetry decreases, as more investors trade for speculative reasons instead of hedging reasons and thus reveal their true preferences. This decrease in information asymmetry means investors are better informed and risk is lower. In turn demand, and thus price, increases. Because hedging positions decrease exponentially before market closure, the effect of decreasing information asymmetry dominates at this time, boosting the price and creating the eventual upward slope of the U-shaped patterns (Hong & Wang, 2000).

Although the U-shape is common, it does not occur in all situations, as it is dependent on the frequency and duration of closing periods, the need for investors to hedge risks and the transparency of information on the market (Hong & Wang, 2000).

Predictive power of overnight returns

The previously described U-shaped patterns are reliant on the initial overshooting of the equilibrium price at market open, which happens often. This shows that the opening price does not always reflect the true value of the security, a distortion that has import implications: when the opening price is too high it boosts overnight returns, as those are measured close-to-open, and it reduces intraday returns when the price reverts to its intrinsic value. The opposite is true when the opening price is too low, which theorizes a possible negative autocorrelation

(10)

6 between overnight returns and intraday returns (Hong & Wang, 2000; Branch & Ma, 2012). This theory has been tested and found true for several data samples. Branch and Ma (2012) and Liu and Tse (2017) both find a significantly negative correlation coefficient, as well as Berkman et al. (2012). This correlation can in turn be an argument to investigate a possible causal relationship between overnight returns and intraday returns, which is done by Branch and Ma (2012). In their study they estimate four different models with varying amounts of lagged variables and interaction terms and find that in all the configurations, overnight returns have a significant effect on subsequent intraday returns. This result was found in-sample and found to be consistent out-of-sample, using a predictive regression model.

Expanding on this result is the prediction of price movements during more specific parts of the day. Price movements are often not constant or gradual, because of the U-shaped patterns described earlier, which means stronger results may be obtained when predictions are made for different parts of the day separately. Liu and Tse (2017) do this by dividing the day in three categories; the first half-hour, the last half-hour and everything in between. They find that overnight results can be used to predict the first half-hour results with a negative relation and the last half-hour returns with a positive relation. Furthermore, they find that the first half-hour returns can predict the last half-hour returns with a negative relation, which is consistent with the opposing slopes seen in the U-shapes.

The intraday predictability is studied more comprehensively by Gao, Han, Li and Zhao (2017), who come to this same conclusion, but also find that the predictive power is greater on days where volatility and trading volume is greater. Furthermore, they show that the

predictive power for the first- and last half-hour returns are greater than that for the whole day and persist through out-of-sample analysis, which allows for more specific exploiting investing strategies.

Explanations for the opening price bias

The predictive power of overnight returns is already a valuable discovery, but the models can be further improved when it is understood what causes the biased opening prices and subsequent intraday reversal in the first place. This would help to build a model that predicts the amount of autocorrelation and estimates (partial) intraday returns with more accuracy.

(11)

7 Several possible explanations for this opening price bias have been offered by

researchers, among which the hypothesis of intentionally inaccurate price-setting by market makers. Market makers maximize profit by ensuring many trades are issued, as they receive a small fee for every transaction. When the trading day begins, they thus have an incentive to set the price in such a way that maximizes the number of transactions conducted. A way to do this is to set the price in a way that triggers some limit orders, which are conducted when the price is above or below a certain level, so that buy and sell orders are nearly equal (Branch & Ma, 2012).

An example of this strategy could be the following; it often occurs that there are more buy orders, as investors want to open their hedging positions at the start of the day. The market maker could in this case set the price slightly higher than the equilibrium price to trigger some sell limit orders, which balances the amount of buy and sell orders and thus maximizes the number of transactions. The described strategy would also destabilize the price somewhat, as the chosen price is not equal to the intrinsic value of the security and will have to adjust as the day progresses (Branch & Ma, 2012).

This hypothesis issued by Branch and Ma (2012) seems plausible and is probably contributing to the upwardly biased opening prices in some cases, but it can’t fully explain the effect, as it is only relevant in classic markets that are regulated by a designated market maker. ETF’s and futures do not fall in this category, as they are traded and priced electronically, which makes it impossible to intentionally bias the price (Liu & Tse, 2017).

For these types of securities another explanation is more likely; the theory of attention-driven retail buying at the start of the day, studied by Berkman et al. (2012) and building on Barber and Odean (2007). Barber and Odean (2007) show that retail investors, who are small individual investors, are largely dependent on news and on the securities that

(accidentally) catch their attention. They argue that investors have an enormous group to choose from when deciding what instruments to buy, which makes it impossible to compare all of them and make a calculative decision based on private information or preferences. Because of this, investors only base their choice on stocks that have recently caught their attention, meaning news and external information have a big impact on the individual investor’s behavior (Barber & Odean, 2007). Contrarily, according to Berkman et al. (2012), when selling stocks,

(12)

8 retail investors generally only choose between the small set of stocks that they already own. This means buy orders increase when attention for a stock is high, while sell orders are largely unaffected, creating an order imbalance and a price shift (Berkman et al., 2012).

Berkman et al. (2012) further investigate this result and link it to the relatively high opening prices that are often observed. They proxy retail investors’ attention by looking at the previous day’s volatility and notice that increased attention from retail investors does indeed lead to increased prices when the stock market opens on the subsequent day. They find this effect to be stronger for stocks that are more difficult to valuate and when retail investors’ sentiment is high, strengthening the idea that the increased prices are indeed caused by the retail investors.

Hypotheses

As described, there have been numerous researchers that have investigated the subject of overnight returns and inducted different theories. In this research paper, some of these theories are tested, along with other ideas that arose from the previous work. The hypotheses can be split up in three distinct parts. First of all, it is speculated that overnight volatility is lower, but mean returns are higher. The second hypothesis is that opening prices are upwardly biased, which creates a U-shaped pattern in returns and a negative

autocorrelation between overnight and intraday returns. Finally, it is also hypothesized that this biased opening price is caused by attention-driven buying from retail investors, who drive up the price. Because of this causality, a predictive model that uses overnight returns and incorporates proxies for retail attention could yield better estimations of intraday returns, along with an increased understanding of the trends and effects perceived by other researchers.

(13)

9 3. Sample and method

To test and investigate these hypotheses, high-frequency financial data will be collected, described and used within a model that estimates intraday returns with the overnight returns and other supporting variables. Based on existing literature, different models, variables and transformations will be tested to derive the most successful approach. Data sample

For this analysis, the most import utilized data are the prices and volumes of the US500 index proxy from 2015 through 2017, which are retrieved from the databank of

Dukascopy. The US500 index proxy is an ETF designed to track the weighted average of the 500 most actively traded stocks in the United States. The interval between observations for this data is five minutes, which enables a more detailed analysis in the minutes around market open and market close. For the proxies for investors’ behavior, additional data is collected on the daily long-term interest rate and investors’ sentiment, measured as the percentage of bullish investors. These datasets are retrieved from the US treasury and the online database of Ycharts respectively.

To generate the desired variables from this data, a number of transformations are conducted. First, additional data points will be generated by interpolation with a cubic spline, to correct for the difference in time-intervals between observations. Then, the weekends are filtered out, as the price of the securities will remain constant during these days, potentially leading to a distortion of the results. The weekend is instead treated as if it was one long night. Trading hours are defined as 7:00 through 21:00 GMT, based on daily trading volume, with adjustments made to correct for daylight saving time.

Of this final sample, only the first 445 days will be used to fit the initial models, leaving the last 60 observations for out-of-sample evaluation.

Regression model and methods

The aforementioned data sample is then used to build models and test the hypotheses. First the mean intraday and overnight returns are calculated and it’s tested whether overnight returns exceed intraday returns. Then, a model is estimated to test if there is indeed a negative autocorrelation between the intraday and nighttime returns and if this correlation translates into a useful predictive model. The optimal specification for this model is

(14)

10 very much dependent on the characteristics of the utilized data. Seeing as the data sample is a time series and other researchers have already predicted patterns to exist in the realized volatility (Ahoniemi & Lanne, 2013), stationarity of parameters is not certain. Because of the importance of this assumption, the stationarity is tested with both the augmented Dickey-Fuller (ADF) test and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test. Results of these tests are presented in the appendix (A.1). For all dependent variables, the ADF test rejects the null-hypothesis of presence of a unit root and the KPSS test does not reject the null-hypotheses of stationarity, which is strong evidence for parameter stationarity and supports the use of ordinary least squares (OLS) estimation.

The suggested OLS regression model can be represented in the following way:

OTCt 01OTCt12CTOt13CTOt24ISt 5ITt 6VTt17VOt1t (2.1) In which INt and ONt are the intraday returns and overnight returns in period t respectively, which are calculated as follows:

OTCt log(CPtOPt) 100 (2.2)

CTOt log(OPtCPt1) 100 (2.3)

Where OPt is the opening price and CPt is the closing price on a given day. Two lagged effects

for the intraday returns are immediately included in equation (2.1), as Branch and Ma (2012) show that this significantly improves estimations. Even more lagged effects of the response variable can also be added, depending on the necessity as seen in the partial autocorrelation function (PACF) of the response variable.

To test the added prediction benefit of accounting for retail investors’ behavior, four variables are tested to proxy the amount of retail investing in this security on a certain day. The approximation for this amount can be split up in two parts; the general amount of retail investors’ activity and their attention for this specific security. In this equation the degree of retail investors’ general activity is proxied by two variables; ISt and ITt, which denote the

investors’ sentiment and the long-term interest rate in period t. The underlying assumption is that retail investing will increase when investors’ sentiment is higher and when the long-term interest rate is lower, as the latter would presumably drive some investors away from treasury bonds and in the direction of riskier securities. The attention from those retail investors for the

(15)

11 US500 stocks is also incorporated with two variables. The first is VTt-1, the volatility of the

previous day, measured as the standard deviation of the previous days’ price in intervals of five minutes. The second one is VOt-1, previous day’s trading volume in millions of units. Based

on Berkman et al. (2012), it is assumed that greater price movements on the previous day attract the attention of retail investors. An increased trading volume is also an indication that the security attracted investors’ attention.

Using combinations of the same variables, extra models are evaluated that separately estimate the returns in the first and last half-hour, denoted as First30t and Last30t respectively,

to test if this improves the prediction accuracy and investigate the intraday causality suggested by Gao et al. (2017).

After fitting the models, the significance of parameters, the adjusted R-squared and the Akaike information criterion (AIC) are compared through all the different estimated models to quantify the goodness of fit, as proposed by Bozdogan (1987). The best model without proxies and the best model with proxies is selected, after which in-sample root mean squared error (RMSE) is calculated for the last 60 days in-sample and the out-of-sample RMSE for the 60 days that were not used for the fitting of the model. Based on a comparison of both the in-sample and out-of-sample RMSE, conclusions are drawn to evaluate to what extent the models are improved by adjusting for retail investors’ attention-driven trading.

(16)

12 4. Results

After the initial collection of data, the first step is to calculate the mean overnight and intraday returns. From figure 1, it becomes clear that both the open-to-close (OTC) returns and the close-to-open (CTO) returns are not significantly different from zero, which is not in line with the a priori assumptions. However, the standard deviation from the return is bigger during daytime, which indicates higher volatility. This is in line with the results from i.e. Branch and Ma (2012). The fact that the median is considerably lower than the mean for the OTC returns and the volatility signifies a positively skewed distribution. The close-to-open, open-to-close, first half-hour and last half-hour returns all exhibit minima and maxima that lie

numerous standard deviations away from the mean, which indicates thicker tail ends of their distributions.

Notably, the mean return in the first 30 minutes of a trading day are the only returns that are significantly positive. This partially supports the results of Hong and Wang (2000), who find that returns are high around market open, as a part of the U-shaped pattern they

describe. The last half-hour returns don’t yield the same result. Returns in this time-frame are insignificant, even though the volatility is twice as high as in the first half-hour, which means that this market does not comply to general risk-reward theory and is (partially) inefficient. The standard deviation of the returns in the first and last half-hour are respectively 12.5% and 25% of the total intraday volatility. As they only cover approximately 4% of the total trading time each, it is clear that a disproportionate amount of volatility is realized around market open and market close.

(17)

13 Variable correlations

To determine which variables or combination of variables could be of interest for the fitting and forecasting models, the first step is to look at the correlation between the

independent variables and the response variables. When looking at the correlations between the non-proxy variables depicted in the appendix (A.2), several relations catch the eye. Firstly, the OTC returns show a slight negative autocorrelation with the returns on the previous day and a much stronger correlation between the preceding overnight returns. This is in line with the theory of day-time reversals of overnight returns, proposed by Berkman et al. (2012). Interestingly, the returns in the last 30 minutes of the previous trading day also show a negative correlation with the returns in the first half-hour of the next trading day. This could indicate that the last half-hour returns have some forecasting power for the returns in the beginning of the next day, providing a cause for testing and possible inclusion in the model.

After evaluating the correlations between non-proxy variables and building models to incorporate those variables into a predictive model, the proxies for investors’ attention are examined. Looking at the correlations between the proxies and the response variables helps with deciding which proxies could be powerful in a predictive model. The trading volume of the previous day and the interest rate show the strongest correlation with the daytime returns. Precedent day volatility seems to be less important for this, but shows a strong correlation with the first half-hour returns. Investor sentiment has a weak correlation with all the response variables, which is likely to translate to insignificant effects in a predictive model. Because the proxy variables are also correlated with each other, it is necessary to be careful while adding the proxies into the model, as too many of them could lead to variance inflation. The correlations in (A.2) also support and add on some assumptions based on previous research and economic theory. A high volatility on the previous trading day increases returns around market open This could mean that the U-shaped patterns in returns described by Hong and Wang (2000) are strongest in days following days with volatile prices, which supports the explanation of attention-driven retail buying that Berkman et al. (2012) proposed. Although the correlations are weaker, the same reasoning can be made for the trading volume of the previous day, as it also proxies for retail attention.

(18)

14 Tests and model adjustments

Based on these correlations, numerous variables are identified that could have an interesting effect on the response variables. In addition to this, the partial autocorrelation functions (PACFS) of the dependent variables are examined to identify potentially effective extra lags to be included. The figure in the appendix (A.3) shows that the fifth lag of the open-to-close return shows significant autocorrelation with the returns on day t. For the last half-hour returns, the value of ten days ago is significantly correlated with the present returns in the last 30 minutes. Consequently, both variables are investigated in their relevant models to test whether this autocorrelation can be used to obtain a better fit and forecast accuracy. Because the returns also include some observations that lie very far away from the mean, results are prone to be distorted by unduly influential observations. To cope with this, the Cook’s distance is calculated after estimating the models. As suggested by Cook and Weisberg (1982), observations with a Cook’s distance greater than one are removed from the regression sample. For the used dataset, this is the case for three observations, or days, consistent through all the estimated models. Specifically, these days are; 25 August 2015, 24 June 2016 and 9 November 2016, corresponding to the China flash-crash, Brexit referendum and Trump election respectively.

Even after removing these observations, not all assumptions necessary for OLS are completely satisfied within the fitted models. The most impactful issues are accounts of heteroscedasticity and autocorrelation within the residuals. Because not all models suffer from these problems, each representation is individually tested for heteroscedasticity with the Breusch-Pagan test and for residual autocorrelation with the Durbin-Watson test. The results of these tests are visualized in the appendix (A.4). If the null-hypothesis of homoscedasticity is rejected in the Breusch-Pagan test, Huber-White standard errors are used to correctly

calculate the p-values in the corresponding model. If the null-hypothesis that residuals are serially uncorrelated is rejected in the Durbin-Watson test, or if both test reject the null-hypothesis, Newey-West standard errors are used. The method for variance calculation is mentioned for every model representation.

(19)

15 Fitting open-to-close returns

After solidifying the model specifications, a first model is estimated and depicted in figure 2, starting with the predictions of intraday returns in its entirety. This model is the most basic representation used by i.e Branch and Ma (2012) and Berkman et al. (2012) and serves as a starting point for further addition of variables. This model gives an approximation of the intraday returns, solely based on the returns of the previous day and the preceding overnight returns. None of the coefficients are significant for this dataset and the model can be

improved considerably by adding and removing variables.

Model 2 is an example of this and is the best observed model that doesn’t incorporate proxies for investors’ attention. The fifth OTC lag yields an important added improvement, which was added based on the PACF of the open-to-close returns. Although the effect of the returns in the previous night is not significant, the variable is kept in the model because it does increase the adjusted R-squared and decreases the AIC.

Interestingly, the double lagged variable for nighttime returns has a stronger effect than that of the immediately preceding overnight returns. Branch and Ma (2012) also find the earlier overnight returns to be significant, but don’t offer an explanation for this phenomenon. A possible cause could be that when nighttime returns from two nights ago are high, this indicates an overshooting of the equilibrium price and a daytime reversal. This reversal could make investors weary of buying at the open and lead to an opening price on the following day that lies below the intrinsic value. This would mean that overnight returns show negative autocorrelation, a result that is also visible in (A.2). The following day the price would revert upwards and create higher returns, producing the observed positive correlation with the returns two nights ago. The investigation of this hypothesis is beyond the scope of this paper, but it could be a topic for further research.

When proxies for investors’ attention are added in the third and fourth model, specifically the trading volume on the previous day and the interest rate, fit improves even further, without diminishing the significance of the variables used in the second model. An increase in previous day’s trading volume of 1 million units leads to an expected increase in intraday return of 0.003 percent point (pp), significant at the 5% level. This is not much, but because the variation in volume is very high (std. dev of 29.43 million units), volume is still

(20)

16 expected to account for changes of up to 0.3 pp. The effect of the interest rate is interpreted as follows: an increase of the interest rate by one percent point leads to a mean decrease in daytime returns of 0.25 percent point. It’s coefficient is insignificant, but model four is preferable nonetheless, as adding the interest rate lowers the information criterion and increases the adjusted R-squared without causing excessive variance inflation. The 6.7 point decrease in AIC from model two to model four, means that the model with proxies for investors’ attention is 6.7/ 2

28

e  times more likely to minimize the information loss compared to the second model.

Figure 2: Fitted models for open-to-close returns

Fitting first half-hour returns

Along with fitting the daytime results, measured open-to-close, models are also made for the first and last half-hour returns specifically. As described in section two, it is expected that the returns nearby market open will follow a different pattern than the intraday returns. Figure 3 displays several fitted models for the returns in the first half-hour of a trading day. In the first model, which makes no use of proxies for investors’ attention, the lagged OTC returns show a significant effect, but interestingly, the overnight returns do not. This is in contrast with the theory of day-time reversals of overnight gains, as described by Berkman et al, which would suggest a significant negative coefficient. The double-lagged overnight returns added in model two do show a significant effect, similar to that in the open-to-close

(21)

17 A difference between the model fitting the daytime returns and the model specifically targeting the first half-hour is the positive effect of the lagged last half-hour returns.

Apparently the returns in the last 30 minutes tend to carry over slightly, with a coefficient significant at the 5% level.

The fit of the second model can hardly be improved by adjusting the results for the effects of attention-driven investors. Neither the interest rate nor the investor sentiment yield strong results in this case, possibly because the time-frame is so small that these long-term effects hardly have any predictive power. Short-term variables like previous day’s trading volume and volatility should be more likely to work, which is tried in model three and four. However, correcting for previous day’s trading volume or volatile does not yield significant coefficients and leads to a decrease in the adjusted R-squared and an increase in the AIC. Because the difference is so slight, the last three models will all be evaluated for in- and out-of-sample forecast quality.

Figure 3: Fitted models of first half-hour returns

Predicting last half-hour returns

Similar to the first half-hour returns, the returns in the last 30 minutes are actually harder to predict than the integral daytime returns. One important cause is likely to be the dating of available information. Because hour-by-hour returns are not included in this research, the earliest available price movements like overnight returns or first half-hour returns are already 12 hours old, with their effects probably diminished.

(22)

18 models than it is for first half-hour returns. The first model is the simplest and worst

representation, where overnight returns and lagged intraday returns are the only

(insignificant) explanatory variables. Model two improves this by using the overnight returns from two nights ago instead of those during the previous night. It is strange that this extra outdated variable works slightly better when the overall quality of the model seems to suffer just because of the information maturity, however, a hypothesis for this effect is already proposed and further investigation will be steered clear of. The other addition is the lagged effect of the last half half-hour returns from exactly ten days, or two weeks, ago. The PACF of Last30t showed significant correlation between this lag and the last half-hour returns of the

present day and this effect apparently translates into an improvement of the fit. The second model is actually the best model that doesn’t incorporate proxies for investors’ attention, as incorporating the returns in the first half-hour as a predictor doesn’t provide any extra benefit. In the third and fourth representation, the proxy variables are added, specifically previous day’s trading volume and present day’s volatility. The fit becomes better by adding the first proxy in model three, and then improves even further with the additional proxy for volatility. Although both coefficients are very small, they significantly increase the portion of explained variance, measured as the adjusted R-squared. According to the Akaike Info Criterion, the last model with both proxies incorporated is more than 5 times as likely to minimize the information loss.

(23)

19 Forecasting

After fitting the models described above and comparing them based on the adjusted R-squared and the AIC, the best models with proxies and without proxies are chosen for further examination. The in- and out-of-sample forecasting quality is evaluated for these models, measured by the root mean squared error (RMSE). Figure 5 depicts these values, with an added ‘control’ RMSE, which is the error achieved by just using the in-sample mean as a predictor for all other observations.

As seen in the figure, the models perform worse than the control predictor in all but one case, which means that the better fit does not directly translate into useful predicting qualities. In forecasting the open-to-close returns and the last half-hour returns, the models without proxies for investors’ attention yield a lower or equal RMSE in- and out-of-sample than the representations with proxies included. For predictions of the returns in the first 30

minutes, representations with proxies for investors’ attention slightly improve out-of-sample forecasting, even though it is still worse than the control model.

Generally, incorporating additional variables to correct for the effect of attention-driven trading does not lead to the benefits that are expected when looking at the

improvement of the fit. To explain this, it is possible that there is a break in the data that leads to a disparity between the predicted returns and the actual values, creating the poor forecast quality or that the used model selection procedure is not efficient. Further investigation of possible causes is relevant, but beyond the scope of this research.

(24)

20 5. Conclusion

The relation between intraday returns and overnight returns has been studied by numerous researchers. Hong and Wang (2000) find U-shaped patterns in returns and volatility, that can be explained by inaccurate opening prices and subsequent reversal during the trading day. Branch and Ma (2012) find that mean overnight returns exceed mean open-to-close returns, while volatility is lower. They also find that the overnight returns can be used to predict intraday returns, possibly due to the same intraday reversal of overnight returns found by Hong and Wang (2000). Berkman et al. (2012) explain and forecast the inaccurate opening prices by looking at proxies for retail investors’ attention and find that this is a likely cause and a good predictor.

This study was dedicated to investigating to what extent the incorporation of the proxies for investors’ attention found by Berkman et al. (2012) can help improve the predictions for (partial) intraday returns. The study was conducted using the US500 index proxy and utilized the interest rate, trading volume, investors’ sentiment and volatility as possible proxies for investors’ awareness. It became apparent that the addition of proxies significantly improved the model fit, measured with the Akaike information criterion and the adjusted R-squared, but did not yield meaningful forecast improvement.

Forecasting quality increased slightly for the first half-hour returns with a

representation that includes proxies, however it was still worse than a forecasting model that would use only the in-sample mean, which means this result is not directly applicable for investors. The reason for the poor forecasting is unclear, but it could be attributed to a potential unidentified break in the dataset.

Thus, the proxies were unsuccessful in improving predictions, but addition did succeed in improving the model fit. This improvement of fit can be largely contributed to the short-term proxies. Trading volume significantly increased fit in all tested time-frames, while volatility improved the models in the first and last half-hour returns. Of the long-term proxy variables, only interest rate showed some result, yielding a slight improvement in the

predictions of open-to-close returns. Investors’ sentiment was insignificant in every situation. Aside from this, other hypotheses that stand at the foundation of the theory of day-time reversals and the effect of retail investors were also tested. As opposed to the results

(25)

21 found by Branch and Ma (2012) and Liu and Tse (2017), mean nighttime returns were found to be lower than mean intraday returns. This is in accordance with standard risk-reward financial theory, as the nighttime also reported lower volatility, but does not support previous work. Also, the disproportionate amount of volatility around market open and market close signifies U-shaped patterns in volatility, but little evidence was found for U-shaped patterns in returns. Although the first half-hour returns did report significantly positive gains on average, the mean last half-hour returns were found to be insignificant and should have been positive for a real U-shape to occur. Returns around market open and market do show a slight positive correlation, also visible in the prediction of first hour returns using previous day’s last half-hour returns. If this correlation was stronger, it could mean that the U-shaped patterns would be joined by an approximately equal amount of inverse U-shaped patterns, which would then still explain the insignificant mean return in the last half-hour. Evidence, however, is too weak to prove this claim.

In investigating the previously described hypotheses, this research has some important shortcomings. The most important one is that this is an analysis of a single security. Even though the US500 is an index proxy and not subjective to company- or branch-specific effects, the reaction of the price to changing trading volume or volatility can still be particular for this index and relations could be different for other securities. Also, the poor translation of fit into forecast quality indicates that unidentified breaks in the data may exist or that the process of model selection was inefficient. Finally, there may have been more powerful proxies for investors’ attention, such as institutional ownership or net retail buying, for which appropriate data was unavailable.

Other researchers can build on this paper by trying to reproduce the results in a model that deals with one or more of the deficiencies described above. There are also other

interesting results that deserve further investigation, like the counterintuitive forecasting power of the double lagged overnight returns. The other main audience of interest for this research are the investors specialized in technical analysis. They could use the results found in this paper to gain additional insight into the potential effects of incorporating extra variables for investors’ attention into their forecasting models and use this as a theoretical basis to alter the models where necessary.

(26)

22 References

Ahoniemi, K., & Lanne, M. (2013). Overnight stock returns and realized volatility. International

Journal of Forecasting, 29(4), 592-604.

Barber, B. M., & Odean, T. (2007). All that glitters: The effect of attention and news on the buying behavior of individual and institutional investors. The Review of Financial Studies, 21(2), 785-818.

Berkman, H., Koch, P. D., Tuttle, L., & Zhang, Y. J. (2012). Paying attention: overnight returns and the hidden cost of buying at the open. Journal of Financial and Quantitative

Analysis, 47(4), 715-741.

Branch, B., & Ma, A. X. (2012). Overnight return, the invisible hand behind intraday returns?

Journal of Applied Finance, 2, 1-11.

Bozdogan, H. (1987). Model selection and Akaike's information criterion (AIC): The general theory and its analytical extensions. Psychometrika, 52(3), 345-370.

Cook, R. D., & Weisberg, S. (1982). Residuals and influence in regression. New York: Chapman and Hall.

French, K. R., & Roll, R. (1986). Stock return variances: The arrival of information and the reaction of traders. Journal of financial economics, 17(1), 5-26.

Gao, L., Han, Y., Li, S. Z., & Zhou, G. (2017). Market intraday momentum. Retrieved

from: https://ssrn.com/abstract=2440866

Hong, H., & Wang, J. (2000). Trading and returns under periodic market closures. The Journal

of Finance, 55(1), 297-354.

Jensen, M. C. (1978). Some anomalous evidence regarding market efficiency. Journal of

financial economics, 6(2-3), 95-101.

Jorion, P. (2006). Value at Risk: The New Benchmark for Managing Financial Risk (3rd ed.). New York: McGraw-Hill.

Liu, Q., & Tse, Y. (2017). Overnight returns of stock indexes: Evidence from ETFs and futures. International Review of Economics & Finance, 48, 440-451.

Lockwood, L. J., & Linn, S. C. (1990). An examination of stock market return volatility during overnight and intraday periods, 1964–1989. The Journal of Finance, 45(2), 591-601.

(27)

23 Appendix

(28)

24 A.2: Pearson correlations of variables and lagged effects

(29)

25 A.3: PACFs of dependent variables

Referenties

GERELATEERDE DOCUMENTEN

the regressions of the overlapping one-year bond excess returns on the level, slope and curvatures of the yield curves (the AFNS model, the Svensson model and the Legendre

The table also reports the amount of wins and losses and the abnormal returns for a subsample of group and knockout matches, a subsample of expected outcomes and matches with

An alarming finding from our study is that a large proportion of COVID- 19 trials test the same treatments or drugs, creating a thicket of redundant, uncoordinated, and

Help mij door onderstaande enquête in te vullen en draag bij aan de ontwikkeling van een mobiele app die u als FACT-medewerker optimaal ondersteunt tijdens uw werk.. Ik doe

The results of the analysis show that a positive relationship is present between the two variables, but that the effect size of CEO-pay-ratio on company performance is almost

The passive and narrative of film is often a completely different experience than the active playing of a game, despite the lack of gameplay in titles like some presented in the

Faculty of Geo-Information Science and Earth Observation, University of Twente, P.O. However, available tools for a maptable either lack advanced analytical functions or have

The central theoretical argument of this thesis is that the cogency of Francis Schaeffer‘s apologetic can be demonstrated on the basis of John Frame‘s triperspectival Christian