• No results found

Understanding short information : how market efficiency benefits from industry-level short selling

N/A
N/A
Protected

Academic year: 2021

Share "Understanding short information : how market efficiency benefits from industry-level short selling"

Copied!
56
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

UNDERSTANDING

SHORT INFORMATION

How market efficiency benefits from

industry-level short selling

ABSTRACT

Our aim was to research short information, discover where its power is concentrated and if it originates from publicly available information or private newly generated intelligence. By looking at industry-level forecasting abilities we have found that short sellers information is concentrated in only part of the market and that they function as macroeconomic modelers. On top of that our output shows suggestive evidence that short sellers are in fact interindustry information transporters as well.

Roger van Buuren

MSc Finance, Asset Management

Supervised by: Esther Eiling

Master Thesis

Universiteit van Amsterdam

June 2017

(2)

Table of contents

Introduction ... 4

Literature Review ... 7

Data ... 11

Methodology & Results ... 16

Industry predictability ... 16

In-sample analysis ... 17

Out-of-sample tests ... 20

SII predictability ... 22

Shorting information into the market ... 22

Shorting information into industries ... 26

Macroeconomic forecasting ... 28

Comparing long/short information ... 31

Controlling for SII ... 31

Coefficient analysis ... 32

Robustness ... 36

Conclusion ... 38

Reference list ... 40

Appendices ... 42

Appendix 1: SII’s per industry ... 42

Appendix 2: Median amount of Industry SII securities ... 42

Appendix 3: Industry portfolios and control descriptives ... 43

Appendix 6: AR(1) coefficients of industry excess returns, controls and SII’s ... 44

Appendix 4: Correlations of industry portfolios ... 45

Appendix 5: Predictive power of industry stone excess returns 1973-01 – 2016-12 ... 46

Appendix 7: Long-term SII performance ... 47

Appendix 8: Economic activity explained by industry SII’s ... 48

Appendix 9: Economic activity explained by industry excess returns 1973-01 – 2016-12 ... 50

Appendix 10: Comparing long-short information ... 52

Appendix 11: Short Interest per industry, without screen ... 53

Appendix 12: Economic activity by SII, alternative ... 54

Appendix 13: Informed industry selection, without opposite sign industries ... 55

Appendix 14: Predictive power of combined money, reits SII ... 55

(3)

Statement of Originality

This document is written by Roger van Buuren who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(4)

Introduction

Future stock market returns are hard to predict despite attempts since the 1920’s. Understanding time varying risk and possible deviations is crucial for equity investors and has implications for many other economic studies such as market efficiency, behavioral finance and game theory. In the past century a lot of researchers have contributed to the understanding of market predictability the academic world has today. Because all use different approaches Goyal and Welch summarized the state of the literature in 2008 in an orderly and structured way, which has since become the main reference for other academics investigating the matter. However recent developments in this field are not yet as established as old predictors. Their true sources of information is sometimes unknown despite lengthy theoretical substantiation.

Short sellers have recently proven to be informed traders both in the cross-section (Drechsler & Drechsler, 2016) as in the time series (Rapach, Ringgenberg & Zhou, 2016, hereafter RRZ). They have shown to be able to predict stock returns better than the average investor and can be used to forecast market returns. Moreover their information is currently the best market predictor known to modern finance. The question hereby arises how they have gained this competitive advantage. It is possible that short sellers can utilize publicly accessible data better or they might have an edge on the rest with respect to predicting new information.

Figure 1. Out-of-sample predictive power of the SII 1995-01 – 2016-12.

The line displayed are is cumulative SSE difference of the monthly estimation model based on market SII and the null model, the simple average of the observed

market excess return during the

estimation period. The out-of-sample based model uses data from 1973-1994 for 6-month predictions ( = 6). When the line goes down, it means that the null model outperforms the estimations by the SII and vice versa. It is clear that the estimation model of SII is superior given by the higher than 0 termination value. For instance the credit crisis in 2008 is largely accounted for by the SII compared to the null model.

(5)

Furthermore, it is known that slow information diffusion between industries is an important factor that causes single industries to be able to predict overall market performance (Hong, Torous & Valkanov, 2007, hereafter HTV). All publicly available information is not incorporated in all share prices immediately, because it is apparently impossible to be up to date of the complete broad market spectrum. Analysts nowadays specialize per industry in order to remain fully aware of all interesting events influencing a specific set of stock. Cross sector specialists, however, are hard to find, which means it is an advantage to have one.

Perhaps someday artificial intelligence will make human interpretation (and its deficiencies) redundant, but until technology and transparency allow us to monitor the market perfectly semi strong market efficiency is a an ideal rather than reality. Gradually improving the process of correct pricing and thereby the functioning of the financial world is appreciated by all real world market participants looking to finance their or others operations. Speeding up the process of fundamental value price discovery by transporting the information available in one industry to the others would be beneficial for market efficiency and therefore valuable. In that light it would make sense to suggest that the actors receive compensation doing so. Such an investor can do highly profitable trades by simply picking up information from one industry and translate it to other industry valuations.

Research is still ambiguous about whether short sellers are these information transporters or rather predictors of macro-economic conditions. Because hedge funds have the (human) resources to model the economy to the same extend as public institutions do it is not unthinkable the true roots of their edge can be found in their ability to forecast macroeconomic conditions instead of exploiting the shortsightedness of investors that do not look past their own industry.

The main focus of this thesis is to understand short information. We try to discover where the predictive power is concentrated and ultimately whether short sellers are in fact solely trans sector information transporters instead of macroeconomic forecasters. Are they trading with inter sector information or do they actually have the skill to predict returns themselves? This theory would have several implications for the way we look at short selling in the current literature as the debate on its desirability is still undecided.

We will start with examining the current status quo in the academic literature. This involves taking a closer look at short selling in recent history and what we know about the information that short interest contains.

(6)

A few papers on industries-level research and slow information diffusion will be discussed as well. Thereafter the empirical research will open with a description of the used data and its sources to allow for replication.

Within the Methodology & Results section we then firstly require to establish the level of information diffusion within our sample period, which demands both in-sample and out-of-sample testing. The analysis used here will to a certain extend be applied on our short data subsequently to determine the concentration of short information over industries and test the predictive power of the variable. We will already touch upon cross industry information transportation within short interest when we take a look at industry return predictability. Next the predictability of market fundamentals will be zoomed in upon while testing the hypothesis that short sellers are macroeconomic condition forecasters more than anything. In the last part we will see if the information is concentrated within the same industries, enabling us to distinguish whether economic activity predictors are significant market forecasters and if predictive power of the long investors within an industry has any relation to the power of the short predictor in the same industry. This final analysis provides answers to the questions if predictive power originates from macroeconomic interpretation and if information is transported across industries.

The thesis will finish off with comments on the robustness of the results, the extended work to guarantee a solid outcome, and an overall conclusion. Throughout the read will be referred to additional tables and figures that can be looked upon closely in the Appendices.

This is the first extensive research of industry-level short selling, which makes it unique. While industry specific information has been used widely for regular long trading, the homogeneousness of short information has since been silently assumed. There still exists a gap in financial literature between the short selling information processing abilities and its origin. At the same time predictive power with regard to the market and its relation to market efficiency remains a vivid discussion. Being the first to combine the two fields we will take a look at both long and short information to clarify how they are interconnected and see if information flows can be defined. This is relevant both theoretically, because it can redirect the way we look at short sellers in the world of academia, and in practice for information concentration has many possible applications for investors, regulators and other stakeholders.

(7)

Literature Review

Short selling is the act of borrowing a financial instrument with the intention of selling it immediately. This creates the obligation to buy it back at a future time. If the market price decreases in the meantime, the short selling has realized a profit. Sell high first, buy low later. The lender receives a fee for the (stock) loan. This type of investment transaction has become increasingly popular for investors aiming to exploit a price drop (RRZ, p. 47). While the lending market developed short selling became less costly and more stocks were shorted. The amount of shorted stocks is known as the short interest of a stock and it is documented and published publicly by the SEC.

The nature of short selling has made it controversial. Borrowing requires collateral, but with short selling the losses are unlimited in theory, which can make the short seller go bankrupt and unable to deliver the stock to its owner. On top of that stock market prices going up benefits all stakeholders and entitles economic growth. Having an incentive to prefer prices going down can be seen as counterproductive or even dangerously perverse. For this reason the authorities can restrict short selling for specific stocks or entire markets.

Short selling constraints are a vibrant topic of research in the academic world (e.g. Jones & Lamont (2002), Haruvy & Noussair (2006), Lim (2011), Philips (2011), Beber & Pagano (2013)). Opponents of shorting point at the negative effects discussed above, where advocates claim that short selling improves price discovery and creates a buying pressure which serves a lifeline in periods of panic.

Who are short sellers and what drives them are other interesting questions. As mentioned before short sellers are informed traders compared to the long side investors (Drechsler & Drechsler 2016). The source of most sustainable shorting activity are hedge funds (D’Avolio, 2002, p. 277). They either directly sell short or indirectly cause it by using derivatives whose dealers divert the risk of trade by short selling. Market makers also sell short obviously, but they try to offset it immediately to minimize exposure. Boehmer, Jones & Zhang (2008, p.492-493) argue that the institutional shorters are better informed because they can afford to purchase valuable research as opposed to the retail investors that mostly need to exploit accidental negative private information despite regulatory insider trading restrictions. Diamond and Verrechia (1987) add that relatively few noise traders populate the short selling hemisphere because they cannot use the proceeds of their sales, driving away liquidity traders. However Boehmer et al (2008)

(8)

counter that this effect might be cancelled out to some extend by statistical arbitrageurs who have no regard of the fundamental value of the stock they are shorting.

Drechsler and Drechsler (2016) find that the short rebate fee, which is the compensation for borrowing the stock, is a strong cross-sectional return predictor. Low rebate fees cause stocks to be cheap to short. These cheap stocks turn out to outperform expensive stocks. This can be explained economically by the paper by D’Avolio (2002). The number of shares outstanding of a stock is fixed. Because investors usually either decide to make their portfolio available for lending or not supply is rigid and shocks are rare. The fee rate is therefore highly dependent on the demand from short sellers. Thus high fee rates are caused by short sellers that expect the stock price to fall. This proves that short sellers are able to act earlier than the long investors, making them more informed. Drechsler and Drechsler create a cheap minus expensive (CME) risk factor to supplement the Fama French model and discover that 8 well-known asset pricing anomalies disappear by doing so. They argue that CME is a legitimate risk factor because shorting involves undiversifiable risk as a result of overvalued stocks having a high covariance with each other which a small portion the market participants have to bear.

Given that short sellers are informed, Boehmer et al (2008) zoom in on the short order data to see whether distinctions can be made within short selling. Their micro approach involves identifying to what degree investors are able to spot overvalued stocks and which type of trader does it best. In their findings they refer to the short sellers as “extremely well informed” and discover that on an average risk-adjusted annualized basis largely shorted stocks underperform little shorted stocks by 15.6%. Institutional shorts are the most informed and can similarly achieve a 19.6% return. Another learning is that the effects on the stock prices are permanent, which means that the short information is truly based on fundamental effects rather than manipulation opportunities.

Next the research by Engelberg, Reed & Ringgenberg (2012) digs deeper into how the short sellers are informed. With the aid of news archives they look at the timing of the short orders and find that short volume relative to total trade volume remains stable before news events, but jumps directly afterwards. A portfolio based on short trading volume would even yield a high theoretical return. This outcome confirms that the informational advantage is not a matter of news anticipation but rather superior news processing. They verify that this volume is not due to enhanced market maker trading, which would mean

(9)

that the event information is simply misinterpreted by long investors. This teaches us that short sellers are skilled information processers, though it remains unclear how they exploit the information exactly.

Nevertheless long investors have information processing tools as well. The most important information processing agents are the analysts, whose prime goal is to close the information asymmetry gap between investors and firms. These financial intermediaries provide guidance based on publicly available information in the form of signals together with target prices, which is the suspected fundamental valuation of the company. Buyers use these recommendations as an indicator for future stock performances. A recent study by Drake, Rees & Swanson (2011) compares the analyst information processing abilities with that of short sellers. Their main finding is that analysts are more off than short sellers. Dechow et al (2001) already suggested that short sellers look at some firm fundamentals. However, whereas the analysts tend to overweigh particular accounting measures that have proven not to work properly, short sellers use a broader set of data to base their decisions upon according to Drake et al (2011). They discover that on average 1.11% can be earned each month with a zero investment strategy that follows the shorters information when it contradicts the analysts’ recommendations. Short sellers are therefore not only outperforming information interpreters compared to the average investor, they also beat the most informed long investors as a group at news processing.

Since short sellers are such highly skilled traders, RRZ come up with the idea of using their information as a market predictor. Their research is the most recent major development in the academic short information papers. The Short Interest Index (SII) they compiled turned out to be the strongest predictor of market excess returns for horizons until 12 months, dominating all conventional forecasters. This index represents an equally weighted detrended monthly level of relative short interest, which means it contains aggregate short selling information about the market. The authors suggest that this informational advantage can exist in the market because shorting involves several unique risks (recalls, fee changes) that need to be compensated. Additionally regulatory short selling restrictions for different market participants reduce the opportunity to utilize the phenomenon as it is subject to limits of arbitrage. Alternatively the predictive power can come forth from the fluctuating aggregate risk premium, yet this has not been decided. RRZ are also the first to link information on macroeconomic conditions to short selling by finding evidence that aggregate cash flows are anticipated by their index.

(10)

Simultaneously financial researchers focus on different subtopics of asset pricing as well. Among others the slow diffusion of information starts to interest academics when Eleswarapu & Tiwari (1996) first introduce the market return predictability by industry-based portfolios. They find that when the market is split up into 12 portfolios, the different returns show business cycle-like behavior. Portfolios containing exposure to basic industry and textiles are negatively correlated with total market excess return whereas consumer durables has a positive relation with it. These findings link market forecasting to macroeconomic factors and present sectoral information expansion.

The distinction between negatively and positively correlated industries with the market is topic of interest in a working paper by Menzly & Ozbas (2004). They argue that it has to do with the supply chain where upstream and downstream sectors can be separated. Splitting up the market into 85 industries and assigning them a place in the supply chain divides the market into two groups whose cross-industry momentum is thought to be a larger (triple the magnitude) predictor than basic industry momentum. This suggests that the individual industry returns are a mere hub in the total circulating information exchange web of the market.

Hou (2007) underwrites the phenomenon of slow diffusion, naming it the origin of the lead-lag effect within the financial market. His research includes monitoring intra industry information flows, which happen to be directed from large to small stocks, where especially responsivity to bad news is weak. Moreover within industries the analysts mentioned earlier are a major determinant of intra industry information distribution. If industries are therefore the primary channel of information dispersion, one might wonder how long the inter industry internalization process of stock news takes.

Building further on this work HTV confirm the delayed response on news between industries by taking the Fama-French 38 industries (plus REITs) as predictor variables. They find that 14 out of them are significant predictors when forecasting until 2 months ahead and explain the economic meaning of their signs as well. The authors point out that commodities serve as input variables for the economy whose price jumps usually lead to a downfall of the broad market index. On the other hand industries such as retail and apparel are positive indicators that historically direct the market upwards. A macroeconomic connection is observed as well as these industries contain leading information on economic activity too. The HTV results are extended and confirmed abroad for the largest 8 non-US markets, which suggests that the effect is consistent rather than chance. Their findings attracted a great interest in the academic world

(11)

pushing the authors to update their research and release their raw data in a Note in 2014. In the extended sample the amount of significant industries shrinks but the core of the initial findings remain intact.

Yiuman (2015) reexamined all of the HTV results himself critically and drew the conclusion that they are less robust than previously believed. Using the same methodology but a dataset that has been revised over the years already the significance of the output decreases. Furthermore he increases the robustness of the standard errors, extends the sample period, and investigates subsamples. Afterwards he concludes that only 1 to 7 industries predict the market excess return, depending on the caution of the interpretations. Yiuman does find evidence of opposite direction predictability where the aggregate stock market forecasts industry returns. Therefore he claims there are no true leading industries, supporting markets efficiency.

It is clear that short sellers are informed traders due to their high skill of information processing. This makes them leaders in terms of market return predictability. Due to some conflicting evidence there is still an ongoing debate about leading industries. However it is widely accepted that graduate information dissemination over the market takes place. By looking at industry level short data, we will see if the two concepts are linked.

Data

For the empirical research data is used from a few sources. Monthly short interest outstanding is taken from the WRDS Compustat Supplemental Short Interest File and merged with the CRSP database to obtain prices, shares outstanding and SIC codes for US securities traded from January 1973 until December 2016. Short interest outstanding is reported as of the 15th of the month until February 2010 when the

outstanding short interest of the months last trading day is made available as well. These month-end observations are dropped to guarantee consistency. The final numbers of the CUSIPs of the Short File need to be removed in order to match their CRSP counterparts and allows 89.9% of the short data to be merged successfully.

The replication of the SII of RRZ starts by calculating the short interest as a fraction of the total shares outstanding. After removing the securities with prices below 5 USD the mean percentage short interest is taken for each month, establishing the Equally Weighted Short Index (EWSI). This EWSI then only needs detrending to become the final SII using Equation (1).

(12)

ln(𝐸𝑊𝑆𝐼

𝑡

) = 𝑎 + 𝑏 ∙ 𝑡 + 𝑆𝐼𝐼

𝑡

for 𝑡 = 1, … , 𝑇

Eq.(1)

In this regression the natural logarithm of every 𝐸𝑊𝑆𝐼𝑡 is the dependent variable. Explanatory variable is the date 𝑡 and the residuals represent the 𝑆𝐼𝐼𝑡.

Because we use a larger sample period than RRZ the slope 𝑏 in Equation (1) deviates from the original paper, which has retrospective consequences for all 𝑆𝐼𝐼𝑡 values. Composing an index in this way requires continuous adjustment of the historical data, implicating a forward bias. Using a rolling trend regression (adjusting the slope each month for next month’s estimate, requiring no future data) would solve this problem and make the SII more convenient to use in practice by investors if still reliable. For now we will continue to construct the SII as designed by RRZ, with an extended sample period.

Figure 2. Short Interest Index (SII) 1973-01 – 2016-12.

The SII constructed after Rapach, Ringgenberg & Zhou (2016). It represents the relative short interest outstanding. Upward spikes indicate periods where a large portion of the shares outstanding in the market are shorted as opposed to downward spikes that refer to periods with little short selling.

Despite the phenomenon described above the constructed SII is still very similar to the original RRZ SII with a correlation of 0.881. RRZ scale the SII to have a standard deviation of 1. Because this is not relevant for predictive power, the adjustment has been skipped.

(13)

Another feature of this detrending measure is that by definition of OLS the mean of the residuals is 0. This makes below average SII observations easy to spot and compare.

Likewise separate industry SII’s are created. The stocks are split up by SIC code in accordance with Ken French’s 38 portfolio classification. Additionally a special portfolio of REIT’s can be is compiled for securities with SIC code 6798, as proposed by HTV, bringing the total of portfolios to 39. Following the same procedure as for the entire market SII, the industry SII’s are composed. This means amongst others that industry SII’s have assigned equal weight to all index members. Therefore a combination of all industry SII’s does not result in the market SII as weights on firms differ per index.

Appendix 1 summarizes the market SII and 35 remaining industry SII outputs. All of their means are zero by definition as mentioned before.

Out of these 39 industry portfolios 4 have to be dropped (garbg, steam, water and othr), due to insufficient data, which HTV also faced when investigating the industries. Table 1 shows that for some industries relative short interest can be measured more precisely when the final SIIt is based upon a mean of a large

number of securities. Some of the industry short interests is estimated on a small number of underlying firms, causing the industry estimator to be less precise. This estimation error can cause a problem in the upcoming analyses when it produces additional noise in the variance of the SII’s. It is therefore not a surprise that the average number of underlying securities per industry portfolio in Table 1 are negatively correlated with the standard deviations of their industry SII (correlation of -0.25). Overall there is an upward trend in the number of industry SII constituents, as shown graphically in Appendix 2.

The short selling data is supplemented by monthly industry excess return data found at Amit Goyal’s webpage, which is also used by HTV. Real estate data is added from the NAREIT website, in accordance with HTV as well. Furthermore I downloaded monthly market excess returns and risk-free rates from Ken French’s webpage and special control variables from Harrison Hong’s online available dataset. These controls are default spread (AAA rated bond yields minus BAA bond rates), market volatility (of daily CRSP value-weighted market portfolio excess returns). These controls were not available in Goyal’s general predictor database and hence only available up to 2013. Other controls were inflation (US CPI growth rate), dividend yield (dividends paid-out in the previous 12 months dividend by share prices of the CRSP portfolio, as a natural logarithm), as retrieved from Goyal. Finally I used the book-to-market ratio of the Dow Jones Industrial Average from Matthew Ringgenberg’s webpage.

(14)

Table 1. Short Interest per industry 1973-01 – 2016-12.

Segregating the market in Ken French industries causes the shorting information to be split up as well. This table provides descriptive statistics on the distribution of this information and thus significance of the final portfolio index. The first column lists the 39 industry portfolios that are investigated followed by the description of industry classes. The last 5 columns are data on the number of securities that are assigned to the respective industries. Observations (Obs.) represents the number of observed data points of the industry SII, meaning the number of months within the sample period in which at least one security with short interest information is assigned to that particular industry. Our sample period consists of 528 months, so that is the top boundary. Every number below 528 reflects a missing SII observation for that industry. The mean column displays how many securities on average constitute to the industry SII and the remaining columns provide the standard deviation (St. Dev.), lowest observed value (Min.) and largest observed value (Max.) of the number of contributors to the respective industry SII’s.

Industry Description Obs. Mean St. Dev. Min. Max.

agric Agriculture, forestry, and fishing 499 4.8 2.85 1 12

mines Mining 527 26.9 15.10 4 70

Oil Oil and Gas Extraction 528 77.6 38.76 6 159

stone Nonmetallic Minerals Except Fuels 509 4.4 2.49 1 11

cnstr Construction 528 25.2 10.41 1 45

food Food and Kindred Products 528 43.5 17.88 2 73

smoke Tobacco Products 449 2.69 1.26 1 7

txtls Textile Mill Products 527 10.2 5.43 1 25

apprl Apparel and other Textile Products 521 16.6 8.46 1 37

wood Lumber and Wood Products 527 10.1 3.88 1 18

chair Furniture and Fixtures 484 10.8 5.68 1 24

paper Paper and Allied Products 525 17.4 6.12 1 31

publi Printing and Publishing 528 21.2 8.90 1 38

chems Chemicals and Allied Products 528 151.2 100.79 13 392

ptrlm Petroleum and Coal Products 528 24.4 5.45 2 34

rubbr Rubber and Miscellaneous Plastics Products 527 17.7 5.32 2 31

lethr Leather and Leather Products 489 7.3 3.28 1 15

glass Stone, Clay and Glass Products 527 12.8 5.41 1 24

metal Primary Metal Industries 528 33.0 11.88 2 57

mtlpr Fabricated Metal Products 527 29.5 7.85 3 44

machn Machinery, Except Electrical 528 97.9 47.74 5 204 elctr Electrical and Electronic Equipment 528 111.5 78.76 11 292

cars Transportation Equipment 528 51.7 15.24 7 79

instr Instruments and Related Products 528 85.7 64.90 2 235 manuf Miscellaneous Manufacturing Industries 528 14.5 6.75 1 32

trans Transportation 528 55.5 38.98 2 129

phone Telephone and Telegraph Communication 527 30.6 20.53 2 78 Tv Radio and Television Broadcasting 527 24.2 15.97 2 60 utils Electric, Gas, and Water Supply 528 100.4 34.38 8 148

garbg Sanitary Services 474 6.7 3.62 1 14

steam Steam Supply 0 - - - -

water Irrigation Systems 0 - - - -

whlsl Wholesale 528 56.5 32.85 2 119

rtail Retail Stores 528 115.9 64.87 8 253

money Finance and Insurance 528 909.0 917.66 5 2796

srvc Services 528 247.5 215.73 7 672

govt Public Administration 527 9.1 5.33 1 24

othr Almost Nothing 0 - - - -

(15)

The correlation between the discussed control variables is highlighted in Table 2, which shows that the controls are rather complementary than overlapping in general.

Table 2. Correlation of control variables 1973-01 – 2016-12.

Correlations between the different variables are reported below. The first two columns help identifying the variables, as does the first row. The first 5 variables are control variables as suggested by previous literature and the last row reports correlations with the market excess return. Naturally all correlations of the variables with themselves are 1.00. Outstandingly high is the correlation between dividend yield and book-to-market ratio, which can be partially traced back to their original calculations. All other correlations remain below 0.50 (and above -0.50).

bm dy dspr mktvol infl mktrf

bm Book-to-market ratio 1.00 0.90 0.45 -0.10 0.49 -0.06

dy Dividend yield 0.90 1.00 0.48 -0.13 0.38 0.05

dspr Default spread 0.45 0.48 1.00 0.35 -0.01 0.04

mktvol Market volatility -0.10 -0.13 0.35 1.00 -0.19 -0.30

infl Inflation 0.49 0.38 -0.01 -0.19 1.00 -0.11

mktrf Market excess return -0.06 0.05 0.04 -0.30 -0.11 1.00

Appendix 3 summarizes the descriptive statistics of the variables mentioned above. Panel A and B display the data for the period 1973 – 2016 where C and D show the statistics for 1946 – 2016, which is when the HTV sample period started. Subsequently Appendix 4 reports the correlations between different industry portfolio excess returns. Correlations vary between 0.19 (smoke, mines) and 0.89 (elctr, machn). Naturally some industries have more common ground than others and that is reflected as well in this correlation matrix.

The lack of observations in the dropped industries discussed before (steam, water, garbg and other) is also reflected in the availability of the industry excess returns. On top of that the govt portfolio misses plenty of observations as well and will be ignored during the remaining analysis too, leaving us 34 industry portfolios to investigate.

Finally we need a non-market proxy for economic activity. HTV (p. 373) use the coincident index for economic activity from Stock and Watson. However this measure is outdated according to the NBER and has been replaced in December 2003. We use its replacement, the Chicago FED National Activity Index (CFNAI), which structures 85 macroeconomic indicators each month according to the Stock and Watson (1989) methodology. It is designed specifically to provide an indication of the current state of production

(16)

instead of future growth conditions. The April 2017 version of the index is used and downloaded from the Chicago FED website.

Figure 3. National activity index 1973-01 – 2016-12.

In this graph the monthly Chicago FED National Activity Index is shown during our sample period. It is an approximation of economic activity. The mean and standard deviation are set to 0 and 1 respectively. When macroeconomic conditions perform good the index will be high and the index will fall in months of constraints. For this reason the oil crisis in mid-70’s and the credit crisis in late 00’s can be recognized immediately by the downward spike.

Methodology & Results

Industry predictability

As discussed before we firstly need to establish the degree of slow information diffusion and what are the leading industries. When researching time series predictably in-sample significance can provide a broad indication of which factors are relevant, but out-of-sample significance is truly essential. Goyal and Welch (2008) therefore make a sharp distinction between the two types of analysis while evaluating predictive models. In-sample regression estimates are only known ex-post. Because investment decisions are made ex-ante the in-sample models are useless in reality.

(17)

In-sample analysis

In order to establish any relationship we need to examine the current status of the industry portfolio predictive power. If there are any consistently leading industry portfolio’s it means that there is public information available in some parts of the market, which is not exploited fully by investors. This would contradict market efficiency in the semi-strong sense.

To investigate this we will replicate the research done by HTV, which is key because Yiuman (2015) has already shown that their results are sensitive to various data selection criteria. This procedure involves running the Ordinary Least Squares regressions specified in Equation (2) for our own period of interest, 1973 - 2016.

𝑅

𝑚,𝑡:𝑡+ℎ

= 𝛼

𝑖

+ 𝛽

1,𝑖

𝑅

𝑖,𝑡−1

+ 𝛽

2,𝑖

𝑍

𝑡−1

+ 𝜀

𝑖,𝑡:𝑡+ℎ

for 𝑡 = 1, … , 𝑇 − ℎ

Eq.(2)

Equation (2) is a standard framework for research on time series predictability, where 𝑅𝑚,𝑡:𝑡+ℎ is the market excess return that is attempted to be forecasted by lagged variables 𝑅𝑖,𝑡−h and 𝑍𝑡−h. The regression parameters 𝛼, 𝛽 and 𝜀 all have subscripts 𝑖 for they are industry specific. The months of lag is determined in ℎ. We will check for ℎ = 1, 3, 6 and 12 for the reason that the SII predictability concentrates in these shorter horizon time periods (see Appendix 7). Our main variable of interest is𝑅𝑖,𝑡−1, the lagged industry portfolio excess returns. 𝑍𝑡−i represents a vector of (combined) control variables with identical lag as the explanatory industry excess returns.

Using all controls at once in a so-called ‘kitchen sink’ regression would allow for off estimators through overfitting, which will be an out-of-sample disaster even though it might provide significant in-sample outcomes. For this reason the control variables discussed in the Data section will be implemented one by one in 𝑍𝑡−h. Along with market volatility, default spread, book-to-market ratio, inflation and dividend yield the lagged market excess return will be used to capture time-varying risk. Because lagged market excess returns are used in the autoregressive model, which is known to be a basic but effective ‘internal’ estimation method (Kenett et al (2015)), it is combined with the ‘external’ control variables in many specifications. These controls match the HTV procedure.

(18)

The OLS estimates can be seen in Table 3. The difference with the HTV conclusions is remarkable. With p=0.10, on average 3.5 of the industries should be significant predictors by pure chance. For the one month and three month horizon the number of significant industries is 4 and 3 respectively, making an argument for true predictive powers harder. The market excess returns for longer horizons have more significant predictor industries (9 for ℎ = 6 and 10 for ℎ = 12). This signals that the distribution of information takes longer than previously believed.

Throughout the horizons the core of the predictive industries remains stable, which is in line with the current paradigm in which the market leaders have a structural place in the information hierarchy due to their fundamentals. Stone persists in all horizons as a significant predictor. Petroleum is present as significant in all but one case, although it does is a better short-term than long-term predictor. The 6 and 12 month horizons add more predictive industries to the selection with oil, mines, cnstr, metal, mtlpr and trans as the most distinctive ones. A commonality in these industries is often the link (direct or indirect) to commodities, the top of the supply chain, as pointed towards by Menzly and Ozbas (2004) as well. Interestingly almost all significant signs are negative, with exception to phone, tv and money. A possible explanation for this is that those industries are more technology related and operate at the other side of the supply chain spectrum. When critical upstream components of the business are under pressure to cut prices it can be more profitable for the rest of the market in the downstream section. It is possible that the other positive market predictors in the research by HTV have since then lost their leading status.

The R2‘s of the significant predictors is usually between 1% - 2%. This is partly due to the head start in

terms of R2 provided by the significance of the lagged market excess return itself, incorporated in the

regression as well. This variable accounts for example already for 0.5% for the ℎ = 1 horizon.

When we take a look at the R2‘s it should be noted that comparisons across horizons are not accurate for

the reason that they increase mechanically by nature due to the autoregressive feature of the predictive excess returns (Cochrane, p. 392). AR(1) betas of the industry excess returns are not as close to 1 as most predictive factors, but the effect is large enough not to be ignored and be careful with interpretation as can be seen in Appendix 6.

(19)

Table 3. Market predictability by industries 1973-01 – 2016-12.

This table reports the estimates for the HTV regressions. Dependent variable is the market excess return and the different industry excess returns mentioned in column one are the main independent variable as in Equation (2). The used control is the lagged market excess return. The lag is either 1, 3, 6 or 12 as can be seen in horizon of predictability. In-sample OLS estimates denoted as 𝛽1,𝑖 in Equation (2) are viewed on the firsts

column of each horizon with robust standard errors below in brackets. Each second column states the R2‘s of the regression. The top R2‘s are based

on in-sample performance where the bottom observation represents out-of-sample R2‘s for period 1995-01 – 2016-12. R2 calculation is based on

Equation (3) and estimated based on OLS coefficients for sample period 1973-01 – 1994-12. Statistically significant (different from zero) estimates are marked with * , **, *** for α is 10%, 5% and 1% respectively.

(20)

Compared to the original HTV paper the amount of significant industries is little. However the overall image is the same with similar informed industries and also similar signs of these informative excess returns. The updated note by HTV (2014) already shifts towards these results with less significance after data revisions and extension. The R2‘s are small, even the significant ones (1% - 3%), when compared to

the predictors examined by Goyal and Welch (2008) (3% - 5%). This is possibly due to the revelation of the HTV industries, which can have caused investors to become aware of it and exploiting it thereby trading it away partially, putting the leading industry returns under pressure. As mentioned in the introduction it is possible as well that technological advancements speeded up the information circulation, making the leads less powerful in their forecasts.

Out-of-sample tests

As mentioned before we need to look at out-of-sample performances to evaluate the true predictive capacities of our estimators. In order to do so, the sample period is split up into two subsamples. Based on the first subsample, 1973 – 1994, the predictive model for market excess returns will be estimated using Equation (2), again using lagged market excess returns as the sole control. The market excess return forecasts based on the model can then be compared to the actual observed market excess returns to obtain the forecast errors. The sum of squared errors (SSE) dividend by the total sum of squares (TSS) yields the unexplained part of the variations in the dependent variable. This way we can find the explained part of the model, R2, as well.

𝑅

𝑖2

= 1 −

𝑆𝑆𝐸

𝑖

𝑇𝑆𝑆

𝑖 Eq.(3.1)

𝑇𝑆𝑆

𝑖

=

(𝑅

𝑚,𝑡:𝑡+ℎ

− (𝛼

̂𝑖 +

𝛽

̂𝑖

𝑅

𝑚,𝑡−1

))

2 𝑡−ℎ 𝑡=1 Eq.(3.2)

Because this estimation model is not optimal per definition for out-of-sample excess return prediction, it is possible for the values of 𝑅𝑖2 to become negative. Negative 𝑅𝑖2 ‘s can be interpreted as predictors whose forecasts are more off then guessing the simple average of the first subsample to be the next observed excess return, only predicted by lagged market excess return. By adjusting the 𝑅𝑖2 calculation this way (Equation (3)) the final values will be lower. Considering market excess return (the time varying risk premium) variation is already included in the TSS, the denominator is reduced, affecting the total 𝑅𝑖2 downwards. This makes it a harder test to pass for our industry portfolio excess returns. It allows us to isolate the true predictive effect of the industry excess return coefficient as opposed to the full model.

(21)

Where the significant industries already underperformed in in-sample testing compared to the original HTV paper, this trend continues in the out-of-sample test. These results can be seen in each estimators second row 𝑅𝑖2in Table 3. Only industry stone keeps a positive R2 for the shortest two horizons while ptrlm,

tv and money fail to do so. This suggests once again that the information diffusion takes at least a medium term to take place, because for ℎ = 6 and ℎ = 12 only a few predictor industries (mines, metal) do not meet the zero 𝑅𝑖2 out-of-sample threshold. After all agric (ℎ = 12), oil, stone, cnstr, paper (ℎ = 6), ptrlm (ℎ = 12),

mtlpr, manuf (ℎ = 6), trans, phone (ℎ = 6) and tv (ℎ = 12) are the main predictor industries to watch on the medium to long term. These firms are typically related to the input-output relation of the business cycle defined in the Literature section. The explained deviations of the market excess return from the model based solely on itself still is 2.3% at most, but usually around 0.6% - 1.0%.

Finally the stability of the estimators can be examined graphically by looking at the difference between the cumulative SSE of a model and a baseline model over time, as done for all time series predictors by Goyal and Welch (2008). Figure 4 illustrates that when we compare the null model, containing only lagged excess market return as an independent variable, we can disentangle the effect through time. Where the money industry plot collapses almost immediately in the late 90’s and continues to fall afterwards, the stone predictor model follows its in-sample counterpart closely. By default the in-sample estimator wins at the end, but continuous near parallel movements is a good sign in terms of true predictive power. A stable predictor is an advocate for the theory of slow information diffusion, because it shows that its power is not predominantly an incidental matter of chance.

When looking at Figure 4 it is no surprise that the money industry information edge is not powerful enough to sustain out-of-sample testing. Critically reviewing the stone industry predictability leads to the question why both lines do not rise above 0 until the financial crisis of 2008, but that goes beyond our focus for now. Needless to say these plots can be generated for every time series predictor and every horizon.

We have now established insights on where we can find the informational advantage in the market. Public information asymmetry appears not to be dissolved on the short term as we can only observe weak evidence on it. For medium term horizons however some industry portfolios have excess returns that predict market excess returns in such a way that a constant effect of slow information diffusion cannot be ignored. News in these industries receives too little attention from investors, allowing for the predictive power of their returns.

(22)

Figure 4. Performance of the in-sample and out-of-sample industry models 1995-01 – 2016-12.

The two lines displayed are the cumulative SSE differences of the estimation models based on industry portfolio excess returns and lagged market excess return with a stable benchmark based on only lagged market excess returns. The dotted line represents the in-sample mode, which uses observations from 1973-2016 for = 1. The solid line in turn is the out-of-sample based model that uses data from 1973-1994 for = 1. When the line goes down, it means that the baseline model outperforms the estimations by the industry portfolio and vice versa. The left plot shows the model based on the stone industry excess returns, one that has been named to be reliable both in- and out-of-sample. The right example is the money industry. This industry was p=0.10 significant in-sample, but clearly does not work that well out-of-sample.

SII predictability

Now that we have touched base with industry predictability we can extend this knowledge to the industry SII predictability. We need to determine the exact predictive power of the SII and the individual SII’s in order to see whether the short industry information shows a similar pattern with regard to information on the market excess returns. RRZ assume that the total market SII is the best predictor for market excess returns. This would make sense when multiple industry SII’s contain additional information which is not diluted by underperforming industry SII’s.

Shorting information into the market

Firstly we take the simple forecasting model proposed by RRZ and adjust it for multiple SII inputs. This results in Equation (4), which is very similar to Equation (2), except for the independent variable and the lack of control vector.

𝑅

𝑚,𝑡:𝑡+ℎ

= 𝛼

𝑖

+ 𝛾

𝑖

𝑆𝐼𝐼

𝑖,𝑡−1

+ 𝜀

𝑖,𝑡:𝑡+ℎ

(23)

This measure enables us to test the forecasting ability of every short index for the market portfolio. There are a variety of reasons as to why these individual indices could be predictive. We know from the HTV analysis that industries are of different importance to market excess returns. Because some industries have more effect on others, the outstanding shorts of industries contains a different exposure as well for the investors. Shorting key stock can be a way to lever up the investment when market swings are expected to affect a set of stock more than others (CAPM beta > 1). Shorting stock instead of the market can also be a way of bypassing the reporting obligation for diversified ETF’s and remain under the radar of other competitor investors.

Table 4 provides the required estimations to discuss the predictive powers of the industry SII’s. First of all it makes sense intuitively that all significant signs are negative. This is in line with RRZ and other literature (e.g. Engelberg et al (2012), Drake et al (2011)) and means that high shorting activity forecasts low excess returns, which is exactly what short sellers are aiming to time. From this we can conclude that their skills are up to the level that they are not consistently wrong. Of course short sellers would not exist at all if the market did not allow for the skill to be profitable.

The economic magnitude of the variables need to be seen relative to the SII values. Because SII values are roughly between -2.5 and 2.5 a coefficient of -0.010 predicts that when the respective SII collapses with three standard deviations of 0.5 the market excess return rises by 1.5% the next month. To compare, industry excess returns in the HTV analysis have standard deviations barely reaching 0.010. When their coefficient is 0.100 and a collapse by three standard deviations occurs, the forecasted effect on the market performance is only 0.3%. They need a shock of five times as great (in standard deviations) to establish the same forecasted effect on the market, which makes it a less attractive prediction method in practice.

(24)

Table 4. Market predictability by industry SII’s 1973 - 2016.

This table reports the estimates for the predictive regressions of market excess returns by industry SII’s. Dependent variable is the market excess return and the different industry SII’s mentioned in column one are the independent variable as in Equation (4) with exception to the first row where the market SII is the independent variable. The lag is either 1, 3, 6 or 12 as can be seen in horizon of predictability. In-sample OLS estimates denoted as

𝛽

1,𝑖 in Equation (4) are viewed on the firsts column of each horizon with robust standard errors below in brackets. Each second column states the R2‘s of the regression. The top R2‘s are

based on in-sample performance where the bottom observation represents out-of-sample R2‘s for period

1995-01 – 2016-12. R2 calculation is based on Equation (5) and estimated based on OLS coefficients for

sample period 1973-01 – 1994-12. Statistically significant (different from zero) estimates are marked with * , **, *** for α is 10%, 5% and 1% respectively.

Est. R2 (%) Est. R2 (%) Est. R2 (%) Est. R2 (%) Est. R2 (%) Est. R2 (%) Est. R2 (%) Est. R2 (%)

-0.014* 0.8 -0.047*** 2.6 -0.096*** 5.2 -0.177*** 8.8 -0.009** 1.0 -0.022*** 2.1 -0.028*** 1.6 -0.035** 1.2 (0.008) 0.1 (0.015) 3.9 (0.022) 8.2 (0.033) 12.5 (0.004) 0.3 (0.007) 0.9 (0.011) 1.5 (0.015) 2.2 -0.003 0.3 -0.005 0.2 -0.003 0.1 0.031*** 2.4 -0.004 0.1 -0.016 0.5 -0.034** 1.0 -0.079*** 2.7 (0.003) 0.3 (0.005) 0.5 (0.008) 1.2 (0.010) 2.0 (0.006) 0.1 (0.011) 0.7 (0.016) 1.8 (0.022) 4.9 -0.002 0.0 0.004 0.0 0.000 0.0 0.028 0.5 -0.002 0.1 -0.005 0.1 -0.007 0.1 -0.016 0.2 (0.005) -0.3 (0.010) 0.2 (0.016) 0.2 (0.018) 1.7 (0.005) 0.0 (0.009) 0.3 (0.014) 0.6 (0.019) 1.6 -0.012** 1.0 -0.040*** 3.5 -0.085*** 7.5 -0.123*** 7.8 0.010 0.3 0.025 0.7 0.064*** 2.0 0.122*** 3.6 (0.005) -1.8 (0.009) -1.8 (0.012) 5.3 (0.017) 6.7 (0.009) -1.6 (0.016) -2.0 (0.024) -4.0 (0.036) -1.1 -0.004 0.4 -0.008 0.6 -0.011 0.5 -0.005 0.1 -0.014*** 1.9 -0.037*** 4.1 -0.075*** 8.0 -0.112*** 9.0 (0.003) -0.3 (0.006) -0.8 (0.008) 0.5 (0.011) 2.3 (0.005) -1.9 (0.008) 2.1 (0.010) 8.0 (0.014) 9.2 -0.005 0.4 -0.016** 1.1 -0.029*** 1.8 -0.026* 0.7 -0.011** 0.7 -0.039*** 2.6 -0.075*** 4.5 -0.114*** 5.2 (0.004) -1.0 (0.006) -2.5 (0.010) -2.1 (0.015) 0.3 (0.006) 0.9 (0.011) 3.2 (0.016) 5.5 (0.023) 7.6 0.002 0.1 0.006 0.2 0.013* 0.5 -0.002 0.0 -0.004 0.1 -0.019* 0.5 -0.042** 1.1 -0.057** 1.0 (0.004) 0.1 (0.006) -0.4 (0.007) -3.0 (0.011) -4.1 (0.006) 0.1 (0.012) 0.1 (0.018) 1.7 (0.024) 2.6 0.001 0.0 0.000 0.0 -0.001 0.0 -0.006 0.1 0.000 0.0 0.004 0.1 0.011 0.3 0.033** 1.5 (0.003) -0.2 (0.005) 0.1 (0.006) 0.8 (0.009) 2.5 (0.004) -0.6 (0.007) -0.1 (0.012) 0.4 (0.013) 3.0 -0.002 0.1 -0.005 0.1 -0.006 0.1 0.002 0.0 -0.004 0.1 -0.014 0.3 -0.037** 1.0 -0.059** 1.4 (0.003) 0.1 (0.007) 0.5 (0.011) 0.6 (0.015) 0.4 (0.006) -1.8 (0.011) -2.0 (0.016) -3.6 (0.024) -7.1 -0.001** 1.1 -0.024*** 2.2 -0.051*** 4.9 -0.069*** 4.4 -0.005* 0.7 -0.014*** 1.7 -0.029*** 3.5 -0.046*** 4.4 (0.004) 0.4 (0.008) 0.8 (0.012) 3.7 (0.015) 6.2 (0.003) 0.6 (0.005) 2.1 (0.007) 4.5 (0.010) 6.9 -0.002 0.1 -0.002 0.0 -0.006 0.1 -0.015 0.3 -0.008 0.4 -0.013 0.3 -0.033* 0.8 -0.044* 0.8 (0.003) -0.4 (0.006) -2.2 (0.009) -4.6 (0.013) -3.1 (0.006) 0.5 (0.0120) -0.3 (0.0190) -1.0 (0.023) 0.2 -0.006* 0.8 -0.01 0.7 -0.019* 1.1 -0.044*** 2.9 -0.006 0.4 -0.013* 0.6 -0.025*** 1.1 -0.051*** 2.4 (0.003) 1.0 (0.007) 1.5 (0.010) 2.4 (0.013) 5.4 (0.004) 0.3 (0.007) 1.0 (0.009) 2.2 (0.013) 4.6 0.003 0.1 0.003 0.0 0.004 0.0 0.009 0.1 -0.006 0.3 -0.028*** 2.5 -0.067*** 6.9 -0.125*** 11.8 (0.003) -0.1 (0.006) -0.2 (0.010) -0.1 (0.013) 0.9 (0.005) -0.1 (0.009) 0.5 (0.013) 1.2 (0.015) 2.4 -0.001 0.0 -0.011 0.3 -0.034** 1.3 -0.085*** 4.1 -0.004 0.1 -0.019 0.4 -0.046* 1.2 -0.112*** 3.5 (0.006) -0.7 (0.009) -0.3 (0.015) 1.1 (0.019) 5.6 (0.008) -0.1 (0.015) 0.8 (0.024) 2.3 (0.033) 1.9 0.001 0.0 0.001 0.0 -0.003 0.0 -0.014 0.1 -0.005 0.2 -0.019** 1.1 -0.041*** 2.4 -0.078*** 4.3 (0.005) -0.3 (0.009) -1.0 (0.012) -1.3 (0.016) 0.0 (0.005) -0.3 (0.009) -0.8 (0.013) 1.2 (0.019) 6.5 0.005 0.4 0.005 0.1 0.008 0.2 0.020 0.5 -0.009 0.4 -0.015 0.3 -0.015 0.2 -0.027 0.2 (0.004) 0.1 (0.006) 0.4 (0.009) 0.9 (0.013) 2.4 (0.007) 0.4 (0.013) 0.5 (0.020) 0.6 (0.025) 2.2 -0.008** 1.0 -0.021*** 2.3 -0.041*** 4.3 -0.091*** 10.3 -0.004 0.2 -0.004 0.1 -0.005 0.0 -0.013 0.2 (0.003) 0.8 (0.007) 2.5 (0.010) 4.6 (0.013) 8.9 (0.004) 0.2 (0.008) -1.9 (0.013) -2.8 (0.018) -1.3 -0.002 0.1 0.000 0.0 -0.002 0.0 -0.014 0.5 (0.002) 0.0 (0.004) 0.4 (0.006) 1.2 (0.009) 2.2 Ind. h=1 h=3 h=6 h=12 Horizon of predictability cars pape r publi chems ptrlm rubbr le thr food smoke txtls apprl wood chair sii agric mines oil stone cnstr rtail money srvc re it Ind. Horizon of predictability h=1 h=3 h=6 instr manuf trans phone tv utils glass me tal mtlpr machn e lctr h=12 whlsl

(25)

Because SII’s do not require additional variables in their regression its R2 is calculated differently from the

industry excess returns for the out-of-sample measure, see Equation (5). This calculation is both more conventional and less conservative, which makes comparisons with HTV R2‘s harder. However by

comparing the SII regression results with each other we can safely obtain other interpretations.

𝑇𝑆𝑆

𝑖

=

(𝑅

𝑚,𝑡:𝑡+ℎ

− 𝑅

̅𝑚,𝑡:𝑡+ℎ

)

2

𝑡−ℎ

𝑡=1

Eq.(5)

Similarly to the results from HTV analyses the SII’s also perform the best for medium to long term forecasts, which suggests that the short sellers informational advantage is obtained at the same moment in time as when industries start to lead. As mentioned earlier the performance with even longer horizons is researched as well. Appendix 7 shows that R2’s decrease for ℎ = 24 and ℎ = 36 even though it is boosted

by the AR effect described earlier. This effect is even more crucial when looking at the SII’s because the AR(1) coefficients are closer to 1 than excess returns, as displayed in Panel C of Appendix 6. On top of that when individual market excess returns are forecasted (

𝑅

𝑚,𝑡+ℎ instead of

𝑅

𝑚,𝑡:𝑡+ℎ

)

the significance disappears for these longer horizons.

Comparing the SII’s shows that the market information is not spread equally over all industries. Ultimately at ℎ = 12 only 11 industries (out of 34) are not significant at a p=0.10 level. The overall trend is that predictor SII’s gain power going forward in time. Of the 130 estimates (market SII excluded) 60 are in-sample significant, of which a fifth has negative out-of-in-sample R2’s. Two thirds of the observed significant

coefficients are from the last two horizons. On top of that in 25 cases (of which 19 in ℎ = 6 or ℎ = 12) the out-of-sample outperformed the in-sample estimate. The out-of-sample SII models consistently topping their in-sample performance are cars, phone and utils, which are industries that remained relatively low profile during the HTV analysis. The same holds for elctr and rubbr, that have high out-of-sample R2’s as

well. This suggests that the short sellers informational gain is apply on a different location than the long information. In contrast stone, ptrlm, tv and mtlpr which have significant predictive power in their industry excess return but perform poorly as SII. This can be partly due to the estimation error described in Table 1, where for instance stone is one of the hardest SII’s to approximate. This finding can also come forth from the distinction between negatively and positively predicting industries described before. Metal is in this instance a dissonant because it contains information relevant for excess market returns as well in its returns as in its SII.

(26)

The aggregate market SII is the best predictor compared to industry SII’s, having significant coefficients that outperform out-of-sample with high R2’s of 12.5% for ℎ = 12. That in itself means that no matter how

informative the short interest is within an industry, when it is combined with others the synergy effect trumps the dilution effect, suggesting that informed shorters are active over a broad horizon of industries.

Shorting information into industries

We have investigated links between industry SII’s and the market and determined that some industry SII’s are unable to predict future market excess returns. Another question is whether all industry SII can predict their own industry excess returns. If this is not the case, it would suggest that the market SII information is still being diluted to some extend by incompetence of short sellers to forecast certain industry excess returns. It can be a result of the lack of usable information that is not yet reflected in prices, which would mean that market efficiency is higher within some industries. This can have a negative impact on the power of SII as well.

Specifications in Equations (4) and (5) are slightly adjusted to predict the industry 𝑖 instead of the market. Estimation results of Equation (6) can be found in Table 5.

𝑅

𝑖,𝑡:𝑡+ℎ

= 𝛼

𝑖

+ 𝛾

𝑖

𝑆𝐼𝐼

𝑖,𝑡−1

+ 𝜀

𝑖,𝑡:𝑡+ℎ

for 𝑡 = 1, … , 𝑇 − ℎ

Eq.(6.1)

𝑇𝑆𝑆

𝑖

=

(𝑅

𝑖,𝑡:𝑡+ℎ

− 𝑅

̅𝑖,𝑡:𝑡+ℎ

)

2 𝑡−ℎ 𝑡=1 Eq.(6.2)

Overall almost a third of the investigated industry SII’s (10 out of 34) are unable to explain their own industry future excess return for any horizon, even in-sample. This means that their SII contains no relevant information not yet incorporated into the current stock prices. The lack of informational edge for the short sellers can be a result from semi-strong market efficiency in those industries or from the shorters’ expertise that is not applicable to those industries. We should not forget to account for the estimation error we encounter on industry level, which makes it harder to obtain significant outcomes through the noise. Also maybe the fact we use an equally weighted estimator to forecast value-weighted industry excess returns has an effect. All significant signs are negative (only exception being agric for ℎ = 12, but that estimate does not function out-of-sample) and have similar sizes as for the market predictability.

(27)

Table 5. Industry predictability by industry SII’s 1973 - 2016.

This table reports the estimates for the predictive regressions of industry excess returns by industry SII’s. Dependent variable is the industry excess return of the corresponding SII predictor. The industry SII’s mentioned in column one are the independent variable as in Equation (6). The lag is either 1, 3, 6 or 12 as can be seen in horizon of predictability. In-sample OLS estimates denoted as

𝛾

𝑖 in Equation (6) are viewed on the firsts column of each horizon with robust standard errors below in brackets. Each second column states the R2‘s of the regression. The top R2‘s are based on in-sample performance where the

bottom observation represents out-of-sample R2‘s for period 1995-01 – 2016-12. R2 calculation is based on

Equation (6.2) and estimated based on OLS coefficients for sample period 1973-01 – 1994-12. Statistically significant (different from zero) estimates are marked with * , **, *** for α is 10%, 5% and 1% respectively.

Again the estimations for horizons ℎ = 6 and ℎ = 12 are the most accurate, although the out-of-sample check rejects almost three quarters of their significant estimators (30 out of 43). If we take out-of-sample R2’s > 0 as the critical value we can only confirm the informational advantage of the short sellers in 4

Est. R2 (%) Est. R2 (%) Est. R2 (%) Est. R2 (%) Est. R2 (%) Est. R2 (%) Est. R2 (%) Est. R2 (%) -0.000 0.0 -0.000 0.0 0.006 0.1 0.040*** 1.9 -0.010* 0.5 -0.028*** 1.3 -0.033** 0.9 -0.052** 1.2 (0.004) 0.1 (0.006) 0.3 (0.009) 0.7 (0.013) -2.1 (0.006) -0.1 (0.011) -0.3 (0.016) -0.9 (0.021) -4.1 -0.003 0.0 0.011 0.1 0.012 0.0 0.038 0.2 -0.001 0.0 -0.002 0.0 0.002 0.0 -0.020 0.1 (0.009) 0.0 (0.017) 0.1 (0.024) 0.0 (0.034) -1.8 (0.010) -0.1 (0.017) -0.2 (0.024) -0.9 (0.035) -0.1 -0.001 0.0 -0.018 0.3 -0.051** 0.9 -0.087*** 1.3 -0.008 0.5 -0.022** 1.0 -0.042*** 1.8 -0.084*** 4 (0.009) -1.6 (0.016) -1.4 (0.023) 1.1 (0.034) 1.8 (0.005) 0.2 (0.009) -0.6 (0.014) -2.7 (0.018) -6.1 -0.005 0.2 -0.011 0.3 -0.021* 0.6 -0.035* 0.7 0.014 0.3 0.038* 0.7 0.089*** 1.7 0.186*** 3.4 (0.005) 0.4 (0.008) 0.3 (0.012) -0.1 (0.018) 0.6 (0.011) -0.3 (0.020) 0.3 (0.030) 0.8 (0.044) 3.3 -0.005 0.2 -0.017 0.5 -0.032** 0.8 -0.027 0.3 -0.018*** 1.3 -0.050*** 3.2 -0.105*** 6.2 -0.154*** 5.9 (0.006) -1.1 (0.011) -3.3 (0.016) -3.9 (0.022) -1.4 (0.007) 0.6 (0.012) 3.3 (0.018) 6.2 (0.027) 5.3 -0.001 0.0 -0.003 0.1 0.009 0.3 0.005 0.0 -0.014* 0.7 -0.055*** 3 -0.103*** 4.9 -0.153*** 4.9 (0.003) -5.1 (0.005) -21.2 (0.008) -60.8 (0.012) -91.9 (0.007) 0.4 (0.014) 1.9 (0.020) 1.7 (0.030) -1.0 -0.000 0.0 -0.002 0.0 -0.007 0.1 -0.008 0.1 -0.002 0.0 -0.005 0.0 -0.016 0.1 -0.017 0.1 (0.003) -0.1 (0.006) -1.2 (0.010) -3.3 (0.015) -9.6 (0.007) -0.1 (0.014) -3.1 (0.020) -3.0 (0.029) -3.2 -0.005 0.1 -0.011 0.2 -0.018 0.2 -0.026 0.2 -0.006 0.4 -0.021** 1.2 -0.044*** 2.7 -0.078*** 4.5 (0.006) 0.1 (0.011) -1.0 (0.017) -4.6 (0.025) -10.5 (0.005) -0.9 (0.008) -2.3 (0.012) -4.2 (0.016) -4.6 -0.008 0.4 -0.025** 1.0 -0.064*** 2.9 -0.096*** 3.1 -0.009 0.3 -0.030** 1.0 -0.077*** 3.2 -0.144*** 5.9 (0.006) 0.0 (0.011) -0.6 (0.016) -1.3 (0.024) -4.8 (0.007) -2.3 (0.013) -2.0 (0.019) -4.3 (0.026) -6.7 0.001 0.0 0.010 0.2 0.020 0.3 0.024 0.2 -0.006** 0.8 -0.017*** 2.4 -0.034*** 4.2 -0.055*** 4.2 (0.006) -0.8 (0.010) -3.5 (0.015) -9.9 (0.021) -12.8 (0.003) 0.8 (0.005) 2.7 (0.007) 4.1 (0.012) 0.5 -0.006 0.3 -0.007 0.1 -0.007 0.1 -0.028 0.4 -0.015* 0.6 -0.033** 0.8 -0.065*** 1.3 -0.064* 0.6 (0.005) 0.1 (0.009) -0.6 (0.013) -1.1 (0.019) -0.6 (0.008) 0.2 (0.016) 0.2 (0.025) -0.2 (0.037) -3.6 0.000 0.0 -0.004 0.1 -0.011 0.2 -0.022 0.4 -0.004 0.2 -0.007 0.3 -0.022** 1.2 -0.051*** 2.7 (0.004) -0.5 (0.007) -1.3 (0.010) -1.9 (0.015) -2.2 (0.003) -0.2 (0.006) -0.3 (0.009) -0.6 (0.014) -1.8 -0.005 0.2 -0.019 0.5 -0.042** 1.1 -0.131*** 4.8 -0.014*** 1.4 -0.043*** 4 -0.086*** 7.6 -0.149*** 10.7 (0.006) -0.2 (0.012) -0.5 (0.018) -1.3 (0.026) 0.9 (0.005) 1.1 (0.009) 1.9 (0.013) 0.0 (0.019) -4.5 0.007 0.3 0.019** 0.9 0.034*** 1.4 0.056*** 1.7 -0.012 0.4 -0.043*** 1.4 -0.104*** 3.9 -0.256*** 11.4 (0.005) -1.0 (0.008) -3.9 (0.012) -6.9 (0.019) -8.2 (0.008) 0.1 (0.016) 0.2 (0.023) -2.2 (0.031) -21.4 -0.000 0.0 -0.004 0.1 -0.006 0.1 -0.029** 0.8 -0.002 0.0 -0.014 0.4 -0.036** 1.2 -0.083*** 3.0 (0.004) 0.0 (0.006) 0.1 (0.009) -0.1 (0.014) -0.4 (0.005) -0.3 (0.010) -1.9 (0.014) -2.8 (0.021) -0.3 -0.006 0.4 -0.019** 1.0 -0.036*** 1.8 -0.083*** 4.7 -0.008 0.1 -0.005 0.0 0.003 0.0 -0.014 0.0 (0.004) 0.0 (0.008) 0.2 (0.012) -0.1 (0.017) -0.1 (0.009) 0.1 (0.017) -1.4 (0.024) -4.6 (0.034) -7.1 -0.004 0.2 -0.005 0.1 -0.016 0.4 -0.029* 0.6 -0.006 0.3 -0.012* 0.5 -0.018 0.5 -0.034** 0.8 (0.004) 0.2 (0.008) -0.3 (0.011) 0.3 (0.016) 0.1 (0.004) 0.3 (0.007) 0.5 (0.011) -0.3 (0.017) -0.3 Ind. Horizon of predictability h=1 h=3 h=6 h=12 food smoke txtls apprl wood agric mines oil stone cnstr instr publi chems ptrlm rubbr whlsl pape r chair le thr glass me tal mtlpr machn e lctr cars rtail money srvc re it manuf trans phone tv utils Ind. Horizon of predictability h=1 h=3 h=6 h=12

(28)

industries, oil, machn, elctr and phone. Because individual industry excess returns aggregate to the excess market return, it is not a surprise that insignificant SII predictors in Table 4 and 5 show a similar pattern.

Of the 10 powerless industries mentioned above 7 do not pass the tests as well for market prediction power. Only metal, instr and chair SII’s serve as (long-term) predictors of the market excess return without being able to predict their own industry excess returns. Eyeballing the correlation matrix in Appendix 4 does not provide possible guidelines for this at once, but it appears to be conflicting at first sight. Industry SII’s that are able to predict the market without predicting their own excess returns suggest that the flow of information does not run through the subject industry per definition. Because the tests involve out-of-sample performance simply blaming the margin of type 1 error p=0.10 is not justified as well. Perhaps investors mistakenly assume that market downturns will be reflected in these industries as well and therefore chose to short them. Another possibility is that short sellers are specialized in predicting market fluctuations as a whole and decide to short every part of the market in an ETF-like way, without having information on individual stocks or industries. This effect might in turn be unobservable in other industries due to a larger fraction of noise short sellers. Unfortunately Table 5 does not directly provide enough evidence to support that hypothesis completely. Also Boehmer et al (2008) reject the hindering presence of under informed short sellers. Finally this result can also be a consequence of the lack of available firms per SII, mentioned in Table 1, creating an error of measurement.

The industry SII’s that are able to predict their own industry excess return, oil, elctr, machn and phone, have been mentioned before in the previous tests. Elctr and machn share common ground in their fundamental characteristics as manufacturers also illustrated by the excess return correlation of 0.89. Together with phone they can be identified with the latter, downstream, fase in the supply chain. Oil on the other hand is the base example of a commodity and thus as an industry weakly dependent on other industries (prices are known to be influenced by the supply side primarily). This lays out the impossibility to group this selection in a theoretical setting.

Macroeconomic forecasting

Before we have seen that informational leads occasionally go hand in hand with the ability to forecast macroeconomic fundamentals. As explained before hedge funds have the resources and the incentive to make complicated estimates for crucial economic variables. It has also been suggested before that their

Referenties

GERELATEERDE DOCUMENTEN

Omdat betrokken partijen van tevoren niet altijd zullen weten of de koper een gelieerde partij is, is het aan te raden dat de beoogd curator zo snel mogelijk na zijn aanwijzing een

Dit liet volgens hem zien dat er door het Westen meer macht werd uitgeoefend door middel van bilaterale hulp en dat dit enkel zorgde voor economische groei in het westerse land

The measurements of the SAP under realistic conditions are corrupted by network interference, which can originate from mobile macrocell users, mobile users within the SAP coverage

More information with greater trans- parency is needed from the infant formula and baby food companies on how they apply the evidence gained from the extensive research conducted

But we have just shown that the log-optimal portfolio, in addition to maximizing the asymptotic growth rate, also “maximizes” the wealth relative for one

Therefore the moving standard deviation of the past 20 closing prices is used as a proxy for volatility as is also done by Beber and Pagano (2010). Market capitalization

Following Ackert and Tian (2001), this study therefore only considers closing prices. The daily index closing prices over the period.. Since the DAX is a performance index,

The last hypothesis which suggests that knowledge about the export process – represented by the number of information categories mentioned - depends on the degree familiarity with the