• No results found

The long-term relationsip between ETF flows and returns of the underlying index

N/A
N/A
Protected

Academic year: 2021

Share "The long-term relationsip between ETF flows and returns of the underlying index"

Copied!
50
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

University of Amsterdam Amsterdam Business school MSc Finance: Corporate Finance

Master Thesis

The Long-Term Relationship between ETF Flows and Returns of the Underlying Index Menno Pater

July 2018

(2)

Statement of Originality

This document is written by Menno Pater who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document are original and that no

sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of

(3)

Abstract

This research investigates what the long-term relationship between ETF flows and the returns of their underlying indices is. A multivariate time-series dataset is used comprised of monthly data on US equity ETFs and the underlying indices in the period 2000 up to and including 2017. A vector autoregression model is used to test the predictive power of the lagged values of the ETF flows and index returns for the future values of both variables. In the full sample analysis, it is found that index returns positively predict ETF flows the next month, evidentiary of return-chasing behavior. ETF flows do not predict returns of the underlying index and the information hypothesis is rejected. The analysis on the top 10% ETFs by assets under management, showed evidence in favor of predictive power. ETF inflows predict negative returns over the next three months and positive index returns predict ETF outflow the next three months. The opposite of both the information and return-chasing hypotheses is found. This research contributes to the flow-return relationship discussion and the academic literature on ETFs and ETF flows.

Keywords: Flow-return relationship, ETF flow, index return, information hypothesis, return-chasing hypothesis

(4)

Table of content

1. Introduction...5

2. Background...7

2.1 Literature Review...7

2.2 Flow-Return Relationship. ...9

2.3 The Creation/Redemption Process...10

2.4 Hypotheses...11 3. Methodology...12 4. Data...14 4.1 Data Collection...14 4.2 ETF Flows...16 4.3 Returns...17 4.4 Descriptive Statistics...17 5. Results...22 6. Robustness Checks...25

7. Conclusion & Discussion...29

Reference List...32

(5)

ETFs are a relatively new investment product. The first ETF was created in 1990 on the Toronto Stock Exchange and tracked the TSE-35 Index (Simpson, 2018). Since then the market for ETFs has grown significantly, especially so in the past decade. For the US alone, the ETF market increased from $100 billion in 2002 to $1 trillion in 2010 to $3.4 trillion in 2017 ("What is the history of ETFs?", n.d.). The largest ETF, the SPDR SPY, had assets under management of $300 billion on its 25th anniversary on the 22nd of January 2018 (Van Poll, 2018). This increase of investing in ETFs has given rise to discussions and research on the effects and implications for the market stability as well as for investors.

A typical ETF is an open-end fund that aims to track a certain benchmark as close as possible. This benchmark typically is an index, whether it be an equity, bond, commodity or any other type of index. The ETF holds the basket of securities of the underlying index it seeks to track and replicate. Once placed in the secondary market, shares of the ETF can be bought and sold on public exchanges like regular stocks throughout the opening hours of an exchange. Due to ETFs tracking an index rather than trying to outperform it, they are regarded as passively managed funds, as opposed to most mutual funds, that are managed in an active way. As mentioned by Koesterich (2008), "virtually all studies of mutual fund performance suggest that traditional active funds fail to add value once fees are accounted for." Due to mutual funds not adding value relative to generally used benchmarks such as the S&P500, investors might regard the passive ETFs as an excellent substitute. Other advantages that make ETFs more attractive relative to other types of funds are the low expense ratios, relatively high liquidity, lower taxes, increased transparency and easier access for smaller investors. All previously mentioned factors have contributed to the rapid rise of the ETF market over the past years.

The passive ETFs have been taking market share from mutual funds (Wigglesworth, 2018) as investors realize that active management in many cases does not lead to outperformance of the market in general (Koesterich, 2008). The flows in and out of ETFs would contain more information about the expectations of the underlying index relative to actively managed funds, since no managerial investment skill is involved (Kalaycioglu, 2004). Moreover, the elastic supply of ETF shares leads to flows containing mostly information about the expectations of the underlying index rather than factors specific to that ETF. Due to the ETF flows being unique in their informational content, the flow-return relationship between ETF flows and the returns of the underlying index comprises an interesting research area.

(6)

The research question central to this research is: What is the long-term relationship between ETF flows and the returns of the underlying indices? The research question is answered by testing whether ETF flows have predictive power on the future returns of the underlying index, or whether the underlying index returns have predictive power on the future ETF flows tracking that index. If predictive power is found, it is checked whether these results align with the two dominant theories in the flow-return discussion; the information hypothesis and the return-chasing hypothesis. The first argues that investors have informed expectations about future returns and will invest where these returns will be realized The latter argues that investors invest based on historical returns and momentum Hence, if ETF flows have positive predictive power for future index returns, the information hypothesis is confirmed. If index returns have positive predictive power for future flows, the return-chasing hypothesis is confirmed. Other results are not in line with the leading theories regarding the flow-return relationship and may give rise to new theories.

This research contributes to the existing research in multiple ways. First, the growth of the ETF market should be accompanied by a growing body of literature to uncover their impact on the market as a whole and on the different parties involved. Second, this research adds to literature on asset flows and the flow-return relationship due to the unique characteristics of ETF flows. Within the ETF literature, this research is of added value because it investigates the long-term ETF flow and index return relationship in addition to existing research on the short-term flow-return relationship. Within this long-term setting, that analyzes predictive power over a period of 10 months, the information and return-chasing hypotheses are the focus rather than the price pressure hypothesis. Also, the period that is covered is relatively long, covering 18 years, and the dataset is relatively large, since the focus in this research is not only on the largest ETFs but on a sample of all sizes.

The method used to test the hypotheses is a vector autoregression with a multivariate time series dataset. The dataset is comprised of unlevered US equity ETFs and the period is from the beginning of 2000 until the end of 2017. Initially the dataset is used with daily data, it is however transformed into monthly observations for the analysis. The variables used in the vector autoregression are the main variables in the flow-return relationship: ETF flows and index returns. In the analysis, both are regressed on their own lagged values as well as lagged values of the other. In order to be able to interpret the results, orthogonalized impulse response functions are generated.

The outline of this paper is as follows. First, the scene is set by looking at existing literature covering ETFs and the return relationship, the dominant theories in the

(7)

flow-return relationship discussion are described and the creation/redemption process that drives ETF flows is explained. From this, multiple hypotheses are formulated to be tested later on. In the second section, the methodology used in answering the research question will be explained. The dataset is covered thereafter in the next section, which also explains how the main variables, the ETF flow and index return, are defined. This section will also provide descriptive statistics from which preliminary inferences regarding the hypothesis or analysis are made. The fourth section will give the results and interpretation of the VAR analysis. These results will be followed by multiple robustness checks. The final section will conclude by answering the research questions and providing implications of the findings and limitations of the research.

2. Background 2.1 Literature review

Since the origination of ETFs, these investment funds have gained popularity as part of investment strategies and therefore as well in academic literature. ETFs allow investors to invest in a basket of securities with one single transaction. This makes ETFs attractive as they provide a low-cost diversified investment and allow for passive portfolio management.

Even though the ETF market has grown significantly in size, not all academics agree on whether this increased presence of ETFs is a positive development for the financial markets. For example, Ben-David et al. (2014) have found a positive relationship between ETF ownership of a stock and its volatility, indicating that ETF have a destabilizing effect on the stock market. A similar result stems from the research by Bhattacharya & O'Hara (2017), who conclude that herding and instability can result from the presence of ETFs, moving prices away from their fundamentals. In addition, the increased use of ETFs which make use of derivative products “can lead to a build-up of systemic risks in the financial system”, as argued by Ramaswamy (2011). However, as Perignon et al. (2014) show, some worries about the distorting effect ETFs are unfounded. In their research on the db-X ETF series, issued by Deutsche Bank, they find that ETFs that are constructed through securities lending do not show a high collateral risk. Fang & Sanger (2012) find that ETFs “seem to play a significant role in the price discovery process” and that the trading of ETFs is motivated by acquired information. Another positive effect of ETF trading on the market is found by Xu & Yin (2017). They find that ETF trading volumes increase the price efficiency of the underlying index.

(8)

has found regarding this main variable. Clifford et al (2014) find that ETF flows decrease with fund size and high expenses, whereas they increase with premiums of the ETF price over its net asset value and conditions of the stock exchange. In researching ETF liquidity, Broman & Shum (2018) find that “relative liquidity is an important determinant of monthly net fund flows” and that it “predicts fund inflows and outflows”.

A lot of research has been done on the impact of ETFs on the markets, however, few focus on the predictive power of the flows. Research on the predictive power of flows and the flow-return relationship has been done for other asset classes. Froot et al. (2001), for example, find that "inflows have positive forecasting power for future equity returns" while looking at international flows and stocks. In line with the results by Froot et al. (2001) are the findings of Bennett & Sias (2001). They also find that money flows have explanatory power for future returns of stocks listed on the NYSE, especially so in the longer-term of 30 to 40 days. Bekaert et al. (2002) find a price pressure effect of capital flows into equity markets of emerging markets, which is partly reversed in following periods. One can also observe return-chasing behavior in the Real Estate Investment Trust (REIT) asset class, flows however do not have predictive power on returns (Downs et al, 2014)

Research on ETF flows comprises research focused on the short term. Henderson & Buetow (2014) conclude that flows follow returns rather than predict them in a 2-day time window. Staer (2017), Osterhoff and Overkott (2016) and Kalaycioglu (2004) find that ETF flows result in price pressure in the short term. Levy & Lieberman (2015) look at the relation between ETF flows and their return on a weekly basis, aggregating funds per investment product. They find that passive investments show a smaller relationship between returns and flows than active investments, however, flows follow returns for all strategies. Clifford et al. (2014) find “little evidence of superior market timing in ETF flows” by creating a long-short portfolio in their research based on US equity ETFs.

One asset class that is very comparable with ETFs are mutual funds. Agapova (2011) finds that passive mutual funds, that also track indices, and ETFs are substitutes. This indicates that the mutual fund literature might prove to be valuable for this research. Lou (2009) looks at how mutual fund flows affect the return on a stock level, and he finds that better past returns predict higher inflows as well as future returns. Hence, inflows themselves do predict future returns but the actual prediction comes from higher previous returns which trigger the inflow. In line with these results are the conclusions drawn about the Australian market by Watson & Wickramanayake (2012) who find that managed fund flows tend to follow returns rather than predict them. They also find that excess returns are followed by

(9)

unexpected increases in managed fund flows. Ben-Raphael et al. (2011) find similar results in Israel, where daily flows of equity mutual funds also show price pressure which is partially corrected within 10 trading days.

The paper that comes closest to this research on the long-term flow-return relationship between the ETF flows and the returns of the underlying index is the one by Staer (2017). He also investigates the relationship between ETF fund flows and the return of the underlying securities and adds how that differs from mutual funds. He finds a price pressure effect of ETF fund flows, which was reversed within 5 days. The focus in his paper is on the short term, whereas my thesis will focus on predictive powers of the fund flows in a more long-term perspective.

A study on the long-term predictive power of ETF flows is yet to be done. This thesis uses monthly data and analyzes the flow-return relationship over a period up to one year, depending on the model fit. This longer-term approach is where this research fills the gap in existing literature on this relatively new investment product. Within the flow-return discussion, long-term is different from how it is defined in general. Following Bennet & Sias (2001), longer-term analysis is from one month onwards, rather than an analysis on multiple years. Knowledge on whether ETF flows predict future returns is useful for investors in order to predict future returns of stocks. It also gives an indication on the impact of ETFs on the efficiency of the financial markets.

From the abovementioned it can be concluded that the existing literature on flows and returns provides no clear answer to whether flows predict returns. The results differ per asset class, period, sector and region that is researched. Flows may predict return or the other way around. The next paragraph provides the theories most relevant to the flow-return relationship discussion.

2.2 Flow-Return Relationship

In the flow-return relationship discussion there are three theories that give an explanation for the correlation that is observed between flows and returns. These are the information hypothesis, the return-chasing hypothesis, and the price pressure hypothesis. The three theories are briefly described in this section.

The first hypothesis is the information hypothesis. This hypothesis ascribes the correlation between flows and returns to investment behavior. It states that as soon as information arises about developments that could affect the future performance of a stock, for

(10)

or out of that specific stock (or any other type of investment). This hypothesis hence argues that investors make profitable investments and that returns follow flows as the flows contain the information about future profitability and hence returns.

The second hypothesis is the return chasing hypothesis. This hypothesis states that investors tend to put their money in investments that have shown profits in the past. These investors invest on stock momentum and historical performance rather than new information about the future. According to this hypothesis, the flow-return correlation is that flows follow returns, as opposed to the order as described by the information hypothesis.

The price-pressure hypothesis is different in that it looks at the technicalities rather than investor's strategies. This hypothesis states that the in- and outflow of funds from assets have a direct effect on that asset's price due the interplay of supply and demand. Because of price-pressure being short term (Staer, 2017; Osterhoff and Overkott, 2016; and Kalaycioglu, 2004) and because of the size of the ETF market (Levy & Liebermann, 2015), the price-pressure hypothesis is not relevant to this research.

2.3 The Creation/Redemption Process

It is important to understand the mechanisms of the creation and redemption process of ETFs in order to understand how flows of ETFs come into existence. This way, the theories explained in the previous section can be linked to ETFs. There are three main types of players in the ETF creation/redemption process; an ETF issuer, an Authorized Participant (henceforth: AP), and the investors in the secondary market. The first, the ETF issuer, is the company behind the ETF, such as Blackrock issues the iShares ETFs. Second are the APs, parties that have a lot of capital to buy additional shares that comprise the underlying indices of the ETFs in order to create new shares for the ETF. The APs are assigned by the ETF issuer and can be multiple per ETF. Third come the investors in the secondary market that buy and sell the ETF shares among themselves and do not actually actively participate in the creation/redemption process. In the creation/redemption process, creation occurs when the AP buys the shares necessary to replicate the underlying index of the ETF. The ETF issuer then trades these stocks for ETF shares with the AP. The AP then can sell these ETF shares on the secondary market. Redemption is the reverse: ETF shares are taken off the market by the AP and then they retrieve the underlying shares from the ETF issuer.

The creation and redemption of ETF shares are the inflows and outflows respectively. The AP has incentives to take on this role because there are arbitrage opportunities in the ETF market that can be exploited. This arbitrage opportunity arises when the price of an ETF

(11)

share deviates from the net asset value (NAV) of the ETF share, which is the value of the underlying shares. If this deviation is enough to exceed transaction costs, the AP can make a riskless profit by either creating new ETF shares, in the case the ETF share trades at a premium, or by redeeming ETF shares, in the case the ETF shares trades at a discount. These in- or outflows also are the mechanism that assures that the prices of ETF shares should not deviate significantly and long-lastingly from their NAV. Since the APs are designated by the ETF issuer and are the only party, or parties, that can create or redeem the ETFs shares, only they can make a profit from the arbitrage opportunities.

ETF prices can deviate from their NAV if demand for the shares differs from the supply. They will trade at a premium if demand exceeds supply, or trade at a discount if the reverse occurs. Hence, as explained before, high demand will make ETF shares trade at a premium, if the ETF shares trade at a premium, the AP will create shares because of the arbitrage opportunity that arises. This creation of shares is what constitutes the fund inflow.

Because of this mechanism, it can be argued that high demand leads to inflow. If we consider investors to be ‘smart’, it can be argued that they place their money where the returns will be. Hence, an increased demand by ‘smart’ investors leads to an ETF share trading at a premium, inducing an inflow created by the AP. Following the return chasing hypothesis however, return-chasing behavior of investors may be the cause of the increased demand from the markets.

2.4 Hypotheses

Joining the flow-return discussion already present in the literature and following the reasoning as explained in the creation/redemption process paragraph, the information and return chasing hypothesis will be tested. The hypothesis that will be tested are the following two:

1) ETF flows have long term predictive power on the returns of the underlying indices. 2) Returns of indices have long term predictive power on the ETF flows of ETFs tracking that index.

It is expected that in hypothesis 1 the coefficient will show a positive sign, or in other words, that investors put their money where the positive returns will be. If outflows occur investors expect negative returns, also indicative of a positive coefficient. If investors instead

(12)

hypothesis will be positive, indicating that positive returns in the past predict future ETF inflows. In the case that the unexpected sign occurs as a result, we can still speak of predictive power in the flows or the returns. In this scenario, further research is necessary to reveal the theoretical explanation for such a phenomenon.

3. Methodology

To answer the research question and test the hypotheses, multivariate time series will be used. In order to be able to assess whether flows predict the returns of the underlying index, we need a sample of ETFs and their flows over time, as well as the returns over time of the underlying index.

This research follows the approach taken by existing flow-return relationship and forecasting research, such as Froot et al. (2001), Bekaert et al. (2002), Ben-Rephael et al. (2011), Fang & Sanger (2011), Downs et al. (2014) and Staer (2017). This list of literature researching forecasting models with flows and returns make use of a vector autoregressive model (VAR). Sims (1980) was the first to introduce this methodology. VAR models are used because in many economic situations the direction of impact is not clear. In the case of flow and returns, one could wonder whether returns follow on inflow, or whether inflow follows from increased returns. A VAR allows for more than 1 variable to evolve over time and hence can show interdependencies rather than only dependencies. Also, a large part of the problems of variables being exogenous or endogenous is omitted, as it considers all variables included as endogenous. A VAR model will also be able to tell us whether none, one or both of the hypotheses are supported by the statistical analysis.

The regression formulas will take the following form:

i) 𝑅𝑒𝑡𝑢𝑟𝑛',)= 𝛼,+ 𝛼.𝐹𝑢𝑛𝑑 𝑓𝑙𝑜𝑤6,',)7.+ ⋯ + 𝛼9𝐹𝑢𝑛𝑑 𝑓𝑙𝑜𝑤6,',)7: + 𝜖.,)

ii) 𝐹𝑢𝑛𝑑 𝑓𝑙𝑜𝑤6,',)= 𝛽,+ 𝛽.𝑅𝑒𝑡𝑢𝑟𝑛',)7.+ ⋯ + 𝛽=𝑅𝑒𝑡𝑢𝑟𝑛',)7:+ 𝜖=,)

Where Fund flowt,i,j is the fund flow of ETF i tracking index j at time t. Returnt,j is the return of the underlying index j at time t. Capital letter ‘K’ is the total lag length. Another notation for this two variable VAR is the following:

iii) 𝐹𝑢𝑛𝑑 𝑓𝑙𝑜𝑤𝑅𝑒𝑡𝑢𝑟𝑛𝑗,𝑡 𝑖,𝑗,𝑡 = 𝑎,+ 𝐴. 𝑅𝑒𝑡𝑢𝑟𝑛𝑗,𝑡−1 𝐹𝑢𝑛𝑑 𝑓𝑙𝑜𝑤𝑖,𝑗,𝑡−1 + ⋯ + 𝐴9 𝑅𝑒𝑡𝑢𝑟𝑛𝑗,𝑡−𝐾 𝐹𝑢𝑛𝑑 𝑓𝑙𝑜𝑤𝑖,𝑗,𝑡−𝐾 + 𝜀1,𝑡 𝜀2,𝑡

(13)

In equation iii), a, is the matrix of the intercepts and the A. untill the lag A.I are the two by two matrices of the coefficients of the lagged values of Fund flow and Return. In order to determine the number of lags, K, to be used, the Akaike information criterion (AIC)

will be used.

Before a VAR model can be put to use, the dataset needs to be tested for stationarity and, if nonstationary, cointegration. Stationarity might be an issue due to the fact that ETFs have gained in popularity as an investment vehicle and hence inflows would exceed outflows due to asset allocation determinants rather than expectations about the returns. To test for stationarity the Fisher-type test is used based on the Phillips-Perron test because the panel data is unbalanced and to account for structural breaks.

After the VAR has been performed it is checked for stability. A VAR model needs to be stable in order for the results to be valid. As proven by Lütkepohl (2005), a VAR model with p lags is stable if the VAR model 'has finite variance and an auto covariance sequence that converges to zero at an exponential rate." (Rossi, 2004). This can be visualized by checking if all eigenvalues of matrices A, as defined in equation iii, lie within the unit circle. After it has been confirmed that the VAR model is stable, the model is tested for Granger causality. When looking at equations i) and ii) separately, we can speak of Granger causality if the lags of the independent variable, given the lags of the dependent variable, are jointly statistically significant as a result of a Granger causality Wald test. Although Granger causality is essentially no prove of causality, the purpose of this research is not to find any causal relationship. Granger causality is useful in that it looks at informational content of the lagged variables used in the regression on the future value of one of these variables, hence it is useful in this research in predictive power.

Since the VAR regression results cannot be interpreted as simple as an OLS regression, impulse response functions (IRFs) are used to shed light on the complexity of the coefficients in the table displaying the VAR results. Impulse response analyses show how the future values of the variables included in the VAR respond to impulses, or shocks, to that same or another variable included in the VAR. Basic IRFs assume that this impulse only occurs in one of the variables. With the two variables used in this VAR, the likeliness that an impulse will only affect one of the two variables is small and this will be checked by means of the covariance matrix. Because of the fact that it cannot be stated with certainty that an impulse will only affect one variable, orthogonalized IRFs are used (Rossi, 2004). The

(14)

covariance matrix is orthogonalized using the Choleski decomposition, resulting in uncorrelated error terms. Hence the impulses affect only one variable, rather than both. With orthogonalized IRFs, the impulse is equal to one standard deviation (Sánchez, 2011). The orthogonalized IRFs should provide a clear picture that helps in answering the research question.

4. Data

4.1 Data Collection

Data is used for the period January 2000 up to and including December 2017. Even though Svetina & Wahal (2008) use ETF data from the beginning of the inception, this research follows Kalaycioglu (2004) and Staer (2017) in starting the dataset in the year 2000. This is done because the popularity of investing in ETFs would have given the flows sufficient informational content. This research extents the period, relative to these two authors, to the most recent year end, 2017, in order to increase the number of observations and incorporate the more recent rise of the ETF market. This research will focus on the US equity ETFs, following Staer (2017). The reasons for this are that the US equity ETF market is large and because of “the transparency of the creation-redemption process” relative to the European market. Staer (2017) also mentions the “prevalence of the physical replication of the index” as a reason since his focus is on the short-term pricing pressure of the flows. However, due to the long-term forecasting focus of this research, that reason is not applicable to this research.

In order to answer the research question on whether ETF flows predict returns of the underlying index, the main variables arise from the question itself. ETF flows and the returns of the underlying index are the focus of this research. Fund flows and the prices of the underlying indices from which the returns are calculated are retrieved from Bloomberg. Literature directs to Bloomberg as the prime source for ETF data. First, the ETF Screener of Bloomberg was used to select only Equity ETFs and the US market. Then that list of 744 tickers was used to download daily data on the fund flow, shares outstanding, net asset values of the funds (NAV), total assets under management of the fund (AUM), fund price, underlying index ticker, index price, as well as other variables used for controlling the dataset, variable construction or other applications. From the Centre for Research in Security Prices the permanent numbers of the ETFs for ticker consistency and the fund’s end-of-day prices are retrieved. For several ETFs, the underlying index ticker was not available through Bloomberg. Going through a sample showed these funds do not track a specific index and therefore these funds were dropped out of the dataset. Several funds did not provide the

(15)

correct share code as used by CRSP and were dropped as well. Also dropped were funds with a "Bear Market" strategy, leveraged funds, funds based on derivatives, actively managed funds and those that were hedged for currency fluctuations. The reason for this is that coefficients are hard to interpret if the returns of the index are not tracked on one-to-one basis by the ETF. CRSP provides negative prices for the funds if the given price is the bid/ask average and hence, the absolute values are taken. Multiple funds show values for the AUM and NAV only periodically. These values are carried forward until the new value is reported, missing values for which no previous value is available to be carried forward are taken out of the sample because carrying them backwards would deteriorate the data. Following Clifford et al. (2014) the observations of funds within six months of their inception date are omitted, to account for what Clifford et al. (2014) call the incubation bias. This bias is a result of newly incepted funds having excessively large inflows that are high relative to mature ETFs due to the mere fact that these ETFs have just come into existence. Bloomberg did not provide the number of shares outstanding for all observations, in these cases the missing values were filled with the shares outstanding as provided by CRSP. Bloomberg values for the shares outstanding are preferred, since they are updated more frequently. Whilst exploring the data, several funds show abnormal returns in a single period. After manually checking these funds, it showed that the abnormal returns were due to stock splits or convergences, as AUM and NAV remained stable. For these funds, the prices and shares outstanding were corrected for, making the previous values correspond to the later values. In the end, the variables are winsorized at the 1st and 99th percentile.

These filters decreased the number of ETFs in the sample from 744 to 509, with a total number of indices tracked decreasing from 633 to 489. The development of the number of US Equity ETF’s, within this sample, is shown in figure 1. It shows a relatively stable number of ETFs until 2011, when the number of ETFs started to rise rapidly. This is in line with the growth of the popularity of the ETF market as depicted in the introduction. Obviously, there are many different ETFs that follow different strategies or asset classes that are not in this sample and hence figure 1 merely illustrates rather than depicts the pure reality of the growth of the number of ETFs. The figure also shows the in-sample development of the total assets in the US equity ETF industry over time and shows a similar picture as the number of ETFs.

(16)

Figure 1: The in-sample developments of the number of ETFs and total assets under management in the US equity ETF market.

4.2 ETF Flows

Taking into account that ETF share creation and redemption, and hence the in- and outflows, generally occur in bulks generated by the AP, monthly data can be used to mitigate the effect this flow in bulks might have on the data (Clifford, 2014). In line with Broman & Shum (2018) and the reasoning in the section The Creation/Redemption Process, flows act as a proper proxy for demand and hence can be used to translate the investor sentiment and market demand to the flow-return relationship.

Multiple definitions of flows can be used. One is to use the percentage change in shares outstanding, as used by Staer (2017). Another method is the one used in the Bloomberg Terminal, from which a large portion of the data is retrieved, as well as by Ben-Raphael et al (2011). By these two sources, flow is defined as the difference of the ETF shares outstanding between period t and t-1, times the NAV at time t. A third approach is to take fund size into account by dividing the Bloomberg definition by AUM, as done by Broman & Shum (2018). Downs et al. (2014) use the absolute net euro flow divided by the fund size of the previous period in their REIT flow analysis. Edelen & Warner (2001) define the flow as the percentage change of AUM minus the percentage change of NAV.

0 .5 1 1.5 T o ta l AU M (mi lli o n s o f $ ) 0 100 200 300 400 500 N u mb e r o f ET F s 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 201 1 2012 2013 2014 2015 2016 2017 2018 Date ETFs AUM

(17)

This research follows Edelen & Warner (2001) in defining the flow as the percentage change of total AUM of the fund minus the percentage change of NAV. The formula used is as follows: 𝐹𝑙𝑜𝑤6,) = ∆𝐴𝑈𝑀6,) 𝐴𝑈𝑀6,)7.− ∆𝑁𝐴𝑉6,) 𝑁𝐴𝑉6,)7.

Here, Flowi,t is the ETF flow of ETF i at time t. AUMi,t is the total assets under management of ETF i at time t. NAVi,t is the net asset value of ETF i at time t. Using this definition has several advantages over the other definitions. First, by subtracting the percentage change in NAV from the change in AUM, the price return of the underlying index is eliminated from the equation. This is a necessity since the other dependent variable and independent variable lags in the VAR are that same returns and hence these returns would be incorporated in both the return and the flow variables. By this construction, flow represents only that part of the change in dollar value of the ETF that is driven by the creation/redemption process. Another advantage of this definition is that by taking the percentage changes rather than absolute flows a size bias is eliminated. As will be illustrated in figure 2 size differences are large and therefore should be corrected for.

4.3 Returns

The second main variable in the VAR is returns. Returns will be calculated on a monthly basis from end-of-month closing prices obtained from Bloomberg. This will be done using the arithmetic return formula:

𝑅𝑒𝑡𝑢𝑟𝑛',) = 𝑃𝑟𝑖𝑐𝑒',) 𝑃𝑟𝑖𝑐𝑒',)7.− 1

Where returnj,t is the monthly return over month t, for index j. Price j,t is the end-of-month price of index j of month t.

4.4 Descriptive Statistics

Now that the variables have been defined, summary statistics, tables and graphs will be provided and commented on to illustrate the data sample and the variables of interest. These will help in identifying remarkable aspects of the variables or other data characteristics. Table

(18)

1 shows the summary statistics of the variables of the final sample with monthly observations.

Table 1

Descriptive statistics for the ETFs and underlying indices.

This table report descriptive statistics for the variables used for the ETFs and the underlying indices. The number of observation, mean, standard deviation, minimum value, median and maximum value are reported. The sample includes ETF-month or Index-month observations for the period 2000-2017. ETF Price, ETF AUM, ETF NAV and Index Price are in US dollars. ETF Return, ETF Flow and Index Return are in percentages. ETF AUM is in millions.

N Mean St.Dev Min Median Max

ETF Price 25203 44.768 30.777 9.89 33.68 175.82 ETF AUM 25203 2296.484 6166.711 2.171 187.454 42209.79 ETF NAV 25203 44.723 30.8 9.85 33.641 175.901 ETF Return 24695 .006 .048 -.144 .008 .14 ETF Flow 24695 .034 .121 -.261 .001 .663 Index Price 25203 2803.434 4160.719 83.82 1473.34 28528.35 Index Return 24695 .008 .047 -.139 .011 .139

From table 1 it can be noticed that the amount of assets under management of ETFs differs significantly between different ETFs. The smallest ETF in the sample has assets of $2 million, while the largest has total assets of $42 billion, which is 21,000 times as large. The median total assets of the ETF sample is $187 million, whereas the mean is almost $2.3 billion. From this it can be concluded that a relatively small group of ETFs have a large amount of assets in the funds. To illustrate the extent of this phenomenon, figure 2 can be examined. This figure shows the aggregate total AUM per decile as a percentage of the total assets of all ETFs at the end of 2017. This reaffirms that a small number of ETFs comprise a large part of the US Equity ETF market, as the figure shows that nearly 90% (89.4%) of the total assets are in the 10% largest ETFs. Another thing one can note from table 1 is that the index prices are not nearly the same as the ETF NAV. This occurs because ETFs track returns rather than the exact prices. Hence, the value of the ETF may differ significantly from the price of the index it aims to track the returns of. Furthermore, there seem to be small differences between ETF return and index return, which could arise of because of the tracking error, the error that occurs if the ETF does not track the index close enough. Besides the large differences in AUM, no surprising results can be found in the table.

Figure 3 shows the mean and median ETF flow, as defined in section 4.2, over time. It becomes clear that the variance has decreased over time. A reason for that could be the increase in the number of ETFs which has smoothed the line in figure 3 due to single ETFs having a lower impact on the whole. Figure 3 does shows that in general the ETF market in the sample has shown an average inflow between 0% and 5%. As stated in table 1, the mean

(19)

Figure 2: ETFs sorted in deciles by total AUM.

Figure 3: Mean and median of ETF flows per month.

0 .2 .4 .6 .8 1 F ra ct io n of AU M 1 2 3 4 5 6 7 8 9 10 Deciles

Fraction of Aggregate AUM per Decile

-5 0 5 10 15 F lo w (% ) 20 00 20 01 20 02 20 03 20 04 20 05 20 06 20 07 20 08 20 09 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 Date

Mean Flow Median Flow

(20)

and median ETF flow are 3.4% and 0.1% respectively. The ETF flows can differ significantly between ETFs. There are 100 ETF-month combinations that show a 0% flow for the ETF in that month. 25,103 ETF-month combinations that do have flow for that ETF in that month, from which 15,000 show inflows and 9,503 outflows. In the dataset, there is not one month in which no single ETF has an in- or outflow.

Figure 4: Mean flow and return per month.

Figure 4 shows the mean monthly flow of all ETFs in the sample and the mean monthly returns of all the underlying indices. This graph may give an indication of whether a trend can be observed in the two main variables and whether the mean of one follows the other. This might indicate which of the two hypotheses will be supported. First, there appears to be no trend for both variables. Second, the graph does not provide any convincing evidence for either one of the hypotheses. The aggregate variables, however, do not necessarily show the same movements as within panel movements. Table 2 gives the pairwise correlations between the two main variables and monthly lags of the other variable and of themselves up to half a year are provided below and should give a more concise picture than figure 4.

Table 2 leads us to expect the following for the VAR analysis. First, there appears to be a small amount of serial autocorrelation in index returns. This might indicate that returns

-20 -10 0 10 20 Pe rce n t (% ) 20 00 20 02 20 03 20 01 20 04 20 05 20 06 20 07 20 08 20 09 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 Date

Mean Flow Mean Return

(21)

contain some predictive power for future returns. Second, past flows tend to be negatively correlated with current index returns. This is the opposite of what the information hypothesis argues, because here an inflow would predict negative returns. Third, in ETF flows there is serial autocorrelation as well, also indicative of ETF flows containing information about future ETF flows. Fourth, returns and lagged returns are positively correlated to ETF flows, where all correlation coefficients are significant. This is indicative of return chasing behavior.

Table 2

Pairwise correlations for Returns and Flow and their lagged values

This table, separated in panels A, B, C and D, represents the pairwise correlations for the variables ETF Flow and Index Return and their lagged values up to 6 months back. Panel A represents the correlation matrix for the Index Returns and their own lagged values. Panel B represents the correlation matrix of the Index Return and the ETF Flow and lagged ETF Flows. Panel C represents the correlation matrix for the ETF Flows and their own lagged values. Panel D represents the correlation matrix for ETF Flows and Index Returns and lagged Index Returns. Statistical significance at the 10% level is indicated by *.

A) Returns - Past Returns

Variables Return Index ReturnIndex t-1 Index Returnt-2 Index Returnt-3 Index Returnt-4 Index Returnt-5 Index Returnt-6 Index Return 1.000 Index Returnt-1 0.040* 1.000 Index Returnt-2 -0.037* 0.040* 1.000 Index Returnt-3 0.006 -0.037* 0.038* 1.000 Index Returnt-4 0.025* 0.006 -0.040* 0.038* 1.000 Index Returnt-5 0.084* 0.025* 0.007 -0.042* 0.042* 1.000 Index Returnt-6 0.012* 0.080* 0.025* 0.005 -0.044* 0.041* 1.000

B) Returns – ETF Flow & Past ETF Flow

Variables Index

Return ETF Flow

ETF Flowt-1 ETF Flowt-2 ETF Flowt-3 ETF Flowt-4 ETF Flowt-5 ETF Flowt-6 Index Return 1.000 ETF Flow 0.103* 1.000 ETF Flowt-1 0.004 0.149* 1.000 ETF Flowt-2 -0.009 0.127* 0.149* 1.000 ETF Flowt-3 -0.014* 0.106* 0.127* 0.151* 1.000 ETF Flowt-4 -0.019* 0.070* 0.104* 0.125* 0.148* 1.000 ETF Flowt-5 -0.000 0.061* 0.069* 0.103* 0.122* 0.147* 1.000 ETF Flowt-6 -0.004 0.050* 0.061* 0.067* 0.102* 0.120* 0.146* 1.000

(22)

Table 2 (continued)

C) ETF Flows - Past ETF Flows

Variables ETF Flow ETF Flowt-1 ETF Flowt-2 ETF Flowt-3 ETF Flowt-4 ETF Flowt-5 ETF Flowt-6 ETF Flow 1.000 ETF Flowt-1 0.149* 1.000 ETF Flowt-2 0.127* 0.149* 1.000 ETF Flowt-3 0.106* 0.127* 0.151* 1.000 ETF Flowt-4 0.070* 0.104* 0.125* 0.148* 1.000 ETF Flowt-5 0.061* 0.069* 0.103* 0.122* 0.147* 1.000 ETF Flowt-6 0.050* 0.061* 0.067* 0.102* 0.120* 0.146* 1.000

D) ETF Flows - Returns & Past Returns

Variables ETF flow Return Index ReturnIndex t-1 Index Returnt-2 Index Returnt-3 Index Returnt-4 Index Returnt-5 Index Returnt-6 ETF flow 1.000 Index Return 0.103* 1.000 Index Returnt-1 0.114* 0.040* 1.000 Index Returnt-2 0.042* -0.037* 0.040* 1.000 Index Returnt-3 0.022* 0.006 -0.037* 0.038* 1.000 Index Returnt-4 0.025* 0.025* 0.006 -0.040* 0.038* 1.000 Index Returnt-5 0.023* 0.084* 0.025* 0.007 -0.042* 0.042* 1.000 Index Returnt-6 0.020* 0.012* 0.080* 0.025* 0.005 -0.044* 0.041* 1.000 * shows significance at the .1 level

5. Results

This section covers the VAR analysis, as described in the methodology section, and the results derived. The first step in the analysis is to test for stationarity. This is done by performing Fisher-type tests based on Phillips-Perron. The results are displayed in appendix A.1. The tests show that for both the ETF flow and index return variable the null-hypothesis of non-stationarity can be rejected at the 1% confidence level, and hence find supportive evidence of stationary variables. Because of this stationarity it can be concluded that there is no trend and hence fixed effects do not have to be controlled for.

` The VAR performed is on the full sample and full period, and results are reported in table 3. According to the AIC criterion, nine periods should be used as lags in order to provide the best fitting model. Column 1 shows the first part of the VAR, with the dependent variable being the current index returns and as independent variables the lags of index returns and ETF flows. Column 2 shows the ETF flow as dependent variable with the lagged index returns and flows as independent variables. Table 3 shows a mixed picture for the effect of lagged index returns on index returns, where multiple coefficients are significant but the sign

(23)

Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1

differs among the significant coefficients. The two, three, four and eight month lagged ETF flows show a significant (10% level for t-1, other lags 1% level) effect on index returns, with negative coefficients indicating that inflows are followed by negative returns rather than

Table 3

Long-term relationship between ETF Flows and Index Returns

This table reports the coefficients from the ETF-month vector autoregression of the returns of the underlying index and ETF flows. Column 1 shows the results of the regression of the index returns on its own lagged values and ETF flow lagged values. Column 2 shows the results of the regression of the ETF flows on the lagged values of the index returns and on its own lagged values.

(1) (2)

VARIABLES Index Return ETF Flow

Index Returnt-1 0.0346*** 0.280*** (0.0100) (0.0195) Index Returnt-2 -0.0440*** 0.0759*** (0.00984) (0.0192) Index Returnt-3 -0.00437 0.0291 (0.00974) (0.0183) Index Returnt-4 0.0188* 0.0529*** (0.00970) (0.0178) Index Returnt-5 0.0663*** 0.0502*** (0.00923) (0.0184) Index Returnt-6 -0.00286 0.0243 (0.00900) (0.0172) Index Returnt-7 -0.0705*** -0.0191 (0.00888) (0.0177) Index Returnt-8 0.0366*** -0.00260 (0.00880) (0.0168) Index Returnt-9 -0.0276*** 0.0129 (0.00858) (0.0166) ETF flow t-1 -0.000608 0.105*** (0.00344) (0.0127) ETF flow t-2 -0.00564* 0.0775*** (0.00336) (0.0113) ETF flow t-3 -0.0101*** 0.0564*** (0.00341) (0.0107) ETF flow t-4 -0.0122*** 0.0226** (0.00333) (0.00960) ETF flow t-5 -0.00424 0.0258*** (0.00320) (0.00964)

ETF flow t-6 9.86e-06 0.0149*

(0.00311) (0.00869) ETF flow t-7 -0.00332 0.0115 (0.00311) (0.00879) ETF flow t-8 -0.00834*** 0.0233*** (0.00310) (0.00869) ETF flow t-9 -0.00257 0.0209** (0.00298) (0.00837) Observations 19,844 19,844

(24)

return chasing behavior. The one, two, four and five month lagged index returns have a significant and positive relationship with ETF flows at time t, indicative of return-chasing behavior. Lastly, ETF flows tend to positively predict future ETF flows, with all except the seven-month lag being significant at the 10% level or lower.

Checking the VAR for stability and Granger causality, results in appendix A.2 and A.3 respectively, convincing evidence is found in favor of the lags of ETF flows providing additional information about future values of returns above and beyond the information already contained in past returns themselves. The same applies for the informational content of returns on the future values of flow. The VAR system as a whole is stable as well, as indicated by the unit circle in appendix A, and the null-hypothesis regarding no Granger causality can be rejected at the 1% confidence level.

Figure 5: Orthogonalized Impulse Response Functions of ETF flows and index returns

Important to note is that the coefficients cannot be interpreted the same way as a regular OLS regression. As mentioned in the methodology section, orthogonalized impulse response functions (IRFs) can be used in order to provide a visual representation of how one variable moves after a one standard deviation shock of the other variable. The full sample IRFs are displayed in figure 5. ETF flows respond to a one standard deviation (10.57%) increase in flows with a relatively low amount of future inflows, gradually decreasing from

0 .05 .1 0 .05 .1 0 5 10 0 5 10

ETF Flow : ETF Flow

Index Return : ETF Flow

ETF Flow : Index Return

Index Return : Index Return

95% CI Orthogonalized IRF Months

(25)

1.1% in the next month to 11 basis points 10 months after the impulse. Returns do not seem to respond to an impulse of flows, resulting in a flat line. All return responses to the 10.57% increase in flows range between 15 basis points to 0.6 basis points, with all responses being negative but without any economically significant magnitude. Flows do respond to an increase of returns of 4.59%. The next month response is the largest, with flows increasing with 1.4%. The second month the effect decreases to 63 basis points and this trend continues as the effect deceases to near zero over the next months. An impulse in returns does not seem to be followed by more returns and tend to be fluctuating around zero with values between -37 basis points and 15 basis points over the next 10 months. For all four IRFs, the 95% confidence interval is close to the line depicted and hence indicates that the conclusions drawn are likely to hold.

With regards to the research question and the hypothesis, it can be concluded that hypothesis 1 is rejected and hypothesis 2 is confirmed. Regarding hypothesis 1, the predictive power of ETF flows on future returns is statistically significant. However, the predictive power is economically insignificant in that the magnitudes of the coefficients in the IRFs are small and close to zero. This means that no inferences can be made from current ETF flows regarding the future index returns. Therefore the hypothesis stating that ETF flows have predictive power on future index returns is rejected. Hypothesis 2 stated that underlying index returns have predictive power on future ETF flows tracking that specific index. This hypothesis is confirmed, as it is found that current positive index returns are accompanied by current ETF inflow and inflows for the subsequent months, with the largest portion of inflows occurring in the month following the increased return. Linking these findings to the flow-return relationship discussion, evidence is found in favor of the flow-return-chasing hypothesis, as inflows follow positive returns. The information hypothesis cannot be confirmed, as the evidence suggests that increases in ETF flow are not followed by positive returns.

6. Robustness Checks

In order to check whether the results as found and commented on in the previous section hold in different scenarios, multiple robustness checks are performed. For these robustness checks, the approach to the VAR is similar to that explained in the methodology sector and applied in the original analysis.

The first robustness check is performed by separating along the stable period of the number of ETFs and the growth period, as was depicted in figure 1 in the data section. This

(26)

means that a separate VAR is performed for a sample between 2000 and 2010 and another one for a sample between 2011 and 2017. The VAR and the IRF results are displayed in appendix B. For both the periods no significant different results are found than already found for the full sample.

Different ETFs can have the same underlying index, therefore the flows of different ETFs tracking the same index are aggregated. As discussed by Staer (2017), this may be useful because the flows may occur between ETFs that track the same index. Outflows and inflows from one to the other will impact the analysis because they are regarded as outflows and inflows from the underlying index, whilst in fact they do not represent the sentiment that the index will underperform as they are merely allocated to another ETF tracking the same index. Since the numbers of ETFs tracking the same index is relatively small, it is not expected that results will differ. The results can be found in appendix C and, as expected, do not show any significant differences from the original analysis.

It became clear from table 1 and figure 2 that the largest 10% of ETFs control nearly 90% of the assets in the ETF market. Even though the definition of ETF flow that is used takes fund size into account, the last robustness check looks at whether the results are consistent when only these 10% largest ETFs are used in the VAR. The results are reported in table 4, columns 1 and 2. If the results differ from the original VAR, the bottom 90% of ETF by AUM might be the driver of the results in the original VAR. Therefore a separate VAR is performed for the bottom nine deciles as well with the results reported in column 3 and 4. Results for the stationarity, stability and Granger Causality tests are reported in appendix D.

Table 4 confirms that an analysis of the top 10% ETFs by AUM gives different results than that of the bottom 90% and the original VAR performed as displayed in table 3. The results in column 3 and 4 are similar to that in table 3. However, the coefficients are larger, hinting at an opposite or lower effect in the top 10% that put a downward pressure on the coefficients of the original full sample VAR relative to the bottom 90%. An interesting outcome from column 1 and 2 is that ETF flows of the previous quarter tend to have predictive power on the returns, as the one, two and three month lagged ETF flow are significant. All three have a negative sign though, hence inflows are followed by negative returns. This shows that money invested in the top decile of ETFs in general are bad investments. For the two subsamples the orthogonalized IRFs are plotted in figure 6 and figure 7 to be able to properly interpret the results. The results from the VAR as described above can be retraced in the IRFs. In figure 6 it can be found that for the top 10% of ETFs by assets, inflows in ETFs are subsequently followed by outflows the next month. An ETF

(27)

Table 4

Long-term relationship between ETF Flows and Index Returns

This table reports the coefficients from the ETF-month vector autoregression of the returns of the underlying index and ETF flows. Column 1 and 2 show the coefficients for the VAR on the top decile of ETFs by AUM. Column 3 and 4 show the coefficients for the VAR on the bottom nine deciles. Column 1 and 3 show the results of the regression of the index returns on its own lagged values and ETF flow lagged values. Column 2 and 4 show the results of the regression of the ETF flows on the lagged values of the index returns and on its own lagged values.

Top 10% ETFs Bottom 90% ETFs

(1) (2) (3) (4)

VARIABLES Index Return ETF Flow Index Return ETF Flow Index Returnt-1 0.00307 -0.0309 0.0335*** 0.318*** (0.0321) (0.0388) (0.0107) (0.0214) Index Returnt-2 -0.0347 -0.0788** -0.0443*** 0.0903*** (0.0295) (0.0393) (0.0106) (0.0210) Index Returnt-3 -0.0123 -0.0922** -0.00954 0.0363* (0.0296) (0.0385) (0.0105) (0.0199) Index Returnt-4 0.0171 0.0537 0.0171 0.0567*** (0.0289) (0.0359) (0.0105) (0.0195) Index Returnt-5 0.0388 0.0437 0.0713*** 0.0519** (0.0281) (0.0358) (0.00989) (0.0201) Index Returnt-6 -0.0904*** -0.0279 0.00818 0.0296 (0.0276) (0.0381) (0.00967) (0.0188) Index Returnt-7 -0.0570** 0.0422 -0.0727*** -0.0288 (0.0286) (0.0313) (0.00953) (0.0193) Index Returnt-8 0.0660** 0.0602 0.0314*** -0.00527 (0.0275) (0.0378) (0.00943) (0.0183) Index Returnt-9 -0.0495** -0.0168 -0.0225** 0.0170 (0.0251) (0.0320) (0.00923) (0.0181) ETF flow t-1 -0.0859*** -0.204*** 0.00251 0.114*** (0.0211) (0.0391) (0.00350) (0.0133) ETF flow t-2 -0.0862*** 0.0130 -0.00299 0.0690*** (0.0202) (0.0478) (0.00343) (0.0116) ETF flow t-3 -0.0516** 0.0166 -0.00946*** 0.0534*** (0.0202) (0.0355) (0.00347) (0.0111) ETF flow t-4 -0.0324 -0.00917 -0.0125*** 0.0204** (0.0204) (0.0346) (0.00339) (0.0101) ETF flow t-5 -0.0410** -0.0177 -0.00358 0.0248** (0.0191) (0.0379) (0.00325) (0.00996) ETF flow t-6 -0.0313 0.0178 0.00103 0.0126 (0.0195) (0.0367) (0.00316) (0.00901) ETF flow t-7 0.00171 -0.0196 -0.00344 0.00924 (0.0214) (0.0305) (0.00314) (0.00909) ETF flow t-8 -0.0180 -0.0944*** -0.00738** 0.0244*** (0.0179) (0.0301) (0.00314) (0.00907) ETF flow t-9 -0.0117 0.0355 -0.00280 0.0153* (0.0205) (0.0304) (0.00301) (0.00866) Observations 1,951 1,951 17,157 17,157

(28)

Figure 6: Orthogonalized IRFs of ETF flows and index returns (top 10% by AUM)

Figure 7: Orthogonalized IRFs of ETF flows and index returns (bottom 90% by AUM) -.02 0 .02 .04 .06 -.02 0 .02 .04 .06 0 5 10 0 5 10

ETF Flow : ETF Flow

Index Return : ETF Flow

ETF Flow : Index Return

Index Return : Index Return

95% CI Orthogonalized IRF Months

impulse : response

Top Decile IRFs

0 .05 .1 0 .05 .1 0 5 10 0 5 10

ETF Flow : ETF Flow

Index Return : ETF Flow

ETF Flow : Index Return

Index Return : Index Return

95% CI Orthogonalized IRF Months

impulse : response

(29)

inflow of 5.5% is followed by outflows amounting to 1.1% the next month. The following months in- and outflows alternate and are of a magnitude of less than half a percent. The flow impulse and return response IRF shows that ETF flows are followed by index returns in the opposite direction for the next 6 months. The size of these movements is between 13 to 47 basis points. These results confirm hypothesis 1 of predictive power of ETF flows on the returns of the underlying index. Interestingly though is that the opposite is found of what the information hypothesis argues. Here inflows are followed by negative returns rather than positive ones. The third IRF shows how the ETF flows move after an impulse of 3.9% to returns occurs. The first three months show outflows of 23, 25 and 30 basis points. From month 4 onwards the flows alternate between positive and negative of a magnitude between 1 to 20 basis points. This result is different than the original VAR, as it indicates that investors tend to sell rather than buy their ETF shares after positive returns. Hence, evidence is found for hypothesis 2 stating predictive power of index return on future ETF flows, although the opposite direction is found of what the return-chasing hypothesis argues. The return-return IRF shows similar results as the full sample analysis. As expected and indicated by the results of column 3 and 4 of table 4, the ETFs in the bottom 9 deciles have similar IRFs to the full sample IRFs.

All in all, the original VAR appears to be robust. The only robustness check that showed significantly different results was the VAR on the top decile. Different to the full-sample analysis is that the results for the top 10% of ETF by AUM find evidence for both the hypotheses of predictive power. However, within the theories dominant in the flow-return discussion the opposite is found of what the information and return-chasing hypotheses argue.

7. Conclusion & Discussion

This research aims to provide insights into the long-term relationship between ETF flows and the returns of their underlying index. A VAR model has been applied to a monthly multivariate time-series dataset of US equity ETFs and the indices tracked by the ETFs in the sample. It is tested whether ETF flows have long-term predictive power on the returns of the underlying index, and whether returns of the underlying index have long-term predictive power on ETF flows. Within this flow-return discussion, the information hypothesis and return-chasing hypothesis are dominant.

(30)

returns occurs to the underlying index, equal to a return of 4.59%, a 1.4% increase in flows is observed the next month. The subsequent months this increase gradually declines to near zero. The hypothesis that ETF flows predict underlying index returns is rejected. Translated to the theories in the flow-return discussion it is found that the information hypothesis can be rejected and the return-chasing hypothesis is confirmed.

Checking the robustness of the results, a separation by a period with a stable number of ETFs and a period of a growing number of ETFs do not result in any different inferences. Aggregating ETF flows for ETFs tracking the same index provides the same results as well. However, when performing a separate VAR analysis for the top decile and the bottom nine deciles by assets under management significantly different results are found for the top decile. For this subsample, evidence is found in favor of predictive power of ETF flows on future index returns and predictive power of index returns on future ETF flow. However, the relation in both cases is negative. These results are exactly the opposite of what is theorized in both the information hypothesis and the return-chasing hypothesis.

These findings are in line with what was previously found for ETFs in the short term, mutual funds and REITs by Buetow (2014), Lou (2009) and Downs et al. (2014), respectively. An interesting addition is the distinction between the large and small ETFs by AUM and the different results for the two subsamples. With regards the dominant theories in the flow-return discussion, it is interesting to observe that this research finds zero evidence supporting the information hypothesis. The return-chasing hypothesis is confirmed for the full sample. However, these results are largely driven by the bottom 90% of ETFs by AUM.

Several limitations of the research should be mentioned. First, the top decile showed to comprise nearly 90% of the ETF market in the sample and showed different results with regards to testing the return-chasing hypothesis than the results on the full sample. Levy & Lieberman (2015) and Staer (2017) therefore focus their research on a selection of the largest ETFs in the market. However, completely ignoring what effect smaller ETFs can have on the market and for investors is not the proper approach either. Another limitation arises from the dataset only including ETFs that intend to track the index returns one-to-one and are long positions. The ETF market includes numerous different types of ETFs that all have their own typical characteristics and hence might show totally different results regarding the flow-return relationship. The last limitation arises from the fact that Bloomberg only provides ETFs that are still in the market and therefore a survivorship bias occurs. Staer (2017) however mentions that this is “unlikely to introduce a survivorship bias, as low assets under management (AUM) and generally low flows” are the reason for the ETFs delisting. Due of

(31)

the creation/redemption process creating shares in large bulks, these ETF flows tend to be mostly in the secondary market and therefore the translation of market demand to flow would provide less information regarding investor sentiment with regards of the underlying indices of the delisted ETFs.

The results of this research may be useful for investors wanting to exploit the flow-return relationship in order to generate superior investment results. Since this research does not find support for the information hypothesis and hence inflows do not predict positive future returns, flow information cannot be used to anticipate returns. Only for the top 10 decile it can be anticipated that returns tend to be slightly negative following an inflow. This might provide a signal for shareholders of the ETF to sell shares when inflows are high for these large ETFs. The difference between large and smaller ETFs by AUM and might provide investment opportunities as well, further research is required however to identify these opportunities.

Several topics for future research follow from the results found in this research. As mentioned, the different results found between the larger ETFs and the smaller ones might induce researchers to further research this discrepancy. While the large ETFs might have a larger impact on the market in factors such as liquidity and price discovery, analyses of the smaller ETFs might provide useful information regarding investment behavior and strategies. As mentioned as a limitation, the type of ETF used is only one in the vast in the ETF landscape. Since it is hard to combine these ETFs in a single analysis like this one, comparison studies or studies like this using multiple datasets of different ETF types might provide valuable insights in the different impacts on the market or investment tactics available to investors. Lastly, limited evidence is found for the dominant theories in the flow-return relationship. Hence, the results might incentivize researchers to look for explanations of why investors behave the way the results indicate and find complimentary theories to add to the flow-return discussion.

(32)

Reference List

Agapova, A. (2011). Conventional mutual index funds versus exchange-traded funds.

Journal of Financial Markets, 14(2), 323-343.

Bekaert, G., Harvey, C. R., & Lumsdaine, R. L. (2002). The dynamics of emerging market equity flows. Journal of International money and Finance, 21(3), 295-350.

Bennett, J. A., & Sias, R. W. (2001). Can money flows predict stock returns?.

Financial Analysts Journal, 57(6), 64-77.

Ben-David, I., Franzoni, F., & Moussawi, R. (2014). Do ETFs increase volatility? (No. w20071). National Bureau of Economic Research.

Ben-Rephael, A., Kandel, S., & Wohl, A. (2011). The price pressure of aggregate mutual fund flows. Journal of Financial and Quantitative Analysis, 46(2), 585-603.

Bhattacharya, A., & O'Hara, M. (2017). Can ETFs increase market fragility? Effect of information linkages in ETF markets.

Clifford, C. P., Fulkerson, J. A., & Jordan, B. D. (2014). What drives ETF flows?.

Financial Review, 49(3), 619-642.

Downs, D. H., Sebastian, S., Weistroffer, C., & Woltering, R. O. (2016). Real estate fund flows and the flow-performance relationship. The Journal of Real Estate Finance and

Economics, 52(4), 347-382.

Edelen, R. M., & Warner, J. B. (2001). Aggregate price effects of institutional trading: a study of mutual fund flow and market returns. Journal of Financial

Economics, 59(2), 195-220.

Fang, Y., & Sanger, G. C. (2011). Index price discovery in the cash market.

Froot, K. A., O’connell, P. G., & Seasholes, M. S. (2001). The portfolio flows of international investors. Journal of financial Economics, 59(2), 151-193.

Henderson, B. J., & Buetow, G. W. (2014). Are Flows Costly to ETF Investors?. The Journal of Portfolio Management, 40(3), 100-112.

(33)

Kalaycıoğlu, S. (2004). Exchange traded fund flows.

Koesterich, R. (2008). The ETF strategist. New York, NY: Portfolio

Levy, A., & Lieberman, O. (2015). Active flows and passive returns. Review of

Finance, 20(1), 373-401.

Lou, D. (2012). A flow-based explanation for return predictability. The Review of Financial Studies, 25(12), 3457-3489.

Lütkepohl, H. (2005). New introduction to multiple time series analysis. Springer Science & Business Media.

Osterhoff, F., & Overkott, M. (2016). ETF Flows and Underlying Stock Returns: The True Cost of NAV- based Trading.

Perignon, C., Yeung, S., Hurlin, C., & Iseli, G. (2014). The Collateral Risk of ETFs (No. 1050). HEC Paris.

Ramaswamy, S. (2011). Market structures and systemic risks of exchange-traded funds. Bank for International Settlements.

Rompotis, G. G. (2011). Predictable patterns in ETFs' return and tracking error.

Studies in Economicsand Finance, 28(1), 14-35.

Rossi, E. (2004). Impulse response functions [PowerPoint Slides]. Retrieved from http://economia.unipv.it/pagp/pagine_personali/erossi/dottorato_svar.pdf

Sánchez, G. (2011). Technical tips on time series with Stata [PowerPoint Slides]. Retrieved from https://www.stata.com/meeting/mexico11/materials/gsanchez.pdf

Sims, C. A. (1980). Macroeconomics and reality. Econometrica: Journal of the

Econometric Society, 1-48.

Referenties

GERELATEERDE DOCUMENTEN

We hypothesize that upon impact, while a partially-frozen droplet deforms and spreads into a thin pancake along the substrate, the pre-solidified material at its interface

1 Er is dan ook vanaf het begin van de zeventiende eeuw aandacht geweest voor de opgravingen, wat ervoor gezorgd heeft dat er de afgelopen eeuw haast continu onderzoek geweest is

Hugh McLeod, The Religious Crisis of the 1960s (Oxford: Oxford University Press, 2007); Hugh McLeod, ‘The Religious Crisis of the 1960s ’, Journal of Modern European

To analyse, to what extent Hollywood has an issue with racial and religious minority visibility and stereotyping over time, this study content analyses 1109 characters from

This is the so-called voluntary Transparency Register and it was seen as an enhancement to transparency, because it made it possible for European citizens to

We will further elaborate how religion and technology are not foreign entities that stand outside one another but are rather intertwined by an analysis of the

We have systematically varied motor speed and density in filaments confined to a pressurised cylindrical cell, and have uncovered four qualitatively different types of steady

Key words: (state led, tourism, commercial and cultural) gentrification, social mix strategies, the right to the city, entrepreneurialism and the entrepreneurial city, mixed