• No results found

The data used in this research comes from several databases. I obtain the options data from the Ivy DB US database of the OptionMetrics. From here I procure the options information for individual ETFs consisting of daily observations of option volumes for call and put contracts, the implied volatilities for calls and puts, the open interest for calls and puts separately, and the option deltas. The option volume available in OptionMetrics is unsigned and non-directional (Zhou, 2022). According to Cremers and Weinbaum (2010), and Han et al. (2017), the implied volatilities of the American options on ETFs and the deltas are calculated using the binomial tree method of Cox et al. (1979). This adjustment is done in order to consider the possibility of early exercise of options.

My sample period ranges between January 2013 and December 2020 which translates into 2015 trading days. The data set contains 178 U.S. ETFs out of which only 109 had listed options.

Furthermore, from the Center for Research in Security Prices (CRSP) I employ the data for the ETFs. I obtain daily observations for the returns, closing prices, trading volume, shares outstanding, bid and ask quotes for every ETF. The data for returns will start from December 2012 to allow for the construction of the rolling 30-day realized volatility, momentum and short-term reversal measures. Moreover, I obtain the daily beta coefficient of each ETF from the Beta Suite of the Wharton Research Database Services (Beta Suite, 2016) which is estimated over a monthly rolling window. Additionally, from Kenneth French’s database available on Wharton Research, I retrieve the CAPM excess market return, Fama and French’s (1993) three factors, and Fama and French’s (1993) three factors together with Carhart’s (1997) momentum factor. Last, I retrieve the Lipper codes for each ETF in the sample from the Eikon database of Refinitiv.

Similarly to previous studies (An et al., 2014; Han & Li, 2021), I use ETF options with a 30-day expiration and at-the-money, meaning that they have a delta of 0.5. The argument behind this decision lies in the fact that at-the-money options are informationally rich since they are actively traded, thus placing them among the most liquid options (Stephan & Whaley, 1990; Xing et al., 2010). Equally important, Pan and Poteshman (2006) inform that lower maturity options offer a higher degree of leverage to investors. Also, they document that lower maturity options are also preferred for trading on information that has the potential to quickly fade. Therefore, I will also use at-the-money options with a 91-day expiration in one of the robustness analyses, similar

to An et al. (2014), to evaluate the viability of the main independent variables in a slightly different setting.

With regards to the ETFs, the initial sample included 178 U.S. ETFs, but because of option availability on OptionMetrics and lack of listed options, only 109 ETFs remain valid for the analysis. Due to the good quality of the data, negligibly few missing observations had to be dropped. Past literature in the field of stock return prediction using option volume or option implied volatility measures performed analyses at the intraday level (Chan et al., 2002), daily or weekly levels (Atilgan et al., 2015; Cremers & Weinbaum, 2010; Johnson & So, 2012; Pan & Poteshman, 2006), monthly levels (An et al., 2014; Bali & Hovakimian, 2009; Zhou, 2022) and quarterly levels (Jones et al., 2018). However, in the case of ETF return prediction using option information Han and Li (2021) studied all the frequencies, although only for one index. Therefore, my study also aims to fill the gap in the literature by using a sample of multiple ETFs and by conducting a comprehensive prediction analysis at the daily frequency.

4.1 Descriptive Statistics

Panel A of Table 1 presents descriptive statistics regarding the returns and other variables of the ETFs for the full sample and for the sample of ETFs with available options which will be used further on in the analysis. Looking at the average daily shares outstanding, one can notice that ETFs with available options have over 40% more shares available. Likewise, the average daily volume and market capitalization are higher for them. This shows that ETFs with listed options are more traded, thus, more liquid and larger than the ones without traded options, similar to what previous literature found (Brown et al., 2021; Mayhew & Mihov, 2004). As already stated, the full sample includes 178 ETFs, while the sample with available options includes 109 ETFs. However, having 8 years of data results in 2015 trading days available for each ETF which meets the criteria for the validity of the analysis regarding the number of observations. The average daily ETF return in the sample is 0.04% for all the ETFs and 0.05% for ETFs with available options, with a daily standard deviation of 1.23% for the former and a slightly higher figure, 1.35%, for the latter. Both distributions reflect non-normal characteristics, with moderate negative skewness figures of around −0.8 and high excess kurtosis of approximately 37. Furthermore, Panel B of Table 1 shows the classification of ETFs of both samples by Lipper categories. From this, one can notice that the

sample with available options includes slightly more ETFs focused on broad equities and almost double the amount of bond ETFs, while all other categories have lower proportions.

Table 1

Descriptive Statistics for the ETF Sample

All ETFs ETFs with available options Panel A: ETF characteristics

Average shares outstanding (millions) 83.58 119.89

Average daily volume (millions) 2.55 4.03

Average ETF market capitalization (billions)

8.70 13.01

Number of daily observations 358,670 219,635

Number of ETFs 178 109

Return descriptive statistics

Mean (%) 0.04 0.05

Standard deviation (%) 1.23 1.35

Skewness −0.83 −0.81

Excess kurtosis 37.10 36.72

Panel B: ETF classification by Lipper categories

Broad equities (%) 44.95 48.59

Sector equities (%) 33.03 25.42

Bonds (%) 9.17 18.08

Alternative equities (%) 5.50 3.39

Natural resources (%) 3.67 2.26

Commodities (%) 2.75 1.69

Mixed assets (%) 0.92 0.56

Note. Panel A reports the daily descriptive statistics for the full ETF sample and the sample of ETFs with options available. The sample consists of daily data covering January 2013 to December 2020. The average shares outstanding and the average daily volume are reported in millions. The average market capitalization is reported in billions and is equal to the average of the shares outstanding multiplied by the share price of an ETF. The mean return and the standard deviation are reported in percentage. Excess

kurtosis is calculated by subtracting 3 from the kurtosis estimate. Panel B reports a percentage classification of the two ETF samples in Lipper categories.

Regarding the main independent variables of this study, Table 2 reports the average daily mean and standard deviation of the put-call ratio and IV spread across ETFs for each year, from 2013 to 2020, including the latter. First of all, the average daily put-call ratio for the whole sample is more than 35% indicating that the put option volume is, on average, lower than the call option volume in the ETF options market, very similar to what Pan and Poteshman (2006) find. This means that traders might be more inclined to use the options market for trading on positive news rather than negative news. Second of all, the average daily IV spread over the sample is negative indicating that the put implied volatility is higher than the call implied volatility on average, which is consistent with the prior findings (Cremers & Weinbaum, 2010; Jin et al., 2012; Jo nes et al., 2018; Lin et al., 2013). Noticeably, the year 2020 marked by the downturn caused by the outburst of Covid-19 registered the highest daily put-call ratio levels and one of the highest daily IV spread levels from the whole sample. Additionally, the standard deviation of the put-call ratio was the lowest in that year indicating a general predisposition of the market towards put options trading.

At the same time, the IV spread recorded the highest standard deviation indicating that the highest

Table 2

Descriptive Statistics for the Main Independent Variables

Year

P/C IVS

M SD M SD

2013 32.49 35.63 −0.85 1.68

2014 36.80 36.45 −0.69 2.81

2015 36.68 35.65 −0.60 4.63

2016 35.72 36.29 −1.21 4.14

2017 33.54 35.38 −0.51 3.21

2018 34.24 34.04 −0.17 3.56

2019 35.06 34.57 −0.40 2.83

2020 39.06 32.50 −0.36 4.93

Whole sample 35.45 35.14 −0.60 3.63

Note. The averages and standard deviations of the put-call ratio (P/C) and IV spread (IVS) across the sample of optionable ETFs are reported on a yearly basis from January 2013 to December 2020. The figures are expressed in percentages. The last row reports the averages and standard deviations for the two measures over the whole sample period.

daily values of both call and put implied volatilities in the sample were caused by the market turmoil.

Table 3 reports the ETF cross-correlations between the main independent variables, their component variables and two measures of the ETF market, namely trading volume and realized volatility. The volume of call and put options have a very high correlation with each other, and a high correlation with the ETF share trading volume. This means that both markets exhibit periods of high trading activity. Nevertheless, the put-call ratio has a very low correlation with all three volumes. Similarly, for the volatility measures, the put and call implied volatilities have a very high correlation with each other and high correlations with the realized volatility of the ETFs. An et al. (2014) describe this as a general volatility phenomenon because when ETF volatility is high, the implied volatilities are high as well. Furthermore, if the put-call parity would hold through the implied volatilities, their correlation would be perfect (An et al., 2014). However, because put-call parity violations happen in practice, their correlation is almost perfect (An et al., 2014) at the level of 0.95. This high positive correlation, although counterintuitive, is normal, particularly during put-call parity violations, according to An et al. (2014).

On the other hand, in Table 3 the IV spread has a very small positive correlation with the call implied volatility and a very small negative correlation with the put implied volatility, which is also what An et al. (2014) document in one of their robustness checks. Intuitively, this coincides with the design of the IV spread: when the call implied volatility increases, the IV spread increases and when the put implied volatility increases, the IV spread decreases. However, the IV spread is almost uncorrelated with the ETF volatility. An et al. (2014) argue that the call and put implied volatilities are responding to more than the information in realized volatility, meaning that a measure that combines them, such as the IV spread, would convey novel information which is not reflected in the realized volatility. A similar case can be made for the put-call ratio, its components and ETF trading volume.

Table 3

Correlations across variables

Variable 1 2 3 4 5 6 7 8

1. Volume calls −

2. Volume puts .96 −

3. P/C .08 .10 −

4. Call implied volatility .02 .01 .05 −

5. Put implied volatility .01 .00 .05 .95 −

6. IVS .02 .02 .03 .17 −.15 −

7. Trading volume .77 .78 .15 .12 .11 .03 −

8. Realized volatility .01 .00 .03 .85 .86 −.03 .08 −

Note. This table reports ETF cross-correlations of the volume of calls and puts, the put-call ratio (P/C), the implied volatilities of calls and puts, the implied volatility spread (IVS), share trading volume and realized volatility.