• No results found

The lead-lag relationship between stock index and stock index futures : empirical study based on high-frequency data from Chinese market

N/A
N/A
Protected

Academic year: 2021

Share "The lead-lag relationship between stock index and stock index futures : empirical study based on high-frequency data from Chinese market"

Copied!
43
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

1

MSc Business Economics

Finance

Master Thesis

The lead-lag relationship between stock index and stock index futures:

Empirical study based on high-frequency data from Chinese market

Name: Pingsheng Wang Student number: 11089024

Supervisor: Dr. L. Zou Date: 07-07-2016

(2)

2

Statement of Originality

This document is written by Student Pingsheng Wang who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

3

Abstract

This paper investigates the lead-lag relationship between the stock index and stock index futures in Chinese market using high frequency data. Vector autoregression model (VAR) and vector error correction model (VECM) are used. The empirical results show that there exist a bidirectional lead-lag relationship and the lead relation from the futures to spot market is stronger. In addition, the dichotomous analysis is carried out to study this relationship in upper and lower market separately. The result shows that the lead relation from the futures to spot market is asymmetric. It is greater in the lower market than in the upper market.

(4)

4

Table of Contents

Statement of Originality ... 2 Abstract ... 3 1. Introduction ... 5 2. Literature Review ... 8 3. Methodology ... 15

3.1 Vector Autoregression Model ... 16

3.2 Information criterions ... 17

3.3 Augment Dicky-Fuller Test ... 18

3.4 Granger Causality Test ... 18

3.5 Vector Error Correction Model (VECM) ... 19

3.6 Dichotomous Analysis ... 22

4. Data ... 23

5. Empirical Results ... 27

5.1 Model Selection Information Criterions ... 28

5.2 The VAR regression Results ... 28

5.3 Granger Causality Test ... 31

5.4 Dichotomous Analysis ... 31

6. Robustness Check ... 34

7. Conclusion and Discussion ... 38

(5)

5

1. Introduction

Under the efficient market hypothesis, the existence of rational people and arbitrage opportunities makes the market price efficiently reflects all available information in the market. And under no arbitrage hypothesis, the relationship between the stock index and stock index futures should be perfectly and contemporaneously correlated. In real life markets, however, due to transaction costs, different market microstructures and other market frictions, different markets may react to new information with different efficiency. Information can be disseminated in one market first and then transmitted to the other later. Thereby resulting in the lead-lag relationship between spot and futures markets.

It is very important to understand the lead-lag relationship since it may indicate how well the two markets are linked together and provides insights on how fast one market reacts to the new information compared to the other market. It can provide implications for local investors and regulators. First of all, investors can profit from the arbitrage opportunities by discovering and taking advantage of the price difference. Also, they can adjust their position properly to hedge their risk more effectively using the stock index futures contracts. So investors can increase their profits and better manage their risk in the meantime.

Understanding this lead-lag relationship is also crucial to regulators. Futures is more liquid and has lower transaction cost than stocks, one important function of the futures market is price discovery. Finding the lead-lag relation between the stock index and stock index futures is helpful for regulators to understand the information transmission mechanism between the two markets. It can help regulators to develop more proper policies to form a more efficient market.

So it is crucial to investigate whether the spot market leads the futures market, whether the futures market leads the spot market or whether the bidirectional feedback between the two markets exists. This study intends to answer this question.

The lead-lag relationship between the stock index and stock index futures has been studied extensively. But most of these papers focused on developed markets. Different level of economy and stock market development makes findings from the developed markets hard

(6)

6

to be generalized to emerging markets. Only a few studies have been carried out in emerging markets and no consistent conclusions are found in these studies. On the other hand, many researches that focused on this question are rather old, some are from the 1990s. Thanks to the development of modern information technology, the trading now are quite different from the old days and orders can be executed much more quickly. This relationship might be different now.

In this study, I focus on Chinese stock and futures market, trying to investigate the lead-lag relationship between them. Chinese capital market is still at the early stage in terms of market efficiency, regulation system and competitiveness of market participants. Chinese capital markets development still cannot satisfy the needs of economic development. Chinese stock market, as an important composition of Chinese stock market, is rather an immature one. Limited investment instruments, the overregulated system and irrational investors make the market response to new information slowly. Market price often deviates from its intrinsic value. The characteristics of Chinese stock market are different from that of developed markets. Chinese equity market is dominant by individual investors compared to developed markets, and more importantly, the trading experience of Chinese investors are unlikely to be comparable to that of investors from developed markets. So Chinese market is more likely to be irrational. The conclusions that draw for developed markets cannot be directly applied to the Chinese market.

China Shanghai Shenzhen 300 Stock Index Futures was introduced by CFFEX (China Financial Futures Exchange) on April 16, 2010. It is a stock index futures contract whose underlying stock index is HuShen 300 (Shanghai is abbreviated to “Hu” and Shenzhen is abbreviated to “Shen”), also China Securities Index 300. The introduction of HuShen 300 index futures provides the risk-averse investors with necessary financial instruments for risk management. It helps to improve the market structure and accelerates the price discovery process in Chinese stock market. After several years’ development, the HuShen 300 index futures market attracts more and more attention of investors with the increasing trading volume but this market is still not very mature. There is still heated discussion whether the introduction of HuShen 300 index futures is beneficial to the development of Chinese capital market. By investigating the lead-lag relationship, this paper can provide evidence for this

(7)

7

discussion. If the result suggests that the futures leads the spot market, which indicates the futures market is faster in price discovery process, then it supports the argument that the introduction of HuShen 300 index futures is helpful for the discovery process of Chinese capital market.

In this research, I collect minute-to-minute high frequency price data of HuShen 300 stock index and stock index futures. I use the vector autoregression (VAR) model and Granger causality test to investigate the lead-lag relationship between HuShen 300 stock index returns and index futures returns. Both methods can provide insights of the causal relationship. Furthermore, I dichotomize the market into upper and lower market and extend the research to specific circumstances, trying to find whether this relationship is asymmetric.

The empirical results show that there is a bidirectional lead-lag relationship between the HuShen 300 stock index and HuShen 300 stock index futures, and the lead relationship from the futures to spot market is stronger than that from the spot market to futures market. Furthermore, I find that this relationship is asymmetric. The lead relationship from stock index futures to stock index is greater in the lower market than in the upper market. This result can be explained by the lack of necessary instruments to short sell stocks in the spot market when the market has a downward trend. It makes the stock market even less efficient, so information is disseminated even slower in the lower market.

To make the conclusion more reliable, I adopt the vector error correction model (VECM) as robust check. This is also a widely used method in determining this lead-lag relation. The empirical result from VECM also suggests that although a bidirectional relationship between the two markets exists, the lead effect is stronger from futures to spot market. This conclusion is consistent with the conclusions from VAR model, which makes this conclusion more reliable.

This rest of this paper is organized as follows. Section 2 provides explanations of the theoretical relationship between stock index and stock index futures and reviews relevant research papers that focused on this topic. Section 3 explains the theoretical background and explain the methodologies that are used in this paper. Section 4 presents the descriptive statistics and interpretations of the high frequency data. Section 5 shows and

(8)

8

interprets the empirical regression results. Sections 6 presents the empirical results of robustness check. And section 7 summarizes the conclusion and further discusses this topic.

2. Literature Review

Under no arbitrage argument, the price of stock market index and stock index futures should be perfectly and contemporaneously correlated with each other, otherwise there will be arbitrage opportunities. Arbitrageurs will take advantage of these opportunities and prices will be pushed back to their equilibrium relation. The theoretical relationship should be:

𝐹0 = 𝑆0∗ 𝑒(𝑟−𝑑)∗(𝑇−𝑡)

where 𝐹0 is the price of the futures, 𝑆0 is the index price. r is the risk free rate and d is the

dividend yield of stock index portfolio so r-d is the cost of carrying the underlying stocks in the index. T is time to maturity and T-t is the remaining maturity of the futures. If this relationship is not satisfied, arbitrageurs can lock in risk-free profits. For example, if the futures price is too high, arbitragers can lock in a risk-free profit by buying the stock index and short the stock index futures. However, due to market imperfection, this relationship is not always hold in real life market, especially in short term periods. So the lead-lag relationship between the stock index and stock index futures might exist.

There are many researches focused on this field. Many empirical studies have been carried out to investigate this relationship. But due to different characteristics of different markets and different research methods used, contradictory results are found in different markets. Most studies found that index futures market tends to lead spot market.

Kawaller, Koch, and Koch (1987) empirically examine the intraday price relationship between S&P 500 futures and the S&P 500 index using minute-to-minute data. They aim to find whether movements in the futures prices have predictive power over movements in the index. They investigate this temporal relationship by estimating distributed lags

(9)

9

between the returns of the index and futures using three stage least squares regression. The empirical results report that the lead relationship from S&P 500 futures to stock index prices by 20 to 45 minutes, while the lead from spot to futures prices rarely extends beyond one minute. They also find that the relative size of the contemporaneous coefficients compared with the value of lag coefficients suggests that it is unlikely to profit from with projected price movement implied by the lag structure. Herbst, McCormack and West (1987) examine the lead-lag relationship between the spot and futures markets for S&P500 and VLCI (Value Line Composite Index) indices. They find that for S&P500 the time that futures lead spot market is between zero and eight minutes, while for VLCI the lead time is up to sixteen minutes. Stoll and Whaley (1990) study the dynamics of the stock index and stock index futures returns using 5-minutes data. The empirical results find that S&P 500 and Major Market Index (MMI) futures returns lead stock index returns by about five minutes on average, and occasionally by more than 10 minutes, but the feedback from the cash market into the futures market is much shorter than that. They think that this finding can be attributed to the fact that not all stock are continuously traded. They also find that futures returns even lead the returns of actively traded stock, the effect of infrequent trading and bid-ask spread can be adequately described by an ARMA(2,3) process.

Chan (1993) investigate the intraday lead-lag relation between returns of the major market cash index (MMI) and returns of the major market Index futures and S&P 500 futures. Since the component stocks of the MMI are more actively traded, so by using MMI instead of S&P 500 index, the infrequent trading problem is less serious. Empirical results show strong evidence that the futures leads the spot index and weak evidence that the spot index leads the futures. Furthermore, he studies how lead-lag pattern changes under different conditions. First, he examined whether the relation is higher or lower when bad news or good news occurred by sorting the observations by sign and size of spot index returns and allocate them into different quintiles. Quintiles with the highest return represents good news group and Quintiles with lowest returns belongs to bad news group. The empirical results provide no evidence support the hypothesis that futures leads the spot market only under bad news. The tendency is not stronger for the futures to lead stock index under bad news than good ones. Then he studies the impact of the relative intensity of trading activity. The intensity of trading activity is represented by the number of transactions in the stock

(10)

10

index and stock index futures market. Based on this, subgroups are formed to reflect the relative intensity of trading activity in the two markets, they reflect low, medium and high trading activities in the stock market. This paper finds no empirical evidence to suggest that the lead-lag relation is impacted by the relative intensity of trading activity in the stock index and stock index futures market. Furthermore, he investigates the lead-lag relation under market wide information. He stratifies the observations into five quintiles when higher quintile means that there is market wide information affecting the stocks moving together. Empirical results suggest that the lead-lag relationship is not symmetric, indicating that when more stocks move together because of market-wide information, the degree that futures leads the spot index is greater.

Instead of examining the relationship between each futures and its associated cash index, Kim, Szakmary and Schwarz (1999) examine intraday price leadership across the S&P 500, NYSE Composite, and MMI futures, and across the respective spot indexes in order to hold market microstructure effects constant. They examine the price leadership by decomposing forecast error variance and impulse response function from VAR model. They find that, because transaction costs in the S&P futures are lowest than in other index futures, the S&P 500 shows the strongest price leadership over the other index futures by about 5 minutes. While in the spot market, major market index (MMI) has the highest predictive power over other indexes, MMI leads the S&P 500 and NYSE indexes by 5 minutes. They conclude that trading cost and price leadership are linked even in markets with a similar microstructure. Brooks, Rew, Ritson (2001) use ten-minutely observations from June 1996 to 1997, examine the lead-lag relationship between the FTSE 100 index and index futures price and propose a profitable strategy from this relationship. They found that lagged changes in the futures price can help to predict changes in the spot price, meaning that new market wide information is disseminated in the futures market and then transmit to spot market with arbitragers trading across both markets. They propose a “buy and hold” trading strategy that was derived from the cost of carry error correction model and find that this strategy can outperform the benchmark when assuming there are no transaction costs. But this strategy can no longer beat the market when considering transaction costs.

(11)

11

There are also studies find that bi-directional relation exists between spot market and index futures. Ghosh, A. (1993) presents a new methodology that considers the short term dynamics adjustments and long term relation between economic variables and use it to investigate whether the stock index and index futures price changes are predictable or not. He finds that index spot and futures prices are integrated. Furthermore, the result suggests that error correction model is potentially useful to forecast spot index and index futures prices. Pizzi et al. (1998) investigates the relationship between the S&P 500 stock index and the three-month and six-month futures contracts with same maturities. Cointegration analysis and vector error correction model are used. He finds that the stock index cointegrated with both the three-month and the six-month futures. And the empirical result of error correction model shows that both the three- and six-month futures markets lead the spot market by at least 20 minutes. Meanwhile, he finds that the spot market leads the three month futures by more than three minutes and the spot market leads the six month futures by at least four minutes. So a bidirectional lead-lag relationship is found while the stock index futures markets tend to have a stronger lead effect. Kavussanos, Visvikis and Alexakis (2008) focus on Greece market, they investigate the lead-lag relationship in daily returns and volatilities between of the stock index futures and the underlying spot indices. Empirical results show that there is a bi-directional relationship between spot and futures prices and futures returns respons more rapidly to economic events than stock prices. As for the volatility, they find evidence suggesting that futures volatilities spill information to spot market volatilities while spot market volatilities have no effect on the futures market. Kim and Ryu (2014) use VAR(1)-asymmetric BEKK-MGARCH model to examine the intraday relationships among the spot index, index futures, and the implied volatility index using a high-frequency dataset from the Korean financial market. They find that there is strong linkage among spot, futures and option markets. Specifically, between spot and futures market, they find that there is a bi-directional causal relationship between the spot and futures markets while the futures return shock affects the spot market more severely. A few studies find that stock index plays a more dominant role in this relationship. For example, Amrit and Tipprapa (2014) use daily data to investigate whether a lead–lag relationship exists between the spot market and the futures market in Thailand during the period 2006 through 2012. Their result suggests that error correction model is the best

(12)

12

forecast model on this research question. The result shows that changes of stock index prices lead changes of stock index futures prices. Based on their findings, they propose a “buy predicted positive and sell predicted negative” trading strategy that can outperform the market even after allowing for transaction costs. Chen and Gau (2009) study the competition between stock index, stock index futures and stock index options in price discovery process in the Taiwan stock exchange. They find that the stock index is more dominant in the price discovery process than stock index futures and options.

Most of the papers focus on developed markets, very few studies investigates this relationship in Chinese markets. And contradict results are found in these papers. Yang, Yang and Zhou(2010) use ECM-GARCH model to investigate the intraday price discovery and volatility transmission between the Chinese stock index and stock index futures. They find that spot market is dominant in the price discovery process. They conclude that the stock index futures does not function well in price discovery at that time. Meanwhile, they find that a strong bidirectional intraday volatility dependence between the stock index and stock index futures, meaning that volatility originated in either market would transmit to the other one. Zhou, Wu(2015) investigated intraday price discovery and volatility transmission between the Chinese stock index and the stock index futures markets using high-frequency data. Vector autoregression model (VAR) and multivariate generalized autoregressive conditional heteroscedasticity (MGARCH) are adopted. The empirical results suggest that there exists a significant bidirectional Granger causal relationship between the Chinese stock index (HuShen 300) and stock index futures and the lead relationship from futures to spot is stronger. It means that futures market plays a more dominant role in the price discovery process. There is also significant bidirectional volatility spillovers effect between the stock index and stock index futures but this effect is almost equal in both directions. There are several papers that also shed light on my research even though they do not study the lead-lag relationship between the stock index and stock index futures directly.

Kroner and Ng (1998) examine the differences between several widely used multivariate generalized autoregressive conditional heteroscedasticity (GARCH) models and introduce a general dynamic covariance matrix model that allows for asymmetric effects in the variances and covariances. They apply this model to study the dynamic relation between

(13)

13

weekly returns of large-firm portfolios and small-firm portfolios and find that large-firm return can affect the volatility of small-firm return while the reverse effect is not significant. Their findings also suggest that the choice of a multivariate volatility can have a significant impact on the result of the analysis. Hamilton (2003) uses a flexible method to study the nonlinear relation between GDP growth and oil price changes. His findings report that the relationship between oil price changed and GDP growth is nonlinear. Data of oil price increases are much more useful to forecast GDP growth than data of oil price decreases. Zou (2005) studied cross-asset derivative securities and derived a dichotomous asset pricing model (DAPM) that significantly enriches the Sharpe-Lintner-Black capital asset pricing model. DAPM dichotomizes the market and separately predicts relations between the assets’ expected return and beta under the upper market and lower market conditions. These researches provide my paper with the methodology that is used to study the asymmetric relation between the stock index and stock index futures. Ng and Wu (2007) analyze and compare the trading behaviors of individual and institutional investors across China and find that investors with a different level of wealth tend to develop different trading strategies. One important finding of their paper indicates that individual investors’ monthly buys and sells have no significant impact on the stock returns of the next month. While institutional investors’ buys and sells can be used to forecast future stock returns. Their findings suggest that institutional investors are better informed than individual investors in the Chinese market. Cabrera, Wang and Yang (2009) investigate the leading electronic trading market in the price discovery process. They focus on the price discovery function of futures on foreign exchange markets for two currency pairs, the Euro/USD and the Yen/USD. The empirical results show that transaction prices in spot market are more informative than the prices in futures market, meaning that the spot foreign exchange market lead the price discovery process. Hernandez and Torero (2010) use Granger causality tests to empirically investigate the information flow direction between spot and futures prices of agricultural commodities by examining the dynamic relationship between spot and futures prices. The results indicate changes in futures prices lead changes in spot prices more often that the reverse, meaning that futures market is more dominant in price discovery. Their findings support the price discovery function of futures contracts. Jackline and Deo (2011) examine the relationship between the futures market and spot market for the lean hogs and pork bellies markets

(14)

14

using the data from January 2001 through May 2010. They find that in short term, futures price Granger cause the spot market and vice versa, meaning that there is a bi-causality relationship among these markets. They reach the conclusion that the selected markets are perfectly efficient and no profitable arbitrage opportunity exists. Wang, Wu and Yang (2013) study the relationship between oil prices and stock markets differentiate oil-exporting countries from oil-importing countries using a structural VAR analysis. In this paper, they propose a model to identify asymmetric effects by separating positive changes from negative changes using two auxiliary variables and found that there are no significant asymmetric effects from oil price shocks on stock market returns across all the countries in their sample.

Although there are plentiful researches try to investigate and identify the lead-lag relationship between stock index and stock index futures, no consistent conclusions have been found. There is an unsolved puzzle about this relationship. My research intends to investigate this relationship and extend it into specific circumstances and find out whether this relationship is any different in upper and lower markets. So it can provide investors and regulators with more specific guidance.

A futures contract is an agreement to buy or sell an asset at a future time for a certain price. The futures market has two important functions, risk management and price discovery. The existence of futures could benefit investor by providing a risk management instrument and a more efficient market by accelerating the price discovery process. Futures price is determined by demand and supply. It reflects market participants’ expectation about the stocks. Market prices are driven by expectations. Prices of stocks would also move toward market participants’ expected level. So I expect that the stock index futures would lead stock index.

So my first hypothesis is that:

There exists a lead-lag relationship between the stock index and stock index futures and futures intend to lead in this relationship.

Furthermore, because of the existence of short selling constraints in Chinese stock market, investors lack of usable ways to short sell stocks when they expect the stock price will go

(15)

15

down. This will further diminish the price discovery function of the stock market when the market is going down. So I expect there will be difference in different market trends.

My second hypothesis is that:

The lead relation from stock index futures to stock index is greater when the market goes down.

3. Methodology

This paper aims to investigate the lead-lag relationship between the stock index and stock index futures, and find out whether the spot market leads the futures market, whether the futures market leads the cash market or whether there is a bidirectional relationship between the two markets.

In order to test the lead-lag relationship in short time periods, I choose the minute-to-minute price data of the HuShen 300 stock index futures and the underlying HuShen 300 spot index. Compared to low-frequency data(daily data and weekly data), which are mostly used in the previous studies, high-frequency data contain more detailed information about the price change, it can faithfully reflect the price changes to various information in the market. High-frequency data are more and more widely used in academic studies.

From the data obtained, I calculated the continuously compounded 1-minute return using the following equation. In the equation, 𝑟𝑠 denotes the return of the spot market and 𝑟𝑓 denotes the return of the futures market.

𝑟𝑠 = 100 ∗ ln⁡( Ps,i Ps,i−1) 𝑟𝑓 = 100 ∗ ln⁡( Pf,i

(16)

16

3.1 Vector Autoregression Model

The stock index and stock index futures may each convey predictive information regarding subsequent price variation in their own market and the other market. In order to test this relationship empirically, I use the Vector Autoregression Model (VAR). VAR model is an econometric model used to capture the linear interdependencies among multiple time series. VAR model allows more than one variables and treats these variables symmetrically. The current value of all variables are regressed against the fixed number of lagged value of all variables, it can be used to estimate the dynamic relationship between endogenous variables without any prior constraints. It does not require much knowledge about the economic reasons, only need to have a list of variables that can be reasonably hypothesized to affect each other. Since it is easy to implement. The equations are as follows:

𝑟𝑠,𝑡= 𝛼1+ ∑ 𝛽1𝑖𝑟𝑠,𝑡−𝑖 𝑛 𝑖=𝑖 + ∑ 𝛾1𝑖𝑟𝑓,𝑡−𝑖 𝑛 𝑖=𝑖 + 𝑢1,𝑡 𝑟𝑓,𝑡 = 𝛼2+ ∑ 𝛽2𝑖𝑟𝑓,𝑡−𝑖 𝑛 𝑖=𝑖 + ∑ 𝛾2𝑖𝑟𝑠,𝑡−𝑖 𝑛 𝑖=𝑖 + 𝑢2,𝑡

Where 𝑟𝑠 is the return of stock index and 𝑟𝑓 is the return of stock index futures, u is the error

term and n is the maximum lag value. The value of lags n is chosen using model selection information criterion.

There are several advantages of vector autoregression model. First, VAR model is easy to implement. Every equation in the VAR has the same number of variables so the coefficients are easily estimated using ordinary least squares (OLS) method to each equation separately. And in large samples, the OLS estimator is consistent and normally distributed. Also, after estimating VAR model, it is easy to implement Granger causality test.

From the regression results, if some of the coefficients 𝛾1𝑖, i=1, 2, 3…n, are significant, we

can say that stock index futures leads stock index, because the lagged value of futures return can be used to predict the current spot return. Similarly, if some of the 𝛾2𝑖⁡coefficients, i=1, 2, 3…n, are significant, we can say that stock index leads stock index

(17)

17

futures. If some of the coefficients of both 𝛾1𝑖 and 𝛾2𝑖 are significant, we can say there is a

bidirectional feedback effect between these two variables. And by comparing the value of significant coefficients, we can see how great the lead effects are. So using this model, we can empirically test the lead-lag relationship between the stock index and stock index futures.

3.2 Information criterions

Information criterions are model selection tool that can be used to compare different models. The best model is the one that minimizes the information criterion. In this paper, I compare AIC, SBIC and HQIC to determine the lag value of VAR model.

AIC is Akaike information criterion, The equation is 𝐴𝐼𝐶 = ln [𝑆𝑆𝑅(𝑝)

𝑇 ] + (𝑝 + 1) 2

𝑇 , where

𝑆𝑆𝑅(𝑝) is the sum of squared residuals, T is the number of observations and p is the value that minimizes AIC(p) among the possible choices p=0, 1,… , 𝑝𝑚𝑎𝑥 , where 𝑝𝑚𝑎𝑥 is the largest

value of p considered and p=0 means that the model contains only an intercept. AIC is actually a relative estimate of the information loss when a model is chosen. AIC can only be used to compare models, it provides no information about the absolute quality of the model. BIC is Bayes information criterion, also called Schwarz information criterion (SIC). AIC and BIC are both widely used in practice. The equation of BIC is: 𝐼𝐶 = ln [𝑆𝑆𝑅(𝑝)

𝑇 ] + (𝑝 + 1) ln⁡(𝑇)

𝑇 .

The difference between the AIC and BIC is that the term “2” in the AIC is replaced by “ln(T)” in the BIC. So this term in the BIC is bigger. In large samples, BIC is more appropriate. AIC tends to overestimate p with nonzero probability when the sample is large.

HQIC is Hannan-Quinn information criterion, which is an alternative to AIC and BIC. It is given as 𝐻𝑄𝐼𝐶 = −2𝐿𝑚𝑎𝑥 + 2𝑘⁡𝑙𝑜𝑔 log 𝑛 where 𝐿𝑚𝑎𝑥 is the log-likelihood, k is the number

of parameters and n is the number of observations.

Using the information criterions described above, the lag value of the VAR model can be determined by choosing the value that minimizes the most proper criterion.

(18)

18

3.3 Augment Dicky-Fuller Test

Augment Dickey-Fuller test is a widely used econometric method that can be used to test whether a time series has a unit root, thus to determine whether this time series is stationary or not. One underlying assumption of the VAR model is that the time series in this model are stationary. So before applying VAR model, the augment Dickey-Fuller (ADF) test is needed to check whether the two time series are stationary. The regression of the ADF test is:

∆𝑟𝑡 = 𝛽0+ δ𝑟𝑡−1+ ∑ 𝛾𝑖∆𝑟𝑡−𝑖

𝑝

𝑖=1

+ 𝑢𝑡

Where ∆ is the first difference operator and δ is the interested coefficient. P is the maximum value of lagged number, here this value is the same with VAR model. The null hypothesis is: 𝐻0:⁡𝛿 = 0 and the alternative hypothesis is: 𝐻1:⁡𝛿 < 0. Under alternative

hypothesis, the time series is stationary. The ADF statistic is the OLS t-statistic testing 𝛿 = 0. ADF statistic is a negative number. If the ADF statistic is less than the critical value, the null hypothesis is rejected, meaning that the time series is stationary. The more negative ADF statistic is, the stronger the rejection.

3.4 Granger Causality Test

I use the Granger causality test to further analyze the relationship. Granger causality test is a statistical hypothesis test that can be used to determine whether one time series is useful to forecast another. It was proposed by Clive W. J. Granger in 1969. The equation of the VAR model with a lag value p is:

𝑟𝑠,𝑡= 𝛼1+ ∑ 𝛽1𝑖𝑟𝑠,𝑡−𝑖 𝑛 𝑖=𝑖 + ∑ 𝛾1𝑖𝑟𝑓,𝑡−𝑖 𝑛 𝑖=𝑖 + 𝑢1,𝑡 𝑟𝑓,𝑡 = 𝛼2+ ∑ 𝛽2𝑖𝑟𝑓,𝑡−𝑖 𝑛 𝑖=𝑖 + ∑ 𝛾2𝑖𝑟𝑠,𝑡−𝑖 𝑛 𝑖=𝑖 + 𝑢2,𝑡

(19)

19

In first equation of this model, the Granger causality statistic is the F-statistic on all the coefficients of 𝑟𝑓 are zero. The null hypothesis implies that these regressors have no predictive power for 𝑟𝑠. If the null hypothesis is rejected, we can infer that the return of

stock index futures Granger causes the return of the stock index. And the test method is similar in the second equation.

One important feature of Granger causality test is that Granger causality actually has little to do the causality in the common sense. Granger causality means if X granger-causes Y, then X is a useful predictor of Y. It does not necessarily prove that the causal relationship exists. But Granger causality test can still provide insights to investigate the lead-lag relation between stock index and stock index futures through analyzing the predictive power of these variables.

3.5 Vector Error Correction Model (VECM)

Vector Autoregression model (VAR) does have some limitations. So to further analyze this relationship, I adopt the Vector Error Correction Model (VECM) as the robustness check to investigate whether the conclusion is consistent with that from VAR model.

An error correction model is a commonly used model to estimate both short term and long term relationship of one time series on another. It is useful for data that underlying variables have a stochastic trend, or in other words, cointegrated with each other. Error correction model includes an error correction term based on the fact that past deviation from equilibrium has an impact on short run changes. Error correction can estimate the speed at which variables return to equilibrium after deviation. Vector error correction model is a generalization of error correction model, it allows multiple cointegration relationships.

Vector error correction model has several advantages. First, it can avoid the spurious regression problem of VAR model when variables are cointegrated. Then the interpretation

(20)

20

of the result is better since the long term relationship is considered in the model. Long term and short term relation can be analyzed separately.

Since the VECM investigate the relationship through the cointegration, it is no long appropriate to use the returns of the stock index and stock index futures as target variables. So I generate the log value of prices of the stock index and the stock index futures using the following equation.

Lpi = log⁡(pi)

Then similar to the process of VAR model, the lag value that used in the VECM need to be determined using the model selection information criterions like AIC, BIC, etc.

Before using the vector error correction model, cointegration test should be performed to determine whether Lps and Lpf are cointegrated since the VECM model requires that the

variables used in the model are cointegrated. Here, I use Johansen cointegration test to test whether these two variables are cointegrated.

To test whether the cointegration exists between two time series, Engle-Granger two step method is a simple and widely used method. In Engle-Granger test, to test whether time series 𝑋𝑡 and 𝑌𝑡 are cointegrated, the cointegration coefficient ϴ should be estimated in the

first step using OLS estimation of the regression: ⁡𝑌𝑡 = α + θ𝑋𝑡+ 𝑍𝑡

And in the second step, a Dicky-Fuller test is used to test whether the residual from the regression 𝑍𝑡 is stationary. If 𝑍𝑡 is stationary, then ⁡𝑌𝑡 and ⁡𝑋𝑡 are cointegrated. However, Engle-Granger test does have limitations. It can only identify a single cointegration relation. An alternative method that can also be used to test the cointegration correlation is Johansen cointegration test.

Johansen cointegration test is the systematic method that helps to find out the number of cointegrated relationships and estimate them by using Maximum Likelihood Estimation in the unified framework. This approach is to test the multiple cointegration vectors and investigates the long run relationship between variables. This test permits more than one cointegration relationship so it is more generally applicable than the Engle-Granger test.

(21)

21

Then the vector error correction model is estimated using the following equations:

∆L𝑝𝑠,𝑡= 𝑎1+ ∑ 𝛼1,𝑖∆L𝑝𝑠,𝑡−𝑖 𝑝 𝑖=1 + ∑ 𝛽1,𝑖∆L𝑝𝑓,𝑡−𝑖 𝑝 𝑖=1 + 𝛾1(L𝑝𝑠,𝑡−1− 𝜃L𝑝𝑓,𝑡−1) + 𝑢1𝑡 ∆L𝑝𝑓,𝑡= 𝑎2+ ∑ 𝛼2,𝑖∆L𝑝𝑠,𝑡−𝑖 𝑝 𝑖=1 + ∑ 𝛽2,𝑖∆L𝑝𝑓,𝑡−𝑖 𝑝 𝑖=1 + 𝛾2(L𝑝𝑠,𝑡−1− 𝜃L𝑝𝑓,𝑡−1) + 𝑢2𝑡

Where Lps and Lpf denote the logarithmic stock index and stock index futures prices

respectively, ∆ is the first-difference operator. Here, ∆Lps and ∆Lpf are actually the returns of stock index and stock index futures. θ is the long-run multiplier and the term Lps,t− θLpf,t is called the error correction term, its past value helps to predict future values of ∆Yt

and ∆Xt. γ1 and γ2 are the speed of adjustment coefficients, if in the previous period Lps or

Lpf are out of equilibrium, they both will adjust toward the equilibrium value by the speed of γ. α1,i, α2,i, β1,i and β2,i , i=1, 2, …, p, are short-run responses. Actually, from the equations above, we can see that vector error correction model is similar to vector autoregression model. The main difference is that VECM includes the error correction term, which measures the long term relationship between the two variables.

If price of stock index and stock index futures are cointegrated, then there must exist Granger causality relation in at least one direction (Granger, 1988). From the regression results from the model, we can determine the lead-lag relationship by analyzing regression coefficients.

1. If some of the 𝛽1,𝑖 coefficients are significantly different from zero, then there exists

an unidirectional causal relationship from futures to spot.

2. If some of the 𝛼2,𝑖 coefficients are significantly different from zero, then there exists

an unidirectional causal relationship from spot to futures.

3. If both variables Granger cause each, then there is a bidirectional relationship between the stock index and stock index futures. And from the value of the short term coefficients, the degree of the lead or lag relationship can be determined and compared.

(22)

22

3.6 Dichotomous Analysis

When it comes to testing the asymmetric effect, I adopt the following model to test whether the lead effect that stock index futures to stock index is greater when the market is going down. I separate the positive returns from negative ones using extra auxiliary variables, defined as:

𝑟+ = max⁡(0, r)

𝑟− = min⁡(0, r)

Here, r+ represents returns in the upper market, which contains all the positive returns while r− represents returns in the lower market, contains all the negative returns. Then I modified the model which was put forward by Wang, Wu and Yang (2013) to estimate the relationships. This model was used in their paper to study the dynamic relationship between the stock and oil price changes in upper and lower market. Because of the similarity of the research questions, it is reasonable to use this model to investigate the relationship between the returns of stock index and stock index futures in upper and lower market.

𝑟𝑠 = 𝑐 + ∑ 𝛼1,𝑖𝑟𝑠 𝑝 𝑖=1 + ∑ 𝛽1,𝑖𝑟𝑓 𝑝 𝑖=1 + ∑ 𝛾1,𝑖𝑟𝑓# 𝑝 𝑖=1 + 𝑢1,𝑡 𝑟𝑓= 𝑐 + ∑ 𝛼2,𝑖𝑟𝑓 𝑝 𝑖=1 + ∑ 𝛽2,𝑖𝑟𝑠 𝑝 𝑖=1 + ∑ 𝛾2,𝑖𝑟𝑠# 𝑝 𝑖=1 + 𝑢2,𝑡

Where r# ∈ {r+, r−}. It is reasonable to use the lag number p identical to the lag number in VAR model. By analyzing the γ coefficients, asymmetric effects can be identified. In this test, the null hypothesis is that there are no asymmetric effects in the relationship between the stock index return and the stock index futures return.

𝐻0: 𝛾1 = 𝛾2 = 𝛾3 = ⋯ = 0

If some of γ1,i, i=1, 2,…, p are significant, then the lead or lag relation is different in upper

and lower market, we can reject the null hypothesis. And from the signs of the γ1,i

coefficients, we can determine whether the relationship is strengthened or weakened. If the sign of significant coefficients γi is consistent with the sign of coefficients of futures returns,

(23)

23

then this relationship is strengthened. And if the signs are different, we can conclude that the relation is weakened.

4. Data

The HuShen 300 Index is a benchmark of the 300 largest and most liquid stocks listed either on the Shanghai Stock Exchange and Shenzhen Stock Exchange. It can generally reflect the index price movement of Chinese stock markets. Since only the most liquid stocks are included in this index, this index reacts to market information more quickly. By using this index, the infrequent trading problem is less serious.

I use data from S&P YongHua database. S&P YongHua a financial data service provider in China and it can provide minute-to-minute data of stock index and stock index futures price. I obtained minute-to-minute time series data of HuShen 300 stock index and HuShen 300 index futures price data.

The data I obtained is from the year 2015, when the market experienced volatile fluctuations. This data sample can provide various market movement situations, making itself representative. In general, in the first half year, both spot index and futures have a clear upward trend and then they both experienced a sharp decline in June. While in the second half year, the fluctuation is much greater. The markets fluctuated from mid-July to mid-August and then both experienced another sharp decline. After that, the markets have an upward trend with fluctuation. This data sample contains different market trends and is representative. It is helpful for my analysis when I try to find the relationship in upper and lower markets separately.

The trading hours of the HuShen 300 stock index and the futures market is different. Specifically, the trading hours of the Shanghai and Shenzhen Stock Exchange are from 9:30 to 11:30 a.m. and then from 1:00 to 3:00 p.m. (Beijing Time), while the trading hours of the HuShen 300 index futures contract is from 9:15 to 11:30 a.m. and from 1:00 to 3:15 p.m. (Beijing Time). So stock index futures is traded thirty minutes longer than the stocks. Since

(24)

24

no price observations are available for spot index when the stock market is closed, this study is confined to stock trading hours. So futures prices data that is before the stock exchange opens or after it closes are deleted.

After eliminating weekends and holidays when the Shanghai and Shenzhen Stock Exchange is closed, a total of 58560 one-minute observations are obtained. The plot of the spot index and index futures during the sample period shows that these two variables move very closely in the long term as shown in figure 1.

Figure1. The time series plot of HuShen 300 stock index and HuShen 300 index futures prices

Table 1 shows the descriptive statistics and ADF statistics of the returns of stock index and stock index futures. As shown in the table, the descriptive statistics of spot returns and index futures returns are similar in some aspects. The mean value of both returns are very close to zero, and the standard deviation is much larger than the mean value.

However, the skewness and kurtosis of these two returns are different. Skewness is a measure of the asymmetry of the probability distribution of a random variable, and kurtosis

(25)

25

is a measure of the “tails” of the probability distribution of a random variable. The skewness for spot return is -6.1595, which means spot return are positively skewed while the skewness for futures return is 1.7604 which means futures return are negatively and less skewed. The kurtosis value is large for both returns but is much larger for spot return than the kurtosis of futures return. It means they all follow a leptokurtic distribution with high peak and fat tail compared to the normal distribution.

Table 1. Descriptive statistics

Spot Return Futures Return Mean(%) 0.0000762 0.00000389 SD(%) 0.1354 0.1840 Max(%) 6.4036 7.4187 Min(%) -7.6765 -5.6808 Skewness -6.1595 1.7604 Kurtosis 488.5910 130.4207 Obs 58,559 58,559 ADF -177.881* -239.169*

The 1% critical value is -3.43 for ADF test.* represents the rejection of the null hypothesis at the 1% level of significance.

As shown in the table, the ADF statistic of both returns are negative with large absolute values, -177.881 and -239.169 respectively. They are all less than the 1% critical value for augment Dicky-Fully test. In augmented Dicky-Fuller Test, the null hypothesis indicates the time series has a stochastic trend, here the null hypothesis is rejected at the 1% level of significance. So the stock index return and stock index futures return time series can be regarded as stationary. So both time series can be used in vector autoregression model.

(26)

26

Table 2. Autocorrelation coefficients of one-minute returns of spot index and index futures

Lag Spot Futures

1 0.2984 0.0117 2 0.0603 0.0093 3 -0.0723 -0.0064 4 -0.1018 -0.0068 5 -0.0577 0.0052 6 -0.0094 -0.0098 7 0.0249 0.0105 8 0.0242 -0.0100 9 0.0087 -0.0082 10 -0.013 -0.0081

Table 2 shows the autocorrelation structure of the one-minute returns of spot index and index futures. As shown in the table, the correlations of spot return for the first 5 lags are relatively large, as well as the autocorrelations with lag 7, 8. It shows that for the one-minute spot return, the autocorrelation is quite obvious. In indicates that the past value of spot returns have predictive power to forecast current spot return. In contrast, the autocorrelations of index futures with its lagged value are rather small. The coefficient of the first lag is only 0.0117, which is the largest in first ten lags. It means that the autocorrelation between current stock index futures return and its past value is less significant. The difference of the autocorrelation between spot and futures return maybe due to slow the dissemination of market-wide information in the spot market. Spot market takes longer to digest new market information thus the current return is highly related to its past values.

The cross-correlation between the returns of spot index and index futures is shown in Table 3. It provides a preliminary look at the lead-lag relation between the two markets and also a simple suggestion of the leads and lags that should be used in the regression.

(27)

27

Table 3. Cross-correlation coefficients of one-minute returns of spot index and index futures

lag corr lag corr

-20 0.0031 1 0.3339 -19 0.0097 2 0.1667 -18 -0.0091 3 0.0517 -17 -0.0144 4 -0.0184 -16 -0.0018 5 -0.0342 -15 0.0033 6 -0.0366 -14 0.0187 7 -0.015 -13 0.007 8 0.0039 -12 -0.0001 9 0.0112 -11 -0.0038 10 0.0069 -10 -0.017 11 -0.0058 -9 -0.0112 12 -0.0135 -8 -0.0064 13 -0.0116 -7 0.0033 14 0.0015 -6 0.0097 15 0.0054 -5 0.0056 16 0.0067 -4 -0.0077 17 0.0105 -3 -0.0222 18 -0.0065 -2 -0.0143 19 -0.0049 -1 0.0166 20 0.0068 0 0.5718

As shown in the table, the contemporaneous correlation coefficient is 0.5718, which is the correlation between the current spot return and futures return. It means that the two time series are not perfectly correlated. Correlation coefficients between current spot return and lead futures return are small. The correlation coefficient between current spot return and the third lead futures return is -0.0222, whose absolute value is already the biggest in all correlations. In contrast, correlation between current spot return and first three lagged futures returns are relatively larger (0.3339, 0.1667, 0.0517). It means that there is a strong correlation with the current spot return and first three lagged stock index futures return. It provides a preliminary results that the lead-lag relationship might exist.

(28)

28

In this section, the regression results are presented, including the model selection information criterions, the vector autoregression results, the Granger causality test and the dichotomous analysis. All computations are carried out using Stata 14.

5.1 Model Selection Information Criterions

Table 4 presents the different information criterions which are used to determine the lag order in the VAR model. In this case, AIC and HQIC choose 10 lags and 6 lags respectively. In this study, the sample size is rather large. AIC tends to overestimate the lag value when the sample is large, so BIC is more appropriate here. The BIC is minimized when lag equal 5, so a VAR model with 5 lags is selected.

Table 4. model selection information criterions

lag AIC HQIC BIC

0 -2.10466 -2.10456 -2.10435 1 -2.30568 -2.30539 -2.30476 2 -2.33556 -2.33508 -2.33402 3 -2.3588 -2.35813 -2.35665 4 -2.37246 -2.3716 -2.3697 5 -2.37598 -2.37493 -2.37261* 6 -2.3765 -2.37526* -2.37251 7 -2.37665 -2.37522 -2.37205 8 -2.37665 -2.37503 -2.37143 9 -2.37675 -2.37494 -2.37093 10 -2.377* -2.375 -2.37056

* indicates lag order selected by the criterion AIC: Akaike information criterion

BIC: Bayes information criterion

HQIC: Hannan-Quinn information criterion

5.2 The VAR regression Results

Table 5 shows the regression results of the 5-lag VAR model. As shown in the second column, the coefficients of first 5 lags of spot return are all significant under 1% level, which indicates a strong autocorrelation for stock index return. It means that past returns of stock

(29)

29

index return can be used to forecast future returns of the stock index, indicating that there exists a short-term predictability in stock index returns. But for stock futures returns, this autocorrelation relationship is not obvious. The only significant coefficient under 1% significance level is the coefficient of its second lag return, which is only 0.0218. It means that past futures returns have limited predictability power to forecast its future returns. Regarding the information transmission through returns, the current stock index returns are significantly affected by the first 5 lag of futures returns. The coefficients of the first 5 lag futures returns are 0.2152, 0.1455, 0.1094, 0.0741, 0.0416, which are all statistically significant under 1% significance level. It indicates that the current stock index returns are positively influenced by the first five lags of stock index futures. It means that the past futures returns can be used to forecast future stock index returns. In contrast, as shown in the third column, the coefficients of second lag and third lag of the spot return are -0.0414 and -0.0370, which are significant under 1% level. It means the current futures returns are negatively influenced by the lagged spot returns and past stock index returns can also be used to predict the futures returns. But considering that the value of the coefficients are relatively small, the predictive power of past spot returns on future index futures return is relatively small.

(30)

30

Table 5. The VAR regression results

spot return futures return spot return lag coefficients coefficients

1 0.0626* 0.0206 (-11.94) (-2.64) 2 -0.1009* -0.0414* (-19.34) (-5.32) 3 -0.1387* -0.0370* (-26.78) (-4.79) 4 -0.1063* -0.0097 (-20.65 (-1.27) 5 -0.0438* 0.0028 (-9.26) (-0.39)

futures return lag

1 0.2152* 0.0027 61.12 0.52 2 0.1455* 0.0218* 39.49 3.98 3 0.1094* 0.0165 29.36 2.96 4 0.0741* 0.0101 19.88 1.82 5 0.0416* 0.0127 11.59 2.37 N 58554 58554 adj. R-sq 0.1705 0.0015

* denotes the 1* significance level

These results indicate that there is a significant bidirectional relationship between the stock index return and stock index futures return over the sample period. Furthermore, more lag values of futures returns are significant to forecast spot returns and the values of coefficients are larger, we can infer that the lead relationship from the index futures to stock index is much stronger.

So in general, we can conclude that there exists a bidirectional relationship between the stock index and stock index futures, but index futures market tends to lead the underlying stock market, and this relationship is more dominant.

(31)

31

5.3 Granger Causality Test

Furthermore, I use Granger causality test to investigate this relationship. The results of Granger causality test for the stock index returns and stock index futures returns are showed in Table 6. For large samples, Stata will give chi square statistic instead of F statistic, and the analysis method is similar to F statistic.

As shown in the table, the chi square statistics are 67.106 and 4947.6 respectively for two tests. They are both significantly larger than the 1% critical value and the p-value is very close to zero. So the two null hypothesis are both rejected at 1% level, indicating that a significant bidirectional granger causal relationship exists between the stock index returns and stock index futures returns.

Furthermore the chi square statistic value for testing the hypothesis “Futures returns does not Granger cause spot returns” is 4947.6, which is much larger 67.106. It indicates that the lead effect from stock index futures to stock index is much stronger. From the Grange causality test we can get a consistent conclusion with above that there exists a bidirectional lead-lag relationship between the stock index futures return and stock index return and the lead relationship from futures to spot is stronger.

Table 6. Granger Causality tests

The null hypothesis lag Chi^2 statistic P-value Conclusions Spot returns does not Granger

cause futures returns 5 67.106 0.000 Rejected

Futures returns does not Granger

cause spot returns 4947.6 0.000 Rejected

5.4 Dichotomous Analysis

From the analysis above, I find that the effect of stock index futures lead stock index is stronger than the opposite. So my first hypothesis is verified. Then I dichotomize the market into upper and lower market and use the model presented in 3.6 to investigate this

(32)

32

relationship under different market trend to test whether this relationship is asymmetric. The regression results are presented in table 7.

As shown in table 7, in both upper and lower markets, the coefficients of all lagged variables of stock index futures return are significant under 1% significance level, which means that in dichotomous markets, stock index futures still significantly lead stock index.

Specifically speaking, in the upper market, the coefficients of the first two lags of rf+ are -0.0342 and -0.0205, both significant under 1% significance level. These two coefficients are negative while the coefficients of lagged futures returns are all positive, it indicates that that the lead relation from stock index futures to stock index is weakened in upper market. And as shown in the second column, the coefficients of the first two lags of rf− are 0.0341 and 0.0202, and both are significant under 1% significant level. But in contrast to the upper market, these two coefficients are positive, which means that in the lower market, the lead effect is strengthened.

So in conclusion, the lead-lag relationship holds in both upper and lower market. Stock index futures lead stock index and this lead effect is asymmetric. The lead relationship from stock index futures to stock index is greater in lower market than in the upper market. This result can be explained by the fact that in the spot market, investors lack necessary instruments to short sell stock, so the stock market is even more inefficient when the market has a downward trend. Compared to the stock market, stock index futures market can response to new information even faster in lower market because of the availability of short instrument, so the lead relation from the stock index futures to stock index is stronger in lower market.

(33)

33

Table 7. The Dichotomous Analysis spot return

Upper market Lower market spot return lag coefficients coefficients

1 0.0619* 0.0619* (11.80) (11.80) 2 -0.101* -0.101* (-19.27) (-19.27) 3 -0.139* -0.139* (-26.80) (-26.80) 4 -0.107* -0.107* (-20.70) (-20.70) 5 -0.0441* -0.0441* (-9.33) (-9.32) futures return lag

1 0.233* 0.199* (50.60) (43.90) 2 0.156* 0.135* (32.88) (29.20) 3 0.108* 0.111* (22.53) (23.70) 4 0.0701* 0.0784* (14.68) (16.80) 5 0.0335* 0.0508* (7.17) (11.12) Dichotomous futures return lag (rf#) 1 -0.0342* 0.0341* (-5.90) (5.88) 2 -0.0205* 0.0202* (-3.53) (3.49) 3 0.00302 -0.00300 (0.52) (-0.52) 4 0.00830 -0.00829 (1.43) (-1.43) 5 0.0173* -0.0173* (2.99) (-2.98) con 0.0007786 0.000773 (1.29) (1.28) N 58544 58554 adj. R-sq 0.1712 0.1712

(34)

34

6. Robustness Check

VAR model does have limitation. One important problem is that the spurious regression might exist when variables are cointegrated. It means that because of the cointegration, t-statistics are highly significant and R square is high although there is no correlation between the variables. So VECM is adopted as the robustness check. The section presents the empirical results of vector error correction model. Table 8 shows the information criterions that are used to determine the lag order for the VECM. As showed in the table, AIC and HQIC choose 10 lags and 7 lags respectively while BIC chooses 6 lags. So similar to the determination of lag value in VAR model, here I use the VECM with 6 lags according to BIC.

Table 8. model selection information criterion for vector error correction model

lag AIC HQIC BIC

0 -5.83657 -5.83648 -5.83627 1 -20.5271 -20.5269 -20.5262 2 -20.7272 -20.7268 -20.7257 3 -20.7569 -20.7562 -20.7547 4 -20.7799 -20.7791 -20.7772 5 -20.7935 -20.7925 -20.7902 6 -20.797 -20.7958 -20.793* 7 -20.7975 -20.7961* -20.7929 8 -20.7977 -20.7961 -20.7925 9 -20.7977 -20.7959 -20.7918 10 -20.7978* -20.7958 -20.7913 * indicates lag order selected by the criterion AIC: Akaike information criterion

BIC: Bayes information criterion

HQ: Hannan-Quinn information criterion

Then Johansen cointegration test is performed to determine whether the log value of stock index price and stock index futures price are cointegrated. The result is showed in Table 9. In rank 1, the null hypothesis is that there is one cointegrating equation exists between the

(35)

35

variables while the alternative hypothesis is that there is no cointegrating equation exists between the two variables. As shown in the table, in rank 1, the trace statistic is 2.1448, which is smaller than the critical value 3.76. So the null hypothesis cannot be rejected. So the Johansen cointegration test result indicates that there exists a cointegration relation between the log value of stock index price and stock index futures price. Under the light of this test, I then estimate the relation using VECM.

Table 9. Johansen tests for cointegration

max rank

trace

statistic 5% Critical value

0 27.2259 15.41

1 2.1448* 3.76

2

The regression results of VECM is showed in table 10. The speed of the adjustment coefficients (γ1 and γ2) indicate that the stock index and stock index futures behave

differently. γ1 is 0.000763 and is significant under 1% level while γ2 is not. The significance

of γ1 means that the spot market response to the previous period’s deviation from

equilibrium. While the insignificance of γ2 means that the current stock index futures price

change does not response to the deviation from equilibrium in previous periods. Therefore, any adjustment in the current period’s futures price is caused by the short run lagged futures and spot prices changes. But we can not conclude that the spot market is not leading the futures market just because of the insignificance of speed of adjustment coefficient 𝛾2. The effects of short run responses should also be analyzed.

As shown in the second column, the coefficients of the first five lags of stock index futures short term response are 0.215, 0.145, 0.11, 0.0744 and 0.0419 respectively. And they are all significant under 1% level, suggesting that the futures market lead the spot market.

Meanwhile, in the last column, the coefficients of the second and sixth lags of stock index short term responses are 0.0203 and -0.0182, which are also significant under 1% level. It indicates that past spot return can also be used to predict current futures return, meaning that the stock index also has leadership effect on stock index futures. However, the value of

(36)

36

coefficients are relatively smaller than the coefficients in the second column, so we can infer that this lead effect is less strong.

From the analysis above, we can get the conclusion that there exists a bidirectional lead-lag relationship between the stock index and stock index futures, and the leadership relation from the futures to spot is stronger. So using the VECM as robustness check, we can get a consistent conclusion with VAR model, so this conclusion more reliable.

(37)

37

Table 10. The VECM Regression Results. error correction term 0.000763* -0.0002 (-3.48) (-0.6) 0.0625* 0.00226 (11.90) (0.43) -0.101* 0.0203* (-19.30) (3.67) -0.140* 0.0141 (-26.64) (2.51) -0.107* 0.00739 (-20.50) (1.32) -0.0440* 0.00988 (-8.49) (1.77) -0.00659 -0.0182* (-1.39) (-3.40) 0.215* 0.0222* (60.90) (2.84) 0.145* -0.0382* (39.31) (-4.88) 0.110* -0.0328* (29.21) (-4.20) 0.0744* -0.00568 (19.76) (-0.73) 0.0419* 0.00567 (11.19) (0.74) 0.00455 0.0216* (1.26) (3.06) N 58553 58553 adj R-sq 0.1707 0.0017 * represents significant at the 1% level

(38)

38

7. Conclusion and Discussion

This paper studies the lead-lag relationship between stock index and stock index futures using vector autoregression (VAR) model based on minute-to-minute high frequency data of Chinese markets. The empirical results show that there exists a significant bidirectional Granger causal relationship between the HuShen 300 stock index return and the HuShen 300 stock index futures return. Moreover, the relation from the stock index futures to stock index is much stronger. So in general, the stock index futures tends to lead the stock index and dominant the price discovery process. To make this conclusion more reliable, I adopt the vector error correction model (VECM) as robustness test. The result of the VECM reaches a consistent conclusion with the VAR model, so it confirms that this conclusion is reliable.

Furthermore, I investigate this relationship under specific market conditions. I dichotomize the market into upper and lower market and study this relation separately. The result indicates that the lead-lag relationship between the stock index and stock index futures is asymmetric. The lead relation from the stock index futures to stock index is greater in lower market than in the upper market.

There are several reasons that might contribute to these findings. First, the transaction cost in the futures market is lower, thus attracts more investors. So more information is digested in this market. Second, trading is faster in futures market. By investing in the stock index futures, investors do not need the trade the stocks within the stock index separately, which is more efficient. Third, within the index, there are stocks that are traded infrequently, which makes the stock index response to new market wide information slower. Furthermore, there are short sell constraints in the stock market. For example, investors are unable to borrow stocks to short in stock market. These constraints make the stock market less efficient than the stock futures market when the market goes down.

But on the other hand, although the relationship is less strong, spot market does lead futures market in certain circumstance. It means under certain circumstance, spot market can react to information faster than the futures market. One possible explanation is that spot market might react to firm wide information better. For instance, if a listed company

Referenties

GERELATEERDE DOCUMENTEN

In the whole sample and in all-size stocks in both stock exchanges, the highest mean return occurs on days before the Chinese Lunar New Year, with 1.063 and 1.314 in

- H0) Media news about the Vietnam War will have an influence on the stock market of the United States. - H1) Media news about the Vietnam War will not have an influence on the

The results presented in Table 7 show significant evidence (null hypotheses are rejected) that the futures returns Granger cause the spot returns. In other words, the information

Both unweighted and weighted measures of stock crashes have statistically significant negative effects on the participation decision (holding shares), on the

The  last  two  chapters  have  highlighted  the  relationship  between  social  interactions   and  aspiration  formation  of  British  Bangladeshi  young  people.

We will further elaborate how religion and technology are not foreign entities that stand outside one another but are rather intertwined by an analysis of the

Correction for body mass index did not change the outcome of any of the GSEA analysis (data not shown). Together, these results show that cigarette smoking induces higher induction

But we have just shown that the log-optimal portfolio, in addition to maximizing the asymptotic growth rate, also “maximizes” the wealth relative for one