Master Thesis Quantitative Finance
The Influence of Investor Sentiment on Chinese Stock Market Returns
Author name: Shuyan Qiao Student ID: 13142917
Thesis Supervisor: Theresa Spickers Date: January, 2022
Specialization: MSc Quantitative Finance Word Count: 13415
Faculty of Economics and Business, University of Amsterdam
Statement of Originality
This document is written by student Shuyan Qiao who declares to take full responsibility for the contents of this document.
I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it.
The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.
This paper aims at finding out the influence of investor sentiment on Chinese stock market in recent years. It applies principal component analysis with direct and indirect market sentiment indicators to build an up-to-date Chinese investor sentiment index.
After constructing two auto vector regression models with investor sentiment index and the returns of stocks with different capitalizations, the evidence shows that investor sentiment is size sensitive. Further, investor sentiment affects more on stocks with small capitalizations than large capitalizations. Three hypotheses are tested in this paper: 1) Direct investor sentiment proxy is a better measure for investor sentiment. 2) Compared with the return of stocks of large capitalizations, the effect of investor sentiment is larger on the return of small-cap stocks. 3) The overall effect of investor sentiment on stock return is positive.
This study contributes to the investor sentiment study in Chinese market. It uses up-to-date data to build model and test whether the investor sentiment theory has changed in recent days. In sum, investor sentiment is a reliable short-term signal for both small-cap and large-cap stock return. The effect of investor sentiment is larger and more long lasting on the return of small-cap stocks.
Keywords: Investor Sentiment; Market Capitalization; Lag Effect; Chinese Market
1. Introduction ... 1
2. Literature Review ... 4
2.1 Investor Sentiment Definition ... 4
2.2 Introduction of Chinese stock market ... 7
2.3 Classification of Investor Sentiment Indicators ... 8
2.3.1 Direct Sentiment Indicator ...8
2.3.2 Indirect Sentiment Indicator ... 10
2.4 Research& Hypotheses ... 12
3. Investor Sentiment Index Design ... 15
3.1 Descriptive Statistical Analysis ... 18
3.2 Determine Lead and Lag Variables ... 19
3.3 Investor Sentiment Index ... 21
4. Empirical Design and Analysis ... 22
4.1 Investor Sentiment Index Validity Test ... 22
4.2 Stability Test ... 23
4.3 Large-cap Stock VAR Model ... 24
4.4 Small-cap Stock VAR Model ... 29
4.5 Sorts ... 35
4.6 Comparison and results ... 40
4.7 Robustness check ... 40
5. Conclusion ... 45
Since last century, the stock markets worldwide have been subject to frequent fluctuations. To satisfy investors’ needs for asset pricing, standard finance theory was introduced and improved. According to classic finance theory, investors are always rational and diversify to optimize their portfolios. If someone appears to be irrational, the ‘abnormal’ price caused by their demands will be offset by arbitragers. However, there were too many weird behaviors that classic finance theory could not explain. Investors’ overconfidence, herd behavior and other psychological biases have led to the overreaction of the stock market, caused the market to be over depressed and full of bubbles. Furthermore, it led to the failure of the financial market to play the role of financing for real economy and
optimizing the allocation of social and economic resources.This gave birth to the behavioral finance theory that studies investors' irrational behavior and investor sentiment.
Based on that, it is important to analyze the influence of investor sentiment on stock market to provide investors with reasonable investment suggestions and optimize return. Investor sentiment is a belief about future cash flows and investment risks that is not justified by the facts at hand. Kenneth L. Fisher and Meir Statman (2000) stated that this belief can tell not only the biases in stock market forecasts but also the opportunities to earn extra returns. With modern behavior finance developing, the point now comes to quantify investor sentiment effects.
“Now, the question is no longer, as it was a few decades ago, whether investor sentiment affects stock prices, but rather how to measure investor sentiment and quantify its effects.”
——Malcolm Baker and Jeffrey Wurgler (2007) Han and Li (2017) indicates that investor sentiment effect matters more for smaller sized stocks than larger sized stocks. Based on this point, my research tries to answer the question: How does investor sentiment affect large-cap and small-cap stock return? By studying on the investor emotional information, this
paper explores whether the return of large-cap stocks and small-cap stocks would be influenced by investor sentiment. The study would measure investor sentiment with several indicators, namely discount rate of closed-end fund, stock turnover rate, financing amount in A-share market, number of new investors opening accounts, investor confidence index and so on. I classify the indicators into the indirect ones such as stock turnover rate, the discount rate of closed-end fund, and the direct ones such as bullish sentiment index and investor confidence index, simply based on their source of data.
This paper is going to research in the stock return in Chinese market, which is the world’s biggest emerging market with special financial policies. Stepping into the 21th century, Chinese stock market has experienced an extremely rapid development. By the end of 2020, there are 4114 listed companies on Chinese stock market, with total market value of around 84.51 trillion. There are over 180 million qualified investors in China, making up 13.8% of total population.
Chinese stock market is well known for some features, such as a large number of listed companies; imperfect supervision mechanism, imperfect delisting system (which may cause inefficient market resource allocation). Infective immature investor mentality drives investors to do more short-term trading and speculations, which also results in the overreaction of total market.
Since the 1997 Asian financial crisis, financial scholars are paying more attention to emerging markets, since they doubt whether the investor behaviors in emerging markets would differ from that of developing market. Also, given the fact that the irrational investing behaviors of international investors probably leads to excessive volatility in developing markets, there may be high deviation from common recognition if we just simply apply the past investor sentiment theories in analyzing such market. Researching into the influence of investor sentiment on Chinese stock market return will also provide new insights into other developing market studies.
The research in investor sentiment is of great importance to both academic and realistic level. Timothy (2020) states that the returns of individual stocks are more synchronous with aggregate market during high investor sentiment periods.
While more synchronous stock prices are associated with a less efficient allocation of economic resources, investor sentiment can also impede capital allocation function of financial markets.
My study makes academic contributions in the following aspects. First, it applies up-to-date data to investigate the change of effect during recent years, as COVID-19 has made huge changes to human life and would undoubtedly affected investor behaviors. I choose new indicators to form my index since the special stock listing policy and trading system in China make it difficult to obtain the indicators which foreign scholars used. It ensures the investor sentiment index I construct can better reveal Chinese investors’ real sentiment. Moreover, it can provide further support to the sentiment study in the largest and most special emerging market, which also applies to other similar developing economies.
Third, it tries to study the relationship between direct and indirect sentiment indicators and investigate in the relationship between investor sentiment and market capitalization with a new model, which may provide new ideas for further study. It also makes contributions at the realistic level. It helps investors to correct their investment attitude and financial management ideas, which may improve their returns. Also, it helps policy makers to improve the supervision system and in the same time, optimize the total environment of the Chinese stock market.
The whole structure is organized as follows: the next part describes the origin and history about investor sentiment. It also introduces the history and main characteristics of Chinese stock market. Several kinds of investor sentiment indicators which will be used in next part to build investor sentiment index are also mentioned in this part. In the end of Part Two I list the hypotheses to be tested later along with related theories and literature proof. Part Three describes
the steps to build the composite investor sentiment index. I will use the principal component analysis which is adopted in Baker and Wurgler (2007). In order to subtract some unnecessary factors and keep the index clean and effective, I do correlation analysis and further screen out the factors and lags which are of low relevance. Principal component analysis can simplify the research topic and improve model efficiency without giving up some rather unimportant information.
After settling down the investor sentiment index, then it comes to the key point.
In the fourth part, first I do the validity test on SHI (Shanghai Composite Index) and SZI (Shenzhen Component Index) to see whether it is able to reflect the change in real market sentiment. In order to test whether Chinese investor
sentiment index is size-sensitive, stocks are classified on the basis of their market capitalizations. The first 30% of all the stocks are chosen to be the large-cap group and the last 30% are chosen as small-cap group. Then I apply the same steps to both groups. Firstly, determine the optimal lag order to build the vector auto model. Secondly, do the Granger causality test to see the statistical Granger relation between them. Thirdly, to show the comprehensive dynamic influence and contribution rate of each impact, I build the impulse response function and variance decomposition process. Fourthly, I replicate the portfolio sorts Baker and Wurgler (2004) to test the cross-sectional effect of investor sentiment. In the end I compare the vector auto regression model of large-cap stocks and small-cap stocks. The result appears consistent with my theoretical prediction that,
compared with stocks of large capitalization, the influence of investor sentiment is larger on stocks of small capitalization. A robustness test is applied to check whether the investor sentiment index is affected by other stock market return indices. Part Five concludes.
2. Literature Review
2.1 Investor Sentiment Definition
Sentiment, as human's conscious experience of external things, is the mental activity and physiological reaction state of human brain subjective cognition environment things. Sentiment is purported to affect returns since investors’
optimism or pessimism may induce mispricing to occur in stock market
worldwide. Brown and Cliff (2005) find that sentiment affects mispricing in the US stock market. Past empirical findings state that stock returns are not only influenced by investor sentiment, but also other sentiments like manager sentiment. Fuwei Jiang and Joshua Lee (2017) built a manager sentiment index based on aggregate managerial textual tone and found that manager sentiment negatively predicts stock returns. The reason is that manager sentiment is able to capture more mispricing than other external fundamental information. In this paper, I’m going to dive into another dimension, which is investor sentiment.
When people talk about investor sentiment, they are not only referring to the simple individual emotion, but also the aggregate attitude in the investment community. In 1990, De Long, Shleifer and Summers first laid out the
assumption that investors are subject to sentiment. Their research shows that risk created by investors’ opinions can significantly reduce the attractiveness of arbitrage. Also, many financial anomalies can be explained by noise trader risk.
That means financial irrationality can largely affect market price even in the
absence of fundamental risk. This is the initial origin of investor sentiment theory.
There’s no clear definition for investor sentiment up to now. De Long (1990) defined investor sentiment as the degree to which the noise traders’ expectations of future stock prices deviate from the beliefs of rational arbitragers. Stein (1996) defined investor sentiment as a systematically biased view of returns. Brown and Cliff (2004) saw investor sentiment as the optimistic or pessimistic expected pricing of financial assets. Baker and Wurgler (2012) regarded investor sentiment as non-traditional psychological deviation of risk and return. Wai Mun Fong (2014) thought that investor sentiment captures the propensity for investors to speculate, and also reflects investors’ optimism or pessimism about stocks in
general. Based on those theories, this passage defines investor sentiment as investors’ reflection of current status of financial market. Due to the fact that, investors may have cognitive bias towards market information, and psychological preferences commonly exist, there is always certain systematic deviation in investor sentiment. Thus, it cannot be accurately measured.
In recent papers, researchers use various direct, indirect and composite variables to measure investor sentiment. Indices such as investor intelligence, investor confidence index, consumer confidence index, China investors’
sentiment index and CCTV Bullish Sentiment Index are regarded as indirect indicators. Baker and Wurgler constructed a sentiment index based on six proxies including dividend premium, trading volume, closed-end fund discount, first-day return on IPOs and equity share in new issue. Tetlock (2007) used daily content from Wall Street Journal column to indicate the investor sentiment. Brown and Cliff (2004) cast more light on indirect measures such as surveys. Schmeling (2009) used consumer confidence index as an indirect measure for individual investor sentiment. Moreover, recent literature has stepped into other non- financial indicators from other dimensions. Adrian and Alexandre (2020)
developed a measure of investor sentiment based on music emotion. Spotify data is applied to generate a music sentiment, which captures more directly investor moods during trading time. The con is that the effect of music sentiment on stock market return is undoubtedly lower than that of financial variables. That is why I choose to use rather more financial indicators as indirect indicators in my model.
Indices such as stock turnover rate, P/E ratio, new account volume and financing- amount in A-share Market are regarded as direct indicators. Composite indicators use principal component analysis to take direct and indirect indicators of different proportions. Composite indicators can take the pros and try to avoid the cons from component indices, which is the reason why they are most widely used nowadays, and will be used in the following part.
2.2 Introduction of Chinese stock market
There are two official markets in Mainland China, namelyShanghai Stock Exchange and Shenzhen Stock Exchange. Since they began to operate in the early 1990s, the two markets have expanded in a dramatic way and made Chinese market one of the leading equity markets worldwide. Due to the fact that Chinese market is the second largest market in Asia, and the fastest growing market all across the world, and China has a fast-growing gross domestic product, it is reasonable to say the Chinese market is growing, and will grow at a marvelous speed.
Nevertheless, the Chinese stock market is still not as mature as many stock markets in developed economies. Green (2003) states that, in 2001, the market capitalization was about 45% of gross domestic product in China. In the
meanwhile, US was over 300% of the figure. Chinese financial market is highly governed by central government. In order for state-owned enterprises to develop quickly and sturdily, interest rates are controlled and kept in a number lower than market rates. The market volatility is low due to central bank supervision and policy restriction, which, along with short-sale constraints, has made speculation difficult and low payoff. Moreover, the policy has set restrictions and supervision on transnational investment, which leaves investors only a few financial products to choose from.
What differs Chinese stock market from developed stock market is that, the constitution and transparency of the markets are dissimilarly. In Chinese stock market, nearly two-thirds of the outstanding shares are not for public trading. In past years, Chinese financial markets have been widely known for the lack of transparency. It is observed that reporting requirements for listed companies in China are neither well-developed nor extensive, and significantly less stringent than those in developed countries (Demirer & Kutan, 2006). That would undoubtedly affect the information environment and further change investor
behaviors. Under such circumstances, individual investors may base their
investing decision on the action or “suggestions” from others who appear to be more informed of internal information.
In recent years, the Chinese government has implemented several measures to improve Chinese stock market environment. The restriction for prohibiting controlling shareholders to buy and sell in the open market is lifted. The companies can start to repurchase stocks and convert non-tradable shares into tradable shares. China Securities Regulatory Commission has also set rules for companies to disclosure greater information and tightened auditing procedures.
The markets are becoming more transparent with those measures.
Due to the particularity of Chinese stock markets and a number of financial policies taking place in recent years, it is essential to investigate whether the investor sentiment in China differs from other economies and how it changes during the past years.
2.3 Classification of Investor Sentiment Indicators
Investor sentiment indicators can be categorized into direct sentiment indicators and indirect sentiment indicators based on data sources.
2.3.1 Direct Sentiment Indicator
Direct sentiment indicators are derived from survey statistics collected by organizations such as periodicals, newspapers, TV programs and financial institutions. The indicators were mainly sold to investors as financial products or provided in reports.
II (Investors Intelligence) is the world’s longest standing provider of technical research and has been in operation for over 50 years. Their sentiment report has been heralding major market moves since 1963. It has been widely adopted by
the investment community as a stable indicator and is followed closely by financial medias. It has a consistent record for staying basically consistent with market trend and predicting major market turning points since 1963. II investigates comments from hundreds of independent institutional critics and classified them into three kinds: positive, neutral and negative. By calculating the difference between the percentage of positive comments and negative comments, they could get the investors intelligence index. II indicates the fact that most investors tend to be affected by market sentiment and make wrong judgements at the turning point. They are too greedy at the top of the market trend and too panic when the market comes to bottom. As a result, II is widely used as a reverse index.
Since the participants of II is mostly stock market practitioners, II index is regarded as a standard direct sentiment indicator for institutional investors.
Comparably, AAII is seen as a representative of the indicator for individual investors. AAII (American Association of Individual Investors) first conducted a survey to its members about their forecast on stock market in 1987. The results were released every Thursday.
Based on the research experience from foreign investor sentiment index, China Securities Investor Protection Fund Co. introduced the ICI (Investor Confidence Index) to observe changes in domestic investors’ investing psychology. ICI not only reflects the change in Chinese investor sentiment since 2008, but also captures the current investor sentiment changes in real time. By sending and receiving questionnaires to the individual and institutional investors in their sample database, they could announce investor confidence survey reports and month-on-month change tables every month. In this report, there are several indices, such as investor confidence index, stock valuation index, market optimism index, market rebound index, market resistance index and stock buying index. The indices range from 0 to 100 and are formatted in tables and plots. An index over 50 means there are more optimistic investors than pessimistic
investors. An increase in ICI means investor optimism increases and investors overall tend to be optimistic. An index below 50 means there are more pessimistic investors than optimistic investors. A decrease in ICI means investor pessimism increases and investors tend to be pessimistic. With the index trend charts and month-on-month change tables, ICI intuitively shows the change in investor sentiment.
Similarly, China Central Television (CCTV) Finance released a Bullish Sentiment Index (BSI) based on the ratio of bullish investors to bearish investors.
They researched and collected sentiment data from securities and consulting agencies, and released a forecast report before every trading day.
Surveys can take into account the psychological dimension of individuals in accordance with their socioeconomic characteristics. They use standardized questions making measurement more precise and allow for regular time series.
Nevertheless, surveys also have boundaries. First, sample size is often limited to the specific group. They may not judge themselves objectively or answer all question truthfully. As a result, the results do not always correspond to the true investor sentiment during the time period. That’s why scholars introduced the indirect sentiment indicators.
2.3.2 Indirect Sentiment Indicator
Indirect indicators show investor sentiment by accessing and processing transaction data in security markets. However, without further arrangement and analysis, they could neither tell investors directly how will the market perform in the future, nor reflect investors’ subject investing intentions. Since all market transaction data related to investor sentiment can be the indirect sentiment indicator, there are various kinds to apply in the model. In this part, I’m going to introduce some commonly used indicators.
Discount rate of closed-end fund (DCEF) is the ratio which divides the difference between net value of fund units and the unit market price of closed-end
funds by the net value of fund units. De Long (1991) thought that DCEF can be used to measure investor sentiment. In further study, Brown and Cliff (2005) thought DCEF cannot be applied in forecasting future stock returns. Since a large portion of closed-end fund is held by individual investors, DCEF is commonly used as an individual investor sentiment indicator in home and abroad. Generally speaking, when DCEF increases, market investor sentiment tends to be pessimistic.
Stock turnover rate (TURN) is the frequency at which the stocks are sold and bought in the market over a period of time. TURN tends to increase when investors are optimistic about the market and the whole trading volume increases.
As a result, turnover rate is a positive investor sentiment index which shows investors’ optimism or pessimism about the future market. Baker and Stein (2004) prove that market liquidity can serve as a sentiment indicator since investors generally have higher sentiment in bull market and lower sentiment in bear market. That is why stock turnover rate, as a measure of liquidity, is incorporated as one of the indicators in this model.
Financing amount in A-share market (FAM) includes initial financing amount, additional issuance financing amount and allotment financing amount. Generally speaking, the more optimistic the investors are, the easier listed company could obtain financing, and the more financing amount they could get, which is, the higher FAM is. Conversely, the more pessimistic the investors are, the lower the success rate of obtaining finance is, and the lower FAM is. In a word, FAM is a positive investor sentiment index.
Number of new investors opening accounts (NAV) is the number of new accounts in stock market during a certain period. It reflects the enthusiasm of OTC investors to join the stock market and their optimistic expectations of the future market. When investors tend to be optimistic, there is higher possibility that OTC investors are to enter the market, thus NAV would increase. When investors tend to be pessimistic, NAV would decrease since OTC traders are not
optimistic about the stock market.
Over bought over sold (OBOS) is an index calculated as the difference in the number of rising and falling stocks in a certain period. Usually, it uses the difference in the number of rising and falling stocks within ten days to measure and forecast the overall trend of the market. If OBOS is above 0, it means the market is in an overbought stage and investors can sell at certain time or vice versa.
Indirect indicators have apparent advantages over direct indicators. They are much easier to construct since they can be accessed through financial database.
Indirect indicators can reflect both the buying and selling power of investors and the influence of their sentiment. However, there still remain some problems using indirect sentiment indicators. The indicators are endogenous to the market and economic activity, so they may not exclusively measure investor sentiment. Due to the fact that both direct and indirect measures have their pros and cons, it’s easy to explain why composite variables are more popular and will be applied in this study.
2.4 Research& Hypotheses
In order to solve the problem that asset price always deviates from the intrinsic value, De Long (1986) presented an overlapping generations model and proposed the noise trading theory. The unpredictability of noise traders' beliefs creates a risk in the price of the asset. As a result, prices can diverge significantly from intrinsic values even in the absence of fundamental risk. The model sheds light on a number of financial anomalies, and aroused the future discussion of many scholars. It is generally recognized that investors are influenced by market sentiment, that’s why neither rational trading nor arbitrage can make the asset price and intrinsic value equivalent.
H1: Direct investor sentiment proxy is a better measure for investor sentiment.
The first step of my study is to determine the composition of investor
sentiment indicator, which is the main variable in this paper. During recent papers, different measures are applied to construct investor sentiment. Lily Qiu and Ivo Welch (2004) compares investor sentiment measures based on consumer confidence surveys with measures extracted from the discounted rate of closed- end fund. The result is that the discounted rate of closed-end fund is not a proper single measure for measuring sentiment but the consumer confidence measure can explain well the closed-end fund activity and investor sentiment. This measure relies less on the sentiment theory but more on direct survey outcomes.
As a result, the construction of investor sentiment in this paper will combine investor confidence index, discounted rate of closed-end fund and other indicators. The first reason is to get a more accurate indicator to represent investor sentiment. The second is to see whether other indirect indicators could affect more than direct indicators in investor sentiment index.
Brown and Cliff (2005) investigated investor sentiment and its relation to near- term stock market returns with the focus on practical implementations. In their paper they separated direct and indirect measures and compared their abilities to predict returns. For direct measures, they concentrated on the whole market instead of individual stocks due to the reason of data limitations. The result is consistent with many other papers related to this area, which means the certain indicators are proven a lot to be effective as sentiment measures. The research method of direct and indirect measures is close to what is conducted in this paper.
There are various kinds of investor behavior and stock markets worldwide differ from each other due to different economic policies. Many empirical
research has been done in this field. One of the approaches is ‘bottom-up’, which uses biases in individual investor psychology, like overconfidence and
overoptimistic to explain how individual investors react to past returns. These models make predictions about patterns in market-wide investor sentiment, stock prices, and volume. A representative model, discussed by Hong and Stein (1999),
relies on differences of opinion across investors, sometimes combined with short sales constraints, to generate mis valuation. The model divide investors into two groups – “Newswatcher” and “Momentum trader”, each of whom is boundedly rational and can only process a subset of information. Newswatchers make their decision based on the news about future price, while momentum traders predict future price with past price changes. It turns out that if there is any short-run underreaction to stocks on the part of one set of traders, there must eventually be overreaction in the longer run as well.
H2: Compared with the return of large-cap stocks, the effect of investor sentiment is larger on the return of small-cap stocks.
The main focus of many past papers is on the aggregated market level, in my paper, the point is to know whether investor sentiment is size-sensitive. Han and Li (2017) used market capitalization as a proxy to test the power of investor sentiment in the cross section. They sorted stocks into large-, medium- and small- cap portfolios to represent different stocks. The result shows that short-term momentum effect matters more for smaller sized stocks. While market sentiment serves as a contrarian predictor in the long run, negative impact gets weaker from large-cap to small-cap stocks.
Malcom Baker and Jeffrey Wurgler (2007) suggested using top-down approach which focused on the measurement of reduced-form, aggregate sentiment and traces its effects to market returns and individual stocks. The point is to explain which stocks are most likely to be affected by investor sentiment, rather than pointing out the level of stock price change depending on sentiment. They combined six proxies to measure investor sentiment, which is detrended log turnover, number of IPOs, first day return on IPOs, dividend premium, equity share in new issues, CEFD (closed-end fund discount). Share turnover, IPO volume, IPO first-day returns and equity share in new issues are positively related to investor sentiment and the remaining ones are negatively associated with
investor sentiment. Stocks are sorted according to volatility. Conclusion is that the stocks of younger, smaller, more volatile, unprofitable companies may be more sensitive to investor sentiment. The ‘bond-like’ stocks are less driven by sentiment. It also shows that sentiment can predict stock returns to some extent.
When sentiment is high, market returns are subsequently lower.
H3:The overall effect of investor sentiment on stock return is positive.
In order to study the lead-lag structure for investor sentiment and stock price, Kun Guo and Yi Sun (2016) collected the investor comments data from a popular Chinese financial networking site called Snowball as direct sentiment proxy.
They proposed Thermal optimal path (TOP) method to determine the optimal
‘path’ for each sample variable. Their research indicates that investor sentiment cannot always lead stock prices, only when stock has high investor attention. It assumes that the predictive power of investor sentiment is not yet determined.
Delong and Summers (1990) have found out that one of the strongest behavior tendencies is to chase the trend. The ‘bandwagon effect’ suggests that sentiment traders are reinforced by each other hopping on the bandwagon. Noise traders herd in and out, leading prices to deviate further from their fundamental value. It also causes sentiment to rise higher and higher over time, and more dramatically, persist for periods of time. Bases on the finding, Xing Han and Youwei Li (2017) suggests that a high level of sentiment leads to high stock returns in subsequent periods. They formed their sentiment index with PCA method using market turnover rate, new investor account and value-weighted PE ratio. It shows that the effect of investor sentiment on stock return is positive, and the strong positive predictability is more pronounced and long-lasting for small-cap stocks than large-cap stocks.
3. Investor Sentiment Index Design
In this paper, I’m going to use principal component analysis to determine the investor sentiment index. Principal component analysis is a quantitatively
rigorous method to simplify data variables to improve model efficiency. Common principal component analysis contains four steps. First, make a descriptive
analysis about the data to unified data unit. Second, determine the correlation between variables to verify whether they are suitable for the analysis. Third, calculate the load matrix of each factor and obtain the principal component load matrix, in order to express the principal components with standardized variables.
Fourth, take the corresponding variance contribution rate of each component as the weight to calculate the expression of target variable.
Based on the past papers, this study replicates the principal component analysis model of Baker and Wurgler (2007), which creates an investor sentiment index by exacting principal components from a group of proxy variables. In this model I collect the financial indicators of all 3121 A-share stocks from CSMAR, RESSET, CSCD and Wind database. CSMAR (China Stock Market &
Accounting Research Database) database is currently the largest, most accurate and comprehensive financial and economic database in China. It is a research- oriented accurate database developed by Shenzhen Xishima Data Technology Co., drawing on the professional standards of foreign academic databases such as CRSP, Taq and Thomson. It has covered 18 series and more than 160 sub libraries, including green economy, stocks, companies, overseas, funds, bonds, industries and economy. RESSET database is a database designed by C9 League experts which provides strong professional support for financial model testing and investment research. CSCD (Chinese Science Citation Database) is known as
‘China’s SCI’. It has recorded more than 3 million papers and 17 million citations since its foundation in 1989. It provides comprehensive academic papers
regarding wide fields such as mathematics, physics, chemistry, astronomy and geoscience. Wind database is the most comprehensive Chinese financial database developed by Shanghai Wind Information Co. Ltd. It contains more than 300 data
sources and 1.5 million financial indicators, and provides up-to-date financial data such as real-time quotes, news, investment strategy, etc. all day long.
The sample period spans from January 2009 to December 2020 since ICI starts from April 2008. Each indicator presents pros and cons in comparation to others.
However, each indicator would tell sentiment from a specific point of view in financial market. The choice of indicators is based on the following reasons. First, given that both direct and indirect measures have been used in previous studies of foreign market, it remains unclear as to which indicators can better measure, and to what extent they can resemble Chinese market. The combination of direct and indirect indicators could take into account multiple sources of information and would better reflect the change in investor sentiment during past years. Second, one must modify the proxy variable set according to their actual conditions when constructing sentiment index in different countries, since the indices published by different countries and their market rules differ. Due to the particularity of
The key issue with principal component method is how to choose proxy variables. As is discusses in previous section, the composite investor sentiment index is constructed from principal component analysis, using following
indicators: discounted rate of closed-end fund, stock turnover rate, new account volume, over bought over sold index, financing-amount in A-share market and investor confidence index. The number of IPOs, which was adopted in Baker’s model, is not introduced in this model because it is not decided by the market but strictly controlled by China Securities Regulatory Commission. The indicators are defined as follows:
Table 3-1 Indicators
In the formula of DCEF, n is the number of closed-end funds, 𝑃𝑖 is the closing price of fund i on last trading day of the month, 𝑁𝐴𝑉𝑖𝑡 is average NAV on the last trading day of the month. 𝑂𝐵𝑂𝑆̅̅̅̅̅̅̅̅10 means the difference in stocks going up and falling down in the past ten days.
name variable calculation origin
Discounted rate of
closed-end fund DCEF 𝐷𝐸𝐶𝐹𝑡=∑𝑛𝑖=1[(𝑃𝑖𝑡− 𝑁𝐴𝑉𝑖𝑡) ∗ 𝑁𝑖]
∑𝑛𝑖=1(𝑁𝑖∗ 𝑁𝐴𝑉𝑖𝑡) CSMAR
Stock turnover rate TURN 𝑇𝑈𝑅𝑁 =𝑚𝑜𝑛𝑡ℎ𝑙𝑦 𝑡𝑟𝑎𝑑𝑖𝑛𝑔 𝑣𝑜𝑙𝑢𝑚𝑒 ∗ 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑡𝑟𝑎𝑑𝑖𝑛𝑔 𝑑𝑎𝑦𝑠
𝑇𝑆𝑂 ∗ 𝑡𝑟𝑎𝑑𝑖𝑛𝑔 𝑑𝑎𝑦𝑠 RESSET
New account volume NAV 𝑁𝐴𝑉𝑡 = 𝐴𝑉𝑡+1− 𝐴𝑉𝑡 CSCD
Over bought over
sold OBOS 𝑂𝐵𝑂𝑆 = 𝑂𝐵𝑂𝑆̅̅̅̅̅̅̅̅10∗ 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑡𝑟𝑎𝑑𝑖𝑛𝑔 𝑑𝑎𝑦𝑠 WIND Financing-amount in
A-share market FAM FAM WIND
index ICI Research statistic SIPF
3.1 Descriptive Statistical Analysis
As is shown in Table 3-2, the average, maximum and skewness of DCEF are both lower than zero, which means China’s closed-end funds are in a long-time discounted situation. This can to some extent explain the recession phenomena in stock market. There’s also an implication that it’s a good choose to buy closed- end fund in the long run. However, as is shown in the table, the average of ICI is 54.87, with a minimum number of 29.81. It means the investors are mostly overconfident and overoptimistic even if Chinese stock market is depressed.
When investors are overconfident, they tend to trade more which caused high turnover rate.
Table 3-2 Descriptive Analysis
Var. Num. Min. Max. Aver. Std. Skew. Kurt.
DCEF 144 -0.270 -0.010 -0.097 0.055 -1.146 1.290
TURN 144 0.090 0.680 0.247 0.116 1.389 1.839
NIA 144 9.190 165.840 39.799 26.460 2.440 8.717
FAM 144 90.390 3686.230 956.861 663.612 1.262 2.043
OBOS 144 -14.850 18.470 6.820 7.154 -0.679 -0.029
ICI 144 29.810 137.890 54.866 18.975 1.922 6.306
3.2 Determine Lead and Lag Variables
There may be lead or lag in time periods when introducing different sentiment indicators. In Yanhui and Hanhui (2020), they introduced first order lag terms of NIA, margin balance, TURN and investor attention in their PCA to test the lead- lag structure of investor sentiment. They found the lag value of NIA, margin balance and TURN are more closely related to investor sentiment compared to the current variable. Therefore, to replicate this process, this paper includes these six indicators of the current period, namely DCEF, TURN, NIA, FAM, OBOS, ICI and indicators of a period in advance, namely DCEF_t, TURN_t, NIA_t, FAM_t, OBOS_t, ICI_t in order to do the principal component analysis.
Firstly, I did KMO and Bartlett’s test to check whether the variables are independent. If the coefficient of KMO is close to 1, the correlation of variables is stronger and partial correlation is weaker, which means the variables are suitable for the factor analysis. If KMO statistic is lower than 0.5, another method should be introduced instead of factor analysis. As is shown in table 3-3, KMO statistic is 0.670 with a Bartlett significance lower than 0.001. It means the correlation between twelve indicators is high and factor analysis is applicable in this model.
Table 3-3 KMO and Bartlett’s Test
KMO statistic 0.747
Appr. Chi-square 851.136
Secondly, I used principal component analysis method to reduce twelve
variables to three factors, as is shown in Table 3-4. The variance in the correlation matrix is reassembled into 12 eigenvalues. Each eigenvalue represents the amount of variance that has been captured by one component. When they are ordered by eigenvalue, it is clear that some of them can be ignored since their significance is lower. The first three components have eigenvalues greater than 1, and they could explain 76.421% of total variation.
The data after factor’s rotation is shown in Table 3-5:
Table 3-5 Rotated Component Matrix Component
1 2 3
NIA_t 0.939 0.061 -0.086
ICI_t 0.917 0.182 -0.110
TURN_t 0.840 -0.273 0.128
NIA 0.837 0.084 0.360
ICI 0.828 0.206 0.322
TURN 0.767 -0.278 0.428
DCEF -0.209 0.936 -0.054
DCEF_t -0.199 0.928 -0.041
FAM_t 0.424 0.671 -0.097
FAM 0.424 0.655 0.144
OBOS -0.063 -0.025 0.825
OBOS_t 0.246 -0.011 0.487
Table 3-4 Total Variance Explained
Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings Total % of
Variance Cumulative % Total % of
Variance Cumulative % Total % of
Variance Cumulative %
1 5.151 42.929 42.929 5.151 42.929 42.929 4.909 40.912 40.912
2 2.873 23.941 66.870 2.873 23.941 66.870 2.857 23.809 64.721
3 1.146 9.551 76.421 1.146 9.551 76.421 1.404 11.700 76.421
4 0.942 7.851 84.273
5 0.601 5.010 89.283
6 0.500 4.170 93.453
7 0.339 2.827 96.280
8 0.272 2.269 98.549
9 0.104 0.866 99.416
10 0.045 0.378 99.794
11 0.021 0.172 99.966
12 0.004 0.034 100.000
Extraction Method: Principal Component Analysis
21 Extraction Method: Principal Component Analysis
Rotation Method: Varimax with Kaiser Normalization a. Rotation converged in 4 iterations
Thirdly, summing up the three factors with a weight of their percentage of variance, I could get the rough investor sentiment index 𝐼𝑆𝐼1:
𝐼𝑆𝐼1 = 𝐹1 ∗ 0.40912 + 𝐹2 ∗ 0.23809 + 𝐹3 ∗ 0.117 (3.1)
In order to further reduce the unnecessary factors, I did correlation analysis between 𝐼𝑆𝐼1 and the twelve sentiment variables. It can be shown in Table 3-6 that DCEF_t, TURN, NIA, FAM, OBOS_t and ICI have comparably higher correlation with 𝐼𝑆𝐼1. As a result, I used these six variables to do the principal component analysis again to get the index 𝐼𝑆𝐼2.
Table 3-6 Correlation Analysis
Correlation DCEF TURN NIA FAM OBOS ICI
𝐼𝑆𝐼1 0.269** 0.611** 0.829** 0.710** 0.133 0.873**
Correlation DCEF_t TURN_t NIA_t FAM_t OBOS_t ICI_t
𝐼𝑆𝐼1 0.276** 0.602** 0.797** 0.660** 0.318** 0.832**
* means a 1% level of significance. ** means a 5% level of significance.
3.3 Investor Sentiment Index
In this part, I would use the six factors, namely DCEF_t, TURN, NIA, FAM, OBOS_t and ICI, to repeat the above PCA process and build up the investor sentiment index ISI.
The KMO statistic is 0.626 with a with a Bartlett significance lower than 0.001, which means the factor analysis still applies. In total variance explanation (Table 3-7), I chose the first two components which have eigenvalues greater than 1, and they could explain 74.69% of total variation.
Table 3-7 Total Variance Explained
Component Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings Total % of Cumulative % Total % of Cumulative % Total % of Cumulative %
The data after factor’s rotation is shown in Table 3-8:
Table 3-8 Rotated Component Matrix Component
NIA 0.943 0.140
ICI 0.91 0.266
TURN 0.877 -0.246
OBOS_t 0.490 -0.018
DCEF_t -0.253 0.891
FAM 0.416 0.757
Extraction Method: Principal Component Analysis Rotation Method: Varimax with Kaiser Normalization a. Rotation converged in 3 iterations
Summing up the two factors with a weight of their percentage of variance, I could get the final investor sentiment index ISI:
𝐼𝑆𝐼 = 𝐹4 ∗ 0.4940 + 𝐹5 ∗ 0.2530
𝐼𝑆𝐼 = 0.5012𝑁𝐼𝐴 + 0.5168𝐼𝐶𝐼 + 0.3710𝑇𝑈𝑅𝑁 + 0.2375𝑂𝐵𝑂𝑆𝑡+
0.1004𝐷𝐶𝐸𝐹𝑡+ 0.3970𝐹𝐴𝑀 (3.2)
4. Empirical Design and Analysis
4.1 Investor Sentiment Index Validity Test
Baker and Wurgler (2007) thought that an effective sentiment index should reflect true stock market volatility. In order to test the validity of this investor sentiment index, I would first test its correlation with SHI (Shanghai Composite
Variance Variance Variance
1 3.003 50.042 50.042 3.003 50.042 50.042 2.964 49.397 49.397
2 1.479 24.648 74.690 1.479 24.648 74.690 1.518 25.293 74.690
3 0.834 13.899 88.589
4 0.438 7.300 95.889
5 0.233 3.888 99.776
6 0.013 0.224 100.000
Extraction Method: Principal Component Analysis
Index) and SZI (Shenzhen Component Index). Figure 4-1 shows trend chart which can reflect the correlation. We can see from the chart that though all three indices have high volatility, ISI has almost same performances while with SHI and SZI, which means the investor sentiment index in this passage has nearly same trend with the stock market. It is able to reflect the change in market sentiment. Thus, it is proven to be effective.
Figure 4-1 Validity Test
20 40 60 80 100 120 140
-.3 -.2 -.1 .0 .1 .2 .3
09 10 11 12 13 14 15 16 17 18 19 20
20 40 60 80 100 120 140
-.3 -.2 -.1 .0 .1 .2 .3
09 10 11 12 13 14 15 16 17 18 19 20
4.2 Stability Test
In order to test whether this investor sentiment index is stable, in this part I’m going to do the unit root test. Augmented Dickey-fuller test could test stability by
seeking whether there is a unit root in the variable sequence. The null hypothesis is that there is at least one unit root in the sequence and alternative hypothesis is that the sequence is stable.
As is demonstrated above, I choose monthly returns of A-share composite market as the stock market return index. In particular, to test different influences of investor sentiment index on large-cap stocks and small-cap stocks, I range the stocks by their capitalizations, choosing the top 30% of the stocks as large-cap stocks (LargeSAR) and the last 30% as small-cap stocks (SmallSAR). Large-cap stocks represent sentiment-immune stocks, while small-cap stocks represent sentiment-prone stocks. The data of LargeSAR and SmallSAR comes from CSMAR. The result of ADF test is listed below in Table 4-2.
Table 4-2 Stability Test
Variable D (C, T, K)* ADF 1% 5% p-value result
ISI 0 (1, 0, 0) -4.611 -3.476 -2.882 0.0002 stable
LargeSAR 0 (1, 0, 0) -10.910 -3.476 -2.882 0.0000 stable SmallSAR 0 (1, 0, 0) -10.446 -3.476 -2.882 0.0000 stable P.S：Δ means first order difference; （c, t, k） mean the constant term, time trend term and maximum lag term in
the equation; c（or t）=0 means there is no constant (time trend) term.
The result shows that we can reject the null hypothesis at a level of 5%, which means the sequences are stable and thus they can be used to build the VAR model.
4.3 Large-cap Stock VAR Model
To find out the different influences of investor sentiment on large-cap stocks and small-cap stocks, I build a Vector Auto Regression model on ISI and LargeSAR, another one on ISI and SmallSAR.
4.3.1 Optimal Lag Order
Based on six criteria, namely LogL, LR, FPE, AIC, SC and HQ, the optimal lag order can be confirmed. Greater order indicates smaller degree of freedom.
Table 4-3 Optimal Lag Order
Lag LcxogL LR FPE AIC SC HQ
0 -438.811 NA 1.862 6.297 6.339 6.314
1 -366.761 141.012 0.704 5.325 5.451* 5.376*
2 -362.045 9.095 0.697 5.315 5.525 5.400
3 -354.354 14.613 0.661* 5.262* 5.556 5.382
4 -352.068 4.278 0.678 5.287 5.665 5.440
Among the six standards, LR, FPE and AIC choose the third order, SC and HQ choose the first order. I choose the second order as compromise and get the VAR model below:
𝐿𝑎𝑟𝑔𝑒𝑆𝐴𝑅 = 0.00019 𝐼𝑆𝐼𝑡−1 - 0.00061 𝐼𝑆𝐼𝑡−2 + 0.06954 𝐿𝑎𝑟𝑔𝑒𝑆𝐴𝑅𝑡−1 -
0.05153𝐿𝑎𝑟𝑔𝑒𝑆𝐴𝑅𝑡−1+ 0.03160 (4.1)
𝐼𝑆𝐼 = 0.57114 𝐼𝑆𝐼𝑡−1 + 0.18469 𝐼𝑆𝐼𝑡−2 + 63.31278 𝐿𝑎𝑟𝑔𝑒𝑆𝐴𝑅𝑡−1 -
3.82515𝐿𝑎𝑟𝑔𝑒𝑆𝐴𝑅𝑡−2+ 13.06332 (4.2)
Figure 4-4 Stability Test
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 Inverse Roots of AR Characteristic Polynomial Inverse Roots of AR Characteristic Polynomial
From the model we can see that the return of large-cap stocks has a lag effect on investor sentiment index. To test the stability of this VAR model, I use an eigenvalue graph and get a result like Figure 4-4. The points represent reciprocal
of AR eigenvalue. If the points fall within the circle, it means the model is stable.
Obviously, this model is stable since all points fall within the unit circle.
4.3.2 Granger Causality Test
To further study the interrelation within investor sentiment and large-cap stock return, I’m going to do Granger causality test on LargeSAR and ISI. Granger causality test is a ‘forecasting’ casual relation introduced by Granger in 1969.
However, it is not forecasting the real relation but the Granger relation in statistical level. If a variable is affected by the lag of other variables, they are said to have Granger causality. The null hypothesis is ‘X does not Granger cause Y’.
If p-value of the test is below 0.05, it means the null hypothesis can be rejected and X can Granger cause Y. If p-value exceeds 0.05, it means the null cannot be rejected and X cannot Granger cause Y. Furthermore, if X and Y can Granger cause each other at the same time, it means there exists bidirectional causality between X and Y. If only X or Y has the Granger causality relation, it means there exists unidirectional causality between X and Y. The result is shown in Table 4-5.
Table 4-5 Granger Causality Test
H0 Chi-sq df Prob.
LARGESAR does not Granger cause ISI 17.889 2 0.0001
ISI does not Granger cause LARGESAR 2.152 2 0.3409
From the table we can see that LargeSAR can Granger cause ISI but ISI cannot Granger cause LargeSAR. There exists unidirectional causality between LargeSAR and ISI. LargeSAR is the Granger cause of ISI. At a significant level of 10%, the change in return of large-cap stocks can cause the change in investor sentiment index, however, the change in investor sentiment cannot cause the change in return of large-cap stocks. This result further reflects the rationality of model (4.2).
4.3.3 Impulse Response Function
In practice, vector auto regression model only reflects a partial dynamic relationship. To further test the interrelation and whole influence between ISI and LargeSAR, I’m going to do the impulse response function. IRF can reflect the dynamic influences more comprehensively. The impulse response function is generated by each variable on the orthogonalization innovation impact of system components. The response process of a standard deviation innovation shock after orthogonalization to the future period can be clearly seen. Figure 4-6 and Figure 4-7 show the impact of the unit standard deviation of ISI and LargeSAR causes on ISI and LargeSAR.
The left side of Figure 4-6 reflects the impact ISI gives on itself. After a positive impact in the current period, ISI itself shows the largest positive impulse in the first period and gradually declines. It declined to 0 around the tenth period.
The right side of Figure 4-6 reflects the impact LargeSAR gives on ISI. After a positive impact in the current period, ISI didn’t show an immediate fluctuation in the first period. There’s an increasing positive impulse between the first and second period, which declines gradually after the third period and converges to 0 after the tenth period. It means the impact of LargeSAR would cause positive fluctuation on ISI in short term, and the single impulse response tend to converge to zero in the long term.
0 4 8 12
1 2 3 4 5 6 7 8 9 10
Response of ISI to ISI
0 4 8 12
1 2 3 4 5 6 7 8 9 10
Response of ISI to LARGESAR Response to Cholesky One S.D. (d.f. adjusted) Innovations ?2 S.E.
The left side of Figure 4-7 reflects the impact ISI gives on LargeSAR. After a positive impact in the current period, LargeSAR shows the largest positive
impulse immediately in the first period and gradually declines between the first and third period. There’s even a slight negative impact in the third period and then declines, converging to 0 after the tenth period. It means the impact of ISI would cause positive fluctuation on LargeSAR in short term, and the single impulse response tend to converge to zero in the long term. The right side of Figure 4-7 reflects the impact LargeSAR gives on itself. After a positive impact in the current period, LargeSAR shows an immediate fluctuation in the first period and declines gradually to zero around the fifth period.
.00 .02 .04 .06
1 2 3 4 5 6 7 8 9 10
Response of LARGESAR to ISI
.00 .02 .04 .06
1 2 3 4 5 6 7 8 9 10
Response of LARGESAR to LARGESAR Response to Cholesky One S.D. (d.f. adjusted) Innovations ?2 S.E.
4.3.4 Variance Decomposition
The variance decomposition process is always applied with impulse response function to show the contribution rate of each impact. Table 4-8 and Table 4-9 show the result of variance decomposition of ISI and LargeSAR.
As is shown in Table 4-8, the contribution rate of LargeSAR to ISI is 0 in the first period and gradually increases. At around the tenth period, ISI can explain 91.003% of its variance variation, and LargeSAR can explain 8.997% of its variance variation. It means in the long run, the change of ISI is mainly affected by itself.
Table 4-8 Variance Decomposition of ISI
Period S.E. ISI LARGESAR
1 12.047 100.000 0.0000
2 15.426 92.323 7.677
3 17.253 91.797 8.203
4 18.172 91.345 8.655
5 18.695 91.202 8.798
6 18.993 91.107 8.893
7 19.167 91.058 8.942
8 19.268 91.029 8.971
9 19.327 91.013 8.987
10 19.361 91.003 8.997
As is shown in Table 4-9, the contribution rate of ISI to LargeSAR is 14.424%
in the first period and gradually increases. At around the tenth period, ISI can explain 16.048% of its variance variation, and LargeSAR can explain 83.952% of its variance variation. It means in the long run, the change of LargeSAR is mainly affected by itself.
Table 4-9 Variance Decomposition of LARGESAR
Period S.E. ISI LARGESAR
1 0.0730 14.424 85.576
2 0.0732 14.651 85.349
3 0.0736 15.366 84.634
4 0.07378 15.668 84.332
5 0.07387 15.842 84.158
6 0.07392 15.932 84.068
7 0.07395 15.986 84.014
8 0.07396 16.018 83.982
9 0.07397 16.037 83.963
10 0.07398 16.048 83.952
4.4 Small-cap Stock VAR Model
Lee (1991) first stated that investor sentiment is more of a size story: small firms are influenced more than large firms. Han and Li (2017) found that short- term momentum effect matter more for small sized firms. In this part, I’m going