1
Do word choices in the Management Discussion and Analysis by bank managers reveal prior
knowledge about the level of risk taking prior to the financial crisis?
MSc. Thesis
Name: Jhonny Yep Sang Xian
Student number: 10015280
Study: MSc. Business Economics
Field: Finance
Supervisor: Dr. Jochem, Torsten
2
Statement of Originality
This document is written by Jhonny Yep Sang Xian who declares to take full responsibility for
the contents of this document.
I declare that the text and the work presented in this document is original and that no sources
other than those mentioned in the text and its references have been used in creating it.
The Faculty of Economics and Business is responsible solely for the supervision of completion of
3 Abstract
This study explores whether the content of disclosures (10Q and 10K) in the Management
Discussion and Analysis has valuable information regarding risk and inside information that goes
beyond financial measures. This linguistic research is using wordlists in the financial field to
analyze the tone of the disclosures. Ahead the financial crisis, mortgage originators and bank
managers are accused, but sufficient evidence has not been there. The study finds before the
collapse of the economy, the tone of the disclosures change comparing to the years before.
Statistical significant evidence are not found, but suggesting that bank managers were aware of
the excessive risk they took e.g. risky lending and originating mortgage to not credit worthy
4
Table of Contents
Section 1 Introduction ... 5
Section 2.1 Literature review ... 7
Section 3 Methodology ... 14
Section 3.1 The Wordlist ... 16
Section 3.2 Defining tone ... 17
Section 3.4 Dependent variable ... 18
Excessive risk taking level of delisted companies ... 19
... 21
The start of the financial crisis ... 21
Control variables ... 23
The regression ... 25
Section IV Data ... 25
Extracting Management Discussion and Analysis ... 26
Merging data ... 28
Section V Results ... 29
Section VI Conclusion ... 35
5
Section 1 Introduction
The valuation of a firm’s asset and the probability of future returns are based on quantitative
information that is provided in the quarterly reports and annual report. However, there are
researches done in the field of qualitative data using textual analysis to examine the sentiment
of the reports. The negative sentiment and tone in reports are reflecting useful information,
because Loughran & McDonald (2010) have shown that negative words classification are having
a high significant correlation with financial variables. In a research also done by Loughran &
McDonald (2013) they have been searching for future predicting outcomes using the United
States Securities and Exchange Commission filing of the Initial Public Offering. They conduct an
investigation to the level of uncertain words that is used which in turn affects the returns of the
IPO. The research shows that the choices for words have certain influence on future outcomes.
This is because the higher the proportion of uncertain/weak modal words, the difficult it is for
investors to assimilate value relevant information and hence influences on the investor’s
expectation of the firm value. The contingency of this research with previous research in the
linguistic analysis is using the negative word classification to examine the tone of a report and
based on that, do word choices reveal any prior knowledge. The data that will be used in this
6
This research is important for banking sector and especially focusing on the housing
market, because we try to investigate whether prior private knowledge would reveal risk taking
levels of word choices and if so, then the concerned supervising committee can acts properly to
this. This study follows the path of other linguistic research based on performance return and
tone measures, but the field is different than researches before, because the focus is on a
specific sector, namely the banking sector, where the financial crisis has been triggered. It is
sector where the banks has initiated the subprime mortgage market for liquidity. As following
linguistic research it is to analyze disclosures or content of certain media on how the banking
sector functioned before the crisis. How much do they reveal about their knowledge about their
action? There is separation between bank managers, the one who were indeed aware of the
level of risk taking. And some do not participate in these risky lending program. Would bank
managers have more negative words in the MD&A section than years before? Would their prior
knowledge have an impact on the future outcomes by using annual reports of different banks?
Do words choices in the reports reveal some valuable information regarding the risk that insiders
only may have?
The remainder of this research is as follow. Section one is an introduction to the linguistic
research and what the relevance is to do this research. Section two is the main focus on previous
researches done in the same field. Section three goes about the methodology, how the setup is
of this research and section four is to show the collected data and some descriptive statistics.
Section five presents the main results and the discussion while section six is the end section of
7
Section 2.1 Literature review
Researches done in this field of using tones of directors to analysis annual reports.
Tetlock (2007) was one of the first who quantifies the interaction between the content of
the Wall Street Journals and the daily stock market; what was the influence of media on the
stock market. Evidences Tetlock found was the direct movement of the stock market can be
predicted by the new media content. First, high level of pessimism in media content predicts a
downward pressure of the stock prices, which in turn move back to its fundamental value.
However, when the pessimism on the price impact is large, the reversion to its fundamental
value is slow. Secondly, low value of media pessimism have a forecasting power on higher
trading volume of the market. Finally, low market return leads to a higher level of media
pessimism. These findings show the level of pessimism in context have significant influence on
down pressing the stock market prices, while the trading volume of the stock market is also
affected, but it does not hold for risk. This trading volume is only affected when there is
exposure to the media content. By contrast, pessimism seems not related to risk in the media
content. Tetlock (2007) did not find evidences that pessimism increases market volatility. So
pessimism in media content only decreases return of the stock market temporarily, but do not
increase the risk. However, as note Tetlock added that this is to blame to the word categories
used in his research are retrieved from psychologists and they have a different interpretations of
negative sentiment. Further to this word categories is referred later in this study to which word
8
In line with the research of Tetlock over pessimism, Larcker & Zakolyukina (2012) did a
linguistic research on misstating financial statement by the CEO and/or CFO. The linguistic
research is based on the discussions during a quarterly earning conference call. The model they
used is to label each call into “truthful” or “deceptive”. From these deceptive calls, they perform
a linguistic test which shows deceptive executives have more references to general knowledge
and less references to shareholder value. Further deceptive CEOs use significantly more extreme
positive expression and less anxiety words. However, this is in conflict with what Tetlock (2007)
found in his research, because now the deceptive executives will not only hide away misstating
in the financial statement by using extreme positive words, but will also reduce their anxiety
word choices i.e. pessimism. But the theory that is described by Vrij (2008) is that the emotion
perspective hypothesizes suggest deceivers are feeling guilty and afraid to be caught in a
deceptive action. Besides, the assumption of Larcker & Zakolyukina (2012) is that CEO and CFO
know when financial statement have been manipulated and while testing the context of the
executives can deliver valuable information about their prior deceptive behavior. This is hard to
control for when CEO and CFO are manipulating of word choices. However, in the end theories
of psychology and linguistic supports that liars, generally, are more negative and will use less
self-references. This can be explained that liars would feel uncomfortable because they lie and
never experienced the supposed claim e.g. a claim for something that he has never experienced
would put him in a more negative position towards its claim according to Larcker & Zakolyukina
(2012).
non-9
verbal behavior during deception is also applicable to verbal context. One of the four perspective
is the control perspective and that one describes that liars would intentionally avoid regretful
words. So liars will be very carefully in using word choices such that he controls the listeners not
easily to perceive the lies in the statement. This refers to less self-references, short statement
with less details and as substitute for the information that the liar not wants to expose, he will
provide more irrelevant information e.g. liars are speaking with caution and use unique words to
have more lexical diversity, while truthful speakers often repeat their information which leads to
less lexical diversity. But in general the linguistic and psychological theories suggest that liars are
more negative. We expect that deceptive executives will have more negative sentiments in his
word choices.
Kothari et al. (2008) found that litigation risk facing the firm under the SEC rule 10b-05 is
likely to motivate the manager to disclose bad news. This rule together with other laws state that
the management has the duty to disclose material information when it becomes known to the
management. With other words speaking, when the management know material information,
under the law enforcement, is it the management duty to disclose this information, whether bad
or good news.
Other research also done in the field of tone in a conference call is from Davis et al.
(2011). They show that firms manipulate the tone of the corporate communication when the
earnings are announced. Linking to the thesis, they have showed that markets seem to be
influenced by the manipulated tone in the corporate communication, which is the investor’s
10
(2012) is that the conference call linguistic tone is a significant predictor of abnormal returns and
trading volumes of a stock and that the tone influenced the 60-trading days after the conference
call.
To what happened to the financial crisis, on one side the amount of creditworthy
borrowers was limited. While on the other side mortgage lenders are able to easily underwriting
standards of the mortgage which leads them to originate riskier mortgages for the borrowers.
While the borrowers were not creditworthy and mortgage lenders are providing lower standards
mortgages, when these two meets, it is not surprising that it causes an imbalance. The risky
subprime mortgage is the result of different entities, for example credit rating agency,
underwriters, investors but also lenders and central banks, who might be accused for. It have
started when the central banks provides more liquidity by reducing the interest rate which in
turn investors are looking for riskier projects to retrieve a higher return according to
Brunnermeier (2008). Mortgage lenders took part of it and accepting subprime mortgages to
borrowers of poor creditworthiness, but seems the originators are likely to provide borrowers
the fund to buy a house. In figure 1 is the role in the subprime mortgage displayed by Ashcraft &
Schuermann (2008). Ashcraft & Schuermann (2008) showed in figure 2 the top ten subprime
mortgage provider originators, and all of them are multinational banking and financial services
holding company. Ashcraft & Schuermann (2008) describe the subprime mortgage as friction
between different parties, and they believed these friction has been the reason for the
11
Products are offered to subprime borrowers with a complex structure which leads them
to misunderstanding of the product or misrepresentation.
Moral hazard refers to changes in behavior in response to redistribution of risk. They
explained this friction when insurance may lead to risk taking behavior when the insured
does not bear the consequences of the bad outcomes, with another word, the insured in
downside limited and is therefore not responsible for bad outcome and leading to moral
hazard.
Then the principal agent problem occurs when asset managers are evaluated to its
benchmark resulting in assets managers have the incentive to reach higher yield by
buying structured debt issues with same credit rating but higher coupons, but when
other asset managers underperforms, the asset managers keeps the same portfolio, but
do not exercise the same effort anymore. In case other asset managers underperform!
The findings in their study is that the incentive has played a role in the breakdown of the
subprime mortgage market. The bank managers are downside limited and may have helped
them to the excessive risk taking.
Confirming to the research of Cziraki (2015), bank executives that were more exposed to
the housing market have sold larger proportion of the shares that the bank executives that were
less exposed to the housing market. When the first decline of the housing prices starts, bank
executives sold off large proportion of the share, because they understand exposures of their
bank to the housing prices. While Fahlenbrach & Stulz (2011) show in their research that there
12
in 2008, Cziraki (2015) divided bank executives into groups relative to the exposure to the
housing market instead of treating them as one group and started his study earlier in 2006 to
examine their pattern of ownership. The main finding from a new perspective regarding inside
trading is concluded by the study: It has been shown that the importance of the bank executives’
role plays a role on inside trading on the housing market. By differentiating according to the risk
exposure to the housing market seems that the group of high exposures shows that the insiders
are more likely to sell off stock than the low exposure group by 20%. This study by Cziraki (2015)
questioned the role executives and insiders what they do in relevance of the information
gathered on forehand. Cziraki (2015) had a different approach to what shown light on what
insiders do with the information they have and with this study using word analysis to show the
same results on whether executives are tend to use more negative words, and therefore the
tone is more negative, in their disclosures regarding the excessive risk they take on the housing
market and mortgaged backed loan.
Feldman et al. (2009) main finding that will contributes to this research is that when
manager’s assessment of future prospects become more negative (positive), they would use
more negative (positive) words in the disclosures.
One key element of this research is the bank managers who controls the portfolio and
accounting for its exposures. The research by Cantrell (2013) focuses on the ability of a bank
manager. They discovered that if the ability of the bank manager is high, then they provide a
higher quality accounting. Bank manager with higher ability is able to offer fair values of their
13
ability manager had a better estimate of the exposure in their securities portfolio. The better
estimates is thus depending on the ability of the manager how to tackle with information and
processing it. Hence prior information the bank manager has will be taken into consideration in
order of decision making. Regarding the ability of the bank manager, the role he plays has effect
on what he controls. The portfolio the bank manager control are in line of its ability and hence
the prediction power. This suggest a correlation between portfolio management and the role of
a bank manager how to deal with inside information. The inside information the bank manager
has will have influence on how he acts.
Section 2.2 Hypothesis development
With the prior study done in this field of linguistic research, the disclosures of companies seem
to hide valuable information. Longstanding literature suggest that there is a link between
linguistic content and firm performance. In contrast to the collapse of the economy as result of
subprime mortgages market it is interesting, beside the discussion who to blame, are the
so-called big players aware the excessive risk taking on forehand. And if so, how long in advance are
they aware of this. I develop a test that will focus on disclosures by the banks, since they were
majority of the originators of the subprime mortgage market (See figure 2). By analyzing the
content of the disclosures of the banks, I relate it with the return of the banks. Some
performance measurement that is comprehensive to use as an indicator how they perform after
14
information they possessed and if they do reveal some characteristics of risk before the financial
crisis took place (Cziraki (2015)). Risk is associated with negative feelings, and Larcker &
Zakolyukina (2012) already showed with the theory provided by Vrij (2008) that even liars would,
even at the top of the company e.g. CEO or CFO, would reveal the risk and be able to hold in the
lie. Feldman et al. (2009) they showed that if the management is negative about future
prospects, they would use more negative words in the disclosure. Using this information about
word choices by management, performance measurement and the negative tone of reflecting
risk, disclosure are expected to be negative and bank’s performances, when the disclosures hide
some risk, is also expected to be negative associated.
Section 3 Methodology
The MD&A is an important section in 10-K report, because these filings contains more details
and information than the annual report and are obligated by the SEC. The reason why it is
interesting to use the MD&A section is because it provides an overview of the operation last
years and which situation the company is in described by the management. Structurally; 10-Ks
need to be filed within 60 days after the end of the fiscal year. If this is not the case, the return
of the security does not match with the 10-X. The focus is on the 10-K’s and the 10-Q’s, since
these reports are also required an MD&A section. Excluding 10-K’s from the sampling data will
cause biased observation and more importantly, using the 10-Q’s it is possible to see if the tone
of the management changed comparing to previous 10-Q. The 10-Q gives a smaller time-window
before the occurrences of the financial crisis. Although this sections is unaudited, bank managers
15
Therefore, this section is essential concerning risk in this study, because bank manager might
reveal some private information to risk exposures they were aware of.
To quantify the 6 categories words (uncertain, weak modal, negative, positive,
legal and strong modal), the same method is used from Loughran & McDonald (2013). The
wordlists have been used in prior literature to determine magnitude, amount or the volume of
the tone. The 6-words lists are specifically created for financial reports. First is programming skill
to quantify all 6-words category by using the parse function described by Loughran & McDonald
(2011), after parsing we remove all ASCII-parts consisting of non-characters and then summarize
a table of the affected words. After defining the tone of the annual reports, we estimate
excessive returns and performances of the bank according to the tone of the firm. For control
variables we retrieve from literature as firm sizes, outstanding debt (Feldman, Govindaraj, Livnat,
and Segal, 2010). Even in the prior researches of Tetlock (2007) and Loughran & McDonald
(2011) describe that positive words are harder to measure by program since positive words can
be used in a sentence that it has negative meaning. For example, a positive word could be like
“good”, but more the word “not very good” together with “not very” makes the word/sentence
instantly negative, so in this research the focus is on the negativism instead of measuring
positivism. This study covers SEC filings1 that need to be collected manually. One way to do this
is to write a program in Visual Basic for Applications, which is a supporting program. Writing a
program in VBA allows the program to download all the listed SEC filings automatically and after
1 SEC filings is the 10K MD&A section.
16
the download process if finished and all the files are collected, then a program is needed
especially for extracting the MD&A part of the report out of the 10-X’s. The quarterly reports are
from U.S. publicly listed firms, including banking sector.
Section 3.1 the Wordlist
The content of this linguistic research is based on a basic computational research which counts
the frequency of a word. While most researchers typically measure the tone of any disclosures
based on its frequency counts of words and assigning weight to the word. The 10-K’s and 10-Q’s
are treated as a document full of words and hereby the sequence of the words do not play a role
anymore. This is called a vector of term-frequency counts. In order to measure the tone of the
document, a wordlist has to be developed to separate the positive words and the negative
words. Wordlists used widely in earlier studies are the Diction and General Inquirer. Diction is
developed by Roderick Hart and used for political communication while General inquirer is
developed by Philip Stone and used over decades for social psychology. Because these wordlists
are general English linguistic dictionaries, it is not specific enough for the domain of financial
disclosure. The classification of tone based on the Diction and the General Inquirer does not
provide sufficient accuracy from the study of Li (2010). Since general words omits that are
actually considered to be positive or negative in the financial domain, while other words are
included that actually would not be included according to Henry & Leone (2009). This invokes to
look for another wordlist. Because Loughran & McDonald showed in their study that from the
10-K’s that are analyzed, almost three-fourths (73.8%) of the words that were identified as
17
contexts (Loughran & McDonald, 2008). Another wordlist is made available by Loughran &
McDonald. The wordlist is described in their research “When is a liability? Textual analysis,
dictionaries, and 10-K’s (2010)”. The wordlist is created by analyzing all 10-K’s occurring from
1994-2007 and from these words that occurs in more than five percentage of the analyzed
words, it is classified into their six-category wordlist. Since I am doing research in the field of
financial reports, I reference Loughran & McDonald’s wordlists, updated until March 2015, for
analyzing 10-X2. The Loughran & McDonald wordlists have some advantages over other wordlist
in this research, because the list does not require words to collapse down to its roots e.g.
aardvark versus aardvarks. In this example is has shown that the wordlist is already included
aardvark, but also aardvarks, so therefore it has no important role to collapse down the word
aardvarks into aardvark.
Section 3.2 Defining tone
Davis et al (2008) research about earnings press release languages. They are counting the
optimism-increasing and decreasing words. They used for the sentiment measurement as the
difference between the percentage of optimistic words and the percentage of pessimistic words.
As a measurement for before and after the earning press release, they used the lag difference
between the two-periods.
In this study, I will only apply the frequency measurement of the word that is using the same
method as Henry (2008). While Henry used different wordlists among others Diction and General
2 10-X’s are referenced as the 10-Q’s and the 10-K’s used in this research. I use as much as possible available
18
Inquirer for financial disclosures, I will use the wordlist that is retrieved from Loughran &
McDonald Master Dictionary updated to March 2015. Doran et al. (2012) and Henry & Leone
(2010) defines tone of earnings announcements in their study using the Henry measure in two
ways. First is the ratio between the amounts of positive words divided by the amount of negative
words (H1) expressed as follows:
# 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑊𝑜𝑟𝑑𝑠 # 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑊𝑜𝑟𝑑𝑠 And the second way of Henry measure (H2) is:
( # 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑊𝑜𝑟𝑑𝑠 − # 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑊𝑜𝑟𝑑𝑠 ) ( # 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑊𝑜𝑟𝑑𝑠 + # 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑊𝑜𝑟𝑑𝑠)
So the tone is calculated as the amount of positive words subtracting the amount of negative
words, then divided by the total of the positive words and the negative words.
Loughran & McDonald’s Master Dictionary is an aggregate wordlist which is useful for textual
analysis in financial applications and extended to 10-K documents.
Section 3.3 Dependent variable
Many researchers, Engelberg et al. (2012) Feldman et al. (2008) Demers and Vega (2011), in the
field of linguistic research use the event study methodology and defining the dependent variable
as the cumulative abnormal return (CAR) in a predetermined event window. However, the use of
CAR is very precise and good for event studies on short term, or where the event window is
19
study faces performance measurement over few years, and therefore using the CAR is not
applicable. Numerous empirical studies have defined the financial theory that underlies stock
return as performance measurement. The financial theory implies that the stock reflects the
current status of the company. Using its return may reflect the performance during the crisis.
Section 3.4 Excessive risk taking level of delisted companies
Cziraki (2015) discussed in his study about the role of insiders. He used data from the years prior
to the crisis to measure how much insiders sell after acquiring information in response to stock
price increase or the changes in book-to-market ratio in the empirical study, measuring the how
many stocks the insider sells off in normal scenario and without being impacted by inside
information. I will add an equivalent component in order to see the affection of inside
information. The differentiation between banks will be made based on which company was more
affected by the financial crisis than banks which were not affected or merely affected by the
financial crisis. With another words, banks who got delisted during this period are assumed to
have taken excessive risk, and therefore removed from the listed security. The stock is then
removed from the exchange, because the security is in compliance with the listing requirements.
Consequently, the expectation now is that the banks which suffer from the financial crisis has
significant more negative words in the MD&A section because they have taken more excessive
risk. It is convenient to build up a treatment group consisting of banks that are affected during
the financial crisis and the control group is the one who was either not aware of the risks or did
not participate on this excessive risk taking on the mortgage backed lending. To understand if
20
divide the banks into categories: delisted and not-delisted group. With these groups, the
excessive return or the stock return over the time period can be measured. If the excessive
return is stable over time, then it can assumed that no inside info was spoiled as reaction on the
stock. As same as Cziraki (2015) did was assigning bank with CRSP3 delisting code 500-599
(“dropped”) and code 200-299 (“mergers”) into low performance group, while the rest of the
banks are assigned to the high performance group. Summarizing, in this study companies which
got delisted during the crisis depending on their reason will be treated as the group that did not
survive the crisis and hence assigned to the non-survivors group. Other banks that do survive the
crisis will be assigned to the survivor group. The non-survivor group served as the group who had
actually excessive risk-taking on their balance, because the impact of the crisis has worsens their
position and as result that they did not survive the crisis without any external help or cease to
exist because it violates the requirement of listing.
21
Section 3.5 the start of the financial crisis
Subprime lending is referred to the way loan is lend out to those who have difficulties of
maintaining the repayment schedule and most of time they will face some offset of the loan
compared to normal loans. As in order to offset the higher credit risk, because of the difficulties
in maintaining repayment schedule, these loans will have bad collateral and most of time a
higher interest rate. According to Lemke et al. (2013) majority of the subprime lending are
packaged in the mortgaged backed securities which in turns defaulted, causing the financial crisis
in 2007-2008. The contribution of these subprime lending has triggered the financial meltdown,
which peaked in august of 2008. As illustrating for the financial crisis, I will use the TED spread
historical chart from the Federal Reserve. TED has two component, one is the 3 month LABOR
rate and second is the 3 month Treasury Bills. The LIBOR is the overnight interbank lending rate
which is used by the banks as an average for borrowing from other banks. While 3 month
treasury is used by the U.S. government to raise capital from the public. The TED spread is
22
𝑇𝐸𝐷 𝑆𝑝𝑟𝑒𝑎𝑑 = 3 𝑀𝑜𝑛𝑡ℎ 𝐿𝐼𝐵𝑂𝑅 𝑅𝑎𝑡𝑒
3 𝑀𝑜𝑛𝑡ℎ 𝑇𝑟𝑒𝑎𝑠𝑢𝑟𝑦 𝐵𝑖𝑙𝑙𝑠 𝑅𝑎𝑡𝑒
This is a measure for the perceived credit risk for the U.S. economy. Since the LIBOR measures
the lending rate between banks, if the spread between the LIBOR and the 3 month treasury bill
raises, this will show lack of trust between the concerning bank and the counterparties. In figure
1 is shown the TED measure of the time window 2005-2009. The TED spread is peaking on its top
in September 2008 with 315.25 basis points. Normally, when the economy is stable, the TED
spread varies only between 30 to 50 basis points, and in February 2005 the lowest measure of 23
basis points. The remarkably increase in August 2008 is the collapse of the economy, but the
start of the increase can be pointed to July 2007. That was the first increase of the TED spread
and from that point it only fluctuates to its highest peak in August 2008. It is now interesting to
look at July 2007, because July is right after the Q2-2007 announcement. Because the study
covers the Management Discussion and Analysis part, it is now interesting to test the risk
awareness right before the collapse of the economy. How far in advance are the bank managers
aware of the excessive risk they took? In order to test how far in advance the bank managers
were aware of the excessive risk and evenly if they were aware of the excessive, each 10-X’s is
assigned to the quarter of the year that they will be announcement and published. So for the
quarterly report of June 2006 the 10-Q will be Q2-2006. But one rough estimation has to be
made for the date, because not all quarterly reports and yearly reports have the same
23
for this study the rough estimation will be made on every March, June, September and
December.
Section 3.6 Control variables
In this linguistic research it is important to understand why certain tone of the document would
reveal prior knowledge of their excessive risk taking. Therefore, market variables needs to be
built in the for control variables to test if the I include control variables for firm characteristics
variables and common market variables such as book to market ratio and trading volumes.
Trading volumes’ role for control variables is essential, because studies found out that textual
sentiment has significant influence on trading volumes. Tetlock (2007) found out in his study for
either extremely low or high pessimism that will lead to temporarily high trading volumes. Das
and Chen (2007) have found the same results, they have found a strong correlation between
trading volumes and sentiment. But however, both study did not show that sentiment can have a
predictability on future trading volumes. The book to market ratio is used to display the current
status of the publicly-traded company. It compares the company’s net asset value per share to
its share price. This comparison helps to determine whether the market price for the company in
relative to its actual worth. A higher book to market ratio indicates that the company is
undervalued. This is a common tool for control variable, because it sets the company in
comparison with other companies.
Ferguson et al. (2015) has tested in their study if positive and negative words in media content
display relevant information. They have found significant predictive relationship between the
24
high book to market ratio. They also found high attention news, either positive or negative,
affects the subsequent period return of predicting power of media content to firm size on
return. The role of a big firm are tended to receive more media attention than small firms. The
media attention to the big firm comes together with investor recognition and they try to
investigate if the stock of the smaller firms, so lower investor recognition, is compensated with
higher stock return. The results of their study shows that media attention on the firm size has
significant influence. After the announcement to the media, either positive or negative news,
both has significant predictive power on the firm next period abnormal return. However, for
small firms it has significant predicting power, but for bigger firms, it turns out that media
content has no influence on the larger firms listed in the FTSE4 100 firms. It seems like that firm
size do have influence on the predictive power of the return. In my study I will try to control
return of the stock for firm size, because firm size has significant influence on stock return.
Market capitalization is not available on CRSP, however it can be calculated as following.
By following the instruction given by CRSP, I collect the outstanding shares at the average
between 2005 and July 2007. Subsequent I multiply the outstanding shares with the average
price of the share also between 2005 and July 2007, which gives the market capitalization of the
company in between the timeframe.
4 FTSE is the stock index listed on the London stock exchange where the 100 firms with the highest market
25
Section 3.7 the regression
𝑆𝑡𝑜𝑐𝑘 𝑟𝑒𝑡𝑢𝑟𝑛 = 𝛼 + 𝛽1 ∗ 𝐻2 + 𝛾2 ∗ ln(𝑀𝑎𝑟𝑘𝑒𝑡 𝐶𝑎𝑝𝑖𝑡𝑎𝑙𝑖𝑧𝑎𝑡𝑖𝑜𝑛) + 𝛾3 ∗ ln(𝐿𝑜𝑛𝑔 𝑇𝑒𝑟𝑚 𝐷𝑒𝑏𝑡) + 𝛾4 ln(𝑁𝑒𝑡 𝐼𝑛𝑐𝑜𝑚𝑒) + 𝜀
Section 4 Data
This section describes the data and sampling approach. I used quarterly and annually
data from CRSP starting in fiscal year 2005 to fiscal year end 2009. The reason to use U.S. firm is
because publicly listed companies are required to deliver up their disclosures and earnings
announcement to the SEC, including the banking sector. The first sample collection is retrieved
from Compustat banking sector. For financial, accounting and industrial data I refer to
Compustat. Historical data about prices and market capitalization I refer to CRSP.
In Table 1 provides the number of extracted files and the amount of positive and negative words
found in the files. From all the 13.945 extracted Management Discussion and Analysis section,
there were 53 files where no negative words had been found and 74 files where no positive
words have been found. The maximum of positive and negative words found in Management
Discussion and Analysis is respectively 790 and 3260. Further investigation to these maximum
number of found words turned out that the file that has the most positive words is exactly the
same file that also had the most negative words found. This can be explained if the file is bigger
than other files, so the cumulative of total words is bigger than other files. This will result in that
26
words and negative words, these percentages give the positive words found divided by the total
of positive words in the wordlist. Taking a closer look to the minimum found negative words
0.00042 shows that some files have barely any negative words in it. This can be explained that
each file has at least either one positive or negative word. See Table 1 at the value of minimum 1
in the total amount found positive or negative words. For computing the tone of the document I
used the Henry 1 measurement as well the Henry 2 measurement. Of all the analyzed files, both
tone measures give a higher ratio for negative words. Henry 1 has a higher denominator than
the numerator, which implies than there were more negative words found in each document
(mean = 0.4044). Henry 2 measure shows the same result for each document. The mean of
Henry 2 is -0.4516, this implies that on average, the tone of the documents are more negative.
This is in line from the literature by Cziraki (2015) & Larcker & Zakolyukina (2012) that executives
are tend to use more negative words to reflect the risk or either because they are uncomfortable
to lie, and therefore more negative tone is present in the disclosure regarding the risk.
Table 2 gives the descriptive statistics of the 2005Q1, since each subgroup has different
descriptive statistics, I will not display them all in the Appendix. In Table 2A all the variables that
will be used in this study is displayed over there with a small description.
Section 4.1 Extracting Management Discussion and Analysis
With the help of Compustat I have gathered the banking sector operating in between the
timelines of begin 2005 and end of 2009. Primarily, the operating banking sector are listed in the
CIK-27
identifier 5I downloaded filings of the banking sector over 2005 until 2009. First is to download
the Master Files, and then identify target forms from the master file into specific target file
name. In total there were 15.774 10-X’s downloaded between the timeline. In order to analyze
to the tone of the MD&A section, this part needs to be cut out from the 10-X document. With
supporting tool written in VBA provided by T. Jochem, the process of extracting the MD&A is
based on using keywords. By using keywords I mark the start of the Management Discussion and
Analysis section by selecting the keyword and select all the text that is included. Keywords are
typical described by Management or Discussion. At the end, I look for keywords that marks the
end and all the text selected will be marked down as the Management Discussion and Analysis
section and a new text file will be created with only these selected area. Since the files are
downloaded from the SEC’s Edgar system, the 10-X’s are all in HTML format. Converting it into a
text file and removing the HTML tags have ensured that the file has a different layout than its
initial as in HTML format. Older files are also less structured than new files and has as result that
it will have a different layout throughout the years. Of all the 15774 examined 10-X’s, 1009 of
them had only a reference to the MD&A file, some of them are referenced to the annual report
or stockholders’ report. Another 178 10-X’s did not have a Management Discussion and Analysis
section, 51 are lost because the section of Management Discussion and Analysis are mixed with
the following section e.g. Financial Report or Quantitative and Qualitative Disclosures about
Market Risk. 370 10-X’s are excluded from the data, because of the different layout caused by
5 CIK-identifier is the Central Index Key used by EDGAR Company Filings for companies who have filed disclosures
28
converting files into Doc file, the specific section of Management Discussion and Analysis got
lost. The section was not traceable using the supporting tool, because there was either no begin
of the search criteria or the end of the search criteria and is therefore not specific enough to cut
out the Management Discussion and Analysis section. More specific, it is because the
arrangement of the section is mostly not starting with the regular key word. From the 15774
10-X’s it is now reduced to 13.945 documents.
Section 4.2 Merging data
From the extracted MD&A files I have the CIK-code of the concerning banks, which is used by
EDGAR company filings. From that point I used the CRSP/Compustat Merged Database from
WRDS and looked up all the entire database for its CIK entry which resulted in 8047 unique
companies and 10528 unique companies if I use CUSIP entry listed in the CRSP/Compustat
Merged Database from January 2005 until December 2009. Subsequently, I used the lookup
function in Microsoft Excel to get the corresponding CUSIP-code with the CIK-code retrieved
earlier. Using this method I have exactly the same amount CUSIP-code of the companies and
extracted files from the previous section, 13.946 files. Because the CRSP data set has only the
CUSIP identifier, the rest of the securities files from CRSP will use the CUSIP identifier. From this
point using the retrieved CUSIP-code of the banks, I can merge securities files of CRSP to the
tone of the 10-X’s. CRSP data set provides the data of the securities including the stock prices,
shares outstanding. And Compustat proves the data of Net Income and Long Term Debt. After
defining the H1 and H2 I need the average for the regression and I had to calculate the average
29
raw data collected from 2007 securities and 2009 securities from the CRSP data set had more
than 7.000 companies, but as explained above, by using the lookup function in Microsoft Excel I
had the same exact match for each banks in the sample data.
Section 5 Results
The section displays the results from the test. The focus is on how far in advance are the bank
managers aware of the excessive risk they took. I use for this part different time window event
to capture the effect of the negative tone used in the Management Discussion and Analysis
section. As when the financial crisis starts to breakdown, that will be the moment when all of the
bank managers are aware of the risk, and will extensively discussing the risk and the future
prospects of the company. This study is to focus on the periods before the collapse of the
economy. In the next table I summarize all the different time periods before the collapse in July
2007. For each quarter the computed average tone of the Henry 1 and the Henry 2 tone
measure in table 3. While there was 13.945 files downloaded from the SEC Edgar filings, each
company appears to have multiple documents over the years. Some bank published for example
5 10-X’s before they got delisted, while some of the bank did not get delisted and survived until
July of 2007 and published 10-X’s for every quarterly report. Over the period from 2005Q1 –
2007Q2 there were in total 8.213 files assigned to this period, while the remaining 5.732 files are
after outbreak of the financial crisis. A closer look to the first lag, or the first period analyzed will
be the 2007Q2 – 2007Q2. This period is the period right before the crisis broke out. On average
the tone of Henry 1 and Henry 2 is respectively 0.4338 and -0.4251. While inspecting these tone
30
measure as the Henry 2 measure. The comprehensive theory by Tetlock (2007) is in line with this
finding. According to Tetlock (2007) and Larcker & Zakolyukina (2012) the theory suggest that if
the deceiver are afraid to be caught in deceptive act, he will use more general terms, less
self-references, shorter answer, but most interesting is that they are more negative toward it.
For the next part I will show the correlation between the variables described in section 4
and summarize in table 4. After I will run the regression on stock return over the period from 1
January until 1 July 2007 using the stock return between 1 July 2007 and 1 January 2009. The
quarter (Q) indicates which period before the financial crisis. From that period I run the
regression to check the sentiment of the Management Discussion and Analysis section and
analyze the sentiment on stock return. Table 4 shows the correlation matrix on 5% significance.
From this collected data I only expect multicollinearity from the variable meanH1 and meanH2,
because these two variable are computed from the same data, and therefore facing high
intercorrelation of the independent variables. The correlation coefficient between meanH1 and
meanH2 turns out to be 0.9236*** and significant. This value is high and below significant level
of 1%. This will be disturbance in my data and for this reason. I am not able to use both of the
tone measures at the same time and check which tone measure provides the better result of
tone. Therefore, I had to omit one of the tone measure. I used Henry 2 as tone measure,
because the tone measure shows the net value of positive words and by using the ratio, it shows
that a higher value for this ratio is equal to a more negative tone of the Management Discussion
and Analysis section.
31
correlation of -0.1869. This significant correlated coefficient is explained by Longin & Solnik
(2001), Ang & Chen (2002), Hong et al. (2003) in their study that when stock returns are more
highly correlated when the market declines. The firm size expressed as the market capitalization
is significant negative correlated with the stock return, however this correlation is relatively
weak, but significant and in line with the prior found literature. The market did not decline in the
between 2005 and 2006 (See figure 4), instead the market by using the GDP6 per capita, as the
sum of the gross value by all the producers in the economy, was booming between 2005 and
2006 and in turn the relationship between stock return and firm size is relatively low. So since
the correlation coefficient is negative, the direction of the economy is moving in opposite way. If
the economy was facing a downturn, then according to the literature the correlation coefficient
would suggest a higher coefficient and vice versa.
Net income and Long Term Debt seem to move in the same direction, showing
0.8366*** significant and strong correlated. This highly correlated coefficient has no literature
explanation, but can only be used as control variable over different banks. There is no literature
that shows the relationship between the long term debt and net income, however, from the
financial knowledge it is known that long term debt occurs on the balance sheet. The balance
sheet and the income statement where net income is calculated are somehow correlated. One is
affecting another in a way that is known as the double-entry accounting, where Asset = Liability +
Owner’s equity.
6 GDP is the gross domestic product divided by the midyear population. It estimates the total sum of gross value
32
In table 5 are the regression from January 2005, stated as 2005Q1, to 1 July 2007. First to
notice is that each quarter has different observation (=N), while 721 banks is evaluated in
2005Q1, only 583 banks has available information on stock return, market capitalization, long
term debt and net income. Each column presents each subgroup until 2007Q2 where the
financial crisis took place. Other banks that showed missing values are omitted from the sample
data. Table 6 shows the percentage increase of banks analyzed. The reason why starting in
2005Q1 has more banks analyzed can be explained that there were more Management
Discussion and Analysis available. For each missing Management Discussion and Analysis of the
banks, I omitted the banks consistently from the sample data. And since the timespan between
2005Q1 and 2007Q2 covers eight quarterly reports, it is more than logic that it has more
observations (N=721) than the timespan between 2006Q2 until 2007Q2 (N=664). The intercept
of the regression are always significant different from zero for each quarter. When all the other
independent variables are zero, then the estimate of the intercept will be equal to the
dependent variable. The economic meaning of this intercept or constant term is that, when the
other independent variable are equal to zero, then the stock return is equal to this value.
The inclusion of the control variables lnmarketcap, lnLTD and lnNI is to exclude
alternative explanations that is correlated to the stock return and tone measure, while testing
the explanatory variables, which is the tone H2 measure. From the correlation matrix it has
shown that there was a high correlation between the control variables lnLTD and lnNI. When the
control variables does not change my explanatory variable, the H2 measure, it is better to
33
but the relevance of net income to stock return and risk is discussed to be important in section
III, therefore I leave this variable to control, because the joint significance showed that lnNI does
contributes to predicting the outcome.
The economic meaning behind the meanH2 is the net positive value how it reacts on the
stock return. Table 5 presents the mean of Henry 2 measure for every subgroup in the sample,
where meanH2 is computed for every subgroup. The subgroups may have more quarterly report
(10-K and 10-Q) and for each subgroup an average of the Henry 2 tone measure will be created
and assigned to this subgroup. The average of the Henry 2 tone measure is -0.42744 between
2005Q1 and 2007Q2. Referring to the equation of Henry 2 tone measure: it is the positive words
minus the negative words divided by the total of positive and negative words. Showing a minus
means that there were more negative words found in the Management Discussion and Analysis
section. This gives us a rough idea how negative the tone of the document is. As mentioned here
is the average on the Henry 2 tone measure -0.42744. Using Table 5 the sentiment of the
documents are quite stable over 2005Q1 until 2006Q2, with two outliers that lay above the
average in 2006Q1 and 2006Q2. From 2006Q3 the use of more negative words seem to rise and
that is noticeable since the ratio is increasing, which indicates that there are more negative
words used in the Management Discussion and Analysis section. As discussed in section 2 & 3,
regarding the results found by Cziraki (2015) is that bank managers are aware of the risk and
seemingly that negative words choices are leading up to the financial crisis. This is especially to
recognize taking a closer look to 2007Q1 and 2007Q2. The average tone in these period are
34 negative tone is already forming in that period.
Table 5 presents the regression coefficient of the independent variable of Henry 2 tone
measure. Recall that from Feldman et al. (2009) that the slope coefficient can be interpreted as a
stock return. The economic meaning behind this is that if one standard deviation increase in
meanH2, the stock return increase or decrease by that percentage, since stock return are given
in percentage already. For 2005Q1 until 2006Q4, if the Henry 2 tone measure goes up in 1 unit,
else equal, the stock return increase by 0.635 for 2005Q1, however this regression coefficient is
negative associated for 2007Q1 and 2007Q2 while positive associated for the subgroup 2005Q1
until 2006Q4 but insignificantly associated. This regression only makes sense if the literature
review holds for when bank managers speak about risk and using negative sentiment, that it
have as result that the stock return goes up. Although this may seem inconsistent with prior
studies about stock return, by comparing to figure 4, where the economy was actually in a
growing stadia, using more negative tone results only in an increase of stock return only if the
bank manager is straightforward and purposely not holding any risk concerning the bank. Note
that all the tone measures are insignificant. However the independent coefficient of Henry 2
tone is negative and seem not to be statistical significant, but the minus before the coefficient
have an important meaning. The minus before the coefficient makes sense when bank managers
really are negative about their position and originating subprime mortgages and lending to less
creditworthy borrowers as discussed in section 2. Thereby using more negative words and
revealing their excessive risk taking on account of others according to the same study result of
35
2007Q2. The following statement can be made, however keeping in mind that the results of the
regression are insignificant for the independent variable of Henry 2 tone measure, that the bank
managers were aware of the risk in 2007Q1. This is because the regression coefficient for the
Henry 2 tone changed dramatically from a positive coefficient, which only makes sense when the
bank manager reveal risk exposure that is benefitting the bank which resulted in positive stock
return, to a negative coefficient. This change can be related to the analysis of the mean of the
Henry 2 tone measure. Note that since 2007Q1 there has been an increase in negative word
choices. However, before 2007Q1, namely 2006Q3 and 2006Q4, has already shown an increase
in negative words choices, but the change between the subgroup is slightly weak, while the
change in 2007Q1 is notable.
Table 7 is the regression of 2007Q1. The coefficient of meanH2 is -0.6845 (1.0004). So for every
unit increase in meanH2, a -0.6845 on the stock return is expected, holding other constant. This
implies that the tone of the analyzed documents are negative to the stock return. If the amount
of negative increase, so will be the meanH2, which increase the tone of the document. An
increase in tone means more negative words analyzed in the document, and realizing a decrease
in the stock return.
Section 6 Conclusion
This study focused on linguistic research on disclosures by banks that have been involved in the
financial crisis during 2007 – 2008. For this study I analyzed disclosures of 10-Q and 10-K and I
focused on the section of Management Discussion and Analysis. This section of the 10-Q and the
36
This section is meant for discussion about different objectives of the company, including
management and management style, but more important the perspectives of the company and
the risk it is involved. With a supporting tool written in VBA I extracted the Management
Discussion and Analysis section of all the downloaded 10-Q and 10-K during January 2005 until
December 2009. After the extraction I wrote a program who reads in all the Management
Discussion and Analysis files and starts to count the amount of positive words and negative
words of each Management Discussion and Analysis file. After word counting, I collected raw
data from CRSP, Compustat to merge with the Management Discussion and Analysis files and I
was able to set the tone of each document using the Henry 1 and Henry 2 tone measure. The
next thing is to test the hypotheses and answering the question mentioned in section. For
answering the first question about how much they reveal about their knowledge their action.
There was no variable that described how much they know about their action. But from section
V I have retrieved a few result about how much they knew beforehand of their action. By using
the average Henry 2 tone measure I was able to find out that since 2007Q1 the negative word
choice started to increase until 2007Q2 right before the crisis, which leaks information about the
acknowledgement of their excessive risk taking behavior regarding risky lending of mortgages.
Further, there is separation between bank managers, the one who were indeed aware of the
level of risk taking. And some do not participate in these risky lending program. This could be
investigated by using the delisted banks in the sample, because the delisted bank needed help in
order to survive or some of them did not survive the crisis. Delisted banks could assigned to be
37
taking on lending, they were not able to recover themselves from an outbreak or an economic
shock. It turns out that delisted banks were a better estimator for performance measurement.
On the following question whether bank managers have more negative words in the MD&A
section than years before can be tested using the average mean of the Henry 2 measure again.
Note that there were no significant change in tone measurement, except in 2007Q1 there is the
notable change in increased negative words choice leading to a higher, more negative sentiment
of the disclosures. For the next question is whether words choices in the reports reveal some
valuable information regarding the risk that insiders only may have. This is not validated by this
study, but with the existing literature and the study done by Larcker & Zakolyukina (2012) turns
out that there is no differentiation between a bank manager who possessed inside information
and one without inside information. According to Larcker & Zakolyukina (2012) in context of an
linguistic research, insiders are not able to conceal information, because of the theory of Vrij
(2008) it is assumed that deceivers or liars are not able to conceal the information, which is the
risk in this case, because deceivers are not feeling comfortable when they lie. In contrast to the
finding of Cziraki (2015) is that insiders of high exposures are 20% more likely to sell off stock
than insiders of low exposures. Inside information turns out to be a strategy that you can
implement in portfolio, but in linguistic research it shows no difference between someone who
possesses inside information and no inside information. With other words, inside information
regarding risk cannot be revealed through linguistic research, as in written words and word
choices.
38
Discussion and Analysis section of the disclosures by banks are associated with risk the bank
managers took. The way to check if the stock return is influenced by the sentiment of the
Management Discussion and Analysis ten different subgroup are created, one for each quarterly
report, starting in 2005Q1. If there was awareness of risk by the bank managers, then the
Management Discussion and Analysis should display some change in tone. This can be
understand as more negative, because risk is associated with a negative tone, as measured by
the Henry 2 tone measurement.
The result shows that negative tone change in the Management Discussion and Analysis
are correlated with risk, expressed as negative word choices. Therefore, a change in the tone in
the disclosure reveals some predicting information the performance measurement. This shows
that the Management Discussion and Analysis section do have information content. The result of
this study are limited by a few factors. First is the data collection from different databases. Of all
the analyzed Management Discussion and Analysis files, there were a lot missing values, because
when one database does not provide the corresponding values, as result of using different
identifiers, then I will omit this observation. Due to this problem, many observation were lost,
approximately 30% of the analyzed banks did not had a corresponding values and are omitted.
Second, Different banks face different risk, in my study I did not make subgroup for different
kind of banks e.g. investment bank, commercial bank, land development bank, savings bank,
National bank, exchange bank and many more. If I had the option, I would have tracked down
and assigning the banks into different subgroup, while now in this study all banks are treated
39
This study contributes in the field of linguistics research and on studies that are
interested in the effect and the risk that is expressed as the negative word choice in the
Management Discussion and Analysis section. Also bank manager private information about their
period knowledge of the risk awareness, expressed as in subgroup with how many periods on
forehand. And the relationship between the tone and the stock return, so for those who are
interested in the behavior of securities or selecting securities by using non-financial disclosures
(Feldman et al. (2009)). Regulators may also benefit from this study regarding the bank
manager’s behavior.
40
Appendix
Figure 1 Key players and friction in subprime mortgage credit securitization.
Source: Ashcraft, A.B. & Schuermann, T. (2008) "Understanding the Securitization of Subprime
41
Figure 1 top ten subprime mortgage originators of 2005 and 2006
Source: Inside Mortgage Finance (2007): “The 2007 Mortgage Market Statistical Annual.”
42
Figure 2the TED Spread between year 2005 and year 2009
Source: TED Spread (2015) Retrieved from
43 Figure 3 GDP per capita in $ in U.S.
Source: GDP per capita (2015) Retrieved from
44
Observation Mean Std. Dev. Min Max
MD&A 13945 495.4893 284.1115 1 987 negative 13892 124.5687 129.6858 1 3260 positive 13871 44.90981 41.439 1 790 Positive percentage 13871 0.126864 0.117059 0.002825 2.231638 Negative percentage 13892 0.053144 0.055327 0.000427 1.390785 H1 13866 0.404426 0.22203 0.019802 5 H2 13866 -0.45169 0.184458 -0.96117 0.666667
45
Variable Observation Mean Std. Dev. Min Max
Stock return 771 .4006467 4.25734 -1 78.13309
meanH2 771 -.4229542 .1641455 -.9375 .4492754
Long Term Debt 771 4390.544 22645.61 0 253228.2
Net Income 771 90.68383 449.1042 -28.656 5864.303
Market Capitalization 771 3252615 1.73e+07 460.92 2.43e+08
46
Stockreturn The return of the stock calculated as the price change including
dividend from 1 July 2007 to 1 January 20190
marketcapital Market capitalization calculated as the price times the shares
outstanding mid 2006 in million $
lnmarketcap Log 10 of the market capital
LTD Long term debt given in million $
lnLTD Log 10 of the long term debt
NI Net income given in million $
lnNI Log 10 of the net income
Delistedcompany Dummy variable for the companies within the timeframe if it got
delisted because of excessive risk taken. Takes on value 1 if delisted.
Foundpositivewords The total count of positive words
Foundnegativewords The total count of negative words
Posipercentage The amount of positive words found divided by the total amount of
positive words in the positive wordlist
Negapercentage The amount of negative words found divided by the total amount of
negative words in the negative wordlist
H1 The tone of the document set by positive words divided by negative
47
H2 The tone of the document set by positive words minus negative words
then divided by the total of positive and negative words
meanH1 The Henry 1 tone measure is calculated as the average tone during the
quarter before the financial crisis in 1 July 2007, depending on how
many periods before the crisis.
meanH2 The Henry 2 tone measure is calculated as the average tone during the
quarter before the financial crisis in 1 July 2007, depending on how
many periods before the crisis.
48 Total analyzed files average H1 tone Std. Dev. average H2 tone Std. Dev. 2007-07-01 - 2007-04-01 2007Q2-2007Q2 715 0.4159255 0.231487 -0.4440956 0.192966 2007-07-01 - 2007-01-01 2007Q2-2007Q1 754 0.4275165 0.209601 -0.4303758 0.174818 2007-07-01 - 2006-10-01 2007Q2-2006Q4 783 0.4363359 0.206338 -0.4220579 0.171633 2007-07-01 - 2006-07-01 2007Q2-2006Q3 805 0.439557 0.207008 -0.4186862 0.170869 2007-07-01 - 2006-04-01 2007Q2-2006Q2 819 0.4325627 0.205821 -0.4256192 0.171046 2007-07-01 - 2006-01-01 2007Q2-2006Q1 832 0.4322163 0.214128 -0.4259615 0.164032 2007-07-01 - 2005-10-01 2007Q2-2005Q4 862 0.4342687 0.215028 -0.4242811 0.163979 2007-07-01 - 2005-07-01 2007Q2-2005Q3 868 0.4409433 0.22046 -0.4193069 0.165761
49 2007-07-01 - 2005-04-01 2007Q2-2005Q2 882 0.4399043 0.220471 -0.420695 0.166704 2007-07-01 - 2005-01-01 2007Q2-2005Q1 893 0.4396274 0.210229 -0.4203377 0.163757 0.43388576 0.214057 -0.42514169 0.170556
50
Stock
return meanH1 meanH2 lnLTD lnNI lnmarketcap
Stock return 1.0000 meanH1 0.0129 1.0000 0.7204 meanH2 0.0236 0.9236*** 1.0000 0.5124 0.0000 lnLTD 0.0705 -0.0244 -0.0452 1.0000 0.0544 0.5061 0.2181 lnNI 0.0350 -0.0014 -0.0275 0.8366*** 1.0000 0.3399 0.9692 0.4544 0.0000 lnmarketcap -0.1869*** 0.0053 -0.0202 0.0332 0.0457 1.0000 0.0000 0.8838 0.5752 0.3654 0.2135
51
Table 5 Regression result on stock return from 2005Q1 - 2007Q2
2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 2007Q2
721 712 699 696 676 664 656 638 615 583
1.26% 1.86% 0.43% 2.96% 1.81% 1.22% 2.82% 3.74% 5.49%
52
Source SS df MS Number of obs = 583
F( 4, 578) = 7.28
Model 574.943716 4 143.735929 Prob > F = 0.0000
Residual 11404.8316 578 19.7315426 R-squared = 0.0480
Adj R-squared = 0.0414
Total 11979.7753 582 20.5838064 Root MSE = 4.442
Stockreturn Coef. Std. Err. t P>t [95% Conf. Interval]
meanH2 -.3772874 .958737 -0.39 0.694 -2.260321 1.505746 lnmarketcap -1.117247 .2376577 -4.70 0.000 -1.584025 -.6504689 lnLTD .8531035 .3544818 2.41 0.016 .1568741 1.549333 lnNI -.4276372 .3649933 -1.17 0.242 -1.144512 .2892377 _cons 4.942673 1.476173 3.35 0.001 2.043355 7.841991 Table 7 Regression on 2007Q2