Do word choices in the Management Discussion and Analysis by bank managers reveal prior knowledge about the level of risk taking prior to the financial crisis?

(1)

1

Do word choices in the Management Discussion and Analysis by bank managers reveal prior

knowledge about the level of risk taking prior to the financial crisis?

MSc. Thesis

Name: Jhonny Yep Sang Xian

Student number: 10015280

Study: MSc. Business Economics

Field: Finance

Supervisor: Dr. Jochem, Torsten

(2)

2

Statement of Originality

This document is written by Jhonny Yep Sang Xian who declares to take full responsibility for

the contents of this document.

I declare that the text and the work presented in this document is original and that no sources

other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of

(3)

3 Abstract

This study explores whether the content of disclosures (10Q and 10K) in the Management

Discussion and Analysis has valuable information regarding risk and inside information that goes

beyond financial measures. This linguistic research is using wordlists in the financial field to

analyze the tone of the disclosures. Ahead the financial crisis, mortgage originators and bank

managers are accused, but sufficient evidence has not been there. The study finds before the

collapse of the economy, the tone of the disclosures change comparing to the years before.

Statistical significant evidence are not found, but suggesting that bank managers were aware of

the excessive risk they took e.g. risky lending and originating mortgage to not credit worthy

(4)

4

Section 1 Introduction

The valuation of a firm’s asset and the probability of future returns are based on quantitative

information that is provided in the quarterly reports and annual report. However, there are

researches done in the field of qualitative data using textual analysis to examine the sentiment

of the reports. The negative sentiment and tone in reports are reflecting useful information,

because Loughran & McDonald (2010) have shown that negative words classification are having

a high significant correlation with financial variables. In a research also done by Loughran &

McDonald (2013) they have been searching for future predicting outcomes using the United

States Securities and Exchange Commission filing of the Initial Public Offering. They conduct an

investigation to the level of uncertain words that is used which in turn affects the returns of the

IPO. The research shows that the choices for words have certain influence on future outcomes.

This is because the higher the proportion of uncertain/weak modal words, the difficult it is for

investors to assimilate value relevant information and hence influences on the investor’s

expectation of the firm value. The contingency of this research with previous research in the

linguistic analysis is using the negative word classification to examine the tone of a report and

based on that, do word choices reveal any prior knowledge. The data that will be used in this

(6)

6

This research is important for banking sector and especially focusing on the housing

market, because we try to investigate whether prior private knowledge would reveal risk taking

levels of word choices and if so, then the concerned supervising committee can acts properly to

this. This study follows the path of other linguistic research based on performance return and

tone measures, but the field is different than researches before, because the focus is on a

specific sector, namely the banking sector, where the financial crisis has been triggered. It is

sector where the banks has initiated the subprime mortgage market for liquidity. As following

linguistic research it is to analyze disclosures or content of certain media on how the banking

sector functioned before the crisis. How much do they reveal about their knowledge about their

action? There is separation between bank managers, the one who were indeed aware of the

level of risk taking. And some do not participate in these risky lending program. Would bank

managers have more negative words in the MD&A section than years before? Would their prior

knowledge have an impact on the future outcomes by using annual reports of different banks?

Do words choices in the reports reveal some valuable information regarding the risk that insiders

only may have?

The remainder of this research is as follow. Section one is an introduction to the linguistic

research and what the relevance is to do this research. Section two is the main focus on previous

researches done in the same field. Section three goes about the methodology, how the setup is

of this research and section four is to show the collected data and some descriptive statistics.

Section five presents the main results and the discussion while section six is the end section of

(7)

7

Section 2.1 Literature review

Researches done in this field of using tones of directors to analysis annual reports.

Tetlock (2007) was one of the first who quantifies the interaction between the content of

the Wall Street Journals and the daily stock market; what was the influence of media on the

stock market. Evidences Tetlock found was the direct movement of the stock market can be

predicted by the new media content. First, high level of pessimism in media content predicts a

downward pressure of the stock prices, which in turn move back to its fundamental value.

However, when the pessimism on the price impact is large, the reversion to its fundamental

value is slow. Secondly, low value of media pessimism have a forecasting power on higher

trading volume of the market. Finally, low market return leads to a higher level of media

pessimism. These findings show the level of pessimism in context have significant influence on

down pressing the stock market prices, while the trading volume of the stock market is also

affected, but it does not hold for risk. This trading volume is only affected when there is

exposure to the media content. By contrast, pessimism seems not related to risk in the media

content. Tetlock (2007) did not find evidences that pessimism increases market volatility. So

pessimism in media content only decreases return of the stock market temporarily, but do not

increase the risk. However, as note Tetlock added that this is to blame to the word categories

used in his research are retrieved from psychologists and they have a different interpretations of

negative sentiment. Further to this word categories is referred later in this study to which word

(8)

8

In line with the research of Tetlock over pessimism, Larcker & Zakolyukina (2012) did a

linguistic research on misstating financial statement by the CEO and/or CFO. The linguistic

research is based on the discussions during a quarterly earning conference call. The model they

used is to label each call into “truthful” or “deceptive”. From these deceptive calls, they perform

a linguistic test which shows deceptive executives have more references to general knowledge

and less references to shareholder value. Further deceptive CEOs use significantly more extreme

positive expression and less anxiety words. However, this is in conflict with what Tetlock (2007)

found in his research, because now the deceptive executives will not only hide away misstating

in the financial statement by using extreme positive words, but will also reduce their anxiety

word choices i.e. pessimism. But the theory that is described by Vrij (2008) is that the emotion

perspective hypothesizes suggest deceivers are feeling guilty and afraid to be caught in a

deceptive action. Besides, the assumption of Larcker & Zakolyukina (2012) is that CEO and CFO

know when financial statement have been manipulated and while testing the context of the

executives can deliver valuable information about their prior deceptive behavior. This is hard to

control for when CEO and CFO are manipulating of word choices. However, in the end theories

of psychology and linguistic supports that liars, generally, are more negative and will use less

self-references. This can be explained that liars would feel uncomfortable because they lie and

never experienced the supposed claim e.g. a claim for something that he has never experienced

would put him in a more negative position towards its claim according to Larcker & Zakolyukina

(2012).

(9)

non-9

verbal behavior during deception is also applicable to verbal context. One of the four perspective

is the control perspective and that one describes that liars would intentionally avoid regretful

words. So liars will be very carefully in using word choices such that he controls the listeners not

easily to perceive the lies in the statement. This refers to less self-references, short statement

with less details and as substitute for the information that the liar not wants to expose, he will

provide more irrelevant information e.g. liars are speaking with caution and use unique words to

have more lexical diversity, while truthful speakers often repeat their information which leads to

less lexical diversity. But in general the linguistic and psychological theories suggest that liars are

more negative. We expect that deceptive executives will have more negative sentiments in his

word choices.

Kothari et al. (2008) found that litigation risk facing the firm under the SEC rule 10b-05 is

likely to motivate the manager to disclose bad news. This rule together with other laws state that

the management has the duty to disclose material information when it becomes known to the

management. With other words speaking, when the management know material information,

under the law enforcement, is it the management duty to disclose this information, whether bad

or good news.

Other research also done in the field of tone in a conference call is from Davis et al.

(2011). They show that firms manipulate the tone of the corporate communication when the

earnings are announced. Linking to the thesis, they have showed that markets seem to be

influenced by the manipulated tone in the corporate communication, which is the investor’s

(10)

10

(2012) is that the conference call linguistic tone is a significant predictor of abnormal returns and

trading volumes of a stock and that the tone influenced the 60-trading days after the conference

call.

To what happened to the financial crisis, on one side the amount of creditworthy

borrowers was limited. While on the other side mortgage lenders are able to easily underwriting

standards of the mortgage which leads them to originate riskier mortgages for the borrowers.

While the borrowers were not creditworthy and mortgage lenders are providing lower standards

mortgages, when these two meets, it is not surprising that it causes an imbalance. The risky

subprime mortgage is the result of different entities, for example credit rating agency,

underwriters, investors but also lenders and central banks, who might be accused for. It have

started when the central banks provides more liquidity by reducing the interest rate which in

turn investors are looking for riskier projects to retrieve a higher return according to

Brunnermeier (2008). Mortgage lenders took part of it and accepting subprime mortgages to

borrowers of poor creditworthiness, but seems the originators are likely to provide borrowers

the fund to buy a house. In figure 1 is the role in the subprime mortgage displayed by Ashcraft &

Schuermann (2008). Ashcraft & Schuermann (2008) showed in figure 2 the top ten subprime

mortgage provider originators, and all of them are multinational banking and financial services

holding company. Ashcraft & Schuermann (2008) describe the subprime mortgage as friction

between different parties, and they believed these friction has been the reason for the

(11)

11

 Products are offered to subprime borrowers with a complex structure which leads them

to misunderstanding of the product or misrepresentation.

 Moral hazard refers to changes in behavior in response to redistribution of risk. They

explained this friction when insurance may lead to risk taking behavior when the insured

does not bear the consequences of the bad outcomes, with another word, the insured in

downside limited and is therefore not responsible for bad outcome and leading to moral

hazard.

 Then the principal agent problem occurs when asset managers are evaluated to its

benchmark resulting in assets managers have the incentive to reach higher yield by

buying structured debt issues with same credit rating but higher coupons, but when

other asset managers underperforms, the asset managers keeps the same portfolio, but

do not exercise the same effort anymore. In case other asset managers underperform!

The findings in their study is that the incentive has played a role in the breakdown of the

subprime mortgage market. The bank managers are downside limited and may have helped

them to the excessive risk taking.

Confirming to the research of Cziraki (2015), bank executives that were more exposed to

the housing market have sold larger proportion of the shares that the bank executives that were

less exposed to the housing market. When the first decline of the housing prices starts, bank

executives sold off large proportion of the share, because they understand exposures of their

bank to the housing prices. While Fahlenbrach & Stulz (2011) show in their research that there

(12)

12

in 2008, Cziraki (2015) divided bank executives into groups relative to the exposure to the

housing market instead of treating them as one group and started his study earlier in 2006 to

examine their pattern of ownership. The main finding from a new perspective regarding inside

trading is concluded by the study: It has been shown that the importance of the bank executives’

role plays a role on inside trading on the housing market. By differentiating according to the risk

exposure to the housing market seems that the group of high exposures shows that the insiders

are more likely to sell off stock than the low exposure group by 20%. This study by Cziraki (2015)

questioned the role executives and insiders what they do in relevance of the information

gathered on forehand. Cziraki (2015) had a different approach to what shown light on what

insiders do with the information they have and with this study using word analysis to show the

same results on whether executives are tend to use more negative words, and therefore the

tone is more negative, in their disclosures regarding the excessive risk they take on the housing

market and mortgaged backed loan.

Feldman et al. (2009) main finding that will contributes to this research is that when

manager’s assessment of future prospects become more negative (positive), they would use

more negative (positive) words in the disclosures.

One key element of this research is the bank managers who controls the portfolio and

accounting for its exposures. The research by Cantrell (2013) focuses on the ability of a bank

manager. They discovered that if the ability of the bank manager is high, then they provide a

higher quality accounting. Bank manager with higher ability is able to offer fair values of their

(13)

13

ability manager had a better estimate of the exposure in their securities portfolio. The better

estimates is thus depending on the ability of the manager how to tackle with information and

processing it. Hence prior information the bank manager has will be taken into consideration in

order of decision making. Regarding the ability of the bank manager, the role he plays has effect

on what he controls. The portfolio the bank manager control are in line of its ability and hence

the prediction power. This suggest a correlation between portfolio management and the role of

a bank manager how to deal with inside information. The inside information the bank manager

has will have influence on how he acts.

Section 2.2 Hypothesis development

With the prior study done in this field of linguistic research, the disclosures of companies seem

to hide valuable information. Longstanding literature suggest that there is a link between

linguistic content and firm performance. In contrast to the collapse of the economy as result of

subprime mortgages market it is interesting, beside the discussion who to blame, are the

so-called big players aware the excessive risk taking on forehand. And if so, how long in advance are

they aware of this. I develop a test that will focus on disclosures by the banks, since they were

majority of the originators of the subprime mortgage market (See figure 2). By analyzing the

content of the disclosures of the banks, I relate it with the return of the banks. Some

performance measurement that is comprehensive to use as an indicator how they perform after

(14)

14

information they possessed and if they do reveal some characteristics of risk before the financial

crisis took place (Cziraki (2015)). Risk is associated with negative feelings, and Larcker &

Zakolyukina (2012) already showed with the theory provided by Vrij (2008) that even liars would,

even at the top of the company e.g. CEO or CFO, would reveal the risk and be able to hold in the

lie. Feldman et al. (2009) they showed that if the management is negative about future

prospects, they would use more negative words in the disclosure. Using this information about

word choices by management, performance measurement and the negative tone of reflecting

risk, disclosure are expected to be negative and bank’s performances, when the disclosures hide

some risk, is also expected to be negative associated.

Section 3 Methodology

The MD&A is an important section in 10-K report, because these filings contains more details

and information than the annual report and are obligated by the SEC. The reason why it is

interesting to use the MD&A section is because it provides an overview of the operation last

years and which situation the company is in described by the management. Structurally; 10-Ks

need to be filed within 60 days after the end of the fiscal year. If this is not the case, the return

of the security does not match with the 10-X. The focus is on the 10-K’s and the 10-Q’s, since

these reports are also required an MD&A section. Excluding 10-K’s from the sampling data will

cause biased observation and more importantly, using the 10-Q’s it is possible to see if the tone

of the management changed comparing to previous 10-Q. The 10-Q gives a smaller time-window

before the occurrences of the financial crisis. Although this sections is unaudited, bank managers

(15)

15

Therefore, this section is essential concerning risk in this study, because bank manager might

reveal some private information to risk exposures they were aware of.

To quantify the 6 categories words (uncertain, weak modal, negative, positive,

legal and strong modal), the same method is used from Loughran & McDonald (2013). The

wordlists have been used in prior literature to determine magnitude, amount or the volume of

the tone. The 6-words lists are specifically created for financial reports. First is programming skill

to quantify all 6-words category by using the parse function described by Loughran & McDonald

(2011), after parsing we remove all ASCII-parts consisting of non-characters and then summarize

a table of the affected words. After defining the tone of the annual reports, we estimate

excessive returns and performances of the bank according to the tone of the firm. For control

variables we retrieve from literature as firm sizes, outstanding debt (Feldman, Govindaraj, Livnat,

and Segal, 2010). Even in the prior researches of Tetlock (2007) and Loughran & McDonald

(2011) describe that positive words are harder to measure by program since positive words can

be used in a sentence that it has negative meaning. For example, a positive word could be like

“good”, but more the word “not very good” together with “not very” makes the word/sentence

instantly negative, so in this research the focus is on the negativism instead of measuring

positivism. This study covers SEC filings1_{that need to be collected manually. One way to do this}

is to write a program in Visual Basic for Applications, which is a supporting program. Writing a

program in VBA allows the program to download all the listed SEC filings automatically and after

1_{SEC filings is the 10K MD&A section.}

(16)

16

the download process if finished and all the files are collected, then a program is needed

especially for extracting the MD&A part of the report out of the 10-X’s. The quarterly reports are

from U.S. publicly listed firms, including banking sector.

Section 3.1 the Wordlist

The content of this linguistic research is based on a basic computational research which counts

the frequency of a word. While most researchers typically measure the tone of any disclosures

based on its frequency counts of words and assigning weight to the word. The 10-K’s and 10-Q’s

are treated as a document full of words and hereby the sequence of the words do not play a role

anymore. This is called a vector of term-frequency counts. In order to measure the tone of the

document, a wordlist has to be developed to separate the positive words and the negative

words. Wordlists used widely in earlier studies are the Diction and General Inquirer. Diction is

developed by Roderick Hart and used for political communication while General inquirer is

developed by Philip Stone and used over decades for social psychology. Because these wordlists

are general English linguistic dictionaries, it is not specific enough for the domain of financial

disclosure. The classification of tone based on the Diction and the General Inquirer does not

provide sufficient accuracy from the study of Li (2010). Since general words omits that are

actually considered to be positive or negative in the financial domain, while other words are

included that actually would not be included according to Henry & Leone (2009). This invokes to

look for another wordlist. Because Loughran & McDonald showed in their study that from the

10-K’s that are analyzed, almost three-fourths (73.8%) of the words that were identified as

(17)

17

contexts (Loughran & McDonald, 2008). Another wordlist is made available by Loughran &

McDonald. The wordlist is described in their research “When is a liability? Textual analysis,

dictionaries, and 10-K’s (2010)”. The wordlist is created by analyzing all 10-K’s occurring from

1994-2007 and from these words that occurs in more than five percentage of the analyzed

words, it is classified into their six-category wordlist. Since I am doing research in the field of

financial reports, I reference Loughran & McDonald’s wordlists, updated until March 2015, for

analyzing 10-X2_{. The Loughran & McDonald wordlists have some advantages over other wordlist}

in this research, because the list does not require words to collapse down to its roots e.g.

aardvark versus aardvarks. In this example is has shown that the wordlist is already included

aardvark, but also aardvarks, so therefore it has no important role to collapse down the word

aardvarks into aardvark.

Section 3.2 Defining tone

Davis et al (2008) research about earnings press release languages. They are counting the

optimism-increasing and decreasing words. They used for the sentiment measurement as the

difference between the percentage of optimistic words and the percentage of pessimistic words.

As a measurement for before and after the earning press release, they used the lag difference

between the two-periods.

In this study, I will only apply the frequency measurement of the word that is using the same

method as Henry (2008). While Henry used different wordlists among others Diction and General

2_{10-X’s are referenced as the 10-Q’s and the 10-K’s used in this research. I use as much as possible available}

(18)

18

Inquirer for financial disclosures, I will use the wordlist that is retrieved from Loughran &

McDonald Master Dictionary updated to March 2015. Doran et al. (2012) and Henry & Leone

(2010) defines tone of earnings announcements in their study using the Henry measure in two

ways. First is the ratio between the amounts of positive words divided by the amount of negative

words (H1) expressed as follows:

# 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑊𝑜𝑟𝑑𝑠 # 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑊𝑜𝑟𝑑𝑠 And the second way of Henry measure (H2) is:

( # 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑊𝑜𝑟𝑑𝑠 − # 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑊𝑜𝑟𝑑𝑠 ) ( # 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑊𝑜𝑟𝑑𝑠 + # 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑊𝑜𝑟𝑑𝑠)

So the tone is calculated as the amount of positive words subtracting the amount of negative

words, then divided by the total of the positive words and the negative words.

Loughran & McDonald’s Master Dictionary is an aggregate wordlist which is useful for textual

analysis in financial applications and extended to 10-K documents.

Section 3.3 Dependent variable

Many researchers, Engelberg et al. (2012) Feldman et al. (2008) Demers and Vega (2011), in the

field of linguistic research use the event study methodology and defining the dependent variable

as the cumulative abnormal return (CAR) in a predetermined event window. However, the use of

CAR is very precise and good for event studies on short term, or where the event window is

(19)

19

study faces performance measurement over few years, and therefore using the CAR is not

applicable. Numerous empirical studies have defined the financial theory that underlies stock

return as performance measurement. The financial theory implies that the stock reflects the

current status of the company. Using its return may reflect the performance during the crisis.

Section 3.4 Excessive risk taking level of delisted companies

Cziraki (2015) discussed in his study about the role of insiders. He used data from the years prior

to the crisis to measure how much insiders sell after acquiring information in response to stock

price increase or the changes in book-to-market ratio in the empirical study, measuring the how

many stocks the insider sells off in normal scenario and without being impacted by inside

information. I will add an equivalent component in order to see the affection of inside

information. The differentiation between banks will be made based on which company was more

affected by the financial crisis than banks which were not affected or merely affected by the

financial crisis. With another words, banks who got delisted during this period are assumed to

have taken excessive risk, and therefore removed from the listed security. The stock is then

removed from the exchange, because the security is in compliance with the listing requirements.

Consequently, the expectation now is that the banks which suffer from the financial crisis has

significant more negative words in the MD&A section because they have taken more excessive

risk. It is convenient to build up a treatment group consisting of banks that are affected during

the financial crisis and the control group is the one who was either not aware of the risks or did

not participate on this excessive risk taking on the mortgage backed lending. To understand if

(20)

20

divide the banks into categories: delisted and not-delisted group. With these groups, the

excessive return or the stock return over the time period can be measured. If the excessive

return is stable over time, then it can assumed that no inside info was spoiled as reaction on the

stock. As same as Cziraki (2015) did was assigning bank with CRSP3_{delisting code 500-599}

(“dropped”) and code 200-299 (“mergers”) into low performance group, while the rest of the

banks are assigned to the high performance group. Summarizing, in this study companies which

got delisted during the crisis depending on their reason will be treated as the group that did not

survive the crisis and hence assigned to the non-survivors group. Other banks that do survive the

crisis will be assigned to the survivor group. The non-survivor group served as the group who had

actually excessive risk-taking on their balance, because the impact of the crisis has worsens their

position and as result that they did not survive the crisis without any external help or cease to

exist because it violates the requirement of listing.

(21)

21

Section 3.5 the start of the financial crisis

Subprime lending is referred to the way loan is lend out to those who have difficulties of

maintaining the repayment schedule and most of time they will face some offset of the loan

compared to normal loans. As in order to offset the higher credit risk, because of the difficulties

in maintaining repayment schedule, these loans will have bad collateral and most of time a

higher interest rate. According to Lemke et al. (2013) majority of the subprime lending are

packaged in the mortgaged backed securities which in turns defaulted, causing the financial crisis

in 2007-2008. The contribution of these subprime lending has triggered the financial meltdown,

which peaked in august of 2008. As illustrating for the financial crisis, I will use the TED spread

historical chart from the Federal Reserve. TED has two component, one is the 3 month LABOR

rate and second is the 3 month Treasury Bills. The LIBOR is the overnight interbank lending rate

which is used by the banks as an average for borrowing from other banks. While 3 month

treasury is used by the U.S. government to raise capital from the public. The TED spread is

(22)

22

𝑇𝐸𝐷 𝑆𝑝𝑟𝑒𝑎𝑑 = 3 𝑀𝑜𝑛𝑡ℎ 𝐿𝐼𝐵𝑂𝑅 𝑅𝑎𝑡𝑒

3 𝑀𝑜𝑛𝑡ℎ 𝑇𝑟𝑒𝑎𝑠𝑢𝑟𝑦 𝐵𝑖𝑙𝑙𝑠 𝑅𝑎𝑡𝑒

This is a measure for the perceived credit risk for the U.S. economy. Since the LIBOR measures

the lending rate between banks, if the spread between the LIBOR and the 3 month treasury bill

raises, this will show lack of trust between the concerning bank and the counterparties. In figure

1 is shown the TED measure of the time window 2005-2009. The TED spread is peaking on its top

in September 2008 with 315.25 basis points. Normally, when the economy is stable, the TED

spread varies only between 30 to 50 basis points, and in February 2005 the lowest measure of 23

basis points. The remarkably increase in August 2008 is the collapse of the economy, but the

start of the increase can be pointed to July 2007. That was the first increase of the TED spread

and from that point it only fluctuates to its highest peak in August 2008. It is now interesting to

look at July 2007, because July is right after the Q2-2007 announcement. Because the study

covers the Management Discussion and Analysis part, it is now interesting to test the risk

awareness right before the collapse of the economy. How far in advance are the bank managers

aware of the excessive risk they took? In order to test how far in advance the bank managers

were aware of the excessive risk and evenly if they were aware of the excessive, each 10-X’s is

assigned to the quarter of the year that they will be announcement and published. So for the

quarterly report of June 2006 the 10-Q will be Q2-2006. But one rough estimation has to be

made for the date, because not all quarterly reports and yearly reports have the same

(23)

23

for this study the rough estimation will be made on every March, June, September and

December.

Section 3.6 Control variables

In this linguistic research it is important to understand why certain tone of the document would

reveal prior knowledge of their excessive risk taking. Therefore, market variables needs to be

built in the for control variables to test if the I include control variables for firm characteristics

variables and common market variables such as book to market ratio and trading volumes.

Trading volumes’ role for control variables is essential, because studies found out that textual

sentiment has significant influence on trading volumes. Tetlock (2007) found out in his study for

either extremely low or high pessimism that will lead to temporarily high trading volumes. Das

and Chen (2007) have found the same results, they have found a strong correlation between

trading volumes and sentiment. But however, both study did not show that sentiment can have a

predictability on future trading volumes. The book to market ratio is used to display the current

status of the publicly-traded company. It compares the company’s net asset value per share to

its share price. This comparison helps to determine whether the market price for the company in

relative to its actual worth. A higher book to market ratio indicates that the company is

undervalued. This is a common tool for control variable, because it sets the company in

comparison with other companies.

Ferguson et al. (2015) has tested in their study if positive and negative words in media content

display relevant information. They have found significant predictive relationship between the

(24)

24

high book to market ratio. They also found high attention news, either positive or negative,

affects the subsequent period return of predicting power of media content to firm size on

return. The role of a big firm are tended to receive more media attention than small firms. The

media attention to the big firm comes together with investor recognition and they try to

investigate if the stock of the smaller firms, so lower investor recognition, is compensated with

higher stock return. The results of their study shows that media attention on the firm size has

significant influence. After the announcement to the media, either positive or negative news,

both has significant predictive power on the firm next period abnormal return. However, for

small firms it has significant predicting power, but for bigger firms, it turns out that media

content has no influence on the larger firms listed in the FTSE4_{100 firms. It seems like that firm}

size do have influence on the predictive power of the return. In my study I will try to control

return of the stock for firm size, because firm size has significant influence on stock return.

Market capitalization is not available on CRSP, however it can be calculated as following.

By following the instruction given by CRSP, I collect the outstanding shares at the average

between 2005 and July 2007. Subsequent I multiply the outstanding shares with the average

price of the share also between 2005 and July 2007, which gives the market capitalization of the

company in between the timeframe.

4_{FTSE is the stock index listed on the London stock exchange where the 100 firms with the highest market}

(25)

25

Section 3.7 the regression

𝑆𝑡𝑜𝑐𝑘 𝑟𝑒𝑡𝑢𝑟𝑛 = 𝛼 + 𝛽1 ∗ 𝐻2 + 𝛾2 ∗ ln(𝑀𝑎𝑟𝑘𝑒𝑡 𝐶𝑎𝑝𝑖𝑡𝑎𝑙𝑖𝑧𝑎𝑡𝑖𝑜𝑛) + 𝛾3 ∗ ln(𝐿𝑜𝑛𝑔 𝑇𝑒𝑟𝑚 𝐷𝑒𝑏𝑡) + 𝛾4 ln(𝑁𝑒𝑡 𝐼𝑛𝑐𝑜𝑚𝑒) + 𝜀

Section 4 Data

This section describes the data and sampling approach. I used quarterly and annually

data from CRSP starting in fiscal year 2005 to fiscal year end 2009. The reason to use U.S. firm is

because publicly listed companies are required to deliver up their disclosures and earnings

announcement to the SEC, including the banking sector. The first sample collection is retrieved

from Compustat banking sector. For financial, accounting and industrial data I refer to

Compustat. Historical data about prices and market capitalization I refer to CRSP.

In Table 1 provides the number of extracted files and the amount of positive and negative words

found in the files. From all the 13.945 extracted Management Discussion and Analysis section,

there were 53 files where no negative words had been found and 74 files where no positive

words have been found. The maximum of positive and negative words found in Management

Discussion and Analysis is respectively 790 and 3260. Further investigation to these maximum

number of found words turned out that the file that has the most positive words is exactly the

same file that also had the most negative words found. This can be explained if the file is bigger

than other files, so the cumulative of total words is bigger than other files. This will result in that

(26)

26

words and negative words, these percentages give the positive words found divided by the total

of positive words in the wordlist. Taking a closer look to the minimum found negative words

0.00042 shows that some files have barely any negative words in it. This can be explained that

each file has at least either one positive or negative word. See Table 1 at the value of minimum 1

in the total amount found positive or negative words. For computing the tone of the document I

used the Henry 1 measurement as well the Henry 2 measurement. Of all the analyzed files, both

tone measures give a higher ratio for negative words. Henry 1 has a higher denominator than

the numerator, which implies than there were more negative words found in each document

(mean = 0.4044). Henry 2 measure shows the same result for each document. The mean of

Henry 2 is -0.4516, this implies that on average, the tone of the documents are more negative.

This is in line from the literature by Cziraki (2015) & Larcker & Zakolyukina (2012) that executives

are tend to use more negative words to reflect the risk or either because they are uncomfortable

to lie, and therefore more negative tone is present in the disclosure regarding the risk.

Table 2 gives the descriptive statistics of the 2005Q1, since each subgroup has different

descriptive statistics, I will not display them all in the Appendix. In Table 2A all the variables that

will be used in this study is displayed over there with a small description.

Section 4.1 Extracting Management Discussion and Analysis

With the help of Compustat I have gathered the banking sector operating in between the

timelines of begin 2005 and end of 2009. Primarily, the operating banking sector are listed in the

(27)

CIK-27

identifier 5_{I downloaded filings of the banking sector over 2005 until 2009. First is to download}

the Master Files, and then identify target forms from the master file into specific target file

name. In total there were 15.774 10-X’s downloaded between the timeline. In order to analyze

to the tone of the MD&A section, this part needs to be cut out from the 10-X document. With

supporting tool written in VBA provided by T. Jochem, the process of extracting the MD&A is

based on using keywords. By using keywords I mark the start of the Management Discussion and

Analysis section by selecting the keyword and select all the text that is included. Keywords are

typical described by Management or Discussion. At the end, I look for keywords that marks the

end and all the text selected will be marked down as the Management Discussion and Analysis

section and a new text file will be created with only these selected area. Since the files are

downloaded from the SEC’s Edgar system, the 10-X’s are all in HTML format. Converting it into a

text file and removing the HTML tags have ensured that the file has a different layout than its

initial as in HTML format. Older files are also less structured than new files and has as result that

it will have a different layout throughout the years. Of all the 15774 examined 10-X’s, 1009 of

them had only a reference to the MD&A file, some of them are referenced to the annual report

or stockholders’ report. Another 178 10-X’s did not have a Management Discussion and Analysis

section, 51 are lost because the section of Management Discussion and Analysis are mixed with

the following section e.g. Financial Report or Quantitative and Qualitative Disclosures about

Market Risk. 370 10-X’s are excluded from the data, because of the different layout caused by

5_{CIK-identifier is the Central Index Key used by EDGAR Company Filings for companies who have filed disclosures}

(28)

28

converting files into Doc file, the specific section of Management Discussion and Analysis got

lost. The section was not traceable using the supporting tool, because there was either no begin

of the search criteria or the end of the search criteria and is therefore not specific enough to cut

out the Management Discussion and Analysis section. More specific, it is because the

arrangement of the section is mostly not starting with the regular key word. From the 15774

10-X’s it is now reduced to 13.945 documents.

Section 4.2 Merging data

From the extracted MD&A files I have the CIK-code of the concerning banks, which is used by

EDGAR company filings. From that point I used the CRSP/Compustat Merged Database from

WRDS and looked up all the entire database for its CIK entry which resulted in 8047 unique

companies and 10528 unique companies if I use CUSIP entry listed in the CRSP/Compustat

Merged Database from January 2005 until December 2009. Subsequently, I used the lookup

function in Microsoft Excel to get the corresponding CUSIP-code with the CIK-code retrieved

earlier. Using this method I have exactly the same amount CUSIP-code of the companies and

extracted files from the previous section, 13.946 files. Because the CRSP data set has only the

CUSIP identifier, the rest of the securities files from CRSP will use the CUSIP identifier. From this

point using the retrieved CUSIP-code of the banks, I can merge securities files of CRSP to the

tone of the 10-X’s. CRSP data set provides the data of the securities including the stock prices,

shares outstanding. And Compustat proves the data of Net Income and Long Term Debt. After

defining the H1 and H2 I need the average for the regression and I had to calculate the average

(29)

29

raw data collected from 2007 securities and 2009 securities from the CRSP data set had more

than 7.000 companies, but as explained above, by using the lookup function in Microsoft Excel I

had the same exact match for each banks in the sample data.

Section 5 Results

The section displays the results from the test. The focus is on how far in advance are the bank

managers aware of the excessive risk they took. I use for this part different time window event

to capture the effect of the negative tone used in the Management Discussion and Analysis

section. As when the financial crisis starts to breakdown, that will be the moment when all of the

bank managers are aware of the risk, and will extensively discussing the risk and the future

prospects of the company. This study is to focus on the periods before the collapse of the

economy. In the next table I summarize all the different time periods before the collapse in July

2007. For each quarter the computed average tone of the Henry 1 and the Henry 2 tone

measure in table 3. While there was 13.945 files downloaded from the SEC Edgar filings, each

company appears to have multiple documents over the years. Some bank published for example

5 10-X’s before they got delisted, while some of the bank did not get delisted and survived until

July of 2007 and published 10-X’s for every quarterly report. Over the period from 2005Q1 –

2007Q2 there were in total 8.213 files assigned to this period, while the remaining 5.732 files are

after outbreak of the financial crisis. A closer look to the first lag, or the first period analyzed will

be the 2007Q2 – 2007Q2. This period is the period right before the crisis broke out. On average

the tone of Henry 1 and Henry 2 is respectively 0.4338 and -0.4251. While inspecting these tone

(30)

30

measure as the Henry 2 measure. The comprehensive theory by Tetlock (2007) is in line with this

finding. According to Tetlock (2007) and Larcker & Zakolyukina (2012) the theory suggest that if

the deceiver are afraid to be caught in deceptive act, he will use more general terms, less

self-references, shorter answer, but most interesting is that they are more negative toward it.

For the next part I will show the correlation between the variables described in section 4

and summarize in table 4. After I will run the regression on stock return over the period from 1

January until 1 July 2007 using the stock return between 1 July 2007 and 1 January 2009. The

quarter (Q) indicates which period before the financial crisis. From that period I run the

regression to check the sentiment of the Management Discussion and Analysis section and

analyze the sentiment on stock return. Table 4 shows the correlation matrix on 5% significance.

From this collected data I only expect multicollinearity from the variable meanH1 and meanH2,

because these two variable are computed from the same data, and therefore facing high

intercorrelation of the independent variables. The correlation coefficient between meanH1 and

meanH2 turns out to be 0.9236*** and significant. This value is high and below significant level

of 1%. This will be disturbance in my data and for this reason. I am not able to use both of the

tone measures at the same time and check which tone measure provides the better result of

tone. Therefore, I had to omit one of the tone measure. I used Henry 2 as tone measure,

because the tone measure shows the net value of positive words and by using the ratio, it shows

that a higher value for this ratio is equal to a more negative tone of the Management Discussion

and Analysis section.

(31)

31

correlation of -0.1869. This significant correlated coefficient is explained by Longin & Solnik

(2001), Ang & Chen (2002), Hong et al. (2003) in their study that when stock returns are more

highly correlated when the market declines. The firm size expressed as the market capitalization

is significant negative correlated with the stock return, however this correlation is relatively

weak, but significant and in line with the prior found literature. The market did not decline in the

between 2005 and 2006 (See figure 4), instead the market by using the GDP6_{per capita, as the}

sum of the gross value by all the producers in the economy, was booming between 2005 and

2006 and in turn the relationship between stock return and firm size is relatively low. So since

the correlation coefficient is negative, the direction of the economy is moving in opposite way. If

the economy was facing a downturn, then according to the literature the correlation coefficient

would suggest a higher coefficient and vice versa.

Net income and Long Term Debt seem to move in the same direction, showing

0.8366*** significant and strong correlated. This highly correlated coefficient has no literature

explanation, but can only be used as control variable over different banks. There is no literature

that shows the relationship between the long term debt and net income, however, from the

financial knowledge it is known that long term debt occurs on the balance sheet. The balance

sheet and the income statement where net income is calculated are somehow correlated. One is

affecting another in a way that is known as the double-entry accounting, where Asset = Liability +

Owner’s equity.

6_{GDP is the gross domestic product divided by the midyear population. It estimates the total sum of gross value}

(32)

32

In table 5 are the regression from January 2005, stated as 2005Q1, to 1 July 2007. First to

notice is that each quarter has different observation (=N), while 721 banks is evaluated in

2005Q1, only 583 banks has available information on stock return, market capitalization, long

term debt and net income. Each column presents each subgroup until 2007Q2 where the

financial crisis took place. Other banks that showed missing values are omitted from the sample

data. Table 6 shows the percentage increase of banks analyzed. The reason why starting in

2005Q1 has more banks analyzed can be explained that there were more Management

Discussion and Analysis available. For each missing Management Discussion and Analysis of the

banks, I omitted the banks consistently from the sample data. And since the timespan between

2005Q1 and 2007Q2 covers eight quarterly reports, it is more than logic that it has more

observations (N=721) than the timespan between 2006Q2 until 2007Q2 (N=664). The intercept

of the regression are always significant different from zero for each quarter. When all the other

independent variables are zero, then the estimate of the intercept will be equal to the

dependent variable. The economic meaning of this intercept or constant term is that, when the

other independent variable are equal to zero, then the stock return is equal to this value.

The inclusion of the control variables lnmarketcap, lnLTD and lnNI is to exclude

alternative explanations that is correlated to the stock return and tone measure, while testing

the explanatory variables, which is the tone H2 measure. From the correlation matrix it has

shown that there was a high correlation between the control variables lnLTD and lnNI. When the

control variables does not change my explanatory variable, the H2 measure, it is better to

(33)

33

but the relevance of net income to stock return and risk is discussed to be important in section

III, therefore I leave this variable to control, because the joint significance showed that lnNI does

contributes to predicting the outcome.

The economic meaning behind the meanH2 is the net positive value how it reacts on the

stock return. Table 5 presents the mean of Henry 2 measure for every subgroup in the sample,

where meanH2 is computed for every subgroup. The subgroups may have more quarterly report

(10-K and 10-Q) and for each subgroup an average of the Henry 2 tone measure will be created

and assigned to this subgroup. The average of the Henry 2 tone measure is -0.42744 between

2005Q1 and 2007Q2. Referring to the equation of Henry 2 tone measure: it is the positive words

minus the negative words divided by the total of positive and negative words. Showing a minus

means that there were more negative words found in the Management Discussion and Analysis

section. This gives us a rough idea how negative the tone of the document is. As mentioned here

is the average on the Henry 2 tone measure -0.42744. Using Table 5 the sentiment of the

documents are quite stable over 2005Q1 until 2006Q2, with two outliers that lay above the

average in 2006Q1 and 2006Q2. From 2006Q3 the use of more negative words seem to rise and

that is noticeable since the ratio is increasing, which indicates that there are more negative

words used in the Management Discussion and Analysis section. As discussed in section 2 & 3,

regarding the results found by Cziraki (2015) is that bank managers are aware of the risk and

seemingly that negative words choices are leading up to the financial crisis. This is especially to

recognize taking a closer look to 2007Q1 and 2007Q2. The average tone in these period are

(34)

34 negative tone is already forming in that period.

Table 5 presents the regression coefficient of the independent variable of Henry 2 tone

measure. Recall that from Feldman et al. (2009) that the slope coefficient can be interpreted as a

stock return. The economic meaning behind this is that if one standard deviation increase in

meanH2, the stock return increase or decrease by that percentage, since stock return are given

in percentage already. For 2005Q1 until 2006Q4, if the Henry 2 tone measure goes up in 1 unit,

else equal, the stock return increase by 0.635 for 2005Q1, however this regression coefficient is

negative associated for 2007Q1 and 2007Q2 while positive associated for the subgroup 2005Q1

until 2006Q4 but insignificantly associated. This regression only makes sense if the literature

review holds for when bank managers speak about risk and using negative sentiment, that it

have as result that the stock return goes up. Although this may seem inconsistent with prior

studies about stock return, by comparing to figure 4, where the economy was actually in a

growing stadia, using more negative tone results only in an increase of stock return only if the

bank manager is straightforward and purposely not holding any risk concerning the bank. Note

that all the tone measures are insignificant. However the independent coefficient of Henry 2

tone is negative and seem not to be statistical significant, but the minus before the coefficient

have an important meaning. The minus before the coefficient makes sense when bank managers

really are negative about their position and originating subprime mortgages and lending to less

creditworthy borrowers as discussed in section 2. Thereby using more negative words and

revealing their excessive risk taking on account of others according to the same study result of

(35)

35

2007Q2. The following statement can be made, however keeping in mind that the results of the

regression are insignificant for the independent variable of Henry 2 tone measure, that the bank

managers were aware of the risk in 2007Q1. This is because the regression coefficient for the

Henry 2 tone changed dramatically from a positive coefficient, which only makes sense when the

bank manager reveal risk exposure that is benefitting the bank which resulted in positive stock

return, to a negative coefficient. This change can be related to the analysis of the mean of the

Henry 2 tone measure. Note that since 2007Q1 there has been an increase in negative word

choices. However, before 2007Q1, namely 2006Q3 and 2006Q4, has already shown an increase

in negative words choices, but the change between the subgroup is slightly weak, while the

change in 2007Q1 is notable.

Table 7 is the regression of 2007Q1. The coefficient of meanH2 is -0.6845 (1.0004). So for every

unit increase in meanH2, a -0.6845 on the stock return is expected, holding other constant. This

implies that the tone of the analyzed documents are negative to the stock return. If the amount

of negative increase, so will be the meanH2, which increase the tone of the document. An

increase in tone means more negative words analyzed in the document, and realizing a decrease

in the stock return.

Section 6 Conclusion

This study focused on linguistic research on disclosures by banks that have been involved in the

financial crisis during 2007 – 2008. For this study I analyzed disclosures of 10-Q and 10-K and I

focused on the section of Management Discussion and Analysis. This section of the 10-Q and the

(36)

36

This section is meant for discussion about different objectives of the company, including

management and management style, but more important the perspectives of the company and

the risk it is involved. With a supporting tool written in VBA I extracted the Management

Discussion and Analysis section of all the downloaded 10-Q and 10-K during January 2005 until

December 2009. After the extraction I wrote a program who reads in all the Management

Discussion and Analysis files and starts to count the amount of positive words and negative

words of each Management Discussion and Analysis file. After word counting, I collected raw

data from CRSP, Compustat to merge with the Management Discussion and Analysis files and I

was able to set the tone of each document using the Henry 1 and Henry 2 tone measure. The

next thing is to test the hypotheses and answering the question mentioned in section. For

answering the first question about how much they reveal about their knowledge their action.

There was no variable that described how much they know about their action. But from section

V I have retrieved a few result about how much they knew beforehand of their action. By using

the average Henry 2 tone measure I was able to find out that since 2007Q1 the negative word

choice started to increase until 2007Q2 right before the crisis, which leaks information about the

acknowledgement of their excessive risk taking behavior regarding risky lending of mortgages.

Further, there is separation between bank managers, the one who were indeed aware of the

level of risk taking. And some do not participate in these risky lending program. This could be

investigated by using the delisted banks in the sample, because the delisted bank needed help in

order to survive or some of them did not survive the crisis. Delisted banks could assigned to be

(37)

37

taking on lending, they were not able to recover themselves from an outbreak or an economic

shock. It turns out that delisted banks were a better estimator for performance measurement.

On the following question whether bank managers have more negative words in the MD&A

section than years before can be tested using the average mean of the Henry 2 measure again.

Note that there were no significant change in tone measurement, except in 2007Q1 there is the

notable change in increased negative words choice leading to a higher, more negative sentiment

of the disclosures. For the next question is whether words choices in the reports reveal some

valuable information regarding the risk that insiders only may have. This is not validated by this

study, but with the existing literature and the study done by Larcker & Zakolyukina (2012) turns

out that there is no differentiation between a bank manager who possessed inside information

and one without inside information. According to Larcker & Zakolyukina (2012) in context of an

linguistic research, insiders are not able to conceal information, because of the theory of Vrij

(2008) it is assumed that deceivers or liars are not able to conceal the information, which is the

risk in this case, because deceivers are not feeling comfortable when they lie. In contrast to the

finding of Cziraki (2015) is that insiders of high exposures are 20% more likely to sell off stock

than insiders of low exposures. Inside information turns out to be a strategy that you can

implement in portfolio, but in linguistic research it shows no difference between someone who

possesses inside information and no inside information. With other words, inside information

regarding risk cannot be revealed through linguistic research, as in written words and word

choices.

(38)

38

Discussion and Analysis section of the disclosures by banks are associated with risk the bank

managers took. The way to check if the stock return is influenced by the sentiment of the

Management Discussion and Analysis ten different subgroup are created, one for each quarterly

report, starting in 2005Q1. If there was awareness of risk by the bank managers, then the

Management Discussion and Analysis should display some change in tone. This can be

understand as more negative, because risk is associated with a negative tone, as measured by

the Henry 2 tone measurement.

The result shows that negative tone change in the Management Discussion and Analysis

are correlated with risk, expressed as negative word choices. Therefore, a change in the tone in

the disclosure reveals some predicting information the performance measurement. This shows

that the Management Discussion and Analysis section do have information content. The result of

this study are limited by a few factors. First is the data collection from different databases. Of all

the analyzed Management Discussion and Analysis files, there were a lot missing values, because

when one database does not provide the corresponding values, as result of using different

identifiers, then I will omit this observation. Due to this problem, many observation were lost,

approximately 30% of the analyzed banks did not had a corresponding values and are omitted.

Second, Different banks face different risk, in my study I did not make subgroup for different

kind of banks e.g. investment bank, commercial bank, land development bank, savings bank,

National bank, exchange bank and many more. If I had the option, I would have tracked down

and assigning the banks into different subgroup, while now in this study all banks are treated

(39)

39

This study contributes in the field of linguistics research and on studies that are

interested in the effect and the risk that is expressed as the negative word choice in the

Management Discussion and Analysis section. Also bank manager private information about their

period knowledge of the risk awareness, expressed as in subgroup with how many periods on

forehand. And the relationship between the tone and the stock return, so for those who are

interested in the behavior of securities or selecting securities by using non-financial disclosures

(Feldman et al. (2009)). Regulators may also benefit from this study regarding the bank

manager’s behavior.

(40)

40

Appendix

Figure 1 Key players and friction in subprime mortgage credit securitization.

Source: Ashcraft, A.B. & Schuermann, T. (2008) "Understanding the Securitization of Subprime

(41)

41

Figure 1 top ten subprime mortgage originators of 2005 and 2006

Source: Inside Mortgage Finance (2007): “The 2007 Mortgage Market Statistical Annual.”

(42)

42

Figure 2the TED Spread between year 2005 and year 2009

Source: TED Spread (2015) Retrieved from

(43)

43 Figure 3 GDP per capita in $ in U.S.

Source: GDP per capita (2015) Retrieved from

(44)

44

Observation Mean Std. Dev. Min Max

MD&A 13945 495.4893 284.1115 1 987 negative 13892 124.5687 129.6858 1 3260 positive 13871 44.90981 41.439 1 790 Positive percentage 13871 0.126864 0.117059 0.002825 2.231638 Negative percentage 13892 0.053144 0.055327 0.000427 1.390785 H1 13866 0.404426 0.22203 0.019802 5 H2 13866 -0.45169 0.184458 -0.96117 0.666667

(45)

45

Variable Observation Mean Std. Dev. Min Max

Stock return 771 .4006467 4.25734 -1 78.13309

meanH2 771 -.4229542 .1641455 -.9375 .4492754

Long Term Debt 771 4390.544 22645.61 0 253228.2

Net Income 771 90.68383 449.1042 -28.656 5864.303

Market Capitalization 771 3252615 1.73e+07 460.92 2.43e+08

(46)

46

Stockreturn The return of the stock calculated as the price change including

dividend from 1 July 2007 to 1 January 20190

marketcapital Market capitalization calculated as the price times the shares

outstanding mid 2006 in million $

lnmarketcap Log 10 of the market capital

LTD Long term debt given in million $

lnLTD Log 10 of the long term debt

NI Net income given in million $

lnNI Log 10 of the net income

Delistedcompany Dummy variable for the companies within the timeframe if it got

delisted because of excessive risk taken. Takes on value 1 if delisted.

Foundpositivewords The total count of positive words

Foundnegativewords The total count of negative words

Posipercentage The amount of positive words found divided by the total amount of

positive words in the positive wordlist

Negapercentage The amount of negative words found divided by the total amount of

negative words in the negative wordlist

H1 The tone of the document set by positive words divided by negative

(47)

47

H2 The tone of the document set by positive words minus negative words

then divided by the total of positive and negative words

meanH1 The Henry 1 tone measure is calculated as the average tone during the

quarter before the financial crisis in 1 July 2007, depending on how

many periods before the crisis.

meanH2 The Henry 2 tone measure is calculated as the average tone during the

quarter before the financial crisis in 1 July 2007, depending on how

many periods before the crisis.

(48)

48 Total analyzed files average H1 tone Std. Dev. average H2 tone Std. Dev. 2007-07-01 - 2007-04-01 2007Q2-2007Q2 715 0.4159255 0.231487 -0.4440956 0.192966 2007-07-01 - 2007-01-01 2007Q2-2007Q1 754 0.4275165 0.209601 -0.4303758 0.174818 2007-07-01 - 2006-10-01 2007Q2-2006Q4 783 0.4363359 0.206338 -0.4220579 0.171633 2007-07-01 - 2006-07-01 2007Q2-2006Q3 805 0.439557 0.207008 -0.4186862 0.170869 2007-07-01 - 2006-04-01 2007Q2-2006Q2 819 0.4325627 0.205821 -0.4256192 0.171046 2007-07-01 - 2006-01-01 2007Q2-2006Q1 832 0.4322163 0.214128 -0.4259615 0.164032 2007-07-01 - 2005-10-01 2007Q2-2005Q4 862 0.4342687 0.215028 -0.4242811 0.163979 2007-07-01 - 2005-07-01 2007Q2-2005Q3 868 0.4409433 0.22046 -0.4193069 0.165761

(49)

49 2007-07-01 - 2005-04-01 2007Q2-2005Q2 882 0.4399043 0.220471 -0.420695 0.166704 2007-07-01 - 2005-01-01 2007Q2-2005Q1 893 0.4396274 0.210229 -0.4203377 0.163757 0.43388576 0.214057 -0.42514169 0.170556

(50)

50

Stock

return meanH1 meanH2 lnLTD lnNI lnmarketcap

Stock return 1.0000 meanH1 0.0129 1.0000 0.7204 meanH2 0.0236 0.9236*** 1.0000 0.5124 0.0000 lnLTD 0.0705 -0.0244 -0.0452 1.0000 0.0544 0.5061 0.2181 lnNI 0.0350 -0.0014 -0.0275 0.8366*** 1.0000 0.3399 0.9692 0.4544 0.0000 lnmarketcap -0.1869*** 0.0053 -0.0202 0.0332 0.0457 1.0000 0.0000 0.8838 0.5752 0.3654 0.2135

(51)

51

Table 5 Regression result on stock return from 2005Q1 - 2007Q2

2005Q1 2005Q2 2005Q3 2005Q4 2006Q1 2006Q2 2006Q3 2006Q4 2007Q1 2007Q2

721 712 699 696 676 664 656 638 615 583

1.26% 1.86% 0.43% 2.96% 1.81% 1.22% 2.82% 3.74% 5.49%

(52)

52

Source SS df MS Number of obs = 583

F( 4, 578) = 7.28

Model 574.943716 4 143.735929 Prob > F = 0.0000

Residual 11404.8316 578 19.7315426 R-squared = 0.0480

Adj R-squared = 0.0414

Total 11979.7753 582 20.5838064 Root MSE = 4.442

Stockreturn Coef. Std. Err. t P>t [95% Conf. Interval]

meanH2 -.3772874 .958737 -0.39 0.694 -2.260321 1.505746 lnmarketcap -1.117247 .2376577 -4.70 0.000 -1.584025 -.6504689 lnLTD .8531035 .3544818 2.41 0.016 .1568741 1.549333 lnNI -.4276372 .3649933 -1.17 0.242 -1.144512 .2892377 _cons 4.942673 1.476173 3.35 0.001 2.043355 7.841991 Table 7 Regression on 2007Q2