• No results found

Did U.S. households diversy their stock portfolios in more companies after the financial crisis? : do the independent variables still add value to the model after the crisis?

N/A
N/A
Protected

Academic year: 2021

Share "Did U.S. households diversy their stock portfolios in more companies after the financial crisis? : do the independent variables still add value to the model after the crisis?"

Copied!
33
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

DID U.S. HOUSEHOLDS DIVERSY THEIR STOCK

PORTFOLIOS IN MORE COMPANIES AFTER THE

FINANCIAL CRISIS?

Do the independent variables still add value to the model after the crisis?

Author:

M.P. Zwagerman

Student number:

10787844

Thesis supervisor: Dr. J.J.G. Lemmen

UNIVERSITY OF AMSTERDAM

AMSTERDAM SCHOOL OF ECONOMICS

BSc Economics & Business

(2)

PREFACE AND ACKNOWLEDGEMENTS

This thesis is part of the Bachelor study Economics and Business at the University of Amsterdam. The thesis is made between the time period 31 October 2017 up and to 31 January 2018.

I would like to thank my supervisor, Dr. J.J.G. Lemmen, for his supervision and guidance through this period.

Statement of Originality

This document is written by Student Mark Zwagerman who declares to take full responsibility for the contents of this document.

I declare that the text and the work presented in this document are original and that no sources other than those mentioned in the text and its references have been used in creating it.

The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

(3)

ABSTRACT

This research examines if U.S. households diversify their stock portfolios in more companies after the crisis. The research question is answered by comparing the averages from before and after the crisis. The data from the Survey of Consumer Finances is used. This is a pooled panel dataset which is every three year available. The data from after the crisis are the years 2010, 2013 and 2016. The data from before the crisis are coming from the years 2004 and 2007. The outcome is that average number of companies held by U.S. households is lower after the crisis. The null hypothesis is rejected. Though, in 2010 the

households are still busy to recover from the economic downturn during the crisis. When 2010 is removed from the dataset, then the averages from before and after the crisis are almost the same. Again, the null hypothesis is rejected.

The sub research question examines if the independent variables still add value to the model after the crisis. Age, total market value of the stock portfolio, education and income are the independent variables. The dependent variable is the average number of stocks of different companies held by U.S. households. To test if these variables still add value after the crisis an F-test is made. The F-test statistic is significant. Therefore, at least one the independent variables still adds value to the model.

Keywords: U.S. households, Survey of Consumer Finances, diversification, stocks, crisis. JEL Classification: G14

(4)

TABLE OF CONTENTS

PREFACE AND ACKNOWLEDGEMENTS ii

ABSTRACT iii TABLE OF CONTENTS iV LIST OF TABLES V LIST OF FIGURES Vi CHAPTER 1 Introduction 1 1.1 Introducing topic 1 1.2 Earlier studies 1

CHAPTER 2 Literature Revie 3

2.1 Literature review 3

CHAPTER 3 Methodology 6

3.1 Research model 6

3.2 Hypothesis 7

CHAPTER 4 Data 8

4.1 The variables, descriptive statistics 8

CHAPTER 5 Results 16 5.1 F-test 16 CHAPTER 6 Conclusion 21 6.1 Conclusion 21 6.2 Future research 22 6.3 Limitations 22 REFERENCES 24 APPENDIX 25

(5)

LIST OF TABLES

Table 1 Means of the age variable 2004-2016 page 9 Table 2 Means of the total market value of stocks variable 2004-2016 page 9 Table 3 Dummy variable education 2016 page 11 Table 4 Means dummy variable education 2004-2016 page 11 Table 5 Median variable income 2016 page 13 Table 6 Median variable income 2013 page 13 Table 7 Median variable income 2010 page 14 Table 8 Median variable income 2007 page 14 Table 9 Median variable income 2004 page 15 Table 10 Average number of stocks of different companies by U.S. households page 17 Table 11 Regression result from data 2004-2016 page 19 Table 12 Regression result from data 2004-2007 page 20

(6)

LIST OF FIGURES

Figure 1 Average number of different companies held stock from by page 17 U.S. households over time

(7)

CHAPTER 1 Introduction

1.1 Introducing topic

When will the next financial crisis occur? Nobody knows that. But according to an article from The New York Times (‘The Next Bubble, 2010’) a new crisis could happen anytime. The base of a new crisis is already founded. After the housing bubble of 2008, the bubble described by the New York Times will be the next crisis. There are some interesting familiarities between the two. Both have irrational investments and then a sudden flight. The next bubble that The New York Times describes starts by the huge capital inflows that developing countries get from foreign investments. These capital inflows will complicate their macroeconomic management such as pushing up the value of their currency and promoting fast credit expansion. Fast credit expansion usually leads to inflation, assets bubbles and bad loans. This is already remarkable familiar with the problems that occurred during the housing bubble of 2008. And if a shock happens like a default in Greece, the huge capital inflows of foreign investments will stop. The emerging markets of developing countries will collapse and damage the weak balance sheets of American banks. Because of this the stock markets will also drop, just like it did in 2008.

In 2008 U.S. households could save some of their losses by holding a diversified portfolio.

Polkovnichenko (2005) found that the medium number of stocks held by U.S. households was 4. And that 90% of the investors held less than 10 different stocks. This is interesting because those investors are more vulnerable for firm specific risk. With diversification investors can reduce this risk and only be vulnerable for systemic risk, which cannot be solved by diversification. According to Kelly (1995) more than 20 different stocks are needed to eliminate idiosyncratic risk. Then only undiversifiable risk will remain. There are a few reasons to mention why investors don’t diversify. According to Benartzi (2001) a possible reason could be that employees like to invest in their own employer’s stock. Campbell (2006) confirms this. In line with this is the familiarity bias. Investors invest in firms they are familiar with (DeMarzo, Kaniel and Kremer, 2004). But despite these findings, did U.S.

households learn from the past that holding a better diversified portfolio could shield them from adverse stocks? So, my main research question is: would U.S. households hold more stocks from

different companies after the crisis?

1.2 Earlier studies

This paper does add value to the existing literature because the analyzed period of this research differs from the existing papers. The research periods in previous researches were most of the time from a shorter period than the research period of this paper. The research period of Yunker and Melkumian (2013), King and Leape (1987) and Kelly (1995) lasted one year, while Blume and Friend (1975) had

(8)

research period from Polkovnichenko (2005) lasted 18 years but this was before the time period of this paper. There is no overlap. His time period was 1983-2001 while the time period of this paper is 2004-2016. Therefore, it gives new insights on the development of the diversification level of U.S.

households’ stock portfolios. In previous research the research period was often very short. Therefore, in all those papers mentioned above it is not possible to look at the diversification level before and after a period. Only the research of Polkovnichenko (2005) could investigate if the level of

diversification is different after some event that happened. But in that research they did not look at some before and after effect, their interest was just the general extent of diversification of individual households over time.

(9)

CHAPTER 2 Literature review

2.1 Literature review

Previous literature showed that the number of stocks held is an effective way to measure

diversification. Therefore, diversification will be measured by the average number of stocks an U.S. household possesses. If this number increases, then the portfolios of U.S. households are better diversified. This way of measuring the diversification level was also used by Uhler and Cragg (1971), Kelly (1995), King and Leape (1987) and Polkovnichenko (2005). There is however a complication by using the number of stocks held as a measurement tool. Using this measurement tool will overstate the level of diversification (Kelly, 1995). This is because it is not a certainty that the level of

diversification will increase if the number of stocks held increases. A portfolio that contains a few stocks but is well diversified, is still better protected against firm specific risk than a portfolio that contains a lot of stocks but is poorly diversified. This is an observation that must be accounted for in the final conclusion of this paper. An advantage of using this measurement tool is that it is easy to interpret the results. It will be easy to notice if the number of stocks held increases after the crisis. The only difference between previous research and the current research is that this research focusses on the number of different companies held stock in. Therefore, it is a better measurement tool for

diversification. Partly this measurement tool is also chosen because it was included in the dataset this way.

To answer the sub research question of this paper a model has to be made. Which independent variables are included in this model is based on previous literature. The dependent variable is the average number of stocks of different companies held by U.S. households. An independent variable is included if it can be expected that it has a significant effect on the dependent variable. Age will be included as an independent variable.

According to Dicks-Mireaux and King (1984) Canadian households are better diversified when the head of the household is older than forty. The probability that a Canadian household held each of the eleven assets was higher for households where the head of the household was older than forty. That age has a positive influence on the level of diversification in households is proved by Goetzmann and Kumar (2008). The effect of age was positive (0.117) and significant ( t-statistic of 10.80). This is in line with King and Leape (1987). They state that it is possible that that the information possessed by many individuals about sophisticated investment opportunities is significantly incomplete. Over time the acquired information about diversification will be collected and therefore there will be made better

(10)

investments at an older age. In both the OLS as the Poisson model the age term was positive and significant.

Nevertheless, there are some research papers who found opposite results. Uhler and Cragg (1971) showed that the variable age was negative but insignificant (for each of their four alternatives

considered). Blume and Friend (1975) and Kelly (1995) had similar outcomes. First of all, Blume and Friend (1975) conclude that the households are very undiversified. Furthermore, they also included age in their model but the outcome was insignificant (t value of -0.69 and 0.03). Kelly (1995) showed that the effect of age was positive but insignificant. According to King and Leape (1987) there are two reasons why these papers have opposite results. First, these papers imposed a simple linear

relationship between portfolio composition and age. Therefore, it will maybe fail to detect the true nonlinear relationship. Secondly, the number of assets distinguished in these papers was small. So, these papers are not reliable. Thus, it is still appropriate to include age in the regression. It is expected that the parameter of the variable will be positive.

Total market value of the stock portfolio will be included in the model as an independent variable. From economic intuition there are several reasons why this variable should be included. The more money you have the more the benefits are from diversification. The loss because of diversification will increase if the invested amount of money increases. Therefore, it is more interesting for households who have bigger stock portfolios. According to Yunker and Melkumian (2013) it is widely accepted that the wealthier the household, the more diversification it practises. Thus, the expectation is that the effect is positive. This theory is supported by several previous studies. Kelly (1995) included the variable ‘total value of stock owned, including stock held through investment clubs’ in his model. The effect was positive and statistically significant at 5%. Uhler and Cragg (1971) also included a

comparable variable in their model (gross amount of all corporate stock, including mutual fund shares).

Education will be included in the model as an independent variable. Again, from economic intuition it is probable that education will explain a part in the level of diversification of the household. With education is meant if the head of the household has a college degree (or more) or not. This distinction between college degree or not is made because college is for most people the first time they are confronted with the concept of diversification. In high school diversification is not a teaching subject. Later in this paper more will be explained about how this distinction has been made in the data. With a college degree it is more probable that the head of the households has an understanding of the

diversification benefits. Therefore, the expectation is that the effect is positive. Several studies have proved that education explains something in the level of diversification. King and Leape (1987) included the variables post-graduate education and college education in their model. The effect of both

(11)

variables was positive and significant ((0.87(0.16) and 0.48(0.12)). Kelly (1995) and Goetzmann and Kumar (2008) also included education in their models. Goetzmann and Kumar (2008) found that the effect was positive (0.107) and significant (7.72 t-value). Kelly (1995) included education in the model as a dummy variable. Did the head of the household attend college or not? The effect was positive. In the model the variable education will be 1 is the head of the household has a college degree or more. 0 if not.

Income is also included in the model. Several studies have used income as an independent variable. Uhler and Cragg (1971) had the variable ‘current disposable income (x1000 dollars)’ included in their model. This variable was strongly significant at the 1% level. Therefore, it is likely that the variable income explains something of the level of diversification. The same result was found by the paper of Goetzmann and Kumar (2008). The variable income in their model was also significant at the 1% level. In both models the effect of income was positive. Many economic intuitions can be used to explain this positive effect. Both papers don’t give an explanation why this variable effect is positive. Nevertheless the expectation is that the variable in this paper is also positive.

More variables could be added to the model but there are several reasons why that has not been done. First of all, the data source, U.S. Survey of Consumer Finances, has its limitations. Some variables which probably have an influence on the number of stocks held are now not included. In the survey there was nothing asked about the favourite colour of the head of the households for example. Therefore that variable cannot be added to the model.

(12)

CHAPTER 3 Methodology

3.1 Research model

For the main research question the answer will be found by comparing the averages of before and after the crisis. Where after the crisis are the years 2010, 2013 and 2016. For the crisis are the years 2004 and 2007.

For the sub research question the following research model, based on previous research, is used:

Where:

Y = average number of stocks of different companies held by U.S. households β0 = constant, the intercept

Age = the age of the head of the household

Totalmarketvalue = the total market value of stocks a household possesses

Education = a dummy variable. 1 if the head of the household has a college degree, 0 if not Income = the income of the household in U.S. dollars

ε = the error term

This model is estimated by an OLS regression in Stata.

Two additional regressions will be made. One regression with only the data from before the crisis and one regression with the data from all the years. To see if the data after the crisis add value to the model, a F-test will be made. The formula for a F-test is the following (1):

F

~

F

q, n-k-1 (1)

Where:

= of the regression with only the years before the crisis included

ur = unrestricted, of the regression with all the years included q = number of restrictions

k = number of independent variables

(13)

3.2 Hypothesis

There are the two hypotheses considered. One hypothesis for answering the main research question and one hypothesis for answering the sub research question. The first hypothesis for the main research question is the following:

H0: The average number of companies households held stock from is higher after the crisis. H1: The average number of companies households held stock from is the same, or lower after the crisis.

The expectation is that after the crisis households have learnt from the past. Households who did not hold a diversified portfolio, made bigger losses during the crisis. This is caused by firm specific risk, which can be solved by diversification. Therefore, households will invest in more stocks and more companies in the years after the crisis to prevent this for happening again.

The second hypothesis for the sub research question is the following: H0: The data from 2010, 2013 and 2016 add value to the model H1: The data from 2010, 2013 and 2016 do not add value to the model. This can also be formulated parametrically:

The null hypothesis is the following:

The alternative hypothesis is the following

The expectation is that the variables also after the crisis add value to the model. Before the crisis these variables were important for the explanation of the dependent variable. It is expected that the crisis will have some effect on the independent variables. Despite that effect, the effect of the independent variables is expected to be significant after the crisis.

H

0: β4= β5= β6= β7= β8 ≠ 0

H1:

β4= β5= β6= β7= β8 = 0

(14)

CHAPTER 4 Data

4.1 The variables, descriptive statistics

The data that is used is named the Survey of Consumer Finances (SCF). This is a free data source from the Federal Reserve Board (FED). In this survey the NORC, a research organization at the University of Chicago, gathers information about families in the U.S. across the country. This survey is done every three years so therefore the data is panel data. The years that are used in this research are 2004, 2007, 2010, 2013 and 2016. The population of the sample are all U.S households. Unfortunately, it’s not possible to collect all that data and therefore a random sample is taken every three years.

Therefore, the data is pooled cross sectional data. Pure panel data includes the same subject over time. With the Survey of Consumer Finances, the data is randomly selected. So, differences between the same individuals over time cannot be done. Still a strong attempt is made to select families from all economic strata in their survey. Furthermore, the population differs every three years. People get older, people die and new people entering the population. Therefore, the observations are not

identically, but the observations are still independent. In this paper the term ‘head’ of the household is used several times. The term refers to the male in a mixed-sex couple or the older individual in a same-sex couple. The term head can also refer to a single core individual where there is not a core couple.

One big problem with the dataset is that standard errors of all variables are incorrect. According to the website of the Federal Reserve (https://www.federalreserve.gov/econres/scfindex.htm) this is caused by the failure to account for multiple imputation and the complex sample design. Therefore, the dataset contains 5 times the number of actual observations. There is a way to calculate the correct standard errors but that’s not possible now because of a lack of knowledge. Still all the calculated coefficients in this paper are correct and usable. They are not affected by this.

According to the National Bureau of Economic Research (NBER) the beginning of the crisis was in December 2007. The data from before the crisis are 2004 and 2007. The data from the survey in 2007 is usable. Most variables are measured in the year before the survey, so it is still before the crisis. The variable income in 2007 for example is the income that families earned over 2006. The survey of 2001 does not take part in this research because of the dot-com bubble. According to the NBER the U.S. economy was in a regression from March 2001 to November 2001. The data from after the crisis are from 2010, 2013 and 2016. According to the NBER the final month of the recession was June 2009. For some variables of 2010 this is still partly measured during the recession. Still the data can be used because the time period 2010 is not entirely during the crisis. The first effects of after crisis behavior under families can be visible.

(15)

The mean of all the age variables is between 55 and 60. This is satisfying outcome because it means that the randomly selected samples are comparable. The samples are the same but not exactly the same. If the means would be totally different, the selected samples would probably be not the same. See table 1 for the precise means.

Table 1 Means of the age variable 2004-2016

Variable Number of observations Mean Standard deviation Minimum maximum Age 2004 7398 55.003 13.632 19 95 Age 2007 6937 56.610 14.735 19 95 Age 2010 7189 55.966 14.374 19 95 Age 2013 6570 57.151 14.232 20 95 Age 2016 6587 58.007 14.667 19 95 Note, coefficients are rounded by 3 decimals. The descriptive statistics of all the age variables are shown from 2004 up and to including 2016.

Table 2 Means of the total market value of stocks variable 2004-2016

Variable Number of observations Mean Standard deviation Minimum maximum Total market value of stocks 2004 7398 3766793 1.67e+07 10 2.00e+08 Total market value of stocks 2007 6937 5484150 2.72e+07 1 6.90e+08 Total market value of stocks 2010 7189 3239398 1.78e+07 1 4.00e+08 Total market value of stocks 2013 6570 4305038 2.46e+07 1 7.00e+08 Total market value of stocks 2016 6587 6123547 3.76e+07 10 1.00e+09

Note, coefficients are rounded by 3 decimals. The descriptive statistics of all the total market value of stocks variables are shown from 2004 up and to including 2016.

(16)

The averages of the variable total market value are listed in table 2. The averages are very high but that’s not a problem. The cause for this is the same as for the variable income. This is because of the fact that there are quite a few households with really large portfolios. Those households increase the averages of the age variables.

The variable education is a dummy variable for 1 if the head of the household has a college degree and 0 if not. But in the dataset the distinction between college degree or not is not made yet. In the survey of 2016 the following answers were possible to the question what the highest level of education was, that the head of the household completed:

1: 1st, 2nd, 3rd, or 4th grade 2: 5th or 6th grade 3: 7th and 8th grade 4: 9th grade 5: 10th grade 6: 11th grade 7: 12th grade, no diploma

8: High school graduate - high school diploma or equivalent 9: Some college but no degree

10: Associate degree in college - occupation/vocation program 11: Associate degree in college - academic program

12: Bachelor's degree (for example: BA, AB, BS)

13: Master's degree ( for example: MA, MS, MENG, MED, MSW, MBA) 14: Professional school degree (for example: MD, DDS, DVM, LLB, JD) 15: Doctorate degree (for example: PHD, EDD)

-1: Less than 1st grade

To make a dummy variable, 12 or higher is taken as 1. If the number is lower than 12, the value of 0 is given. See table 3 how this is done for the year 2016. To see the how this is done for the years 2013, 2010, 2007 and 2004, see table 1 up and to including table 4 of the appendix. On average the dummy variable education was between 0.73 and 0.88, see table 4. In 2016 the mean of the dummy variable education was 0.732 which is much lower than all the other years. An explanation therefore could be the economic growth in the U.S. In the period 2013-2016 the crisis was left behind and a lot of grow opportunities were available. In this time period it’s more attractive to start working earlier. Therefore, the number of people who finishes with a college degree will drop. This could be an explanation why the average number of education in 2016 is lower than all the other years. Related to the research question, the fact that the level of education drops in the period 2013-2016 is interesting. In 2016 the average number of different companies held stock from was 15.855 (table 10), while the average

(17)

number in 2013 was 18.486. The drop in the level of education could be an explanation for the drop in average number of companies held stock from.

On the other side, it could also be possible that in times of crisis the number of people who finishes with a college degree will increase. In times of crisis there are less work opportunities available. Therefore, it is more necessary to study. More people will study because otherwise they will become unemployed. Unfortunately, there is no proof for this thought in the data. The means of the variable education are not remarkably higher than before or after the crisis.

Table 3 Dummy variable education 2016

Dummy education Education 0 1 total -1 0 0 0 1 5 0 5 2 5 0 5 3 10 0 10 3 30 0 30 5 25 0 25 6 15 0 15 7 51 0 51 8 667 0 667 9 568 0 568 10 181 0 181 11 211 0 211 12 0 2432 2432 13 0 1423 1423 14 0 964 964 total 1768 4819 6587

The column Education reflects the possible answers in the survey. In this survey the answers were -1 up and including 14. See the text above this table what the meaning was of each number. The two columns Dummy education ranks those answers. 1 is given if the answer is 12 or higher, 0 is given if the answer is lower than 12. To see why these answers are ranked like this, see also the text above the table

(18)

Table 4 Means dummy variable education 2004-2016 Variable Number of observations Mean Standard deviation Minimum maximum Education 2004 7398 0.867 0.339 0 1 Education 2007 6937 0.871 0.335 0 1 Education 2010 7189 0.876 0.330 0 1 Education 2013 6570 0.883 0.321 0 1 Education 2016 6587 0.732 0.443 0 1

Note, coefficients are rounded by 3 decimals. The descriptive statistics of all the education variables are shown from 2004 up and to including 2016.

The income variable brings all the different kinds of income a household earns together. So, the income variable is the total income a household earns. The income variable includes the following

parts:

-The family’s annual income from wages and salaries. -The family’s net annual income from a sole proprietorship. -The family’s annual income from non-taxable investments such as municipal bonds and other

interests.

-The family’s annual income from dividends -The family’s annual income from gains or losses from mutual funds or from the sale of stocks, bonds

or real estate.

-The family’s annual income from other businesses or investments, net rent, trusts, or royalties. -The family’s annual income from unemployment or worker’s compensation. -The family’s annual income from child support or alimony. -The family’s annual income from TANF, SNAP (food stamps), or other forms of welfare or

assistance such as SSI.

-The family’s annual income from Social Security or other pensions, annuities, or other disability or

retirement programs.

-The family’s annual income from any other sources.

(19)

The average incomes for respectively 2004, 2007, 2010, 2013 and 2016 were: 768726, 1106429, 608831, 846279 and 792500. These numbers seem to be very high but that is due to the large outliers. Nevertheless, these outliers will remain in the dataset. The NORC included these wealthy families on purpose to create a dataset that include families from all economic strata. In the U.S. the income is not normally distributed. Negative values are omitted from the dataset. In the survey -1 was added to the dataset if the household didn’t earn any money. And -9 was added to the dataset if the household made a negative amount of money that year. How much that negative amount was is not given. Therefore, all the negative numbers are removed from the dataset. By looking at the median number of the income, a more natural number will come forth. On average the median was 62000. Table 1 up to and including table 5 shows the drop in the median income after the crisis. The tables 5 up to and including 9 also shows that the median income is recovering after the crisis. The median is by the 50% mark.

Table 5 Median variable income 2016

Percentiles Percentiles values Smallest Mean

1% 5100 10 792500 5% 12000 10 10% 16000 10 25% 31000 20 50% 67000 Largest 75% 158500 2.74e+08 90% 944000 2.84e+08 95% 3180000 3.00e+08 99% 1.34e+07 3.02e+08

Note, sometimes e is used, this is the notation for an exponential. The percentiles for the variable income in 2016 are shown. The 50% percentile is the median. All the values are in U.S. dollars.

(20)

Table 6 Median variable income 2013

Percentiles Percentiles values Smallest Mean

1% 5200 1 846279 5% 11000 1 10% 15000 1 25% 28000 1 50% 59000 Largest 75% 141000 1.67e+08 90% 720000 1.75e+08 95% 2540000 1.77e+08 99% 1.59e+07 1.77e+08

Note, sometimes e is used, this is the notation for an exponential. The percentiles for the variable income in 2013 are shown. The 50% percentile is the median. All the values are in U.S. dollars.

Table 7 Median variable income 2010

Percentiles Percentiles values Smallest Mean

1% 5400 500 608831.4 5% 10000 500 10% 14000 500 25% 26000 500 50% 55000 Largest 75% 120000 3.13e+08 90% 500000 3.29e+08 95% 1530000 3.32e+08 99% 1.00e+07 3.56e+08

Note, sometimes e is used, this is the notation for an exponential. The percentiles for the variable income in 2010 are shown. The 50% percentile is the median. All the values are in U.S. dollars.

(21)

Table 8 Median variable income 2007

Percentiles Percentiles values Smallest Mean

1% 5500 600 1106429 5% 10000 600 10% 15000 600 25% 31000 600 50% 70000 Largest 75% 214000 1.82e+08 90% 1640000 1.82e+08 95% 3820000 1.82e+08 99% 2.50e+07 1.82e+08

Note, sometimes e is used, this is the notation for an exponential. The percentiles for the variable income in 2007 are shown. The 50% percentile is the median. All the values are in U.S. dollars.

Table 9 Median variable income 2004

Percentiles Percentiles values Smallest Mean

1% 4000 400 768725.7 5% 8400 400 10% 13000 400 25% 28000 400 50% 60000 Largest 75% 171000 1.02e+08 90% 1240000 1.02e+08 95% 3300000 1.02e+08 99% 1.59e+07 1.02e+08

Note, sometimes e is used, this is the notation for an exponential. The percentiles for the variable income in 2004 are shown. The 50% percentile is the median. All the values are in U.S. dollars.

(22)

CHAPTER 5 Results

5.1 Average numbers of different companies held stock from

In table 10 the average numbers of stocks of different companies held by U.S. households for each year are shown. In the years before the crisis (2004 and 2007) the average number of companies held stock from was 17.306. In the years after the crisis (2010, 2013 and 2016) the average number of companies held stock from was 15.220. The expectation was that households would hold more stocks after the crisis to increase the level of diversification. Based on the results from the data that is not the case. The average number of companies held stock from is lower after the crisis.

A comment that can be made on the outcome of this result is that the average of the years after the crisis is influenced by the effect of the crisis. The meaning of this effect of the crisis is that the year 2010 the cause is for the low combining average. In the year 2010, the effect of the crisis is the most visible in the data. The crisis period was according to the NBER from December 2007 up and to including June 2009. In this time period 2007-2010 it’s not possible yet to learn from the faults made in the crisis. The households are probably still busy trying to recover from the crisis. The number of stocks of different companies held by U.S. households drops in 2010 because of the entire economic downturn.

But even if you erase the year 2010 from the combining average from after the crisis, the average after the crisis is still lower than the average from before the crisis. If you erase the year 2010, the average would be 17.171. This average is still lower than the average from before the crisis, which was 17.306. What can be said is that the average number of stocks of different companies held by U.S. households is restored after the crisis. The average numbers 17.171 and 17.306 are very close. The effect of crisis on the average number is nicely displayed in figure 1. The crisis effect causes a U-shape. The U-shape starts in 2007 were the average number is not yet affected by the crisis. The crisis effect causes a drop in by 2010 in figure 1. In 2013 the line restores to its former level.

(23)

Table 10 Average number of stocks of different companies held by U.S. households

Summary statistics dependent variable

year 2004 2007 2010 2013 2016 Average number of different companies 16,378 18,234 14,320 18,486 15,855

Note, the average number of stocks for each year are rounded by 3 decimals.

Figure 1 Average number of different companies held stock from by U.S. households over time

0 2 4 6 8 10 12 14 16 18 20 2004 2007 2010 2013 2016 number of different companies held stock from

5.2 F-test

Two regressions have been made. In one regression all the data is included (table 11) and in one regression only the data from before the regression is included (table 12). In table 11 the first 5

variables are corresponding to the years 2004 and 2007. The last 5 years are corresponding to the years 2010, 2013 and 2016. The question if the years 2010, 2013 and 2016 add value to the model in table 11 will be answered by a F-test (1).

(24)

something of the dependent variable. For the model were all the data is included (table 11) the F-value is 63.54. Therefore, the chance that all of the coefficients on the independent variables are equal to zero is less than 1% (p-value = 0.000). For the model were only the data from 2004-2007 is included (table 12), the F-value is 71.06. Thus, also for this model it means that at least one of the independent variables explains something of the dependent variable. Again, the chance that all of the coefficients on the independent variables are equal to zero is less than 1% (p-value = 0.000).

To test if the variables from 2010, 2013 and 2016 add value to the model another F-test is made. The F-test is done by the following Stata command: test (age_after = 0), (stockvalue_after = 0),

(education_after = 0), (income_after = 0). The F-value that results from this test is 53.74. Therefore, the conclusion can be made that at least one of the added variables, add value to the model. The four variables are jointly significant. This conclusion is supported by the adjusted of both models. The

stands for the percent of variance of the dependent variable explained by the independent variables. It’s better to look at the adjusted of both models instead of the normal . The adjusted adjusts for the number of terms in the model. Because of the adjusted is not possible to add useless variables to the model. With the normal the would become higher every time a useless variable is added. But taking that into account, the adjusted of the model with the data from 2004-2016 is higher than the model with only the data from 2004-2007. For the model with the 2004-2007 data the adjusted is 0.0409, while the adjusted for the 2004-2016 model is 0.0708. Still it’s good to be careful with comparing models based on or adjusted . The explains how good the variables fit the real model. So, statically it is a good model. But says nothing about causality. However, in this case it can be expected that there also is causality. Previous research also used those variables, therefore it’s plausible that there is some causality.

(25)

Table 11 Regression result from data 2004-2016

The effect of age, total market stock value, education and income on the number of different companies held stock from. Data from before and after the crisis.

Number of different companies held stock from Coefficient Robust standard error

T-value P-value 95% confidence interval

Lower bound Upper bound Age before crisis 0.113*** 0.015 7.63 0.000 0.084 0.142 Total market value of stocks before crisis

1.25e-07*** 9.54e-09 13.14 0.000 1.07e-07 1.44e-07

Education before crisis

3.46*** 0.626 5.52 0.000 2.232 4.688

Income before crisis

-2.47e-08 4.00e-08 -0.62 0.537 -1.03e-07 5.37e-08

Age after crisis 0.193*** 0.018 10.71 0.000 0.157 0.228 Total market value of stocks after crisis

5.68e-08 9.54e-09 5.96 0.000 3.81e-08 7.55e-08

Education after crisis

4.82 0.680 7.09 0.000 3.485 6.149

Income after crisis

-5.86e-09 3.72e-08 -0.16 0.875 -7.89e-08 6.71e-08

constant -8.476 1.570 -5.40 0.000 -11.554 -5.399 Note, coefficients are rounded by 3 decimals. Dependent variable = number of different companies held stock from. Coefficient significant at the 10% level are denoted by *, at the 5% level by **, and the 1% level by ***. Sometimes e is used, this is the notation for an exponential. is 0.0719, adjusted is 0.0708. F(8,6561) = 63.54, prob > F = 0.000. Robust standard errors option is done to provide a correction for serial correlation and heteroskedasticity.

(26)

Table 12 Regression result from data 2004-2007

The effect of age, total market stock value, education and income on the number of different companies held stock from. Data from before the crisis.

Number of different companies held stock from Coefficient Robust standard error

T-value P-value 95% confidence interval Lower bound Upper bound Age before crisis 0.115*** 0.015 7.66 0.000 0.086 0.145 Total market value of stocks before crisis

1.26e-07*** 9.68e-09 13.02 0.000 1.07e-07 1.45e-07

Education before crisis

3.063*** 0.635 4.82 0.000 1.819 4.308

Income before crisis

-1.22e-08 4.06e-08 -0.30 0.763 -9.19e-08 6.74e-08

constant 6.973*** 1.049 6.65 0.000 4.917 9.029 Note, coefficients are rounded by 3 decimals. Dependent variable = number of different companies held stock from. Coefficient significant at the 10% level are denoted by *, at the 5% level by **, and the 1% level by ***. Sometimes e is used, this is the notation for an exponential. is 0.0415, adjusted is 0.0409. F(4,6565) = 71.06, prob > F = 0.000. Robust standard errors option is done to provide a correction for serial correlation and heteroskedasticity.

(27)

CHAPTER 6 Conclusion

6.1 Conclusion

This research analyses if U.S. households diversify their stock portfolios in more companies after the financial crisis. By looking at the averages of the number of stocks of different companies held a comparison can be made. To answer this question the dataset from the Survey of Consumer Finances (SCF) is used. This is a pooled panel dataset. The null hypotheses was that the average number of stocks of different companies held is higher after the crisis. This is based on the expectation that households have learnt from the past. Households who did not held a diversified portfolio during the crisis, made bigger losses. The alternative hypothesis was that the average number of stocks of different companies held is the same, or lower after the crisis. The results showed that the average number of stocks of different companies held did not increase after the crisis. The average number from after the crisis was lower than the average number from before the crisis. Therefore, the null hypothesis is rejected. If the year 2010 should be in the dataset is discussable. The crisis has a serious effect on the outcome. In the period 2007-2010 the households are still busy to recover from the crisis. Therefore, the average from after the crisis is lower than it should be. Even if the year 2010 is omitted from the dataset, the null hypothesis is still rejected. The average number of stocks of different companies held after the crisis is almost the same as the average number from before the crisis. Possible reasons for this result can be mentioned. The familiarity bias described by DeMarzo, Kaniel and Kremer (2004) states that investors invest in firms they are familiar with. For households could apply the same. Another possible cause that can be mentioned is that employee’s like to invest in their own employer’s stock according to Campbell (2006) and Benartzi (2001). This could apply for the head of the household because many of them are an employee.

The sub research question analyses if the independent variables still add value to the model after the model. The dependent variable in this model is the average number of stocks of different companies held by U.S. households. The independent variables are age, total market value of the stock portfolio, level of education and income. The null hypothesis is that at least one of these variables still adds value to the model. The alternative hypothesis is that none of these variables add value to the model. To test if the independent variables add value to the model a F-test is done. The F-value is significant. The p-value is 0.000. Therefore, the null hypothesis is not rejected. The results are confirmed by the adjusted . The from the model with all the data is larger than the from the model with only the data from before the crisis included. Nevertheless, it’s good to be careful by comparing the two adjusted because they don’t give any certainty about causality.

(28)

6.2 Future research

When studying this research paper, it’s good to take the following comments into account for future research. Number of stocks of different companies held by U.S. households is not a perfect

measurement for diversification. Probably did families invest in indirect stock also, like hedge funds and trust funds. Those funds are known to be very well diversified. In those funds people work with the knowledge about diversification. So if U.S. households invested more in indirect stock, than the diversification level will increase. For hedge funds it’s different because their investment strategy is secret. In this research paper indirect stock investments are not included because it’s hard to do that. Therefore, this research focusses completely on the direct stock.

Furthermore, if the number of different companies held stock from increases, it does not always mean that also the level of diversification increases. Because maybe the companies are very similar to each other (companies in the same sector or industry) and have the same fluctuations. What can be said is that the level of diversification probably increases if the number of different companies held stocks from increases.

6.3 Limitations

Comparing averages is a very thin base for answering the research question. Still, the outcome of the research question is very reliable. The number of observations is very large and according to the NORC the sample represents the American population very well. Originally comparing averages was not the intention of this research. The intention was to include a dummy variable crisis in the model that now is used for the sub research question. This dummy variable was 1 if only data from before the crisis was used and the variable was zero if data from before the crisis was used. Running a regression would give a coefficient. Testing this by a T-test would give an answer to the question if the

coefficient was significant or not. And by knowing that something could be said about the question if U.S. households hold more or less stocks from different companies after the crisis. If the coefficient is positive and significant than it is likely that U.S. households diversify their stocks better after the crisis. Another consideration was to include the dummy variable crisis as an interaction term. The dummy variable crisis would interact with the variable average number of different companies held stock from. Running a regression on this would give a coefficient. This coefficient can be tested on significance by a T-test. Both options did not make it due to the difficult sample. The dataset is panel data which means that there are multiple observations over time. There is not one continuous

(29)

The observations in the dataset over the years are not the same households. Every three years a new random sample is chosen. This fact also makes it hard to work with.

(30)

REFERENCES

Benartzi, S. (2001). Excessive Extrapolation and the allocation of 401 (k) Accounts to Company Stock. Journal of Finance, 56(5), 1747-1764.

Berk, J. & DeMarzo, P. (2014). Investor Behavior and Capital Market Efficiency. Corporate Finance

third edition. Harlow, England: Pearson Education, 442-445.

Blume, M. E., Friend, I. (1975). The Demand for Risky Assets. The American Economic Review,

65(5), 900-922.

Campbell, J.Y. (2006). Household Finance. Journal of Finance, 61(4), 1553-1604.

DeMarzo, P., Kaniel, R., Kremer, I. (2004). Diversification as a Public Good: Community Effects in Portfolio Choice. Journal of Finance, 59(4), 1677-1715.

Dicks-Mireaux, L., King M. (1984). Pension wealth and household saving: test of robustness. Journal

of Public Economics, 23(1-2), 115-139.

Kelly, M. (1995). All their eggs in one basket: Portfolio diversification of US households. Journal of

Economics Behavior & Organization, 27(1), 87-96.

King, M., and Leape, J. (1987). Asset Accumulation, Information and the Life Cycle. National Bureau

of Economic Research, working paper No. 2392.

Polkovnichenko, V. (2005). Household Portfolio Diversification: A Case for Rank Dependent Preferences, Review of Financial Studies, 18(4), 1467-1502.

The Next bubble. (2010, October 14). Retrieved from

https://search.proquest.com/docview/1458404543?rfr_id=info%3Axri%2Fsid%3Aprimo

Uhler, R. S., Cragg, J. G. (1971). The Structure of the Asset Portfolios of Households. The Review of Economics Studies, 38(3), 341-357.

Yunker, J. A., Melkumian, A. (2013). Optimal diversification and risk-taking: atheretical and empirical analyses. Applied Economics, (45(11), 1481-1492

(31)

APPENDIX

Tables dummy variable education

To make a dummy variable, every value of the original survey value has to be converted into 1 or 0. For the education variable this is done 5 times. How this is done for the years 2013, 2010, 2007 and 2004 is shown in table I up and to including table IV.

Table 1 dummy variable education 2013

Dummy education Education 0 1 total 4 5 0 5 6 5 0 5 7 5 0 5 8 15 0 15 9 30 0 30 10 50 0 50 11 45 0 45 12 611 0 611 13 0 171 171 14 0 513 513 15 0 156 156 16 0 2289 2289 17 0 2675 2675 total 766 5804 6570

The column Education reflects the possible answers in the survey. In this survey the answers were -1 up and including 17. To see what number exactly means, look at the codebook on the website of the federal reserve. Some numbers in this column are missing but that means that nobody received that kind of education. The two columns Dummy education ranks the answers. 1 is given if the answer is 13 or higher, 0 is given if the answer is lower than 13. 13 or higher means that the head of the household received at least a bachelor degree or even a higher educational diploma.

(32)

Table 2 dummy variable education 2010 Dummy education Education 0 1 total 2 5 0 5 3 5 0 5 4 5 0 5 6 5 0 5 8 10 0 10 9 10 0 10 10 45 0 45 11 35 0 35 12 774 0 774 13 0 270 270 14 0 590 590 15 0 235 235 16 0 2524 2524 17 0 2676 2676 total 894 6295 7189

The column Education reflects the possible answers in the survey. In this survey the answers were -1 up and including 17. To see what number exactly means, look at the codebook on the website of the federal reserve. Some numbers in this column are missing but that means that nobody received that kind of education. The two columns Dummy education ranks the answers. 1 is given if the answer is 13 or higher, 0 is given if the answer is lower than 13. 13 or higher means that the head of the household received at least a bachelor degree or even a higher educational diploma.

Table 3 dummy variable education 2007

Dummy education Education 0 1 total -1 5 0 5 6 5 0 5 8 20 0 20 9 10 0 10 10 70 0 70 11 65 0 65 12 720 0 720 13 0 150 150 14 0 580 580 15 0 208 208 16 0 2612 2612 17 0 2492 2492 total 895 6042 6937

The column Education reflects the possible answers in the survey. In this survey the answers were -1 up and including 17. To see what number exactly means, look at the codebook on the website of the federal reserve. Some numbers in this column are missing but that means that nobody received that kind of education. The two columns Dummy education ranks the answers. 1 is given if the answer is 13 or higher, 0 is given if the answer is lower than 13. 13 or higher means that the head of the household received at least a bachelor degree or even a higher educational diploma.

(33)

Table 4 dummy variable education 2004 Dummy education Education 0 1 total 2 10 0 10 3 5 0 5 5 5 0 5 6 1 0 1 7 5 0 5 8 25 0 25 9 15 0 15 10 45 0 45 11 36 0 36 12 835 0 835 13 0 265 265 14 0 593 593 15 0 139 139 16 0 2463 2463 17 0 2956 2956 total 982 6416 7956

The column Education reflects the possible answers in the survey. In this survey the answers were -1 up and including 17. To see what number exactly means, look at the codebook on the website of the federal reserve. Some numbers in this column are missing but that means that nobody received that kind of education. The two columns Dummy education ranks the answers. 1 is given if the answer is 13 or higher, 0 is given if the answer is lower than 13. 13 or higher means that the head of the household received at least a bachelor degree or even a higher educational diploma.

Referenties

GERELATEERDE DOCUMENTEN

Second, we regress the NYSE listed banks’ daily unadjusted- and mean adjusted returns against four sets of dummy variables (which are combinations of non–financial

Cumulative abnormal returns show a very small significant reversal (significant at the 10 per cent level) for the AMS Total Share sample of 0.6 per cent for the post event

The two main findings of this study are (1) Educational inequalities in the hazardous drinking prevalence—higher hazardous drinking among those with high levels of education—were

© 2020 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and.. Perceptual learning enables us to make sense of what we

Opvallend is dat ook de Raad van State in zijn advies met betrekking tot het wetsvoorstel van de PvdD oordeelde dat een verbod op onbedwelmd ritueel slachten in strijd

The early models describing the consumer buying decision making process were developed at a time where limited research in the discipline of consumer behavior was

Several techniques which have been used to increase the performance of the metal oxide semiconductor field effect transistor (MOSFET) are also applied to the FinFET; such as

In order to perform the measurements for perpendicular polarization, the λ/2 plate is rotated by 45°, to rotate the laser polarization by 90°.The measurements were performed