1 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Vrije Universiteit Amsterdam Universiteit van Amsterdam Faculty of Economics and Business Master Entrepreneurship
DOES ENTREPRENEURSHIP PAY?
A REPLICATION STUDY OF HAMILTON (2000)
PATRICK PETRUS WILHELMUS SUIKER 2509999 & 1088863
Academic year 2014-2015
Supervisor prof. dr. P.D. (Philipp) Koellinger July 1th 2015
i © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Statement of Originality
This document is written by Student Patrick Suiker who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.
ii © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Acknowledgements
Amsterdam, 2015
After my Bachelor Business Administration, I had the wonderful opportunity to start this completely new master program. A program that is in line with my aspirations of becoming an entrepreneur myself. I wrote many interesting papers and studied a lot of challenging theories in this year and now is the moment to bring it all to an end. Writing my master thesis will be the last paper I write as a student. I learned a lot over the past years and I feel ready for the ‘grown up world’. A world full of opportunities that I’m ready to jump on!
Before I let your eyes go through my thesis, I would like to take a moment to thank Prof. dr. Phillip Koellinger for his knowledge, expertise and positive energy during the useful feedback sessions. I also want to thank my group members Arne Weijling, Hugo Borja Bustamante, Quentin Merelle and Umer Saqib for their support. It was amazing to work together and I had a great time with them!
iii © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Abstract
Objective: This is a replication study of Hamilton (2000) that elaborates on the differences in taxable income between wage workers and self-employed workers, using a LISS panel dataset of 32992 observations of Dutch people from 2008 until 2013. The researcher investigates variables that have an influence on taxable income and that cause the difference between the taxable income of wage workers and entrepreneurs. Method: The ordinal least squares regression, quantile
regression and fixed effect estimator are used to test the variables. Results: The results indicate that wage workers earn more taxable income on median as well as on average. Additional, the earning distribution for self-employment is more right skewed and has a higher variation. Conclusion: The results are in line with some of the findings of Hamilton (2000). Apparently, next to the United States, his results are also applicable to the Netherlands. The outcome of this study is interesting for individuals that stand for the decision whether to become an entrepreneur or not and for researchers that are interested in entrepreneurship and looking for future research opportunities.
iv © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Table of contents
Statement of Originality……….. i
Acknowledgements……… ii
Abstract………. iii
Table of contents……….. iv
List of tables………. vii
List of figures………. vii
Glossary………. viii 1. Introduction………. 1 1.1 Popularity of entrepreneurship……… 1 1.2 Theoretical relevance………. 2 1.3 Practical relevance……… 3 1.4 Research question………. 3 2. Literature review……….. 4 2.1 Introduction………. 4
2.2 The trend in entrepreneurship……….. 4
2.3 Why people become entrepreneurs ………. 5
2.4 Characteristics of potential entrepreneurs……… 6
2.5 Situations that direct people into entrepreneurship………. 7
2.6 How are the earnings of entrepreneurship being measured?... 8
2.7 Does entrepreneurship pay?... 8
2.7.1 Comparing wage workers with entrepreneurs……….. 8
2.7.2 Switching in employment status……….. 10
3. Methodology……….. 11
3.1 Introduction………. 11
3.2 Research strategy……… 11
3.3 Research design……… 11
v © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
3.3.1.1 Dataset………. 12
3.3.1.2 Variables……….. 13
3.3.2 Data analysis……… 14
3.3.2.1 Introduction……….. 14
3.3.2.2 Pooled OLS and Quantile regression……….. 15
3.3.2.3 Fixed Effect estimator……… 17
3.3.2.4 Kernal Density graph……….. 18
3.3.2.5 Variables………. 18 3.4 Summary……… 18 4. Results………. 20 4.1 Summary statistics………. 20 4.1.1 Explanatory variables……… 20 4.1.2 Dependent variable ……….. 23 4.2 Regression analyses……….. 25
4.2.1 Introduction OLS and Quantile regressions……… 25
4.2.2 Results OLS and Quantile regressions……… 27
4.2.2.1 Tables with the coefficients……….. 27
4.2.2.2 Constant……….. 31
4.2.2.3 Explanatory variables………. 31
4.2.2.3.1 Age………... 32
4.2.2.3.2 Work_experience……… 32
4.2.2.3.3 Agesq and Work experiencesq………. 33
4.2.2.3.4 Early_retirement………. 33 4.2.2.3.5 Household_head………. 33 4.2.2.3.6 Origin……….. 33 4.2.2.3.7 Education………. 33 4.2.2.3.8 Civil status………. 34 4.2.2.3.9 Year………. 35
vi © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
4.3 Fixed effect……….. 35
4.3.1 Fixed effect model……… 35
4.3.2 Change in employment………. 38
5. Discussion………. 41
6. Conclusion……… 43
7. Evaluation and self-reflection……….. 45
7.1 My expectations……… 45
7.2 The process of my team: what went well for the team?... 45
7.3 The process of the team: what went well for me?... 45
7.4 My individual limitations……… 46
7.5 In conclusion……… 46
References………. 47
Appendix……….. 51
Appendix A Interview questions and variables……….. 51
Appendix B1 Summary statistics for the years 2008-2013……… 54
Appendix B2 Summary statistics taxable income for years 2008-2013……… 60
vii © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
List of tables
Table 1 Variable Descriptions and Summary Statistics , year 2010……… 21
Table 2 Independent samples t-test for variables in 2010……… 22
Table 3 Summary Statistics: Taxable Income 2010……… 25
Table 4 Regressions for wage workers……… 27
Table 5 Regressions for self-employed workers……….. 29
Table 6 Fixed effect………. 37
Table 7 Change in employment status……….. 39
Table 8 Interview questions and variables……… 51
Table 9 Variable Descriptions and Summary Statistics, year 2008………. 54
Table 10 Variable Descriptions and Summary Statistics, year 2009………. 55
Table 11 Variable Descriptions and Summary Statistics, year 2010……….. 56
Table 12 Variable Descriptions and Summary Statistics, year 2011……… 57
Table 13 Variable Descriptions and Summary Statistics, year 2012……… 58
Table 14 Variable Descriptions and Summary Statistics, year 2013……… 59
Table 15 Summary Statistics Taxable Income 2008………. 60
Table 16 Summary Statistics Taxable Income 2009………. 60
Table 17 Summary Statistics Taxable Income 2010………. 61
Table 18 Summary Statistics Taxable Income 2011……….. 61
Table 19 Summary Statistics Taxable Income 2012……….. 62
Table 20 Summary Statistics Taxable Income 2013………. 62
List of figures
Figure 1 The average taxable income for the years 2008-2013……… 14Figure 2 Visualization of a linear regression line……….. 16
Figure 3 Visualization of a negative skew and a positive skew distribution……… 18
Figure 4 Kernal Density for year 2010………. 24
Figure 5 Kernal Density for taxable income in 2008……… 63
Figure 6 Kernal Density for taxable income in 2009……… 63
Figure 7 Kernal Density for taxable income in 2010……… 64
Figure 8 Kernal Density for taxable income in 2011……… 64
Figure 9 Kernal Density for taxable income in 2012……… 65
viii © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Glossary
Autocorrelation = The error term at one date can be correlated with the error terms in the previous
periods (Fall, 2008).
Clustered standard error = Observations within group i are correlated in some unknown way,
inducing correlation in eit within i, but that groups i and j do not have correlated errors (Nichols & Schaffer, 2007).
Fixed effect = An estimator to impose time independent effects for each entity that are possibly
correlated with the regressors (Plümper & Troeger, (2007).
Hausman test = A test of H0: that random effects would be consistent and efficient, versus H1: that random effects would be inconsistent. If the Hausman test statistic is large, one must use fixed effect. If the statistic is small, one must use random effect.1
Heteroskedasticity = Occurs when the variance of the unobservable error u, conditional on
independent variables, is not constant (Hoechle, 2007)
Kernal Density graph = A smooth graph where is shown what amount of income that has the most
frequency and how skewed the income is
Life-cycle effect = Wages increases with the age as people become more experienced but when
they become of a higher age, wage starts to increase at decreasing rate because they are not so healthy anymore. At some point, the optimum wage level is reached and instead of growing, the wage earnings starts to fall. So, the relationship between wage and age is inverted U-shaped which is also called the life-cycle effect (Blanchflower & Oswaldm 2008).
LISS panel data = The LISS panel is a representative sample of Dutch individuals who participate
in monthly Internet surveys. The panel is based on a true probability sample of households drawn from the population register. Households that could not otherwise participate are provided with a computer and Internet connection. A longitudinal survey is fielded in the panel every year, covering a large variety of domains including work, education, income, housing, time use, political views, values and personality.
1
ix © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Meso-level analysis = The people of interest for the analysis are groups, in this research wage
workers and self-employed workers.
Necessity-driven entrepreneurship= The concept were an inconvenient initial situation leads to
the decision to become self-employed. This could be the case if the current situation as an employer has bad future sights or insufficient financial rewards (Davidsson, 2005).
Non pecuniary benefits = Benefits that are nonrelated with money such as being your own boss
(Hamilton, 2000).
Opportunity-driven entrepreneurship = The concept where someone becomes self-employed due
to opportunities available being self-employed. The possible access to higher financial returns is an example of an opportunity which can be obtained when becoming self-employed (Davidsson, 2005).
Pooled OLS regression = An estimate that approximate the conditional mean of the response
variable given certain values of the predictor variables (Hartog, Pereira & Vieira, 2001).
Quantile regression = A regression that aims at estimating either the conditional median or other
quantiles of the response variable (Hartog, Pereira & Vieira, 2001).
Qualitative paradigm = Research approach that is associated with an explorative way of
examining, using small datasets and focusing on forming new theories and insights (Hussey and Hussey, 1997)
Quantitative paradigm = It is about holding or rejecting hypotheses. Statistical programs are used
in order to examine the collected data and come to the conclusion of holding or rejecting the hypotheses (Hussey and Hussey, 1997).
Replication study = Repeating a study using the same methods but with different subjects and
experimenters.2
SPSS = A statistical computer program called ‘Statistical Package for the Social Sciences’.3 Stata = Stata is a complete, integrated statistics package that provides everything needed for data
analysis.4
2
Explorable.com (Jun 12, 2009). Replication Study. Retrieved Jun 21, 2015 from Explorable.com: https://explorable.com/replication-study
3 http://www-01.ibm.com/software/analytics/spss/ 4
x © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Entrepreneur or employed workers = People that consider themselves as freelancer,
self-employed in an one-person business, company owner or a partner in a partnership, owner of a private limited liability company or an entrepreneur in some other way.
Square variable = A variable with a square which makes it possible to test the quadratic
relationship between variables. This is very applicable for the age variable, see ‘life-cycle effect’ (Blanchflower & Oswaldm 2008).
Superstar theory = Comparisons of mean earnings of self-employment and paid employment will
be strongly influenced by a handful of high-income entrepreneurial superstars (Rosen, 1981).
Taxable income = The income that is stated on the tax return form. Wage workers = People that worked for an employer in the current year
1 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
1. Introduction
1.1 Popularity of entrepreneurship
Recently, an article appears in the Exchange Magazine about the increasing popularity of entrepreneurship. It says that “Be successful!”, “ Get rich!” and “Run your own business!”, are many heard sentences these days. For centuries, people find it important to be successful and powerful, but this believe is now stronger than ever.5 People relate power and success to entrepreneurship because of the great examples like Donald Trump, Steve Jobs and Mark Zuckerberg. They think “Those people aren’t different from me, so why can’t I be successful?”6 Because of that, more and more people have the dream to become an entrepreneur and the society starts to stimulate this thought. Nowadays, we see an upcoming trend of universities, colleges and organizations that encourage people to go into business for themselves and showing them how to do it. The Master Entrepreneurship (joint degree VU and UvA) is a great example of that.
However, is it only because of a better image that people want to start a business themselves? The answer is no. According to Towler (2008) and Parker (2009), people are bored of their daily job and in search for a new adventure. This adventure cannot always be found in another job because of lack of education, lack of skills or age limitations. Starting your own business
overcomes these obstacles and gives people the freedom to start an adventure all on their own and make their hard work benefit themselves directly. Parker (2006) implies that being your own boss create a better job satisfaction and stimulates to work harder.
Entrepreneurship might become more popular because of non pecuniary reason, but does
entrepreneurship pay? Hamilton (2000) did an empirical analysis of the returns to self-employment in the U.S. and compared this to the salaries of wage workers. He found that entrepreneurs have both lower initial earnings and lower earnings growth than in paid employment. The results follow the superstar model seeing that a few entrepreneurs earn substantial returns in self-employment (Georgellis, Sessions & Tsitsianis (2005).
He concludes that people are willing to sacrifice substantial earnings in exchange for the non pecuniary benefits of being your own boss.
5 http://jansimson.com/2012/10/26/why-do-you-want-to-be-successful/ 6
2 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
The increasing popularity of entrepreneurship is also recognized in the Netherlands. Nowadays, It is more often a topic of discussion and there is an increasing number of people with the aspirations to become an entrepreneur. Therefore, it is valuable to find answers for the question of Hamilton (2000): “Does entrepreneurship pay”?
Consequently, a replication of Hamilton (2000) is done to explore if his results are also applicable to the Netherlands. It will give insight in the difference of return between wage workers and self-employed workers in the Netherlands and whether or not this is the same compared to the United States, according to Hamilton (2000).
1.2 Theoretical relevance
Entrepreneurship is a relatively new field of study that gets more and more attention from the scholars. Every year, new interesting scientific articles about entrepreneurship get published. 7 This thesis contributes to the existing body of literature about self-employment and wage worker earnings and is an extension on the paper of Hamilton (2000). It provides insight about the question “Does entrepreneurship pay?”. This question has not yet been answered for the Netherlands in the current literature and is therefore interesting to investigate. It comes next to the following
researchers that also bowed over this question.
Astebro (2012) found that 75% of all self-employed would have been better off not entering entrepreneurship as their wages in employment would have been higher.
Hamilton (2000) found that entrepreneurs have both lower initial earnings and lower earnings growth than in paid employment.
Levine & Rubinstein (2013) found that entrepreneurs earn much more per hour and work many more hours than their salaried counterparts.
Astebro, Braunerhjelm & Brostrom (2013) found that the earnings for academics that entered entrepreneurship are similar before and after becoming an entrepreneur, and dividends and capital gains are inconsequential. However the income risk is more than three times higher in
entrepreneurship.
Braguinsky & Oyama (2007) found that entrepreneurship does generate significant positive monetary returns for those who actually make intensive use of high level education in their jobs. These academic articles have all different outcomes, some of them may be in line with the results of this thesis and some of them may have contradictions with the thesis. These differences can
7
3 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
motivate other researchers to do further research in this field of study.
1.3 Practical relevance
It is relevant to know if entrepreneurship pays because many people enter self-employment with the thought to earn a lot of money in a short time. They follow their role models and are
overenthusiastic about doing it themselves. This can result in a very disappointing reality. The outcomes of this paper give insight about what people can expect of being an entrepreneur in the sense of monetary earnings. It will help individuals with their decision whether to become an entrepreneur or not.
1.4 Research question
As pointed out earlier in this chapter, the researcher focuses on Dutch households where incomes of wage workers and self-employed workers have been investigated. In order to examine this subject, the following research question is applied:
“Does entrepreneurship pay? “
Because the research question can’t be answered in a few words, a couple of sub questions are formalized. Answering these questions will provide enough knowledge to reply to the research question.
- What is entrepreneurship?
- What is the trend in entrepreneurship?
- Why do people want to become an entrepreneur? - What are specific characteristics of an entrepreneur? - What situations direct people into entrepreneurship? - How is entrepreneurship income measured?
- How is wage worker’s income measured? - Which variables affect the taxable income?
4 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
2. Literature review
2.1 Introduction
This section contains an overview of the existing literature about why people become self-employed and about their payment in comparison with the salaries of employees. The aim is to provide
relevant views, outcomes and theories from the highly regarded scholars that focus on this topic. Firstly, the trend in entrepreneurship is explained. After that, an explanation is given about why people become entrepreneurs, what characteristics they have, what situations people direct into self-employment, how the income of entrepreneurs are measured and if entrepreneurship pays in
comparison with employee salaries.
2.2 The trend in entrepreneurship
Entrepreneurship is one of the most elusive and least understood kind of economic behavior (Oswald & Blanchflower, 1990). The topic of entrepreneurship currently experiences a revived interest as is shown by the research agenda of today’s empirical researchers Baron (2006), Rindova, Barry & Ketchen (2009), Fisher (2012), Schipper (2012), Koellinger, Minniti & Schade (2007) and Frese & Gielnik (2014).
Astebro (2012) noticed a temporal decline in the number of entrepreneurs since 1960 and from the mid 1970s, Hamilton (2000) stated that entrepreneurship increased markedly. The Kauffman Index of Entrepreneurial Activity shows a peak during the years between 2007 and 2012. The economic downturn is a logical explanation for this. If people get fired and there are no available jobs left, people are slightly compelled to start their own business. This trend is also identified for the Netherlands, where the numbers of entrepreneurs since the crisis are increasing with 4% per year. 8 This upward trend is in line with the words written in the thesis introduction about the increasing number of people that are dreaming about entrepreneurship and start following courses and studies for it.
The increase in entrepreneurial activity causes a lot of dynamics in the society. Examples are financial deregulation, changing taxation rules, boosting policies and rising of house prices
(Astebro, 2012). Van Praag & Van Ophem (1995) mention that entrepreneurship has theoretical as well as political interest. Governments are of the opinion that business creation is necessary for a healthy economy and that the natural supply is insufficient. Therefore a society needs to stimulate it. New businesses improve our standard of living by their innovative inventions and create wealth
8
5 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
for the economy. Also, it creates additional jobs and conditions for a flourishing society.
2.3 Why people become entrepreneurs
According to the literature, earnings are not the only important factor why people become entrepreneurs. Oswald & Blanchflower (1990) imply that entrepreneurs need not earn more than employees because a part of the return comes in non pecuniary form. Self-employed workers derive utility partly from income and partly from being your own boss.
This is in accordance with Hamilton (2000) and Astebro (2012) that said that the non pecuniary benefits are significantly important for individuals that work for themselves and that not earning expectations but non pecuniary effects may dominate the choice of becoming an entrepreneur. However, an important thing to mention is that Hamilton (2000) said that entrepreneurs are willing to remain in business despite the lower realizations although it should be noted that he only
observes stay probabilities for two years after entry. Over time, lower performing entrepreneurs do exit (Astebro, 2012). It is true that people can’t live with only non pecuniary benefits because bills have to be paid and children have to be taken care of. If there is not enough available money to cover that, entrepreneurs do have to exit and try to get a job elsewhere.
Despite of the lower earnings, there are a lot of reasons to choose for entrepreneurship. An entrepreneur has more autonomy because he is the one to make the decisions and the one to drive the employees. He also has more flexibility, skill utilization and a greater job security. Job security in the sense of not being scared of getting fired at the end of the contract period. Being
entrepreneurial is a risky way of economic behavior which can affect the job security. However this risk lies in the hand of the entrepreneur himself.
Mostly because of the interesting work and the greater autonomy, self-employed people are more satisfied with their work than wage workers (Astebro, 2012). According to Parker (2006), people that made the decision for entrepreneurship also had the motivation to have better conditions at work. Work hours, work environment and colleagues belong to these conditions. Remaining reasons that influence the decision are parents with a family business, no jobs available in the society and seeing the opportunity for a new business. The last two motivations are in line with Davidsson’s (2005) distinction between necessity-driven entrepreneurship and opportunity-driven entrepreneurship. Necessity entrepreneurship is the concept were an inconvenient initial situation leads to the decision to become self-employed. This could be the case if the current situation as an employer has bad future sights or insufficient financial rewards. Opportunity entrepreneurship is the concept where someone becomes employed due to opportunities available being
self-6 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
employed. The possible access to higher financial returns is an example of an opportunity which can be obtained when becoming self-employed (Davidsson, 2005).
Schumpeter comes with more intrinsic motivations. He states that individuals feel the impulse to fight, to prove oneself superior to others, to succeed for the sake, not of the fruits of success but of success itself (Parker, 2006). This intrinsic motivation is especially applicable to men. The social researchers Dyke & Murphy (2006) identified gender differences of being successful. Where
woman find it important to have balance and relationships, men focus more on material success and strive to be better than others. This is one of the reasons that the number of entrepreneurial activity for men is higher than the number of women.9
Nevertheless, most people are still very convinced that they will earn more money as an
entrepreneur and name that as their main motivation. They can’t stop thinking about the famous successful entrepreneurs and are convinced about the idea of “if they can do it, I can definitely do it!”(Parker, 2006)
2.4 Characteristics of potential entrepreneurs
Every person is different and those differences ensure that people belong to certain groups. Entrepreneurs are one of those groups and have their own characteristics.
Oswald & Blanchflower (1990) use micro econometric methods to discover what those characteristics are and found that individuals are more likely to be self-employed if they (i) are men, with children and have a self-employed wife,
(ii) have a father who was a manager of an enterprise employing less than 25 people, (iii) live in an area of low unemployment.
It is remarkable that having children belong to these characteristics because Drucker (2015) concludes that this is an obstacle of an entrepreneur. He wrote in his book that children give responsibilities and that entrepreneurship is too risky to respond to these.
Evans and Leighton (1989) show that individuals preferring greater autonomy are more likely to become employed. Conversely, Kanbur (1982) emphasizes the role of risk aversion in the self-employment decision, suggesting that business owners may earn a risk premium because of the greater uncertainty of their earnings. Entrepreneurs are real seekers or are at least not risk-averse. They are optimists, over-confident, and skew lovers (Astebro, 2012).
In general, the biggest part that select entrepreneurship are the ones with high human capital. It
9
7 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
seems that they have a higher return than workers with a low human capital. Parker (2006) found that self-employed people on average have more balanced skills than wage workers and for every extra year of education, an increase of enterprise income is 6,1%.
A reason for this difference in human capital is that entrepreneurs are in average older and therefore more experienced than wage workers when entering (Astebro, 2012).
However, Parker (2006) also found that highly educated individuals prefer wage work because their earnings are higher as employee than it will ever be as an entrepreneur.
According to Bhide (2000), most of the successful founders have college degrees and come from middle-class background
Levine & Rubinstein (2013 ) say that a potential entrepreneur have a very distinct mixture of cognitive, non-cognitive, and family traits that differ from salaried workers. The entrepreneur tend to be better-educated and more likely to come from high-earning, two-parent families. Furthermore, as child, he tend to have higher learning competence and self-esteem scores and exhibits more risky behaviors than his salaried counterpart.
The cited findings emphasize that potential entrepreneurs have some overlap in characteristics. The potential entrepreneur is an educated, independent man with no fear for risk and is surrounded by an entrepreneurial atmosphere.
2.5 Situations that direct people into entrepreneurship
There are some situations that make entrepreneurship more attractive.
One of them is long unemployment durations, which have been found to make workers accept less desirable jobs. When these less desirable jobs are not available, one of the options left is entering entrepreneurship and create your own job (Parker, 2006). This is the same for people that receive relatively low wages. They would like to earn more and expect this to happen when they become an entrepreneur (Astebro, 2012).
The second situation is about people that have unstable work and have to hop jobs many times. People with more unstable work histories are more likely to enter entrepreneurship.
An advantage of people that hopped jobs frequently is that they learned the various necessary tasks of entrepreneurship before becoming one. This makes them more experienced and increase the possibility of success.
The third situation is about low skilled individuals. After working in different job categories, they conclude that they have no skills worth specializing in. As a result, they can only apply for jobs
8 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
with low payments and prestige and will eventually choose for self-employment to hope for a better future (Astebro, 2012).
2.6 How are the earnings of entrepreneurship being measured?
Income of employees are simply the wages but income for self-employed workers are often difficult to measure and to interpret. Entrepreneurs can easily evade income tax and entrepreneurs who answer survey questionnaires might therefore be unwilling to report their earnings to the
interviewer. Entrepreneurs are responsible for reporting their incomes to the tax authorities and that makes it easy for them to evade income tax (Parker, 2006).
Hamilton (2000) acknowledge an alternative measure of self-employment earnings. revenues – expenses = draw + retained earnings
The difference between revenues and expenses (including depreciation) is the net profit. This amount of money may be withdrawn from the business in the form of salary or reinvested in the business. The word for salary is draw and the reinvestment is the retained earnings.
The net profit is generally an accounting profit that is often thought to be understated because of the overstatement of expenses and / or underestimation of the profit.
An economic definition of self-employment earnings should account for the opportunity cost of equity invested in the firm. The draw in the equation is the amount of consumption the business generates for its owner. The equity-adjusted draw (EAD) therefore is the draw plus the changes in business equity between the beginning of period t and period t+1. This EAD is adjusted to account for the opportunity cost of business equity (Hamilton, 2000). Astebro (2012) and Parker (2006) agree with those three measurements.
According to Parker (2006), net profit is the most used measure of self-employment income.
2.7 Does entrepreneurship pay?
2.7.1 Comparing wage workers with entrepreneurs
Does entrepreneurship pay, is a frequently asked question. It is based on the comparison with wage salaries. People find entrepreneurship interesting when it pays more than their potential salaries. Astebro (2012) finds that the payments of entrepreneurs never overhaul the salaries for wage workers and earnings profile is more concave for entrepreneurs than wage workers. However, Levine & Rubinstein (2013) contradict this conclusion by telling that wage workers that become
9 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
self-employed earn about 18% more than they were earning before. This difference might be explained by the population that is researched. Astebro (2012) investigated academics that start a new business and Levine & Rubinstein (2013) looked into a random population where all kind of people were questioned. Parker (2006) stated that the earnings of highly educated individuals will always be higher as wage worker than as entrepreneur and this can be the reason that Astebro (2012), who observed highly educated people, has different results than Levine & Rubinstein (2013), who observed all kind of individuals.
Hamilton (2000) mentions that the earnings and earnings growth for entrepreneurs are lower than it is for wage workers. He also states that entrepreneurs can sometimes even earn 25% less than wageworkers, and if an entrepreneur would start a wage job, he would immediately earn more, no matter how long the entrepreneur had been in business already. Nevertheless, there are
entrepreneurs that do earn more, the so-called superstars. Those are the top 25% of the entrepreneurs. The Superstar theory suggests that comparisons of mean earnings of self-employment and paid self-employment will be strongly influenced by a handful of high-income
entrepreneurial superstars (Rosen, 1981). This corresponds with the positively skewed distribution of self-employment mentioned by Astebro (2012). He compared it with wage workers and saw that the distribution of self-employment is more positively skewed than wage work and the standard deviation of earnings is consistently higher. He noticed that the 99th percentile of the self-employment income is higher than 50% of all income of wage workers.
Speaking of this big standard deviation, Parker (2006) found that in most countries the incomes of entrepreneurs are more unequal than those of employees. Higher proportions of the self-employed earn both very low and very high incomes compared with employees. Of the proportion that earns very low, one sixth earn less than the minimum wage. For someone of the age of 23 and older, this amount is €18,021.60 in the Netherlands and $15,080 in the United States. 10,11
Fairlie (2005) did a longitudinal study to examine the same question and looked at young men and women from disadvantaged families. He found some evidence that male entrepreneurs earn more than wage workers and female entrepreneurs earn less than wage workers. This is associated with the trait differences between gender written in section 2.4.
Hamilton (2000) analyzed more median rather than mean incomes to avoid distortions arising from the pronounced skew in the distribution of self-employment incomes. Astebro (2012) and Levine & Rubinstein (2013) confirm his conclusion that the median self-employed person earns less than his salaried counterpart, while having comparable cognitive and non-cognitive traits.
10 https://www.minimumloon.nl 11
10 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
2.7.2 Switching in employment status
Individuals that switched from wage labor to entrepreneurship experienced a loss in income.
75% Would be better off staying in wage work. Only the top 0.5% earns about 1400% more. People with less income as wage worker has more chance to experience an increase in income, for people with high incomes, this chance is much lower (Holtz-Eaking et al., 2000), (Astebro, 2012). However, according to Gentry and Hubbard (2004), there is a great upward mobility in income as an entrepreneur which makes it a matter of time until the earnings get higher.
Additional, what should not be forgotten is that self-employed people often lack of access to social security benefits and that paid employees are more likely to have all part of their health insurance paid for by their employer. Such non wage benefits represent 20% of paid employment
compensation. (Hamilton, 2000), (Parker, 2006). So there might be an increase in income by
11 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
3. Methodology
3.1 Introduction
This chapter will cover the methodology used to find the answer to the research question. Firstly the research strategy is reviewed in which is explained what strategy is applied to test the main research question. Subsequently, the research design is introduced, here the data description, data analysis and the variables are presented. A brief summary is given at the end of this chapter in section 3.4.
3.2 Research strategy
This section contains a short explanation of two research paradigms. According to Hussey and Hussey (1997) the research paradigm is about how research should be conducted in a scientifically accepted manner. Those two paradigms are the qualitative and the quantitative paradigm.
The qualitative research approach is associated with an explorative way of examining, focusing on forming new theories. In-depth interviews and face to face conversations are examples of research methods for this paradigm.
The quantitative paradigm is about holding or rejecting hypotheses. Statistical programs such as SPSS or Stata are used in order to examine the dataset and come to the conclusion of holding or rejecting the hypotheses.
For this thesis, the quantitative paradigm is chosen. Thousands of answers from internet surveys have been collected and are now ready for analyzing.
The answers from the respondents will provide further understanding about the income of entrepreneurs and wage workers.
The researcher started with a broad dataset consisting out of more than 100 variables and filtered more than 75% away to do the analyses. The analyses are done on meso-level which means that the people of interest are groups, in this research, wage workers and self-employed workers.
3.3 Research design
This paragraph will cover the data description, data analysis and a description of the variables. It provides a clear understanding of the dataset and the variables which are used in statistical analyses in chapter 4.
12 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
3.3.1 Data Description
3.3.1.1 Dataset
In this paper use is made of data of the LISS (Longitudinal Internet Studies for the Social sciences) panel administered by CentERdata (Tilburg University, The Netherlands). The LISS panel is a representative sample of Dutch individuals who participate in monthly Internet surveys. The panel is based on a true probability sample of households drawn from the population register. Households that could not otherwise participate are provided with a computer and Internet connection. A
longitudinal survey is fielded in the panel every year, covering a large variety of domains including work, education, income, housing, time use, political views, values and personality.12
The time-series data chosen for this thesis mainly focused on work and income and consist out of six waves. It questioned everything related to possible earnings and income deductibility’s. Some questions were multiple choice and some of them were open questions.
This dataset didn’t contain information about gender and ethnicity. Therefore the background variables and ethnicity variables are merged with a statistical program called SPSS. This is done for all six waves separately and those years are finally all merged together. This is the moment where the cross sectional dataset became a panel dataset. A longitudinal, or panel, dataset is one that follows a given sample of individuals over time, and thus provides multiple observations on each individual in the sample. (Hsiao, 2003). The panel data for this research is unbalanced because not every individual has the same amount of observations.
At the beginning, the data included 32992 observations of Dutch people from 2008 until 2013. The number of observations is measured by N individuals times the number of interviews over time. Only the males that have a Dutch background or have a western background that are between the age of 18 and 65 are selected. The remaining observations are left out.
After filtering out the missing values of taxable income, 13923 observations are left. 6794
Observations are remaining after removing the female respondents from the dataset. Applying the filter for origin, only 3499 observations are left. Doing this for age, the dataset ended up with 3358 observations.
Most of the observations were filtered out because respondents didn’t fill in their taxable income. 57,80% Refused to answer the question. This variable of main interest is measured in real numbers and people might find it personally to give this number away.
12
13 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
3.3.1.2 Variables
After removing all unnecessary variables and creating some new, thirteen of them are left. These consist out of the independent variables wage workers and self-employed workers, the dependent variable taxable income and the control variables gender, age, agesq, origin, education, civil status, position in the household, early retirement, work experience and word experiencesq.
Gender is a dummy variable where 0 is female and 1 is male. Only the males are selected and this represents 48,80% of the data. This percentage is representive of the Dutch population. 13
Age is a numerical variable and only the numbers between 18 and 65 are hold because they belong to the workforce.14 Age squared is a numerical variable computed by age times age.
Origin is measured on a scale with four numbers from 0 to 202 reaching from Dutch background to second generation with a non-western background. 47,20% Of the respondents consider themselves as Dutch, 1,7% are first generation foreigner with a western background and 2,6% are second generation with a western background. Only the people with a Dutch background or a western background are hold.
Education is measured on a scale from 1 to 7 reaching from primary school to WO (University). Civil status is measured on a scale from 1 to 5 reaching from married to never be married. Position in household is measured on a scale from 1 to 7 reaching from household head to family member.
Work experience is measured with the formula EXP = AGE – ED – 6 where EXP is work
experience and ED is education (Garvey & Reimers, 1979). Work experience squared is computed by work experience times work experience.
The dependent variable taxable income is distributed as in figure 1. The results of the data analyses will show what part of this average taxable income is related to wage workers and what part is related to self-employed workers
13 www.indexmundi.com
14
14 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
.
Figure 1: The average taxable income for the years 2008-2013.
3.3.2 Data analysis
3.3.2.1 Introduction
After structuring the data with SPSS, the analyses are done by using the static analyzing program STATA. The data is analyzed as follows. Firstly, descriptive analyses are done to gain further understanding of the dependent and independent variables. Each variable is investigated by applying descriptive analyses in the form summary tabulations. For the independent variables, the tabulations contain the mean separated by employment status. For the dependent variable, the mean, standard deviation and the three quantiles are computed.
After these descriptive statistics, the dataset is analyzed using pooled OLS regressions and quantile regressions. Whereas the method of least squares results in estimates that approximate the
conditional mean of the response variable given certain values of the predictor variables, quantile regression aims at estimating either the conditional median or other quantiles of the response variable (Hartog, Pereira & Vieira, 2001).
In the end, a fixed effect estimator is used to impose time independent effects for each entity that are possibly correlated with the regressors.
15 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
3.3.2.2 Pooled OLS regression and quantile regression
A regression on panel data is different than a regression on cross-sectional data in that it has a double subscript on its variables. The formula that underlies the regression analysis is
yit = α + Xtit β + uit i = 1…..N; t = 1…..T (eq. 1.1)
where the i denoting individuals and t denoting time. The i subscript stands for the cross-section dimension whereas t denotes the time-series dimension. α Is an intercept which the height of income when X is zero. β is K x 1 or also known as the scope of the regression and Xit is the ith observation on K explanatory variables. For this research, Yij measures earnings of an individual, whereas Xit contain a set of variables like work experience, education and gender (Baltagi, 2012). Most of the panel data applications utilize an one-way error component model for the disturbances, with
uit = µi + vit (eq. 1.2)
where µi indicates the unobservable individuals specific effect and Vit indicates the remainder disturbance. Note that µi is time-invariant and it accounts for any individual specific effect that is not included in the regression. In this case, we could think of it as an unobserved ability,
characteristic or skill of the individual. The remainder disturbance Vit varies with individuals and time and can be thought of as the usual disturbance in the regression (Baltagi, 2012).
To visualize this equation, a regression line looks like the linear line in figure 2 where the intercept, slope and residual or error term are shown.
16 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Figure 2: Visualization of a linear regression line with an intercept, slope and residual error. (Not based on real data)
An example of this formula can be that income goes up for education. Every individual earns a minimum wage of €18,021.60 which is the value of the α at X=0. For every X, the income goes up with a slope of 1.025 and an error term is summed up for a specific skill, such as sale skills, that the individual possesses.
For the quantile regression, the equation is a little bit different.
yi = β(p)0 + β(p)1 xi + ε(p) (eq. 1.3)
where 0 < p < 1 indicates the proportion of the population having scores below the quantile at p (Hao & Naiman, 2007).
The hypotheses for a regression are
H0 = coefficient is equal to zero (no effect) H1 = coefficient is not equal to zero (effect)
A coefficient with a p-value less than 0.05 indicates a rejection of the null hypothesis. In that situation, a predictor variable is likely to be a meaningful addition to the regression model because changes in predictor’s value are related to changes in the response variable (Frost, 2013).
A coefficient with a p-value greater than 0.05 implies that changes in the predictor are not related with changes in the response variable.
17 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
3.3.2.3 Fixed effect estimator
After doing the regression, a summary effect is done. The meaning of a summary effect is different between the fixed effect and the random effect. Fixed effects assumes one true effect size which underlies all studies in the analysis and differences among studies are purely random error due to chance. Fixed effects models, also known as the within-estimator, control for the effects of time invariant variables with invariant effects. This means that this model don’t allow invariant variables because it is used to control for individual characteristics. Examples of time-invariant variables are gender and race because they do not change over time. The fixed effect ignores heterogeneity, but the random-effect doesn’t. Heterogeneity in the sense that studies are taken from different populations, different countries, different times. Random-effect incorporates this heterogeneity and considers within study variance and between study variance. It assumes that the true effect size varies from one study to the next, and that the studies in the analysis represent a random sample of effect sizes that could have been observed. The summary effect is then the estimate of the mean of these effects (Borenstein et al., 2009).
To know which one to choose, a Hausman-test is conducted. This test doesn’t look at standard deviation but at coefficients. This is in essence a F-test with
H0 = coefficients are the same H1 = coefficients are different
With a chi-squared of 27.51 and a p-value of 0.0000, it is obvious that the fixed effect estimator is the best choice for this research. By using this estimator in panel data, time independent effects for each entity that are possible correlated with the regressors are imposed.
Fixed effects models need repeated observations for each group of self-employed workers and wage workers. This is because the model rely on within-group action and wants a reasonable amount of variation of the key X variables within each group.
It is important to conduct fixed effect models because data often fall into categories and the characteristics of these categories might affect the dependent variable.
Unfortunately, researchers can never be certain that they have all the relevant control variables, so if they estimate an ordinal least square (OLS) model, they have to worry about unobservable factors that are correlated with the variables that are included in the regression. Omitted variable bias would be the result. If there is the believe that these unobservable factors are time-invariant, then fixed effects regression has the power to eliminate omitted variable bias (Plümper & Troeger, (2007).
18 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
3.3.2.4 Kernal density graph
Finally, a kernal density graph is created to show the differences between the taxable income of wage workers and self-employed workers. It is a smooth graph where is shown what amount of income has the most density and how skewed the income is. The area under the curve is always equal to one. Also is shown which employment group has the largest outliers. When the graph has a negative (left) skew, the highest density is at individuals with a high amount of taxable income and the outliers earn a small income. When the graph has a positive (right) skew, most of the density is around a low amount of taxable income and the outliers earn a high income. The distribution of the self-employment income explained in the literature review corresponds with the right figure of below.
Figure 3:Visualization of a negative (left) skew and a positive (right) skew distribution (Not based on real data).
3.3.2.5 Variables
The answers to the questions in the internet surveys are formed into variables and used in the regressions and fixed effect estimator. Appendix A displays the relevant questions that were asked and the related variables.
3.4 Summary
I started with a dataset of 32992 observations and ended up with 3358 observations. The dataset consist out of answers to the questions relating to the income of wage workers and self-employed workers. The paradigm used in this research is a quantitative method conducted on meso-level. The variables used in this research are summarized in Appendix A. The descriptive analyses contain tabulations of the characteristics of the individuals. This is followed by a more extensive research on the determinant and consequence variables, using a pooled OLS regression, quantile regression and fixed effect model. The kernal density graph shows the differences between taxable income of wage workers and self-employed workers.
19 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Merging the necessary data is done with SPSS, analyses are done with Stata and reporting the results are done with Microsoft Word.
20 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
4. Results
In this section, the results for the methods described in the methodology are shown and interpreted.
4.1 Summary statistics
4.1.1 Explanatory variables
Table 1 presents the summary statistics for 2010 of the variables separated by employment status. Tables for the other five years are presented in appendix B.
Wage workers are people that worked for an employer in the current year and entrepreneurs are people that consider themselves as freelancer, self-employed in an one-person business, company owner or a partner in a partnership, owner of a private limited liability company or an entrepreneur in some other way.
From the 3358 observations out of six years, a total of 667 are for the year 2010. 94.45% Of them are wage workers and 5.55% are entrepreneurs. This is representive for the whole dataset were 3143 are wage workers and 215 are self-employed. This is respectively 93.60% and 6.40%.
For these wage workers and entrepreneurs, the mean of their work experience, levels of education, marital status, origin, early retirement and household head are given on the next page.
21 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Table 1
Variable Descriptions and Summary Statistics Employment Sector year 2010
Mean
Variable name Description
Paid Employees
Self- Employed
Work_Experience Potential labor market experience = age – education - 6
35.57 38.70
No_education Individual without education 0.00 0.00
Primary_school Individual with a primary school degree
0.01 0.00
Secondary_school Individual with a secondary school degree
0.31 0.43
Junior College Individual with a junior college degree
0.25 0.16
College Individual with a college degree 0.31 0.43
University Individual with an university degree 0.13 0.22
Marital_status Married, spouse present 0.62 0.51
Dutch_background Dutch from origin 0.94 0.92
Early_retirement Whether someone is retired earlier or not
0.12 0.17
Household_head Main provider of income 0.87 0.78
Observations 630 37
In this table, it emerges that entrepreneurs have a higher average level of work experience than wage workers. Both wage workers and entrepreneurs have an education degree but especially looking at the individuals with a college or university degree, entrepreneurs score better. However, they score not higher in everything. Entrepreneurs seem less likely to be married and less likely to be the main provider of income. More work experience, better educated but not the main provider of income, that is a remarkable result. Entrepreneurs appear to be less likely Dutch from origin and more likely to go on early retirement. The enumeration of the results for wage workers and
self-22 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
employed workers do not lead to clear conclusions because the p-values are missing.
Therefore, an independent samples t-test is conducted to investigate which of the variables is significantly different for wage workers and entrepreneurs in 2010.
To test this, the hypotheses are:
H0: μ1 = μ2 (Means are the same)
H1: μ1 ≠ μ2 (Means are different)
To run this parametric test, several assumptions need to be fulfilled, including equal variances. To test for homogeneity of variances, a Levene’s test is done. Based on the F-value and the p-value from that test, the decision is made whether to use a t-test assuming equal variances or doing a t-test assuming unequal variances. The hypotheses of a Levene’s test are
H0: σ21 = σ22 (Variances are equal)
Ha: σ2i ≠ σ2j (Variances are unequal)
For every p-value of Levene’s test lower 0.05 (95% CI), variances within a group differ significantly.
The results of this Levene’s test and the t-test done afterwards are summed up in table 2.
Table 2
Independent samples t-test for variables in 2010
Levene’s Test for Equality of Variances
T-test for Equality of Means
Variables F p-value df p-value t
Work_Experience 0.23 0.63 665 0.13 -1.53
No_education 0.71 0.40 665 0.67 0.42
Primary_school 1.95 0.16 665 0.49 0.69
Secondary_school 4.75 0.03 39.67 0.15 -1.45
23 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
College 4.75 0.03 39.67 0.15 -1.45 University 6.56 0.01 38.86 0.24 -1.19 Marital_status 2.10 0.15 665 0.21 1.26 Dutch_background 1.39 0.24 665 0.55 0.60 Early_retirement 0.95 0.33 201 0.61 -0.51 Household_head 6.27 0.01 38.89 0.25 1.16
The t-tests for work experience, no education, primary school, marital status, Dutch background and early retirement are done while assuming equal variances. The secondary school, junior college, college, university and household head are done while assuming unequal variances.
There is no significant difference in the score of one of the variables. It is concluded that the null hypothesis is false at the 95% confidence interval and therefore we fail to reject the null hypothesis. This implies that the means are the same and the earlier enumeration of the results are ungrounded.
4.1.2 Dependent variable
The variables of above might have an influence on the level of income that individuals earn. Hamilton (2000) measured income by wage, early-adjusted draw, draw and net profit. In this paper it is measured by taxable income for both wage workers and entrepreneurs. Figure 4 shows the empirical distribution for 2010 of taxable income compared for wage workers and entrepreneurs. The kernel density graphs for other years are in appendix B3.
24 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Figure 4
Kernal density for year 2010
The figure exhibits two notable characteristics. First the central tendency of the distribution of self-employment returns is less than that of the wage-workers. The taxable income of entrepreneurs is more skewed to the right but the superstar theory seems to fit the wage workers better as they have outliers that earn more than 600,000 dollar. However, the result about superstars might not be true for the whole Dutch population of interest because the outcome is based on a small sample size. It could be simply that not enough people have been surveyed to also capture superstar
entrepreneurs.
Consequently, simple comparisons of the mean earnings of business owners and paid employees hide substantial differences in the properties of the distributions of the two groups. Table 3 shows the mean, standard deviation and percentiles of the taxable income in year 2010. The other five years are in appendix B2.
25 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Table 3
Summary Statistics: Taxable Income 2010
Employment Statistics Paid Employees Self-Employed Mean 45550 41000 Standard Deviation 42877 54920 25th Percentile 30000 15000 50th Percentile 39702 22000 75th Percentile 53290 44000 Observations 630 37
The first row corresponds with the central tendency shown in the kernal density graph. Paid employees earn on yearly average more than self-employed workers. Every year, they receive €4550 taxable income more which is a difference of 11.10% .The income of entrepreneurs exhibits a greater dispersion. The median, lower and upper quartiles emphasize the skewness of the
distribution of taxable income in figure 4. For example, the median difference is 80,46%. This confirms the conclusion of Astebro (2012), Levine & Rubinstein (2005) and Hamilton (2000) that the median of a self-employed person is lower than his salaried counterpart.
The results for the paid employees are probably more precise because there is a higher number of observations for them in comparison with the amount for self-employed workers.
4.2 Regression analyses
4.2.1 Introduction OLS and quantile regressions
Given the observed skewness in the data, both ordinary least squares (OLS) and quantile
regressions are conducted. The regressions present the taxable income of entrepreneurs and wage workers. Differences in productivity across individuals are captured by indicators for year,
26 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
for whether the individual has gone on early retirement. For wage workers, these variables explain 22.23% (R 2 ) of the variation of income, for self-employed workers they explain 38.77% (R 2).
Age and work experiences are also included as square variables. This means that the quadratic relationship of age and work experience on taxable income is tested.
A regression formula with a square variable included looks like
y = β0+β1x1+β2x12+β3x+ε (Eq. 2.1)
The reason to include a square variable is for example that wage increases with the age as people become more experienced. However, when they become older, wage starts to increase at decreasing rate because they are not so healthy anymore. At some point, the optimum wage level is reached and instead of growing, the wage earnings starts to fall. So, the relationship between wage and age is inverted U-shaped which is also called the life-cycle effect (Blanchflower & Oswaldm 2008).
The filled in equation for this research is:
Taxable income = β0 + β1Age + β2Age2 + β3Work Experience + β4Work Experience2 + β5Early
Retirement + β6Household Head + β7Origin+ β8Education + β9Civil Status +
β10Year + ε
(Eq. 2.2)
By adding the square to the variable, it allows to model more accurately the effect of age, which may have a non-linear relationship with taxable income. For instance, the effect of age could be positive up until, for example the age of 45, and then negative thereafter.
Adding the age squared to age, allows to model the effect at differing ages, rather than assuming the effect is linear for all ages (Blanchflower & Oswaldm 2008).
27 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
4.2.2 Results OLS and Quantile regressions
4.2.2.1 Tables with the coefficients
Table 4 and 5 present the coefficients of the ordinal least squares regression and the quantile regression for both wage workers and self-employed workers. The number between the brackets is the clustered standard error. It means that observations within group i are
correlated in some unknown way, inducing correlation in eit within i, but that groups i and j do not have correlated errors (Nichols & Schaffer, 2007).
Table 4
Regression for Wage workers
Dependent Variable: Taxable income (N=1707 ; R2=0.2232)
Variable OLS .25 .50 .75 Constant 11843 (31302) -44932* (10263) 36176* (12115) 254305 (145957) Age -3791 (3057) 4368* (1048) -548 (1016) -20116 (12433) Agesq 52 (30) -47* (11) 3 (10) 158 (111) Work_experience 4746* (2277) -1613 (862) 677 (714) 16295 (8813) Work_experienceSQ -74* (27) 21 (11) -5.5 (9) -145 (91) Early_retirement 1662 (3469) 2956 (1685) -336 (623) -129 (6958) Household_head 7218* (2557) 1566 (969) 1731* (701) -4173 (7571) Origin First generation
foreigner with Western Background 15365 (16962) 5134 (4085) 22 (1215) 33912 (33698)
28 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Second generation with Western Background 9600 (7667) 38 (1775) 1681 (856) 17976 (16127) Education
Junior High School 6939* (3087)
1218 (1691)
2115 (1255) Senior High School 16318*
(4229) -1204 (1937) 1903 (1394) 2165 (8815) Junior College 7976* (3005) 343 (1673) 2446* (975) -2707 (5473) College 19090* (3726) 2053 (1926) 4308* (852) 10719 (6219) University 35445* (5579) 656 (2754) 4808* (860) 32868* (7978) Other 11972* (6782) -1785 (2764) 23 (2244) 19579* (7781)
Not yet completed Omitted Omitted Omitted Omitted
Not yet started
Civil status Separated -11991 (7207 -2863 (4532) -4003* (714) -9786 (8871) Divorced -2317 (4748) -2303 (1297) -478 (773) 13770 (14896) Widow or Widower -2556 (6597) 5733* (1647) -2516* (948) -13057* (6091) Never been married -5398*
(2180) -316 (1129) -349 (645) -4248 (4622) Year 2009 2166 (1397) 774 (616) 83 (358) 167 (4181) 2010 5592 (3039) 3092* (1430) 695 (570) 578 (7660)
29 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
2011 5440 (3612) 4528* (1350) 892 (646) 725 (9063) 2012 6226 (3269) 2942 (1655) 1288 (725) -1844 (8540) 2013 5424 (3740) 5353* (1666) 2336* (676) -4203 (9326) Observations 1707 550 565 592 R-squared 0.2232 0.3814 0.1071 0.1009 Table 5
Regressions for self-employed workers
Dependent Variable: Taxable income (N=142 ; R2=0.3877)
Variable OLS .25 .50 .75 Constant -198917 (102227) 130340 (33090) 130340 (71321) -497139 (761372 Age 27260* (10811) 1848 (3390) -7750 (6912) 61706 (60644) Agesq -248 (215) 5.9 (74) 65.6 (5541) -688 (693) Work_experience -19725* (5956) -1735 (1468) 5669 (56) -49161 (39193) Work_experienceSQ 204 (192) -8.4 (65) -63 (5864) 672 (606) Early_retirement 17692 (12006) 9976* (3724) 3602 (3501) 25608 (34228) Household_head 18839* (8368) -974 (2618) 1703 (7610) 29041 (39633)
30 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Origin
First generation
foreigner with Western Background
-35249* (11894)
4739 (3744)
Second generation with Western Background -2177 (11900) -1398 (4650) -2 (7611) Education
Junior High School -70389 (50825)
10030 (16590) Senior High School -45883
(36565) 6291 (12631) -5302 (5306) 65157 (53138) Junior College -68411* (24602) 2159 (7900) -6687 (3522) 27828 (60748) College -45884* (15527) -3614 (3934) -2368 (3644) 53899 (57764) University -34740 (20274) -4753 (5025) -7553 (5058) 94017 (87010) Other -65929* (25468) Omitted Omitted
Not yet completed
Not yet started Omitted Omitted
Civil status Separated Divorced -14452 (10134) 4228 (2275) -4070 (4347) -26648 (24008) Widow or Widower 11094 (9629) 1905 (2562) -2749 (5058) 14563 (31057) Never been married -37735*
(12610) 2480 (2174) -5051 (4298) -42732 (23646)
31 © 2015 Patrick Petrus Wilhelmus Suiker. All Rights Reserved
Year 2009 13668 (8784) 2183 (1861) 802 (3297) 56726 (31221) 2010 4340 (8497) 3176 (2179) 2827 (6223) 30583 (23811) 2011 15122 (11003) -2404 (2601) -1120 (4024) 58023 (29190) 2012 9607 (10751) -2046 (2510) -1291 (4441) 44716 (27611) 2013 -9268 (11665) -1484 (2995) 6910 (6794) 38732 (31839) Observations 142 49 50 43 R-squared 0.3877 0.4870 0.3848 0.5891 4.2.2.2 Constant
The constant OLS value for wage workers is 11843 and for entrepreneurs -198917. The constant simply means that the expected value on taxable income for wage workers is 11843 and for self-employed workers -198917, when all independent/predictor variables are set to zero. The negative value for self-employed workers is not a concern. Some independent variables can be not
meaningful in low value range or if it is not possible for taxable income to reach low values at all. It is the overall relationships between the variables that will be of the most importance in the linear regression model, not the value of the constant. 15 Because the independent variables will never be zero, there is no interest in the intercept. It doesn’t tell anything about the relation between X and the amount of taxable income. 16
4.2.2.3 Explanatory variables
The regression coefficients represent the mean change in the response variable for one unit of change in the predictor or explanatory variable while holding other predictors in the model constant.
15 http://www.researchgate.net/post/A_negative_coefficient_for_a_constant_in_a_linear_regression 16