• No results found

Analysing the Drivers of Income Inequality in the United States: Skill, Innovation and Urbanisation in the 21

N/A
N/A
Protected

Academic year: 2021

Share "Analysing the Drivers of Income Inequality in the United States: Skill, Innovation and Urbanisation in the 21"

Copied!
42
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

Master’s Thesis

Analysing the Drivers of Income Inequality in the

United States: Skill, Innovation and Urbanisation in

the 21

st

Century

Philip Koch

S3513866

p.k.a.koch@student.rug.nl

University of Groningen

Faculty of Economics and Business

MSc International Economics and Business

Supervisor: Prof. Dr. B. van Ark

Co-Assessor: Prof. Dr. R. Inklaar

(2)

Abstract

Contrary to the Kuznets hypothesis of ultimately decreasing inequality in a developed nation, the past decades have shown an upward trend in income inequality. This thesis identifies educational attainment, innovation, urbanisation and economic expansion as factors contributing to this increase over the recent time period from 2001 to 2015. The results indicate that the level of tertiary educational attainment, the number of patent applications, population density and GDP growth have a positive and significant effect on income inequality, rejecting the Kuznets hypothesis for recent periods and adding to the literature on skill-biased technological change as well as on agglomeration.

(3)

Table of Contents

1. Introduction ...5

2. Literature Review ...8

2.1 Income Inequality and its Relevance ...8

2.2 Determinants of within-country income inequality ...9

2.2.1 GDP Growth ... 10

2.2.2 Population Density ... 12

2.2.3 Educational Attainment ... 13

2.2.4 Innovation ... 15

2.3 Gini Coeffcient as Inequality Measure ... 17

2.4 Conceptual Model and Hypotheses ... 18

3. Data and Methodology ... 19

3.1 Time Frame and Region ... 20

3.2 Dependent Variable: Inequality ... 20

3.3 Explanatory Variables ... 21

3.3.1 Educational Attainment ... 21

3.3.2 Urbanisation... 22

3.3.3 Innovation ... 23

3.3.4 GDP Growth ... 23

3.4 Control Variables: GDP per capita ... 24

3.5 Descriptive Statistics ... 24 3.6 Empirical Specification ... 25 4. Results ... 27 4.1 Model 1 ... 27 4.2 Model 2 ... 29 4.3 Robustness ... 30

5. Conclusion and Limitations ... 36

5.1 Limitations ... 36

5.2 Concluding Remarks ... 37

(4)

List of Figures

Figure 1: Evolution of the Gini Coefficient in the USA, 1917-2015 ...9

Figure 2: Evolution of the Gini Coefficient in the USA, 2001-2015 ... 11

Figure 3: Histogram of Model 1 Residuals ... 31

Figure 4: Kernel Density Plot Model 1 ... 31

List of Tables Table 1: Descriptive Statistics... 25

Table 2: Estimation Results for Models 1 and 2 ... 27

Table 3: Estimation Results for Models 1A and 2A ... 32

Table 4: Estimation Results for Models 1 and 2 using the Top Decile Income Share as Dependent Variable ... 33

(5)

1. Introduction

The determinants of (income) inequality have been subject of an ongoing debate between scholars and policy makers in economics. While Thomas Piketty (2014) has gained much attention with his work on wealth inequality and the “top 1 percent” in recent years, inequality focussing on income from labour and employment is also an intensely debated topic, as it is highly relevant not only for academic research but is said to have an effect on well-being and redistributive preferences in society (although varying substantially between countries) (Chapple, Förster and Martin, 2009). Therefore, inequality is an issue to be addressed by policy-makers.

(6)

One cause for the upward-sloping trend in US income inequality may be explicable by the phenomenon of urbanisation. Glaeser, Resseger and Tobio (2009) point to the large differences in human capital endowments found in urban systems compared to more rural regions. Baum-Snow and Pavan (2013) have found that increases in income inequality are related to the large variance of wages in cities, which have driven up overall inequality. Another possible cause can be seen in skill-biased technological change, where technological change increases the relative demand for skilled workers, leading to higher wages in these sectors of the economy while the wages of other skill groups remain constant (Milanovic, 2016). Skill-biased technological change is found to increase income inequality in various studies and is often used as an explanation for the upward-trend in income inequality in recent years (Acemoglu, 1998, 2002; Goldin and Katz, 2007; Autor, 2014). Another possible explanation for the recent evolution of inequality could be seen in innovation. The presence of a new technological or information revolution as described by Piketty and Saez (1998) may be helpful to explain this trend, and Aghion (2002) argues that such technological phenomena may increase the skill-premium in the labour market to ultimately affect income inequality. In addition, the hypothesised relationship between development and income inequality leads to the suspicion that economic growth itself could determine levels of income inequality in society, be it negatively or positively (Kuznets, 1955; Milanovic, 2016).

(7)

reasons for these developments meaningful and interesting. Since the beginning of the decade, the United States as a whole, but also the underlying federal states, have shown an approximately linear increase in inequality which is also beneficial for an investigation using a linear regression model. Furthermore, the USA are considered one of the leading innovative nations when it comes to technology, services and communication technologies (Schwab and Sala-i-Martín, 2017). Key explanatory factors for increased inequality can be measured well in the United States, and the existence of regional data strongly increases the number of observations and thereby increases the robustness of the relationships outlined previously.

Taking the Kuznets hypothesis of decreasing income inequality in developed nations as a point of departure, and acknowledging that this prediction has not held up to the test of reality for the United States, this thesis is going to investigate the possible causes of increasing income inequality along the following research question: To what extent can economic growth,

urbanisation, educational attainment and innovation explain the increases in income inequality from 2001-2015 in the 51 US federal states?

(8)

2. Literature Review

The following section outlines the main theories behind the hypothesised relationship of GDP growth, urbanisation, educational attainment, innovation and income inequality. First, an overview of the term income inequality and its relevance in the literature will be given. Afterwards, the determinants of within-country inequality will be discussed individually. Then, an overview of the Gini coefficient as measure of inequality will be provided. The hypotheses will be outlined in the final section of the literature review.

2.1 Income Inequality and its Relevance

(9)

for the United States. Kuznets had predicted that during the development of a country, inequality would rise initially due to industrialisation and an according gap between agricultural and industrial wages, but then would fall with increasing development due to structural change towards high-productivity sectors. It is now recognised that especially for the USA, the opposite of Kuznets predictions occurred. Especially for top-income shares, a distinct U-shape pattern becomes visible (Piketty & Saez, 2003). The authors also suggest that the fall in inequality after World War II was not only due to technical and structural change, but also due to changing social norms concerning labour which expressed changing societal preferences. As reasons for an increase in income inequality since the 1970s, Piketty and Saez (2003) point to another industrial or technical revolution occurring, with similar patterns in inequality similar to those of previous industrial revolutions. Further, social insitutions and norms could have changed again with society in the United States accepting the existence of few top-earners.

Figure 1: Evolution of the Gini Coefficient in the USA, 1917-2015

2.2 Determinants of within-country income inequality

So far, the literature outlined has focused on income inequality on the national level. Although these national accounts are very relevant in the comparison of countries, it is possibly even more relevant for national policy makers to be informed about differences in income inequalities within regions of a country (Panizza, 2002). First, the assessment of income

(10)

distribution on a sub-national, regional level allows for the identification of “problematic” areas with respect to an uneven distribution, and may lead policy makers to adopt stimuli targeted to a specific region.

The regional measurement of income inequality is also a useful tool to complement the analysis of inter-regional mean income disparities by supplementing a picture of the regional concentration of inequalities (Piacentini, 2014). Since individuals likely assess their own life satisfaction with reference to a geographically limited region and act on this intuition, regional indicators can lead to a better understanding of individuals’ behaviour (Royuela, Veneri, & Ramos, 2014). In addition to regional data on income inequality, the data collection of other indicators has also been advanced in recent years, although data availability of, for example, educational attainment is already more widely available than inequality data. The OECD has made efforts in the past years to collect harmonized data on income inequality and other regional statistics coining the term “regional well-being” in order to allow for a more fine-grained analysis of societal challenges ahead. In addition, Frank (2009) shows how the use of regional inequality measures can produce more reliable results compared to the use of national data, since even regions within a country can be very heterogeneous.

2.2.1 GDP Growth

Which factors affect the income distribution within the regions of a country? One factor seen as decisive for national income inequality is the level of development, as measured by GDP, or more precisely by GDP growth. Although GDP differences on a regional level within a country are less pronounced than differences between countries, regions do have strongly different domestic products and also differing growth rates according to their economic success or adaptation to new technological circumstances. As stated earlier, Kuznets hypothesized that at a high level of development, income inequality will become lower than at the medium stages. Therefore, an increase in GDP measured by GDP growth should reduce inequality in a developed nation such as the United States (Kuznets, 1955).

(11)

Barro (2000) finds the existence of the inverted-U shape, although stating that the mere empirical regularity found is not fit to explain the variation in inequality across countries over time. In testing simultaneously for the relationship between GDP growth and inequality, Lundberg and Squire (2003) find that there is a significant positive relationship between growth and inequality, that is increases in growth lead to increased inequality, although the quantitative impact of a relatively large increase in growth is quite small. Further, the authors find that the somewhat simultaneous nature of growth and inequality have important consequences for policy-makers, as a focus on merely one of the two measures may lead to adverse effects on the other. Although the Kuznets Hypothesis is not an accurate characterisation of recent decades, it is still an interesting tool for analysis since further economic development in the future may well see a decrease in inequality, thus leading to a wave-like form factor.

For the time period at hand, the evolution of income inequality in the United States follows an approximately linear pattern shown in Figure 2, such that a long-term analysis of the Kuznets hypothesis is not possible. Nonetheless, the relationship is contrary to the predictions for developed nations, and various factors influence this upward-sloping relationship. Therefore, the inclusion of GDP growth as a predictor of income inequality enables a test of the Kuznets relationship for recent time periods. One contribution of this thesis is the reversal of the most frequently tested relationship of inequality affecting growth. In the present analysis, current and lagged values of GDP growth will be used to predict the development of income inequality.

Figure 2: Evolution of the Gini Coefficient in the USA, 2001-2015

(12)

2.2.2 Population Density

Another prime reason for disparities is seen in the urban-rural divide, which is often seen as most drastic in developing nations, but nonetheless relevant for OECD countries also. The outset for the disparity between rural and urban areas is the location decision of the firm. A large agglomeration literature finds that although input costs are higher in urban areas, firms tend to locate their business where they can benefit from agglomeration economies such as increasing returns to scale and knowledge-sharing and thus increase their productivity (Baum-Snow & Pavan, 2013; Roca and Puga, 2017). Productive workers then self-select into the larger cities, because their high productivity guarantees employment in a highly productive firm and is associated with a high income. Therefore, regions that contain a large urban area as well as very rural regions are expected to exhibit a higher regional inequality than regions consisting of only one settlement pattern due to the large productivity differences between agriculture and employment in the services and industrial sector, which is often located in urban agglomerations. However, it also has to be noted that there is not only an urban-rural divide in wages leading to income inequality, but that there also exists a within-urban component to inequality, stemming from large wage differentials in cities. This could be due to the effect that there are large differences in human capital endowments between the inhabitants (Glaeser, Resseger, & Tobio, 2009). One motivation of less educated individuals to move to a city can be seen in the ample supply of jobs not requiring high skills (for example in the service sector). On the other hand, cities are also typically the places where highly skilled individuals move to in order to gain high returns on their human capital by working productively in high-paying employment (Glaeser et al., 2009). Therefore, it is expected that income differentials in cities are quite large. Baum-Snow and Pavan (2013) have shown that the variance of US log wages would have been 34% smaller between 1979 and 1999, and 23% between 1979 to 2007, had inequality in urban areas grown at the same rate as inequality in more rural areas. More precisely, the authors demonstrate that within-group inequality has been the driving force behind the urban-specific developments. Although this could point to a higher ability dispersion within the groups, the more likely explanation is that income inequality increases due to returns on unobserved skill in urban areas. This leads to the expectation that for regions with a higher proportion of individuals living in urban areas and for regions encompassing large cities, income inequality will be more pronounced than for more equally distributed regions.

(13)

kept rising. Furthermore, in combination with the other explanatory variables, it will be informative to find out whether urbanisation will have an effect on inequality when accounting for educational attainment and innovation, which are both found to be significantly larger in urban versus rural areas in the agglomeration literature (Glaeser & Maré, 2001).

2.2.3 Educational Attainment

Another factor helping to explain regional income inequalities can be seen in the degree of educational attainment. Historically, educational attainment has been seen as a source of decreasing inequality. According to the theory, a higher share of educated persons will increase competition for positions requiring educational credentials and will thereby drive wages down (Tinbergen, 1975). However, the proposed negative relationship between education and inequality seems to hold more for a country on the full development path, that is going through different stages of development (Nielsen & Alderson, 1997). For industrialised nations, Nielsen and Alderson (1997) also make the argument that education affects income inequality no longer through the average education level, but rather through the dispersion of education in society, that is through growing skill differences.

(14)

quite closely matches the recent evolution of inequality in the United states, with the rising college premium after the late 1970s possibly connected to rising inequality since the 1980s.

A further phenomenon connecting inequality and education is job polarization, where highly skilled individuals can augment their productivity by using technology, while the uneducated have to compete with technology that may well replace them at some point (Autor, Dorn, & Hanson, 2016). During the past decades, the demand for skills has shifted away from manual labour to more cognitive skills, specific knowledge of technology and generally towards solving more complex problems. According to Autor (2014), a distinction has to be made between manual, routine and abstract skills. While manual tasks (e.g. jobs in the health sector) cannot yet be replaced by offshoring or technology, and abstract skills can often still be carried out most efficiently by humans, it is the routine tasks such as factory work that have come under pressure of replacement. Further, Acemoglu (1998) points to the phenomenon that previous technological innovations in the 18th and 19th century have led to a replacement of labour (for example the steam machine and weaving jenny), while current innovations are rather skill-complementary, that is these innovations do not replace human labour but rather allow the worker to increase his productivity. Manual tasks however show a low complementarity to technology and are thus showing low wage increases. At the same time, the supply of skilled labour also expanded in the developed countries as the education systems produced more college graduates and individuals in tertiary education. However, the demand seemingly still outpaces the supply of skilled labour, as the economy of many developed countries further transitions from an industrial to a service-driven economy (Autor, 2014). From the preceding reasoning, one may expect income inequality to be more pronounced in regions with a higher share of individuals in tertiary education.

(15)

The inclusion of educational attainment as an explanatory factor for income inequality has been studied quite extensively. However, this thesis will contribute to the existing literature by scrutinising an interesting time period in the development of the United States: while the share of persons with tertiary education is almost universally increasing, so is the demand for skilled labour. Therefore, the analysis will contribute to the research on the skills premium and its effect on income inequality in the United States since the beginning of the 21st Century, a period characterised by fast increases in technological developments.

2.2.4 Innovation

In addition to modelling the “skill” portion of skill-biased technological change, the factor of technological change, or innovation, also has to be considered in its effect on inequality. This can be measured by expenditure on research & development, as is often done when assessing innovative potential. However, R&D expenditure is a measure of research inputs, that is the budget allocated by firms in order to develop new products (Acs, Anselin, & Varga, 2002). Patents on the other hand represent the outcome of R&D efforts and are therefore an interesting measure of innovation (although the use of patents as a measure of innovation suffers from a different set of limitations, for example patent hoarding to extract royalty payments).

(16)

break-through to create a new product and diffuse it in mass markets. The effect of the GPT would be seen in an increased skill premium due to the increased demand for skilled labour, which would, in turn, increase income inequality. A further characteristic of GPT is that much of the skill-intensive innovation surrounding the GPT takes place in a relatively short time frame, so that even with an increase in the supply of skilled labour, the skills premium would rise. Within the time period from 2001 to 2015, important innovations have taken place especially in the United States, an example being the diffusion of smart phones into mass markets over the past 10 years. Therefore, an increase in innovation could be related to an increase in income inequality.

For the USA, a positive correlation is found between a state’s level of innovation, measured by the flow of patents or citations, and top-income inequality in the respective state (Aghion, Akcigit, Bergeaud, Blundell, & Hémous, 2015). However, less positive and even negative correlations are found between the measures of innovation and income inequality that is not measuring only top-incomes. The authors hypothesise that innovation affects inequality through an increase in the entrepreneurial share of income, since patents lead to the possibility of entering the business sector with a new product rather than being simply employed. Frank (2009) finds a positive relationship between top-income shares and growth in the US states, so a point can be made here that growth is fuelled by innovation and that there is thus a relationship between innovation and income inequality. As the data set at hand uses income inequality measured by the Gini coefficient over all income shares it is informative to test for the significance of innovation for society at large.

(17)

Although some theoretical channels through which innovation may affect income inequality have been identified, the state of research on this topic is still quite thin. Aghion et al. (2015) indeed find a positive relationship between innovation and income inequality if measured by top-income shares. This thesis will add to this literature by testing the hypothesised relationship for a recent time period and by analysing the effect of innovation on income inequality measured by the Gini Coefficient, that is the effect of innovation on inequality in society as a whole will be more closely researched.

2.3 Gini Coeffcient as Inequality Measure

(18)

2.4 Conceptual Model and Hypotheses

In the previous sections, the existing literature has been reviewed. From the relationships found in previous research, the following set of testable hypotheses has been developed, relating GDP growth, urbanisation, educational attainment and innovation to the increasing levels of income inequality in the United States over the past decades.

H1: There exists a negative relationship between inequality and the growth of GDP by US-state, even when controlling for the level of GDP per capita.

H2: A higher degree of urbanization by US state as measured by population density will lead to a higher level of income inequality.

H3: US states with a larger share of highly educated persons in the labour force will exhibit a higher inequality than a region with more uniform skills attainment

(19)

3. Data and Methodology

In this section, the data and methodology of the analysis will be explained. First, an overview of the dataset and its sources will be provided. The time frame and region at hand will then be discussed. In the following, the dependent variable, the explanatory variables and the control variable will be described including the descriptive statistics. In the methodology section, the empirical specification of the model will be provided.

The data used for this research stems from two data sources. The first is from the so-called Frank-Sommeiller-Price series, a research effort in which the authors collected data on income inequality in the United States from 1917 to 2009, obtained from Internal Revenue Service (IRS) data (Frank, 2009). The dataset is updated by the author, with the current time period ranging up to 2015 with annual measurements of the variables. The variables in the dataset are the top 10% to top 0.01% income shares, the Atkinson Index, Theil Index and the Gini Index ranging from 0 to 1, which has been obtained as the dependent variable for the purpose of this analysis. Over the course of the century, adjustments have been made to the way the Gini Index is calculated, and changes in the IRS recordings have been smoothed out. For the 2001-2015 period, the data collection is consistent.

(20)

The variables extracted from the datasets above have been merged into one dataset with 765 observations (15 years * 51 states), yielding a strongly balanced panel meaning that the number of observations does not vary between regions and over time.

3.1 Time Frame and Region

The time frame for the measurement of inequality from the Frank-Sommeiller-Price Series begins in 1917 and runs until 2015. However, the OECD regional statistics dataset limits the availability of much of the necessary data to the late 1990s and early 2000s. For this reason, the time frame has been specified to 2001 to 2015. The resulting time frame is sufficiently long for a panel data analysis and is recent in a way that allows for conclusions of relevance for policy makers and scholars. As outlined above, income inequality in the United States of America has been rising between 2001 and 2015.

The region chosen for analysis consists of the 50 states of the United States of America (plus the federal district Washington D.C., which will be treated as a “normal” state for the purpose of this analysis). The federal states correspond to the territorial level of TL2, defined as the first administrative tier of sub-national government, and therefore correspond to the data supplied in the OECD regional statistics (OECD, 2018b).

3.2 Dependent Variable: Inequality

(21)

population, as their income lies below a certain threshold level depending on age and marital status. Nonetheless, the IRS-dataset is more unlikely to suffer from over- or under reporting bias, which household survey may be more prone to, the reason being the penalisation of wrong statements to the tax authorities.

In their seminal paper, Deininger and Squire have also laid out some requirements regarding the quality of the underlying data used for the calculation of the Gini coefficient, namely the individual as unit of observation, comprehensive coverage of the population and the comprehensive measurement of income. Clearly, the Frank-Someiller-Price series, based on IRS tax data, fulfils these 3 requirements, since tax data is available for the individual, the population is covered in much more breadth than using household surveys, and the tax data measures a relatively broad subset of different forms of income.

3.3 Explanatory Variables

In the following section, the four explanatory variables educational attainment, urbanisation, innovation and GDP growth will be outlined concerning their measurement. Possible reverse causality issues will also be addressed in this section.

3.3.1 Educational Attainment

Another factor identified as possibly influencing income inequality is the educational attainment of the labour force. As outlined previously, it is expected that a higher share of persons with a tertiary education would be positively associated with income inequality. The variable used in the empirical analysis is therefore the share of the labour force with a tertiary education as the highest level of educational attainment, measured in percent. The proposed channel here is that those with tertiary education will likely take on abstract and non-routine occupations, rather than manual type labour, and will thus earn higher wages in their occupations, which could in turn influence income inequality.

(22)

for the use of the lag, but it is also sensible conceptually. In a theoretical sense, education may affect inequality through the channel of competition for high-wage employment and through the phenomenon of job polarisation, and this is the hypothesised relationship in the analysis. The lag of only one year is not the preferred time period, but due to data availability it remains the only possibility. Ideally, the lagged time period would be slightly larger, because it is likely that the effect of an increase of persons with tertiary education will take some more time than just one year, for example due to transitional periods until employment is taken up.

3.3.2 Urbanisation

The variable of population density was chosen in order to model the hypothesis revolving around higher inequality in urban areas. The variable population density is defined as the number of persons living in one square kilometre (OECD, 2018b). The goal of the measure generally is to capture the variation in urbanisation and settlement patterns between the states of the United States (Long, Rain, & Ratcliffe, 2001). Although population density is only a proximate measure of urbanisation, it does capture the intended measurement, as can be seen for example in the population density of Alaska with a value of 0.5 inhabitants per square kilometre and 465 inhabitants per square kilometre in New jersey in the year 2015 relatively accurately describing the settlement pattern in these states. However, the measure is imperfect because it does not fully model the size distribution of cities in the United States, where the highest values should correspond to the urban areas with the highest population (density). The inclusion of the population density measure is however warranted by its availability over longer time periods. Further, the measure proxies urbanisation, but is only dependent on the unchanging territorial area of US states and the changing population, but not on artificial definitions of e.g. metropolitan areas.

(23)

chosen due to a decreasing correlation of population density and income inequality with increasing lag periods. The lag of one year therefore addresses the issue of reverse causality without decreasing the predictive power of the model too strongly.

3.3.3 Innovation

As a proxy for innovation and research and development, the number of registered patents has been chosen as the relevant variable. The values express the number of patent applications by the inventor’s place of residence and the year of the application, and thereby provides a relatively direct measure of the strength of innovation in the respective state and time frame.

For the explanatory variable patent, again, a lag is introduced. The hypothesis at hand is interested in the relationship of innovation affecting income inequality, so the explanatory variable patent will be lagged by three years. The reason for this time frame can be seen in Aghion et al. (2015), where it is stated that there is a delay between the application of a patent and its grant, so that the wage and thus income inequality effect only materialises after a few years. Regressing the original variable patent and its lag on the Gini coefficient shows that the predictive power of the model with the lag of patent is higher.

3.3.4 GDP Growth

The final explanatory variable at hand is GDP growth, measured as an index of real growth starting in 2001. The nominal growth rate of GDP has therefore been adjusted for inflation to measure the true expansion or contraction of the economy. As the dataset begins with values of 100 in 2001, the data have been transformed by subtracting 100 from these values, such that the first measurement in 2001 is 0.

(24)

3.4 Control Variables: GDP per capita

GDP per capita is introduced as the control variable into the estimation of the models. In order to make the values comparable over time, GDP per head in constant 2010 prices has been chosen as the measure. GDP per capita can be viewed as a measure of the extent of development of a country and is interesting as a control variable because population adjustments can be controlled for, as can the initial level of development. As Milanovic (2016) points out, the initial developmental level of a country has an effect on GDP growth, as for lower levels of development, economic growth is likely faster than at higher levels of development, simply due to the fact that at lower absolute levels, the same economic expansion leads to a higher percent increase in growth.

3.5 Descriptive Statistics

(25)

Table 1: Descriptive Statistics

Variable Obs. Mean St. Dev. Min. Max.

Gini 765 .599549 .0363184 .5218447 .7114252

Population Density 765 146.7352 523.909 .43 4216.21

Population Density, lag 765 145.4966 518.0637 .42 4144.69

Education 765 36.20562 5.983608 24.2 66.9 Education, lag 765 36.04366 5.95423 21.6 63.9 Patent 765 933.5955 1775.642 5.5833 15934.2 Patent, lag 765 855.3114 1570.359 4.3333 14422.5 GDP Growth 765 17.84641 16.63811 -13.2 142.2 GDP Growth, lag 765 8.357778 14.72343 -24.3 92.10001 GDP per Capita 765 49099.38 18424.29 28571 171570

GDP per Capita, lag 765 47405.71 17704.91 27967 171570

3.6 Empirical Specification

(26)

presented is the average of the individual intercept of each state. In addition, since it is feasible that the standard errors in the model, 𝜀$,&, are correlated within the cluster, that is the respective US federal state, cluster robust standard errors have been used in the empirical analysis at a later stage (Hill et al., 2011).

The first step in the regression analysis will be to control for the effects of GDPgrowth and the level of GDP per capita (GDPcap), as these are expected to affect income inequality. The explanatory variables population popdens, educ and patent are then also included in the model to yield equation 1:

𝐺𝐼𝑁𝐼 = 𝛽"+ 𝛽,𝑝𝑜𝑝𝑑𝑒𝑛𝑠+𝛽3𝑒𝑑𝑢𝑐$,&+𝛽6𝑝𝑎𝑡𝑒𝑛𝑡$,&+𝛽9𝐺𝐷𝑃𝑐𝑎𝑝$,&+ 𝛽<𝐺𝐷𝑃𝑔𝑟𝑜𝑤𝑡ℎ$,&+ 𝜀$,& (1) In addition to the first model, a second model will be introduced to include the lagged variables of all explanatory variables in order to mitigate the possibility of reverse causality bias. The specification will stay the same, however the variables will be lagged by different time periods as outlined above to yield equation 2, showing the lag length in years:

(27)

4. Results

This section of the analysis will provide the results and discussion of the models 1 and 2. Afterwards, various robustness tests will be discussed for both models. Table 2 presents the results of the estimation for both models.

Table 2: Estimation Results for Models 1 and 2

VARIABLES (1) Model 1 (2) Model 2 Population Density 0.0000263 (0.0000259) 0.0000628** (0.0000313) Education 0.0027565*** (0.0004024) 0.0028359*** (0.0004288) Patent 0.00000872*** (0.00000220) 0.0000102*** (0.00000249) GDP per Capita -0.000000207*** (0.000000429) -0.00000113*** (0.000000196) GDP Growth 0.0009895*** (0.0001317) 0.0006398*** (0.0001061) Constant 0.4728035*** (0.0157983) 0.5278405*** (0.019422) Observation 765 765 R-squared 0.2394 0.2489

Number of federal states 51 51

Standard errors in parentheses ***p<0.01, **p<0.5, *p<0.1

4.1 Model 1

(28)

the fixed-effects estimation.1 Therefore, the following estimations are all carried out using fixed-effects estimations.

Model 1 includes the dependent variable gini and the explanatory variables population density,

education, patent and GDP growth as well as the control variable GDP per capita, with the

estimation results being presented in Table 2. Turning to the first explanatory variable, population density is insignificant at a high margin. Therefore, urbanisation as measured by population density does not seem to have a significant effect on the evolution of income inequality. Since Hypothesis 2 had stated that urbanisation is positively associated with income inequality, this hypothesis has to be rejected here. This finding is quite surprising, as other studies by Glaeser et al. (2009) had shown the heterogeneity in skill endowments, while Baum-Snow and Pavan (2013) had found that the increases in log wage variance in cities had contributed strongly to the increase in income inequality over past decades in the United States. A possible reason for the insignificant relationship could lie in the absence of a lagged independent variable of population density. As outlined above, when using a non-lagged variable, it may become impossible to decompose the direction of causality correctly, that is, population density may affect income inequality while income inequality may simultaneously affect population density. This issue will be addressed in Model 2.

The second explanatory variable, the share of the labour force with tertiary education, is significant at the 1% level with a slightly positive coefficient of 0.0027565. This signifies that a one percentage point increase in the share of persons with a college degree in the labour force will increase the Gini coefficient by 0.0027565. Although this may seem like an extremely marginal increase, it is important to keep in mind that the Gini coefficient is measure on a 0 to 1 scale, with the extreme values of complete (in)equality never being reached in practice (Milanovic, 2016). Between 2001 and 2015, the average Gini coefficient for income inequality in the United States has increased from 0.59 to 0.64, showing that the coefficient for educational attainment has a larger impact than one would think. As the coefficient is positive, it can be concluded that skill-biased technological change and the increase of the skills premium is a reality at least for the United States (Autor, 2014). Therefore, Hypothesis 3 can be accepted.

The third explanatory variable refers to the effect of innovation as measured by patent applications on income inequality. The results are significant at the 1% significance level, with

(29)

a positive coefficient of 0.00000872. As the explanatory variable is measured as the number of patent applications per year in the respective federal state, an increase by one patent application would positively increase the Gini coefficient by 8.72e-6. Again, the effect seems miniscule but the range of vales the Gini coefficient takes is simply very small. Further, the range of the number of patent applications in the federal states is extremely large, ranging from just over 30 in Alaska to over 14000 in California.

The final explanatory variable, GDP growth, is significant at the 1% level of significance with a coefficient of 0.0009895. The interpretation here is that a 1 percentage point increase in GDP growth would lead to an increase in the Gini coefficient by 0.0009895. From the graphical intuition of increasing income inequality with increasing development in the United States over the past decades, this would be the expected relationship between GDP growth and inequality, and therefore Hypothesis 1 can be accepted for Model 1. However, the results contradict the predictions of Kuznets (1955), rejecting the idea that increasing development decreases inequality in the long term, at least for the time period at hand.

The control variable, GDP per capita, is significant at the 1% significance level with a negative coefficient of -0.00000207. This is puzzling in a way, as it supports the hypothesis of Kuznets, that in a developed country (which the USA can be considered to be), inequality would decrease with further development. However, this contradicts what has been shown in the data for the United States, where income inequality has actually increased severely over the past decades.

4.2 Model 2

(30)

To test for the preferred model, the Akaike Information Criterion and the Bayesian Information Criterion have been utilized with the result that Model 2 is preferred over model 1.2

The most significant change in the model can be seen in the variable population density, which turns significant at the 5% significance level, contrary to the previously found statistical insignificance. Therefore, an increase in 1 person per square kilometre in year t-1 would increase the Gini coefficient in the respective state by 0.0000628. The inclusion of the lagged population density variable therefore leads to the acceptance of hypothesis 2, that an increase in population density increases income inequality. Overall, the statistical tests lead to a preference of Model 2, and this can also be defended theoretically, as the explanatory variables used in the regression take some time to show visible effects on income inequality.

4.3 Robustness

Several model diagnostics have been conducted to assess and increase the validity of the models at hand. The first test is concerned with the normality of the regression errors in Model 1 and Model 2. For model 1, the skewness amounts to 0.4996 while the kurtosis is at 3.6807. The according histogram is displayed in Figure 3 below and shows an approximately normal distribution. In addition, a normal distribution has been added as an overlay, again signifying the approximate normality of the residuals of Model 1, although the distribution is slightly skewed right. Finally, a quintile plot of the Model 1 residuals has been carried out, showing some indication of non-normally distributed residuals especially in the upper quintiles, as shown in Figure 4. A possible explanation for these indications lies in the presence of various extreme values in the data. For example, the number of patent applications in California is much higher than in any other state, likely due to the agglomeration of technology companies in the so-called Silicon Valley and an accordingly high research output going hand in hand with the desire to protect innovations from being used by competitors. As for GDP per capita, the District of Columbia shows roughly 3 times higher per capita GDP than most other states. Most of these “outliers” however cannot simply be excluded from the data, as they are measured and therefore should not be deleted to prevent artificially overfitting the model. The case of the District of Columbia is however interesting. For example, population density takes highly

2 The test statistic for the Akaike Information Criterion was found to be -3770.724 for Model 1 and -3780.252

for Model 2. Regarding the Bayesian Information Criterion, the statistic was found to be -3742.885 and

(31)

extreme values, the reason being the much smaller area of the district compared to all other federal states with a relatively high population in Washington D.C. at the same time. Due to the District of Columbia being defined differently to a federal state and therefore showing very unusual values for multiple variables in the model, Models 1 and 2 will be re-estimated excluding the observations for the District of Columbia, yielding Model 1A and Model 2A.

Figure 3: Histogram of Model 1 Residuals

(32)

Table 3: Estimation Results for Models 1A and 2A VARIABLES (1) Model 1A (2) Model 2A Population Density 0.002125*** (0.0003042) 0.0017858*** (0.000298) Education 0.002294*** (0.0004047) 0.002441*** (0.0004362) Patent 0.00000680*** (0.00000215) 0.00000791*** (0.00000248) GDP per Capita -0.000000813 (0.000000578) -0.000000688** (0.000000380) GDP Growth 0.000526*** (0.0001727) 0.0003954*** (0.0001294) Constant 0.3817126*** (0.0410246) 0.4014862*** (0.0328181) Observation 750 750 R-squared 0.2844 0.2810

Number of federal states 50 50

Standard errors in parentheses ***p<0.01, **p<0.5, *p<0.1

(33)

in predictive power. Again, this is not surprising as the linear fit should become better when extreme values are omitted from the data set. Looking at the Akaike Information Criterion and the Bayesian Information Criterion however, both Model 1A and 2A are not preferred over the initial Model 2 with all observations present.3 In addition, for both adjusted models the skewness and kurtosis of the distribution as actually increased compared to Models 1 and 2, meaning that a more normal distribution could not be achieved with the adjustments. Therefore, Model 2 and to a lesser extent Model 1 are still the preferred models and therefore the District of Columbia will remain included in the sample.

Table 4: Estimation Results for Models 1 and 2 using the Top Decile Income Share as Dependent Variable VARIABLES (1) Model 1 Top 10 (2) Model 2 Top 10 Population Density -0.0025728 (0.0023913) 0.0059765** (0.0029224) Education 0.2587217*** (0.0372151) 0.3327767*** (0.0400731) Patent 0.0009283*** (0.0002031) 0.0006924*** (0.0002329) GDP per Capita -0.0001904*** (0.0000387) -0.0002286*** (0.0000277) GDP Growth 0.1261823*** (0.012181) 0.1033611*** (0.0099154) Constant 41.6022*** (2.22021) 40.87825*** (1.815028) Observation 765 765 R-squared 0.3537 0.3519

Number of federal states 51 51

Standard errors in parentheses ***p<0.01, **p<0.5, *p<0.1

(34)

A further robustness check has been carried out by switching the dependent variable of Models 1 and 2 from the Gini coefficient to the top 10 % income share, obtained from Frank (2009)4. This variable has been chosen as there is a relatively clear correlation of 0.679 between the Gini coefficient and the top income shares, and the evolution of the top income decile share closely relates to overall income inequality (Förster & Tóth, 2015). However, the effect of the explanatory variables on the top decile income share may actually be stronger than that on the Gini coefficient over all income levels. Although innovation has been found to significantly affect income inequality, this had not been the case in Aghion et al. (2015). Tertiary education may also have a larger effect on the evolution of the top-income share, as those with college education will likely earn wages in the upper income deciles after finishing their studies. Therefore, it will be interesting to re-estimate the models using the alternative measure of inequality. The results of the re-estimation are shown in Table 4. In comparison with the original dependent variable models, the effects of the explanatory variables have remained robust in the re-estimation. The largest change can be seen in the increase in R-squared of both models from 0.2394 to 0.3537 for Model 1 and from 0.2489 to 0.3519 for Model 2. This suggests that the explanatory variables at hand may be better able to predict the development of the top income decile than that of the Gini coefficient over all incomes.

4 While the Gini Coefficient in Frank (2009) had been measured on a scale from 0 to 1, the top income shares are

(35)

Table 5: Estimation Results for Models 1 and 2 using Cluster-robust Standard Errors VARIABLES (1) Model 1 robust (2) Model 2 robust Population Density 0.0000263 (0.0000176) 0.0000628** (0.0000262) Education 0.0027565*** (0.0003248) 0.0028359*** (0.0003627) Patent 0.00000872*** (0.00000161) 0.0000102*** (0.00000208) GDP per Capita -0.00000207** (0.000000781) -0.00000113** (0.000000435) GDP Growth 0.0009895*** (0.0002279) 0.0006398*** (0.0001749) Constant 0.5717526*** (0.0333856) 0.5278405*** (0.0001749) Observation 765 765 R-squared 0.2394 0.2489

Number of federal states 51 51

Standard errors in parentheses ***p<0.01, **p<0.5, *p<0.1

(36)

5. Conclusion and Limitations

The final section of this thesis will start out with a discussion of various limitations in the empirical model and the results. Afterwards, a direction for future research will be pointed out, followed by a discussion of some policy implications of the results at hand.

5.1 Limitations

One limitation of the analysis at hand is the relatively short time period it spans. Although a clear pattern arises when viewing the evolution of income inequality from 2001 to 2015, it would have been value-adding if the time frame had begun earlier, from approximately 1985 to present day. The reason is that significant increases in income inequality became visible from this point in time. Although the regional Gini coefficients are available for a long time frame, the other variables obtained from the OECD dataset are not. Thus, more data would have been necessary to expand the estimation. Further, although 51 US states have been considered in the estimation, and this leads to a dataset with many observations, it would have been interesting to carry out the analysis for more than just one country. First, using a sample of regions from developed countries with increasing income inequality over the past decades would have provided a clearer picture on the effect of the independent variables. Second, the use of regions from developing parts of the world, with different patterns of income inequality evolution could have provided more insights on the effect of urbanisation, educational attainment and innovation where high development levels have not yet been achieved.

Furthermore, the effect of globalisation on income inequality has been omitted from the analysis, although it is likely that globalisation does have an effect on employment and wage levels in industries that are prone to competition from abroad (Autor et al., 2016). A regional measure for trade openness or trade as a percentage of GDP could not be obtained but would be an interesting addition to the model as an additional explanatory variable.

(37)

5.2 Concluding Remarks

This thesis has investigated to what extent urbanisation, educational attainment, innovation and GDP growth can be used to explain the upward-trend in income inequality in the regions of the United States from 2001 to 2015. A panel data set constructed from two data sources spanning 51 US states has been used to answer the question at hand. In the preferred Model 2, the independent variables have been lagged by varying time periods in order to decrease the likelihood of an endogeneity problem in the estimation.

Urbanisation, which has been measured by population density, is found to have a positive and significant effect on income inequality and is therefore in line with Baum-Snow and Pavan (2013). Educational attainment, measured by the share of persons in the labour force with tertiary education and innovation, measured by patent applications in the US states also shows positive and significant effects on income inequality, as is predicted by the literature on skill-biased technological change and innovation (Goldin and Katz, 2010; Autor, 2014 and Aghion, Akcigit, Bergeaud, Blundell, & Hémous, 2015). Overall, educational attainment was found to have the largest effect on income-inequality and can thus be seen as the most important contributor within the confines of this research. This is also a point of departure for future research, where it would be interesting to research not only the effect of high-skill educational attainment on income inequality in a developed country, but also the effect of broader educational measures on income inequality. The broad diffusion of knowledge into society may well lead to decreasing inequality, and therefore it would be of interest to differentiate between skill levels in the analysis. Growth of GDP is also positively related to income inequality, even when controlling for the level of per capita GDP, which is found to negatively affect inequality. Thus, the graphical relationship between increasing development (as measured by GDP growth) and rising inequality can be confirmed, at least for the period from 2001 to 2015. The empirical results of the analysis should however be treated with caution, as heteroskedasticity and serial correlation could not be rejected for the model.

(38)
(39)

6. References

Acemoglu, D. (1998). Why Do New Technologies Complement Skills? Directed Technical Change and Wage Inequality. The Quarterly Journal of Economics, 113(4), 1055–1089. Acemoglu, D. (2002). Technical Change, Inequality, and the Labor Market. Journal of Economic

Literature, 40(1), 7–72.

Acemoglu, D., & Autor, D. (2011). Skills, Tasks and Technologies: Implications for Employment and Earnings. In Handbook of Labor Economics (Vol. 4B, pp. 1043–1171). Elsevier.

Acs, Z. J., Anselin, L., & Varga, A. (2002). Patents and innovation counts as measures of regional production of new knowledge. Research Policy, 31(7), 1069–1085.

https://doi.org/10.1016/S0048-7333(01)00184-6

Aghion, P. (2002). Schumpeterian Growth Theory and the Dynamics of Income Inequality.

Econometrica, 70(3), 855–882. https://doi.org/10.1111/1468-0262.00312

Aghion, P., Akcigit, U., Bergeaud, A., Blundell, R., & Hémous, D. (2015). Innovation and Top

Income Inequality (Working Paper No. 21247). National Bureau of Economic Research.

https://doi.org/10.3386/w21247

Autor, D. H. (2014). Skills, education, and the rise of earnings inequality among the “other 99 percent”. Science, 344(6186), 843–851. https://doi.org/10.1126/science.1251868

Autor, D. H., Dorn, D., & Hanson, G. H. (2016). The China Shock: Learning from Labor-Market Adjustment to Large Changes in Trade. Annual Review of Economics, 8(1), 205–240. https://doi.org/10.1146/annurev-economics-080315-015041

Barro, R. J. (2000). Inequality and Growth in a Panel of Countries. Journal of Economic Growth,

5(1), 5–32.

Baum-Snow, N., & Pavan, R. (2013). Inequality and City Size. The Review of Economics and

(40)

Chapple, S., Förster, M., & Martin, J. P. (2009). Inequality and Well-being in OECD

Countries:What do we Know? Presented at the The 3rd OECD World Forum on ‘Statistics, Knowledge and Policy’, Busan, Koreak: OECD.

Corak, M. (2013). Income Inequality, Equality of Opportunity, and Intergenerational Mobility.

Journal of Economic Perspectives, 27(3), 79–102. https://doi.org/10.1257/jep.27.3.79

Deininger, K., & Squire, L. (1996). A New Data Set Measuring Income Inequality. The World

Bank Economic Review, 10(3), 565–591.

Deininger, K., & Squire, L. (1998). New ways of looking at old issues: inequality and growth.

Journal of Development Economics, 57(2), 259–287.

Foellmi, R., & Zweimüller, J. (2006). Income Distribution and Demand-Induced Innovations. The

Review of Economic Studies, 73(4), 941–960.

Förster, M. F., & Tóth, I. G. (2015). Cross-Country Evidence of the Multiple Causes of Inequality Changes in the OECD Area. In A. B. Atkinson & F. Bourguignon (Eds.), Handbook of

Income Distribution (Vol. 2, pp. 1729–1843). Elsevier.

https://doi.org/10.1016/B978-0-444-59429-7.00020-0

Frank, M. W. (2009). Inequality and Growth in the United States: Evidence from a New State-Level Panel of Income Inequality Measures. Economic Inquiry, 47(1), 55–68.

https://doi.org/10.1111/j.1465-7295.2008.00122.x

Glaeser, E. L., & Maré, D. C. (2001). Cities and Skills. Journal of Labor Economics, 19(2), 316– 342. https://doi.org/10.1086/319563

Glaeser, E. L., Resseger, M., & Tobio, K. (2009). Inequality in Cities. Journal of Regional

Science, 49(4), 617–646. https://doi.org/10.1111/j.1467-9787.2009.00627.x

Goldin, C., & Katz, L. F. (2007). The Race between Education and Technology: The Evolution of

U.S. Educational Wage Differentials, 1890 to 2005 (Working Paper No. 12984). National

(41)

Hill, R. C., Griffiths, W. E., & Lim, G. C. (2011). Principles of Econometrics (4th ed.). Hoboken, New Jersey: Wiley.

Kuznets, S. (1955). Economic Growth and Income Inequality. The American Economic Review,

45(1), 1–28.

Long, J. F., Rain, D. R., & Ratcliffe, M. R. (2001). Population Density vs. Urban Population: Comparative GIS Studies in China, India, and the United States. Presented at the IUSSP Conference, Salvador, Brazil: U.S. Census Bureau.

Lundberg, M., & Squire, L. (2003). The Simultaneous Evolution of Growth and Inequality. The

Economic Journal, 113(487), 326–344.

Milanovic, B. (2016). Global Inequality: A new approach for the age of globalization. Cambridge, Massachusetts: The Belknap Press of Harvard University.

Milanovic, B., Lindert, P. H., & Williamson, J. G. (2011). Pre-Industrial Inequality. The Economic

Journal, 121(551), 255–272. https://doi.org/10.1111/j.1468-0297.2010.02403.x

Nielsen, F., & Alderson, A. S. (1997). The Kuznets Curve and the Great U-Turn: Income Inequality in U.S. Counties, 1970 to 1990. American Sociological Review, 62(1), 12–33. https://doi.org/10.2307/2657450

OECD. (2018a, March). OECD Regional Database. Retrieved from

https://www.oecd.org/cfe/regional-policy/regionalstatisticsandindicators.htm

OECD. (2018b, March). Regional Demography: Population Density and Regional Area. Retrieved from http://stats.oecd.org/Index.aspx?DataSetCode=REGION_DEMOGR

Panizza, U. (2002). Income Inequality and Economic Growth: Evidence from American Data.

Journal of Economic Growth, 7(1), 25–41.

Piacentini, M. (2014). Measuring Income Inequality and Poverty at the Regional Level in OECD Countries. https://doi.org/10.1787/5jxzf5khtg9t-en

(42)

Piketty, T., & Saez, E. (2003). Income Inequality in the United States, 1913–1998. The Quarterly

Journal of Economics, 118(1), 1–41. https://doi.org/10.1162/00335530360535135

Roca, J. D. L., & Puga, D. (2017). Learning by Working in Big Cities. The Review of Economic

Studies, 84(1), 106–142. https://doi.org/10.1093/restud/rdw031

Rogerson, P. A. (2013). The Gini coefficient of inequality: a new interpretation. Letters in Spatial

and Resource Sciences, 6(3), 109–120. https://doi.org/10.1007/s12076-013-0091-x

Royuela, V., Veneri, P., & Ramos, R. (2014). Income Inequality, Urban Size and Economic Growth in OECD Regions - Papers - OECD iLibrary. Retrieved 12 March 2018, from

http://www.oecd-ilibrary.org/urban-rural-and-regional-development/income-inequality-urban-size-and-economic-growth-in-oecd-regions_5jxrcmg88l8r-en?crawler=true

Schwab, K., & Sala-i-Martín, X. (2017). The Global Competitiveness Report 2017-2018. Geneva: World Economic Forum. Retrieved from https://www.weforum.org/reports/the-global-competitiveness-report-2017-2018/

Referenties

GERELATEERDE DOCUMENTEN

These voltages, given by G &amp; C C , will be relayed back to the power supply (depending on the switching topology) source via an intrinsic body diode that is present inside

Although the model showed potential to create types of dunes that emerge on sandflats close to inlets on synthetic scenarios (e.g. Nebkah type dune) and spatial trends on

Aus der Umfrage geht hervor, dass Schüler mit Deutschunterricht erwartungsgemäß mehr Kenntnisse über Deutschland haben, mehr Kontakt mit Deutschland und den Deutschen haben und

Although such impacts were re- ported in the 1923 Great Kant¯o earthquake and the 1999 Chi- Chi earthquake, careless reconstruction in hazard-prone ar- eas and consequently huge

Abstract: This paper analyses an adaptive nonconforming finite element method for eigenvalue clusters of self-adjoint operators and proves optimal convergence rates (with respect to

Challenges to Social Construction of Technology (SCOT) Theory: Con- sidering a Methodological Subjectivity for East Asian Technology Studies

The comparison of the simulated snow albedo evolution with the in situ measurements shows that the parameterizations adopted by Noah, BATS, and CLASS are only able to simulate an

In hoeverre bestaat er een verband tussen de gecommuniceerde identiteit en de gemedieerde legitimiteit van organisaties op social media en in hoeverre spelen het gebruik van