• No results found

The effect of the recent surge in E-commerce on agglomeration of physical retail stores in the Netherlands

N/A
N/A
Protected

Academic year: 2021

Share "The effect of the recent surge in E-commerce on agglomeration of physical retail stores in the Netherlands"

Copied!
46
0
0

Bezig met laden.... (Bekijk nu de volledige tekst)

Hele tekst

(1)

The effect of the recent surge in E-commerce

on agglomeration of physical retail stores in

the Netherlands

Author: L.K. van Maanen s2249146 L.K.van.Maanen@student.rug.nl Supervisor: prof. dr. S. Brakman Co-Assessor: dr. T.M. Stelder

MSc. International Economics and Business

Faculty of Economics and Business, University of Groningen

(2)

Abstract

Keywords: Agglomeration, E-commerce, Geography

(3)

Contents

1 Introduction 3

2 Literature Review 6

2.1 Localization Economies and Urban Economies . . . 6

2.2 Agglomeration and Dispersion Forces . . . 8

2.3 Tomahawk Model and Bell Shaped Curve . . . 9

2.4 E-commerce as a Dispersion or Agglomeration Force . . . 10

2.5 Hypotheses . . . 13

3 Theory of the Gini coefficient 17 4 Methodology 23 4.1 Data . . . 23

4.2 Empirical model . . . 24

5 Results 28 5.1 ∆Log Gini regression . . . 29

5.2 Log Gini regression . . . 30

5.3 Comparison of results . . . 31

6 Discussion 32 6.1 Limitations . . . 34

7 Conclusion 36

(4)

1

Introduction

Nearly 84% of the Dutch population currently lives in urban areas and, by 2030, almost 90% of the Dutch population expected to live in cities (guardian, 2009). Of the entire world’s population, about 54% lived in urban areas in 2014, which is expected to increase to 66% by 2050, according to the United-Nations. Moreover, the United Nations predict rural population to decline absolutely and identify important opportunities for economic development of-fered by cities. Many studies have researched the agglomeration process and the two opposite agglomeration and dispersion forces (Marshall, 1890, 1920; Jacobs, 1969; Henderson, 1986; Krugman and Venables, 1995; Helpman, 1998; Puga, 2010).

Over the last fifteen years, the usage and number of users of the Internet has grown very rapidly, particularly in countries of the Western world and, specifi-cally, E-commerce has been a strong driver of growth in the retail industry1 of

developed economies (Clarke, Thompson, and Birkin,2015; Soopramanien and Robertson, 2007). If E-commerce affects physical retail stores in agglomerates differently than “lone” physical retail stores, then E-commerce must have a (net) agglomeration or dispersion effect. We are therefore led to raise the fol-lowing question: What is the effect of the recent surge in E-commerce on the agglomeration processes of physical retail stores in the Netherlands. More pre-cisely, we study whether physical retail stores in urban areas or “lone” physical retail stores in remote areas suffer more from E-commerce.

Research has shown that the shopping process of the majority of individ-uals starts on the Internet. Individindivid-uals use the Internet both for research before they visit physical retail stores or to substitute their in-store purchases for on-line purchases (Ward & Morganowsky, 2002). Moe and Fader (2004) highlighted the fact that the costs associated with on-line browsing and/or searching for information about products and services is significantly lower than the costs of searching for information without the use of the Internet, which causes consumers to search on-line more frequently than off-line.

In 2006, consumers bought for a total of 3.69 billion (Thuiswinkel.org, 2007)

1We use the term “retail sector” to denote the set of all physical retail stores engaged in

(5)

in products and services on-line, which is approximately 12 billion less than in 2015. Based on the Central Agency for Statistics, the Dutch E-commerce industry is expected to reach a revenue of 19,28 billion in 2016. Comparing the years 2016 and 2006, in ten years, the Dutch E-commerce industry has expanded enormously, by more than 400%. These massive increases in the on-line revenues show that not only on-line product search frequency, but also on-line product purchasing frequency has increased tremendously in the Netherlands.

Research from Thuiswinkel.org (2016a) in collaboration with postNL and GfK found that in the first quarter of 2016, a total of 11.12 million Dutch people of fifteen years or older made one or more on-line purchases, which is approximately 80% of the total Dutch population, and a 2% increase with respect to the first quarter of 2015. During 2015, on-line shopping constituted between 4,7% (first quarter) and 5,9% (fourth quarter) of the total the market. Overall, in 2015, 78% of the revenues from on-line shopping were generated by the sale of products and 22% by the sale of services. Additionally, three-quarters of all expenditures on services were generated from on-line shopping. The research (Thuiswinkel.org, 2016b) also showed that on-line buyers in the Netherlands are, in general, higher educated and have a significantly above-average income. Furthermore, individuals from households with children or households consisting of three to five persons are more likely to shop on-line. Lastly, consumers aged between twenty and forty-nine years old, as well as heavy Internet users2 are also more likely to buy on-line.

Many studies have analyzed on-line product searching and buying behav-ior (e.g. Ingham, Cadieux, and Berrada, 2015; Cao, Xu, and Douma, 2012; Weltevreden and Rietbergen, 2007), but only a few include the effect of ge-ographical factors on on-line buying behavior (e.g. Ren and Kwan, 2009; Farag, Weltevreden, Van Rietbergen, Dijst, and van Oort, 2006). Moreover, the question of how the on-line retail industry influences geography, specifically the spatial distribution of physical retail stores, has not yet been investigated. Since very little research into E-commerce has paid attention to geographic effects, and the research that did go into geographic effects focused on the consumers’ perspective, it remains unclear whether E-commerce acts as an ag-glomeration or dispersion force. The research presented in this article adds to

(6)

the current body of knowledge on the effect of E-commerce by exploring its effects on the spatial distribution of physical retail stores, specifically, investi-gating E-commerce’ nature as an agglomeration or dispersion force. Namely, the aim of this paper is to study whether E-commerce has a net agglomerative or dispersive effect, through competing more with physical retail stores in ag-glomerations or more with “lone’ physical’ physical retail stores, respectively. We focus exclusively on sales from businesses to consumers (B2C) and do not consider business to business (B2B) and consumer to consumer (C2C) platforms such as eBay. The B2C purchases considered include products and services regardless of where the products or services purchased: at home, at work, on the road, in the street or at an (on-line) store.

Given the expectation that E-commerce revenues will continue to grow in the future, the understanding of whether it has an agglomerating or disper-sive effect on physical retail stores is very valuable for urban planners and policymakers alike.

(7)

Section 2 reviews the literature relevant to the topics of agglomeration and E-commerce, and highlights the areas this study contributes to and section 3 describes the theory of the Gini coefficient. Section 4 describes the data and methodology used in the empirical research and section 5 presents the results of this study. Section 6 discusses the implications of the results presented in the previous section and the limitations of this study. Section 7 forms the conclusion of this paper.

2

Literature Review

This study approaches the effect of E-commerce on agglomeration as a “dis-ruptive” force, i.e. we assume that, before the introduction of E-commerce, the spatial distribution of physical retail stores was more or less in equilibrium with respect to the agglomeration and dispersion forces. The interplay be-tween agglomeration forces and dispersion forces can be explained in different ways. Scitovsky (1954) makes the distinction between internal and external economies of scale, based on Marshall (1920). Agglomeration economies re-sults from external economies of scale, which refer to how the scale of the urban environment adds to the productivity of the firm. There are three dimensions over which these externalities may extend, namely the industrial, geographic, and temporal dimensions (Rosenthal & Strange, 2004). This study will focus on the industrial and geographic dimensions, which explain the externalities that arise from the concentration of economic activities and why urban ar-eas (i.e. cities) exist, respectively. According to Puga (2010), companies and workers have a higher productivity when they are located in an agglomerated environment. Marshall (1890, 1920) was the first researcher that described the benefits from agglomeration economies, which will be further discussed in subsection 2.1.

2.1

Localization Economies and Urban Economies

(8)

inter-nal increasing returns to scale, from labor market pooling, which refers to the fact that agglomeration economies allow a better match between the demand and supply of labor, and from knowledge spillovers (which are explained be-low). An increasing body of literature supports Marshall’s theory that spatial concentration within industries raises growth and productivity, which shows the importance of localization economies ((Moomaw, 1981); (Nakamura, 1985); (Henderson, 1986); (Ciccone & Hall, 1993); (Henderson, 2003)).

In contrast to Marshall’s sector specific spillovers, also known as Marshall, Arrow, Romer (MAR) economies, Jacobs (1969) shows the importance of ur-ban density, highlighting city specific spillovers that cross the boundaries be-tween industries. According to Marshall (1920), city growth is driven by spe-cialization, with the so-called localization externalities as the main driving force of growth, whereas according to Jacobs (1969) the growth of areas is not driven only by industry specific externalities, but also by externalities shared among firms from different industries. Jacobs’ (1969) idea is that larger ur-banized areas have a higher productivity and all firms in a specific city benefit from the superior infrastructure, which is a positive externality of urban ar-eas and is not industry specific. Note that Marshall (1920) did recognize the value of urban diversity, which reduces risk and achieves domestic complemen-tarity (Rosenthal & Strange, 2004). Puga (2010) describes that cities, which are clusters of many different types of economic activities, are evidence of the existence of urbanization economies.

(9)

2.2

Agglomeration and Dispersion Forces

From reviewing the literature, it becomes clear that there are several impor-tant agglomeration and dispersion forces, which we will now discus. Impor-tant agglomeration forces, as described by Marshall (1920), are input sharing, knowledge spillovers, and labor market pooling. Many other agglomeration forces exist that were not discussed by Marshall (1920), like the home market effect and urban consumption opportunities.

Input sharing is influenced by the existence of scale economies in input production. Without scale economies, a lone firm in a remote area obtains inputs at the same price as firms in cities, but in the case of scale economies, the lone firm will be at a disadvantage (Holmes, 1999; Rosenthal and Strange, 2004). Moreover, sharing of infrastructure, transportation and other resources makes firms located in cities more profitable.

Knowledge spillovers are the most discussed agglomeration force, as they are analyzed in many different areas of economics. Through the close con-nection between firms, knowledge travels more easily among firms, which fa-cilitates innovation and growth. Knowledge spillovers support localization economies, as one of the advantages of being located close to other firms in the same industry is that it is easier to share knowledge. Knowledge spillovers can also exist between different industries and bolster innovation, but only if the knowledge is complementary, which stimulates urban economies.

An individual firm benefits from the pool of industry-specific skilled work-ers, that a large city supplies (Brakman, Garretsen, & Van Marrewijk, 2009). Labor market pooling forces can be explained both as a localization economies force or an urban economies force. Namely, labor supply and demand are bet-ter matched in larger cities, which supports urban economies, but labor supply and demand are also better matched in industrial concentrations, which sup-port localization economies.

Labor mobility intensifies the agglomeration force due to the home market effect (Nishikimi, 2008). The so-called home market effect makes larger regions or cities more attractive, making them even larger.

(10)

behav-ior. If E-commerce erodes agglomeration forces, but does not affect dispersion forces, E-commerce is expected to shift the retail industry to a more dispersed equilibrium, as it decreases the advantages of agglomeration, but leaves the disadvantages unaffected.

2.3

Tomahawk Model and Bell Shaped Curve

According to Fujita, Krugman, and Venables (2001), transport costs represent one of the most crucial factors for determining the balance between agglom-eration and dispersion forces. Furthermore, Glaeser and Kohlhase (2004) find that the transport costs between cities and regions are an important factor for explaining urban growth. Thus, transport costs drive the formation of cities and agglomeration economies. We use transport costs, to denote all costs as-sociated with overcoming barriers to trade between different locations, such as language and culture barriers, communication costs, shippings costs, and tariffs. Transport costs are the main feature of geography in the core model of (Krugman, 1991), as without transport costs, there is no role for location or ge-ography to begin with. An important feature of the model of Krugman (1991) is that it predicts that high transport costs dictates dispersion and low trade costs stimulate agglomeration, which is displayed graphically in the tomahawk model (Baldwin, Forslid, Martin, Ottaviano, & Robert-Nicoud, 2003). The formulation of the impact of transport costs in the tomahawk model is vital to the understanding of their effect on agglomeration, and of their relevance to the hypotheses below. The Tomahawk model describes the distribution of firms over two regions as a function of transport costs. For sufficiently low trans-port costs, any equilibrium in which firms are distributed over both regions is unstable, as if one firm relocates from one region to the other, all other firms will follow, leading to a completely agglomerated equilibrium where all firms located in the same region (which is always stable in the low-transport costs regime). Conversely, for high transport costs, the completely agglomerated equilibrium is unstable, and the stable equilibrium is given by the situation in which the firms are equally distributed over both regions.

(11)

for low and high levels. For low transport costs, agglomeration is unattractive as the dispersion forces dominate over the agglomeration forces, because ac-cess to labor, consumers, and other firms is effortless and uncomplicated. For intermediate levels of transport costs, firms agglomerate to enjoy the exter-nalities that arise from clustering. However, for very high levels of transport costs, firms again disperse over the different regions, in order to match the final demand in each region. The bell-shaped or U-curve model is further explained in, e.g., Combes, Mayer, and Thisse (2008), Puga (1999), or Krugman and Venables (1995).

Sufficiently low transport costs allow consumers to travel to retail-agglomerates to experience a wider variety of choice. However, as E-commerce provides a similar variety of choice without the transportation costs, the relative costs of traveling to an agglomerate is increased. On the other hand, consumers that previously favored shopping at the nearest lone store over shopping at an agglomerate in order to minimize transport costs, could also choose to shop on-line as they can enjoy a wider variety of choice without having to travel.

2.4

E-commerce as a Dispersion or Agglomeration Force

(12)

Since the introduction of the Internet, individuals have the option to go on-line to browse the webpages of e-retailers for (information about) products or services they might wish to purchase. If the information about the product is found on-line and purchase is made on-line, the consumer is no longer inter-ested in where the retailer is located, as long as shipping fees and time are not affected, which, at least for domestic shipping, is generally the case (Ander-son, Chatterjee, and Lakshmanan, 2003; Farag, Krizek, and Dijst, 2006). The introduction of the internet and, specifically, the recent surge in E-commerce revenues therefore gives rise to the question of how geography, which is highly relevant for physical physical retail stores, affects these physical retail stores’ competition experienced from E-commerce.

Much of the research on Internet marketing revolves around the discussion of whether Internet shopping is necessarily a substitute to physical store shop-ping or that they can also complement each other. Several studies (Hernandez, Gomez-Insausti, and Biasiotto, 2001; Ferrell, 2004; Farag, Weltevreden, et al., 2006) have found complementary effects, with the complete shopping process in many cases consisting of interactions with physical and virtual stores for the same purchasing decision. However, the key difference between these studies and the research presented in this paper, is that here we consider only the actual purchase at the Internet retail store (not browsing the Internet physical retail stores’ web pages in general), which must necessarily be a substitute for the purchase of the same product at a physical store, just like the purchase of a product at one physical retail store is a substitute for the purchase of the same product at another physical store3.

Very few studies have examined E-commerce from a spatial perspective and the studies that did do so, were performed from the perspective of the consumer (through surveys), focusing on the e-shoppers’ decision process and experience (Clarke et al., 2015; Cao et al., 2012; Ren and Kwan, 2009; Weltevreden and Rietbergen, 2007; Farag, Krizek, and Dijst, 2006; Farag, Weltevreden, et al., 2006; Sinai and Waldfogel, 2004; Zmud and Arce, 2000). On the other hand, this research takes the novel approach of focusing on the effects of E-commerce

3Since the consumer has a limited income at his disposal to satisfy his desires to the best

(13)

on agglomeration, by measuring how E-commerce affects the distribution of physical retail stores through its competition with both stores part of agglom-erates and “lone” stores.

Clarke et al. (2015) examined the growth of E-commerce in British re-tailing and tested the spatial variations in E-commerce usage. Commercial consumer surveys were their main data source, in combination with census data. The research questions they posed were: “Does E-commerce vary by geodemographic group?” and “Does the usage vary by geographical region”, for which they used binary logistic regression and surveys with several differ-ent questions4. Clarke et al. (2015) find evidence of increasingly high Internet usage in rural areas due to increased quality in broad band service.

Ren and Kwan (2009) examined why individuals adopt E-commerce and what kind of changes in activities will occur due to E-shopping, and included a spatial perspective in their analysis 5. This study focused on the

accessibil-ity to local stores, the effect of the residential background on the adoption of E-commerce, and the on-line buying frequency. They used Internet diary sur-veys, which collected detailed data about the Internet-use patterns, frequency, amount of products and/or services purchased, and the amount of money spent on Internet. Several analytical methods, including logistic regression, Poisson and negative binomial regression models, were used to test the research ques-tions. They concluded that people with low levels of accessibility to local stores are more likely to shop on-line.

Several studies on E-commerce from the consumer’s perspective highlight different interesting aspects of the E-shopping process. Firstly, Sinai and Wald-fogel (2004) found that the further away individuals live from the nearest book or clothing store, the more they buy on-line, as compared to their off-line ex-penditures. Secondly, Weltevreden and Rietbergen (2007) found that when the city center is perceived as being more attractive, Internet users are less in-clined to shop on-line. Conversely, and somewhat surprisingly, Farag, Krizek, and Dijst (2006) found that Dutch respondents are more likely to buy on-line as travel times to shops are shorter, which they explained by arguing that cities are innovation centers, which makes their inhabitants more likely to be

4“How often do they use the Internet to buy goods and services?”, “Where do you live?”

and a question to classify whether the households live in urban or rural areas

5This study is the second study on E-commerce that takes geographic factors into account,

(14)

early adopters of E-commerce6.

The higher use of E-commerce in rural areas suggests lone stores in remote areas to suffer more from the competition with E-commerce than stores in agglomerates (especially those in attractive city centers). On the hand, the findings of Farag, Krizek, and Dijst (2006) suggest the latter to suffer more from the competition with E-commerce than the former.

Thus, the literature is inconclusive on whether E-commerce is an agglomer-ation or dispersion force, which is why this paper will investigate this further. The understanding of the effect of E-commerce as an agglomeration or dis-persion force is important to, for example, urban planners, as it helps them forecast the evolution of the retail industry under the growing competition from E-commerce. Moreover, it has the potential to expand agglomeration models to include a force that has become very relevant over the last decade, namely that of E-commerce. As no previous studies have analyzed the effect of E-commerce on agglomeration, we cannot build upon empirical methods that are “standard” in the field. Therefore, we choose Ordinary Least Squares regression as the starting point for empirics in this novel field, because of its widespread use and well-understood properties.

2.5

Hypotheses

We formulate two opposing hypotheses, based on the previously discussed liter-ature. The (external) scale advantages of agglomeration identified by Marshall and Jacob for physical retail stores are, most importantly, the wider variety of choice offered to consumers (in addition to effects like spillovers of effective sales techniques). On the other hand, from the Tomahawk and Bell curve models, we learn that the most important disadvantage of shopping at an ag-glomerate, is that the nearest agglomerate is typically located farther away than the nearest lone store, resulting in higher transport expenses (in addi-tion to other dispersion forces such as congesaddi-tion costs). Thus, consumers who prioritize freedom of choice are expected to shop at agglomerates, whereas con-sumers looking to minimize travel times and expenditures will typically shop at lone stores. However, E-shopping offers consumers a wide variety of choice, with (almost) zero travel times and expenses. Therefore, E-commerce has the

6As, E-commerce in the Netherlands has moved past the early adoption stage, these

(15)

potential of attracting both the variety-loving client`ele of agglomerates, and the travel-averse client`ele of lone stores.

We formulate the “Efficiency” hypothesis as follows: lone stores in remote areas suffer a disproportionally large amount of competition from E-commerce as compared to stores located in agglomerates, because before the Internet the client`ele of these lone stores typically shopped there simply because they did not consider their purchase worth the effort of traveling to the nearest agglomerate to have a wider variety of purchasing options. However, since the introduction of the Internet, consumers have the option to enjoy a wide variety of purchasing options without having to travel to an agglomerate. Namely, by using the Internet, consumers can enjoy the entire spectrum of products and/or services offered by web shops from the comfort of their homes. Thus, the “Efficiency” hypothesis implies that E-commerce is net agglomeration force, as it causes lone stores to suffer more than stores in agglomerates, which we state as:

Hypothesis 1. E-commerce is an agglomeration force.

The second hypothesis, which we will refer to as the “Love of Variety” hypothesis, opposes the Efficiency hypothesis through the following reasoning: Retail stores that are part of agglomerates attract extra customers (on top of their local customer base that shops there simply because it is the nearest option), who travel to the agglomerate because they wish to enjoy the extra product variety the agglomerate offers as compared to their nearest lone store. However, since the introduction of the Internet, consumers that would previ-ously be forced to travel to an agglomerate to enjoy a wide variety of products and/or services on offer, can now choose to spare themselves the trip to the agglomerate, as they can find at least as wide a variety of products and/or services on offer by accessing the Internet. Thus, the “Love of Variety” hy-pothesis theorizes that stores part of agglomerates suffer a disproportionally large amount of competition from web stores, as compared to lone stores in remote areas, as it hampers their ability to attract extra consumers on top of their local customer base. Therefore, the “Love of Variety” hypothesis implies that E-commerce is a net dispersion force, as it causes stores in agglomerates to suffer more than lone stores, which we state as:

(16)

Both hypotheses hinge on the assertion that E-commerce offers both the advantage of the local lone store, i.e. of little to no travel time and expenses, and the wide variety of choice of stores in agglomerates. Namely, consumers can access the entire spectrum of products and services offered on the Internet and have them delivered to their front door, without ever leaving their homes. The efficiency aspect of E-shopping (no traveling involved) diminishes a store’s “local” base of consumers that prefers not to travel for their purchases. This is expected to affect lone stores most, as this type consumers typically makes up their entire client`ele. On the other hand, the variety aspect of E-shopping (wide variety on offer) makes agglomeration less feasible as it diminishes the additional client base that stores in agglomerates are able to attract through offering a wide variety of choice. The stores in the agglomerates need these additional clients to subsist, as the higher concentration of stores also has downsides, such as congestion costs and having to share the local consumer base with a larger number of stores (i.e. attracting additional customers has allowed the agglomerates to grow beyond the number of stores that can be sustained by the local customer base).

(17)

than purchasing in a physical store (because of shipping fees) and that some consumers are still averse to Internet banking. Lastly, physical retail stores also attract “funshoppers”, i.e. customers who experience shopping as an en-joyable pastime and travel to, specifically, agglomerates to have a pleasant day out. This is an experience E-commerce does not offer, which makes the “shop-ping experience” an important advantage physical stores have over web-shops, benefiting especially stores in agglomerates, as agglomerates are much more attractive to “funshoppers” than lone stores.

Our choice of opposing hypotheses is similar to that of Farag, Weltevreden, et al. (2006). Their first hypothesis was that E-commerce is a predominantly an urban phenomenon, because technology adoption usually starts in centers of innovation (innovation-diffusion hypothesis), which implies that urban stores (which are more likely to be agglomerated) suffer more from the competition with E-commerce. Their second hypothesis was that people are more likely to adopt E-commerce when their accessibility to shops is relatively low (analogous to our efficiency hypothesis), which implies that stores in rural areas (which are less likely to be agglomerated) suffer more from the competition with E-commerce. Their findings indicated that on-line buying is still largely urban phenomenon in the Netherlands, but that there is a trend towards diffusion to the weakly urbanized and rural areas.

However, it is important to realize that E-commerce is a rapidly developing sector, and the analysis of Farag, Weltevreden, et al. was performed over 10 years ago, making the study presented in this paper very relevant. Specifically, the innovation-diffusion hypothesis formulated by Farag, Weltevreden, et al. (2006) has become quite dated, as 98% of all Dutch households have since gained access to the Internet (European-Commission, 2016), and E-commerce has passed the early adoption stage, becoming mainstream practice. The high degree of Internet access has enabled the Dutch to buy products and services on-line with relative ease both in urbanized and rural areas. Although the “Innovation” hypothesis of Farag, Weltevreden, et al. (2006) has likely lost its relevance, the “Love of Variety” hypothesis tested in this paper explains how (urban) stores that are part of agglomerates might still suffer more from the competition with E-commerce than the lone physical retail stores (in rural areas), which would make E-commerce a dispersion force.

(18)

ex-pect that E-commerce will act as an agglomeration force, as in addition to the “Love of Variety” effect agglomerates are also attractive to “funshoppers”, who are indifferent to competition from E-commerce. As the customer bases of all physical retail stores will suffer from the competition with E-commerce, except for the agglomerates’ “funshopping” customer base, we expect E-commerce to be a net agglomeration force.

3

Theory of the Gini coefficient

In order to measure how E-commerce affects agglomeration, we need to first quantify agglomeration, i.e. “how” agglomerated a certain sector in a cer-tain region is. To quantify the agglomeration of the retail industry in the Netherlands, we use the Gini coefficient. The Gini coefficient is a measure of inequality that has a wide range of applications, the most well-known being the Gini coefficient for wage inequality. For quantifying agglomeration, the Gini coefficient of concentration of the labor force of an economic sector in a country is defined as (Rosenthal & Strange, 2001)

Gj =

X

i

(xi− sij)2, (1)

where xi is the share of the entire country’s population living in region i, and

sij is the share of the total labor force of sector j that is employed in region

i. Thus, Gj measures the concentration of the labor force of sector j with

respect to the distribution of the country’s population. Let us now apply the Gini coefficient to labor concentration in the retail industry. In the complete absence of agglomeration effects (both economies of scale and congestion costs), and if we assume homogeneous consumers, and retail employees and stores, we expect a fixed ratio of the number of retail employees, Li, to the number

of inhabitants, N , of a region. Namely, the ratio LN

i = Q is determined by

(19)

equilibrium situation, this assumption is expected to be accurate, but during transition processes it will definitely be not. Thus, the entire country’s retail labor force Ltot

i is given by the entire country’s population Ntot, divided by Q.

Going back to the Gini coefficient, we have

Gj = X i  Ni Ntot − Lij Ltot j 2 , (2) Gj = X i  Ni Ntot − Lij · Q Ntot 2 , (3) Gj = X i  Ni Ntot − Ni Ntot 2 = 0. (4)

Thus, in the absence of agglomeration effects and for homogeneous cus-tomers, retail employees, and stores, the Gini coefficient is zero. Any (agglom-eration) effect that causes a region’s retail employment to deviate from the employment level Lij = NQi will increase the Gini coefficient to a value larger

than zero. Namely, expanding our equation sij =

Lij · Q

Ntot =

Ni

Ntot, (5)

to include agglomeration effects yields sij = Ni Ntot + f (F D ij, F A ij). (6) Here, f (FD

ij, FijA) is determined by a region’s equilibrium between

agglomera-tion forces and dispersion forces, denoted by FA and FD, respectively, and describes the regions’ fluctuations away from the “dispersion equilibrium” sij = NNtoti (G = 0 corresponds to a perfectly dispersed equilibrium).

If we keep the total amount of retail employees needed to satisfy domestic demand fixed, i.e. Ltot

j = Ntot/Q, then X i sij = P iNi Ntot + X i f (FijD, FijA), (7) 1 = N tot Ntot + X i f (FijD, FijA) = 1 +X i f (FijD, FijA). (8) Thus, P if (F D

ij, FijA) = 0, i.e. f (FijD, FijA) causes a redistribution of the total

(20)

At this point, it is important to make a distinction between the macro-scopic, country-level agglomeration measured by the Gini coefficient and the microscopic region-level agglomeration described by the function f (FD

ij, FijA).

Namely, the former describes the country’s inequality in the distribution of re-tail employment, whereas the latter describes each individual region’s relative agglomeration, i.e. it describes each region’s retail employment-to-population ratio with respect to the country’s other regions. If a region has a retail employment-to-population ratio below the country average of Ltot

j /Ntot = Q −1,

then the region has negative relative agglomeration, and when the ratio is above the country average, the region has positive relative agglomeration. Thus, at the microscopic level, f (FijD, FijA) describes each region’s balance of agglomera-tion and dispersion forces with respect to the other regions, such that when its agglomeration forces are higher and dispersion forces lower than the country average, the region will be positively agglomerated (i.e. have a higher than average retail employment-to-population ratio), whereas if its agglomeration forces are lower and dispersion forces higher, the region will be negatively ag-glomerated (i.e. have a lower than average retail employment-to-population ratio).

On the other hand, at the macroscopic level, the agglomeration minimum corresponds to G = 0 (perfect dispersion), i.e. it cannot be negative, and thus agglomeration forces increase the fluctuations in the retail employment-to-population ratio over the different regions, and dispersion forces decrease these fluctuations. Thus, macroscopic agglomeration forces increase the aver-age magnitude of f (FD

ij, FijA) and macroscopic dispersion forces decrease the

average magnitude of f (FD

ij, FijA)7.

To better understand how the function f (FD

ij, FijA) can be parametrized, we

will now discuss Duranton’s canonical model of agglomeration. The canonical model of agglomeration describes the balance between the advantages and disadvantages of agglomeration. It assumes increasing returns to scale, i.e. per capita productivity increases as a city’s population increases, but also increasing marginal costs for increasing population sizes of cities, which are referred to as congestion costs. This interplay between congestion costs and agglomeration economies (e.g. increasing returns to scale) is formulated , for

7In the limit that the dispersion forces completely dominate over the agglomeration

forces, the average magnitude of f (FD

ij, FijA) is zero, i.e. f (FijD, FijA) is zero everywhere,

(21)

a specific city, as

u(L) = A · Lβ− B · Lγ, (9)

where u denotes the marginal utility for a person relocating from the country side to the city, L the size of the labor force of the city, β the elasticity of agglomeration economies with respect to labor force size, and γ the elasticity of congestion with respect to the labor force size. A denotes the Total Fac-tor Productivity (TFP), which is intrinsic to the specific city considered. B denotes the city-specific congestion cost factor, i.e. B is analogous to A but for costs instead of profits (by redefining the factor A as A = AB, the conges-tion cost factor B is typically eliminated without the model losing much of its validity, but as we will not actually fit relation 9, we do not need to try and reduce the number of free parameters).

The utility function u is to be normalized to the country side’s utility (which is assumed to be constant), such that when a city’s marginal utility falls to zero, people from the countryside will stop relocating to the city and (long-term) negative utilities are impossible as people will relocate back to the countryside. γ is taken to be larger than β, such that for large population sizes the congestion costs term starts to dominate over the agglomeration economies term and the marginal utility becomes a decreasing function of population size (0 < β < γ). This prevents the distribution from collapsing to the situation where the entire labor force resides in a single city. Moreover, for non-empty cities, A > B > 1, such that when a city’s population is close to zero, its marginal utility function is an increasing function of population.

(22)

popula-tion causes the city’s populapopula-tion to run away from the equilibrium populapopula-tion (a decrease in population causes the city’s marginal utility to fall and the pop-ulation to decrease further and, similarly, an increase in poppop-ulation causes the city’s marginal utility to rise and the population to increase further).

If we now envision equation 9 as describing specifically the marginal utility of someone taking up the profession of retail employee, then we take it as being normalized to the employees’ productivity in other professions, such that individuals will stop taking up the retail profession if they can be more productive in other lines of work. Equation 9 describes the scale advantages of employing extra retail employees; e.g. extra employees means a larger variety of goods can be offered, as they can run more or larger stores and specialize in specific types of retail goods. On the other hand, equation 9 also describes the scale disadvantages of employing extra retail employees, e.g. congestion costs from more or larger physical retail stores driving up rents, and retail employees becoming scarcer, having to travel farther to, or being less qualified for the job. Thus, a parametrization of the form of 9 can be used to find f (FijD, FijA), which describes the excess or lack of retail employees in a region with respect to 5.

However, the parametrization in equation 9 describes absolute agglomer-ation, whereas f (FD

ij, FijA) describes relative agglomeration. For relative

ag-glomeration, the difference between agglomeration and dispersion forces is no longer the only determinant of the level of agglomeration. For example, from the perspective of Duranton’s model, a region that has A > B will have non-zero retail employment, where the equilibrium employment level is given by A · Lβ = B · Lγ. However, it is possible this region’s “retail agglomeration

potential”(= A − B8), which describes how attractive a region is to the retail

industry (with respect to the region’s population), is smaller than the country’s average retail potential. Thus, f (FD

ij, FijA) will dictate negative agglomeration

in this region.

We are testing the effect of E-commerce as macroscopic (country-wide) agglomeration or dispersion force. That is, using the Gini coefficient, we measure the country-level agglomeration. Therefore, we do not need to ex-plicitly describe the microscopic agglomeration and dispersion forces in each region. Thus, the most straightforward parametrization of f (FD

ij, FijA) can be

8Although β 6= γ, we assume their values to be close enough, such that we can use

(23)

achieved by assuming each region’s (sector specific) relative retail potential to be a randomly distributed variable −1 < Rij < 1, under the constraint that

P iRij = 0, and f (FijD, FijA) ≡ Cj · Rij · FjA FD j , (10)

where FjA and FjD are the macroscopic (country-wide) agglomeration and dis-persion forces, respectively, and Cj is a sector-specific normalization factor,

which depends on how FA

j and FjD are quantified. Thus, regions with a positive

relative retail potential have an above average retail employment-to-population ratio and will see its ratio increase for increasing (macroscopic) agglomera-tion forces and decreasing dispersion forces, and the retail employment-to-population ratio of regions with a negative relative retail potential is below the average and will increase for decreasing levels of agglomeration forces and increasing levels dispersion forces.

Using the Gini coefficient, we test whether increasing levels of E-commerce lead to an increase of retail employment-to-population ratios in regions where the ratio was already above average, and a decrease in regions where the ratio was already below average, which would mean E-commerce is an agglomeration force, or to a decrease of retail employment-to-population ratios in regions where the ratio is above average, and a increase in regions where the ratio is below average, which would mean E-commerce is an dispersion force.

(24)

4

Methodology

4.1

Data

This study focuses on data from the Netherlands, as obtaining accurate data on other countries proved too challenging. The time horizon used in this study is also completely determined by data availability, as we include all years in our analysis for which we have E-commerce revenues. We use data on employment and population density both at level of individual municipalities and on the level of four-digit ZIP-codes, which allows us to study agglomeration processes at different geographic scales. The analysis at the level of four-digit ZIP-codes includes more detailed shifts or disappearances of physical retail stores, since at the level of municipalities, for example, shifts of physical retail stores from the middle of the city-center to the edge of the city cannot be detected, whereas at the level of four-digit ZIP-codes, such shifts can be detected. We use data from the LISA foundation (“Landelijk Informatiesysteem van Arbeidsplaatsen”), which are based on employment records from the Dutch government. LISA is a database containing information on all branches of business in the Netherlands where paid work is performed. The data have a spatial component, namely the businesses’ addresses, and a socio-economic component, namely employment numbers and type9 (full-time or part-time). The LISA data cover the period

from 1996 to 2014, at both municipality and ZIP-code level. This study uses only 5 out of the 99 branches, as the other branches are not retail-related. The LISA database is based on 21 regional employment records, which are supported by the provinces and (partly) by the municipalities.

Both for the analysis at the level of municipalities and for the analysis at the level of four-digit ZIP-codes, we calculate the Dutch, country-wide, Gini coefficient of each retail sector during each year, using either the population and retail employment distributions at the municipality level, or four-digit ZIP-codes level, respectively.

In addition to the data from the LISA foundation, this study uses data from thuiswinkel.org10. The data from thuiswinkel.org are based on the Global

Online Measurement Standard for B2C E-commerce (GOMSEC), which is the standard of the measurement of B2C E-commerce for products and services

9The LISA foundation classifies full-time employees as working more than 12 hours a

week, part-timers as working less than 12 hours a week.

(25)

and is determined by the European Organization Ecommerce Europe. Only the absolute, total revenues per sector earned by B2C Internet sales were available. That is, we did not have access to E-commerce revenues relative to a sector’s total revenue, nor spatially disaggregate E-commerce revenues. The data span 19 sectors cover the period from 2010 to 2013. Unfortunately, no additional business-specific data were available, as all contacted corporations, without exception, chose not to share data about their performance. The 19 sectors are combined into 14 sectors, because of the sectors LISA uses.

Finally, we uses data from CBS (statline) on population numbers at munic-ipalities and ZIP-code level. This data for the municmunic-ipalities covers the period from 1996 to 2014 and for the ZIP-code level from 2011 to 2014.

4.2

Empirical model

In the previous section, we assumed the underlying agglomeration forces de-scribed by f (FijD, FijA) to be at a stable equilibrium before the introduction of E-commerce, such that changes to the Gini coefficient can be regarded as being caused by E-commerce. However, even when these forces are undergoing changes, we can still find the E-commerce’s effect as an agglomeration or dis-persion force, by testing how (changes to) the Gini coefficients correlate to the E-commerce revenues, and controlling for year and sector-specific levels of ag-glomeration. To do so, we need to consider how the yearly Gini coefficients are influenced by revenues in E-commerce. As explained previously, E-commerce revenues come at the cost of physical stores’ revenues. However, retail em-ployment will not react instantaneously to the physical retail stores’ revenues. Nevertheless, the retail industry typically works with short-term contracts and low educated, abundantly available labor. Therefore, we expect (changes to re-tail) employment of a certain sector in a certain year to be strongly influenced by the performance of the retail sector in the year directly prior to it.

(26)

However, as we do not have population data at the four-digit ZIP-code level from before the year 2011, we cannot calculate the change of the Gini coefficient of 2011 with respect to the previous year for the analysis at this level. As we already have very limited statistics (56 observations), excluding observations 2011, which would reduce our data by a quarter, is not an op-tion. Nevertheless, all else being equal (for which we control using time and industry fixed effects), if a positive or negative relation between changes to the Gini coefficient and E-commerce revenues in the previous year exists, then a, respectively, positive or negative relation also exists between the actual level of the Gini coefficients and the E-commerce revenues in the previous year. Therefore, for the analysis at the level of four-digit ZIP-codes, we regress the log of a sector’s Gini coefficient against the previous year’s E-commerce rev-enues of the sector. We also perform the same analysis (i.e. regressing the logs of the Gini coefficients against the E-commerce revenues in the previous year) at the level of individual municipalities, such that we can compare the results directly. In both cases, we take the log of the Gini coefficients to reduce its skewness and the heteroskedacity in its linear regression on E-commerce in our data 11

Although the E-commerce revenues are also somewhat skewed to the right (see figure 5 in the appendix I), we choose to regress the revenues themselves, and not their logarithms, as this would lead us to study the effect of relative changes to E-commerce revenues on the Gini coefficient (additionally, taking the log of E-commerce revenues actually “overcompensate” somewhat, as the log of E-commerce revenues show some skewness to the left (see figure 6 in the appendix I).

Both at the level of municipalities and of four-digit ZIP-codes, we choose Ordinary Least Squares regression as the starting point for empirics in this novel field, because of its widespread use and well-understood properties12, to estimate the relation between the Gini coefficient of retail sector j in year y,

11The Gini coefficients, at both the level of the municipalities and the level of four-digit

ZIP-codes, are checked for skewness by using Stata’s built-in skewness/kurtosis test and their linear regression on E-commerce revenues is checked for heteroskedasticity by using the Breusch-Pagan / Cook-Weisberg test. We find evidence of skewness and heteroskedasticity at 5% significance, which are strongly reduced after taking the logarithm of the Gini coefficients (see figures 1 to 4 in Appendix I)

12We cannot build upon empirical methods that are “standard” in the field, because there

(27)

denoted as Gjy, and the E-commerce revenues of retail sector j in year y − 1,

denoted as Ejy−1:

ln(Gjy) ∼ αL· (Ejy−1), (11)

where αL measures the correlation between the level of E-commerce revenues

and the log of the Gini coefficient. Furthermore, at the level of municipalities, we also estimate the following relation using Ordinary Least Squares regression:

∆ ln(Gjy) ∼ α∆· (Ejy−1), (12)

where ∆ ln(Gjy) = ln(Gjy) − ln(Gjy−1) = ln( Gjy

Gjy−1), and α∆ measures the

correlation between the level of E-commerce revenues and changes to the log of the Gini coefficient. Thus, the “Log Gini” analyses, for a constant level of E-commerce revenues, assume a constant level of the Gini coefficient in the following years, i.e. after a change in E-commerce revenues, the Gini coefficient is assumed to have completely adapted to the new E-commerce level in the year directly following it. On the other hand, the “∆Log Gini” analysis, for a constant level of E-commerce revenues, assume constant relative changes to the Gini coefficient in the following years, i.e. the Gini coefficient is assumed to adapt to a new E-commerce revenues level by constant relative changes over an indefinite number of years. Thus, the “Log Gini” analyses let us predict the short-term effects (in the directly following year) of increases in E-commerce revenues on agglomeration, whereas the “∆Log Gini” analysis, at least theoretically (as our data only span a short period of time), studies the long-term effect of

To find the correlation coefficients αL and α∆, sufficient data and,

specifi-cally, sufficient variation in E-commerce revenue levels are needed. Therefore, the entire dataset, including all different retail sectors, is used to determine one global correlation parameter α. Thus, we determine the effect of E-commerce as an agglomeration or dispersion force for the entire retail branch, and not for specific retail sectors.

(28)

ap-propriate. Therefore, we use a fixed effects regression model, where we use industry-fixed effect to control for time-independent effects of sector-specific characteristics on the sector’s Gini coefficient.

This paper includes different econometric tests to verify and improve the validity of the performed analysis. As we are using a linear regression model, we first test for heteroscedasticity. Namely, the OLS estimator is based on the assumption that for each value of (Ejy−1), the values of ln(Gjy) are distributed

around their mean with the same variance. The data in this model are checked for heteroskedasticity by using the Breusch-Pagan / Cook-Weisberg test and when heteroskedasticity is found, robust standard errors are added to the OLS regression to mitigate some of its effect. The second test we perform checks for the existence of time fixed effects in the data and the necessity to add them to our regression and the last test we perform identifies potential outliers.

The outcome of Breusch-Pagan / Cook-Weisberg test for heteroskedasticity finds that some degree of heteroscedasticity is still present in the data after taking the logarithm, for all three analyses. We therefore use Stata’s robust standard errors (Huber-White sandwich estimators) to minimize the effect of heteroscedasticity, but observe that these do not strongly affect the results of the regressions.

As the Gini coefficient varies over different sectors, because of the nature of these industries (Rosenthal & Strange, 2001), we include sector fixed effects13.

Moreover, we include time-fixed effects to control for factors like economic upturn or downturn (conjuctuur) influencing employment and, possibly, the Gini coefficient, as Stata’s (time) fixed effects test finds their joint inclusion to be significant at 5% for all three analyses. The test of Osborne and Over-bay (2004) is used to find outliers. Osborne and OverOver-bay identified potential outliers as being located more than three sigma away from the mean. Both at the level of individual municipalities, and at the level of four-digit ZIP-code, no outliers in the logarithm of the Gini coefficient are found using the method of Osborne and Overbay (2004). Furthermore, in the E-commerce revenues, we identify the travel sector as an outlier, given its high E-commerce revues (see figures 7-9 in the appendix). However, as excluding this sector would reduce our number of observation further to only 52, and because we believe this sector to be a very informative case for what happens to agglomeration

13Stata’s built-in fixed effect test confirms the significance of the joint inclusion of the

(29)

at higher levels of E-commerce revenues, we choose to include this sector in our estimations. Nevertheless, we investigate the effect of excluding this sector on our estimations and observe that this does not disproportionally affect our estimates14 (see table 3 in the appendix).

Thus, the Ordinary Least Squared regressions performed on the data at the level of individual municipalities are formulated as

ln(Gmjy) = αLm·(Ejy−1)+βY1·δY1+...+βYN·δYN+βS1·δS1+...+βSN·δSN+ujy,

(13) and

∆ ln(Gmjy) = α∆·(Ejy−1)+βY1·δY1+...+βYN·δYN+βS1·δS1+...+βSN·δSN+ujy,

(14) and the regression at the level of four-digit ZIP-codes is formulated as

ln(Gzjy) = αLz· (Ejy−1) + βY1· δY1+ ... + βYN· δYN+ βS1· δS1+ ... + βSN· δSN+ ujy,

(15) where δY denote the year dummies, δS denote the sector dummies, and ujy

denotes the robust standard error term.

Thus, finding a significant negative correlation between ln(Gjy), or ∆ ln(Gjy),

and (Ejy−1) supports hypothesis 2, indicating that E-commerce is a dispersion

force, and finding a significant positive correlation between the ln(Gjy) and

(Ejy−1) supports hypothesis 1, indicating that E-commerce is an

agglomera-tion force.

5

Results

The results of the Ordinary Least Squares regression testing two formulated hypotheses are presented in table 1 and will be analyzed in this section (see table 2 in the appendix for summary statistics).

As shown in table 1, the R-Squared for the analysis at the level of four-digit ZIP-code equals 0.979, i.e. 97.9% of the variation in the Gini coefficient is explained by the E-commerce revenues, the fourteen different sector dummies and the four year dummies. At the level of individual municipalities, we find an

14The E-commerce coefficients do not change sign, but p-values are somewhat increased

(30)

Table 1: Regression results (OLS)

Note: ∗∗∗ p<0.01, ∗∗ p<0.05, ∗ p<0.1. Robust standard errors in parentheses. The first column shows the estimated effect of E-commerce revenues on the log of the Gini coefficient at the level of four-digit ZIP codes, the second column shows the estimated effect of E-commerce revenues on the log of the Gini coefficient at the level of individual municipalities, and the third column shows the estimated effect of E-commerce revenues on the changes to the log of the Gini coefficient at the level of individual municipalities.

R-squared of 0.471 for the “∆Log Gini” analysis, and 0.958 for the “Log Gini” analysis, indicating that, respectively, 47.1% and 95.8% of the variation in the (changes to) the log of the Gini coefficient is explained by the E-commerce revenues, the fourteen different sectors and the four year dummies. Typically, an R-Squared value above .9 is seen as indicative of a high level of statistical power of the regression.

5.1

∆Log Gini regression

We start by discussing the results of regressing the changes to the logs of Gini the coefficients against the logs of the E-commerce revenues at the level of individual municipalities. Since we are estimating the relation (denoting the additional fixed effects terms and intercept as “FE”)

ln  Gy Gy−1  = α∆· Ey−1+ F E, (16) we have Gy Gy−1

= exp (α∆· Ey−1+ F E) = eα∆·Ey−1· eF E, (17)

Gy = eα∆·Ey−1 · eF E· Gy−1, (18)

so if α∆ > 0, the Gini coefficient is an increasing function of E-commerce

(31)

coefficient is a decreasing function of E-commerce revenues, making the latter an dispersion force (Gy−1 is greater or equal to zero by definition and strictly

larger than zero in any realistic scenario). If a sector’s annual E-commerce revenue increases by 1 unit, ˆEy−1= Ey−1+ 1 (where the E-commerce revenues

are measured in units of BC109), the sector’s Gini coefficient in the next year is expected to be eα∆ times larger than it would have been, had the sector’s

annual E-commerce revenue remained unchanged. Namely, ˆ

Gy = eα∆· ˆEy−1· eF E· Gy−1 = eα∆·(Ey−1+1)· eF E· Gy−1, (19)

ˆ

Gy = eα∆ · eα∆·Ey−1· eF E · Gy−1= eα∆ · Gy. (20)

We find a value of −.92 ± .44 for the coefficient α∆, which means that a

one billion Euro increase in a sector’s E-commerce revenues is associated with the sector’s Gini coefficient in the next year being eα∆ = 2.5+1.4

−0.9 times smaller

than it would have been, had the E-commerce revenues remained unchained. The coefficient α∆ is smaller than zero at 5% significance, providing evidence

for the Love of Variety Hypothesis (2) (which states that E-commerce acts as a dispersion force) at the level of individual municipalities.

5.2

Log Gini regression

We now continue with the results of regressing the logs of Gini the coefficients against the logs of the E-commerce revenues, where we estimate the relation

ln (Gy) = αL· Ey−1+ F E, (21)

Gy = exp (αL· Ey−1+ F E) = eαL·Ey−1 · eF E, (22)

so we again have that if αL> 0, the Gini coefficient is an increasing function of

E-commerce revenues, making the latter an agglomeration force, and if αL < 0,

the Gini coefficient is a decreasing function of E-commerce revenues, making the latter an dispersion force. We also again find that if a sector’s annual E-commerce revenue increases by 1% ˆE = 1.01E, the sector’s Gini coefficient in the next year is expected to be (1.01α∆ − 1) · 100% larger than it would

(32)

equal, if a sector’s annual E-commerce revenue is 1 unit higher than it was in the previous year (Ey = Ey−1+ 1), then next year’s Gini coefficient (Gy+1) is

eαL times larger than the current year’s Gini coefficient (G

y):

Gy+1 = eαL·Ey· eF E = eαL·(Ey−1+1)· eF E = eαL· Gy, (23)

At the level of individual municipalities, we find the coefficient αLm to be

equal to −.86 ± .43, which means that a one billion Euro increase in a sector’s E-commerce revenues is associated with the sector’s Gini coefficient in the next year being eα∆ = 2.4+1.3

−0.8 times smaller than the sector’s current year’s

Gini coefficient. The coefficient αLm is smaller than zero at 10% significance,

providing evidence for the Love of Variety Hypothesis (2) (which states that E-commerce acts as a dispersion force) at the level of individual municipalities. At the level of four-digit ZIP-codes, we find the coefficient αLz to be equal to

.119±.157, indicating that a one billion Euro increase in a sector’s E-commerce revenues is associated with the sector’s Gini coefficient in the next year being eα∆ = 1.13+.19

−.16 times larger than the sector’s current year’s Gini coefficient.

The coefficient αLz is not significantly different from zero (with a P-value of a

little over .4). Thus, although the sign of the coefficient αLz suggests that

E-commerce acts as an agglomeration force, supporting the Efficiency Hypothesis (1), the sign of the coefficient could as well be due to chance, with the actual effect of E-commerce on agglomeration being non-existent or slightly negative at the level of four-digit ZIP-codes.

5.3

Comparison of results

At the level of individual municipalities, we find that the coefficient α∆(which

measures E-commerce-induced relative changes to the Gini coefficient that are sustained over the following years) of the “∆Log Gini” analysis to be more significant, and slightly larger, than the coefficient αL (which measures the

effect of a change in E-commerce revenues on the Gini coefficient in the year directly following it) of the “Log Gini” analysis. Thus, our findings suggest that the effect of a change in E-commerce revenues on agglomeration is not contained within the year directly following the change.

(33)

municipalities, and that our measurement of the effect of E-commerce on ag-glomeration at the four-digit ZIP code level includes the (significant) retail-dispersion effect of E-commerce we find between municipalities (as we calculate country-wide Gini coefficients). Although smaller in magnitude and insignifi-cant, the sign of the effect of E-commerce on agglomeration that we find at the level of four-digit ZIP-codes is opposite to the sign of the effect that we find at the level of municipalities. Namely, at the level of municipalities, we find that E-commerce acts as a dispersion force, whereas at the level of four-digit ZIP-codes, our results suggest that E-commerce acts as an agglomeration force (although we do not find significant evidence for this).

It is theoretically possible for the dispersion effect of E-commerce that we find at the municipality level to be “undetectable” to the regression at the ZIP code level and the positive coefficient αL we find at the ZIP code level to be

simply due to chance, e.g. if statistical fluctuations at the ZIP code level are much larger than those at the municipality level 15.

However, since we know that E-commerce acts as an agglomeration force between municipalities, it is more likely that the insignificant and positive co-efficient of the effect of E-commerce on agglomeration that we find at the level of four-digit ZIP-codes is caused by E-commerce acting as an agglomeration force within municipalities, contrary to its effect a dispersion force between municipalities. That would mean that at the level of four-digit ZIP-codes, the dispersion effect of E-commerce between municipalities competes with agglom-eration effect of E-commerce within municipalities, with the latter canceling the former, resulting in the statistically insignificant and slightly positive value of the coefficient αLz.

6

Discussion

As explained in the previous section, our findings suggest that the competition that physical retail stores suffer from E-commerce acts as a dispersion force between municipalities, but as an agglomeration force within municipalities. Although this might seem counterintuitive, the following explanation shows that it is possible nevertheless:

15Which could explain the lower R-squared of the ∆Log Gini regression, but the fact that

(34)

On the one hand, E-commerce could act as a dispersion force at the level of municipalities by decreasing the willingness to travel to an agglomerate in another municipality, as a result of the wide variety of products offered through E-commerce (i.e. by eroding the “Love of Variety” effect). This would lead to a distribution of physical retail stores over the different municipalities that is more proportional to the municipalities’ number of inhabitants, thus lowering Gini coefficient. Moreover, this effect would be less relevant within municipalities, as distances are smaller and travel times shorter.

On the other hand, within municipalities, E-commerce could act as an agglomeration force through price competition. Namely, this forces physical retail stores, that were previously distributed uniformly over the city and its periphery, to concentrate in areas where production factors, specifically land rents and transportation accessibility, are the least expensive (typically select commercial zones in a city’s periphery), in order to remain profitable. How-ever, it is also possible that “Funshoppers” sustain physical re-tail stores in (attractive) city-centers, whereas physical rere-tail stores spread throughout the city parish under the competition from E-commerce. Lastly, E-commerce could also act as an agglomeration force within municipalities by driving smaller stores out of business through price competition, such that only the largest stores, that can endure the price competition better through scale advantages, remain. This would lead specifically to agglomeration of retail em-ployment, but not (necessarily) to agglomeration of retail stores.

(35)

an agglomeration force within municipalities, between which the results of our analysis cannot distinguish. Therefore, future research should reveal whether or not such interactions are indeed the cause for the discrepancy between the effect of E-commerce on agglomeration at the level of municipalities, and at the level of four-digit ZIP-codes. Namely, instead of using the Gini coefficient as an indicator of the level of agglomeration, such studies should analyze the geographic movement of retail employment and stores in detail, such that it is understood which zones in cities or regions lose physical retail stores and employment because of competition from E-commerce, and to where the re-tail activity relocates. Furthermore, other aspects of this research upon which future studies should improve are discussed in the following section.

6.1

Limitations

Given the opposite findings at different geographical scales, as discussed in the previous section, and, specifically, the lack of significant evidence for the effect of E-commerce within municipalities, it is important to discuss the limitations of this study, the most important of which are related to data availability.

Firstly, the E-commerce revenues we used only span four years. Internet is a phenomenon that has been on the rise in the Netherlands since 1988 (Posthumus, 2013)16, whereas the data on E-commerce revenues that we use in this study start in 2010, i.e. 12 years after its inception. The optimal situation for this study would have been to use E-commerce revenues starting from 1998, which could have produced more accurate results in our analysis. Unfortunately, E-commerce revenues from before 2010 were not available for this study, because of confidentially of these data and the fact that web-shops would not share these data with individuals outside of the company.

Secondly, another limitation of this study is that the regression analysis we performed on the effect of web-shop revenues on the Gini coefficient grouped all sectors together. This was done because we did not have sufficient statistics to test the effect of E-commerce on the Gini coefficient for each sector individually. Although it is very likely that the effect of E-commerce on the Gini coefficient depends on the type of sector considered, and it would be useful to measure this effect (for policymakers, city-planners, etc.), this sector-dependency is

16The first web-shop in the Netherlands was established in 1998 and bol.com was found

(36)

impossible to determine with our current data.

Thirdly, the accuracy of this study is limited by the fact that the Webshop revenues we used are not region specific, but measure E-commerce revenues for the entirety of the Netherlands together. Farag, Weltevreden, et al. (2006) find that E-commerce intensities differ significantly over different regions in the Netherlands, whereas we cannot make such a region-specific distinction in our analysis. Therefore, we cannot calculate municipality-specific Gini co-efficients and regress these against region-specific E-commerce revenues, in order to determine whether E-commerce indeed acts as an agglomeration force within these municipalities. As the data from CBS on population numbers, and from LISA on retail employment, are region specific already, namely coded by individual municipalities and four-digit ZIP-codes, data on region-specific E-commerce revenues have the potential to greatly increase the accuracy of the analysis.

The fourth way is which the accuracy of this research is limited, is that the LISA database does not provide physical stores’ revenues (which would allow us to evaluate the effect of E-commerce on physical retail stores much more straightforwardly), forcing us to use employment as a proxy for the physical retail stores’ performance. However, contrary to these stores’ revenues, there is a certain “lagging period” involved in stores’ employment to respond to competition from E-commerce. In this research, we assumed the employment of a certain year to be based on the stores’ performance in the previous year, but in reality E-commerce can have both shorter and longer term effect on employment effect as well, making it more difficult to find a causational re-lationship between changes in E-commerce revenues and changes in physical retail stores’ employment.

(37)

7

Conclusion

The aim of this study was to analyze the effect of E-commerce on the agglom-eration of physical retail stores. We used the Gini coefficient to measure the inequalities in the geographic distribution of retail employment, with respect to the distribution of the Dutch population. A Gini coefficient closer to zero indicates a lower level of agglomeration, and a higher Gini coefficient indicates more agglomeration. Both at the level of individual municipalities and four-digit ZIP-codes, we calculated (the log of) the Gini coefficients (and, at the former level, also their changes) for Dutch retail employment and regressed these (using OLS) against E-commerce revenues, where all three regressions are controlled for sector-specific and year-specific effects.

The results of our analysis indicate opposite effects of E-commerce on ag-glomeration at the two different geographic scales under consideration. Namely, at the level of individual municipalities, we find significant evidence for E-commerce driving the dispersion of physical retail stores, supporting the Love of Variety Hypothesis (2), whereas our analysis at the level of four-digit ZIP-codes, although not yielding significant results, suggests that E-commerce stim-ulates agglomeration within municipalities, which would support the Efficiency Hypothesis (1).

Previous studies on the effects of E-commerce paid very little attention to geographic effects and research was typically performed from the perspective from the consumer. Therefore, the effect of E-commerce as an agglomeration or dispersion force was unclear. Clarke et al. (2015), Weltevreden and Rietbergen (2007), Farag, Krizek, and Dijst (2006), and Farag, Weltevreden, et al. (2006) are some of the few studies on the relation between geography and E-commerce, but are also all performed from the consumers’ perspective.

(38)

(2007) that Internet users are less likely shop on-line, when the city-center is perceived as more attractive could support our finding that E-commerce is likely to act as an agglomeration force within municipalities, but only under the assumption that city-center where physical retail stores are more concen-trated (specifically, where retail employment is more concenconcen-trated with respect to the population) are perceived as more attractive.

Referenties

GERELATEERDE DOCUMENTEN

Unlike its counterparts in other countries at the time, the King Report 1994 went beyond the financial and regulatory aspects of corporate governance in

However, insurance companies, banks, retailers and traders (indicated in grey in Figure 7) are endogenous variables that influence the level of risk that exists for

De fracties die de hoogste activiteiten bezitten zijn onderzocht met behulp van gaschromatografie gekoppeld aan time-of- flight massaspectrometrie (GC-ToF-MS) om de identiteiten van

When describing the dietary intake according to foods most frequently consumed, both urban boys and girls consumed cooked porridge (including both maize meal- and

[13] independently introduced two alternative accusation methods for the Tardos code, as well as an attacker model which allows attackers to perform signal processing attacks on

So since it is expected that because of high information transparency consumers will especially perceive durable products as less complex in the online channel, and with that

The current study broadens institutional research and practice variation by analyzing how institutional pressures and diverse logics evoke similarities and differences in

The total num- ber of prehistoric sites stands in a ratio of just under 1:5 (13 as against 69) to the number of definite Archaic to Early Hellenistic sites from the same area